Hacker Newsnew | past | comments | ask | show | jobs | submit | wood_spirit's commentslogin

Not a rubyist so just curious on the background and if this is the “good” or the “bad” side in the spat? What’s the other side and what has been the broader community impact?

From what I can tell, this story is primarily about personalities. The community essentially ended up with several factions, but I’ll try to explain this without it degenerating into the schoolyard fighting that it appears to be.

1. Ruby Central is the surviving Ruby non-profit that another Ruby non-profit, Ruby Together merged with. This is where part of the legal ambiguity/dispute comes from that will make sense in (2).

2. RubyGems (the code, GitHub repo, etc) and RubyGems.org are two separate things. RubyGems code appears to not have been legally transferred in the merger. RubyGems.org is run by Ruby Central, but this transfer is also extremely muddy.

3. For reasons in dispute, Ruby Central seized the GitHub repos of RubyGems. It is not clear they have the legal or ethical right to do this (based on the evidence, I believe they do not and they have committed theft).

4. Ruby Central has made various noises about the need to do this for security and other things despite the extremely sloppy nature of the takeover.

5. Ruby Central then “gave” RubyGems to the Ruby core team without resolving anything in what appears to be an attempt to try and end the controversy.

In the background of all of this appears to be a lack of trust, dhh posting crap like this: https://world.hey.com/dhh/as-i-remember-london-e7d38e64, resulting in a fight about the future of the Ruby ecosystem.


https://joel.drapper.me/p/rubygems-takeover/

Read the above, but tl;dr is that Shopify executed a hostile takeover of Ruby Central for its own benefit, at the expense of long-term maintainers and the general community. I'm not sure if there's been any real change since then, but there are many reasons not to trust anything that the board says at this point.


IMHO, Ruby Central keeps trying to find a way to frame all of this in a good light, but it seems like they keep falling flat. They tried doing filtered Q&A avoiding all the obvious questions that people hostile to what happened would ask, temporarily providing transparency reports that didn’t really say much. It all felt like very incompetent damage control.

I think they were hoping that handing it off to the Ruby core team would allow them to move on, but that requires ownership of their failings or at least actions that demonstrate that they will be better moving forward and none of that has happened.


Wait, I had no idea dhh was on the outs now. This is the first I've heard of this. I have to go look for more information about this. What did he do?

I would recommend as a starting point this beautiful piece from November: https://okayfail.com/2025/in-praise-of-dhh.html

Not sure he's "on the outs", he on Shopify's board.

Sidekiq's solo dev (Mike Perham) has for many years made a generous donation to Ruby Central. He informed them that he didn't want his money to be spent platforming dhh at their conference, they ignored his request, he stopped his annual donations.

If you want to read about dhh's colorful blog posts and tweets: https://jakelazaroff.com/words/dhh-is-way-worse-than-i-thoug...


Colorful is an odd way to spell "vocally bigoted".

He came out as a white nationalist [1]. And he's always been contentious.

[1] https://jakelazaroff.com/words/dhh-is-way-worse-than-i-thoug...


If

MINASWAN...

WTFIDHHSAFA???


If you’d like to read, in his own words, his “coming out” as an ultra right wing racist piece of shit, feel free to look on his blog for the post titled “As I Remember London.”

Shopify and/or its technical leadership worked its connections to oust a Rubygems maintainer they saw as a threat to Ruby projects Shopify has invested in.

This was especially provocative because it involved Ruby Central asserting control over Rubygems, which it does not own.

It was (by credible accounts) a "preemptive strike" on this maintainer, and thus was not communicated to other RG maintainers, who were understandably angry.

The statement from RC at the time sounded like lot of CYA, and this doesn't read as all that sincere either.


The company has an internal policy of “open company, no bullshit” and an internal channel for venting called literally “outrage”. I don’t see an “official internal” and “unofficial internal” distinction here.


(Perhaps a good library for timestamp code in data pipelines https://github.com/williame/TimeMillis)

Thanks! I'm using Instant.parse at present and this is supposedly 37x faster. Will definitely give it a try.

And report back please! :)

Also zap the timestamp instant objects if you really need speed; see https://github.com/williame/TimeMillis

Yes! Obligatory link to the seminal work on the subject:

https://gwern.net/doc/cs/2005-09-30-smith-whyihateframeworks...


A subject close to my heart, I write a lot of heavily optimised code including a lot of hot data pipelines in Java.

And aside from algorithms, it usually comes down to avoiding memory allocations.

I have my go-to zero-alloc grpc and parquet and json and time libs etc and they make everything fast.

It’s mostly how idiomatic Java uses objects for everything that makes it slow overall.

But eventually after making a JVM app that keeps data in something like data frames etc and feels a long way from J2EE beans you can finally bump up against the limits that only c/c++/rust/etc can get you past.


> And aside from algorithms, it usually comes down to avoiding memory allocations.

I’ve heard about HFT people using Java for workloads where micro optimization is needed.

To be frank, I just never understood it. From what I’ve seen heard/you have to write the code in such a way that makes it look clumsy and incompatible with pretty much any third party dependencies out there.

And at that point, why are you even using Java? Surely you could use C, C++, or any variety of popular or unpopular languages that would be more fitting and ergonomic (sorry but as a language Java just feels inferior to C# even). The biggest swelling point of Java is the ecosystem, and you can’t even really use that.


I am very interested about this and would like an authoritative answer on this. I even went as far as buying some books on code optimization in the context of HFT and I was not impressed. Not a single snippet of assembly; how are you optimizing anything if you don't look at what the compiler produces?

But on Java specifically: every Java object still has a 24-byte overhead. How doesn't that thrash your cache?

The advice on avoiding allocations in Java also results in terrible code. For example, in math libraries, you'll often see void Add(Vector3 a, Vector3 b, Vector3 our) as opposed to the more natural Vector3 Add(Vector3 a, Vector3 b). There you go, function composition goes out the window and the resulting code is garbage to read and write. Not even C is that bad; the compiler will optimize the temporaries away. So you end up with Java that is worse than a low-level imperative language.

And, as far as I know, the best GC for Java still incurs no less than 1ms pauses? I think the stock ones are as bad as 10ms. How anyone does low-latency anything in Java then boggles my mind.


Modern ZGC guarantees under 1ms pause times, and Azul's pauseless C4 has been around for a while too.

The biggest selling point of Java is that you can easily find programmers that know it. They will need some training to do HFT style code but you'll still pay them less than C++ prima donnas and they'll churn out reasonably robust code in good time.

Can you share the libs you 're using?

Does this have corresponding speed ups or memory gains for normal CPUs too? Just thinking about all the cups of coffee that have been made and drunk while scikit-learn kmeans chugs through a notebook :)

For CPU with bigger K you would put the centroids in a search tree, so take advantage of the sparsity, while a GPU would calculate the full NxK distance matrix. So from my understanding the bottleneck they are fixing doesn't show up on CPU.

search trees tend not to scale well to higher dimensions though, right?

from what I've seen I had the impression that Yinyang k-means was the best way to take advantage of the sparsity.


Most data I've used is for geospatial with D<=4 (xyzt) so for me search trees worked great. But for things like descriptor or embedding clustering yes, trees wouldn't be useful.

Then these papers with these instructions get included in the training corpus for the next frontier models and those models learn to put these kinds of instructions into what they generate and …?

This seems a pretty big claim! What truths do people believe that are wrong, and what do you believe the truth to be? And why would they protect a generation and from what and why? And will they not protect the generation who are coming up now, still learning the false truth because the real truth hasn’t been revealed yet?

> What truths do people believe that are wrong, and what do you believe the truth to be?

I'm not certain.

> And why would they protect a generation and from what and why?

It's very common to delay information to minimize its impact. I suppose for "national security." My intuition is that it is a matter of country pride, and cultural "ownership" over a world wonder.

> And will they not protect the generation who are coming up now, still learning the false truth because the real truth hasn’t been revealed yet?

No, that's how delaying information works. They'll all be dead by the time it comes out anyway.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: