Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think it's worth noting that EleutherAI is a grassroots collection of researchers, which distinguishes it from academia/industry labs.

As part of their work on democratizing AI, they're now hoping to replicate GPT-3 and release it for free (unlike OpenAI's API).

I would encourage everyone interested to join their discord server (https://discord.gg/BK2v3EJ) -- they're extremely friendly and I think it's a project worth contributing to.



How are they sourcing/funding the compute to train these massive models?


TFRC, a program that lets you borrow Google's TPUs when they're not being used. You can apply here: https://www.tensorflow.org/tfrc


Connor Leahy, who I think is a sort of BDFL figure for ElutherAI, mentioned in a Slatestarcodex online meetup I attended that Google donated millions of dollars worth of preemptable TPU credits to the project. There is a video of the meetup on YouTube somewhere. Struck me as a really smart kid with a lot of passion.


Haha Connor (although one of the main participants) definitely isn't a BDFL - we don't have any BDFLs :)

We don't really have much of a hierarchy at all - it's mostly just a collection of researchers of widely varying backgrounds all interested in ML research.


I'm not sure what a BDFL figure is, but Google does not give us millions of dollars. We are a part of TFRC, a program where researchers and non-profits can borrow TPUs when they're not being used. You could say that we are indirectly funded as a result, but it's nowhere near millions of dollars and it doesn't reflect any kind of special relationship with Google.


Benevolent dictator for life


EleutherAI has a very flat hierarchy; we do not have any BDFL-like figure.


they'll probably run it on scientific clusters of various universities, or on collections of idle lab desktop machines. Both of these tend to sit idle a lot of the time, based on my experience at uni in Europe.


Any idea how large dataset used to train GPT-3 was?


570GB of Common Crawl post-filtering, but only 40% of CC data was seen even once during training, though CC is only 60% of the training data. You could work through the math to find the rough size of GPT-3's training data, but it sounds like The Pile is of comparable size.


Yeah, the Pile is approximately the size of the GPT-3 training data, which is not a coincidence--one major reason we created the Pile (though certainly not the only one) was for our GPT-3 replication project.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: