I think it's worth noting that EleutherAI is a grassroots collection of research...

odnes · on Jan 1, 2021

How are they sourcing/funding the compute to train these massive models?

stellaathena · on Jan 2, 2021

TFRC, a program that lets you borrow Google's TPUs when they're not being used. You can apply here: https://www.tensorflow.org/tfrc

Alvion_Bleeds · on Jan 2, 2021

Connor Leahy, who I think is a sort of BDFL figure for ElutherAI, mentioned in a Slatestarcodex online meetup I attended that Google donated millions of dollars worth of preemptable TPU credits to the project. There is a video of the meetup on YouTube somewhere. Struck me as a really smart kid with a lot of passion.

chillee · on Jan 2, 2021

Haha Connor (although one of the main participants) definitely isn't a BDFL - we don't have any BDFLs :)

We don't really have much of a hierarchy at all - it's mostly just a collection of researchers of widely varying backgrounds all interested in ML research.

stellaathena · on Jan 2, 2021

I'm not sure what a BDFL figure is, but Google does not give us millions of dollars. We are a part of TFRC, a program where researchers and non-profits can borrow TPUs when they're not being used. You could say that we are indirectly funded as a result, but it's nowhere near millions of dollars and it doesn't reflect any kind of special relationship with Google.

o-__-o · on Jan 2, 2021

Benevolent dictator for life

leogao · on Jan 2, 2021

EleutherAI has a very flat hierarchy; we do not have any BDFL-like figure.

luto · on Jan 1, 2021

they'll probably run it on scientific clusters of various universities, or on collections of idle lab desktop machines. Both of these tend to sit idle a lot of the time, based on my experience at uni in Europe.

joe_the_user · on Jan 1, 2021

Any idea how large dataset used to train GPT-3 was?

arugulum · on Jan 1, 2021

570GB of Common Crawl post-filtering, but only 40% of CC data was seen even once during training, though CC is only 60% of the training data. You could work through the math to find the rough size of GPT-3's training data, but it sounds like The Pile is of comparable size.

leogao · on Jan 2, 2021

Yeah, the Pile is approximately the size of the GPT-3 training data, which is not a coincidence--one major reason we created the Pile (though certainly not the only one) was for our GPT-3 replication project.