Hacker Newsnew | past | comments | ask | show | jobs | submit | gpjt's submissionslogin
1.Provision: LLM-powered server setup from Markdown (provision.sh)
2 points by gpjt 4 days ago | past | discuss
2.LLM from scratch, part 32j – trying to train a better model in the cloud (gilesthomas.com)
2 points by gpjt 5 days ago | past | discuss
3.Writing an LLM from scratch, part 32i – Interventions: what is in the noise? (gilesthomas.com)
1 point by gpjt 7 days ago | past | discuss
4.Writing an LLM from scratch, part 32h – Interventions: full fat float32 (gilesthomas.com)
7 points by gpjt 11 days ago | past | discuss
5.Writing an LLM from scratch, part 32g – Interventions: weight tying (gilesthomas.com)
2 points by gpjt 21 days ago | past
6.Writing an LLM from scratch, part 32f – Interventions: weight decay (gilesthomas.com)
6 points by gpjt 22 days ago | past
7.Writing an LLM from scratch, part 32e – Interventions: the learning rate (gilesthomas.com)
3 points by gpjt 35 days ago | past
8.Writing an LLM from scratch, part 32d – Interventions: adding attention bias (gilesthomas.com)
6 points by gpjt 67 days ago | past
9.Writing an LLM from scratch, part 32c – Interventions: removing dropout (gilesthomas.com)
1 point by gpjt 68 days ago | past
10.Writing an LLM from scratch, part 32B – Interventions: gradient clipping (gilesthomas.com)
2 points by gpjt 69 days ago | past
11.Writing an LLM from scratch, part 32a – Interventions: training a baseline model (gilesthomas.com)
1 point by gpjt 70 days ago | past
12.Getting a Custom PyTorch LLM onto the Hugging Face Hub (gilesthomas.com)
1 point by gpjt 76 days ago | past
13.Writing an LLM from scratch, part 31 – the models are now on Hugging Face (gilesthomas.com)
2 points by gpjt 87 days ago | past
14.Writing an LLM from scratch, part 30 – digging into the LLM-as-a-judge results (gilesthomas.com)
1 point by gpjt 3 months ago | past
15.LLM from scratch, part 29 – using DDP to train a base model in the cloud (gilesthomas.com)
2 points by gpjt 3 months ago | past
16.LLM from scratch, part 28 – training a base model from scratch on an RTX 3090 (gilesthomas.com)
540 points by gpjt 4 months ago | past | 121 comments
17.Writing an LLM from scratch, part 27 – what's left, and what's next? (gilesthomas.com)
1 point by gpjt 5 months ago | past
18.Writing an LLM from scratch, part 26 – evaluating the fine-tuned model (gilesthomas.com)
4 points by gpjt 5 months ago | past
19.Writing an LLM from scratch, part 25 – instruction fine-tuning (gilesthomas.com)
2 points by gpjt 5 months ago | past
20.Writing an LLM from scratch, part 24 – the transcript hack (gilesthomas.com)
1 point by gpjt 5 months ago | past
21.Retro Language Models: Rebuilding Karpathy's RNN in PyTorch (gilesthomas.com)
3 points by gpjt 5 months ago | past
22.Writing an LLM from scratch, part 23 – fine-tuning for classification (gilesthomas.com)
1 point by gpjt 5 months ago | past
23.Writing an LLM from scratch, part 22 – training our LLM (gilesthomas.com)
254 points by gpjt 6 months ago | past | 10 comments
24.Revisiting Karpathy's 'Unreasonable Effectiveness of Recurrent Neural Networks' (gilesthomas.com)
2 points by gpjt 6 months ago | past
25.Writing an LLM from scratch, part 21 – perplexed by perplexity (gilesthomas.com)
1 point by gpjt 6 months ago | past
26.Writing an LLM from scratch, part 20 – starting training, and cross entropy loss (gilesthomas.com)
41 points by gpjt 6 months ago | past | 3 comments
27.How Do LLMs Work? (gilesthomas.com)
2 points by gpjt 6 months ago | past | 1 comment
28.The maths you need to start understanding LLMs (gilesthomas.com)
616 points by gpjt 7 months ago | past | 120 comments
29.What AI chatbots are doing under the hood (gilesthomas.com)
2 points by gpjt 7 months ago | past
30.LLM from scratch, part 18 – residuals, shortcut connections, and the Talmud (gilesthomas.com)
2 points by gpjt 7 months ago | past

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: