Is it easy to find where the matvecs are, in LLaMA (if you are someone who is curious and wants to poke around at the “engine” without understanding the “transmission,” so to speak)? I was hoping to mess around with this for Stable Diffusion, but it seemed like they were buried under quite a few layers of indirection. Which is entirely reasonable, the goal is to ship software, not satisfy people who’d just want to poke things and see what happens, haha.
did you see tiny grad can run llama and stable diffusion? it's an intentionally extremely simple framework vs pytorch or even micrograd, which helped me dig into the underlying math. though https://spreadsheets-are-all-you-need.ai/ is a good one for learning LLMs.