I think for fine-tuned GPT-3.5 to be competitive with GPT-4 on your use cases (assistance with Angular), you'd have to fine-tune on enough data that it really resembles pre-training more than fine-tuning. And it wouldn't be worth the hassle unless you're building a product around it.
That said, many valuable LLM products / features are more narrow in scope and can see a huge lift from fine-tuning. We've run a bunch of experiments on this (e.g., SQL query generation is a good example), where fine-tuning even the 7B Llama-2 model outperforms GPT-4 (surprisingly) [1]. That's a very different type of problem from teaching software engineering of course.
That said, many valuable LLM products / features are more narrow in scope and can see a huge lift from fine-tuning. We've run a bunch of experiments on this (e.g., SQL query generation is a good example), where fine-tuning even the 7B Llama-2 model outperforms GPT-4 (surprisingly) [1]. That's a very different type of problem from teaching software engineering of course.
[1] https://www.anyscale.com/blog/fine-tuning-llama-2-a-comprehe...