I threw a challenging rendering problem at it and I was pretty impressed with the overall structure and implementation. But as I looked deeper, the flaws became apparent. It simply made up APIs that didn’t exist, and when prompted to fix it, couldn’t figure it out.
Still, despite being fundamentally wrong it did send me down some different paths.
I asked chatgpt about the api for an old programming game called chipwits.. it invented a whole programming language that it called chiptalk with an amalgam of the original chipwits stuff, missing some bits and adding others, and generated a parser for it, which I implemented and got to work, before figuring out how much was imaginary, after talking to the original chipwits devs. They found it pretty amusing.
I'm fast learning Django and even though it's an extremely well documented space, ChatGPT has sent me down the wrong path more than a handful of times.
This is especially difficult because I don't know when it's wrong and it's so damn confident. I've gotten better at questioning its correctness when the code doesn't perform as expected but initially it cost me upwards of 30min per time.
Still, I would say between ChatGPT and Copilot - I'm WAY further ahead.
my biggest problem with it is that it doesn't seem to understand its own knowledge. If you talk to it for a while and you go back and forth on a coding problem it will often suddenly start using wrong syntax that doesn't exist. Even though at this point it should already know and have looked up for sure that this syntax can't possibly exist because many times it responded correctly. So in human terms it has read the documentation and must know that this syntax can't possibly exist and yet it doesn't know that 10 sec later. That's currently what makes it seem like a not real intelligence to me.
It's very likely it was using other languages' as "inspiration" given there's very little Zig code out there... so it's maybe natural it would use APIs that don't yet exist... perhaps informing it that it also needs to implement those APIs could work?
A simple metric on confidence interval could do the trick. As the model grows larger, it is getting more difficult to understand what is going on, but that doesn't mean that it needs to be a total black box. At least let it throw some proxy metrics. In due course, will learn to interpret those metrics and adjust our internal trust model.
I’ve had very good results from running the code and pasting the errors back into ChatGPT and asking it what to do. Sometimes it corrects itself quite well
That has been my experience with Zig. It led me to the conclusion that there are just too many 'non indexed' developer tools in use these days, so there isnt much training data for new topics. But it was happy to hallucinate API's and their proof of existence.
yea I find it to be wrong a lot when coding
But its faster for me to fix existing code than to write code from scratch so its still better than nothing for me
Still, despite being fundamentally wrong it did send me down some different paths.