Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I threw a challenging rendering problem at it and I was pretty impressed with the overall structure and implementation. But as I looked deeper, the flaws became apparent. It simply made up APIs that didn’t exist, and when prompted to fix it, couldn’t figure it out.

Still, despite being fundamentally wrong it did send me down some different paths.



Using APIs that don't exist is the biggest problem I've seen with ChatGPT, and it seems GPT-4 as well.


I asked chatgpt about the api for an old programming game called chipwits.. it invented a whole programming language that it called chiptalk with an amalgam of the original chipwits stuff, missing some bits and adding others, and generated a parser for it, which I implemented and got to work, before figuring out how much was imaginary, after talking to the original chipwits devs. They found it pretty amusing.


> and got to work

Can you elaborate?


I'm fast learning Django and even though it's an extremely well documented space, ChatGPT has sent me down the wrong path more than a handful of times.

This is especially difficult because I don't know when it's wrong and it's so damn confident. I've gotten better at questioning its correctness when the code doesn't perform as expected but initially it cost me upwards of 30min per time.

Still, I would say between ChatGPT and Copilot - I'm WAY further ahead.


chatgpt or gpt4?

public copilot uses gpt3.5, as does non premium chatgpt.


my biggest problem with it is that it doesn't seem to understand its own knowledge. If you talk to it for a while and you go back and forth on a coding problem it will often suddenly start using wrong syntax that doesn't exist. Even though at this point it should already know and have looked up for sure that this syntax can't possibly exist because many times it responded correctly. So in human terms it has read the documentation and must know that this syntax can't possibly exist and yet it doesn't know that 10 sec later. That's currently what makes it seem like a not real intelligence to me.


One of the advantages of Bing, and do guess now ChatGPT with browsing plugin, is that it's able to search on the web for the right API.


To be fair, using APIs that I think should exist, is how I develop most of my APIs.


Except that I wasn't asking it to develop a new API.


It's very likely it was using other languages' as "inspiration" given there's very little Zig code out there... so it's maybe natural it would use APIs that don't yet exist... perhaps informing it that it also needs to implement those APIs could work?


Then I guess you're not using it to its fullest potential ;)


We can’t keep blaming the prompter.


A simple metric on confidence interval could do the trick. As the model grows larger, it is getting more difficult to understand what is going on, but that doesn't mean that it needs to be a total black box. At least let it throw some proxy metrics. In due course, will learn to interpret those metrics and adjust our internal trust model.


You can just ask it to give you confidence in the output on a scale 0 to 1


I wonder if a plugin to let it query API docs would solve this problem.


Also it makes up Python libraries, macOS apps to do certain tasks, etc.


I’ve had very good results from running the code and pasting the errors back into ChatGPT and asking it what to do. Sometimes it corrects itself quite well


Put that in a loop and see if AGI emerges.


>It simply made up APIs that didn’t exist

That has been my experience with Zig. It led me to the conclusion that there are just too many 'non indexed' developer tools in use these days, so there isnt much training data for new topics. But it was happy to hallucinate API's and their proof of existence.


yea I find it to be wrong a lot when coding But its faster for me to fix existing code than to write code from scratch so its still better than nothing for me


Same. It seems similar to Copilot in that regard, but better at text-to-code, porting between languages or frameworks, and generating test cases and readmes: https://notes.osteele.com/gpt-experiments/using-chatgpt-to-p...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: