My bet is that (previously discussed by others and here) that they have cascades/steps of models. There's probably a 'simple' model that looks at your query first, which detects whether your query could result in a problematic (racist, sexist etc.) GPT answer, returning some boiler-plate text instead of sending the query to GPT. That saves a lot of compute power and time. If I were them I'd focus more on those auxiliary models which hold the hands of the main-GPT model; there are probably more lower-hanging fruits there. This would also explain why they didn't announce GPT-4 details; my bet is that the model itself isn't very impressive, you're just getting the illusion that it got better by these additional 'simpler' models.
> His point is that the raw model that became GPT-4 would do literally anything it asked you to.
It's unfortunate that people would abuse that and we can't have the raw model just for personal use. The story telling and characters alone would be worth it. The safety guards tend to seep into fictional scenarios, making them more bland and preachy.
I have been writing prompts for a GPT-based document 'digester' for business-internal people who can't code but do have the right background knowledge. Every day I have to expand the prompt because I found a new spot where I have to hold the thing's hands so it does the right thing :)