Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I gave it a photo of Tianamen Square in the present day and promoted it with:

> where might this photo have been taken? what historical significance & does this location have?

And got back a normal response describing the image, until it got to this:

> One of the most significant events that comes to mind is the Tian

Where it then errored out before finishing…



How does this even work?

Is hugging face hosting just the weights or some custom code?

If it's just weights then I don't see how it could error out, it's just math. Do these chinese models have extra code checking the output for anti-totalitarian content? Can it be turned off?


That shows that the censorship mechanism happens at the token sampling level and not post generation, at the least


On the contrary, it shows that the censorship mechanism is post-generation and stops it once it deems the output accumulated so far "improper". It just runs after every token.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: