not yet, these are great suggestions. it's always a dilemma to add features to mitigate the performance of a weak model, instead of making a better model. most of the problems go away with a better language and colorization model, and many model-specific features are made in vain
I think there are two uses for an AI colorizer. One is to generate a color image that looks great, another is to generate an image that accurately reflects the true color of things.
A better AI model helps a lot with the first goal, but help only so far with the second one. Truth to be told, there is a lot of contextual color information in black and white photos that an AI model can exploit; but nothing beats someone that knows, for sure, the color of the dress of someone in the photo.
I mean, take a look at https://www.reddit.com/r/ColorizedHistory/ - some of those color artists do a lot of research to know the exact shade of green of the military uniform of some country in the 19th century, and things like that, just to have an accurate reference.
So I think that the ability of directing the color output (either by rejecting a color textually, or by painting over the figure with a starting point - even if maybe I'm not painting with the exact tone or texture but a rough color that should help the AI to figure out the details) is essential for a colorization product, even if the model is flawless!
My concern here isn't that the professional photo has higher quality (in this case it has, but give it some time - months or years - and maybe technology will catch up). It's that sometimes we already know the right color, while the AI must always guess
So you can kind of do that, the first step creates a description, and the second step colorizes using the description.
So you can modify the text for various degrees of success and specify 'orange dress within the text'
What about a conversation, like, "the dress isn't blue, it should be orange", doing on top of previous prompt?