We’ve all seen it done countless times in movies and TV (especially if you’re a fan of crime procedurals like CSI): Some leading man/woman hands a techie a some grainy photo apparently taken from a city block away… during a sandstorm… while the photographer had a seizure… resulting in an image that is just a mess of blurry pixels. But our hero spots something small in one corner, and urgently asks, “Can you zoom and enhance that?”. This usually prompts the techie to start fervently mashing his keyboard before slamming on the Enter key, and seeing the image zoom in and then resolve from a blocky smudge to perfectly clear image of whatever the hero/heroine was looking for.
Of course, all of that is nothing more than prime Grade-A bullcrap. Much like how no “hacker” is ever seen using a mouse, or you never actually see a terminal prompt where they are typing, or elaborate “downloading” screens whenever something just needs to be copied, it’s a technobabble garbage trope popularized by Hollywood screenwriters who either don’t know how these things work, or just couldn’t be bothered by factual accuracy. But thanks to some imaging tech from Google, it now turns out these writers may just have been right all along*.
Google’s long-term deep learning project Google Brain has reportedly developed a new image enhancement AI software that does exactly that: enhancing the clarity of a low-res picture by using what are termed “hallucinations” to fill in the missing detail. This is the official description of what this new software does taken from the originally published paper:
We present a pixel recursive super resolution model that synthesizes realistic details into images while enhancing their resolution. A low resolution image may correspond to multiple plausible high-resolution images, thus modeling the super resolution process with a pixel independent conditional model often results in averaging different details– hence blurry edges. By contrast, our model is able to represent a multimodal conditional distribution by properly modeling the statistical dependencies among the high-resolution image pixels, conditioned on a low resolution input. We employ a PixelCNN architecture to define a strong prior over natural images and jointly optimize this prior with a deep conditioning convolutional network. Human evaluations indicate that samples from our proposed model look more photo realistic than a strong L2 regression baseline.
If anybody went cross-eyed reading that, don’t worry, here’s an easier explanation. With pictures! In the image below you will see three columns of pics. The rightmost column contains the original picture, while the leftmost column contains an 8×8 low-res version of that same pic. The centre column is the result of what the new Google Brain “super resolution” software tech was able come up with.
As you can see, it’s not a perfect duplicate, and certainly a far way off from the enhanced ultra clear 4K rubbish images Hollywood always comes up with, but the closeness is startling. Those differences that do creep in, are due to the fact that the AI algorithm that generates these pics are making nothing more than educated guesses. For lack of a better term, it’s imagining what data it thinks is supposed to be there, hence the term “hallucinations” used to describe this process. The algorithm basically uses two “networks” that have been trained using a dataset of celebrity faces. The “conditioning network” first maps out the low-res 8×8 image to onto a similar looking higher resolution image. The “prior network” then finds and uses other existing images with what it determines are similar pixel maps, and uses them to extrapolate what the original image could potentially look like.
This highly complex guesswork does raise concerns about this tech being used by law enforcement officials, per se, as a means of identifying a person, as it can, and does, make mistakes. It might still be able to assist though, as its results are still far more recognizable than the original pixelated source. In fact, in real world studies conducted with 40 test subjects, Google Brain received really positive results in being able to fool their subjects into not know what was an image enhanced by their super resolution AI technique and what was actually a photo taken by a camera.
It may still be in its early days, but it’s all very impressive stuff. Now I just hope that the folks over on NBC doesn’t get wind of this, or we’re never going to stop seeing CSI technicians pull off imaging miracles every week.
*At least about the image enhancing tech. Everything else is still utter rubbish. For now.