|Lead image: Image generated from the following text, ‘A peaceful lake surrrounded by tall-trees in a foggy day.’|
NVIDIA has announced the latest version of NVIDIA Research’s AI painting demo, GauGAN2. The model is powered by deep learning and now features a text-to-image feature. Whereas the original version could only turn a rough sketch into a detailed image, GauGAN 2 can generate images from phrases like ‘sunset at a beach,’ which can then be further modified with adjectives like ‘rocky beach,’ or by changing ‘sunset’ to a different time of day or even modifying weather conditions. GauGAN is powered by generative adversarial networks (GAN), which you can learn more about in this NVIDIA article.
Back to GauGAN2, NVIDIA writes, ‘With the press of a button, users can generate a segmentation map, a high-level outline that shows the location of objects in the scene. From there, they can switch to drawing, tweaking the scene with rough sketches using labels like sky, tree, rock and river, allowing the smart paintbrush to incorporate these doodles into stunning images.’
You can try GauGAN2 for yourself on NVIDIA AI Demos. You can also see it in action in the video below.
By adding text-to-image capabilities, the new version of GauGAN is more customizable and can be tuned much quicker. Even a quick sketch is not nearly as fast and simple as typing a phrase. The latest version is also one of the first AI models to incorporate multiple modalities, text, semantic segmentation, sketch, and style, within a single GAN network.
Your text-based starting point, such as a ‘snow-capped mountain range,’ can be further customized with sketching. You can add trees, change the height and size of objects, add clouds to sky and much more. And then GauGAN2 generates a new, modified image.
|‘Endless tall mountains in a sunny day’|
You don’t need to keep your ideas based in reality, either. GauGAN2 may prove useful for concept artists, as you can create worlds with two suns, like Tatooine in Star Wars. NVIDIA writes, ‘It’s an iterative process, where every word the user types into the text box adds more to the AI-created image.’
|Click to view GIF in motion|
GauGAN continues to improve its results. When we looked at it in early 2019, the results were impressive, but there were visible limitations. NVIDIA released a tool earlier this year built upon GauGAN, NVIDIA Canvas, which can be used on any NVIDIA RTX GPU. At this point, GauGAN2 has been trained on 10 million landscape images using the NVIDIA Selene supercomputer, which is among the world’s 10 most powerful supercomputers.
To learn more about NVIDIA Research and its projects, click here. There are a lot of exciting AI-powered projects in the works and it’s impressive to see how far GauGAN has come in just a few years.