Skip to main content

What! An AI Made All These Faces

Let's talk about glorious tale of AI-based human face generation and showcase an absolutely unbelievable new paper in this area. You may be surprised, but this thing is not recent at all. This is 4 year old news! Insanity. Later, researchers turned this whole problem around, and performed something that was previously thought to be impossible. They started using these networks to generate photorealistic images from a written text description. We could create new bird species by specifying that it should have orange legs and a short yellow bill. Then, researchers at NVIDIA recognized and addressed two shortcomings: one was that the images were not that detailed, and two, even though we could input text, we couldn’t exert too much artistic control over the results. In came StyleGAN to the rescue, which was then able to perform both of these difficult tasks really well. Furthermore, there are some features that are highly localized as we exert control over these images, you can see how this part of the teeth and eyes were pinned to a particular location and the algorithm just refuses to let it go, sometimes to the detriment of its surroundings.

A followup work titled StyleGAN2 addresses all of these problems in one go. So, StyleGAN2 was able to perform near-perfect synthesis of human faces, and remember, none of these people that you see here really exist. Quite remarkable. So, how can we improve this magnificent technique? Well, this new work can do so many things,I don’t even know where to start. First, and most important, we now have much much more intuitive artistic control over the output images. We can add or remove a beard, make the subject younger or older, change their hairstyle, make their hairline recede, put a smile on their face, or even make their nose pointier. Absolute witchcraft. So why can we do all this with this new method? The key idea is that it is not using a Generative Adversarial Network, a GAN, in short. A GAN means two competing neural networks,where one is trained to generate new images, and the other one is used to tell whether the generated images are real or fake. GANs dominated this field for a long while because of their powerful generation capabilities, but, on the other hand, they are quite difficult to train and we have only limited control over its output.

Among other changes, this work disassembles this generator network into F and G, and the discriminator network into E and D, or in other words, adds an encoder and decoder network here. Why? The key idea is that the encoder compresses the image data down into a representation that we can edit more easily. This is the land of beards and smiles, or in other words, all of these intuitive features that we can edit exist here, and when we are done, we can decompress the output with the decoder network and produce these beautiful images. This is already incredible, but what else can we do with this new architecture? A lot more. For instance, two, if we add a source and destination subjects, their coarse, middle, or fine styles can also be mixed. What does that mean exactly? The coarse part means that high-level attributes,like pose, hairstyle and face shape will resemble the source subject, in other words, the child will remain a child and inherit some of the properties of the destination subjects. However, as we transition to the “fine from source” part, the effect of the destination subject will be stronger, and the source will only be used to change the color scheme and microstructure of this image. Interestingly, it also changes the background of the subject. Three, it can also perform image interpolation. This means that we have these four images as starting points, and it can compute intermediate images between them.

You see here that as we slowly become Bill Gates, somewhere along the way, glasses appear. Now note that interpolating between images is not difficult in the slightest and has been possible for a long-long time — all we need to do is just compute average results between these images. So what makes a good interpolation process? Well, we are talking about good interpolation,when each of the intermediate images make sense and can stand on their own. I think this technique does amazingly well at that. I’ll stop the process at different places,you can see for yourself and let me know in the comments if you agree or not. I also kindly thank the authors for creating more footage just for us to showcase in this series. That is a huge honor, thank you so much! And note that StyleGAN2 appeared around December of 2019, and now, this paper by the name “Adversarial Latent Autoencoders” appeared only four months later. Four months later. My goodness! This is so much progress in so little time it truly makes my head spin. What a time to be alive!

Comments

Popular posts from this blog

How to get Apple iPhone X for Rs 65,000 Everything thing you need to know

The upcoming Apple iPhone X has garnered a lot of interest globally but the atrocious price of Rs 89,000 (64GB) and Rs 1,02,000 (256GB) in India has disheartened many fans. But to make things a bit easy for Apple fanbois, the company has announced that it will provide one-year international warranty on all iPhones irrespective of whether you have Apple Care or not. This warranty is not applicable for carrier-locked iPhones.   This means you can get the 64GB iPhone X from the US for just Rs 65,000 instead of spending Rs 89,000 with full warranty. The iPhone X 64GB costs $999 in the US. The same is applicable for all iPhones including the latest iPhone 8 and 8 Plus. Remember that there could be additional taxes involved.   Since Apple hasn't supported warranty claims for iPhones bought outside India, some frequent travellers used to find it unwise to go for such a purchase. But now we are able to confirm that a change in Apple India's policy about a month ago honours the w...

HP SProcket Pocket-Sized Instant Photo Printer Launched At Rs 8,999

HP on Thursday launched a rather unique product in India called SProcket that lets users take prints of photographs stored on their mobile phones wirelessly. "The selfies and pictures taken in our phones are in "digital prisons" and we lose them them often," said Rajkumar Rishi, HP India. "Sprocket caters to the millennials who would want a physical copy of their cherished memories."  HP Sprocket, aimed at attracting millennials, travelers and photography enthusiasists. The printer weighs about 0.17 kgs and can print photos directly from smartphones and tablets. The Sprocket printer connects to any Android or iOS based smartphone via Bluetooth and allows users to edit pictures and add borders and emojis before taking a print. It uses a special photo paper based on ZINK technology to print the pictures instantly without any ink. The ZINK paper costs Rs 539 for a pack of 20 and Rs 1249 for a pack of 50. A pack of 10 is included in the bo...

Xbox one X: The most powerful gaming console ever , Pre-order, Features

Xbox One X - earlier known as Project Scorpio - will arrive in store shelves in just a few weeks, and Microsoft wants you to "feel true power". The new slogan is part of Xbox's new marketing campaign, aimed to sell you this slim box said to be the most powerful console ever. And powerful it certainly is, capable of 4K and all other forms of visual magic. It's not the start of a new generation, but a souped-up Xbox One sibling powerhouse. But how much power does it really have, and how does it compare to the PS4 Pro? What games can you play on it? What do you need to get the full experience? We've got answers to all these questions and more waiting below.  Xbox One X - where can I pre-order it? Microsoft has revealed a very special Xbox One X console exclusively for pre-orders. The slick Xbox One X Project Scorpio Edition is inscribed with the code name for the console on both the box itself and the controller and comes with a vertical stand to show...