Skip to main content

What! An AI Made All These Faces

Let's talk about glorious tale of AI-based human face generation and showcase an absolutely unbelievable new paper in this area. You may be surprised, but this thing is not recent at all. This is 4 year old news! Insanity. Later, researchers turned this whole problem around, and performed something that was previously thought to be impossible. They started using these networks to generate photorealistic images from a written text description. We could create new bird species by specifying that it should have orange legs and a short yellow bill. Then, researchers at NVIDIA recognized and addressed two shortcomings: one was that the images were not that detailed, and two, even though we could input text, we couldn’t exert too much artistic control over the results. In came StyleGAN to the rescue, which was then able to perform both of these difficult tasks really well. Furthermore, there are some features that are highly localized as we exert control over these images, you can see how this part of the teeth and eyes were pinned to a particular location and the algorithm just refuses to let it go, sometimes to the detriment of its surroundings.

A followup work titled StyleGAN2 addresses all of these problems in one go. So, StyleGAN2 was able to perform near-perfect synthesis of human faces, and remember, none of these people that you see here really exist. Quite remarkable. So, how can we improve this magnificent technique? Well, this new work can do so many things,I don’t even know where to start. First, and most important, we now have much much more intuitive artistic control over the output images. We can add or remove a beard, make the subject younger or older, change their hairstyle, make their hairline recede, put a smile on their face, or even make their nose pointier. Absolute witchcraft. So why can we do all this with this new method? The key idea is that it is not using a Generative Adversarial Network, a GAN, in short. A GAN means two competing neural networks,where one is trained to generate new images, and the other one is used to tell whether the generated images are real or fake. GANs dominated this field for a long while because of their powerful generation capabilities, but, on the other hand, they are quite difficult to train and we have only limited control over its output.

Among other changes, this work disassembles this generator network into F and G, and the discriminator network into E and D, or in other words, adds an encoder and decoder network here. Why? The key idea is that the encoder compresses the image data down into a representation that we can edit more easily. This is the land of beards and smiles, or in other words, all of these intuitive features that we can edit exist here, and when we are done, we can decompress the output with the decoder network and produce these beautiful images. This is already incredible, but what else can we do with this new architecture? A lot more. For instance, two, if we add a source and destination subjects, their coarse, middle, or fine styles can also be mixed. What does that mean exactly? The coarse part means that high-level attributes,like pose, hairstyle and face shape will resemble the source subject, in other words, the child will remain a child and inherit some of the properties of the destination subjects. However, as we transition to the “fine from source” part, the effect of the destination subject will be stronger, and the source will only be used to change the color scheme and microstructure of this image. Interestingly, it also changes the background of the subject. Three, it can also perform image interpolation. This means that we have these four images as starting points, and it can compute intermediate images between them.

You see here that as we slowly become Bill Gates, somewhere along the way, glasses appear. Now note that interpolating between images is not difficult in the slightest and has been possible for a long-long time — all we need to do is just compute average results between these images. So what makes a good interpolation process? Well, we are talking about good interpolation,when each of the intermediate images make sense and can stand on their own. I think this technique does amazingly well at that. I’ll stop the process at different places,you can see for yourself and let me know in the comments if you agree or not. I also kindly thank the authors for creating more footage just for us to showcase in this series. That is a huge honor, thank you so much! And note that StyleGAN2 appeared around December of 2019, and now, this paper by the name “Adversarial Latent Autoencoders” appeared only four months later. Four months later. My goodness! This is so much progress in so little time it truly makes my head spin. What a time to be alive!

Comments

Popular posts from this blog

HP SProcket Pocket-Sized Instant Photo Printer Launched At Rs 8,999

HP on Thursday launched a rather unique product in India called SProcket that lets users take prints of photographs stored on their mobile phones wirelessly. "The selfies and pictures taken in our phones are in "digital prisons" and we lose them them often," said Rajkumar Rishi, HP India. "Sprocket caters to the millennials who would want a physical copy of their cherished memories."  HP Sprocket, aimed at attracting millennials, travelers and photography enthusiasists. The printer weighs about 0.17 kgs and can print photos directly from smartphones and tablets. The Sprocket printer connects to any Android or iOS based smartphone via Bluetooth and allows users to edit pictures and add borders and emojis before taking a print. It uses a special photo paper based on ZINK technology to print the pictures instantly without any ink. The ZINK paper costs Rs 539 for a pack of 20 and Rs 1249 for a pack of 50. A pack of 10 is included in the bo

Blackberry KEYOne (Limited Edition Black):Back With It's Latest Beauty

Canadian smartphone manufacturer, Blackberry has launched it's latest android device in the Indian market as an Amazon exclusive.It will be made available  from August 8 by Blackberry's manufacturing and distribution partner , Optiemus Infracom. Design The Blackberry KEYOne has a very unique design with a signature Blackberry mark to it. On front it houses a display,a camera and a QWERTY keyboard with added smartness.It's back panel has a camera setup on the top left corner of the smartphone and a dual tone LED flash.Top end of the smartphone has a 3.5mm headphone jack and a secondary noise cancellation microphone.Bottom end consist of two vents, one for the speaker and the other vent for the primary microphone, it also consist of a USB Type-C port.The right edge houses the dual SIM Hybrid MicroSD slot, volume  rocker and a 'convenience key' that is essentially a shortcut button to access frequently used applications.The left edge houses a power button.I

Nokia 9: Specifications, Pricing And Availability, Everything You Need To Know

In 2017, Nokia was not able to fulfill the expectations of their lovers with their smartphone lineup, it was totally disgusting. They were not able to manage their smartphone's specification with competit ion. Lets hope they will make a grand come back in 2018 with their Nokia 9. Design Rumors suggest that Nokia 9 will feature and all metal unibody with curve on the edges to help in getting good grip of the device. It is expected to come with IP68 water and dust resistance. It can survive up to 1.5 metres deep in water for 30 minutes while dimension of the phone can be 140.9x73x7.5mm. Display Nokia 9 is rumoured to have a 5.5inch Quad HD OLED display with 18:9 aspect ratio, intending that it could come with very less bezels on the top and bottom.  Camera In the camera department, leaked images show dual rear cameras with Carl Zeiss optics, one can be RGB sensor and the other one can be monochrome sensor. On the front also dual camera setup can be seen.