visit
Could this be the technological innovation that hairstylists have been dying for? I'm sure a majority of us have had a bad haircut or two. But hopefully, with this AI, you'll never have to guess what a new haircut will look like ever again.
This AI can transfer a new hairstyle and/or color to a portrait to see how it would look like before committing to the change. Learn more about it below!
►The full article:
►Peihao Zhu et al., (2021), Barbershop,
►Project link:
►Code:
00:00
This article is not about a new technology in itself.
00:03
Instead, it is about a new and exciting application of GANs.
00:06
Indeed, you saw the title, and it wasn't clickbait.
00:10
This AI can transfer your hair to see how it would look like before committing to the
00:15
change.
00:16
We all know that it may be hard to change your hairstyle even if you'd like to.
00:19
Well, at least for myself, I'm used to the same haircut for years, telling my hairdresser
00:24
"same as last time" every 3 or 4 months even if I'd like a change.
00:29
I just can't commit, afraid it would look weird and unusual.
00:33
Of course, this all in our head as we are the only ones caring about our haircut, but
00:38
this tool could be a real game-changer for some of us, helping us to decide whether or
00:43
not to commit to such a change having great insights on how it will look on us.
00:48
Nonetheless, these moments where you can see in the future before taking a guess are rare.
00:53
Even if it's not totally accurate, it's still pretty cool to have such an excellent approximation
00:57
of how something like a new haircut could look like, relieving us of some of the stress
01:02
of trying something new while keeping the exciting part.
01:06
Of course, haircuts are very superficial compared to more useful applications.
01:10
Still, it is a step forward towards "seeing in the future" using AI, which is pretty cool.
01:17
Indeed, this new technique sort of enables us to predict the future, even if it's just
01:22
the future of our haircut.
01:24
But before diving into how it works, I am curious to know what you think about this.
01:28
In any other field: What other application(s) would you like to see using AI to "see into
01:34
the future"?
01:38
It can change not only the style of your hair but also the color from multiple image examples.
01:44
You can basically give three things to the algorithm:
01:47
a picture of yourself a picture of someone with the hairstyle you
01:51
would like to have and another picture (or the same one) of the hair
01:55
color you would like to tryand it merges everything on yourself realistically.
01:59
The results are seriously impressive.
02:02
If you do not trust my judgment, as I would completely understand based on my artistic
02:06
skill level, they also conducted a user study on 396 participants.
02:12
Their solution was preferred 95 percent of the time!
02:17
Of course, you can find more details about this study in the references below if this
02:21
seems too hard to believe.
02:22
As you may suspect, we are playing with faces here, so it is using a very similar process
02:27
as the past papers I covered, changing the face into cartoons or other styles that are
02:33
all using GANs.
02:34
Since it is extremely similar, I'll let you watch my other videos where I explained how
02:39
GANs work in-depth, and I'll focus on what is new with this method here and why it works
02:45
so well.
02:46
A GAN architecture can learn to transpose specific features or styles of an image onto
02:52
another.
02:53
The problem is that they often look unrealistic because of the lighting differences, occlusions
02:58
it may have, or even simply the position of the head that are different in both pictures.
03:04
All of these small details make this problem very challenging, causing artifacts in the
03:09
generated image.
03:10
Here's a simple example to better visualize this problem, if you take the hair of someone
03:11
from a picture taken in a dark room and try to put it on yourself outside in daylight,
03:12
even if it is transposed perfectly on your head, it will still look weird.
03:13
Typically, these other techniques using GANs try to encode the pictures' information and
03:15
explicitly identify the region associated with the hair attributes in this encoding
03:21
to switch them.
03:22
It works well when the two pictures are taken in similar conditions, but it won't look real
03:27
most of the time for the reasons I just mentioned.
03:30
Then, they had to use another network to fix the relighting, holes, and other weird artifacts
03:36
caused by the merging.
03:38
So the goal here was to transpose the hairstyle and color of a specific picture onto your
03:43
own picture while changing the results to follow the lighting and property of your picture
03:49
to make it convincing and realistic all at once, reducing the steps and sources of errors.
03:55
If this last paragraph was unclear, I strongly recommend watching the video at the end of
03:56
this article as there are more visual examples to help to understand.
03:57
To achieve that, Peihao Zhu et al. added a missing but essential alignment step to GANs.
04:01
Indeed, instead of simply encoding the images and merge them, it slightly alters the encoding
04:07
following a different segmentation mask to make the latent code from the two images more
04:12
similar.
04:13
As I mentioned, they can both edit the structure and the style or appearance of the hair.
04:18
Here, the structure is, of course, the geometry of the hair, telling us if it's curly, wavy,
04:24
or straight.
04:25
If you've seen my other videos, you already know that GANs encode the information using
04:30
convolutions.
04:31
This means it uses kernels to downscale the information at each layer and makes it smaller
04:37
and smaller, thus iteratively removing spatial details while giving more and more value to
04:43
general information to the resulting output.
04:46
This structural information is obtained, as always, from the early layers of the GAN,
04:52
so before the encoding becomes too general and, well, too encoded to represent spatial
04:58
features.
04:59
Appearance refers to the deeply encoded information, including hair color, texture, and lighting.
05:05
You know where the information is taken from the different images, but now, how do they
05:10
merge this information and make it look more realistic than previous approaches?
05:15
This is done using segmentation maps from the images.
05:18
And more precisely, generating this wanted new image based on an aligned version of our
05:24
target and reference image.
05:26
The reference image is our own image, and the target image the hairstyle we want to
05:31
apply.
05:32
These segmentation maps tell us what the image contains and where it is, hair, skin, eyes,
05:38
nose, etc.
05:40
Using this information from the different images, they can align the heads following
05:44
the target image structure before sending the images to the network for encoding using
05:49
a modified StyleGAN2-based architecture.
05:52
One that I already covered numerous times.
05:55
This alignment makes the encoded information much more easily comparable and reconstructable.
06:00
Then, for the appearance and illumination problem, they find an appropriate mixture
06:05
ratio of these appearances encodings from the target and reference images for the same
06:11
segmented regions making it look as real as possible.
06:15
Here's what the results look like without the alignment on the left column and their
06:19
approach on the right.
06:21
Of course, this process is a bit more complicated, and all the details can be found in the paper
06:26
linked in the references.
06:27
Note that just like most GANs implementations, their architecture needed to be trained.
06:32
Here, they used a StyleGAN2-base network trained on the FFHQ dataset.
06:38
Then, since they made many modifications, as we just discussed, they trained a second
06:42
time their modified StleGAN2 network using 198 pairs of images as hairstyle transfer
06:50
examples to optimize the model's decision for both the appearance mixture ratio and
06:55
the structural encodings.
06:57
Also, as you may expect, there are still some imperfections like these ones where their
07:02
approach fails to align the segmentation masks or to reconstruct the face.Still, the results
07:08
are extremely impressive and it is great that they are openly sharing the limitations.
07:13
As they state in the paper, the source code for their method will be made public after
07:18
an eventual publication of the paper.
07:21
The link to the official GitHub repo is in the references below, hoping that it will
07:25
be released soon.
07:27
Thank you for watching!