无监督学习 k-means

有关深层学习的FAU讲义 (FAU LECTURE NOTES ON DEEP LEARNING)

These are the lecture notes for FAU’s YouTube Lecture “Deep Learning”. This is a full transcript of the lecture video & matching slides. We hope, you enjoy this as much as the videos. Of course, this transcript was created with deep learning techniques largely automatically and only minor manual modifications were performed. Try it yourself! If you spot mistakes, please let us know!

这些是FAU YouTube讲座“ 深度学习 ”的讲义。 这是演讲视频和匹配幻灯片的完整记录。 我们希望您喜欢这些视频。 当然，此成绩单是使用深度学习技术自动创建的，并且仅进行了较小的手动修改。 自己尝试！ 如果发现错误，请告诉我们！

导航 (Navigation)

Previous Lecture / Watch this Video / Top Level / Next Lecture

上一个讲座 / 观看此视频 / 顶级 / 下一个讲座

Image for post — Need a cover for your new album? I GAN help you. Image created using gifify. Source: YouTube

Welcome back to deep learning! Today we want to talk about a couple of the more advanced GAN concepts, in particular, the conditional GANs and Cycle GANs.

欢迎回到深度学习！今天，我们要讨论几个更高级的GAN概念，尤其是条件GAN和Cycle GAN。

So, let’s have a look at what I have here on my slides. It’s part four of our unsupervised deep learning lecture. First, we start with the conditional GANs. So, one problem that we had so far is that the generators create a fake generic image. Unfortunately, it’s not specific for a certain condition or characteristic. So let’s say if you have text to image generation then, of course, the image should depend on the text. So, you need to be able to model the dependency somehow. If you want to generate zeros then you don’t want to generate ones. So, you need to put in some condition whether you want to generate the digit 0, 1, 2, 3, and so on. This can be done by encoding conditioning which is introduced in [15].

因此，让我们看一下幻灯片上的内容。这是我们无监督深度学习讲座的第四部分。首先，我们从条件GAN开始。因此，到目前为止，我们遇到的一个问题是生成器创建了伪造的通用映像。不幸的是，它并非特定于特定条件或特征。因此，假设您有文本生成图像，那么图像当然应该取决于文本。因此，您需要能够以某种方式对依赖关系进行建模。如果要生成零，则不要生成零。因此，无论是否要生成数字0、1、2、3等，都需要设置一些条件。这可以通过在[15]中介绍的编码条件来完成。

The idea here is now that you essentially split up your latent vector into the set that has essentially the observation. Then, you also have the condition which is encoded here in the conditioning vector y. You concatenate the two and use them in order to generate something. Also, the discriminator then gets the generated image, but it also gets access to the conditional vector y. So, it knows what it’s supposed to see and the specific generated output of the generator. So, both of them receive the conditioning and this then essentially again results in a two-player minimax game that can be described again as a loss that is dependent on the discriminator. The extension here is that you additionally have the conditioning with y in the loss.

现在的想法是，您实际上将潜伏向量分成了本质上具有观测值的集合。然后，您还具有在条件向量y中编码的条件。您将两者串联起来并使用它们来生成某些东西。同样，鉴别器然后获得生成的图像，但它也可以访问条件向量y 。因此，它知道应该看到什么以及生成器的特定生成的输出。因此，他们两个都接受了条件调整，然后这基本上又导致了两人minimax游戏，该游戏可以再次描述为取决于判别器的损失。这里的扩展是，您另外具有损耗为y的条件。

So how does this thing work? You add a conditional feature like smiling, gender, age, or other properties of the image. Then, the generator and the discriminator learn to operate in those modes. This then leads to the property that you’re able to generate a face of a certain attribute. The discriminator learns that this is the face given that specific attribute. So, here, you see different examples of generated faces. In the first row are just random samples. The second row is conditioned into the property of old age. The third row is given the condition old age plus smiling and here you see that the conditioning vector is still able to produce similar images, but you can actually add those conditions on top.

那么这东西如何工作？您添加条件功能，例如微笑，性别，年龄或图像的其他属性。然后，生成器和鉴别器学习以那些模式进行操作。然后，这导致该属性，您可以生成特定属性的外观。鉴别者得知给定特定属性，这就是面Kong。因此，在这里，您会看到生成面Kong的不同示例。第一行只是随机样本。第二行以老年属性为条件。第三行给出了老年条件和微笑条件，在这里您可以看到条件向量仍然能够生成相似的图像，但是您实际上可以将这些条件添加到顶部。

So, this allows then to create really very nice things like the image to image translation. Below, you have several examples of inputs and outputs. You can essentially then create labels to street scenes, you can generate aerial images to maps, you can generate labels to facades, or black & white to color, day to night, and edges to photo.

因此，这可以创建非常好看的东西，例如图像到图像的翻译。下面有几个输入和输出示例。然后，您基本上可以为街道场景创建标签，可以为地图生成航拍图像，可以为立面生成标签，或者为颜色生成黑白，白天到黑夜，以及照片边缘。

The idea here is that we use the label image again as a conditioning vector. This leads us to the observations that this is domain translation. It is simply a conditional GAN. The positive examples are given to the discriminator. The example below shows a handbag and its edges. The negative examples are then constructed by giving the edges of the handbag to the generator to create a handbag that fools the discriminator.

这里的想法是我们再次将标签图像用作条件向量。这导致我们发现这是领域翻译。它只是一个条件GAN。判别器给出了肯定的例子。下例显示了手提包及其边缘。然后通过将手提包的边缘提供给生成器以创建使识别符蒙昧的手提包来构造否定示例。

You can see that we are able to generate really complex images just by using conditional GANs. Now, a key problem here is, of course, that you need the two images to be aligned. So, your conditioning image like the edge image here has to exactly match the respective handbag image. If they don’t, you wouldn’t be able to train this. So, for domain translation using conditional GANs, you need exact matches. In many cases, you don’t have access to exact matches. So, let’s say you have a scene that shows zebras. You will probably not find a paired data set that shows exactly the same scene, but with horses. So, you cannot just use it with a conditional GAN.

您可以看到，仅通过使用条件GAN，我们就能生成真正复杂的图像。现在，这里的一个关键问题当然是您需要将两个图像对齐。因此，您的调节图像(如此处的边缘图像)必须与相应的手提包图像完全匹配。如果他们不这样做，您将无法进行培训。因此，对于使用条件GAN的域名翻译，您需要完全匹配。在许多情况下，您无权访问完全匹配项。因此，假设您有一个显示斑马的场景。您可能不会找到显示完全相同的场景但有马的成对数据集。因此，您不能仅将其与条件GAN一起使用。

The key ingredient, here, is the so-called cycle consistency loss. So, you couple GANs with trainable inverse mappings. The key idea here is that you have one conditional GAN that inputs x as the conditioning image and generates then some new output. If you take this new output and use it in the conditioning variable of F, it should produce x again. So, you use the conditioning variables to form a loop and the key component here is that G and F should be essentially inverses of each other.

此处的关键因素是所谓的循环一致性损失。因此，您将GAN与可训练的逆映射结合在一起。这里的关键思想是您有一个条件GAN，它输入x作为条件图像，然后生成一些新的输出。如果采用此新输出并将其用于F的条件变量中，它将再次产生x 。因此，您使用条件变量来形成循环，并且这里的关键部分是G和F本质上应该是彼此相反的。

So, if you take F(G(x)), you should end up with x again. Of course, also if you take G(F(y)) then you should end up with y again. This then gives rise to the following concepts: So, you take two generators and two discriminators, one GAN G is generating y from x. One GAN F is generating x from y. You still need two discriminators Dₓ and the discriminator Dᵧ. The Cycle GAN loss further has the consistency conditions as additions to the loss. Of course, you have the typical discriminator losses the original GAN losses for Dₓ and Dᵧ. They are, of course, coupled respectively with G and F. On top, you put this cycle consistency loss. The cycle consistency loss is a coupled loss that at the same time translates x to yand y to x again and makes sure that the zebra that is generated in y is still not recognized as fake by the discriminator. At the same time, you have the inverse cycle consistency which is then translating y into x using F and then x into y using G again while fooling the discriminator regarding x. So, you need the two discriminators. This then gives rise to the cyclic consistency loss that we have noted down for you here. You can, for example, use L1 norms and the expected values of those L1 norms to form specific identities. So, the total loss is then given as the GAN losses that we’ve already discussed earlier plus λ the cycle consistency loss.

因此，如果取F(G( x ))，则应该再次以x结尾。当然，如果您采用G(F( y ))，那么您应该再次以y结尾。这样就产生了以下概念：因此，您有两个生成器和两个鉴别器，一个GAN G从x生成y 。一个GAN F正在从y生成x 。您仍然需要两个鉴别符Dₓ和鉴别符Dᵧ。 Cycle GAN损耗还具有一致性条件作为损耗的补充。当然，对于Dₓ和Dᵧ，典型的鉴别器损耗是原始GAN损耗。当然，它们分别与G和F耦合。最重要的是，您将此循环一致性损失。循环一致性损失是一个耦合损失，它同时将x转换为y ，又将y转换为x ，并确保鉴别器仍不会将y中生成的斑马识别为伪造的。同时，您具有逆循环一致性，然后使用F将y转换为x ，然后使用G再次将x转换为y ，同时愚弄关于x的鉴别符。因此，您需要两个鉴别符。这会导致周期性一致性损失，我们在这里已为您记录下来。例如，您可以使用L1规范和这些L1规范的期望值来形成特定的标识。因此，总损耗就是前面已经讨论过的GAN损耗加上λ周期一致性损耗。

So, this concept is fairly easy to grasp and I can tell you this has been widely applied. So, there are many many examples. You can translate from Monet to Photos, from zebras to horses, from summer to winter, and the respective inverse operations. If you couple this with more GANs and more cycle consistency losses, then you’re even able to take one photograph and translate it to Monet, Van Gogh, and other artists and have them represent a specific style.

因此，这个概念相当容易掌握，我可以告诉您这已被广泛应用。因此，有很多例子。您可以从“莫奈”到“照片”，从斑马到马，从夏天到冬天，以及相应的逆运算。如果您将其与更多的GAN和更多的循环一致性损失结合使用，那么您甚至可以拍摄一张照片并将其翻译成Monet，Van Gogh和其他艺术家，并使它们代表特定的风格。

This is, of course, also interesting for autonomous driving where you then can for example input a scene and then generate different segmentation masks. So, you can also use it for image segmentation in this task. Here, we have an ablation study for the Cycle GAN where we show the Cycle alone, the GAN alone, the GAN plus forward loss, the GAN plus backward loss, and the complete Cycle GAN loss. You can see that with the Cycle GAN loss, you get much much better back and forth translations if you compare this to your respective ground truth.

当然，这对于自动驾驶也很有趣，在自动驾驶中您可以例如输入一个场景然后生成不同的分割蒙版。因此，您也可以在此任务中将其用于图像分割。在这里，我们对Cycle GAN进行了消融研究，其中显示了单独的Cycle，单独的GAN，GAN加上正向损耗，GAN加上向后损耗以及完整的GAN损耗。您会发现，如果将Cycle GAN损失与自己的事实相比较，来回翻译会好得多。

Okay, there are a couple of more things to say about GANs and these are the advanced GAN concepts that we’ll talk about next time in deep learning. So, I hope you enjoyed this video and looking forward to seeing you in the next one. Good-bye!

好的，关于GAN还有很多要说的东西，这些是GAN的高级概念，我们下次将在深度学习中讨论。因此，我希望您喜欢这个视频，并希望在下一个视频中见到您。再见！

If you liked this post, you can find more essays here, more educational material on Machine Learning here, or have a look at our Deep LearningLecture. I would also appreciate a follow on YouTube, Twitter, Facebook, or LinkedIn in case you want to be informed about more essays, videos, and research in the future. This article is released under the Creative Commons 4.0 Attribution License and can be reprinted and modified if referenced. If you are interested in generating transcripts from video lectures try AutoBlog.

如果你喜欢这篇文章，你可以找到这里更多的文章，更多的教育材料，机器学习在这里，或看看我们的深入学习讲座。如果您希望将来了解更多文章，视频和研究信息，也欢迎关注YouTube ， Twitter ， Facebook或LinkedIn 。本文是根据知识共享4.0署名许可发布的，如果引用，可以重新打印和修改。如果您对从视频讲座中生成成绩单感兴趣，请尝试使用AutoBlog 。

链接 (Links)

Link — Variational Autoencoders: Link — NIPS 2016 GAN Tutorial of GoodfellowLink — How to train a GAN? Tips and tricks to make GANs work (careful, noteverything is true anymore!) Link - Ever wondered about how to name your GAN?

链接 —可变自动编码器：链接 — Goodfellow的NIPS 2016 GAN教程链接 —如何训练GAN？使GAN正常工作的提示和技巧(小心，什么都没了！) 链接 -是否想知道如何命名GAN？