CSE 559A: Computer Vision


Fall 2018: T-R: 11:30-1pm @ Lopata 101

Instructor: Ayan Chakrabarti (ayan@wustl.edu).
Course Staff: Zhihao Xia, Charlie Wu, Han Liu

http://www.cse.wustl.edu/~ayan/courses/cse559a/

December 6, 2018

General

  • Last class. No office hours tomorrow.
  • Project Reports Due Sunday Night !
  • Keys for PSET 5 can be picked up from Jolley 205 on Tuesday, b/w 12:30pm-1:30pm.

Generative Adversarial Networks

Clarification about Loss

  • Max \(-\log (1-D(G))\) vs min \(-\log(D(G))\)
  • Binary Classifier: output of \(D\) is \(\sigma(y)\) for some \(y\)
  • Discriminator is doing really well, so \(D(G)\) is almost 0 (\(y << 0\))
  • \(\max -\log (1-D(G)) = \min \log(1-D(G))\)

\[\nabla_y = -(D(G)-0) = 0-D(G) \approx 0\]

  • \(\min -\log(D(G))\)

\[\nabla_y = D(G)-1 \approx 1\]

  • Both are negative (i.e., correct sign, says try to increase \(D(G)\))
  • But second version has much higher magnitude.

Generative Adversarial Networks

Generative Adversarial Networks

Generative Adversarial Networks

Generative Adversarial Networks

DCGAN: Radford et al.

Generative Adversarial Networks

DCGAN: Radford et al.

  • Generated images from a training set of bedrooms (LSUN dataset).

Generative Adversarial Networks

Neyshabur et al., Stabilizing GAN Training with Multiple Random Projections

Generative Adversarial Networks

Neyshabur et al., Stabilizing GAN Training with Multiple Random Projections

Generative Adversarial Networks

Neyshabur et al., Stabilizing GAN Training with Multiple Random Projections

Generative Adversarial Networks

Denton et al., Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks

Generative Adversarial Networks

Karras et al., PROGRESSIVE GROWING OF GANS FOR IMPROVED QUALITY, STABILITY, AND VARIATION

Generative Adversarial Networks

Karras et al., PROGRESSIVE GROWING OF GANS FOR IMPROVED QUALITY, STABILITY, AND VARIATION

Conditional GANs

Conditional GANs

Conditional GANs

Conditional GANs

Adversarial Loss

  • Train with

\[L(G) = \|G(x)-y\|^2 - \lambda \log D(G(x))\] \[L(D) = - \log(1-D(G(x))) - \log D(y)\]

  • The GAN loss is unconditional, but there is also a reconstruction loss.
  • So the loss says, be close to the true answer, but make your output resemble natural images.

Un-supervised Learning

  • For a lot of tasks, it is hard to collect enough training data.
  • We saw for the stereo example, how you can have an indirect supervision.
  • But in other cases, you have to use transfer learning.
    • Train a network on a large dataset for a related task for which you have ground truth.
    • Remove last layer, and use / finetune feature extractor for new task.
  • Researchers are exploring tasks made specifically to transfer from.

Un-supervised Learning

  • Pre-train by learning to add color

Larsson, Maire, Shakhnarovich, CVPR 2017.

Un-supervised Learning

  • Pre-train by solving jigsaw puzzles

Un-supervised Learning

  • Pre-train by predicting sound from video

Domain Adaptation

  • Generate synthetic training data using renderers.
  • But networks trained on synthetic data need not generalize to real data.
  • (In fact, they may not transfer from high-quality Flickr data to cell-phone camera data)

Problem Setting

  • Have input-output training pairs of \((x',y)\) from source domain: renderings/high-quality images/...
  • Have only inputs \(x\) from target domain: where we actually want to use this.
  • Train a network so that features computed from \(x'\) and \(x\) have the same distribution ...

i.e., use GANs !

Domain Adaptation

Domain Adaptation

That's all folks !
  • We've covered what forms the foundations of state-of-the-art vision algorithms
    • Will help you read, understand, and implement vision papers
    • But things are changing rapidly: not just new solutions, but new problems
    • So keep reading !

We hope that you have ...

  • An understanding of the basic math and programming tools to approach vision problems
  • Are as surprised as we are that humans and animals are able to solve this so easily

Reminders

  • Fill out course evaluations !
  • Repeat Advertisement: CSE 659A in the Spring, "Advances in Computer Vision".