CSE 659A Advances in Computer Vision

SP2020: Tue/Th 1:00‑2:20pm @ Zoom Whitaker 216


Computer vision is a fast moving field, with the past few years seeing tremendous advances in the development of computational algorithms for solving visual tasks. This course is designed to introduce you to advanced and recently published techniques for problems in low-level vision, recognition and classification, and computational photography.

Pre-requisites: CSE 559A

Instructor: Ayan Chakrabarti

All remaining classes will be held on Zoom. See Canvas for links & recordings. Also see updated grade policies & due dates: there will be no project-presentations this semester. Stay safe, everyone!

Slides

Jan 14Course Intro & norm penalty gradient regularizers.HTML PDF 1UP PDF 4UP
Jan 16Optimization: IRLS & HQS. Intro to Learned Priors.HTML PDF 1UP PDF 4UP
Jan 21Gaussian Mixture Model-based Priors: Learning and Inference.HTML PDF 1UP PDF 4UP
Jan 23Sparse Dictionaries. CNN-denoiser priors. Intro to MRFs. Fields of Experts.HTML PDF 1UP PDF 4UP
Jan 28Belief Propagation for Max-marginal and MAP. Loopy Belief Propagation.HTML PDF 1UP PDF 4UP
Jan 30Fully-connected MRFs. Mean Field—including as RNN. Intro to Graph Cuts.HTML PDF 1UP PDF 4UP
Feb 4Multi-label Graph Cuts. Diverse M-Best Modes.HTML PDF 1UP PDF 4UP
Feb 6Stereo with planar models: Stereo-SLIC, MRFs, Dense Patch Consensus.HTML PDF 1UP PDF 4UP
Feb 11Plane sweep stereo. Monocular Flow. Scene Flow. Large Displacement Optical
Flow. NN-based stereo matching.
HTML PDF 1UP PDF 4UP
Feb 13Neural network-based stereo and flow. Monocular Depth Estimation.
Correspondences and Semantics.
HTML PDF 1UP PDF 4UP
Feb 18SIFT. Content-based Image Retrieval. Dalal-Triggs Human Detection.HTML PDF 1UP PDF 4UP
Feb 20DPMs. Object Detection with NNs: R-CNN, Fast(er) R-CNN, YOLO. Instance Segmentation: SDS.HTML PDF 1UP PDF 4UP
Feb 25Deep Watershed Transform. Image Captioning.HTML PDF 1UP PDF 4UP
Feb 27CG2Real. Seam Carving. Texture Synthesis & Style Transfer. Poisson Blending.HTML PDF 1UP PDF 4UP
Mar 5Image Generation and Editing with GANs.HTML PDF 1UP PDF 4UP
Mar 24Motion Magnification. Blind Deblurring. Patch Recurrence.HTML PDF 1UP PDF 4UP
Mar 26Dehazing. Amplifying Irregularities. Bas-relief Ambiguities.HTML PDF 1UP PDF 4UP
Mar 31Shading Ambiguities. Intrinsic Images. SIRFS.HTML PDF 1UP PDF 4UP
April 2More Intrinsic Images. Shape from Shading. RGB Photometric Stereo.HTML PDF 1UP PDF 4UP
April 7DL methods for Intrinsic Images. General Photometric Stereo.HTML PDF 1UP PDF 4UP
April 9Illumination estimation and separation. Designing & learning sensors.HTML PDF 1UP PDF 4UP
April 14Flas + No-flash. HDR Imaging. Depth from Defocus.HTML PDF 1UP PDF 4UP

Readings

T1.1Krishnan and Fergus, Fast Image Deconvolution using Hyper-Laplacian Priors, NIPS 2009.
T1.2Wikiepdia: Expectation Maximization.
T1.3Zoran & Weiss, From Learning Models of Natural Image Patches to Whole Image Restoration, ICCV 2011.
T1.4Roth & Black, Fields of Experts: A framework for learning image priors, CVPR 2005.
T1.5Zhang et al., Learning Deep CNN Denoiser Prior for Image Restoration, CVPR 2017.
T1.6Kevin Murphy, Machine Learning: A Probabilistic Perspective, Ch 19.
T1.7Boykov, Veksler, and Zabih, "Fast Approximate Energy Minimization via Graph Cuts," PAMI 2001.
T1.8Batra et al., "Diverse m-best solutions in markov random fields," ECCV 2012.
T1.9Zheng et al., "Conditional Random Fields as Recurrent Neural Networks," ICCV 2015.
T2.1Yamaguchi et al., "Continuous Markov Random Fields for Robust Stereo Estimation," ECCV 2012.
T2.2Chakrabarti et al., "Low-level Vision by Consensus in a Spatial Hierarchy of Regions," CVPR 2015.
T2.3Vogel et al., "3D scene flow estimation with a piecewise rigid scene model," IJCV 2015.
T2.4Hu et al., "Efficient Coarse-to-Fine PatchMatch for Large Displacement Optical Flow," CVPR 2016.
T2.5Revaud et al., "EpicFlow: Edge-Preserving Interpolation of Correspondences for Optical Flow," CVPR 2015.
T2.6Zbontar and LeCun, "Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches," JMLR 2016.
T2.7Chang and Chen, "Pyramid Stereo Matching Network," CVPR 2018.
T2.8Sun et al., "PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume," CVPR 2018.
T2.9Chen et al., "Surface Normals in the Wild." ICCV 2017.
T3.1Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," IJCV 2004.
T3.2Sivic and Zisserman, "Video Google: A Text Retrieval Approach to Object Matching in Videos," ICCV 2003.
T3.3Dalal and Triggs, "Histograms of Oriented Gradients for Human Detection," CVPR 2005.
T3.4Felzenszwalb et al., "Object Detection with Discriminatively Trained Part Based Models," PAMI 2010.
T3.5Girshick et al., "Rich feature hierarchies for accurate object detection and semantic segmentation," CVPR 2014.
T3.6Hariharan et al., "Simultaneous Detection and Segmentation," ECCV 2014.
T3.7Bai and Urtasun, "Deep Watershed Transform for Instance Segmentation," CVPR 2017.

Syllabus*

*List of topics are tentative and may change. In the last month, one class per week will be for student project presentations.

  1. Image Priors & Spatial Models
    • Gradient regularizers, patch models based on GMMs, sparse dictionaries, field-of-experts, etc. Optimization algorithms for inference. Restoration with plug-and-play denoisers.
    • Graphical Models and Markov Random Fields (MRFs). Inference algorithms including loopy belief-propagation, graph cuts with expansion and swap moves, etc. Mean field with efficient data-structures for fully-connected MRFs.
    • Combining neural networks outputs with MRF models. Incorporating MRF inference within a network.
  2. Depth and Motion Estimation
    • Un-rectified and multi-view stereo with plane sweep.
    • Inference with planarity and higher-order priors on depth.
    • Large displacement optical flow, and layered models for flow.
    • CNN-based methods for stereo and flow.
    • Monocular Depth and Normal Estimation.
  3. Classification & Recognition
    • Interest point detectors. Traditional region (SIFT, HoG) and scene (GIST) descriptors.
    • Content-based Image Retrieval. Object Detection with the Deformable Parts Model.
    • CNN-based object detection. Image and Instance Segmentation.
    • Image Captioning. Visual Question Answering.
  4. Computational Photography I
    • Texture Synthesis. Seam Carving.
    • Image Harmonization. CG2Real.
    • Motion Magnification.
    • Photo UnCrop. Image Inpainting. Image Editing with Smart Contours.
  5. Advanced Photometric Reasoning
    • Uniqueness results: when does shading determine shape ?
    • Modern algorithms for shape from shading, intrinsic image decomposition, and photometric stereo.
    • Neural network based Color Constancy. Lighting separation with flash/no-flash.
  6. Computational Photography II
    • Dark flash photography.
    • Coded aperture and coded exposure photography. Light-field cameras.
    • Structured light, Time-of-Flight cameras.
    • Other non-traditional cameras.

Policies

Grade: Evaluation in this course will be based on the following

  • 25%: Five homework "paper reviews". You will be asked to read a paper and write a short (1-2 page) review.
    The following is a tentative schedule of when the papers will be assigned and reviews due:
    • Review 1. Assigned: Jan 23, Due: Feb 6.
    • Review 2. Assigned: Feb 6, Due: Feb 20.
    • Review 3. Assigned: Feb 20, Due: Mar 26.
    • Review 4. Assigned: Mar 26, Due: Apr 9.
    • Review 5. Assigned: Apr 9, Due: Apr 23.
  • 60%: Two Projects (30% each).
    • Project I: Topics 1-3. Report Due: Mar 24.
    • Project II: Topics 3-5. Report Due: Apr 26.
  • 15%: Presentation (based on project I). Presentation dates will be assigned randomly and posted during the semester. Irrespective of the presentation date, slides for the presentation for all students will be due before the first presentation date (~ in late March).
All submissions will be through Canvas.

Updates: There will be no presentations. The homework and project total will be scaled up (by 100% / 85%) to determine your final grade. Grade boundaries for letter grades will be determined based on the distribution of scores. For those taking the course P/F, the threshold for a pass will be 70 (after scaling).

Late Policy: All homework reviews, project reports, and presentation slides must be submitted by 11:59 PM on their due dates. There will be no extensions given. We recommend you submit early leaving a buffer of a few day to account for unexpected delays.

Collaboration and Academic Honesty: Discussion about course topics with your classmates is encouraged (in person, and on piazza) but all homework reviews are projects are expected to be completed individually. While completing the projects, it is OK to rely on code posted online, but this should be acknowledged in the project report. Any instances of plagiarism will be reported to the school, and will attract strict penalties.

Office Hours: Are by appointment. Please make a private post on Piazza, and let me know what times in the next 2-3 days work for you. I'll get back to you with a time slot.