CSE 40535 is an upper-level Computer Science and Engineering course at the University of Notre Dame that introduces students to the field of computer vision. The aim of computer vision is to give computers the ability to "understand" what they "see" in images and videos taken by one or more sensors (most often visible-light cameras). The goal of this course is to introduce and discuss methods for interpreting the visual information captured by machines to give them this ability.
The course is divided into five parts. In the first part, we define the notion of computer vision, the progress made in this discipline in recent decades, current challenges, successful applications, and its limitations. We also discuss selected biological vision mechanisms as an inspiration to create better computer vision solutions. The second part explains the basics of signal processing from a computer vision perspective. This part includes image formation, image acquisition, understanding and effective use of color (and in general multi-wavelength) information, and image processing (filtering and segmentation). The third part focuses on the automatic recognition of patterns. It covers feature extraction and selection, texture descriptors, Bayesian inference, classification, and decision making. In this part, we will also discuss how these tasks can be solved using deep learning techniques, especially convolutional neural networks. Several meetings in this third part will be devoted to the reliability of modern deep learning-based, generative, and image-to-image translation models. The fourth part considers multiple-view and geometry topics in vision: motion analysis, including object tracking, projective geometry, camera geometric model, camera calibration, and 3D reconstruction. Finally, the fifth part introduces a few practical applications of computer vision. One meeting at the end of the semester will be devoted to the non-technical aspects of designing trustworthy and reliable computer vision (and AI in general) systems.
After completing this course, students will be able to understand the computer vision literature, recognize the frontiers of state-of-the-art computer vision systems, and select appropriate mathematical and software tools to develop algorithms that solve the most important computer vision problems. Practical classes will utilize high-level programming languages (Python will be our main coding language) and popular computer vision tools and machine learning packages, such as OpenCV, Keras, Tensorflow, or Pytorch, to illustrate in practice selected topics discussed in class. The goal of the semester project is to exercise the entire computer vision pipeline on a selected vision problem. This version of the course was adapted by Prof. Walter Scheirer from the one originally developed by Prof. Adam Czajka.
Upon successful completion of this course, students will be able to:
Describe the principles of image formation and understand biological inspirations in computer vision.
Understand how images are coded and compressed.
Implement selected algorithms of image enhancement, based on point and neighborhood operations.
Design and Implement selected image segmentation methods to separate a useful part of the image from background information.
Build algorithms to extract and select appropriate image features suitable for a computer vision task being solved.
Understand pattern recognition principles and select appropriate classification techniques.
Deploy statistical methods for performance evaluation.
Understand geometrical transformations suitable for 2D and 3D images.
Use selected theoretical and programming tools, along with computer vision libraries, in developing own computer vision systems.
Unit | Date | Topics | Assignment |
---|---|---|---|
Course Introduction | 01/13 | Syllabus, Course Organization, and Graded Material Slides | Homework 00 Project 00 |
Introduction to Computer Vision | |||
Field of Computer Vision | 01/15 | Introduction to Computer Vision (Part 1) Slides | Homework 01 Torralba et al. Notation and Chpt. 2; Optional: Szeliski Sec. 1.1; Daugman pp. 5-20 |
01/17 | Introduction to Computer Vision (Part 2) Slides | ||
Bioligical Vision | 01/20 | Martin Luther King Jr. Day (No Class) | |
01/22 | Human Vision and Bio Inspirations in CV (Part 1) Slides | Torralba et al. Chpt. 1; Optional: Szeliski Sec. 1.2; Daugman pp. 29-55 | |
01/24 | Human Vision and Bio Inspirations in CV (Part 2) Slides | Homework 02 Project 01 CVPR 2024 AI Art Gallery | |
Digital Image Analysis | Image Acquisition and Formation | 01/27 | Image Acquisition Slides | Torralba et al. Chpts. 3, 5, 6, 7 |
01/29 | Color and Multispectral Imaging (Part 1) Slides | Torralba et al. Chpt. 8 | |
01/31 | Color and Multispectral Imaging (Part 2) Slides | Practical 00 (Before-Class) | |
02/03 | Practical 00 (In-Class) | ||
Image Processing | 02/05 | Point Operators (Part 1) | |
02/07 | Point Operators (Part 2) | ||
02/10 | Neighborhood Operators (Part 1) | ||
02/12 | Neighborhood Operators (Part 2) | ||
02/14 | Neighborhood Operators (Part 3) | ||
02/17 | Edge Detection and Linking (Part 1) | ||
02/19 | Edge Detection and Linking (Part 2) | ||
02/21 | Image Segmentation | ||
02/24 | Practical 01 (In-Class) | ||
Visual Recognition | |||
Feature Extraction | 02/26 | Extraction of Visual Features (Part 1) | |
02/28 | Extraction of Visual Features (Part 2) | 03/03 | Extraction of Visual Features (Part 3) |
Geometric Transformations | 03/05 | Geometric Transformations | |
03/07 | Practical 02 (In-Class) | ||
03/10 | Spring Break (No Class) | ||
03/12 | Spring Break (No Class) | ||
03/14 | Spring Break (No Class) | ||
Feature Classification and Evaluation | 03/17 | Feature Classification (Part 1) | |
03/19 | Feature Classification (Part 2) | ||
03/21 | Feature Classification (Part 3) | ||
Deep Learning | 03/24 | Deep Learning in Computer Vision (Part 1) | |
03/26 | Practical 03 (In-Class) | ||
03/28 | Deep Learning in Computer Vision (Part 2) | ||
03/31 | Deep Learning in Computer Vision (Part 3) | ||
04/02 | Object Detection and Tracking (Part 1) | ||
04/04 | Object Detection and Tracking (Part 2) | ||
04/07 | Generative Models | ||
3D Computer Vision | |||
Camera Calibration | 04/09 | Camera Calibration | |
3D Reconstruction | 04/11 | 3D Reconstruction (Part 1) | |
4/14 | 3D Reconstruction (Part 2) | ||
Applications of Computer Vision | |||
Applications | 04/16 | Practical 04 (In-Class) | |
04/18 | Easter Break (No Class) | ||
04/21 | Easter Break (No Class) | ||
04/23 | Human Biometrics | ||
04/25 | Digital Humanities | ||
04/28 | Computer Vision for Ecology | ||
04/30 | Generalization of Computer Vision | ||
Final Exam | 05/05 4:15-6:16pm | Tentative |
Component | Points |
---|---|
Participation Participation in class, office hours, and slack chats. | 200 |
Homeworks Homework assignments. | 10 × 100 |
Practicals In-class programming projects. | 5 × 200 |
Project Final group project. | 500 |
Exam Final Exam. | 300 |
Total | 3000 |
Grade | Points | Grade | Points | Grade | Points |
---|---|---|---|---|---|
A | 2790-3000 | A- | 2700-2789 | ||
B+ | 2601-2699 | B | 2499-2600 | B- | 2400-2498 |
C+ | 2301-2399 | C | 2199-2300 | C- | 2100-2198 |
D | 1950-2099 | F | 0-1949 |
All Assignments are to be submitted to your own private GitHub repository, unless specified otherwise.
In-person attendance in mandatory. Students are expected to attend and contribute regularly in class. This means answering questions in class, participating in discussions, and helping other students.
Foreseeable absences should be discussed with the instructor ahead of time.
Any student who has a documented disability and is registered with Sara Bea Accessibility Services should speak with the professor as soon as possible regarding accommodations. Students who are not registered should contact Sara Bea Accessibility Services.
Any academic misconduct in this course is considered a serious offense, and the strongest possible academic penalties will be pursued for such behavior. Students may discuss high-level ideas with other students, but at the time of implementation (i.e., programming), each person must do his/her own work. Use of the Internet as a reference is allowed but directly copying code or other information is cheating. It is cheating to copy, to allow another person to copy, all or part of an exam or a assignment, or to fake program output. It is also a violation of the Undergraduate Academic Code of Honor to observe and then fail to report academic dishonesty. You are responsible for the security and integrity of your own work.
I am a big advocate of the responsible use of modern AI tools, including generative AI models that are in the headlines (this is a computer vision course, after all). I will create opportunities to use the output of these systems in assignments, and we will discuss their usefulness for solving computer vision problems in class. But you must not use generative AI to simply "do your work for you" (e.g., generate ready-to-submit homework answers, or code snippets). This is not what I would call "responsible use" and it is harmful to your long-term learning goals. The prose generated by ChatGPT is spelled and punctuated correctly and has proper grammar, but it frequently concocts false information, and supports its arguments with very shallow reasoning that relies on description rather than inference. Use it as a "brainstorming buddy" (ChatGPT's confabulations can be very inspirational as they may take you to ideas you have never thought about). Or use it to simply correct your grammar / language. But do not cheat (see the the Academic Honesty section of this page).
In the case of a serious illness or other excused absence, as defined by university policies, coursework submissions will be accepted late by the same number of days as the excused absence. Otherwise, a late penalty, as determined by the instructor, will be assessed to any late submission of an assignment. In general, the late penality is -10 points off for each day after the assigned deadline. The instructor reserves the right to refuse any unexcused late work.
Notre Dame has implemented a classroom recording system. This system allows us to record and distribute lectures to you in a secure environment. You can watch these recordings on your computer, tablet, or smartphone. The recordings can be accessed via a request made to the instructor.
Because the instructor reserves the right to record in the classroom on select occasions, your questions and comments may be recorded. (Video recordings typically only capture the front of the classroom.) If you have any concerns about your voice or image being recorded, please speak to me to determine an alternative means of participating. No content will be shared with individuals outside of your course without your permission except for faculty and staff that need access for support or specific academic purposes.
These recordings are jointly copyrighted by the University of Notre Dame and your instructor. Posting them to other websites, including YouTube, Facebook, Vimeo, or elsewhere without express, written permission may result in disciplinary action and possible civil prosecution.
For the assignments in this class, you may discuss with other students and consult printed and online resources. You may quote from books and online sources as long as you cite them properly. However, you may not look at another student's solution, and you may not copy solutions.
For further guidance please refer to the CSE Honor Code or ask the instructor.