This advanced PhD-level course is designed to prepare students for research in the field of computer vision. The course will explore both recent developments and foundational work in computer vision, with lectures covering neural network architectures, methods for solving recognition tasks (e.g., classification, detection, segmentation), generative models, vision-language models, supervised and unsupervised learning techniques, among others. While the focus is on contemporary research, we will examine impactful papers from the past that offer valuable lessons in methodology and technical writing. As student research projects develop, lecture topics may be adjusted to better align with their interests and project needs. This course is project-based, with a strong emphasis on independent research, experimentation, and the clear communication of results. It is not an introductory course; students are expected to have a solid foundation in machine learning, prior experience in computer vision, and proficiency in Python and LaTeX.

This is a project-based course where students will first work individually and then in teams. Each student will initially propose three potential research projects, from which teams of 2-3 members will form to select and develop one project. The course requires the submission of a detailed project proposal (2-page report and a 3-minute presentation), a midterm project update (4-page report and a 4-minute presentation), and a final 8-page report accompanied by a poster presentation. In-person participation is mandatory for all sessions, particularly for project presentations, to foster a collaborative and interactive learning environment. This course is ideal for students who are seriously interested in advancing their expertise in computer vision and contributing to the field through research.


  • Time: Monday/Wednesday 4:00PM – 5:15PM
  • Location: Morrill I N326 (map)
  • Discussion: TBD
  • Lecture slides: TBD
  • Contact: TBD