About This Course
Course Responses
In the simplest terms, computer vision is the discipline of "teaching machines how to see." This field dates back more than forty years, but the recent explosive growth of digital imaging technology makes the problems of automated image interpretation more exciting and relevant than ever. There are two major themes in the computer vision literature: 3D geometry and recognition. The first theme is about using vision as a source of metric 3D information: given one or more images of a scene taken by a camera with known or unknown parameters, how can we go from 2D to 3D, and how much can we tell about the 3D structure of the environment pictured in those images? The second theme, by contrast, is all about vision as a source of semantic information: can we recognize the objects, people, or activities pictured in the images, and understand the structure and relationships of different scene components just as a human would? This course will strive to provide a unified perspective on the different aspects of computer vision, and give students the ability to understand vision literature and implement components that are fundamental to many modern vision systems.
Who is this class for:This course is an entry level course for Computer Vision. Some basic programming knowledge is assumed and the course requires learners to complete programming tasks in Matlab/C/C++/Python.
Language: Chinese, English
Programming skill: C/C++, Python, Matlab, OpenCV
Instrument: Desktop/Notebook. OS: Windows/Linux/Mac
Reading report: Submitted by web page. Each report with at least 300 words
Programming report: Submitted by web page. Each with a brief report at most 1000 words, but with many program's illustrations
No plagiary for reports and programs.
Feel free to discuss assignments with each other, but coding must be done individually
Feel free to incorporate code or tips you find on the Web, provided this doesn’t make the assignment trivial and you explicitly acknowledge your sources
Remember: I can Google too (and I have the copies of everybody’s assignments from the last three years this class was offered)
Generative AI tools, such as ChatGPT, Gemini, Claude, and CoPilot, are allowed with constraints.
GAI is used to "help": explain concepts, explain source codes, read articles, give you ideas, proofread your reports.
GAI should not be used to: write codes, compile codes, generate code results, write complete reports.
GAI is only your thought partner.
GAI usage regulation
You have to clarify your usage of AI tools.
Clarifications: tool names, prompts and responses.
[CVAA] Computer Vision: Algorithms and Applications, by Richard Szeliski, Springer, 2011. [Free PDF]
[OpenCV4] OpenCV 4 Computer Vision Application Programming Cookbook, 4th, D. M. Escrivá, R. Laganiere, 2019. GitHub
[CMV] Computer and Machine Vision: Theory, Algorithms, Practicalities, Fourth Edition. E.R. Davies, Academic Press, 2012. Free PDF @ FJU Lib
[DIP] Digital Image Processing, 3rd Edition. R.C. Gonzalez and R.E. Woods, Prentice Hall, 2008.
[OpenCV3] OpenCV 3 computer vision application programming cookbook, 3rd, R. Laganière, Packt Publishing, 2017. GitHub 簡體中文版
Computer Vision a Modern Approach, 2nd, D. Forsyth and J. Ponce, Prentice Hall, 2012.