Due     Jan 31, 2025 by 11:59pm     Points     50

Prepare a short (between 500 and 1000 words) description of the proposed high-level solution to the semester project you selected, and the datasets you think you would need to acquire to implement your solution. It’s of course too early in the semester to provide details of your final deliverable. Rather, this is an exercise to start thinking what you need to learn to solve the problem. For instance, do you need to detect the object (i.e., tell where it is) in the frame? If so, is there anything specific about the object you are detecting you would like to use (e.g., color, shape, "key points" that are very specific to that object, etc.)? Are you trying to recognize people by their faces? What kind of features you think would be needed to calculate, and which image properties your solution should be agnostic to?

To get credit for this part push a description to your GitHub repository you shared with Walter and Louisa last week (Part 00). The best way is to prepare a readme.md file using markdown language, and split it into sections (Part 1, Part 2, ...).

This assignment aims at thinking about a high-level solution before diving into detailed tools (feature extractors and classifiers). It allows Walter to direct you towards a correct solution early in the semester, and it allows you to focus on those elements of the class material that are useful for your semester project. More advanced students can certainly include lower-level details (and we can talk right away about feature extractors, appropriate classifiers, loss functions and/or neural architectures in deep learning, etc.). But the primary goal of this assignment is to convince you to start thinking about the solution and the problem you are solving right now, and keep it in mind throughout our lectures. In class you may then keep asking yourself "How the method presented today can help me in my semester project?"

Note 1: When thinking about the data, consider three subsets: for training, for validation and for final testing. You will use the training set to find your best solution (parameters of feature extraction and classification). You will use the validation set to check how this interim solution works on new data not used in training (this prevents your classifier from overfitting). Finally, you will use the "unknown" test subset to evaluate your final deliverable (so don't touch this data until the final evaluation and report). To think ahead of time about some concrete datasets, you can look at this amazing list of links to various Computer Vision databases: http://homepages.inf.ed.ac.uk/rbf/CVonline/Imagedbase.htm. (Don't be discouraged by the old school HTML design — this list is quite well maintained.)

Note 2: You can (and you are encouraged to) use ChatGPT (or other similar tool) as your "brainstorming buddy". That is, work with such tools to generate sensical high-level solutions by treating ChatGPT's responses as "inspirational triggers". Don't copy-paste ChatGPT's generated texts as your submission — this is cheating.

Note 3 (for teams): Indicate in your report individual contributions of each team member. This is needed to have this assignment graded.