Exam project

When the teaching in our course finishes you are to hand in a project. The formal requirements are found here along with details on grading. The exam period runs until August 23 at 10 AM.

In this blog post we want to expand a little bit on how a good project will look. The focus of the project is to pose and interesting question to data that can be collected publicly and attempt to answer it. Importantly, more points are given to projects that collect/construct a data set independently (e.g. through API services or with the use of scraping).

We emphasize that some good projects use machine learning for modeling a variable of interest, while some good projects use machine learning to parse text data. When using machine learning we DO NOT EXPECT that you use any tools beyond this course, i.e. other than logistic and linear regression with regularization. Other projects are great without even using machine learning, by collecting and structuring interesting or challenging data and providing descriptive statistic and good visualizations. Therefore, you should always ask yourself: what does modeling an outcome provide of value? is it interesting? Some of the best project simply visualize data and demonstrate a good understanding through visualization and descriptive statistics.

We also note that a good project is succinct and to the point. You should describe your data structuring at a high level to convey what you have done. Sometimes many hundred lines of complicated code can be stated quite briefly to get the essence.

Earlier projects span quite different subject, data sets, writing styles and use of machine learning.