|Due:||Presentation 12/2; Report 12/9|
|Format:||8- to 10-page report + 5-minute presentation|
|Percentage of Grade:||25%|
Identifying an interesting problem.
Addressing it with social data.
Quality of presenting solutions in report.
If presenting, clear, engaging presentation.
The goal of the final project in this class is to identify an interesting question or problem that you can address by analyzing social data. The tools and papers we have discussed in class should help you along the way. You are free to implement the project in whatever language you like, but again I recommend RStudio or IPython Notebook as the go-to platform. For working with text, I recommend Pattern or scikit-learn. Again, please make sure to document what you do, especially in langauges where you don't get your command history recorded for free.
I am not prescribing the problem. Whereas I intended the mini-project as an opportunity for you to gain experience analyzing and mucking around with data, in this project I want you to stretch your legs. It is reminiscent of the midterm design project. Above all, find a problem that interests you, one you think you can answer with social data. I would like a short email from each team describing the problem and the data source by December 3. This can be very short — a paragraph or two. I may negotiate the problem and/or data source if I think it's too ambitious (or not enough so).
Once you have a problem, you need data. I have two recommendations for places to find it. The first is your own social media data. You might consider downloading all of your email or IM logs. Or, you could download all your Facebook data. My second recommendation is Hillary Mason's research dataset page. Many of them are social and suggest interesting research questions.
Every team will hand-in an approximately eight-page report summarizing their findings.
When writing up your findings, look to the research papers we have read in class.
Start with a section describing your Problem and why interests you.
Next, please present a Method section and then a
longer Results section. The Method section describes what data you used and how you got it. The
Results section describes your analysis and contains your graphics.
I also want an Intrepretation section following the Results section. Here,
you will discuss what you found and what it suggests.