Applied Statistical Methods

STAT3105, Fall 2021

Tue & Thu 11:40am-12:55pm, 517 Hamilton Hall

COVID-19 CORONAVIRUS: The policies set forth in this course are subject to change as we try to determine how best to keep you safe from the COVID-19 coronavirus while we provide the education we promised you.

Instructor: Xiaofei Shi xs2427[at]columbia(dot)edu
TAs: Zhanhao Zhang zz2760[at]columbia(dot)edu
    Yizi Zhang yz4123[at]columbia(dot)edu

Office Hours: Office hours will be on zoom: Meeting ID: 956 4866 1979 Passcode: 067214
    Xiaofei Shi: Thu 2:30pm-3:30pm
    Zhanhao Zhang: Mon 11:00am-12:00pm
    Yizi Zhang: Tue 4:00pm-5:00pm
Course Description: This course is meant to give you a survey of various topics to applied statistic methods and practical aspects of data analysis. Through a mix of lectures, student discussions, and assignments, the course will cover the various stages in modern data analysis pipelines, as well other relevant applied learning topics.
Course Prerequisites: Exposure to foundational statistics and probability, Course in computing that manipulated data, Linear Regression Models
Textbook: References are available on the class schedule below.

Homework: There will be five homework assignments, approximately evenly spaced throughout the semester. The homework will be posted on CourseWork. We highly recommend using Piazza for discussion. We will use CourseWork for submitting and grading. Homeworks submitted after the deadline will not be considered, so please plan in advance. In the case of an emergency (sudden sickness, family problems, etc.), a reasonable extension will be assigned. But we emphasize that this is reserved for true emergencies.

Project: There will be a class project. You need to finish the project on your own.

Evaluation: 50% for 5 Homework + 30% for Project + 20% for Participation.

Schedule

-->
Date Topic Readings Note
Thu 09/09 Introduction
Tue 09/14 R Tutorial 1 Code HW1 out
Thu 09/16 Data Collection Sampling: Design and Analysis, Chapter 1-2.2
Tue 09/21 Data Quality Modeling Ideology and Predicting Policy Change with Social Media
Thu 09/23 Fairness, Accountability, and Transparency References for Crowdsourcing HW2 out
Tue 09/28 HW1 Discussion HW1 due
Thu 09/30 Regression Refresher
Tue 10/05 R Tutorial 2
Thu 10/07 Crash in Bayesian Statistics HW3 out
Tue 10/12 Modern Regression HW2 due
Thu 10/14 HW2 Discussion
Tue 10/19 Naive Bayes
Thu 10/21 Boosting and Ensemble Methods HW4 out
Tue 10/26 R Tutorial 3 HW3 due
Thu 10/28 HW3 Discussion
Tue 11/02 No Class
Thu 11/04 Time Series Data HW5 out
Tue 11/09 Spatial Data HW4 due
Thu 11/11 HW4 Discussion
Tue 11/16 Survival Data and the Issue of Censoring
Thu 11/18 Missing Data
Tue 11/23 No Class
Thu 11/25 No Class
Tue 11/30 HW5 Discussion HW5 due
Thu 12/02 Causal Inference I
Tue 12/07 Causal Inference II
Thu 12/09 Project Session
Tue 12/14 Study and Exam Days
Mon 12/20 Study and Exam Days Final Project Due

Logistics

Policy on Collaboration: You are encouraged to work together on the homework. Discussing the homework problems with one another can be a valuable learning experience. However, it is a violation of the rules on academic integrity to copy another student's solution and submit it as your own. You should write up your solutions separately, not referring to a common document. Furthermore, you should not submit any work that you do not fully understand. You should be able to start with a clean sheet of paper and without notes or assistance write out the solution to any homework solution you submit. If you will do that with every homework you submit, the similarity between your solutions and those of other students will not arouse suspicion. More importantly, you will be well prepared for the exams. You are not permitted to use homework solutions for this course from previous years or solutions you find from other sources, including the internet.

Take Care of Yourself:It is easy for me to say and hard for all of us, including me, to do, but taking care of your physical and mental health is essential, especially during the COVID-19 pandemic. Life is a marathon, and you need to pace yourself. Do your best to maintain a healthy lifestyle by eating well, exercising, avoiding drugs and alcohol, getting enough sleep and taking some time to relax. This will help you achieve your goals and cope with stress.
If you or anyone you know experiences extreme academic stress, difficult life events, or feelings of anxiety or depression, I strongly encourage you to seek support. Counseling and Psychological Services is here to help 24/7, and everything will be confidential: call 212-854-2878 or visit here.
In addition, consider reaching out to a friend, faculty or family member you trust for help getting connected to the support. Keep in mind that for serious psychological issues, the first counselor you meet with may not be the right one for you, but this does not mean you should give up on counseling. Keep looking for someone who can help you.
If you or someone you know is feeling suicidal or in danger of self-harm, call immediately, day or night:
    Counseling and Psychological Services: 212-854-2878
    If the situation is life threatening, call the police:
    • On campus: Columbia Police: 212-854-2797
    • Off campus: 911
If you have questions about this advice, your coursework, or anything else about which I might be helpful, please let me know.