Instructor Email Role
Xiling Shen Professor
Michael Gao Instructor

Class Times:

Tuesdays and Thursdays 3:30-4:45PM

Class Location:

ONLINE ONLY This course will be held online only. DukeHub may list a location for in-person instruction, but please ignore this. You will be able to reacht he zoom link via sakai and a passcode will be provided.

Office Hours:

Times for office hours will be announced at a later time / by appointment

Course Logistics

Prerequisites

The prerequisites for this course are an introductory computer science course, and an introductory statistics/probability course. For undergraduate students a 100 or 200 level computer science course as well as stat 111 or higher is required. These prerequisites are strongly encouraged. However, self-motivated learners may be able to learn the necessary skills to take this course and we will spend 1 week reviewing core concepts. If you do not have these pre-requisites, it is up to you to make sure that you spend the extra time picking up the skills necessary to follow the course content.

An understanding of basic calculus is required. We will mostly be discussing this in the context of finding maxima and minima and in some aspects of probability theory.

Required Text

All texts used in this course will be provided. There are no required textbooks.

Software

We will be using Github as the primary way to submit programming assignments in this class. If you do not have a github account, please create one at http://www.github.com.

You can find all of the installation instructions for software necessary for the course here:

Communications

Communication will be done primarily through Sakai. We will also send email updates about important announcements. In addition, we will have a slack channel for you to ask questions and discuss things among classmates. The link to the slack for this semester will be posted in Sakai.

Sakai will be used for assignment submission (in part - click on the ‘How to Submit Homework’ tab to learn more) and grade management. Piazza will be a place to have discussions and ask questions of the instructors/TAs.

Grading

The final grade for the class will be determined from the following constituents:

  • 50% Problem Sets / Programming Assignments
  • 20% Midterm Project
  • 30% Final Project

We reserve the right to modify this slightly as te course plays out. Ify ou have any concerns about your grade, please do not hesitate to reach out to oneo f us and we will address them.

Late Work Policy

Late assignments will be penalized 10% it is turned in within 24 hours after the due date, 20% if between 24 hours and 48 hours, and will not be accepted afterwards.

Academic Integrity

I expect for all students to adhere to the Duke Community Standard https://studentaffairs.duke.edu/conduct/about-us/duke-community-standard. While I encourage group work and group studying, you must turn in your own work and copying homework solutions or other student’s work will result in a grade of 0 for the assignment. This class has group assignments for the final project. If you feel a team member is not contributing, please let me know.

Problem Sets/Programming Assignments

Each of the skills that we will be covering throughout this course will have an associated problem set or programming assignment (or combined assignment). The purpose of these exercises is to reinforce material taught in lecture as well as explore applications that will not be covered in the lectures. These assignments will be assigned as we reach topics throughout the course.

Midterm Project

The midterm project will be focused on clinical risk prediction algorithms. You will be implementing your own classifier along with the associated solver. Then, you will apply this algorithm to the MIMIC-III dataset to create your own clinical risk algorithm. You will then compare its performance to other well-known risk prediction models and write a report explaining the differences. More information will be released at a later date.

Final project

The final project will make use of all of the skills that you will learn throughout the course. You will write a proposal about a data science question that you want to answer using a clinical dataset and then step through the entire data science process including a verbal presentation to the class. More information about this project will be released later in the semester.

Course Schedule

This course schedule is subject to change, but we will notify everyone in advance of any changes. Please note that the source of truth for due dates is Sakai. Check Sakai under the assignments tab to make sure you have the correct due date.



Copyright © 2021 Michael Gao. All rights reserved.