Home Data Science Data Science

Data Science

Data Science
Harvard University Online Course Highlights
  • 2-4 months long
  • 102 – 184 hours of effort
  • Learn for FREE, Up-gradable
  • Self-Paced
  • Taught by: Rafael Irizarry, Professor of Biostatistics
  • View Course Syllabus

Online Course Details:

The demand for skilled data science practitioners in industry, academia, and government is rapidly growing. The HarvardX Data Science program prepares you with the necessary knowledge base and useful skills to tackle real-world data analysis challenges. The program covers concepts such as probability, inference, regression, and machine learning and helps you develop an essential skill set that includes R programming, data wrangling with dplyr, data visualization with ggplot2, file organization with Unix/Linux, version control with git and GitHub, and reproducible document preparation with RStudio.

In each course, we use motivating case studies, ask specific questions, and learn by answering these through data analysis. Case studies include: Trends in World Health and Economics, US Crime Rates, The Financial Crisis of 2007-2008, Election Forecasting, Building a Baseball Team (inspired by Moneyball), and Movie Recommendation Systems.

Throughout the program, we will be using the R software environment. You will learn R, statistical concepts, and data analysis techniques simultaneously. We believe that you can better retain R knowledge when you learn how to solve a specific problem.


Data Science: R Basics
  • 1–2 hours per week, for 8 weeks
  • Build a foundation in R and learn how to wrangle, analyze, and visualize data.
Data Science: Visualization
  • 1–2 hours per week, for 8 weeks
  • Learn basic data visualization principles and how to apply them using ggplot2.
Data Science: Probability
  • 1–2 hours per week, for 8 weeks
  • Learn probability theory — essential for a data scientist — using a case study on the financial crisis of 2007–2008.
Data Science: Inference and Modeling
  • 1–2 hours per week, for 8 weeks
  • Learn inference and modeling, two of the most widely used statistical tools in data analysis.
Data Science: Productivity Tools

  • 1–2 hours per week, for 8 weeks
  • Keep your projects organized and produce reproducible reports using GitHub, git, Unix/Linux, and RStudio.
Data Science: Wrangling
  • 1–2 hours per week, for 8 weeks
  • Learn to process and convert raw data into formats needed for analysis.
Data Science: Linear Regression
  • 1–2 hours per week, for 8 weeks
  • Learn how to use R to implement linear regression, one of the most common statistical modeling approaches in data science.
Data Science: Machine Learning
  • 2–4 hours per week, for 8 weeks
  • Build a movie recommendation system and learn the science behind one of the most popular and successful data science techniques.
Data Science: Capstone
  • 15–20 hours per week, for 2 weeks
  • Show what you’ve learned from the Professional Certificate Program in Data Science.

What you will learn:

  • Fundamental R programming skills
  • Statistical concepts such as probability, inference, and modeling and how to apply them in practice
  • Gain experience with the tidyverse, including data visualization with ggplot2 and data wrangling with dplyr
  • Become familiar with essential tools for practicing data scientists such as Unix/Linux, git and GitHub, and RStudio
  • Implement machine learning algorithms
  • In-depth knowledge of fundamental data science concepts through motivating real-world case studies

Take This Online Course