This course is a gentle yet thorough introduction to Data Science, Statistics and R using real life examples. Gentle, yet thorough: This course does not require a prior quantitative or mathematics background. It starts by introducing basic concepts such as the mean, median etc and eventually covers all aspects of an analytics (or) data science career from analysing and preparing raw data to visualising your findings. Data Science, Statistics and R: This course is an introduction to Data Science and Statistics using the R programming language. It covers both the theoretical aspects of Statistical concepts and the practical implementation using R. Real life examples: Every concept is explained with the help of examples, case studies and source code in R wherever necessary. The examples cover a wide array of topics and range from A/B testing in an Internet company context to the Capital Asset Pricing Model in a quant finance context.
Yep! MBA graduates or business professionals who are looking to move to a heavily quantitative role. Yep! Engineers who want to understand basic statistics and lay a foundation for a career in Data Science. Yep! Analytics professionals who have mostly worked in Descriptive analytics and want to make the shift to being modelers or data scientists. Yep! Folks who've worked mostly with tools like Excel and want to learn how to use R for statistical analysis.
Data Analysis with R: Datatypes and Data structures in R, Vectors, Arrays, Matrices, Lists, Data Frames, Reading data from files, Aggregating, Sorting & Merging Data Frames. Linear Regression: Regression, Simple Linear Regression in Excel, Simple Linear Regression in R, Multiple Linear Regression in R, Categorical variables in regression, Robust regression, Parsing regression diagnostic plots. Data Visualization in R: Line plot, Scatter plot, Bar plot, Histogram, Scatterplot matrix, Heat map, Packages for Data Visualisation : Rcolorbrewer, ggplot2. Descriptive Statistics: Mean, Median, Mode, IQR, Standard Deviation, Frequency Distributions, Histograms, Boxplots.
No prerequisites : We start from basics and cover everything you need to know. We will be installing R and RStudio as part of the course and using it for most of the examples. Excel is used for one of the examples and basic knowledge of excel is assumed.
Lesson 1
Top Down vs Bottoms Up : The Google vs McKinsey way of looking at data
Lesson 2
R and RStudio installed
Lesson 3
The 10 second answer : Descriptive Statistics - Descriptive Statistics : Mean, Median, Mode
Lesson 3
Our first foray into R : Frequency Distributions
Lesson 4
Draw your first plot : A Histogram
Lesson 5
Computing Mean, Median, Mode in R
Lesson 6
What is IQR (Inter-quartile Range)?
Lesson 7
Box and Whisker Plots
Lesson 8
The Standard Deviation
Lesson 9
Computing IQR and Standard Deviation in R
Lesson 10
Drawing inferences from data
Lesson 11
Random Variables are ubiquitous
Lesson 12
The Normal Probability Distribution
Lesson 13
Sampling is like fishing
Lesson 14
Sample Statistics and Sampling Distributions
Lesson 15
Case studies in Inferential Statistics - Case Study 1 : Football Players (Estimating Population Mean from a Sample)
Lesson 16
Case Study 2 : Election Polling (Estimating Population Proportion from a Sample)
Lesson 17
Case Study 3 : A Medical Study (Hypothesis Test for the Population Mean)
Lesson 18
Case Study 4 : Employee Behavior (Hypothesis Test for the Population Proportion)
Lesson 19
Case Study 5: A/B Testing (Comparing the means of two populations)
Lesson 20
Case Study 6: Customer Analysis (Comparing the proportions of 2 populations)
Lesson 21
Diving into R - Harnessing the power of R
Lesson 22
Assigning Variables
Lesson 23
Printing an output
Lesson 24
Numbers are of type numeric
Lesson 25
Characters and Dates
Lesson 26
Logicals
Lesson 27
Vectors - Data Structures are the building blocks of R
Lesson 28
Creating a Vector
Lesson 29
The Mode of a Vector
Lesson 30
Vectors are Atomic
Lesson 31
Doing something with each element of a Vector
Lesson 32
Aggregating Vectors
Lesson 33
Operations between vectors of the same length
Lesson 34
Operations between vectors of different length
Lesson 35
Generating Sequences
Lesson 36
Using conditions with Vectors
Lesson 37
Find the lengths of multiple strings using Vectors
Lesson 38
Generate a complex sequence (using recycling)
Lesson 39
Vector Indexing (using numbers)
Lesson 40
Vector Indexing (using conditions)
Lesson 41
Vector Indexing (using names
Lesson 42
Creating an Array
Lesson 43
Indexing an Array
Lesson 44
Operations between 2 Arrays
Lesson 45
Operations between an Array and a Vector
Lesson 46
Outer Products
Lesson 47
A Matrix is a 2-Dimensional Array
Lesson 48
Creating a Matrix
Lesson 49
Matrix Multiplication
Lesson 50
Merging Matrices
Lesson 51
Solving a set of linear equations
Lesson 52
What is a factor?
Lesson 53
Find the distinct values in a dataset (using factors)
Lesson 54
Replace the levels of a factor
Lesson 55
Aggregate factors with table()
Lesson 56
Aggregate factors with tapply()
Lesson 57
Lists and Data Frames - Introducing Lists
Lesson 58
Introducing Data Frames
Lesson 59
Reading Data from files
Lesson 60
Indexing a Data Frame
Lesson 61
Aggregating and Sorting a Data Frame
Lesson 62
Merging Data Frames
Lesson 63
Introducing Regression
Lesson 64
What is Linear Regression?
Lesson 65
A Regression Case Study : The Capital Asset Pricing Model (CAPM)
Lesson 66
Linear Regression in Excel : Preparing the data
Lesson 67
Linear Regression in Excel : Using LINEST()
Lesson 68
Linear Regression in R : Preparing the data
Lesson 69
Linear Regression in R : lm() and summary()
Lesson 70
Multiple Linear Regression
Lesson 71
Adding Categorical Variables to a linear model
Lesson 72
Robust Regression in R : rlm()
Lesson 73
Parsing Regression Diagnostic Plots
Lesson 74
Data Visualization in R
Lesson 75
The plot() function in R
Lesson 76
Control color palettes with RColorbrewer
Lesson 77
Drawing barplots
Lesson 78
Drawing a heatmap
Lesson 79
Drawing a Scatterplot Matrix
Lesson 80
Plot a line chart with ggplot2
Loony Corn
Loonycorn is us, Janani Ravi, Vitthal Srinivasan, Swetha Kolalapudi and Navdeep Singh. Between the four of us, we have studied at Stanford, IIM Ahmedabad, the IITs and have spent years (decades, actually) working in tech, in the Bay Area, New York, Singapore and Bangalore. Janani: 7 years at Google (New York, Singapore); Studied at Stanford; also worked at Flipkart and Microsoft. Vitthal: Also Google (Singapore) and studied at Stanford; Flipkart, Credit Suisse and INSEAD too. Swetha: Early Flipkart employee, IIM Ahmedabad and IIT Madras alum. Navdeep: longtime Flipkart employee too, and IIT Guwahati alum. We hope you will try our offerings, and think you'll like them.