skillzpot logo
preview
lessons
81
duration
09:03:59
language
English
level
beginner
access
lifetime

Statistic and Data Science in R

By
Loony Corn
Posted
2017-02-16
Filmed by
Self
600
300

Overview

This course is a gentle yet thorough introduction to Data Science, Statistics and R using real life examples. Gentle, yet thorough: This course does not require a prior quantitative or mathematics background. It starts by introducing basic concepts such as the mean, median etc and eventually covers all aspects of an analytics (or) data science career from analysing and preparing raw data to visualising your findings. Data Science, Statistics and R: This course is an introduction to Data Science and Statistics using the R programming language. It covers both the theoretical aspects of Statistical concepts and the practical implementation using R. Real life examples: Every concept is explained with the help of examples, case studies and source code in R wherever necessary. The examples cover a wide array of topics and range from A/B testing in an Internet company context to the Capital Asset Pricing Model in a quant finance context.

Is this course for me?

Yep! MBA graduates or business professionals who are looking to move to a heavily quantitative role. Yep! Engineers who want to understand basic statistics and lay a foundation for a career in Data Science. Yep! Analytics professionals who have mostly worked in Descriptive analytics and want to make the shift to being modelers or data scientists. Yep! Folks who've worked mostly with tools like Excel and want to learn how to use R for statistical analysis.

What will I gain from this course?

Data Analysis with R: Datatypes and Data structures in R, Vectors, Arrays, Matrices, Lists, Data Frames, Reading data from files, Aggregating, Sorting & Merging Data Frames. Linear Regression: Regression, Simple Linear Regression in Excel, Simple Linear Regression in R, Multiple Linear Regression in R, Categorical variables in regression, Robust regression, Parsing regression diagnostic plots. Data Visualization in R: Line plot, Scatter plot, Bar plot, Histogram, Scatterplot matrix, Heat map, Packages for Data Visualisation : Rcolorbrewer, ggplot2. Descriptive Statistics: Mean, Median, Mode, IQR, Standard Deviation, Frequency Distributions, Histograms, Boxplots.

How do I prepare before taking this course? Is there a prerequisite skill set?

No prerequisites : We start from basics and cover everything you need to know. We will be installing R and RStudio as part of the course and using it for most of the examples. Excel is used for one of the examples and basic knowledge of excel is assumed.

  • Lesson 1

    Top Down vs Bottoms Up : The Google vs McKinsey way of looking at data

  • Lesson 2

    R and RStudio installed

  • Lesson 3

    The 10 second answer : Descriptive Statistics - Descriptive Statistics : Mean, Median, Mode

  • Lesson 3

    Our first foray into R : Frequency Distributions

  • Lesson 4

    Draw your first plot : A Histogram

  • Lesson 5

    Computing Mean, Median, Mode in R

  • Lesson 6

    What is IQR (Inter-quartile Range)?

  • Lesson 7

    Box and Whisker Plots

  • Lesson 8

    The Standard Deviation

  • Lesson 9

    Computing IQR and Standard Deviation in R

  • Lesson 10

    Drawing inferences from data

  • Lesson 11

    Random Variables are ubiquitous

  • Lesson 12

    The Normal Probability Distribution

  • Lesson 13

    Sampling is like fishing

  • Lesson 14

    Sample Statistics and Sampling Distributions

  • Lesson 15

    Case studies in Inferential Statistics - Case Study 1 : Football Players (Estimating Population Mean from a Sample)

  • Lesson 16

    Case Study 2 : Election Polling (Estimating Population Proportion from a Sample)

  • Lesson 17

    Case Study 3 : A Medical Study (Hypothesis Test for the Population Mean)

  • Lesson 18

    Case Study 4 : Employee Behavior (Hypothesis Test for the Population Proportion)

  • Lesson 19

    Case Study 5: A/B Testing (Comparing the means of two populations)

  • Lesson 20

    Case Study 6: Customer Analysis (Comparing the proportions of 2 populations)

  • Lesson 21

    Diving into R - Harnessing the power of R

  • Lesson 22

    Assigning Variables

  • Lesson 23

    Printing an output

  • Lesson 24

    Numbers are of type numeric

  • Lesson 25

    Characters and Dates

  • Lesson 26

    Logicals

  • Lesson 27

    Vectors - Data Structures are the building blocks of R

  • Lesson 28

    Creating a Vector

  • Lesson 29

    The Mode of a Vector

  • Lesson 30

    Vectors are Atomic

  • Lesson 31

    Doing something with each element of a Vector

  • Lesson 32

    Aggregating Vectors

  • Lesson 33

    Operations between vectors of the same length

  • Lesson 34

    Operations between vectors of different length

  • Lesson 35

    Generating Sequences

  • Lesson 36

    Using conditions with Vectors

  • Lesson 37

    Find the lengths of multiple strings using Vectors

  • Lesson 38

    Generate a complex sequence (using recycling)

  • Lesson 39

    Vector Indexing (using numbers)

  • Lesson 40

    Vector Indexing (using conditions)

  • Lesson 41

    Vector Indexing (using names

  • Lesson 42

    Creating an Array

  • Lesson 43

    Indexing an Array

  • Lesson 44

    Operations between 2 Arrays

  • Lesson 45

    Operations between an Array and a Vector

  • Lesson 46

    Outer Products

  • Lesson 47

    A Matrix is a 2-Dimensional Array

  • Lesson 48

    Creating a Matrix

  • Lesson 49

    Matrix Multiplication

  • Lesson 50

    Merging Matrices

  • Lesson 51

    Solving a set of linear equations

  • Lesson 52

    What is a factor?

  • Lesson 53

    Find the distinct values in a dataset (using factors)

  • Lesson 54

    Replace the levels of a factor

  • Lesson 55

    Aggregate factors with table()

  • Lesson 56

    Aggregate factors with tapply()

  • Lesson 57

    Lists and Data Frames - Introducing Lists

  • Lesson 58

    Introducing Data Frames

  • Lesson 59

    Reading Data from files

  • Lesson 60

    Indexing a Data Frame

  • Lesson 61

    Aggregating and Sorting a Data Frame

  • Lesson 62

    Merging Data Frames

  • Lesson 63

    Introducing Regression

  • Lesson 64

    What is Linear Regression?

  • Lesson 65

    A Regression Case Study : The Capital Asset Pricing Model (CAPM)

  • Lesson 66

    Linear Regression in Excel : Preparing the data

  • Lesson 67

    Linear Regression in Excel : Using LINEST()

  • Lesson 68

    Linear Regression in R : Preparing the data

  • Lesson 69

    Linear Regression in R : lm() and summary()

  • Lesson 70

    Multiple Linear Regression

  • Lesson 71

    Adding Categorical Variables to a linear model

  • Lesson 72

    Robust Regression in R : rlm()

  • Lesson 73

    Parsing Regression Diagnostic Plots

  • Lesson 74

    Data Visualization in R

  • Lesson 75

    The plot() function in R

  • Lesson 76

    Control color palettes with RColorbrewer

  • Lesson 77

    Drawing barplots

  • Lesson 78

    Drawing a heatmap

  • Lesson 79

    Drawing a Scatterplot Matrix

  • Lesson 80

    Plot a line chart with ggplot2

profile pic

Loony Corn

Loonycorn is us, Janani Ravi, Vitthal Srinivasan, Swetha Kolalapudi and Navdeep Singh. Between the four of us, we have studied at Stanford, IIM Ahmedabad, the IITs and have spent years (decades, actually) working in tech, in the Bay Area, New York, Singapore and Bangalore. Janani: 7 years at Google (New York, Singapore); Studied at Stanford; also worked at Flipkart and Microsoft. Vitthal: Also Google (Singapore) and studied at Stanford; Flipkart, Credit Suisse and INSEAD too. Swetha: Early Flipkart employee, IIM Ahmedabad and IIT Madras alum. Navdeep: longtime Flipkart employee too, and IIT Guwahati alum. We hope you will try our offerings, and think you'll like them.

you might also like