    Main Content:

  • Module 1: Programming concepts in R or Python as desired by the candidate
  • Module 2: Advanced Data Analytics concepts using R or Python
  • Module 3: Basic to Advanced Statistics concepts -using R/Python
  • Module 4: Data Visualization using R or Python
  • Module 5: Machine Learning
  • Project

Detailed Structure: We have divided our course structure into 3 parts:

    Part A: Programming Concepts in R & Python
    Part B: Data Analytics
    Part C: Data Science
Part A: Programming concepts in R or Python
  • R Overview and setup
  • Syntax
  • Data types - vector, lists, metrices, arrays, factors, data frames
  • R Variables
  • R Operators - arithmetic, relational, logical, assignment and miscellaneous
  • Decision making & loops
  • Functions in R
  • Reading from files (CSV, Excel)
  • Working with databases
  • Data Visualization
    • Overview of R Packages used for Visualization
    • Basic and Advanced Plots. GGPLOT Interactive Plots: Multiple Graphs, Title, Axes, Labels, Color Palletes, vcd, Tableplot, Googlecharts, Shiny, Rcharts, D3.js
    Part B: Data Analytics concepts using R or Python
    • 1. What is Data Science and Why study Data Science
      • Tools and Techniques in Data Science
      • Applications of Data Science
    • 2. Getting to know your data, data cleansing and data transformation using R
    • 3. Standard linear regression using R
      • Building a linear model
      • Performance tuning of linear model
      • Building a polynomial model
      • Ridge Regression
      • Penalty based variable selection in regression models with many parameters(Lasso)
      • Case study
    • 4. Logistic regression using R
      • Building a linear model for binary response data
      • Interpretation of regression coefficients
      • Classification of new cases
      • Building a polynomial model for binary response data
      • Performance tuning of model
      • Multinomial logistic regression
      • Case study
    • 5. Clustering using R
      • K-Means clustering
      • K-Medoids clustering
      • Hierarchical clustering procedures
      • Density based clustering procedures
      • Case study
    • 6. Time series analysis
      • Reading, plotting and decomposing time series data
      • Forecasting using exponential smoothing
      • Case study
    • 7. Market basket analysis using R
    • Part C: Basic to Advanced Statistics concepts
      • Basic Probability
        • Statistical Terminology and Basic Notations
        • Importance of Data and Numbers with domain specific
        • Measure of Central Tendencies & Measure of dispersion
        • Variance Discussion and its importance across the business
        • Legendres Least Square Principle
        • Scatter Diagram and Data points distribution
        • Trend lines and Trend Pattern Discussion
        • Outlier and Missing Value Treatment Analysis
        • Central Limit Theorem
    • Probability Terminology and Notations
      • Sample Space, Events and Experiments
      • Probability Rules & Probability Types
      • Bayes Theorem & Error Matrix
      • Probability Scores and its importance in banking domain
      • Discussion on Churn probability
      • P-Value Significance in model outputs
    • Understanding Distributions
      • Discrete and Continuous distributions
      • Binomial distribution & Poisson distribution
      • Exponential distribution & t- Distribution
      • Normal/Gaussian distribution
      • Concepts on Confidence intervals
      • Industry examples on understanding the distributions
    • Advanced Statistical Concepts
      • Theory of Hypothesis Testing
      • Small Sample and Large Sample Tests( t & Chi square testing )
      • ANOVA ( One way and Two Way )
      • Explanation on F test and Z tests in summary outputs
      • Theory of Association
      • Bivariate and Multivariate Analysis
      • Importance of Linearity
      • Correlation (Positive Correlation, Negative Correlation and Types of Correlation)
      • Regression Theory and Assumptions
      • Exploratory Data Analysis
    • Module 5: Machine Learning
      • Artificial Neural Network
      • Deep Learning
      • Support Vector Machines and Kernel Regression
      • Hidden Markov Models
      • Decision Trees
      • Ensemble Methods: Random Forest, Boosting: Gradient/XG, Ada, Bagging
      • Regularization model
      • Feature selection


    • Advanced topics handled by IIT Professor
    • Personalized training programs
    • Project work covered
    • Assignments to practice

