Project Description
Data Science Online Training
DURATION | MODE OF TRAINING | LEVEL |
---|---|---|
16 Weeks | Online | Advanced |
Overview
Sieve Software Data Science online course training lets you master data analysis, R statistical computing, connecting R with Hadoop framework, Machine Learning algorithms, time-series analysis, K-Means Clustering, Naïve Bayes, business analytics and more. In this Data science online course and certification, you will gain hands-on experience in Data Science by engaging in several real-life projects in domains of banking, finance, entertainment, e-commerce, etc. So, get the best online Data Science courses training from top data scientists!
Data Science Course Description
Why Data Science Course?
- Data is an integral part of all businesses and economies. All businesses draw insights and tangible actions from data.
- AccesData-Scientist salaries are 113% more than average salaries for all job postings, according to Indeed.com.
- KeyfeatureBy 2020, the global estimate of Data Science jobs will reach 2.7 million.
Why Data ScienceTraining At Sieve Software?
- Sieve Software offers one of the best Data Science online training in Hyderabad with a comprehensive course curriculum.
- Elevate your practical knowledge with quizzes, assignments, Competitions and Hackathons to give a boost to your confidence with our hands-on Data Science Training.
- Data Science online training in Hyderabad at Sieve Software Makes you industry ready with coaching sessions, interview prep advice, and resume with 1-1 Mentoring.
Data Science Course Curriculum
Data science is a multi-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from structured and unstructured data. In this first module we will introduce to the field of Data Science and how it relates to other fields of data like Artificial Intelligence, Machine
Learning and Deep Learning.
Introduction to Data Science
High level view of Data Science, Artificial Intelligence & Machine Learning
Subtle differences between Data Science, Machine Learning & Artificial Intelligence
Approaches to Machine Learning
Terms & Terminologies of Data Science
Understanding an end to end Data Science Pipeline, Implementation cycle
Mathematics is very important in the field of data science as concepts within mathematics aid in identifying patterns and assist in creating algorithms. The understanding of various notions of Statistics and Probability Theory are key for the implementation of such algorithms in data science.
Linear Algebra
Matrices, Matrix Operations
Eigen Values, Eigen Vectors
Scalar, Vector and Tensors
Prior and Posterior Probability
Conditional Probability
Calculus
Differentiation, Gradient and Cost Functions
Graph Theory
This module focuses on understanding statistical concepts required for Data Science, Machine Learning and Deep Learning. In this module, you will be introduced to the estimation of various statistical measures of a data set, simulating random distributions, performing hypothesis testing, and building statistical models.
Descriptive Statistics
Types of Data (Discrete vs Continuous)
Types of Data (Nominal, Ordinal)
Measures of Central Tendency (Mean, Median, Mode)
Measures of Dispersion (Variance, Standard Deviation)
Range, Quartiles, Inter Quartile Ranges
Measures of Shape (Skewness and Kurtosis)
Tests for Association (Correlation and Regression)
Random Variables
Probability Distributions
Standard Normal Distribution
Probability Distribution Function
Probability Mass Function
Cumulative Distribution Function
Inferential Statistics
Statistical sampling & Inference
Hypothesis Testing
Null and Alternate Hypothesis
Margin of Error
Type I and Type II errors
One Sided Hypothesis Test, Two-Sided Hypothesis Test
Tests of Inference: Chi-Square, T-test, Analysis of Variance
t-value and p-value
Confidence Intervals
Python for Data Science
Numpy
Pandas
Matplotlib & Seaborn
Jupyter Notebook
Numpy
NumPy is a Python library that works with arrays when performing scientific computing with Python. Explore how to initialize and load data into arrays and learn about basic array manipulation operations using NumPy.
Loading data with Numpy
Comparing Numpy with Traditional Lists
Numpy Data Types
Indexing and Slicing
Copies and Views
Numerical Operations with Numpy
Matrix Operations on Numpy Arrays
Aggregations functions
Shape Manipulations
Broadcasting
Statistical operations using Numpy
Resize, Reshape, Ravel
Image Processing with Numpy
Pandas
Pandas is a Python library that provides utilities to deal with structured data stored in the form of rows and columns. Discover how to work with series and tabular data, including initialization, population, and manipulation of Pandas Series and DataFrames.
Basics of Pandas
Loading data with Pandas
Series
Operations on Series
DataFrames and Operations of DataFrames
Selection and Slicing of DataFrames
Descriptive statistics with Pandas
Map, Apply, Iterations on Pandas DataFrame
Working with text data
Multi Index in Pandas
GroupBy Functions
Merging, Joining and Concatenating DataFrames
Visualization using Pandas
Data Visualization using Matplotlib
Matplotlib is a plotting library for the Python programming language and its numerical mathematics extension NumPy. It provides an object-oriented API for embedding plots into applications using general-purpose GUI toolkits like Tkinter, wxPython, Qt, or GTK+
Anatomy of Matplotlib figure
Plotting Line plots with labels and colors
Adding markers to line plots
Histogram plots
Scatter plots
Size, Color and Shape selection in Scatter plots.
Applying Legend to Scatter plots
Displaying multiple plots using subplots
Boxplots, scatter_matrix and Pair plots
Data Visualization using Seaborn
Seaborn is a data visualization library that provides a high-level interface for drawing graphs. These graphs are able to convey a lot of information, while also being visually appealing.
Basic Plotting using Seaborn
Violin Plots
Box Plots
Cat Plots
Facet Grid
Swarm Plot
Pair Plot
Bar Plot
LM Plot
Variations in LM plot using hue, markers, row and col
Exploratory Data Analysis helps in identifying the patterns in the data by using basic statistical methods as well as using visualization tools to displays graphs and charts. With EDA we can assess the distribution of the data and conclude various models to be used.
Pipeline ideas
Exploratory Data Analysis
Feature Creation
Evaluation Measures
Data Analytics Cycle ideas
Data Acquisition
Data Preparation
Data cleaning
Data Visualization
Plotting
Model Planning & Model Building
Data Inputting
Reading and writing data to text files
Reading data from a csv
Reading data from JSON
Data preparation
Selection and Removal of Columns
Transform
Rescale
Standardize
Normalize
Binarize
One hot Encoding
Imputing
Train, Test Splitting
In machine learning, computers apply statistical learning techniques to automatically identify patterns in data. This module on Machine Learning is a deep dive to Supervised, Unsupervised learning and Gaussian / Naive-Bayes methods. Also you will be exposed to different classification, clustering and regression methods.
Introduction to Machine Learning
Applications of Machine Learning
Supervised Machine Learning
Classification
Regression
Unsupervised Machine Learning
Reinforcement Learning
Latest advances in Machine Learning
Model Representation
Model Evaluation
Hyper Parameter tuning of Machine Learning Models.
Evaluation of ML Models.
Estimating and Prediction of Machine Learning Models
Deployment strategy of ML Models.
Supervised learning is one of the most popular techniques in machine learning. In this module, you will learn about more complicated supervised learning models and how to use them to solve problems.
Classification methods & respective evaluation
K Nearest Neighbors
Decision Trees
Naive Bayes
Stochastic Gradient Descent
SVM –
Linear
Non linear
Radial Basis Function
Random Forest
Gradient Boosting Machines
XGboost
Logistic regression
Ensemble methods
Combining models
Bagging
Boosting
Voting
Choosing best classification method
Model Tuning
Train Test Splitting
K-fold cross validation
Variance bias tradeoff
L1 and L2 norm
Overfit, underfit along with learning curves variance bias sensibility using graphs
Hyper Parameter Tuning using Grid Search CV
Respective Performance measures
Different Errors (MAE, MSE, RMSE)
Accuracy, Confusion Matrix, Precision, Recall
Regression is a type of predictive modelling technique which is heavily used to derive the relationship between variables (the dependent and independent variables). This technique finds its usage mostly in forecasting, time series modelling and finding the causal effect relationship between the variables. The module discusses in detail about regression and types of regression and its usage & applicability
Regression
Linear Regression
Variants of Regression
Lasso
Ridge
Multi Linear Regression
Logistic Regression (effectively, classification only)
Regression Model Improvement
Polynomial Regression
Random Forest Regression
Support Vector Regression
Respective Performance measures
Different Errors (MAE, MSE, RMSE)
Mean Absolute Error
Mean Square Error
Root Mean Square Error
Unsupervised learning can provide powerful insights on data without the need to annotate examples. In this module, you will learn several different techniques in unsupervised machine learning.
Clustering
K means
Hierarchical Clustering
DBSCAN
Association Rule Mining
Association Rule Mining.
Market Basket Analysis using Apriori Algorithm
Dimensionality reduction using Principal Component analysis (PCA)
Natural language is essential to human communication, which makes the ability to process it an important one for computers. In this module, you will be introduced to natural language processing and some of the basic tasks.
Text Analytics
Stemming, Lemmatization and Stop word removal.
POS tagging and Named Entity Recognition
Bigrams, Ngrams and colocations
Term Document Matrix
Count Vectorizer
Term Frequency and TF-IDF
Advanced Analytics covers various areas like Time series Analysis, ARIMA models, Recommender systems etc.
Time series
Time series Analysis.
ARIMA example
Recommender Systems
Content Based Recommendation
Collaborative Filtering
Reinforcement learning is an area of Machine Learning which takes suitable action to maximize reward in a particular situation. It is employed by various software and machines to find the best possible behavior or path it should take in a specific situation.
Basic concepts of Reinforcement Learning
Action
Reward
Penalty Mechanism
Feedback loop
Deep Q Learning
Artificial intelligence (AI) is the ability of a computer program or a machine to think and learn. It is also a field of study which tries to make computers “smart”
Artificial Neural Networks
Neural Networks & terminologies
Non linearity problem, illustration
Perceptron learning
Feed Forward Network and Back propagation
Gradient Descent
Mathematics of Artificial Neural Networks
Gradients
Partial derivatives
Linear algebra
Li
LD
Eigen vectors
Projections
Vector quantization
Overview of tools used in Neural Networks
Tensor Flow
Keras
Deep learning is part of a broader family of machine learning methods based on the layers used in artificial neural networks. In this module, you’ll deep dive in the concepts of Neural Networks, Convolutional Neural Networks, Recurrent Neural Networks, Auto Encoders and many more.
Deep Learning
Tensorflow & keras installation
More elaborate discussion on cost function
Measuring accuracy of hypothesis function
Role of gradient function in minimizing cost function
Explicit discussion of Bayes models
Hidden Markov Models (HMM)
Optimization basics
Sales Prediction of a Gaming company using Neural Networks
Build an Image similarity engine.
Deep Learning with Convolutional Neural Nets
Architecture of CNN
Types of layers in CNN
Different Filters and Kernels
Building an Image classifier with and without CNN
Recurrent neural nets
Fundamental notions & ideas
Recurrent neurons
Handling variable length sequences
Training a sequence classifier
Training to predict Time series
Cloud computing is massively growing in importance in the IT sector as more and more companies are eschewing traditional IT and moving applications and business processes to the cloud. This section covers detailed information about how to deploy Data Science models on Cloud environments.
Topics
Introduction to Cloud Computing
Amazon Web Services Preliminaries – S3, EC2, RDS
Big data processing on AWS using Elastic Map Reduce (EMR)
Machine Learning using Amazon Sage Maker
Deep Learning on AWS Cloud
Natural Language processing using AWS Lex
Analytics services on AWS Cloud
Data Warehousing on AWS Cloud
Creating Data Pipelines on AWS Cloud
DevOps play a pivotal role in bridging the gap between Development and Operational teams. This section covers key DevOps tools which a Data Scientist need to be aware of for doing their day to day data science work.
Topics
Introduction to DevOps for Data Science
Tasks in Data Science Development
Deploying Models in Production
Deploying Machine Learning Models as Services
Running Machine Learning Services in Containers
Scaling ML Services with Kubernetes