uCertify

DAT-650: Advanced Data Analytics

Name: DAT-650: Advanced Data Analytics
Brand: uCertify
SKU: 978-1-61691-081-5
Price: 279.99 USD

(SNHU-DAT650.AE1) / ISBN : 978-1-61691-081-5

Lessons

Lab

TestPrep

Get A Free Trial

This course includes:

Free pre-assessment and first 2 lessons

12+ Interactive Lessons

Accessible on mobile and tablet too

Certificate of completion

Are you an instructor?

Access detailed information about the course content, learning objectives, activities, and assessments before adding it to your curriculum.

Skills You’ll Get

Interactive Lessons

12+ Interactive Lessons

Gamified TestPrep

Hands-On Labs

19+ LiveLab | 15+ Video tutorials | 31+ Minutes

Download Course Outline

Case Study, Part 1: Business Understanding, Data Preparation, and EDA

Cross-Industry Standard Practice for Data Mining
Business Understanding Phase
Data Understanding Phase, Part 1: Getting a Feel for the Data Set
Data Preparation Phase
Data Understanding Phase, Part 2: Exploratory Data Analysis

Milestone

Multivariate Statistics

Two-Sample t-Test for Difference in Means
Two-Sample Z-Test for Difference in Proportions
Test for the Homogeneity of Proportions
Chi-Square Test for Goodness of Fit of Multinomial Data
Analysis of Variance
Reference
The R Zone
R Reference
Exercises

Case Study, Part 2: Clustering and Principal Components Analysis

Partitioning the Data
Developing the Principal Components
Validating the Principal Components
Profiling the Principal Components
Choosing the Optimal Number of Clusters Using Birch Clustering
Choosing the Optimal Number of Clusters Using k-Means Clustering
Application of k-Means Clustering
Validating the Clusters
Profiling the Clusters

k-Nearest Neighbor Algorithm

Classification Task
k-Nearest Neighbor Algorithm
Distance Function
Combination Function
Quantifying Attribute Relevance: Stretching the Axes
Database Considerations
k-Nearest Neighbor Algorithm for Estimation and Prediction
Choosing k
Application of k-Nearest Neighbor Algorithm Using IBM/SPSS Modeler
The R Zone
R References
Exercises

Association Rules

Affinity Analysis and Market Basket Analysis
Support, Confidence, Frequent Itemsets, and the A Priori Property
How Does The A Priori Algorithm Work (Part 1)? Generating Frequent Itemsets
How Does The A Priori Algorithm Work (Part 2)? Generating Association Rules
Extension From Flag Data to General Categorical Data
Information-Theoretic Approach: Generalized Rule Induction Method
Association Rules are Easy to do Badly
How Can We Measure the Usefulness of Association Rules?
Do Association Rules Represent Supervised or Unsupervised Learning?
Local Patterns Versus Global Models
The R Zone
R References
Exercises

Multiple Regression and Model Building

An Example of Multiple Regression
The Population Multiple Regression Equation
Inference in Multiple Regression
Regression With Categorical Predictors, Using Indicator Variables
Adjusting R2: Penalizing Models For Including Predictors That Are Not Useful
Sequential Sums of Squares
Multicollinearity
Variable Selection Methods
Gas Mileage Data Set
An Application of Variable Selection Methods
Using the Principal Components as Predictors in Multiple Regression
The R Zone
R References
Exercises

Variable Selection Methods

NaïVe Bayes and Bayesian Networks

Bayesian Approach
Maximum A Posteriori (MAP) Classification
Posterior Odds Ratio
Balancing The Data
Naïve Bayes Classification
Interpreting The Log Posterior Odds Ratio
Zero-Cell Problem
Numeric Predictors for Naïve Bayes Classification
WEKA: Hands-on Analysis Using Naïve Bayes
Bayesian Belief Networks
Clothing Purchase Example
Using The Bayesian Network to Find Probabilities
The R Zone
R References
Exercises

Imputation of Missing Data

Need for Imputation of Missing Data
Imputation of Missing Data: Continuous Variables
Standard Error of the Imputation
Imputation of Missing Data: Categorical Variables
Handling Patterns in Missingness
Reference
The R Zone
R References

Case Study, Part 3: Modeling And Evaluation For Performance And Interpretability

Do You Prefer The Best Model Performance, Or A Combination Of Performance And Interpretability?
Modeling And Evaluation Overview
Cost-Benefit Analysis Using Data-Driven Costs
Variables to be Input To The Models
Establishing The Baseline Model Performance
Models That Use Misclassification Costs
Models That Need Rebalancing as a Surrogate for Misclassification Costs
Combining Models Using Voting and Propensity Averaging
Interpreting The Most Profitable Model

Case Study, Part 4: Modeling and Evaluation for High Performance Only

Variables to be Input to the Models
Models that use Misclassification Costs
Models that Need Rebalancing as a Surrogate for Misclassification Costs
Combining Models using Voting and Propensity Averaging
Lessons Learned
Conclusions

Milestone

Milestone I

k-Nearest Neighbor Algorithm

Running KNN
Calculating the Euclidean Distance

Association Rules

Milestone 2

Multiple Regression and Model Building

Approximating the Relationship between the Variables in a Scatterplot
Identifying Confidence Intervals
Creating a Dot Plot
Determining the Sequential Sums of Squares
Analyzing Multicollinearity

Variable Selection Methods

Applying the Best Subsets Procedure in a Regression Model
Applying Forward Selection Procedure
Applying the Backward Elimination Procedure
Applying the Stepwise Selection Procedure in a Regression Model
Using the Principal Components as Predictors in Multiple Regression

NaïVe Bayes and Bayesian Networks

Calculating Posterior Odds Ratio
Calculating the Log Posterior Odds Ratio
Calculating the Numeric Predictors for Naive Bayes Classification
Milestone 3

Imputation of Missing Data

Case Study, Part 3: Modeling And Evaluation For Performance And Interpretability

Final Project

Any questions?
Check out the FAQs

Still have unanswered questions and need to get in touch?

Related Courses

All Courses

Lab

CCNA 200-301 Pearson uCertify Network Simulator

ISBN: 9781616918378

200-301-SIMULATOR.AB1

Lessons AI Tutor

Accounting Course 101

ISBN: 9781644597002

ACCOUNT-WRKBK.AE1

Lessons Lab

Accounting All-in-One

ISBN: 9781644594490

ACCOUNTS.AE1

Lessons TestPrep

ACCUPLACER For Beginners

ISBN: 9781644595732

ACCUPLACER.AE1

Lessons TestPrep

ACT Prep 2024

ISBN: 9781644594889

ACT-PREP.AE1

Lessons Lab TestPrep AI Tutor

Mastering Active Directory

ISBN: 9781644595909

ACTV-DIRECT.AJ1

Lessons Lab AI Tutor

Adversarial Machine Learning

ISBN: 9798900590165

ADV-ML.AU1

This course includes:

Free pre-assessment and first 2 lessons

12+ Interactive Lessons

Accessible on mobile and tablet too

Certificate of completion

Are you an instructor?

Access detailed information about the course content, learning objectives, activities, and assessments before adding it to your curriculum.

DAT-650: Advanced Data Analytics

Are you an instructor?