BTAI-Machine-Learning-Coursework

Course Outline:

Unit 1: Machine Learning in a Nutshell

Learning goals: recognize ML problem types; understand terminology, ML workflow, role of ML engineer

Topics covered:

• supervised learning – classification and regression • unsupervised learning – clustering mentioned, but not practiced

Unit 2: Manage Data in ML

Learning goals: analyze data; sample, prepare, and organize data in data matrices; identify features and labels for supervised learning

Topics covered:

• sampling, filtering, and cleaning data • plotting data with Seaborn and Matplotlib • data types • feature engineering (mapping predictive or causal concepts to data representation, manipulating data so that it is appropriate for common machine learning APIs) • feature transformations (binary indicator, one-hot encoding, functional transformations, interaction terms, binning, scaling) • correlation, covariance, mutual information • outliers and missing data • common statistics review

Unit 3: Train Common ML Models

Learning goals: train and evaluate supervised learning models (k-nearest neighbors, decision trees), understand overfitting and underfitting

Topics covered:

• plitting data into testing, training, validation sets • zero-one loss function • hyperparameters • model-based learning vs instance-based learning • kNN, dtrees for classification • distance functions • drawing 2D decision boundaries • entropy (for dtrees) • optimizing models • bias-variance tradeoff

Unit 4: Train a Linear Model

Learning goals: train and evaluate linear models – logistic regression, linear regression; understand mechanics and use of gradient descent for optimization and loss functions for evaluation; understand math using vectors and matrices and its role in implementing a linear model

Topics covered:

• advantages and disadvantages of linear models • improving linear model by minimizing loss function • loss functions for classification and regression problems (log loss, mean squared error) • use of weighted sum and sigmoid in making predictions • review of vector and matrix math • optimization using gradient descent • learning rates • regularization • difference between logistic and linear regression

Unit 5: Evaluate and Improve Your Model

Learning goals: create and select model candidates using evaluation metrics, set up training, validation, and test splits; tune hyperparameters; select features to improve model performance

Topics covered:

• out-of-sample validation • k-fold cross validation • feature selection (heuristic, stepwise, regularization) • hyperparameter optimization • confusion matrix and classification metrics (accuracy, precision, recall) • AUC-ROC curve for model evaluation • calibration curve

Unit 6: Improve Performance with Ensemble Methods

Learning goals: understand what ensemble methods are and when to use them, understand mechanics of random forests and gradient boosted decision trees, build and tune different models to improve performance using ensemble methods

Topics covered:

• model error (bias + variance) • types of ensemble methods: stacking, bagging, boosting • random forests • gradient boosting and gradient boosted decision trees • optimizing gradient boosted decision trees

Unit 7: Use ML for Text Analysis

Learning goals: use NLP preprocessing techniques to convert text to data suitable for ML, understand how word embeddings are used to convert text into numerical features, implement ML models to make predictions from text data, understand basic ideas behind feedforward and recurrent neural networks

Topics covered:

• preprocessing methods: lemmatization, n-grams, stop words • vectorization: binary, count, TD-IDF • word embeddings • cosine similarity • using word embeddings with sparse data sets • pooling approaches to capture different concepts in text • neural networks and non-linear transformations (high level description and interactive) • RNN (high level description)

Unit 8: Prepare ML Models for the Real World

Learning goals: apply bias-variance tradeoff to model evaluation, diagnose how feature issues contribute to degraded model performance, understand souces of discriminatory bias and how to measure and mitigate, understand how to improve fairness and accountability of a model

Topics covered:

• machine learning-based risk and ML model failure modes • model developer best practices (Agile model development, applying unit tests, writing code to be reproducible, creating good documentation.) • model deployment • data issues: bottlenecks, bias-variance tradeoff, class imbalance • feature issues: irrelevant features, feature leakage • algorithmic fairness (allocative harm, representational harm) • algorithmic accountability (transparency) • roles on ML project teams

Learning Outcomes:

As a result of this intentional design, you will learn the habits, skills, and mindsets to be a successful entry-level ML or AI engineer, including the following competencies:

Master machine learning fundamentals to develop your portfolio of artifacts. Cultivate your authentic leadership and community to use machine learning for social good. Master project management and collaboration skills to maximize your impact. Become an effective communicator, presenter, and interviewer. Learn the skills to effectively navigate your work environment.

This course will provide you with the skills to unlock value in unstructured data sets and aid you in making recommendations about what models to use when solving machine learning problems. You will become familiar with factors to consider and questions to ask to make sure they’re implemented in the most effective way, including how to train a more powerful model using advanced evaluation and hyperparameter tuning methods.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
ApplyingFilters.ipynb		ApplyingFilters.ipynb
BuildADecisionTree.ipynb		BuildADecisionTree.ipynb
BuildModelingDataset.ipynb		BuildModelingDataset.ipynb
BuildingABalancedDataSet.ipynb		BuildingABalancedDataSet.ipynb
ChooseYourProblemAndData.ipynb		ChooseYourProblemAndData.ipynb
ComparingRegressionModels.ipynb		ComparingRegressionModels.ipynb
ComputingTheEuclideanDistance.ipynb		ComputingTheEuclideanDistance.ipynb
CreatingBinaryVariables.ipynb		CreatingBinaryVariables.ipynb
CrossValidation.ipynb		CrossValidation.ipynb
DTModelSelection.ipynb		DTModelSelection.ipynb
FeatureSelection.ipynb		FeatureSelection.ipynb
GBDT.ipynb		GBDT.ipynb
GradientDescent.ipynb		GradientDescent.ipynb
ImplementMLProjectPlan.ipynb		ImplementMLProjectPlan.ipynb
JupyterNotebookPractice.ipynb		JupyterNotebookPractice.ipynb
KNNOptimization.ipynb		KNNOptimization.ipynb
Lab1.ipynb		Lab1.ipynb
Lab2 - BuildModelingDataset.ipynb		Lab2 - BuildModelingDataset.ipynb
Lab3.ipynb		Lab3.ipynb
LogisticRegression.ipynb		LogisticRegression.ipynb
LogisticRegressionFromScratch.ipynb		LogisticRegressionFromScratch.ipynb
MissingData.ipynb		MissingData.ipynb
ModelSelectionForKNN.ipynb		ModelSelectionForKNN.ipynb
ModelSelectionForLogisticRegression.ipynb		ModelSelectionForLogisticRegression.ipynb
OptimizingADecisionTree.ipynb		OptimizingADecisionTree.ipynb
OptimizingLogisticRegression.ipynb		OptimizingLogisticRegression.ipynb
Outliers.ipynb		Outliers.ipynb
PandasDescribe.ipynb		PandasDescribe.ipynb
PipelineForClassification.ipynb		PipelineForClassification.ipynb
PracticeNumPy.ipynb		PracticeNumPy.ipynb
PracticePandas.ipynb		PracticePandas.ipynb
PredictandEvaluate.ipynb		PredictandEvaluate.ipynb
README.md		README.md
RandomForest.ipynb		RandomForest.ipynb
StepwiseFeatureSelection.ipynb		StepwiseFeatureSelection.ipynb
TextAsDataCV.ipynb		TextAsDataCV.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BTAI-Machine-Learning-Coursework

Unit 1: Machine Learning in a Nutshell

Topics covered:

Unit 2: Manage Data in ML

Topics covered:

Unit 3: Train Common ML Models

Topics covered:

Unit 4: Train a Linear Model

Topics covered:

Unit 5: Evaluate and Improve Your Model

Topics covered:

Unit 6: Improve Performance with Ensemble Methods

Topics covered:

Unit 7: Use ML for Text Analysis

Topics covered:

Unit 8: Prepare ML Models for the Real World

Topics covered:

Learning Outcomes:

About

Releases

Packages

Languages

TheNaila/BTAI-Machine-Learning-Coursework

Folders and files

Latest commit

History

Repository files navigation

BTAI-Machine-Learning-Coursework

Unit 1: Machine Learning in a Nutshell

Topics covered:

Unit 2: Manage Data in ML

Topics covered:

Unit 3: Train Common ML Models

Topics covered:

Unit 4: Train a Linear Model

Topics covered:

Unit 5: Evaluate and Improve Your Model

Topics covered:

Unit 6: Improve Performance with Ensemble Methods

Topics covered:

Unit 7: Use ML for Text Analysis

Topics covered:

Unit 8: Prepare ML Models for the Real World

Topics covered:

Learning Outcomes:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages