[This document is under construction]
Contributors
Provide a list of contributors who have contributed to this document either by writing sections or by sharing ideas and participating in discussions.
Introduction
Provide a brief introduction to this document, the goals, what this isn't, and the process used by the focus group to develop this document.
Traditional Machine Learning (Minu Mathew, Sandeep Puthanveetil Satheesan)
Provide a brief introduction to machine learning and list major areas within machine learning with short descriptions.
Introductory Courses/Blogs
- Machine Learning, Andrew Ng, Stanford University/Coursera, https://www.coursera.org/learn/machine-learning/
- Machine Learning Crash Course, Google, https://developers.google.com/machine-learning/crash-course/
- Machine Learning Mastery, Jason Brownlee, https://machinelearningmastery.com/
Deep Learning - Text Analysis(Minu Mathew)
Natural language - no structure. Computers like some structure. So try to introduce some structure.
Regular Expressions :
Good for quick string comparisons, transformations.
Tokenization, Normalization and stemming - methods to add some structure
Dimensionality reduction :
Capture the most important structure.
convert high dimensional space to a low dimensional space by preserving only important vectors (Eigen vectors) - get rid of highly correlated dimensions and reduce to single dimension.
Method to transform text to numeric :
- Vocab count / Bag of Words (BOW) - no contextual info kept
- Remove stop words
- One-hot encoding
- Frequency count - no contextual info kept
- TF-IDF - no contextual info kept
- Word Embeddings : preserve contextual information. Get the semantics of a word.
Models :
RNN
LSTM
CNN
Transformer architecture :
Attention mechanism :
BERT model
XL-Net (by microsoft) - BERT and GPT-3 works better in general
GPT-3 model:
ML Ops (Kastan Day, Todd Nicholson)
<Monitoring and Data flow, tools, ML framework deployment issues>
Using GPUs for Speeding up ML (Vismayak Mohanarajan)
Rapids - cuDF and cuML
Colab Page - https://colab.research.google.com/drive/1bzL-mhGNvh7PF_MzsSgMmw9TQjyP6DCe?usp=sharing