You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 10 Next »

[This document is under construction]

Contributors

Provide a list of contributors who have contributed to this document either by writing sections or by sharing ideas and participating in discussions.

Introduction

Provide a brief introduction to this document, the goals, what this isn't, and the process used by the focus group to develop this document.

Traditional Machine Learning (Minu Mathew,  Sandeep Puthanveetil Satheesan)

Provide a brief introduction to machine learning and list major areas within machine learning with short descriptions.

Introductory Courses/Blogs

Deep Learning - Text Analysis(Minu Mathew)

Natural language - no structure. Computers like some structure. So try to introduce some structure.

Regular Expressions :

Good for quick string comparisons, transformations.

Tokenization, Normalization and stemming - methods to add some structure

Dimensionality reduction

Capture the most important structure. 

convert high dimensional space to a low dimensional space by preserving only important vectors (Eigen vectors) - get rid of highly correlated dimensions and reduce to single dimension.

Method to transform text to numeric :

  • Vocab count / Bag of Words (BOW) - no contextual info kept
    • Remove stop words
  • One-hot encoding
  • Frequency count - no contextual info kept
  • TF-IDF - no contextual info kept
  • Word Embeddings : preserve contextual information. Get the semantics of a word.
    • Learn word embeddings using n-gram (pyTorch, Keras )
    • Word2Vec (pre-trained word embeddings from Google) - Based on word distributions and local context (window size). 
    • GLoVe (pre-trained from Stanford) - based on global context
    • BERT



Models :

RNN

LSTM

CNN

Transformer architecture :

BERT model

XL-Net (by microsoft) - BERT and GPT-3 works better in general

GPT-3 model:


ML Ops (Kastan Day, Todd Nicholson)

<Monitoring and Data flow, tools, ML framework deployment issues>

Using GPUs for Speeding up ML (Vismayak Mohanarajan)

References



  • No labels