Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Transformer architecture :

Attention mechanism : 

BERT model

XL-Net (by microsoft) - BERT and GPT-3 works better in general

...

Colab Page - https://colab.research.google.com/drive/1bzL-mhGNvh7PF_MzsSgMmw9TQjyP6DCe?usp=sharing

References