
NLP Overview

Part-of-Speech (POS) tagging

  • Byte-pair encoding
  • Morphological Parsing
  • Named Entity Recognition (NER)
  • precision: fraction of retrieved documents that are relevant
  • recall: fraction of relevant documents that are retrieved
  • IO vs IOB (inside-outside-beginning) tagging


  • Future state depends on past state (t depends on t-1)
  • Conditional Markov Decision Models (CMM)
  • Maximum Entropy Markov Model (MEMM)
  • Flavors:
    • t depends on t-1 and t+1
    • t depends on t-1, t-2 …
  • greedy vs beam search



  • Unigram – essentially, random words. Their tag only depends on the word.
  • Bi-gram – the markov model. Their tag depends on the previous word.
  • N-gram – you get the idea … N=k for some value of k>1

Naiive Bayes

Neural Language Models

Word Vectors


Attention and Transformers

Finetuning and Prompting

Reinforcement Learning with Human Feedback (RLHF)

Books and References

