Skip to main content

Natural Language Processing & Large Language Models Track

NLP & LLM Track

Goal

Focus: Master the complete lifecycle of Large Language Models-from foundational architectures and embeddings to building production-ready applications with RAG, agents, and fine-tuning.

Learn how language models understand, generate, and reason with text, and develop the skills to evaluate, optimize, and deploy LLM systems in real-world scenarios.

Curriculum

1. Text Representation & Classical NLP

Key Topics:

  • Text preprocessing techniques: tokenization, normalization, stopword removal
  • Bag-of-Words representation and TF-IDF weighting
  • N-grams and basic statistical language modeling intuition
  • Vocabulary construction, dimensionality growth, and sparsity issues
  • Core limitations of classical NLP (loss of context, semantics, scalability)

Action Items:

  • Implement a full preprocessing pipeline on a sample dataset
  • Build BoW and TF-IDF vectors and compare feature distributions
  • Experiment with different n-gram sizes and analyze performance impact
  • Measure vocabulary size growth and sparsity levels
  • Compare classical feature-based models with embedding-based approaches
course
beginner

Stanford CS224N — NLP with Deep Learning (Lectures 1~2)

3-4 hours
tutorial
beginner

Scikit-learn Text Feature Extraction Guide

3-4 hours

2. Word Embeddings & Distributional Semantics

Key Topics:

  • Distributional hypothesis and meaning-from-context principle
  • Word2Vec architectures: CBOW vs Skip-gram
  • Negative sampling and efficient training approximation
  • GloVe model intuition and global co-occurrence learning
  • Embedding geometry, cosine similarity, and semantic relationships
  • Bias and social artifacts encoded in vector representations

Action Items:

  • Train Word2Vec on a small corpus using CBOW and Skip-gram
  • Compare embedding quality using similarity queries
  • Visualize embeddings with PCA or t-SNE
  • Experiment with negative sampling parameters
  • Evaluate and detect bias patterns in trained embeddings
course
intermediate

Lecture 2: Word Vectors and Word Senses

2-3 hours
paper
intermediate

Word2Vec Paper

2-3 hours
paper
intermediate

GloVe Paper (Conceptual Overview)

2-3 hours
tutorial
intermediate

Practical Word Embeddings with Gensim

2-3 hours

3. Attention Mechanisms & Transformers

Key Topics

  • Core intuition behind attention and weighted context aggregation
  • Self-attention and token-to-token interaction modeling
  • Query, Key, Value (QKV) formulation and attention score computation
  • Positional encoding and sequence order representation
  • Transformer encoder and decoder architecture components
  • Computational efficiency and scalability advantages over RNNs

Action Items

  • Implement a basic self-attention layer from scratch
  • Visualize attention weight matrices for sample inputs
  • Experiment with positional encodings and observe output changes
  • Build a minimal Transformer encoder block
  • Training speed and parallelism with transformer models
course
intermediate

Stanford CS231N Deep Learning I 2025 (Lecture 8)

2-3 hours
paper
intermediate

Attention Is All You Need

2-3 hours
tutorial
intermediate

Transformer Neural Networks, ChatGPT's foundation

2-3 hours

4. Large Language Models (LLMs)

Key Topics:

  • Language modeling objective and next-token prediction principle
  • End-to-end LLM training pipeline (data, architecture, optimization)
  • Pretraining vs fine-tuning roles and interaction
  • Causal (GPT-style) vs masked (BERT-style) language models
  • Scaling laws and performance vs compute/data trade-offs
  • Emergent behaviors from scale (reasoning, in-context learning)
  • Common limitations: hallucinations, bias, brittleness, context limits

Action Items:

  • Implement a mini GPT-style model on a small dataset
  • Compare causal vs masked model behavior on the same task
  • Run scaling experiments with different model sizes
  • Analyze failure cases and hallucinated outputs
  • Benchmark fine-tuned vs base model performance
course
intermediate

Building GPT from scratch (Andrew Karpathy)

3-4 hours
course
intermediate

Large Language Models from scratch (Book) (Chapter 3~5)

3-4 hours

5. Fine-Tuning Methods

Key Topics:

  • Difference between pretraining, fine-tuning, and prompting
  • Full fine-tuning vs parameter-efficient fine-tuning (PEFT) trade-offs
  • LoRA fundamentals and low-rank adaptation mechanism
  • QLoRA and quantized training for memory-efficient scaling
  • GPU memory, compute, and precision optimization strategies
  • Training stability, overfitting risks, and regularization techniques
  • Deployment strategies for fine-tuned and adapter-based models

Action Items:

  • Compare outputs of base, prompted, and fine-tuned models
  • Train a small LoRA adapter and measure VRAM usage
  • Run one QLoRA experiment on limited hardware
  • Monitor training vs validation loss to detect overfitting
  • Export and benchmark a deployment-ready fine-tuned model
course
intermediate

Fundamentals of LLM Fine-Tuning

4-6 hours
paper
advanced

LoRA: Low-Rank Adaptation of Large Language Models

2-3 hours
paper
advanced

QLoRA: Efficient Finetuning of Quantized LLMs

2-3 hours
docs
advanced

Practical Fine-Tuning with Unsloth

2-3 hours

6. RAG and AI Agents orchestration

Key Topics

  • Retrieval-Augmented Generation (RAG) pipelines
  • Document chunking strategies and context window optimization
  • Embeddings and vector database integration
  • Tool calling, function execution, and agent-based workflows
  • Context management, memory layers, and session handling
  • Single-agent vs multi-agent system design
  • Planning, execution loops, and coordination strategies

Action Items

  • Build a basic RAG pipeline using LangChain
  • Experiment with different chunk sizes and overlap strategies
  • Integrate a vector database and benchmark retrieval quality
  • Implement tool calling for external API interaction
  • Design a two-agent system with specialized roles
tutorial
advanced

Build a RAG agent with LangChain

3-4 hours
course
advanced

AI Agent Orchestration with CrewAI

4-6 hours

Capstone Project

Build an advanced LLM-powered system with RAG, vector databases, multi-agent orchestration, and a web interface for retrieval and intelligent interaction.