LLMs & Generative AI
LLM Operations
Prompt Engineering
- Prompt Engineering Fundamentals
- Prompt Design Patterns
- Prompt Optimization Techniques
- Prompt Injection and Safety
Fine-Tuning Strategies
- Fine-Tuning Overview
- Full Fine Tuning
- LoRA (Low-Rank Adaptation)
- QLoRA (Quantized LoRA)
- P-Tuning
- Adapter Methods
Model Evaluation
Context Management
- Context Window Fundamentals
- Token Optimization
- Context Compression Techniques
- Sliding Window Attention
- Long-Context Models
Production LLM Systems
Model Serving & Inference
- Model Serving Architecture
- Inference Optimization
- Batching Strategies
- KV Cache Optimization
- Speculative Decoding
- Model Quantization for Inference
Caching & Cost Management
- Semantic Caching
- Response Caching Strategies
- Cost Optimization Techniques
- Request Deduplication
- Token Budget Management
Safety & Alignment
RAG Systems
- 01 - RAG Index
- Vector Databases for RAG
- Embedding Models
- Retrieval Strategies
- Hybrid Search
- Re-ranking
- RAG Evaluation
Multi-Modal Systems
- Multi-Modal Models Overview
- Joint Embedding Spaces
- Vision-Language Models
- Text-to-Image Generation
- Image-to-Text Generation
- Audio Processing with LLMs
Agent Systems
- LLM Agents Fundamentals
- ReAct Pattern
- Tool Use and Function Calling
- Agent Orchestration
- Multi-Agent Systems
- Agent Memory Systems
Generative Models
- Transformer Language Models
- GPT Architecture
- BERT and Encoder Models
- T5 and Encoder-Decoder Models
- Diffusion Models
- GANs (Generative Adversarial Networks)
- VAEs (Variational Autoencoders)
Back to: ML & AI Index