Data Engineering

Overview

Master data pipelines, infrastructure, and management practices essential for ML systems. Learn to build robust data foundations for AI applications.


💾 Data Management

Data Versioning

Feature Engineering

Feature Stores

Data Quality

Data Privacy

Handling Imbalanced Data


🔧 Data Infrastructure

ETL ELT Pipelines

Stream Processing

Data Storage

Vector Databases

Data Orchestration

Data Annotation

Data Catalog


📊 Progress Tracking

TABLE
  status as "Status",
  difficulty as "Difficulty",
  last_modified as "Last Updated"
FROM "01 - ML & AI Concepts/05 - Data Engineering"
WHERE contains(tags, "concept")
SORT file.name ASC

🎓 Learning Path

Recommended Order:

  1. Start with Data Quality and Validation
  2. Learn Feature Engineering and Feature Stores
  3. Study ETL/ELT Pipelines
  4. Understand Stream Processing
  5. Master Data Storage solutions
  6. Explore Vector Databases for AI
  7. Advanced: Privacy-Preserving techniques and Data Orchestration

Back to: ML & AI Index