📄 NEW: Free Data Engineering Cheatsheet 2026 — SQL, Airflow, Spark, Kafka, dbt & more →
DataHub logo

DataHub

Free tier available

Open-source metadata platform for the modern data stack

Data Catalogs catalog metadata lineage

📖 Overview

DataHub is an open-source metadata platform originally built at LinkedIn. It provides data discovery, lineage, and governance capabilities. Acryl Data offers a managed cloud version. DataHub is highly extensible and handles metadata at LinkedIn's scale.

Key Features

  • Metadata Ingestion: 50+ source connectors
  • Lineage: Column-level data lineage
  • Search & Discovery: Find data assets fast
  • Data Quality: Integrate quality metrics
  • Governance: Tags, glossary, ownership
  • Real-time Updates: Stream metadata changes

💰 Pricing

Model
open source
Starting Price
$0
Cloud/Pro
Acryl Data offers managed
Free tier available 🏢 Enterprise plans available

👍 Pros

  • + True open-source with active community
  • + LinkedIn-proven scale
  • + Highly extensible architecture
  • + Good lineage capabilities
  • + Self-hosted option
  • + Acryl offers managed cloud

👎 Cons

  • Complex to deploy self-hosted
  • Steeper learning curve
  • UI less polished than commercial tools
  • Requires investment to configure

🎯 Best For

Organizations wanting open-source data catalog at scale. Good for teams with engineering resources to customize and maintain.

🔗 Works With

📁 More Data Catalogs Tools

View all Data Catalogs tools →