Baselinr Documentation
Welcome to the Baselinr documentation! This directory contains all documentation organized by topic.
📚 Documentation Structure
🚀 Getting Started
- Quick Start Guide - Get up and running in 5 minutes
- Installation Guide - Detailed installation instructions
📖 Guides
- Data Validation ✨ NEW - Rule-based data quality validation with format, range, enum, null, uniqueness, and referential integrity checks
- Smart Table Selection ✨ NEW - Automatically recommend tables to monitor based on usage patterns and database metadata
- Smart Selection Quick Start ✨ NEW - Get started with intelligent table selection in 5 minutes
- Root Cause Analysis ✨ NEW - Automatically correlate anomalies with pipeline runs, code changes, and upstream data issues
- Python SDK - Complete guide to the Python SDK for programmatic access to Baselinr
- Airflow Integration ✨ NEW - Complete guide to integrating Baselinr with Apache Airflow 2.x
- Airflow Quick Start ✨ NEW - Get started with Airflow integration in 5 minutes
- Profiling Enrichment - Enhanced profiling metrics: null ratios, uniqueness, schema tracking, and data quality metrics
- Column-Level Configurations - Fine-grained control over profiling, drift, and anomaly detection per column
- Drift Detection - Understanding and configuring drift detection, including type-specific thresholds
- Statistical Drift Detection - Advanced statistical tests for drift detection (KS test, PSI, chi-square, etc.)
- Slack Alerts - Set up Slack notifications for drift detection events
- Partition & Sampling - Advanced profiling strategies
- Parallelism & Batching - Optional parallel execution for faster profiling
- Incremental Profiling - Skip unchanged tables and control profiling costs
- Prometheus Metrics - Setting up monitoring and metrics
- Retry & Recovery - Automatic retry for transient warehouse failures
- Retry Quick Start - Quick reference for retry system
- Retry Implementation - Technical implementation details
📋 Schemas & CLI
- Query Examples - Query command examples and patterns
- Status Command - Status command reference and examples
- UI Command - Start local dashboard with
baselinr ui - Schema Reference - Database schema documentation
- Migration Guide - Schema upgrade procedures
🏗️ Architecture
- Project Overview - High-level system architecture
- Events & Hooks - Event system and hook architecture
- Events Implementation - Implementation details
🎨 Dashboard
- Dashboard Quick Start - Dashboard setup guide
- Dashboard README - Dashboard overview and features
- Dashboard Architecture - Dashboard technical architecture
- Setup Complete - Post-setup verification
- Dashboard Integration - Integrating with Baselinr
Backend
- Backend README - Backend API documentation
- Fix Missing Tables - Troubleshooting guide
- Fix Multiple Tables - Database schema fix
Frontend
- Frontend README - Frontend development guide
- Node.js Setup - Node.js installation troubleshooting
🛠️ Development
- Development Guide - Contributing and development setup
- Git Hooks - Pre-commit and pre-push hooks setup
- Build Complete - Build status and completion notes
🐳 Docker
- Metrics Setup - Docker metrics and monitoring setup
📝 Quick Links
- Main README: ../README.md - Project overview and quick start
- Roadmap: ../ROADMAP.md - Planned features and future enhancements
- Examples: ../examples/ - Configuration examples
- Makefile: ../Makefile - Common commands
🔍 Finding What You Need
- New to Baselinr? → Start with Getting Started
- Want automated setup? ✨ NEW → See Smart Selection Quick Start for zero-touch configuration
- Need data validation? ✨ NEW → See Data Validation Guide
- Want automatic table discovery? ✨ NEW → See Smart Table Selection
- Need root cause analysis? ✨ NEW → See Root Cause Analysis
- Using the Python SDK? → See Python SDK Guide
- Setting up the dashboard? → See Dashboard Quick Start
- Setting up Slack alerts? → See Slack Alerts Guide
- Profiling many tables? → Enable Parallelism & Batching
- Using enrichment metrics? → See Profiling Enrichment
- Configuring column-level controls? → See Column-Level Configurations
- Configuring drift detection? → Check Drift Detection Guide
- Using statistical tests? → See Statistical Drift Detection
- Checking system status? → See Status Command
- Starting the dashboard? → See UI Command
- Querying metadata? → See Query Examples
- Understanding the architecture? → Read Project Overview
- Troubleshooting? → Check the relevant component's README or fix guides
📄 Documentation Standards
All documentation follows these conventions:
- Markdown format (
.md) - Clear headings and structure
- Code examples with syntax highlighting
- Links to related documentation
- Step-by-step instructions where applicable