The Real-Time Data Quality Monitor is an open-source observability tool built with Apache Kafka, dbt, and machine learning to track six key data quality dimensions across streaming pipelines. By using Isolation Forest anomaly detection and delivering sub-10ms latency monitoring, the project helps data engineering teams detect issues before they impact business decisions—without relying on expensive enterprise observability platforms.
