Welcome to DataProbe’s documentation!

DataProbe is a comprehensive Python toolkit for debugging, profiling, and optimizing data pipelines.

Features

  • Operation Tracking: Track execution time, memory usage, and data shapes

  • Visual Pipeline Flow: Generate pipeline execution visualizations

  • Memory Profiling: Monitor and identify memory-intensive operations

  • Data Lineage: Track data transformations throughout the pipeline

  • Bottleneck Detection: Identify slow operations and memory peaks

  • Performance Reports: Generate debugging reports with optimization suggestions

Quick Example

from dataprobe import PipelineDebugger

debugger = PipelineDebugger(name="My_Pipeline")

@debugger.track_operation("Load Data")
def load_data():
    return pd.read_csv("data.csv")

Indices and tables