Online Data Migration from SQL Sources to FalkorDB

The DM-SQL-to-FalkorDB repository provides Rust-based CLI loaders to perform an initial load from SQL systems into FalkorDB and optionally keep FalkorDB continuously synchronized using incremental watermarks.

It also includes an optional control plane (web UI + REST API) for creating configurations, starting runs, monitoring progress, and viewing persisted metrics snapshots.

Supported sources

  • ClickHouse
  • Databricks (Databricks SQL / warehouses)
  • MariaDB
  • MySQL
  • PostgreSQL
  • Snowflake
  • SQL Server

When to use this approach

Use these tools when you want:

  • A one-time migration from SQL tables into a FalkorDB property graph
  • Ongoing one-way sync so FalkorDB stays updated as rows change in the source system

Prerequisites

  • Rust toolchain (Cargo)
  • Network access to your SQL source (ClickHouse, Databricks SQL warehouse, MariaDB, MySQL, PostgreSQL, Snowflake, or SQL Server)
  • A reachable FalkorDB endpoint (for example falkor://127.0.0.1:6379)
  • Node.js + npm (optional; only needed for control plane UI)

Most configurations reference environment variables for secrets and credentials, for example: $CLICKHOUSE_URL, $DATABRICKS_TOKEN, $MARIADB_URL, $MYSQL_URL, $POSTGRES_URL, $SNOWFLAKE_PASSWORD, $SQLSERVER_CONNECTION_STRING.

Getting the tools

git clone https://github.com/FalkorDB/DM-SQL-to-FalkorDB.git
cd DM-SQL-to-FalkorDB

How the loaders work (high level)

Each loader uses JSON/YAML configuration to define:

  • How to read rows from the source (table + optional filter, or custom SELECT)
  • How rows map to:
    • Nodes (labels, keys, property mappings)
    • Edges (relationship type, direction, and endpoint matching rules)
  • Whether each mapping is full or incremental
  • Optional soft-delete behavior
  • Where incremental state/watermarks are persisted (typically a file-backed state.json)

Common concepts

  • Declarative mapping: you define mappings, and the loader handles extraction + load.
  • Idempotent writes: loaders use Cypher UNWIND + MERGE patterns.
  • Incremental safety: watermarks advance only after successful writes.
  • Restart safety: after failures, reruns continue from the last successful watermark.

Option A: Run a loader directly (CLI)

ClickHouse → FalkorDB

cd ClickHouse-to-FalkorDB
cargo build --release

# One-shot run
cargo run --release -- --config clickhouse.incremental.yaml

# Continuous sync
cargo run --release -- --config clickhouse.incremental.yaml --daemon --interval-secs 60

Databricks → FalkorDB

cd Databricks-to-FalkorDB/databricks-to-falkordb
cargo build --release
cargo run --release -- --config path/to/config.yaml

Note: this loader currently supports one-shot execution (no daemon/purge flags in the manifest).

MariaDB → FalkorDB

cd MariaDB-to-FalkorDB
cargo build --release

# One-shot run
cargo run --release -- --config mariadb.incremental.yaml

# Continuous sync
cargo run --release -- --config mariadb.incremental.yaml --daemon --interval-secs 60

MySQL → FalkorDB

cd MySQL-to-FalkorDB
cargo build --release

# One-shot run
cargo run --release -- --config mysql.incremental.yaml

# Continuous sync
cargo run --release -- --config mysql.incremental.yaml --daemon --interval-secs 60

PostgreSQL → FalkorDB

cd PostgreSQL-to-FalkorDB/postgres-to-falkordb
cargo build --release

# One-shot run
cargo run --release -- --config path/to/config.yaml

# Continuous sync
cargo run --release -- --config path/to/config.yaml --daemon --interval-secs 60

Snowflake → FalkorDB

cd Snowflake-to-FalkorDB
cargo build --release

# One-shot run
cargo run --release -- --config path/to/config.yaml

# Continuous sync
cargo run --release -- --config path/to/config.yaml --daemon --interval-secs 300

SQL Server → FalkorDB

cd SQLServer-to-FalkorDB
cargo build --release

# One-shot run
cargo run --release -- --config sqlserver.incremental.yaml

# Continuous sync
cargo run --release -- --config sqlserver.incremental.yaml --daemon --interval-secs 60

Optional purge modes (supported by ClickHouse, MariaDB, MySQL, Snowflake, and SQL Server):

# Purge full graph before loading
cargo run --release -- --config path/to/config.yaml --purge-graph

# Purge selected mappings
cargo run --release -- --config path/to/config.yaml --purge-mapping customers

Option B: Use the control plane (web UI + API)

The control plane discovers tools by scanning for tool.manifest.json files and provides a UI/API to:

  • Create and edit per-tool YAML/JSON configurations
  • Start runs (one-shot or daemon where supported)
  • Stop running jobs
  • Stream logs live (SSE)
  • Store run history and artifacts (SQLite + file-backed data directory)
  • Auto-wire tool metrics ports for metrics-capable tools
  • Persist per-tool/per-mapping metrics snapshots

Start the server:

cd control-plane/server

# Optional: require API key on /api (except /api/health)
export CONTROL_PLANE_API_KEY="..."

cargo run --release

Default server URL: http://localhost:3003

Configuration (environment variables):

  • CONTROL_PLANE_BIND (default 0.0.0.0:3003)
  • CONTROL_PLANE_REPO_ROOT (optional; repository root for manifest scan)
  • CONTROL_PLANE_DATA_DIR (default control-plane/data/)
  • CONTROL_PLANE_UI_DIST (default control-plane/ui/dist/)
  • CONTROL_PLANE_API_KEY (optional bearer token requirement)

Selected metrics API endpoints:

  • GET /api/metrics
  • GET /api/metrics/:tool_id

Notes:

  • Runs execute locally on the machine hosting the control plane server.
  • Log streaming uses SSE.
  • For supports_metrics: true tools, the control plane injects --metrics-port and persists snapshots in SQLite.
  • The Metrics UI uses persisted snapshots and does not expose internal scrape endpoint/port settings.

UI development (optional):

cd control-plane/ui
npm install
npm run dev

Screenshots: The main tools menu with migration selection options. DM-UI-7-tools

The following example shows how you manually execute a migration run, with visibility to the latest incremental watermark, and an option to clear it to restart incremental migration from scratch. DM-UI--screenshot

The following example shows the log view after a successful run. DM-UI--logs

The following shows the metrics view summarizing a run: DM-UI--metrics

Metrics feature (all SQL loaders)

All current SQL loaders expose Prometheus-style metrics with:

  • Global counters:
    • total runs
    • failed runs
    • rows fetched
    • rows written
    • rows deleted
  • Per-mapping counters:
    • runs
    • failed runs
    • rows fetched
    • rows written
    • rows deleted

Default metrics ports:

  • ClickHouse: 9991
  • Databricks: 9994
  • MariaDB: 9997
  • MySQL: 9995
  • PostgreSQL: 9993
  • Snowflake: 9992
  • SQL Server: 9996

You can override ports with --metrics-port (or each tool’s corresponding environment variable).

Operational tips

  • Define node mappings before edge mappings (edges depend on nodes).
  • Choose stable keys for MERGE (primary keys are usually best).
  • Use RUST_LOG=info (or debug) for richer loader diagnostics.
  • Keep state and control-plane data on durable storage for long-running sync setups.

Additional resources

  • DM-SQL-to-FalkorDB repository: https://github.com/FalkorDB/DM-SQL-to-FalkorDB
  • FalkorDB docs: https://docs.falkordb.com/