Bulk Loader

The falkordb-bulk-loader is a Python utility for building FalkorDB graphs from CSV files. It uses the GRAPH.BULK endpoint to import nodes and relationships efficiently in binary batches — much faster than issuing individual CREATE queries.

Requirements

Python 3.10 or later
A running FalkorDB instance (see Get Started)

Installation

pip install falkordb-bulk-loader

Quick Start

Given two CSV files — Person.csv (nodes) and KNOWS.csv (relationships) — import them into a graph named SocialGraph:

falkordb-bulk-insert SocialGraph \
  -n Person.csv \
  -r KNOWS.csv

The label (for nodes) and relationship type (for relationships) are derived from the CSV filename. Multiple node and relation files can be provided by repeating the flags:

falkordb-bulk-insert SocialGraph \
  -n Person.csv \
  -n Country.csv \
  -r KNOWS.csv \
  -r VISITED.csv

Connecting to FalkorDB

By default the loader connects to redis://127.0.0.1:6379. Use --server-url to point it at a different instance:

falkordb-bulk-insert SocialGraph \
  --server-url redis://myhost:6379 \
  -n Person.csv

Key Options

Flag	Extended flag	Description
`-u`	`--server-url TEXT`	Server URL (default: `redis://127.0.0.1:6379`)
`-n`	`--nodes TEXT`	Node CSV file (filename → label)
`-N`	`--nodes-with-label TEXT`	Explicit label followed by node CSV file
`-r`	`--relations TEXT`	Relationship CSV file (filename → type)
`-R`	`--relations-with-type TEXT`	Explicit type followed by relationship CSV file
`-o`	`--separator CHAR`	Field delimiter (default: `,`)
`-d`	`--enforce-schema`	Require typed column headers (see below)
`-j`	`--id-type TEXT`	Type of node ID property: `STRING` or `INTEGER`
`-s`	`--skip-invalid-nodes`	Skip duplicate node IDs instead of erroring
`-e`	`--skip-invalid-edges`	Skip edges with unknown endpoints instead of erroring
`-i`	`--index Label:Property`	Create a range index after import
`-f`	`--full-text-index Label:Property`	Create a full-text index after import

Enforcing a Schema

By default the loader infers each property’s type. Use --enforce-schema (-d) when you want explicit control. Column headers must follow the name:TYPE format:

User.csv

:ID(User),name:STRING,rank:INT
0,"Alice",5
1,"Bob",8

FOLLOWS.csv

:START_ID(User),:END_ID(User),weight:DOUBLE
0,1,0.9
1,0,0.4

falkordb-bulk-insert SocialGraph \
  --enforce-schema \
  -n User.csv \
  -r FOLLOWS.csv

Accepted type strings: ID, START_ID, END_ID, IGNORE, STRING, INT / INTEGER / LONG, DOUBLE / FLOAT, BOOL / BOOLEAN, ARRAY.

Bulk Updates

The companion command falkordb-bulk-update reads a CSV in batches and issues a parameterized Cypher query for each row — useful for incremental updates or when you want full control over the Cypher:

falkordb-bulk-update SocialGraph \
  --csv User.csv \
  --query "MERGE (:User {id: row[0], name: row[1], rank: row[2]})"

Note: falkordb-bulk-update commits changes incrementally. Sanitize your CSV inputs beforehand to avoid leaving the graph in a partially-updated state.

Diagnostics

Both falkordb-bulk-insert and falkordb-bulk-update install a SIGUSR1 handler at startup. Sending SIGUSR1 to a running loader process writes the tracebacks of all Python threads to stderr, which is useful for diagnosing hangs or unexpectedly slow loads without attaching a debugger:

kill -SIGUSR1 <pid>

This relies on Python’s faulthandler module and is only available on platforms that support SIGUSR1 (i.e., not Windows). On unsupported platforms, registration is silently skipped.