Bulk Loader

The falkordb-bulk-loader is a Python utility for building FalkorDB graphs from CSV files. It uses the GRAPH.BULK endpoint to import nodes and relationships efficiently in binary batches — much faster than issuing individual CREATE queries.

Requirements

  • Python 3.10 or later
  • A running FalkorDB instance (see Get Started)

Installation

pip install falkordb-bulk-loader

Quick Start

Given two CSV files — Person.csv (nodes) and KNOWS.csv (relationships) — import them into a graph named SocialGraph:

falkordb-bulk-insert SocialGraph \
  -n Person.csv \
  -r KNOWS.csv

The label (for nodes) and relationship type (for relationships) are derived from the CSV filename. Multiple node and relation files can be provided by repeating the flags:

falkordb-bulk-insert SocialGraph \
  -n Person.csv \
  -n Country.csv \
  -r KNOWS.csv \
  -r VISITED.csv

Connecting to FalkorDB

By default the loader connects to redis://127.0.0.1:6379. Use --server-url to point it at a different instance:

falkordb-bulk-insert SocialGraph \
  --server-url redis://myhost:6379 \
  -n Person.csv

Key Options

Flag Extended flag Description
-u --server-url TEXT Server URL (default: redis://127.0.0.1:6379)
-n --nodes TEXT Node CSV file (filename → label)
-N --nodes-with-label TEXT Explicit label followed by node CSV file
-r --relations TEXT Relationship CSV file (filename → type)
-R --relations-with-type TEXT Explicit type followed by relationship CSV file
-o --separator CHAR Field delimiter (default: ,)
-d --enforce-schema Require typed column headers (see below)
-j --id-type TEXT Type of node ID property: STRING or INTEGER
-s --skip-invalid-nodes Skip duplicate node IDs instead of erroring
-e --skip-invalid-edges Skip edges with unknown endpoints instead of erroring
-i --index Label:Property Create a range index after import
-f --full-text-index Label:Property Create a full-text index after import

Enforcing a Schema

By default the loader infers each property’s type. Use --enforce-schema (-d) when you want explicit control. Column headers must follow the name:TYPE format:

User.csv

:ID(User),name:STRING,rank:INT
0,"Alice",5
1,"Bob",8

FOLLOWS.csv

:START_ID(User),:END_ID(User),weight:DOUBLE
0,1,0.9
1,0,0.4
falkordb-bulk-insert SocialGraph \
  --enforce-schema \
  -n User.csv \
  -r FOLLOWS.csv

Accepted type strings: ID, START_ID, END_ID, IGNORE, STRING, INT / INTEGER / LONG, DOUBLE / FLOAT, BOOL / BOOLEAN, ARRAY.

Bulk Updates

The companion command falkordb-bulk-update reads a CSV in batches and issues a parameterized Cypher query for each row — useful for incremental updates or when you want full control over the Cypher:

falkordb-bulk-update SocialGraph \
  --csv User.csv \
  --query "MERGE (:User {id: row[0], name: row[1], rank: row[2]})"

Note: falkordb-bulk-update commits changes incrementally. Sanitize your CSV inputs beforehand to avoid leaving the graph in a partially-updated state.

Diagnostics

Both falkordb-bulk-insert and falkordb-bulk-update install a SIGUSR1 handler at startup. Sending SIGUSR1 to a running loader process writes the tracebacks of all Python threads to stderr, which is useful for diagnosing hangs or unexpectedly slow loads without attaching a debugger:

kill -SIGUSR1 <pid>

This relies on Python’s faulthandler module and is only available on platforms that support SIGUSR1 (i.e., not Windows). On unsupported platforms, registration is silently skipped.

Further Reading