Community Detection using Label Propagation (CDLP)

Overview

The Community Detection using Label Propagation (CDLP) algorithm identifies communities in networks by propagating labels through the graph structure. Each node starts with a unique label, and through iterative propagation, nodes adopt the most frequent label among their neighbors, naturally forming communities where densely connected nodes share the same label.

CDLP serves as a powerful algorithm in scenarios such as:

Social network community detection
Biological network module identification
Web page clustering and topic detection
Market segmentation analysis
Fraud detection networks

Algorithm Details

CDLP initializes by assigning each node a unique label (typically its node ID). The algorithm then iteratively updates each node’s label to the most frequent label among its neighbors. During each iteration, nodes are processed in random order to avoid deterministic bias. The algorithm continues until labels stabilize (no changes occur) or a maximum number of iterations is reached. The final labels represent community assignments, where nodes sharing the same label belong to the same community.

The algorithm’s strength lies in its ability to discover communities without requiring prior knowledge of the number of communities or their sizes. It runs in near-linear time and mimics epidemic contagion by spreading labels through the network.

Performance

CDLP operates with a time complexity of O(m + n) per iteration, where:

n represents the total number of nodes
m represents the total number of edges

The algorithm typically converges within a few iterations, making it highly efficient for large-scale networks.

Syntax

CALL algo.labelPropagation([config])

Parameters

The procedure accepts an optional configuration Map with the following parameters:

Name	Type	Default	Description
`nodeLabels`	Array	All labels	Array of node labels to filter which nodes are included in the computation
`relationshipTypes`	Array	All relationship types	Array of relationship types to define which edges are traversed
`maxIterations`	Integer	10	Maximum number of iterations to run the algorithm

Return Values

The procedure returns a stream of records with the following fields:

Name	Type	Description
`node`	Node	The node entity included in the community
`communityId`	Integer	Identifier of the community the node belongs to

Examples

Let’s take this Social Network as an example:

    (Alice)---(Bob)---(Charlie)  (Kate)
       |       |         |
    (Diana)    |      (Eve)---(Frank)
       |       |         |      |
    (Grace)--(Henry)   (Iris)--(Jack)

There are 3 different communities that should emerge from this network:

Alice, Bob, Charlie, Diana, Grace, Henry
Eve, Frank, Iris, Jack
Any isolated nodes

Create the Graph

CREATE 
  (alice:Person {name: 'Alice'}),
  (bob:Person {name: 'Bob'}),
  (charlie:Person {name: 'Charlie'}),
  (diana:Person {name: 'Diana'}),
  (eve:Person {name: 'Eve'}),
  (frank:Person {name: 'Frank'}),
  (grace:Person {name: 'Grace'}),
  (henry:Person {name: 'Henry'}),
  (iris:Person {name: 'Iris'}),
  (jack:Person {name: 'Jack'}),
  (kate:Person {name: 'Kate'}),

  (alice)-[:KNOWS]->(bob),
  (bob)-[:KNOWS]->(charlie),
  (alice)-[:KNOWS]->(diana),
  (bob)-[:KNOWS]->(henry),
  (diana)-[:KNOWS]->(grace),
  (grace)-[:KNOWS]->(henry),
  (charlie)-[:KNOWS]->(eve),
  (eve)-[:KNOWS]->(frank),
  (eve)-[:KNOWS]->(iris),
  (frank)-[:KNOWS]->(jack),
  (iris)-[:KNOWS]->(jack)

Example: Detect all communities in the network

CALL algo.labelPropagation() YIELD node, communityId RETURN node.name AS name, communityId ORDER BY communityId, name 

Expected Results

| name | communityId | |————|————-| | Alice | 0 | | Bob | 0 | | Charlie | 0 | | Diana | 0 | | Grace | 0 | | Henry | 0 | | Eve | 2 | | Frank | 2 | | Iris | 2 | | Jack | 2 | | Kate | 10 |

Example: Detect communities with limited iterations

CALL algo.labelPropagation({maxIterations: 5}) YIELD node, communityId

Example: Focus on specific node types

CALL algo.labelPropagation({nodeLabels: ['Person']}) YIELD node, communityId

Example: Use only certain relationship types

CALL algo.labelPropagation({relationshipTypes: ['KNOWS', 'FRIENDS_WITH']}) YIELD node, communityId

Example: Combine node and relationship filtering

CALL algo.labelPropagation({
  nodeLabels: ['Person'], 
  relationshipTypes: ['KNOWS']
}) YIELD node, communityId

Example: Group communities together

CALL algo.labelPropagation() YIELD node, communityId 
RETURN collect(node.name) AS community_members, communityId, count(*) AS community_size
ORDER BY community_size DESC

Expected Results

| community_members | communityId | community_size | |———————————————————-|————-|—————-| | ["Alice", "Bob", "Charlie", "Diana", "Grace", "Henry"] | 0 | 6 | | ["Eve", "Frank", "Iris", "Jack"] | 2 | 4 | | ["Kate"] | 10 | 1 |

Example: Find the largest communities

CALL algo.labelPropagation() YIELD node, communityId 
RETURN communityId, collect(node) AS nodes, count(*) AS size
ORDER BY size DESC
LIMIT 1