Community Detection
Community Detection
Find clusters and groups with Louvain and Label Propagation
What You'll Learn
- Louvain - Modularity-based community detection
- Label Propagation - Fast iterative community assignment
- Comparison - When to use each algorithm
# Cell 1 — ParametersUSERNAME = "_FILL_ME_IN_" # Set your email before running# Cell 2 — Connectfrom graph_olap import GraphOLAPClientclient = GraphOLAPClient(username=USERNAME)
# Cell 3 — Provisionfrom notebook_setup import provisionpersonas, conn = provision(USERNAME)analyst = personas["analyst"]admin = personas["admin"]ops = personas["ops"]client = analyst
print(f"Connected | {conn.query_scalar('MATCH (n) RETURN count(n)')} nodes")Louvain Community Detection
Find communities by optimising modularity
The Louvain algorithm detects communities by maximising modularity — a measure of how densely connected nodes are within a community compared to connections between communities. It works in two phases: first assigning each node to the community that gives the largest modularity gain, then collapsing communities into super-nodes and repeating.
In our test graph, all 5 customers share accounts with each other through 6 edges, forming a single tightly-connected cluster. Louvain correctly assigns them all to the same community.
# Run Louvain community detectionresult = conn.algo.louvain( node_label="Customer", property_name="comm_louvain", edge_type="SHARES_ACCOUNT",)print(f"Louvain {result.status} \u2014 {result.nodes_updated} nodes assigned to communities")# View community assignmentsdf = conn.query_df(""" MATCH (c:Customer) RETURN c.id AS name, c.comm_louvain AS community ORDER BY c.comm_louvain, c.id""")dfAll 5 customers are assigned to community 0. This is expected: the test graph is small and densely connected, so there is no natural split into separate groups.
Label Propagation
Fast community detection through neighbour voting
Label Propagation works differently from Louvain. Each node starts with a unique label, then iteratively adopts the most common label among its neighbours. The process converges when no node wants to change its label.
Label Propagation is faster than Louvain (near-linear time complexity) but less deterministic — different runs may produce slightly different community assignments. It is a good choice when speed matters more than precision, especially on very large graphs.
# Run Label Propagationresult = conn.algo.label_propagation( node_label="Customer", property_name="comm_lp", edge_type="SHARES_ACCOUNT", max_iterations=100,)print(f"Label Propagation {result.status} — {result.nodes_updated} nodes assigned to communities")
df = conn.query_df(""" MATCH (c:Customer) RETURN c.id AS name, c.comm_lp AS community ORDER BY c.comm_lp, c.id""")dfComparing Methods
Louvain vs Label Propagation side by side
Both algorithms assigned all customers to the same community in our test graph. Let us compare the assignments side by side and discuss when to choose which.
| Criterion | Louvain | Label Propagation |
|---|---|---|
| Quality | Higher (modularity-optimised) | Good but non-deterministic |
| Speed | O(n log n) | Near-linear O(n) |
| Best for | Production analysis | Exploratory / large graphs |
| Deterministic | Yes | No (may vary between runs) |
# Compare both community assignments in one tabledf = conn.query_df(""" MATCH (c:Customer) RETURN c.id AS name, c.comm_louvain AS louvain, c.comm_lp AS label_propagation ORDER BY c.id""")dfTuning with Resolution
Control community granularity with parameters
The Louvain algorithm accepts a resolution parameter that controls community
granularity:
- resolution < 1.0 — Fewer, larger communities (merge small groups)
- resolution = 1.0 — Default behaviour
- resolution > 1.0 — More, smaller communities (split large groups)
Higher resolution values make the algorithm more aggressive about splitting nodes into separate communities. On our small test graph the effect is limited, but on larger production graphs this parameter is essential for tuning results.
# Run Louvain with higher resolution to produce finer-grained communitiesresult = conn.algo.louvain( node_label="Customer", property_name="comm_louvain_hi", edge_type="SHARES_ACCOUNT", resolution=2.0,)print(f"Louvain (resolution=2.0) {result.status} — {result.nodes_updated} nodes assigned")
# Compare default vs high-resolution communitiesdf = conn.query_df(""" MATCH (c:Customer) RETURN c.id AS name, c.comm_louvain AS default_res_1_0, c.comm_louvain_hi AS high_res_2_0 ORDER BY c.id""")dfWith the default resolution (1.0), all customers form a single community. Increasing the resolution to 2.0 causes Louvain to split the graph into two communities — MR LAU XIAOMING and KWONG XIAO TONG (the higher-degree nodes) remain together, while the three lower-degree customers form a separate group.
Key Takeaways
- Louvain finds high-quality communities by optimising modularity (
conn.algo.louvain) - Label Propagation is faster but non-deterministic — good for large-scale exploration (
conn.algo.label_propagation) - The resolution parameter controls community granularity — higher values produce smaller, more numerous communities
- In banking, communities reveal shared-account networks and related parties