Skip to content

Connected Components

Tutorial

Connected Components

Find disconnected subgraphs with WCC, SCC, and Kosaraju algorithms

20 min Intermediate
AlgorithmsComponentsConnectivity

What You'll Learn

  • Weakly Connected Components - Find connected subgraphs ignoring edge direction
  • Strongly Connected Components - Find subgraphs where every node reaches every other
  • Kosaraju SCC - An alternative SCC algorithm using two-pass DFS
  • Comparison - When WCC and SCC differ and what that means
Info:
Prerequisites
Complete 01 Algorithm Concepts first.
# Cell 1 — Parameters
USERNAME = "_FILL_ME_IN_" # Set your email before running
# Cell 2 — Connect
from graph_olap import GraphOLAPClient
client = GraphOLAPClient(username=USERNAME)
# Cell 3 — Provision
from notebook_setup import provision
personas, conn = provision(USERNAME)
analyst = personas["analyst"]
admin = personas["admin"]
ops = personas["ops"]
client = analyst
print(f"Connected | {conn.query_scalar('MATCH (n) RETURN count(n)')} nodes")
1

Weakly Connected Components (WCC)

Group nodes that can reach each other when edge direction is ignored

A weakly connected component is a maximal set of nodes where every pair is connected by some path, ignoring the direction of edges. In a banking context, WCC reveals clusters of customers linked through shared accounts — even if the account relationship only points one way.

If the entire graph is one component, every customer is reachable from every other through some chain of shared accounts.

# Weakly Connected Components — ignores edge direction
result = conn.algo.connected_components(
node_label="Customer",
property_name="wcc",
edge_type="SHARES_ACCOUNT",
)
print(f"Status: {result.status}, Nodes updated: {result.nodes_updated}")
# Group customers by their component ID
components = conn.query("""
MATCH (c:Customer)
RETURN c.wcc AS component, collect(c.id) AS members
ORDER BY component
""")
node_count = result.nodes_updated
num_components = len(components)
print(f"\nAll {node_count} customers belong to {num_components} component — the graph is fully connected.")
components.show()
2

Strongly Connected Components (SCC)

Find subgraphs where every node can reach every other following edge direction

A strongly connected component is stricter than WCC: every node must be reachable from every other node following the direction of edges. In a directed graph, SCC can split what WCC considers one component into several smaller ones.

For our test data the SHARES_ACCOUNT edges create bidirectional reachability among all five customers, so SCC produces the same single component as WCC.

# Strongly Connected Components — respects edge direction
result = conn.algo.scc(
node_label="Customer",
property_name="scc",
edge_type="SHARES_ACCOUNT",
)
print(f"Status: {result.status}, Nodes updated: {result.nodes_updated}")
scc_components = conn.query("""
MATCH (c:Customer)
RETURN c.scc AS component, collect(c.id) AS members
ORDER BY component
""")
print(f"\nSCC found {len(scc_components)} component(s) — same as WCC for this fully connected graph.")
scc_components.show()
3

SCC — Kosaraju Algorithm

An alternative two-pass depth-first search approach

Kosaraju’s algorithm finds the same strongly connected components using a different strategy: two depth-first traversals, one on the original graph and one on the transposed (reversed) graph. It produces identical results to the default SCC but may perform differently on certain graph shapes.

Use conn.algo.scc_kosaraju() when you want an alternative implementation for validation or benchmarking.

# Kosaraju SCC — alternative two-pass DFS algorithm
result = conn.algo.scc_kosaraju(
node_label="Customer",
property_name="scc_k",
edge_type="SHARES_ACCOUNT",
)
print(f"Status: {result.status}, Nodes updated: {result.nodes_updated}")
kosaraju = conn.query("""
MATCH (c:Customer)
RETURN c.scc_k AS component, collect(c.id) AS members
ORDER BY component
""")
kosaraju.show()
4

Comparing Results

Side-by-side view of all three algorithms

# Compare all three component algorithms side by side
df = conn.query_df("""
MATCH (c:Customer)
RETURN c.id AS name,
c.wcc AS wcc,
c.scc AS scc,
c.scc_k AS kosaraju
ORDER BY c.id
""")
print("All three algorithms agree: 1 component containing all 5 customers.")
print("In a graph with one-way edges or disconnected clusters, SCC and WCC would diverge.")
df

Key Takeaways

  • WCC ignores edge direction — use it to find clusters of related entities
  • SCC respects direction — use it when mutual reachability matters (e.g. circular fund flows)
  • Kosaraju is an alternative SCC algorithm for validation or benchmarking
  • In our test graph all 5 customers form a single component under all three algorithms
  • In production, multiple components reveal distinct customer networks for segmented analysis