Structural Analysis
Structural Analysis
Analyse graph topology with K-Core, Triangle Count, and Clustering Coefficient
What You'll Learn
- K-Core Decomposition - Identify the densest subgraph cores
- Triangle Count - Count closed triads per node
- Clustering Coefficient - Measure how tightly neighbours are connected
- Structural Comparison - Combine measures to characterise node roles
# Cell 1 — ParametersUSERNAME = "_FILL_ME_IN_" # Set your email before running# Cell 2 — Connectfrom graph_olap import GraphOLAPClientclient = GraphOLAPClient(username=USERNAME)
# Cell 3 — Provisionfrom notebook_setup import provisionpersonas, conn = provision(USERNAME)analyst = personas["analyst"]admin = personas["admin"]ops = personas["ops"]client = analyst
print(f"Connected | {conn.query_scalar('MATCH (n) RETURN count(n)')} nodes")K-Core Decomposition
Find the densest subgraph where every node has at least k connections
A k-core is a maximal subgraph where every node has degree at least k within that subgraph. The coreness of a node is the highest k-core it belongs to. Nodes with high coreness sit in the densely connected heart of the network.
In banking, high-coreness customers share accounts with many other high-coreness customers — a pattern worth reviewing in AML investigations.
# K-Core decomposition — assign coreness value to each noderesult = conn.algo.kcore( node_label="Customer", property_name="kcore", edge_type="SHARES_ACCOUNT",)print(f"Status: {result.status}, Nodes updated: {result.nodes_updated}")
cores = conn.query(""" MATCH (c:Customer) RETURN c.id AS name, c.kcore AS coreness ORDER BY c.kcore DESC, c.id""")
print("\nAll nodes are in the 2-core: every customer connects to at least 2 others.")
cores.show()Triangle Count
Count closed triads to measure local density
A triangle is a set of three nodes that are all connected to each other. The triangle count for a node tells you how many such closed triads it participates in.
High triangle counts indicate tightly knit groups. In a shared-account network, triangles mean three customers all share accounts with each other — a potentially suspicious pattern worth investigating.
# Triangle count — number of closed triads per noderesult = conn.algo.triangle_count( node_label="Customer", property_name="triangles", edge_type="SHARES_ACCOUNT",)print(f"Status: {result.status}, Nodes updated: {result.nodes_updated}")
triangles = conn.query(""" MATCH (c:Customer) RETURN c.id AS name, c.triangles AS tri ORDER BY c.triangles DESC, c.id""")
print("\nLAU and KWONG participate in the most triangles (degree-3 hub nodes).")
triangles.show()Clustering Coefficient
How connected are a node's neighbours to each other?
The clustering coefficient of a node is the ratio of actual connections between its neighbours to the maximum possible connections. A value of 1.0 means all neighbours know each other; 0.0 means none do.
This uses the NetworkX integration via conn.networkx.clustering_coefficient().
# Clustering coefficient via NetworkX integrationresult = conn.networkx.clustering_coefficient( node_label="Customer", property_name="clustering",)print(f"Status: {result.status}")
clustering = conn.query(""" MATCH (c:Customer) RETURN c.id AS name, round(c.clustering, 3) AS cc ORDER BY c.clustering, c.id""")
print("\nDegree-2 nodes have clustering 1.0 (both neighbours connected).")print("Degree-3 hub nodes have 0.667 (2 of 3 neighbour pairs connected).")
clustering.show()Comparing Structural Measures
Side-by-side view of K-Core, triangles, and clustering
# Compare all structural measures side by sidedf = conn.query_df(""" MATCH (c:Customer) RETURN c.id AS name, c.kcore AS kcore, c.triangles AS triangles, round(c.clustering, 3) AS clustering ORDER BY c.triangles DESC, c.id""")
print("Hub nodes (LAU, KWONG): high triangles, lower clustering (more neighbours to connect).")print("Peripheral nodes: fewer triangles but perfect clustering (small tight group).")
dfKey Takeaways
- K-Core reveals the densest core of the network — nodes that survive iterative pruning
- Triangle Count measures local density; high counts indicate tightly knit groups
- Clustering Coefficient normalises triangles by degree — high values mean neighbours are well-connected
- Hub nodes often have high triangle count but lower clustering (many neighbours, not all connected)
- Combining these measures characterises node roles: hubs vs. peripheral members