Skip to content

Structural Analysis

Tutorial

Structural Analysis

Analyse graph topology with K-Core, Triangle Count, and Clustering Coefficient

25 min Intermediate
AlgorithmsStructuralK-CoreTriangles

What You'll Learn

  • K-Core Decomposition - Identify the densest subgraph cores
  • Triangle Count - Count closed triads per node
  • Clustering Coefficient - Measure how tightly neighbours are connected
  • Structural Comparison - Combine measures to characterise node roles
Info:
Prerequisites
Complete 01 Algorithm Concepts first.
# Cell 1 — Parameters
USERNAME = "_FILL_ME_IN_" # Set your email before running
# Cell 2 — Connect
from graph_olap import GraphOLAPClient
client = GraphOLAPClient(username=USERNAME)
# Cell 3 — Provision
from notebook_setup import provision
personas, conn = provision(USERNAME)
analyst = personas["analyst"]
admin = personas["admin"]
ops = personas["ops"]
client = analyst
print(f"Connected | {conn.query_scalar('MATCH (n) RETURN count(n)')} nodes")
1

K-Core Decomposition

Find the densest subgraph where every node has at least k connections

A k-core is a maximal subgraph where every node has degree at least k within that subgraph. The coreness of a node is the highest k-core it belongs to. Nodes with high coreness sit in the densely connected heart of the network.

In banking, high-coreness customers share accounts with many other high-coreness customers — a pattern worth reviewing in AML investigations.

# K-Core decomposition — assign coreness value to each node
result = conn.algo.kcore(
node_label="Customer",
property_name="kcore",
edge_type="SHARES_ACCOUNT",
)
print(f"Status: {result.status}, Nodes updated: {result.nodes_updated}")
cores = conn.query("""
MATCH (c:Customer)
RETURN c.id AS name, c.kcore AS coreness
ORDER BY c.kcore DESC, c.id
""")
print("\nAll nodes are in the 2-core: every customer connects to at least 2 others.")
cores.show()
2

Triangle Count

Count closed triads to measure local density

A triangle is a set of three nodes that are all connected to each other. The triangle count for a node tells you how many such closed triads it participates in.

High triangle counts indicate tightly knit groups. In a shared-account network, triangles mean three customers all share accounts with each other — a potentially suspicious pattern worth investigating.

# Triangle count — number of closed triads per node
result = conn.algo.triangle_count(
node_label="Customer",
property_name="triangles",
edge_type="SHARES_ACCOUNT",
)
print(f"Status: {result.status}, Nodes updated: {result.nodes_updated}")
triangles = conn.query("""
MATCH (c:Customer)
RETURN c.id AS name, c.triangles AS tri
ORDER BY c.triangles DESC, c.id
""")
print("\nLAU and KWONG participate in the most triangles (degree-3 hub nodes).")
triangles.show()
3

Clustering Coefficient

How connected are a node's neighbours to each other?

The clustering coefficient of a node is the ratio of actual connections between its neighbours to the maximum possible connections. A value of 1.0 means all neighbours know each other; 0.0 means none do.

This uses the NetworkX integration via conn.networkx.clustering_coefficient().

# Clustering coefficient via NetworkX integration
result = conn.networkx.clustering_coefficient(
node_label="Customer",
property_name="clustering",
)
print(f"Status: {result.status}")
clustering = conn.query("""
MATCH (c:Customer)
RETURN c.id AS name, round(c.clustering, 3) AS cc
ORDER BY c.clustering, c.id
""")
print("\nDegree-2 nodes have clustering 1.0 (both neighbours connected).")
print("Degree-3 hub nodes have 0.667 (2 of 3 neighbour pairs connected).")
clustering.show()
4

Comparing Structural Measures

Side-by-side view of K-Core, triangles, and clustering

# Compare all structural measures side by side
df = conn.query_df("""
MATCH (c:Customer)
RETURN c.id AS name,
c.kcore AS kcore,
c.triangles AS triangles,
round(c.clustering, 3) AS clustering
ORDER BY c.triangles DESC, c.id
""")
print("Hub nodes (LAU, KWONG): high triangles, lower clustering (more neighbours to connect).")
print("Peripheral nodes: fewer triangles but perfect clustering (small tight group).")
df

Key Takeaways

  • K-Core reveals the densest core of the network — nodes that survive iterative pruning
  • Triangle Count measures local density; high counts indicate tightly knit groups
  • Clustering Coefficient normalises triangles by degree — high values mean neighbours are well-connected
  • Hub nodes often have high triangle count but lower clustering (many neighbours, not all connected)
  • Combining these measures characterises node roles: hubs vs. peripheral members