Combining Algorithms
Combining Algorithms
Layer centrality, community, and structural measures for richer insights
What You'll Learn
- Multi-Algorithm Pipeline - Run PageRank, Louvain, and degree centrality in sequence
- Unified Query - Combine all results into a single DataFrame
- Cross-Algorithm Insight - Identify influential members within communities
# Cell 1 — ParametersUSERNAME = "_FILL_ME_IN_" # Set your email before running# Cell 2 — Connectfrom graph_olap import GraphOLAPClientclient = GraphOLAPClient(username=USERNAME)
# Cell 3 — Provisionfrom notebook_setup import provisionpersonas, conn = provision(USERNAME)analyst = personas["analyst"]admin = personas["admin"]ops = personas["ops"]client = analyst
print(f"Connected | {conn.query_scalar('MATCH (n) RETURN count(n)')} nodes")Run Multiple Algorithms
Build a multi-dimensional node profile with three algorithms
No single algorithm tells the whole story. PageRank identifies globally important nodes, Louvain groups nodes into communities, and degree centrality counts direct connections. Running all three and querying the results together reveals which customers are influential within their community — a pattern that matters for AML case prioritisation.
# 1. PageRank — global importancepr = conn.algo.pagerank( node_label="Customer", property_name="pr", edge_type="SHARES_ACCOUNT",)print(f"PageRank: {pr.status} ({pr.nodes_updated} nodes)")
# 2. Louvain — community assignmentlv = conn.algo.louvain( node_label="Customer", property_name="community", edge_type="SHARES_ACCOUNT",)print(f"Louvain: {lv.status} ({lv.nodes_updated} nodes)")
# 3. Degree centrality — normalised connection countdc = conn.networkx.degree_centrality( node_label="Customer", property_name="dc",)print(f"Degree Centrality: {dc.status}")
print("\nAll three algorithms stored results as node properties.")Unified Results Table
Query all algorithm results in a single DataFrame
# Query all algorithm results togetherdf = conn.query_df(""" MATCH (c:Customer) RETURN c.id AS name, round(c.pr, 3) AS pagerank, c.community AS community, round(c.dc, 2) AS degree ORDER BY c.pr DESC""")dfCross-Algorithm Insights
Identify influential members within each community
The real power of combining algorithms emerges when you layer the results:
- Community + PageRank → Who is the most influential person in each group?
- Community + Degree → Who has the most connections within their cluster?
- PageRank + Degree → Do well-connected nodes always rank highest?
In our small test graph, all customers are in one community and the hub nodes (LAU, KWONG) dominate both PageRank and degree. In production graphs with thousands of customers and multiple communities, this approach pinpoints the key individuals to review first.
# Find the leader of each community (highest PageRank within group)# Query all customers with their algorithm resultsdf = conn.query_df(""" MATCH (c:Customer) RETURN c.community AS comm, c.id AS name, c.pr AS pr, c.dc AS dc ORDER BY c.pr DESC""")
# Group by community in Python (avoids Cypher list function limitations)communities = {}for row in df.iter_rows(named=True): comm = row["comm"] if comm not in communities: communities[comm] = [] communities[comm].append(row)
print("Cross-algorithm insights:")for comm, members in sorted(communities.items()): leader = members[0] # Already sorted by PR desc most_connected = max(members, key=lambda m: m["dc"]) print(f"\nCommunity {comm} ({len(members)} members):") print(f" Leader (highest PageRank): {leader['name']} ({leader['pr']:.3f})") print(f" Most connected (highest degree): {most_connected['name']} ({most_connected['dc']:.2f})")
dfKey Takeaways
- Run multiple algorithms in sequence — results accumulate as node properties
- A single Cypher query can fetch all algorithm outputs into one DataFrame
- Combining centrality with community detection reveals leaders within clusters
- In production, multi-algorithm profiles help prioritise AML investigations
- Divergence between measures (e.g. high PageRank but low degree) flags unusual node roles