Skip to content

Python SDK Architecture

Document Type: SDK Architecture Specification Version: 1.1 Status: Ready for Architectural Review Author: Graph OLAP Platform Team Last Updated: 2026-02-04


This architecture documentation is organized into five focused documents:

DocumentContent
Detailed ArchitectureExecutive Summary + C4 Architecture Viewpoints + Resource Management
This documentPython SDK, Resource Managers, Authentication
Domain & Data ArchitectureDomain Model, State Machines, Data Flows
Platform OperationsTechnology, Security, Integration, Operations, NFRs
Authorization & Access ControlRBAC Roles, Permission Matrix, Ownership Model, Enforcement

The Graph OLAP Platform is notebook-first by design. All user interactions happen through the Python SDK in Jupyter notebooks—there is no separate web console or GUI.

Operation CategorySDK ResourceKey Methods
Mapping Managementclient.mappingscreate(), list(), get(), update(), delete(), copy()
Instance Lifecycleclient.instancescreate_and_wait(), terminate(), update_cpu(), list()
Graph Queriesconn.query()query(), query_df(), query_scalar(), query_one()
Graph Algorithmsconn.algo / conn.networkxpagerank(), louvain(), wcc(), 500+ NetworkX algorithms
Schema Discoveryclient.schemalist_catalogs(), list_tables(), search_tables()
Favoritesclient.favoritesadd(), remove(), list()
Operations (Ops)client.opsget_cluster_health(), get_lifecycle_config(), trigger_job()
Administrationclient.adminbulk_delete()

Why Notebook-First?

  1. Reproducibility: All operations are code, making workflows reproducible and version-controllable
  2. Automation: Scripts can automate common tasks without GUI interaction
  3. Integration: Seamless integration with data science workflows (pandas, polars, visualization)
  4. Auditability: Every operation is logged with the user who executed it

graph-olap-sdk-architecture

Mermaid Source
---
config:
layout: elk
---
flowchart TB
accTitle: Graph OLAP SDK Architecture
accDescr: Shows SDK components from GraphOLAPClient through resource managers to Control Plane and Wrapper APIs
classDef user fill:#F3E5F5,stroke:#7B1FA2,stroke-width:2px,color:#4A148C
classDef client fill:#E1F5FE,stroke:#0277BD,stroke-width:2px,color:#01579B
classDef resource fill:#E8F5E9,stroke:#2E7D32,stroke-width:2px,color:#1B5E20
classDef http fill:#FFF8E1,stroke:#F57F17,stroke-width:2px,color:#E65100
classDef api fill:#E3F2FD,stroke:#1565C0,stroke-width:2px,color:#0D47A1
classDef conn fill:#FCE4EC,stroke:#C2185B,stroke-width:2px,color:#880E4F
Jupyter["Jupyter Notebook<br/>(Analyst)"]:::user
subgraph SDK["Python SDK (graph-olap-sdk)"]
Client["GraphOLAPClient<br/>───────────<br/>Main entry point<br/>from_env() / direct init"]:::client
subgraph Resources["Resource Managers"]
Mappings["MappingResource<br/>CRUD + versioning"]:::resource
Instances["InstanceResource<br/>lifecycle + CPU"]:::resource
Schema["SchemaResource<br/>Starburst metadata"]:::resource
Ops["OpsResource<br/>cluster config"]:::resource
Admin["AdminResource<br/>bulk ops"]:::resource
Health["HealthResource<br/>health checks"]:::resource
end
HTTP["HTTPClient<br/>───────────<br/>Retry logic<br/>Auth headers<br/>Error mapping"]:::http
subgraph Connection["InstanceConnection"]
Conn["Connection<br/>───────────<br/>Cypher queries<br/>query_df()"]:::conn
Algo["AlgorithmManager<br/>───────────<br/>Native algorithms<br/>pagerank, louvain"]:::conn
NX["NetworkXManager<br/>───────────<br/>500+ algorithms<br/>client-side graphs"]:::conn
end
end
CP["Control Plane API<br/>/api/*"]:::api
Wrapper["Wrapper Pod API<br/>/query, /algorithms"]:::api
Jupyter --> Client
Client --> Mappings & Instances & Schema & Ops & Admin & Health
Mappings & Instances & Schema & Ops & Admin & Health --> HTTP
HTTP --> CP
Instances -.->|"connect()"| Conn
Conn --> Algo & NX
Conn --> Wrapper
Algo --> Wrapper

This section provides a scannable reference of SDK capabilities for architects and technical leads. For each resource, method signatures show what operations are available.

from graph_olap import GraphOLAPClient
# Production: reads GRAPH_OLAP_* environment variables
client = GraphOLAPClient.from_env()
# Direct configuration
client = GraphOLAPClient(
api_url="https://graph.example.com",
api_key="your-api-key",
timeout=60.0,
)

client.mappings — Mappings define what data to load from Starburst into a graph.

MethodParametersReturnsDescription
list*, owner, search, sort_by, offset, limitPaginatedList[Mapping]List mappings with filters
getmapping_idMappingGet mapping by ID
createname, description, node_definitions, edge_definitionsMappingCreate new mapping
updatemapping_id, change_description, *, name, description, node_definitions, edge_definitionsMappingUpdate mapping (creates new version)
deletemapping_idNoneDelete mapping
copymapping_id, new_nameMappingCopy mapping with new name
get_versionmapping_id, versionMappingVersionGet specific version
list_versionsmapping_idlist[MappingVersion]List all versions
diffmapping_id, from_version, to_versionMappingDiffCompare two versions
list_snapshotsmapping_id, *, offset, limitPaginatedList[Snapshot]List snapshots for mapping
list_instancesmapping_id, *, offset, limitPaginatedList[Instance]List instances using mapping
set_lifecyclemapping_id, *, ttl, inactivity_timeoutMappingSet auto-cleanup policy
get_treemapping_id, *, include_instances, statusdictGet hierarchy (mapping → versions → snapshots → instances)

client.instances — Manage running graph instances (lifecycle, scaling, connectivity).

MethodParametersReturnsDescription
list*, snapshot_id, owner, status, search, sort_by, offset, limitPaginatedList[Instance]List instances with filters
getinstance_idInstanceGet instance by ID
createmapping_id, name, wrapper_type, *, mapping_version, description, ttl, inactivity_timeout, cpu_coresInstanceCreate instance (async)
create_and_waitmapping_id, name, wrapper_type, *, timeout, poll_interval, on_progress, ...InstanceCreate and wait until running
updateinstance_id, *, name, descriptionInstanceUpdate instance metadata
terminateinstance_idNoneTerminate and delete instance
update_cpuinstance_id, cpu_coresInstanceScale CPU (1-8 cores)
update_memoryinstance_id, memory_gbInstanceUpgrade memory (2-32 GB)
extend_ttlinstance_id, hours=24InstanceExtend TTL from current expiry
set_lifecycleinstance_id, *, ttl, inactivity_timeoutInstanceSet lifecycle parameters
get_progressinstance_idInstanceProgressGet startup progress details
get_healthinstance_id, *, timeoutdictGet wrapper health status
check_healthinstance_id, *, timeoutboolCheck if wrapper is healthy
wait_until_runninginstance_id, *, timeout, poll_intervalInstanceWait for running status
connectinstance_idInstanceConnectionGet connection for queries

client.schema — Browse Starburst metadata (cached, refreshed every 24h).

MethodParametersReturnsDescription
list_catalogslist[Catalog]List all Starburst catalogs
list_schemascataloglist[Schema]List schemas in a catalog
list_tablescatalog, schemalist[Table]List tables in a schema
list_columnscatalog, schema, tablelist[Column]Get columns for a table
search_tablespattern, limit=100list[Table]Search tables by name pattern
search_columnspattern, limit=100list[Column]Search columns by name pattern
admin_refreshdictTrigger cache refresh (admin)
get_statsCacheStatsGet cache statistics (admin)

client.ops — Cluster configuration, jobs, and metrics. Requires Ops role.

MethodParametersReturnsDescription
get_lifecycle_configLifecycleConfigGet TTL defaults for all resource types
update_lifecycle_config*, mapping, snapshot, instanceboolUpdate lifecycle defaults
get_concurrency_configConcurrencyConfigGet instance limits
update_concurrency_config*, per_analyst, cluster_totalConcurrencyConfigUpdate instance limits
get_maintenance_modeMaintenanceModeGet maintenance status
set_maintenance_modeenabled, message=""MaintenanceModeEnable/disable maintenance
get_export_configExportConfigGet export settings
update_export_config*, max_duration_secondsExportConfigUpdate export timeout
get_cluster_healthClusterHealthCheck cluster health
get_cluster_instancesClusterInstancesGet cluster-wide instance summary
get_metricsstrGet Prometheus metrics
trigger_jobjob_name, reason="manual-trigger"dictTrigger background job
get_job_statusdictGet all job statuses
get_statedictGet system state summary
get_export_jobsstatus=None, limit=100list[dict]Get export jobs for debugging

client.favorites — User bookmarks for quick access.

MethodParametersReturnsDescription
listresource_type=Nonelist[Favorite]List favorites
addresource_type, resource_idFavoriteAdd to favorites
removeresource_type, resource_idNoneRemove from favorites

client.admin — Privileged operations. Requires Admin role.

MethodParametersReturnsDescription
bulk_deleteresource_type, filters, reason, expected_count=None, dry_run=FalsedictBulk delete with safety checks

client.health — Health checks (no authentication required).

MethodParametersReturnsDescription
checkHealthStatusBasic health check
readyHealthStatusReadiness check with DB connectivity

conn = client.instances.connect(instance_id) — Query interface to a running instance.

Cypher Queries

MethodParametersReturnsDescription
querycypher, parameters=None, *, timeout, coerce_typesQueryResultExecute Cypher query
query_dfcypher, parameters=None, *, backend="polars"DataFrameQuery returning DataFrame
query_scalarcypher, parameters=NoneAnyQuery returning single value
query_onecypher, parameters=Nonedict | NoneQuery returning single row
get_schemaSchemaGet graph schema (labels, types, properties)
get_lockLockStatusGet current lock status
statusdictGet instance status and resource usage

Native Algorithmsconn.algo

MethodParametersReturnsDescription
algorithmscategory=Nonelist[dict]List available algorithms
algorithm_infoalgorithmdictGet algorithm parameters
runalgorithm, node_label, property_name, edge_type, *, params, timeout, waitAlgorithmExecutionRun any native algorithm
pageranknode_label, property_name, edge_type, *, damping, max_iterations, toleranceAlgorithmExecutionPageRank centrality
louvainnode_label, property_name, *, edge_type, resolutionAlgorithmExecutionLouvain community detection
connected_componentsnode_label, property_name, edge_typeAlgorithmExecutionWeakly connected components
sccnode_label, property_name, *, edge_typeAlgorithmExecutionStrongly connected components
kcorenode_label, property_name, *, edge_typeAlgorithmExecutionK-Core decomposition
label_propagationnode_label, property_name, edge_type, *, max_iterationsAlgorithmExecutionLabel propagation
triangle_countnode_label, property_name, edge_typeAlgorithmExecutionTriangle count per node
shortest_pathsource_id, target_id, *, relationship_types, max_depthAlgorithmExecutionFind shortest path

NetworkX Algorithmsconn.networkx (500+ algorithms)

MethodParametersReturnsDescription
algorithmscategory=Nonelist[dict]List available algorithms
algorithm_infoalgorithmdictGet algorithm parameters
runalgorithm, node_label, property_name, *, params, timeout, waitAlgorithmExecutionRun any NetworkX algorithm
degree_centralitynode_label, property_nameAlgorithmExecutionDegree centrality
betweenness_centralitynode_label, property_name, *, kAlgorithmExecutionBetweenness centrality
closeness_centralitynode_label, property_nameAlgorithmExecutionCloseness centrality
eigenvector_centralitynode_label, property_name, *, max_iterAlgorithmExecutionEigenvector centrality
clustering_coefficientnode_label, property_nameAlgorithmExecutionClustering coefficient

QueryResult — Flexible output from Cypher queries.

MethodParametersReturnsDescription
to_polarspolars.DataFrameConvert to Polars DataFrame
to_pandaspandas.DataFrameConvert to Pandas DataFrame
to_networkxnetworkx.DiGraphConvert to NetworkX graph
to_dictslist[dict]Convert to list of dicts
scalarAnyGet single scalar value
to_csvpathNoneExport to CSV file
to_parquetpathNoneExport to Parquet file
showmax_rows=20Display in Jupyter (auto-visualization)

Iteration: for row in result: yields dict[str, Any] for each row.


The SDK enables a complete analytical workflow from data discovery to graph analysis:

sdk-workflow-sequence

Mermaid Source
sequenceDiagram
accTitle: SDK Workflow Sequence
accDescr: Shows typical user workflow from schema discovery through instance creation to algorithm execution
participant Analyst as Analyst<br/>(Jupyter)
participant SDK as Python SDK
participant CP as Control Plane
participant Worker as Export Worker
participant Starburst as Starburst
participant GCS as GCS
participant Wrapper as Graph Instance
Note over Analyst,Wrapper: 1. Schema Discovery
Analyst->>SDK: client.schema.search_tables("customer")
SDK->>CP: GET /api/schema/search/tables?q=customer
CP-->>SDK: Matching tables
SDK-->>Analyst: Table list with columns
Note over Analyst,Wrapper: 2. Create Mapping
Analyst->>SDK: client.mappings.create(name, nodes, edges)
SDK->>CP: POST /api/mappings
CP-->>SDK: Mapping created
SDK-->>Analyst: Mapping object
Note over Analyst,Wrapper: 3. Create Instance (triggers export)
Analyst->>SDK: client.instances.create_and_wait(mapping_id)
SDK->>CP: POST /api/instances {mapping_id}
CP->>CP: Create snapshot + export job
Worker->>CP: Claim export job
Worker->>Starburst: UNLOAD query
Starburst->>GCS: Write Parquet files
Worker->>CP: Mark complete
CP->>Wrapper: Create Pod
Wrapper->>GCS: COPY FROM Parquet
Wrapper-->>CP: Ready
CP-->>SDK: Instance ready
SDK-->>Analyst: Instance object
Note over Analyst,Wrapper: 4. Connect and Query
Analyst->>SDK: conn = client.instances.connect(instance.id)
Analyst->>SDK: conn.query_df("MATCH (n) RETURN n")
SDK->>Wrapper: POST /query {cypher}
Wrapper-->>SDK: Query results
SDK-->>Analyst: pandas DataFrame
Note over Analyst,Wrapper: 5. Run Algorithms
Analyst->>SDK: conn.algo.pagerank("Customer", "score")
SDK->>Wrapper: POST /algorithms/pagerank
Wrapper->>Wrapper: Execute algorithm
Wrapper-->>SDK: Execution complete
SDK-->>Analyst: AlgorithmResult

graph-olap-sdk/
├── src/graph_olap/
│ ├── client.py # GraphOLAPClient main entry point
│ ├── config.py # Configuration and authentication
│ ├── notebook.py # Jupyter integration (connect(), init())
│ ├── resources/
│ │ ├── mappings.py # MappingResource
│ │ ├── instances.py # InstanceResource
│ │ ├── schema.py # SchemaResource (Starburst metadata)
│ │ ├── ops.py # OpsResource (config, cluster, jobs)
│ │ ├── favorites.py # FavoriteResource
│ │ ├── admin.py # AdminResource (bulk delete)
│ │ └── health.py # HealthResource
│ ├── instance/
│ │ ├── connection.py # InstanceConnection class
│ │ └── algorithms.py # Algorithm execution
│ ├── models/ # Pydantic models
│ ├── exceptions.py # Exception hierarchy
│ └── http.py # HTTP client wrapper
└── examples/
├── basic_workflow.ipynb
├── algorithms.ipynb
└── visualization.ipynb

DecisionChoiceRationale
HTTP ClienthttpxModern async support, connection pooling, HTTP/2
ModelsPydanticType safety, validation, JSON serialization
DataFrame Supportpandas + polarsIndustry standard, analyst familiarity
API StyleSynchronous defaultNotebook-friendly, with async support available
Error HandlingTyped exceptionsClear, actionable error messages

Per ADR-104 and ADR-105, identity is carried end-to-end by a single canonical header, X-Username. Bearer tokens and internal API keys have been removed from the SDK and from the control-plane auth middleware.

HeaderStatusRead by
X-UsernameCanonical (ADR-105)Control-plane middleware (packages/control-plane/src/control_plane/middleware/identity.py); wrapper dependencies (packages/{ryugraph,falkordb}-wrapper/src/wrapper/dependencies.py).
X-User-IDDeprecated aliasAccepted only by the wrapper get_user_id dependency as a fallback when X-Username is absent, for backward compatibility with legacy callers. The control-plane does NOT accept this alias.
X-User-NameDeprecated aliasAccepted only by the wrapper get_user_name dependency as a fallback when X-Username is absent, for backward compatibility with legacy callers. The control-plane does NOT accept this alias.

The SDK always sends X-Username (see packages/graph-olap-sdk/src/graph_olap/http.py:78 and instance/connection.py:82). New callers MUST send X-Username; the aliases above exist solely so that wrappers do not break during rolling upgrades from older SDK/tool versions.

HeaderPurpose
X-Use-Case-IdStarburst use-case identifier passed through the middleware (ADR-102)
X-User-RoleNOT sent by the SDK. Wrappers optionally read it if injected by an upstream component; absent → treated as analyst (see ADR-105 §F8).

Role is not carried in a header. The control plane resolves the authenticated user’s role from the users.role column after matching X-Username. Role hierarchy: Analyst < Admin < Ops. Each higher role inherits the permissions of the lower roles; Ops has exclusive access to config, cluster, and jobs endpoints.

See Authorization & Access Control for the complete permission matrix.

# Environment-based configuration (reads GRAPH_OLAP_API_URL, GRAPH_OLAP_USERNAME)
client = GraphOLAPClient.from_env()
# Explicit username (overrides GRAPH_OLAP_USERNAME and identity.DEFAULT_USERNAME)
client = GraphOLAPClient(
api_url="https://graph.example.com",
username="[email protected]",
timeout=60.0,
)
# Notebook persona switching (see ADR-105 §F3)
import graph_olap.identity
graph_olap.identity.DEFAULT_USERNAME = "[email protected]"
bob_client = GraphOLAPClient(api_url="https://graph.example.com")


This is part of the Graph OLAP Platform architecture documentation. See also: Detailed Architecture, Domain & Data Architecture, Platform Operations, Authorization.