SchemaResource

Reference

SchemaResource

Data catalog exploration

10 min Beginner

ReferenceAPI

SchemaResource

Accessed via client.schema, this resource provides read-only access to Starburst catalog metadata — catalogs, schemas, tables, and columns.

All operations use cached metadata (refreshed every 24h) so calls are fast and do not hit Starburst directly.

Setup

Connect to the platform

# Cell 1 — Parameters
USERNAME = "_FILL_ME_IN_"  # Set your email before running

# Cell 2 — Connect
from graph_olap import GraphOLAPClient
client = GraphOLAPClient(username=USERNAME)

# Cell 3 — Provision
from notebook_setup import provision
personas, _ = provision(USERNAME)
analyst = personas["analyst"]
admin = personas["admin"]
ops = personas["ops"]
client = analyst

Listing Catalogs

Discover available data sources

`list_catalogs() -> list[Catalog]`

List all cached Starburst catalogs, sorted by name.

Returns: List of Catalog objects.

Field	Type	Description
`catalog_name`	`str`	Catalog name
`schema_count`	`int`	Number of schemas in the catalog
`cached_at`	`str \| None`	ISO 8601 timestamp of when metadata was cached

# Ensure schema cache is populated (admin-only operation)
import time

admin.schema.admin_refresh()
print("Cache refresh triggered, waiting for data...")

# Wait for cache to populate (up to 120s)
for _ in range(24):
    time.sleep(5)
    catalogs = client.schema.list_catalogs()
    if catalogs:
        break
    print("  waiting...")

print(f"\nCatalogs: {len(catalogs)}\n")
for cat in catalogs:
    print(f"  {cat.catalog_name}: {cat.schema_count} schemas")

Listing Schemas & Tables

Navigate the catalog hierarchy

`list_schemas(catalog) -> list[Schema]`

List all schemas in a catalog.

Parameter	Type	Description
`catalog`	`str`	Catalog name (e.g., `"analytics"`)

Returns: List of Schema objects.

Raises: NotFoundError if the catalog is not in cache.

# Use the first catalog discovered above
catalog_name = catalogs[0].catalog_name if catalogs else "default"
schemas = client.schema.list_schemas(catalog_name)

for sch in schemas[:5]:
    print(f"  {sch.schema_name}: {sch.table_count} tables")

`list_tables(catalog, schema) -> list[Table]`

List all tables in a schema.

Parameter	Type	Description
`catalog`	`str`	Catalog name
`schema`	`str`	Schema name

Returns: List of Table objects.

Raises: NotFoundError if the schema is not in cache.

# Use the first schema discovered above
schema_name = schemas[0].schema_name if schemas else "public"
tables = client.schema.list_tables(catalog_name, schema_name)

for tbl in tables[:5]:
    print(f"  {tbl.table_name} ({tbl.table_type})")

`list_columns(catalog, schema, table) -> list[Column]`

Get all columns for a table, sorted by ordinal position.

Parameter	Type	Description
`catalog`	`str`	Catalog name
`schema`	`str`	Schema name
`table`	`str`	Table name

Returns: List of Column objects.

Raises: NotFoundError if the table is not in cache.

# Use the first table discovered above
table_name = tables[0].table_name if tables else "customers"
columns = client.schema.list_columns(catalog_name, schema_name, table_name)

for col in columns[:5]:
    nullable = "NULL" if col.is_nullable else "NOT NULL"
    print(f"  {col.column_name:20s} {col.data_type:15s} {nullable}")

Searching

Find tables and columns by name

`search_tables(pattern, limit=100) -> list[Table]`

Search tables by name pattern (prefix match, case-insensitive).

Parameter	Type	Default	Description
`pattern`	`str`	required	Search pattern
`limit`	`int`	`100`	Maximum results (max: 1000)

Returns: List of Table objects matching the pattern.

results = client.schema.search_tables("customer", limit=10)

print(f"Found {len(results)} tables\n")
for tbl in results:
    print(f"  {tbl.catalog_name}.{tbl.schema_name}.{tbl.table_name}")

`search_columns(pattern, limit=100) -> list[Column]`

Search columns by name pattern (prefix match, case-insensitive).

Parameter	Type	Default	Description
`pattern`	`str`	required	Search pattern
`limit`	`int`	`100`	Maximum results (max: 1000)

Returns: List of Column objects matching the pattern.

results = client.schema.search_columns("email", limit=10)

print(f"Found {len(results)} columns\n")
for col in results:
    print(f"  {col.catalog_name}.{col.schema_name}.{col.table_name}.{col.column_name}: {col.data_type}")

Key Takeaways

client.schema provides read-only access to cached Starburst metadata
Navigate the hierarchy: list_catalogs() → list_schemas() → list_tables() → list_columns()
search_tables() and search_columns() find objects by name across all catalogs
No instances are needed -- schema metadata is always available