rasteret.catalog¶
Catalog entries and the in-memory registry used by build() and rasteret datasets ....
catalog
¶
Dataset registry: spec-aligned descriptors for known COG collections.
Each :class:DatasetDescriptor is a proto-spec-descriptor that captures
identity, access, and band-mapping metadata for a cloud-native GeoTIFF
collection. The :class:DatasetRegistry stores them in-memory and
auto-populates :class:~rasteret.constants.BandRegistry and
:class:~rasteret.cloud.CloudConfig keyed by STAC collection id.
Users can register custom datasets at runtime::
import rasteret
from rasteret.catalog import DatasetDescriptor
rasteret.register(DatasetDescriptor(
id="acme/field-survey-2024",
name="ACME Field Survey",
stac_api="https://acme.example.com/stac/v1",
stac_collection="field-survey-2024",
band_map={"RGB": "image"},
separate_files=False,
license="proprietary",
license_url="https://acme.example.com/license",
))
Classes¶
DatasetDescriptor
dataclass
¶
DatasetDescriptor(
id: str,
name: str,
description: str = "",
stac_api: str | None = None,
stac_collection: str | None = None,
geoparquet_uri: str | None = None,
column_map: dict[str, str] | None = None,
href_column: str | None = None,
band_index_map: dict[str, int] | None = None,
bbox_columns: dict[str, str] | None = None,
band_map: dict[str, str] | None = None,
separate_files: bool = True,
spatial_coverage: str = "",
temporal_range: tuple[str, str] | None = None,
requires_auth: bool = False,
license: str = "",
license_url: str = "",
commercial_use: bool = True,
static_catalog: bool = False,
s3_credentials_url: str | None = None,
cloud_config: dict[str, str] | None = None,
example_bbox: tuple[float, float, float, float]
| None = None,
example_date_range: tuple[str, str] | None = None,
torchgeo_class: str | None = None,
torchgeo_verified: bool = False,
)
A dataset descriptor: identity + access + band mapping.
Proto-spec-descriptor. Each entry will migrate to YAML format when the spec ships. Fields map to spec axes:
id, name, description -> dataset identity
stac_api, stac_collection -> access (stac_query)
geoparquet_uri -> access (parquet_record_table)
band_map -> field roles (input bands)
spatial_coverage, temporal_range -> coverage metadata
license, license_url,
commercial_use -> licensing
static_catalog -> static STAC catalog traversal
column_map, href_column,
band_index_map, bbox_columns -> normalisation hints
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
id
|
str
|
Namespaced identifier (e.g. |
required |
name
|
str
|
Human-readable name. |
required |
description
|
str
|
One-liner description. |
''
|
stac_api
|
str
|
STAC API endpoint URL. For static STAC catalogs (no |
None
|
stac_collection
|
str
|
STAC collection identifier. May be |
None
|
geoparquet_uri
|
str
|
URI to a GeoParquet record table. |
None
|
column_map
|
dict
|
|
None
|
href_column
|
str
|
Column in the GeoParquet containing COG URLs. When set and
|
None
|
band_index_map
|
dict
|
|
None
|
bbox_columns
|
dict
|
|
None
|
band_map
|
dict
|
Mapping of band code to STAC asset name. |
None
|
separate_files
|
bool
|
|
True
|
spatial_coverage
|
str
|
Geographic coverage hint (e.g. |
''
|
temporal_range
|
tuple of str
|
|
None
|
requires_auth
|
bool
|
Whether credentials are needed to access the data. |
False
|
license
|
str
|
License identifier. Use the value reported by the STAC API
(typically an SPDX id like |
''
|
license_url
|
str
|
URL to the full license text. Sourced from the STAC collection's
|
''
|
commercial_use
|
bool
|
|
True
|
static_catalog
|
bool
|
|
False
|
s3_credentials_url
|
str
|
Endpoint for obtaining temporary S3 credentials for auth-gated datasets.
When set, |
None
|
example_bbox
|
tuple of float
|
Example bounding box (minx, miny, maxx, maxy) known to return data. Used in docs and live smoke tests. |
None
|
example_date_range
|
tuple of str
|
Example ISO date range (start, end) known to return data. Used in docs and live smoke tests. |
None
|
cloud_config
|
dict
|
Cloud provider configuration for URL resolution. |
None
|
torchgeo_class
|
str
|
Equivalent TorchGeo class name (reference only, not a dependency). |
None
|
torchgeo_verified
|
bool
|
|
False
|
DatasetRegistry
¶
Registry of dataset descriptors. Proto-spec catalog.
Built-in datasets are registered at module import time.
Users can add entries via :meth:register or the top-level
:func:rasteret.register helper.
Functions¶
register
classmethod
¶
register(descriptor: DatasetDescriptor) -> None
Register a dataset descriptor.
Also populates :class:~rasteret.constants.BandRegistry and
:class:~rasteret.cloud.CloudConfig keyed by the descriptor id so that
provider-specific conventions do not collide (e.g. Planetary Computer
vs Earth Search for sentinel-2-l2a).
Source code in src/rasteret/catalog.py
unregister
classmethod
¶
unregister(dataset_id: str) -> DatasetDescriptor | None
get
classmethod
¶
get(dataset_id: str) -> DatasetDescriptor | None
Look up a descriptor by namespaced ID.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_id
|
str
|
Full namespaced id (e.g. |
required |
Source code in src/rasteret/catalog.py
list
classmethod
¶
list() -> list[DatasetDescriptor]
search
classmethod
¶
search(keyword: str) -> list[DatasetDescriptor]
Search descriptors by keyword in id, name, or description.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
keyword
|
str
|
Case-insensitive search term. |
required |
Source code in src/rasteret/catalog.py
Functions¶
load_local_descriptors
¶
load_local_descriptors(
path: str | Path | None = None,
) -> list[DatasetDescriptor]
Load persisted local dataset descriptors from JSON.
Invalid entries are skipped with a warning.
Source code in src/rasteret/catalog.py
save_local_descriptor
¶
save_local_descriptor(
descriptor: DatasetDescriptor,
path: str | Path | None = None,
) -> None
Persist a local dataset descriptor to JSON (upsert by id).
Source code in src/rasteret/catalog.py
remove_local_descriptor
¶
remove_local_descriptor(
dataset_id: str, path: str | Path | None = None
) -> DatasetDescriptor | None
Remove one persisted local descriptor (if present).
Source code in src/rasteret/catalog.py
unregister_local_descriptor
¶
unregister_local_descriptor(
dataset_id: str, path: str | Path | None = None
) -> DatasetDescriptor | None
Unregister a local dataset from persisted and in-memory registries.
Source code in src/rasteret/catalog.py
export_local_descriptor
¶
export_local_descriptor(
dataset_id: str,
output_path: str | Path,
path: str | Path | None = None,
) -> Path
Export one local descriptor as JSON for sharing.