Quickstart: xarray¶
This notebook shows the minimal Rasteret workflow:
- Build a
Collectionfrom the dataset catalog (one-time, cached) - Fetch pixels for a small AOI + time window as
xarray.Dataset - Compute NDVI
from pathlib import Path
import xarray as xr
from shapely.geometry import Polygon
import rasteret
Define area of interest¶
We use a small polygon over Bengaluru, India. The STAC query uses the polygon's bounding box to find matching Sentinel-2 L2A scenes.
aoi = Polygon(
[
(77.55, 13.01),
(77.58, 13.01),
(77.58, 13.08),
(77.55, 13.08),
(77.55, 13.01),
]
)
Build the Collection¶
build() picks a dataset from the catalog (here Sentinel-2 on Earth Search),
queries the STAC API, parses COG headers for every matching scene, and writes
the result to a local Parquet index. On the next run, the cache is loaded in
milliseconds.
For more on the catalog and local descriptors, see Dataset Catalog & Descriptors.
For full-control STAC indexing, use build_from_stac() with explicit API and
collection parameters; see the
Collection Management how-to guide.
workspace = Path.home() / "rasteret_workspace"
collection = rasteret.build(
"earthsearch/sentinel-2-l2a",
name="bangalore",
bbox=aoi.bounds,
date_range=("2024-01-01", "2024-01-31"),
workspace_dir=workspace,
)
print(f"Collection: {collection.name}")
print(f"Scenes: {collection.dataset.count_rows()}")
print(f"Columns: {collection.dataset.schema.names[:8]}...")
Fetch pixels as xarray¶
get_xarray() reads only the tiles that intersect the AOI. Filters
(cloud_cover_lt, date_range) are applied as Arrow pushdown predicates
before any HTTP requests are made.
ds = collection.get_xarray(
geometries=[aoi],
bands=["B04", "B08"],
cloud_cover_lt=20,
date_range=("2024-01-10", "2024-01-30"),
)
ds
Compute NDVI¶
Standard xarray operations. Rasteret hands off a standard xr.Dataset.
ndvi = (ds["B08"] - ds["B04"]) / (ds["B08"] + ds["B04"])
out = xr.Dataset({"ndvi": ndvi}, coords=ds.coords, attrs=ds.attrs)
out
What happened¶
build()looked upearthsearch/sentinel-2-l2ain the catalog, queried STAC once, parsed every COG header, and cached the result as partitioned Parquet.get_xarray()read the Parquet index, computed which tiles intersect the AOI, fetched those tiles in parallel, and assembled anxr.Dataset.- Subsequent runs skip the STAC query and header parsing entirely.
Next: TorchGeo Integration