Cloud-native geospatial —
without the copy.
Query petabyte-scale Zarr and NetCDF archives with SQL, directly in object storage. No pipelines to copy data into a database first.
Query in place
No ETL, no duplicated petabytes, no separate database to maintain.
Open standards, no lock-in
Built on Zarr and Apache Arrow/DataFusion — open formats, permissively licensed.
Boutique engineering
Small senior team, hands-on production engagements, deep specialization.
zarr-datafusion
GitHubSQL on Zarr-native array data, powered by Apache DataFusion. Open source — the technical foundation behind everything we build.
zarr> CREATE EXTERNAL TABLE era5 STORED AS ZARR
LOCATION 'gs://gcp-public-data-arco-era5/...';
zarr> SELECT time, AVG(temperature)
FROM era5
WHERE time > '2020-01-01'
GROUP BY time
LIMIT 5; Get in touch
Working with large geospatial array data and hitting the copy-everything wall? Let's talk.
hi@stratoscale.io