Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

zarrs is a Rust library for the Zarr V2 and Zarr V3 array storage formats. If you don't know what Zarr is, check out:

zarrs was originally designed exclusively as a Rust library for Zarr V3. However, it now supports a V3 compatible subset of Zarr V2, and has Python and C/C++ bindings. This book details the Rust implementation.

🚀 zarrs is Fast 🚀

The zarr_benchmarks repository includes benchmarks of zarrs against other Zarr V3 implementations. Check out the benchmarks below that measure the time to round trip a \(1024x2048x2048\) uint16 array encoded in various ways. The zarr_benchmarks repository includes additional benchmarks.

benchmark standalone

Python Bindings: zarrs-python zarrs_python_ver zarrs_python_doc zarrs_python_repo

zarrs-python exposes a high-performance zarrs-backed codec pipeline to the reference zarr-python Python package. It is enabled as follows:

from zarr import config
import zarrs # noqa: F401

config.set({"codec_pipeline.path": "zarrs.ZarrsCodecPipeline"})

That's it! There is no need to learn a new API and it is supported by downstream libraries like dask. However, zarrs-python has some limitations. Consult the zarrs-python README or PyPi docs for more details.

Rust Crates

The Zarr specification is inherently unstable. It is under active development and new extensions are regularly being introduced.

The zarrs crate has been split into multiple crates to:

  • allow external implementations of stores and extensions points to target a relatively stable API compatible with a range of zarrs versions,
  • enable automatic backporting of metadata compatibility fixes and changes due to standardisation,
  • stay up-to-date with unstable public dependencies (e.g. opendal, object_store, icechunk, etc) without impacting the release cycle of zarrs, and
  • improve compilation times.

Below is an overview of the crate structure:

The core crate is:

  • zarrs zarrs_ver zarrs_doc zarrs_repo

For local filesystem stores (referred to as native Zarr), this is the only crate you need to depend on.

zarrs has quite a few supplementary crates:

  • zarrs_metadata zarrs_metadata_ver zarrs_metadata_doc zarrs_metadata_repo
  • zarrs_metadata_ext zarrs_metadata_ext_ver zarrs_metadata_ext_doc zarrs_metadata_ext_repo
  • zarrs_storage zarrs_storage_ver zarrs_storage_doc zarrs_storage_repo
  • zarrs_plugin zarrs_plugin_ver zarrs_plugin_doc zarrs_plugin_repo
  • zarrs_data_type zarrs_data_type_ver zarrs_data_type_doc zarrs_data_type_repo
  • zarrs_registry zarrs_registry_ver zarrs_registry_doc zarrs_registry_repo

tip

The supplementary crates are transitive dependencies of zarrs, and are re-exported in the crate root. You do not need to add them as direct dependencies.

note

The supplementary crates are separated from zarrs to enable development of Zarr extensions and stores targeting a more stable API than zarrs itself.

Additional crates need to be added as dependencies in order to use:

  • remote stores (e.g. HTTP, S3, GCP, etc.),
  • zip stores, or
  • icechunk transactional storage.

The Stores chapter details the various types of stores and their associated crates.

C/C++ Bindings: zarrs_ffi zarrs_ffi_ver zarrs_ffi_doc zarrs_ffi_repo

A subset of zarrs exposed as a C/C++ API. zarrs_ffi is a single header library: zarrs.h. Consult the zarrs_ffi README and API docs for more information.

CLI Tools: zarrs_tools zarrs_tools_ver zarrs_tools_doc zarrs_tools_repo

Various tools for creating and manipulating Zarr v3 data with the zarrs rust crate. This crate is detailed in the zarrs_tools chapter.

Zarr Metadata Conventions

ome_zarr_metadata ome_zarr_metadata_ver ome_zarr_metadata_doc ome_zarr_metadata_repo

A Rust library for OME-Zarr (previously OME-NGFF) metadata.

OME-Zarr, formerly known as OME-NGFF (Open Microscopy Environment Next Generation File Format), is a specification designed to support modern scientific imaging needs. It is widely used in microscopy, bioimaging, and other scientific fields requiring high-dimensional data management, visualisation, and analysis.