Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

zarrs is a Rust library for the Zarr V2 and Zarr V3 array storage formats. If you don't know what Zarr is, check out:

zarrs was originally designed exclusively as a Rust library for Zarr V3. However, it now supports a V3 compatible subset of Zarr V2, and has Python and C/C++ bindings. This book details the Rust implementation.

🚀 zarrs is Fast 🚀

The zarr_benchmarks repository includes benchmarks of zarrs against other Zarr V3 implementations. Check out the benchmarks below that measure the time to round trip a \(1024x2048x2048\) uint16 array encoded in various ways. The zarr_benchmarks repository includes additional benchmarks.

benchmark standalone

Python Bindings: zarrs-python zarrs_python_ver zarrs_python_doc zarrs_python_repo

zarrs-python exposes a high-performance zarrs-backed codec pipeline to the reference zarr-python Python package. It is enabled as follows:

from zarr import config
import zarrs # noqa: F401

config.set({"codec_pipeline.path": "zarrs.ZarrsCodecPipeline"})

That's it! There is no need to learn a new API and it is supported by downstream libraries like dask. However, zarrs-python has some limitations. Consult the zarrs-python README or PyPi docs for more details.

Rust Crates

The Zarr specification is inherently unstable. It is under active development and new extensions are regularly being introduced.

The zarrs crate has been split into multiple crates to:

  • allow external implementations of stores and extensions points to target a relatively stable API compatible with a range of zarrs versions,
  • enable automatic backporting of metadata compatibility fixes and changes due to standardisation,
  • stay up-to-date with unstable public dependencies (e.g. opendal, object_store, icechunk, etc) without impacting the release cycle of zarrs, and
  • improve compilation times.

Below is a slightly simplified overview of the crate structure:

graph LR
    subgraph tools[CLI Tools]
        zarrs_tools
    end
    subgraph metadata_conventions[Zarr Metadata Conventions]
        ome_zarr_metadata
    end
    subgraph Stores
        direction LR
        zarrs_filesystem[zarrs_filesystem <br> zarrs::filesystem]
        zarrs_object_store
        zarrs_opendal
        zarrs_http
        zarrs_icechunk
    end
    subgraph Core
        zarrs_storage[zarrs_storage <br> zarrs::storage]
        zarrs_metadata_ext[zarrs_metadata_ext <br> zarrs::metadata_ext]
        zarrs_metadata[zarrs_metadata <br> zarrs::metadata]
        zarrs_registry[zarrs_registry <br> zarrs::registry]
        zarrs_plugin[zarrs_plugin <br> zarrs::plugin]
        subgraph Extensions
            direction LR
            zarrs_data_type[zarrs_data_type <br> zarrs::array:data_type]
            %% zarrs_codec TODO
            %% zarrs_chunk_grid TODO
        end
        zarrs
    end
    subgraph storage_adapters[Storage Adapters]
        zarrs_zip
    end
    subgraph Bindings
        %% direction LR
        zarrs_ffi[zarrs_ffi <br> C/C++]
        zarrs-python[zarrs-python <br> Python]
    end
    zarrs_storage --> zarrs
    %% zarrs_registry --> zarrs
    zarrs_metadata_ext --> zarrs
    zarrs_metadata --> zarrs_metadata_ext
    zarrs_registry --> zarrs_metadata_ext
    %% zarrs_metadata --> zarrs
    %% zarrs_metadata --> Extensions
    zarrs_metadata_ext --> Extensions
    zarrs_plugin --> Extensions
    Extensions --> zarrs
    %% zarrs_plugin ---> zarrs
    ome_zarr_metadata --> zarrs_tools
    Stores --> storage_adapters
    storage_adapters --> zarrs_storage
    Stores --> zarrs_storage
    Core --> tools
    Core --> Bindings

The core crate is:

  • zarrs zarrs_ver zarrs_doc zarrs_repo

For local filesystem stores (referred to as native Zarr), this is the only crate you need to depend on. zarrs has quite a few supplementary crates that are typically just used as transitive dependencies:

  • zarrs_metadata zarrs_metadata_ver zarrs_metadata_doc zarrs_metadata_repo
  • zarrs_metadata_ext zarrs_metadata_ext_ver zarrs_metadata_ext_doc zarrs_metadata_ext_repo
  • zarrs_storage zarrs_storage_ver zarrs_storage_doc zarrs_storage_repo
  • zarrs_plugin zarrs_plugin_ver zarrs_plugin_doc zarrs_plugin_repo
  • zarrs_data_type zarrs_data_type_ver zarrs_data_type_doc zarrs_data_type_repo
  • zarrs_registry zarrs_registry_ver zarrs_registry_doc zarrs_registry_repo

Additional crates need to be added as dependencies in order to use:

  • remote stores (e.g. HTTP, S3, GCP, etc.),
  • zip stores, or
  • icechunk transactional storage.

The Stores chapter details the various types of stores and their associated crates.

C/C++ Bindings: zarrs_ffi zarrs_ffi_ver zarrs_ffi_doc zarrs_ffi_repo

A subset of zarrs exposed as a C/C++ API. zarrs_ffi is a single header library: zarrs.h. Consult the zarrs_ffi README and API docs for more information.

CLI Tools: zarrs_tools zarrs_tools_ver zarrs_tools_doc zarrs_tools_repo

Various tools for creating and manipulating Zarr v3 data with the zarrs rust crate. This crate is detailed in the zarrs_tools chapter.

Zarr Metadata Conventions

ome_zarr_metadata ome_zarr_metadata_ver ome_zarr_metadata_doc ome_zarr_metadata_repo

A Rust library for OME-Zarr (previously OME-NGFF) metadata.

OME-Zarr, formerly known as OME-NGFF (Open Microscopy Environment Next Generation File Format), is a specification designed to support modern scientific imaging needs. It is widely used in microscopy, bioimaging, and other scientific fields requiring high-dimensional data management, visualisation, and analysis.