Release notes¶

Unreleased¶

New Features¶

GRIB1/GRIB2 files can now be read as virtual datasets via the GribberishParser from the gribberish library (>=1.0.0), installable with pip install "virtualizarr[grib]". Each GRIB message becomes one chunk, decoded on read through gribberish's registered zarr codec. Like the TIFF parser, the parser itself lives in the third-party package; VirtualiZarr just adds the optional dependency, docs, and tests. By Tom Nicholas.
ManifestArray.with_fill_value_only(fill_value) — return a new ManifestArray with the same schema (shape, chunks, codecs, dimension names, attributes) as the original but with an empty chunk manifest and the given fill_value. Useful as a typed placeholder for a variable that is absent from one source but present in others. By Tom Nicholas.
open_virtual_dataset and open_virtual_datatree now populate ds.encoding["source"] with the normalized source URI, mirroring xarray.open_dataset's behaviour. Parsers that have already set encoding["source"] are left untouched. By Tom Nicholas.

Documentation¶

Document that virtual concatenation also requires homogeneous CF encoding (scale_factor/add_offset) across files — xarray's default attribute-merging silently drops mismatched values and produces incorrectly-decoded data on read. Added a new FAQ bullet and a warning admonition under "Combining virtual datasets" in the usage docs. See #1004. (#1006). By Tom Nicholas.

v2.6.2 (18^th May 2026)¶

Adds an IcechunkParser for reading existing icechunk repositories as virtual datasets without copying data, chunk-aligned indexing on ManifestArray (so xarray.Dataset.isel works end-to-end on virtual datasets), and limited sub-chunk slicing for uncompressed arrays.

New Features¶

New IcechunkParser — opens an existing icechunk repository and converts it into a ManifestStore without copying data. Maps icechunk virtual refs straight through, exposes native (managed) chunks as VirtualiZarr virtual refs under {native_chunks_prefix}/{chunk_id}, and preserves inline chunks. Provides both the protocol-conformant IcechunkParser()(url, registry) entry point and an IcechunkParser().parse_session(session, registry, ...) escape hatch for callers that already have an open icechunk Session. Requires icechunk >= 2.0.5; the [icechunk] extra still pins >=2.0.3 so writer-only users aren't forced to upgrade. (#991). By Tom Nicholas.
ManifestArray now supports chunk-aligned integer and slice indexing along each axis, including multi-chunk slices, mixed integer + slice indexers, and selections that include a partial final chunk. Integer indexers drop the indexed axis (numpy / array-API semantics) and are legal only when chunk_size == 1 along that axis; slice indexers preserve the axis. This makes xarray.Dataset.isel work end-to-end on virtual datasets for any chunk-aligned selection. Indexers that would split individual chunks raise a new SubChunkIndexingError (a ValueError subclass). Closes #51, supersedes #499. (#994). By Tom Nicholas.
Slicing along the largest-stride storage axis of an uncompressed ManifestArray can now sub-divide a chunk — the result is a new reference into the same source file with a bumped byte offset and a smaller length. Eligible codec stacks are [BytesCodec] (C-order) and [TransposeCodec(...), BytesCodec] (e.g. F-order). Useful for picking a single timestep from a multi-row chunk produced by, e.g., the netCDF3 parser, without rechunking. Limited to slices that fit within one source chunk. Addresses part of #86. (#996). By Tom Nicholas.

Bug fixes¶

Fix vds.vz.to_icechunk() raising IcechunkError("invalid zarr key format") when the manifest contains inlined chunks. The icechunk writer now always emits c/0/0/0-form chunk keys regardless of the manifest's stored chunk-key separator. Mainly surfaces with IcechunkParser (icechunk inlines small chunks aggressively); existing parsers don't produce inlined chunks and aren't affected. (#991). By Tom Nicholas.

Documentation¶

Add guidance on fill values and scale/offset to the custom parser docs. (#974). By Max Jones.
Align all parser docstrings with IcechunkParser — promote per-parser docstrings to the class level so the rendered API reference is consistent across parsers. (#999). By Tom Nicholas.

Internal changes¶

Mark test_read_netcdf3 as also requiring kerchunk, since NetCDF3Parser lazily imports kerchunk.netCDF3 and the test would otherwise raise ModuleNotFoundError in environments with scipy but not kerchunk. (#998). By Tom Nicholas.
Move dev-version git sources out of the pixi workspace (pyproject.toml's upstream dependency group) into ci/upstream-overrides.txt, applied as a pip overlay in the Upstream CI job. Stops every pixi solve (docs, minimum-versions, etc.) from having to build wheels for those packages, which was causing intermittent SIGSEGV / ETXTBSY failures on memory-constrained CI runners. Closes #995. (#997). By Tom Nicholas.

v2.6.1 (3^rd May 2026)¶

Adds end-to-end support for inlined chunk references in ChunkManifest (read via Kerchunk parsers, write via Kerchunk and Icechunk writers), plus Zarr-Python 3.2.0 compatibility and several bug fixes.

New Features¶

ChunkManifest can now hold inlined chunks — raw chunk bytes carried directly in memory rather than as references to external files. Intended for parser authors (e.g., loading Kerchunk references with inlined data); not exposed via loadable_variables. (#938). By Max Jones and Tom Nicholas.
KerchunkJSONParser and KerchunkParquetParser now parse inline chunk data (both raw-string and base64:-prefixed forms) into inlined ChunkManifest entries, instead of raising NotImplementedError. Fixes the read side of #489. (#979). By Tom Nicholas.
The kerchunk writer now serializes inlined ChunkManifest entries as kerchunk's base64:-prefixed inline form, rather than emitting broken ["__inlined__", 0, length] triples. Together with the read-side support added in #979, this means a virtual dataset with inlined chunks can be round-tripped through both to_kerchunk(format="json"/"parquet") and the corresponding KerchunkJSONParser/KerchunkParquetParser. (#980). By Tom Nicholas.
The icechunk writer now handles ChunkManifest entries containing inlined chunk data. For arrays with no inlined chunks the existing fast bulk set_virtual_refs_arr path is unchanged; otherwise inlined positions are sent to icechunk as empty (missing) virtual refs and the inlined bytes are written separately as managed chunks. A virtual dataset with inlined chunks can now be to_icechunk'd and re-opened via xr.open_zarr without data loss. (#981). By Tom Nicholas.

Bug fixes¶

Fix KerchunkParser rejecting cloud URI roots (e.g. s3://bucket) in fs_root. (#976). By Tom Nicholas.
Fix HDFParser failing on HDF5 datasets with a zero-length dimension under zarr-python >= 3.2.0, which forbids zero-length chunk dimensions. Chunk dimensions are now clamped to a minimum of 1 when falling back from dataset shape. (#977). By Tom Nicholas.
Fix KerchunkParser mangling scheme-only fs_root values like s3:// into s3:/key when joining paths. (#984). By Tom Nicholas.

Documentation¶

Expand ChunkManifest documentation with detail on construction and the shape argument. (#961). By Tyler Anderson.

Internal changes¶

Add compatibility with Zarr-Python 3.2.0. (#957). By Max Jones.
Update typing for internal changes in Zarr-Python 3.2.0. (#985). By Max Jones.

v2.6.0 (16^th April 2026)¶

Now requires icechunk 2.x, enabling a ~3x performance improvement for writing virtual references. Also drops support for Python 3.11.

New Features¶

Breaking changes¶

Now requires icechunk >= 2.0.3. (#967). By Tom Nicholas.
Dropped support for Python 3.11. Python 3.12+ is now required, matching icechunk 2.x. (#969). By Tom Nicholas.

Bug fixes¶

Fix scalar variable manifests getting shape (1,) instead of () from kerchunk references. (#965). By Tom Nicholas.

Documentation¶

Internal changes¶

Use set_virtual_refs_arr for ~3x faster virtual ref writing to icechunk. (#967). By Tom Nicholas.

v2.5.1 (9^th April 2026)¶

Adds support for sharded Zarr V3 arrays, and includes several other bug fixes.

New Features¶

Support for sharded Zarr V3 arrays in ZarrParser and icechunk writer. (#946, #952). By Tom Nicholas.

Breaking changes¶

Bug fixes¶

Fix handling of scalar Zarr V3 arrays with None dimension_names. (#897). By Lars Buntemeyer.
Fix allowing Azure URLs. (#943). By Max Jones.
Add h5py import for dimension variable handling. (#955). By Tom Nicholas.
Fix mypy error in FITSParser for optional reader_options. (#959). By Tom Nicholas.

Documentation¶

Fix note markdown in developer docs. (#948). By Aimee Barciauskas.

Internal changes¶

Add iteration helpers to ChunkManifest. (#939). By Max Jones.
Fix simple_netcdf4 test fixture to explicitly use netcdf4 engine. (#958). By Tom Nicholas.
Fix flaky region test dimension ordering. (#960). By Tom Nicholas.

v2.5.0 (23^rd March 2026)¶

Brings region-writing support in .to_icechunk(), a ZarrParser with orders of magnitude better performance, more FAQ docs, and various bugfixes.

New Features¶

Added region parameter to to_icechunk(). (#873). By Vladislav Wohlrath.
Support configurable chunk separator. (#917). By Max Jones.
Improved ZarrParser performance enormously by using obstore to list chunks in a directory instead of getting all their sizes individually. (#892). By Raphael Hagen.

Breaking changes¶

Minimum required version of obspec_utils is now 0.9.0.

Bug fixes¶

Fix setting fill_value for Zarr V2 arrays if data type is a subtype of integer or float. (#845). By Hauke Schulz.
Fix reading kerchunk parquet references with sparse arrays (missing chunks represented as NULL). (#864). By Tom Nicholas.
Raise clearer error when kerchunk references have malformed codec specifications. (#864).
Fix warnings caused by outdated imports from obspec_utils (#863). By Tom Nicholas.
Allow ZarrParser to work from inside a running event loop (e.g. inside a Jupyter Notebook) (#900) By Julius Busecke.
Fix Lithops executor to allow use of functools.partial, and update get_executor function to ensure ProcessPoolExecutor uses "forkserver" mode on platforms that default to "fork" (#899). By Chuck Daniels.
Fix ZarrParser not using the store-relative path when the zarr store is nested inside the object store root (#913). By Tom Nicholas.
Fix ZarrParser not correctly parsing scalar variables from v2 native zarr stores (#936). By Julius Busecke
Fix dmrpp error handling (#880). By Luis López.
Fix error when running with Zarr-Python 3.1.0 (#868). By Rajat Shinde.
Fix coordinate name issue (#924). By UserNobody14.
Fix ZarrParser to use public attribute instead of private one (#916). By Max Jones.

Documentation¶

Added FAQ answer comparing the Kerchunk and Icechunk serialization formats. (#818). By Tom Nicholas.
FAQ answer on "why still write native zarr?" (#918). By Tom Nicholas.
Updated FAQ regarding virtualizing existing Zarr V2 data (#893). By Tom Nicholas.
R2 docs (#937). By Tom Nicholas.

Internal changes¶

Inlined virtualizarr.writers.icechunk.generate_chunk_key in virtualizarr.writers.icechunk.write_manifest_virtual_refs, and deleted the original function. (#873). By Vladislav Wohlrath.
Skip unnecessary re-validation of already-validated paths during manifest concatenation(#910). By Tom Nicholas.
Completely rewrote the ZarrParser to use numpy string arrays for efficiency (#927). By Tom Nicholas.
Testing across all supported python versions (#932). By Julius Busecke
Compile regular expressions for improved performance (#909). By Chuck Daniels.

v2.4.0 (24^th January 2026)¶

This release moves the ObjectStoreRegistry to a separate package obspec_utils, and provides a way to customize how files are read, which can easily allow open_virtual_dataset to run over ~5x faster.

New Features¶

Added reader_factory parameter to HDFParser to allow customizing how files are read (#844). By Max Jones.

Breaking changes¶

Move ObjectStoreRegistry and Reader functionality to obspec_utils (#844). By Max Jones.
ObjectStoreRegistry has moved from virtualizarr.registry to obspec_utils.registry. The old import path still works but emits a DeprecationWarning and will be removed in a future release.
ObstoreReader has been removed from virtualizarr.utils. This should not break user's code, as it was not part of the public/documented API. See obspec_utils for public file handlers.
Added obspec_utils>=0.7.0 as a required dependency. This package provides the ObjectStoreRegistry that was previously part of VirtualiZarr.
Minimum required version of obstore is now 0.7.0 (previously 0.5.1). This was the first release to implement obspec protocols.

Documentation¶

Added example of virtualizing GOES using caching and request splitting (#855). By Max Jones.
Updated kerchunk comparison in FAQ (#856). By Tom Nicholas.

v2.3.0 (20^th January 2026)¶

New Features¶

Implement open_virtual_datatree. (838). By Max Jones.
Set supports_consolidated_metadata property on ManifestStore to False. (809). By Julia Signell.

Internal changes¶

Remove the undocumented/unfunctional wrapper of Kerchunk's TIFF parser. (849). By Max Jones.

v2.2.1 (17^th November 2025)¶

Bug fixes¶

Allow storing scalar arrays under 'c' key. (#836). By Max Jones
Improve ManifestStore.list_dir for arrays and nested groups. (#837) By Max Jones

v2.2.0 (12^th November 2025)¶

New Features¶

Allow nested-groups inside ManifestStore and ManifestGroup objects and update HDFParser to be able to create nested zarr.Group objects. (#790). By Ilan Gold
ZarrParser now handles Zarr V2 and V3 array parsing. (#822). By Neil Schroeder
Add Virtual TIFF as an optional dependency for TIFF parsing. (#810) By Max Jones

Breaking changes¶

Bug fixes¶

ZarrParser no longer uses ZARR_DEFAULT_FILL_VALUE lookup to infer missing fill_value. (#812). By Raphael Hagen.
Return None for Zarr V2/consolidated metadata requests. (#827). By Max Jones
Raise informative error on Zarr V2 parsing with Zarr-Python<3.1.3 (#829). By Max Jones.
Revert "Remove unnecessary dtype conversion in icechunk writer" (#805). By Tom Nicholas.

Documentation¶

Internal changes¶

v2.1.2 (3^rd September 2025)¶

Patch release with minor bug fixes for the DMRPParser and Icechunk writing behavior.

Bug fixes¶

Enable DMRPParser to process scalar, dimensionless variables that lack chunks are present. (#666). By Miguel Jimenez-Urias.
Enable DMRPParser to parse flattened dmrpp metadata reference files, which contain container attributes. (#581). By Miguel Jimenez-Urias.
Support dtypes without an endianness (#787). By Justus Magin.

Internal changes¶

Change default Icechunk writing behavior to not validate or write "empty" chunks (#791). By Sean Harkins.

v2.1.1 (14^th August 2025)¶

Extremely minor release to ensure compatibility with the soon-to-be released version of xarray (likely named v2025.07.2).

Bug fixes¶

Adjust for minor upcoming change in private xarray API xarray.structure.combine._nested_combine. (#779). By Tom Nicholas.

v2.1.0 (14^th August 2025)¶

This release fixes a number of important bugs that could silently lead to referenced data being read back incorrectly. In particular, note that writing virtual chunks to Icechunk now requires that all virtual chunk containers are set correctly by default. It also unpins our dependency on xarray, so that VirtualiZarr is compatible with the latest released version of Xarray. Please upgrade!

New Features¶

Expose validate_containers kwarg in .to_icechunk, allowing it to be set to False (#567, #774). By Tom Nicholas.

Breaking changes¶

Writing to Icechunk now requires that virtual chunk containers are set correctly for all virtual references by default. (#774). This change is needed because otherwise it can lead to situations in which attempting to read data back returns fill values instead of real data, silently! (See #763) By Tom Nicholas.
Update minimum required version of Icechunk to v1.1.2 #774. By Tom Nicholas.
Unpin dependency on xarray, by adjusting our tests to pass despite minor changes to the bytes of netCDF files written between versions of xarray #774). By Max Jones and Tom Nicholas.

Bug fixes¶

Fixed bug where VirtualiZarr was incorrectly failing to raise if virtual chunk containers with correct prefixes were not set for every virtual reference (#774). By Tom Nicholas.
Fix handling of big-endian data in Icechunk by making sure that non-default zarr serializers are included in the zarr array metadata #766. By Max Jones
Fix handling of big-endian data in Kerchunk references #769. By Max Jones

Documentation¶

Updated Icechunk examples now that virtual chunk containers are required by default (#774). By Tom Nicholas.

Internal changes¶

extract_codecs function inside convert_to_codec_pipeline now raises if it encounters a codec which does not inherit from the correct zarr.abc.codec base classes. (#775). By Tom Nicholas.

v2.0.1 (30^th July 2025)¶

Minor release to ensure compatibility with incoming changes to Icechunk.

Bug fixes¶

Fixed bug caused by writing empty virtual chunks to Icechunk (#745). By Tom Nicholas.
Rewrote the internals of ManifestArray.__getitem__ to ensure it actually obeys the array API standard under myriad edge cases (#734). By Tom Nicholas.

Documentation¶

Added recommendation to use icechunk.Repository.save_config() to persist icechunk.VirtualChunkContainers (#746). By Tom Nicholas.

v2.0.0 (21^st July 2025)¶

New Features¶

Added a pluggable system of "parsers" for generating virtual references from different filetypes. These follow the virtualizarr.parsers.typing.Parser typing protocol, and return ManifestStore objects wrapping obstore stores. (#498, #601)
Added a Zarr parser that allows opening Zarr V3 stores as virtual datasets. (#271) By Raphael Hagen.
Added ManifestStore for loading data from ManifestArrays by (#490) By Max Jones.
Added ManifestStore.to_virtual_dataset() method (#522). By Tom Nicholas.
Added open_virtual_mfdataset function (#345, #349). By Tom Nicholas.
Added datatree_to_icechunk function for writing an xarray.DataTree to an Icechunk store (#244). By Chuck Daniels.
Added a .vz custom accessor to xarray.DataTree, exposing the method xarray.DataTree.vz.to_icechunk() for writing an xarray.DataTree to an Icechunk store (#244). By Chuck Daniels.
Added a warning if you attempt to write an entirely non-virtual dataset to a virtual references format (#657). By Tom Nicholas.
Support big-endian data via zarr-python 3.0.9 and zarr v3's new data types system (#618, #677). By Max Jones and Tom Nicholas.
Added a V1 -> V2 usage migration guide #637. By Raphael Hagen.

Breaking changes¶

As virtualizarr.open_virtual_dataset now uses parsers, it's API has changed. #601) See the migration-guide for more details.
The recommended virtualizarr Xarray accessor name is vz rather than virtualize.
Which variables are loadable by default has changed. The behaviour is now to make loadable by default the same variables which xarray.open_dataset would create indexes for: i.e. one-dimensional coordinate variables whose name matches the name of their only dimension (also known as "dimension coordinates"). Pandas indexes will also now be created by default for these loadable variables. This is intended to provide a more friendly default, as often you will want these small variables to be loaded (or "inlined", for efficiency of storage in icechunk/kerchunk), and you will also want to have in-memory indexes for these variables (to allow xarray.combine_by_coords to sort using them). The old behaviour is equivalent to passing loadable_variables=[] and indexes={}. (#335, #477) by Tom Nicholas.
Moved ChunkManifest, ManifestArray etc. to be behind a dedicated .manifests namespace. (#620, #624) By Tom Nicholas.
Now by default when writing virtual chunks to Icechunk, the last_updated_time for the chunk will be set to the current time. This helps protect users against reading from stale or overwritten chunks stored in Icechunk, by default. (#436, #480) by Tom Nicholas.
Minimum supported version of Icechunk is now v1.0
Minimum supported version of Zarr is now v3.1.0
Xarray is pinned to v2025.6.0. We expect to loosen the upper bound shortly.

Bug fixes¶

Fixed bug causing ManifestArrays to compare as not equal when they were actually identical (#501, #502) By Tom Nicholas.
Fixed bug causing coordinates to be demoted to data variables when writing to Icechunk (#574, #588) By Tom Nicholas.
Removed checks forbidding paths in virtual references without file suffixes (#659) By Tom Nicholas.
Fixed bug when indexing a scalar ManifestArray with an ellipsis(#596, #641) By Max Jones and Tom Nicholas.

Documentation¶

Added more detail to error messages when an indexer of ManifestArray is invalid (#630, #635). By Danny Kaufman.
Added new docs page on how to write a custom parser for bespoke file formats (#452, #580) By Tom Nicholas.
Added new docs page on how to scale VirtualiZarr effectively#590. By Tom Nicholas.
Documented the new [virtualizarr.open_virtual_mfdataset] function #590. By Tom Nicholas.
Added MUR SST virtual and zarr icechunk store generation using lithops example. (#475) by Aimee Barciauskas.
Added FAQ answer about what data can be virtualized (#430, #532) By Tom Nicholas.
Switched docs build to use mkdocs-material instead of sphinx (#615) By Max Jones.
Moved examples into a V1/ directory and adds notes that examples use the VirtualiZarr V1 syntax #644. By Raphael Hagen.

Internal Changes¶

ManifestArrays now internally use zarr.core.metadata.v3.ArrayV3Metadata. This replaces the ZArray class that was previously used to store metadata about manifest arrays. (#429) By Aimee Barciauskas. Notable internal changes:
Make zarr-python a required dependency with a minimum version >=3.0.2.
Specify a minimum numcodecs version of >=0.15.1.
When creating a ManifestArray, the metadata property should be an zarr.core.metadata.v3.ArrayV3Metadata object. There is a helper function create_v3_array_metadata which should be used, as it has some useful defaults and includes convert_to_codec_pipeline (see next bullet).
The function convert_to_codec_pipeline ensures the codec pipeline passed to ArrayV3Metadata has valid codecs in the expected order (ArrayArrayCodecs, ArrayBytesCodec, BytesBytesCodecs) and includes the required ArrayBytesCodec using the default for the data type.
- Note: convert_to_codec_pipeline uses the zarr-python function get_codec_class to convert codec configurations (i.e. dicts with a name and configuration key, see parse_named_configuration) to valid Zarr V3 codec classes.
Parser changes are minimal.
Writer changes:
- Kerchunk uses Zarr version format 2 so we convert ArrayV3Metadata to ArrayV2Metadata using the convert_v3_to_v2_metadata function. This means the to_kerchunk_json function is now a bit more complex because we're converting ArrayV2Metadata filters and compressor to serializable objects.
zarr-python 3.0 does not yet support the big endian data type. This means that FITS and NetCDF-3 are not currently supported (zarr-python issue #2324).
zarr-python 3.0 does not yet support datetime and timedelta data types (zarr-python issue #2616).
The continuous integration workflows and developer environment now use pixi (#407).
Added loadable_variables kwarg to ManifestStore.to_virtual_dataset. (#543) By Tom Nicholas.
Ensure that the KerchunkJSONParser can be used to parse in-memory kerchunk dictionaries using obstore.store.MemoryStore. (#631) By Tom Nicholas.
Move the virtualizarr.translators.kerchunk module to virtualizarr.parsers.kerchunk.translator, to better indicate that it is private. Also refactor the two kerchunk readers into one module. (#633) By Tom Nicholas.

v1.3.2 (3^rd Mar 2025)¶

Small release which fixes a problem causing the docs to be out of date, fixes some issues in the tests with unclosed file handles, but also increases the performance of writing large numbers of virtual references to Icechunk!

New Features¶

Breaking changes¶

Minimum supported version of Icechunk is now v0.2.4 (#462) By Tom Nicholas.

Deprecations¶

Bug fixes¶

Documentation¶

Internal Changes¶

Updates store.set_virtual_ref to store.set_virtual_refs in write_manifest_virtual_refs (#443) By Raphael Hagen.

v1.3.1 (18^th Feb 2025)¶

New Features¶

Examples use new Icechunk syntax

Breaking changes¶

Reading and writing Zarr chunk manifest formats are no longer supported. (#359), (#426). By Raphael Hagen.

Deprecations¶

Bug fixes¶

Documentation¶

Internal Changes¶

v1.3.0 (3^rd Feb 2025)¶

This release stabilises our dependencies - you can now use released versions of VirtualiZarr, Kerchunk, and Icechunk all in the same environment!

It also fixes a number of bugs, adds minor features, changes the default reader for HDF/netCDF4 files, and includes refactors to reduce code redundancy with zarr-python v3. You can also choose which sets of dependencies you want at installation time.

New Features¶

Optional dependencies can now be installed in groups via pip. See the installation docs. (#309) By Tom Nicholas.
Added a .nbytes accessor method which displays the bytes needed to hold the virtual references in memory. (#167, #227) By Tom Nicholas.
Upgrade icechunk dependency to >=0.1.0a12. (#406) By Julia Signell.
Sync with Icechunk v0.1.0a8 (#368) By Matthew Iannucci. This also adds support for the to_icechunk method to add timestamps as checksums when writing virtual references to an icechunk store. This is useful for ensuring that virtual references are not stale when reading from an icechunk store, which can happen if the underlying data has changed since the virtual references were written.
Add group=None keyword-only parameter to the VirtualiZarrDatasetAccessor.to_icechunk method to allow writing to a nested group at a specified group path (rather than defaulting to the root group, when no group is specified). (#341) By Chuck Daniels.

Breaking changes¶

Passing group=None (the default) to open_virtual_dataset for a file with multiple groups no longer raises an error, instead it gives you the root group. This new behaviour is more consistent with xarray.open_dataset. (#336, #338) By Tom Nicholas.
Indexes are now created by default for any loadable one-dimensional coordinate variables. Also a warning is no longer thrown when indexes=None is passed to open_virtual_dataset, and the recommendations in the docs updated to match. This also means that xarray.combine_by_coords will now work when the necessary dimension coordinates are specified in loadable_variables. (#18, #357, #358) By Tom Nicholas.
The append_dim and last_updated_at parameters of the VirtualiZarrDatasetAccessor.to_icechunk method are now keyword-only parameters, rather than positional or keyword. This change is breaking only where arguments for these parameters are currently given positionally. (#341) By Chuck Daniels.
The default backend for netCDF4 and HDF5 is now the custom HDFVirtualBackend replacing the previous default which was a wrapper around the kerchunk backend. (#374, #395) By Julia Signell.
Optional dependency on kerchunk is now the newly-released v0.2.8. This release of kerchunk is compatible with zarr-python v3.0.0, which means a released version of kerchunk can now be used with both VirtualiZarr and Icechunk. (#392, #406, #412) By Julia Signell and Tom Nicholas.

Deprecations¶

Bug fixes¶

Fix bug preventing generating references for the root group of a file when a subgroup exists. (#336, #338) By Tom Nicholas.
Fix bug in HDF reader where dimension names of dimensions in a subgroup would be incorrect. (#364, #366) By Tom Nicholas.
Fix bug in dmrpp reader so _FillValue is included in variables' encodings. (#369) By Aimee Barciauskas.
Fix bug passing arguments to FITS reader, and test it on Hubble Space Telescope data. (#363) By Tom Nicholas.

Documentation¶

Change intro text in readme and docs landing page to be clearer, less about the relationship to Kerchunk, and more about why you would want virtual datasets in the first place. (#337) By Tom Nicholas.

Internal Changes¶

Add netCDF3 test. (#397) By Tom Nicholas.

v1.2.0 (5^th Dec 2024)¶

This release brings a stricter internal model for manifest paths, support for appending to existing icechunk stores, an experimental non-kerchunk-based HDF5 reader, handling of nested groups in DMR++ files, as well as many other bugfixes and documentation improvements.

New Features¶

Add a virtual_backend_kwargs keyword argument to file readers and to open_virtual_dataset, to allow reader-specific options to be passed down. (#315) By Tom Nicholas.
Added append functionality to to_icechunk (#272) By Aimee Barciauskas.

Breaking changes¶

Minimum required version of Xarray is now v2024.10.0. (#284) By Tom Nicholas.
Minimum required version of Icechunk is now v0.1.1. (#419) By Tom Nicholas.
Minimum required version of Kerchunk is now v0.2.8. (#406) By Julia Signell.
Opening kerchunk-formatted references from disk which contain relative paths now requires passing the fs_root keyword argument via virtual_backend_kwargs. (#243) By Tom Nicholas.

Deprecations¶

Bug fixes¶

Handle root and nested groups with dmrpp backend (#265) By Ayush Nag.
Fixed bug with writing of dimension_names into zarr metadata. (#286) By Tom Nicholas.
Fixed bug causing CF-compliant variables not to be identified as coordinates (#191) By Ayush Nag.

Documentation¶

FAQ answers on Icechunk compatibility, converting from existing Kerchunk references to Icechunk, and how to add a new reader for a custom file format. (#266) By Tom Nicholas.
Clarify which readers actually currently work in FAQ, and temporarily remove tiff from the auto-detection. (#291, #296) By Tom Nicholas.
Minor improvements to the Contributing Guide. (#298) By Tom Nicholas.
More minor improvements to the Contributing Guide. (#304) By Doug Latornell.
Correct some links to the API. (#325) By Tom Nicholas.
Added links to recorded presentations on VirtualiZarr. (#313) By Tom Nicholas.
Added links to existing example notebooks. (#329, #331) By Tom Nicholas.

Internal Changes¶

Added experimental new HDF file reader which doesn't use kerchunk, accessible by importing virtualizarr.readers.hdf.HDFVirtualBackend. (#87) By Sean Harkins.
Support downstream type checking by adding py.typed marker file. (#306) By Max Jones.
File paths in chunk manifests are now always stored as absolute URIs. (#243) By Tom Nicholas.

v1.1.0 (22^nd Oct 2024)¶

New Features¶

Can open kerchunk reference files with open_virtual_dataset. (#251, #186) By Raphael Hagen & Kristen Thyng.
Adds defaults for open_virtual_dataset_from_v3_store in (#234) By Raphael Hagen.
New group option on open_virtual_dataset enables extracting specific HDF Groups. (#165) By Scott Henderson.
Adds decode_times to open_virtual_dataset (#232) By Raphael Hagen.
Add parser for the OPeNDAP DMR++ XML format and integration with open_virtual_dataset (#113) By Ayush Nag.
Load scalar variables by default. (#205) By Gustavo Hidalgo.
Support empty files (#260) By Justus Magin.
Can write virtual datasets to Icechunk stores using virtualize.to_icechunk (#256) By Matt Iannucci.

Breaking changes¶

Serialize valid ZarrV3 metadata and require full compressor numcodec config (for #193) By Gustavo Hidalgo.
VirtualiZarr's ZArray, ChunkEntry, and Codec no longer subclass pydantic.BaseModel (#210)
ZArray's __init__ signature has changed to match zarr.Array's (#210)

Deprecations¶

Depreciates cftime_variables in open_virtual_dataset in favor of decode_times. (#232) By Raphael Hagen.

Bug fixes¶

Exclude empty chunks during ChunkDict construction. (#198) By Gustavo Hidalgo.
Fixed regression in fill_value handling for datetime dtypes making virtual Zarr stores unreadable (#206) By Timothy Hodson

Documentation¶

Adds virtualizarr + coiled serverless example notebook (#223) By Raphael Hagen.

Internal Changes¶

Refactored internal structure significantly to split up everything to do with reading references from that to do with writing references. (#229) (#231) By Tom Nicholas.
Refactored readers to consider every filetype as a separate reader, all standardized to present the same open_virtual_dataset interface internally. (#261) By Tom Nicholas.

v1.0.0 (9^th July 2024)¶

This release marks VirtualiZarr as mostly feature-complete, in the sense of achieving feature parity with kerchunk's logic for combining datasets, providing an easier way to manipulate kerchunk references in memory and generate kerchunk reference files on disk.

Future VirtualiZarr development will focus on generalizing and upstreaming useful concepts into the Zarr specification, the Zarr-Python library, Xarray, and possibly some new packages. See the roadmap in the documentation for details.

New Features¶

Now successfully opens both tiff and FITS files. (#160, #162) By Tom Nicholas.
Added a .rename_paths convenience method to rename paths in a manifest according to a function. (#152) By Tom Nicholas.
New cftime_variables option on open_virtual_dataset enables encoding/decoding time. (#122) By Julia Signell.

Breaking changes¶

Requires numpy 2.0 (for #107). By Tom Nicholas.

Deprecations¶

Bug fixes¶

Ensure that _ARRAY_DIMENSIONS are dropped from variable .attrs. (#150, #152) By Tom Nicholas.
Ensure that .attrs on coordinate variables are preserved during round-tripping. (#155, #154) By Tom Nicholas.
Ensure that non-dimension coordinate variables described via the CF conventions are preserved during round-tripping. (#105, #156) By Tom Nicholas.

Documentation¶

Added example of using cftime_variables to usage docs. (#169, #174) By Tom Nicholas.
Updated the development roadmap in preparation for v1.0. (#164) By Tom Nicholas.
Warn if user passes indexes=None to open_virtual_dataset to indicate that this is not yet fully supported. (#170) By Tom Nicholas.
Clarify that virtual datasets cannot be treated like normal xarray datasets. (#173) By Tom Nicholas.

Internal Changes¶

Refactor ChunkManifest class to store chunk references internally using numpy arrays. (#107) By Tom Nicholas.
Mark tests which require network access so that they are only run when --run-network-tests is passed a command-line argument to pytest. (#144) By Tom Nicholas.
Determine file format from magic bytes rather than name suffix (#143) By Scott Henderson.

v0.1 (17^th June 2024)¶

v0.1 is the first release of VirtualiZarr!! It contains functionality for using kerchunk to find byte ranges in netCDF files, constructing an xarray.Dataset containing ManifestArray objects, then writing out such a dataset to kerchunk references as either json or parquet.

Release notes¶

Unreleased¶

New Features¶

Documentation¶

v2.6.2 (18th May 2026)¶

New Features¶

Bug fixes¶

Documentation¶

Internal changes¶

v2.6.1 (3rd May 2026)¶

New Features¶

Bug fixes¶

Documentation¶

Internal changes¶

v2.6.0 (16th April 2026)¶

New Features¶

Breaking changes¶

Bug fixes¶

Documentation¶

Internal changes¶

v2.5.1 (9th April 2026)¶

New Features¶

Breaking changes¶

Bug fixes¶

Documentation¶

Internal changes¶

v2.5.0 (23rd March 2026)¶

New Features¶

Breaking changes¶

Bug fixes¶

Documentation¶

Internal changes¶

v2.4.0 (24th January 2026)¶

New Features¶

Breaking changes¶

Documentation¶

v2.3.0 (20th January 2026)¶

New Features¶

Internal changes¶

v2.2.1 (17th November 2025)¶

Bug fixes¶

v2.2.0 (12th November 2025)¶

New Features¶

Breaking changes¶

Bug fixes¶

Documentation¶

Internal changes¶

v2.1.2 (3rd September 2025)¶

Bug fixes¶

Internal changes¶

v2.1.1 (14th August 2025)¶

Bug fixes¶

v2.1.0 (14th August 2025)¶

New Features¶

Breaking changes¶

Bug fixes¶

Documentation¶

Internal changes¶

v2.0.1 (30th July 2025)¶

Bug fixes¶

Documentation¶

v2.0.0 (21st July 2025)¶

New Features¶

Breaking changes¶

Bug fixes¶

Documentation¶

Internal Changes¶

v1.3.2 (3rd Mar 2025)¶

New Features¶

Breaking changes¶

Deprecations¶

Bug fixes¶

Documentation¶

Internal Changes¶

v1.3.1 (18th Feb 2025)¶

New Features¶

Breaking changes¶

Deprecations¶

Bug fixes¶

Documentation¶

v2.6.2 (18^th May 2026)¶

v2.6.1 (3^rd May 2026)¶

v2.6.0 (16^th April 2026)¶

v2.5.1 (9^th April 2026)¶

v2.5.0 (23^rd March 2026)¶

v2.4.0 (24^th January 2026)¶

v2.3.0 (20^th January 2026)¶

v2.2.1 (17^th November 2025)¶

v2.2.0 (12^th November 2025)¶

v2.1.2 (3^rd September 2025)¶

v2.1.1 (14^th August 2025)¶

v2.1.0 (14^th August 2025)¶

v2.0.1 (30^th July 2025)¶

v2.0.0 (21^st July 2025)¶

v1.3.2 (3^rd Mar 2025)¶

v1.3.1 (18^th Feb 2025)¶

v1.3.0 (3^rd Feb 2025)¶

v1.2.0 (5^th Dec 2024)¶

v1.1.0 (22^nd Oct 2024)¶

v1.0.0 (9^th July 2024)¶

v0.1 (17^th June 2024)¶