virtualizarr.accessor.VirtualiZarrDatasetAccessor.to_kerchunk#

VirtualiZarrDatasetAccessor.to_kerchunk(filepath: None, format: Literal['dict']) KerchunkStoreRefs#
VirtualiZarrDatasetAccessor.to_kerchunk(filepath: str | Path, format: Literal['json']) None
VirtualiZarrDatasetAccessor.to_kerchunk(filepath: str | Path, format: Literal['parquet'], record_size: int = 100000, categorical_threshold: int = 10) None

Serialize all virtualized arrays in this xarray dataset into the kerchunk references format.

Parameters:
  • filepath – File path to write kerchunk references into. Not required if format is ‘dict’.

  • format – Format to serialize the kerchunk references as. If ‘json’ or ‘parquet’ then the ‘filepath’ argument is required.

  • record_size – Number of references to store in each reference file (default 100,000). Bigger values mean fewer read requests but larger memory footprint. Only available when format is ‘parquet’.

  • categorical_threshold – Encode urls as pandas.Categorical to reduce memory footprint if the ratio of the number of unique urls to total number of refs for each variable is greater than or equal to this number (default 10). Only available when format is ‘parquet’.

References

https://fsspec.github.io/kerchunk/spec.html