import os
import xarray as xr
Discover EOPF Zarr - Sentinel-2 L2A
Introduction
This tutorial introduces you to the structure of a an EOPF Zarr product sample for Sentinel-2 L2A data. We will demonstrate how to access and open a Zarr product sample with xarray
, how to visualise the zarr
encoding structure, explore embedded information, and retrieve relevant metadata for further processing.
What we will learn
- ⚙️ How to open a
.zarr
file usingxarray
? - 🛰️ The general structure of a Sentinel-2 L-2A item
- 🔎 How to access metadata that describes the
.zarr
encoding?
Prerequisites
This tutorial uses a re-processed sample dataset from the EOPF Sentinel Zarr Samples Service STAC API that is available for direct access here.
The selected zarr
product is a Sentinel-2 L2A tile from the 10th of June 2025: * File name: S2C_MSIL2A_20250610T103641_N0511_R008_T32UMD_20250610T132001.zarr.
).
Import libraries
Helper functions
print_gen_structure
This function helps us to retrieve an visualise the names for each of the stored groups inside a zarr
. As an output, it will print a general overview of elements inside the zarr
.
def print_gen_structure(node, indent=""):
print(f"{indent}{node.name}") #allows us access each node
for child_name, child_node in node.children.items(): #loops inside the selected nodes to extract naming
+ " ") # prints the name of the selected nodes print_gen_structure(child_node, indent
Open a Zarr Store
In a first step, we use the function open_datatree()
from the xarray
library to open a Zarr store as a DataTree.
Inside, we ned to define the following key word arguments:
filename_or_obj
: path leading to azarr
storeengine
:'eopf-zarr'
, designed for the EOPFzarr
by ESA.op_mode
: extension by thexarray-eopf
development for allowing an analysis or native mode. For more information visit the xarray-eopf documentation.chunks
: loads the data with dask using the engine’s preferred chunk size, generally identical to the format’s chunk size
The final print of the DataTree
object is commented out, as the display can be quite extensive, showing the entire content within the Zarr. An alternative is to apply a helper function that only displays the higher level structure as shown in the next code cell.
= 'https://objects.eodc.eu/e05ab01a9d56408d82ac32d69a5aae2a:202506-s02msil2a/10/products/cpm_v256/S2C_MSIL2A_20250610T103641_N0511_R008_T32UMD_20250610T132001.zarr'
url = xr.open_datatree(url,
s2l2a_zarr_sample="eopf-zarr", # storage format
engine="native", # no analysis mode
op_mode={}, # allows to open the default chunking
chunks )
If we apply the helper function print_gen_structure
on the root of the DataTree object, we will get a listing of the tree-like structure of the object. We can see all Zarr groups, such as measurements
, quality
and conditions
, their sub-groups and content.
print("Zarr Sentinel 2 L2A Structure")
print_gen_structure(s2l2a_zarr_sample.root) print("-" * 30)
Zarr Sentinel 2 L2A Structure
None
conditions
geometry
mask
detector_footprint
r10m
r20m
r60m
l1c_classification
r60m
l2a_classification
r20m
r60m
meteorology
cams
ecmwf
measurements
reflectance
r10m
r20m
r60m
quality
atmosphere
r10m
r20m
r60m
l2a_quicklook
r10m
r20m
r60m
mask
r10m
r20m
r60m
probability
r20m
------------------------------
Extract information from Zarr groups
In a next step, we can explore the content of individual Zarr groups. By specifying the name of the group and subgroup and adding it into square brackets, we can extract the content of the relevant group. Let us for example extract the content of the subgroup reflectance
under measurements
.
As a result, it is visible that there are three subgroups of the parent node measurements/reflectance
: r10
, r20
and r60
, which are the DataArrays with the three different resolutions of the Sentinel-2 L2A data.
The xarray.DataTree
structure allows the exploration of additional group-related metadata and information. For example, we can find the chunksize
of each array and the coordinates.
# Retrieving the reflectance groups:
# s2l2a_zarr_sample["measurements/reflectance"] # Run it yourself for an inteactive overview
Extract Zarr metadata on different levels
Through s2l2a_zarr_sample.attrs[]
we are able to visualise both the stac_discovery
and other_metadata
included in the zarr
store.
For the properties inside stac_discovery
for example we can get the parameters included:
# STAC metadata style:
print(list(s2l2a_zarr_sample.attrs["stac_discovery"].keys()))
['assets', 'bbox', 'geometry', 'id', 'links', 'properties', 'stac_extensions', 'stac_version', 'type']
We are also, able to retrieve specific information by diving deep into the stac_discovery
metadata, such as:
print('Date of Item Creation: ', s2l2a_zarr_sample.attrs['stac_discovery']['properties']['created'])
print('Item Bounding Box : ', s2l2a_zarr_sample.attrs['stac_discovery']['bbox'])
print('Item ESPG : ', s2l2a_zarr_sample.attrs['stac_discovery']['properties']['proj:epsg'])
print('Sentinel Platform : ', s2l2a_zarr_sample.attrs['stac_discovery']['properties']['platform'])
print('Item Processing Level: ', s2l2a_zarr_sample.attrs['stac_discovery']['properties']['processing:level'])
Date of Item Creation: 2025-06-10T13:20:01+00:00
Item Bounding Box : [9.146276872400831, 52.25344953517325, 7.500940412097549, 53.24953673463324]
Item ESPG : 32632
Sentinel Platform : sentinel-2c
Item Processing Level: L2A
And from other_metadata
, we are able to retrieve the information specific to the instrument variables.
# Complementing metadata:
print(list(s2l2a_zarr_sample.attrs["other_metadata"].keys()))
['AOT_retrieval_model', 'L0_ancillary_data_quality', 'L0_ephemeris_data_quality', 'NUC_table_ID', 'SWIR_rearrangement_flag', 'UTM_zone_identification', 'absolute_location_assessment_from_AOCS', 'band_description', 'declared_accuracy_of_AOT_model', 'declared_accuracy_of_radiative_transfer_model', 'declared_accuracy_of_water_vapour_model', 'electronic_crosstalk_correction_flag', 'eopf_category', 'geometric_refinement', 'history', 'horizontal_CRS_code', 'horizontal_CRS_name', 'mean_sensing_time', 'mean_sun_azimuth_angle_in_deg_for_all_bands_all_detectors', 'mean_sun_zenith_angle_in_deg_for_all_bands_all_detectors', 'mean_value_of_aerosol_optical_thickness', 'mean_value_of_total_water_vapour_content', 'meteo', 'multispectral_registration_assessment', 'onboard_compression_flag', 'onboard_equalization_flag', 'optical_crosstalk_correction_flag', 'ozone_source', 'ozone_value', 'percentage_of_degraded_MSI_data', 'planimetric_stability_assessment_from_AOCS', 'product_quality_status', 'reflectance_correction_factor_from_the_Sun-Earth_distance_variation_computed_using_the_acquisition_date', 'spectral_band_of_reference']
đź’Ş Now it is your turn
As we are able to retrieve several items from the EOPF Sentinel Zarr Samples Service STAC API, let us try the following:
### Task Go to the Sentinel-2 Level-2A collection and: - Choose an item of interest. - Replicate the workflow and explore the item’s metadata. When was it retrieved? - What are the dimensions? - What is the detailed location of the item?
Conclusion
This tutorial provides an initial understanding of the zarr
structure for a Sentinel-2 L2A product sample. By using the xarray
library, we can effectively navigate and inspect the different components within the zarr
format, including its metadata and array organisation.
What’s next?
Now that you’ve been introduced to the zarr
format, learned its core concepts, and understood the basics of how to explore it, you are prepared for the next step. In the following chapter we will introduce you to STAC and the EOPF Zarr STAC Catalog. As we go along, we are more and more transition from theory to practice, providing you with hands-on tutorials working with EOPF Zarr products.