Access EOPF Sentinel Zarr using Rarr

Introduction

This notebook demonstrates how to retrieve remotely stored Zarr data using the Rarr package in R. We will explore how to read and visualize Zarr data (zarrays) and their metadata. For subsequent analysis and visualization we turn the zarray into stars objects which are more suitable for spatial data.

Tip

This notebook has a sibling, which demonstrates how to access the same data using the GDAL Zarr driver. You can find it here.

What we will learn

  • ✏️ How to edit URLs of Zarr stores to make them readable for GDAL
  • 🔎 Which read-functions and arguments to use in stars and terra
  • đźš§ Current limitations of these packages

Prerequisites

To follow this notebook, you will need to load a STAC item in Zarr format from the EOPF Zarr STAC Catalog. If you are new to STAC inside the R environment, we suggest to follow the Access the EOPF Zarr STAC API with R tutorial. The item we are using for this example mirrors the one used in the Sentinel-2 L1C MSI Zarr Product Exploration notebook which is hosted on EODC’s object storage and is accessible under the URL below.

zarr_url = "https://objects.eodc.eu/e05ab01a9d56408d82ac32d69a5aae2a:sample-data/tutorial_data/cpm_v253/S2B_MSIL1C_20250113T103309_N0511_R108_T32TLQ_20250113T122458.zarr"

Import packages

library(Rarr)      # reading; BiocManager::install("Rarr")
library(stars)     # spatial objects
library(dplyr)     # table operations
library(jsonlite)  # parsing metadata
library(mapview)   # interactive maps

Metadata exploration with Rarr

Important

In this notebook we show the current capabilities and limitations of Rarr in terms of reading Zarr array(s) and their metadata. Successful attempts are marked with a âś…, failed attempts with a â›”.

Using the Rarr::zarr_overview function we can get meta data for all assets in the remote Zarr store as a table. Here list all arrays and their data type, compression, dimensions, etc.

zarr_content = zarr_overview(zarr_url, as_data_frame = T) |> 
  mutate(var = sub(zarr_url, "", path)) |> 
  select(var, everything(), -path) 

filter(zarr_content, grepl("reflectance", var))
                                  var data_type endianness compressor
1  /measurements/reflectance/r10m/b02    uint16     little      blosc
2  /measurements/reflectance/r10m/b03    uint16     little      blosc
3  /measurements/reflectance/r10m/b04    uint16     little      blosc
4  /measurements/reflectance/r10m/b08    uint16     little      blosc
5    /measurements/reflectance/r10m/x     int64     little      blosc
6    /measurements/reflectance/r10m/y     int64     little      blosc
7  /measurements/reflectance/r20m/b05    uint16     little      blosc
8  /measurements/reflectance/r20m/b06    uint16     little      blosc
9  /measurements/reflectance/r20m/b07    uint16     little      blosc
10 /measurements/reflectance/r20m/b11    uint16     little      blosc
11 /measurements/reflectance/r20m/b12    uint16     little      blosc
12 /measurements/reflectance/r20m/b8a    uint16     little      blosc
13   /measurements/reflectance/r20m/x     int64     little      blosc
14   /measurements/reflectance/r20m/y     int64     little      blosc
15 /measurements/reflectance/r60m/b01    uint16     little      blosc
16 /measurements/reflectance/r60m/b09    uint16     little      blosc
17 /measurements/reflectance/r60m/b10    uint16     little      blosc
18   /measurements/reflectance/r60m/x     int64     little      blosc
19   /measurements/reflectance/r60m/y     int64     little      blosc
            dim  chunk_dim nchunks
1  10980, 10980 1830, 1830    6, 6
2  10980, 10980 1830, 1830    6, 6
3  10980, 10980 1830, 1830    6, 6
4  10980, 10980 1830, 1830    6, 6
5         10980      10980       1
6         10980      10980       1
7    5490, 5490   915, 915    6, 6
8    5490, 5490   915, 915    6, 6
9    5490, 5490   915, 915    6, 6
10   5490, 5490   915, 915    6, 6
11   5490, 5490   915, 915    6, 6
12   5490, 5490   915, 915    6, 6
13         5490       5490       1
14         5490       5490       1
15   1830, 1830   305, 305    6, 6
16   1830, 1830   305, 305    6, 6
17   1830, 1830   305, 305    6, 6
18         1830       1830       1
19         1830       1830       1

We can see that all Sentinel-2 bands are present, but split into directories (or groups) based on spatial resolution (10, 20, 60 meters), which have different dimensions (number and size of pixels).

                                                   var  data_type endianness
1                           /conditions/geometry/angle unicode224     little
2                            /conditions/geometry/band  unicode96     little
3                        /conditions/geometry/detector      int64     little
4                 /conditions/geometry/mean_sun_angles    float64     little
5   /conditions/geometry/mean_viewing_incidence_angles    float64     little
6                      /conditions/geometry/sun_angles    float64     little
7        /conditions/geometry/viewing_incidence_angles    float64     little
8                               /conditions/geometry/x      int64     little
9                               /conditions/geometry/y      int64     little
10        /conditions/mask/detector_footprint/r10m/b02      uint8       <NA>
11        /conditions/mask/detector_footprint/r10m/b03      uint8       <NA>
12        /conditions/mask/detector_footprint/r10m/b04      uint8       <NA>
13        /conditions/mask/detector_footprint/r10m/b08      uint8       <NA>
14          /conditions/mask/detector_footprint/r10m/x      int64     little
15          /conditions/mask/detector_footprint/r10m/y      int64     little
16        /conditions/mask/detector_footprint/r20m/b05      uint8       <NA>
17        /conditions/mask/detector_footprint/r20m/b06      uint8       <NA>
18        /conditions/mask/detector_footprint/r20m/b07      uint8       <NA>
19        /conditions/mask/detector_footprint/r20m/b11      uint8       <NA>
20        /conditions/mask/detector_footprint/r20m/b12      uint8       <NA>
21        /conditions/mask/detector_footprint/r20m/b8a      uint8       <NA>
22          /conditions/mask/detector_footprint/r20m/x      int64     little
23          /conditions/mask/detector_footprint/r20m/y      int64     little
24        /conditions/mask/detector_footprint/r60m/b01      uint8       <NA>
25        /conditions/mask/detector_footprint/r60m/b09      uint8       <NA>
26        /conditions/mask/detector_footprint/r60m/b10      uint8       <NA>
27          /conditions/mask/detector_footprint/r60m/x      int64     little
28          /conditions/mask/detector_footprint/r60m/y      int64     little
29        /conditions/mask/l1c_classification/r60m/b00      uint8       <NA>
30          /conditions/mask/l1c_classification/r60m/x      int64     little
31          /conditions/mask/l1c_classification/r60m/y      int64     little
32                /conditions/meteorology/cams/aod1240    float32     little
33                 /conditions/meteorology/cams/aod469    float32     little
34                 /conditions/meteorology/cams/aod550    float32     little
35                 /conditions/meteorology/cams/aod670    float32     little
36                 /conditions/meteorology/cams/aod865    float32     little
37               /conditions/meteorology/cams/bcaod550    float32     little
38               /conditions/meteorology/cams/duaod550    float32     little
39          /conditions/meteorology/cams/isobaricInhPa    float64     little
40               /conditions/meteorology/cams/latitude    float64     little
41              /conditions/meteorology/cams/longitude    float64     little
42                 /conditions/meteorology/cams/number      int64     little
43               /conditions/meteorology/cams/omaod550    float32     little
44               /conditions/meteorology/cams/ssaod550    float32     little
45                   /conditions/meteorology/cams/step      int64     little
46               /conditions/meteorology/cams/suaod550    float32     little
47                /conditions/meteorology/cams/surface    float64     little
48                   /conditions/meteorology/cams/time      int64     little
49             /conditions/meteorology/cams/valid_time      int64     little
50                      /conditions/meteorology/cams/z    float32     little
51         /conditions/meteorology/ecmwf/isobaricInhPa    float64     little
52              /conditions/meteorology/ecmwf/latitude    float64     little
53             /conditions/meteorology/ecmwf/longitude    float64     little
54                   /conditions/meteorology/ecmwf/msl    float32     little
55                /conditions/meteorology/ecmwf/number      int64     little
56                     /conditions/meteorology/ecmwf/r    float32     little
57                  /conditions/meteorology/ecmwf/step      int64     little
58               /conditions/meteorology/ecmwf/surface    float64     little
59                  /conditions/meteorology/ecmwf/tco3    float32     little
60                  /conditions/meteorology/ecmwf/tcwv    float32     little
61                  /conditions/meteorology/ecmwf/time      int64     little
62                   /conditions/meteorology/ecmwf/u10    float32     little
63                   /conditions/meteorology/ecmwf/v10    float32     little
64            /conditions/meteorology/ecmwf/valid_time      int64     little
65                  /measurements/reflectance/r10m/b02     uint16     little
66                  /measurements/reflectance/r10m/b03     uint16     little
67                  /measurements/reflectance/r10m/b04     uint16     little
68                  /measurements/reflectance/r10m/b08     uint16     little
69                    /measurements/reflectance/r10m/x      int64     little
70                    /measurements/reflectance/r10m/y      int64     little
71                  /measurements/reflectance/r20m/b05     uint16     little
72                  /measurements/reflectance/r20m/b06     uint16     little
73                  /measurements/reflectance/r20m/b07     uint16     little
74                  /measurements/reflectance/r20m/b11     uint16     little
75                  /measurements/reflectance/r20m/b12     uint16     little
76                  /measurements/reflectance/r20m/b8a     uint16     little
77                    /measurements/reflectance/r20m/x      int64     little
78                    /measurements/reflectance/r20m/y      int64     little
79                  /measurements/reflectance/r60m/b01     uint16     little
80                  /measurements/reflectance/r60m/b09     uint16     little
81                  /measurements/reflectance/r60m/b10     uint16     little
82                    /measurements/reflectance/r60m/x      int64     little
83                    /measurements/reflectance/r60m/y      int64     little
84                    /quality/l1c_quicklook/r10m/band      int64     little
85                     /quality/l1c_quicklook/r10m/tci      uint8       <NA>
86                       /quality/l1c_quicklook/r10m/x      int64     little
87                       /quality/l1c_quicklook/r10m/y      int64     little
88                              /quality/mask/r10m/b02      uint8       <NA>
89                              /quality/mask/r10m/b03      uint8       <NA>
90                              /quality/mask/r10m/b04      uint8       <NA>
91                              /quality/mask/r10m/b08      uint8       <NA>
92                                /quality/mask/r10m/x      int64     little
93                                /quality/mask/r10m/y      int64     little
94                              /quality/mask/r20m/b05      uint8       <NA>
95                              /quality/mask/r20m/b06      uint8       <NA>
96                              /quality/mask/r20m/b07      uint8       <NA>
97                              /quality/mask/r20m/b11      uint8       <NA>
98                              /quality/mask/r20m/b12      uint8       <NA>
99                              /quality/mask/r20m/b8a      uint8       <NA>
100                               /quality/mask/r20m/x      int64     little
101                               /quality/mask/r20m/y      int64     little
102                             /quality/mask/r60m/b01      uint8       <NA>
103                             /quality/mask/r60m/b09      uint8       <NA>
104                             /quality/mask/r60m/b10      uint8       <NA>
105                               /quality/mask/r60m/x      int64     little
106                               /quality/mask/r60m/y      int64     little
    compressor          dim    chunk_dim      nchunks
1        blosc            2            2            1
2        blosc           13           13            1
3        blosc            7            7            1
4        blosc            2            2            1
5        blosc        13, 2        13, 2         1, 1
6        blosc    2, 23, 23    2, 23, 23      1, 1, 1
7        blosc 13, 7, 2.... 7, 4, 2,.... 2, 2, 1,....
8        blosc           23           23            1
9        blosc           23           23            1
10       blosc 10980, 10980   1830, 1830         6, 6
11       blosc 10980, 10980   1830, 1830         6, 6
12       blosc 10980, 10980   1830, 1830         6, 6
13       blosc 10980, 10980   1830, 1830         6, 6
14       blosc        10980        10980            1
15       blosc        10980        10980            1
16       blosc   5490, 5490     915, 915         6, 6
17       blosc   5490, 5490     915, 915         6, 6
18       blosc   5490, 5490     915, 915         6, 6
19       blosc   5490, 5490     915, 915         6, 6
20       blosc   5490, 5490     915, 915         6, 6
21       blosc   5490, 5490     915, 915         6, 6
22       blosc         5490         5490            1
23       blosc         5490         5490            1
24       blosc   1830, 1830     305, 305         6, 6
25       blosc   1830, 1830     305, 305         6, 6
26       blosc   1830, 1830     305, 305         6, 6
27       blosc         1830         1830            1
28       blosc         1830         1830            1
29       blosc   1830, 1830     305, 305         6, 6
30       blosc         1830         1830            1
31       blosc         1830         1830            1
32       blosc         9, 9         9, 9         1, 1
33       blosc         9, 9         9, 9         1, 1
34       blosc         9, 9         9, 9         1, 1
35       blosc         9, 9         9, 9         1, 1
36       blosc         9, 9         9, 9         1, 1
37       blosc         9, 9         9, 9         1, 1
38       blosc         9, 9         9, 9         1, 1
39        <NA>                                       
40       blosc            9            9            1
41       blosc            9            9            1
42        <NA>                                       
43       blosc         9, 9         9, 9         1, 1
44       blosc         9, 9         9, 9         1, 1
45        <NA>                                       
46       blosc         9, 9         9, 9         1, 1
47        <NA>                                       
48        <NA>                                       
49        <NA>                                       
50       blosc         9, 9         9, 9         1, 1
51        <NA>                                       
52       blosc            9            9            1
53       blosc            9            9            1
54       blosc         9, 9         9, 9         1, 1
55        <NA>                                       
56       blosc         9, 9         9, 9         1, 1
57        <NA>                                       
58        <NA>                                       
59       blosc         9, 9         9, 9         1, 1
60       blosc         9, 9         9, 9         1, 1
61        <NA>                                       
62       blosc         9, 9         9, 9         1, 1
63       blosc         9, 9         9, 9         1, 1
64        <NA>                                       
65       blosc 10980, 10980   1830, 1830         6, 6
66       blosc 10980, 10980   1830, 1830         6, 6
67       blosc 10980, 10980   1830, 1830         6, 6
68       blosc 10980, 10980   1830, 1830         6, 6
69       blosc        10980        10980            1
70       blosc        10980        10980            1
71       blosc   5490, 5490     915, 915         6, 6
72       blosc   5490, 5490     915, 915         6, 6
73       blosc   5490, 5490     915, 915         6, 6
74       blosc   5490, 5490     915, 915         6, 6
75       blosc   5490, 5490     915, 915         6, 6
76       blosc   5490, 5490     915, 915         6, 6
77       blosc         5490         5490            1
78       blosc         5490         5490            1
79       blosc   1830, 1830     305, 305         6, 6
80       blosc   1830, 1830     305, 305         6, 6
81       blosc   1830, 1830     305, 305         6, 6
82       blosc         1830         1830            1
83       blosc         1830         1830            1
84       blosc            3            3            1
85       blosc 3, 10980.... 1, 1830,....      3, 6, 6
86       blosc        10980        10980            1
87       blosc        10980        10980            1
88       blosc 10980, 10980   1830, 1830         6, 6
89       blosc 10980, 10980   1830, 1830         6, 6
90       blosc 10980, 10980   1830, 1830         6, 6
91       blosc 10980, 10980   1830, 1830         6, 6
92       blosc        10980        10980            1
93       blosc        10980        10980            1
94       blosc   5490, 5490     915, 915         6, 6
95       blosc   5490, 5490     915, 915         6, 6
96       blosc   5490, 5490     915, 915         6, 6
97       blosc   5490, 5490     915, 915         6, 6
98       blosc   5490, 5490     915, 915         6, 6
99       blosc   5490, 5490     915, 915         6, 6
100      blosc         5490         5490            1
101      blosc         5490         5490            1
102      blosc   1830, 1830     305, 305         6, 6
103      blosc   1830, 1830     305, 305         6, 6
104      blosc   1830, 1830     305, 305         6, 6
105      blosc         1830         1830            1
106      blosc         1830         1830            1

Now that we have explored the content, let’s get some related metadata. Rarr provides the read_zarr_attributes function for this purpose, which looks for the .zattrs file inside the Zarr store.

â›” Unfortunately we were not successful when applying it to our remote store.

read_zarr_attributes(zarr_url)
Error:
! No file that could contain attributes (either `.zattrs` for v2 or `zarr.json` for v3) was found in the path.
read_zarr_attributes(file.path(zarr_url, ".zattrs"))
Error:
! No file that could contain attributes (either `.zattrs` for v2 or `zarr.json` for v3) was found in the path.

âś… As a simple workaround we can read the JSON directly:

zattrs = read_json(file.path(zarr_url, ".zattrs"))
names(zattrs)
[1] "other_metadata"  "other_metadata4" "stac_discovery" 
                                                   var  data_type endianness
1                           /conditions/geometry/angle unicode224     little
2                            /conditions/geometry/band  unicode96     little
3                        /conditions/geometry/detector      int64     little
4                 /conditions/geometry/mean_sun_angles    float64     little
5   /conditions/geometry/mean_viewing_incidence_angles    float64     little
6                      /conditions/geometry/sun_angles    float64     little
7        /conditions/geometry/viewing_incidence_angles    float64     little
8                               /conditions/geometry/x      int64     little
9                               /conditions/geometry/y      int64     little
10        /conditions/mask/detector_footprint/r10m/b02      uint8       <NA>
11        /conditions/mask/detector_footprint/r10m/b03      uint8       <NA>
12        /conditions/mask/detector_footprint/r10m/b04      uint8       <NA>
13        /conditions/mask/detector_footprint/r10m/b08      uint8       <NA>
14          /conditions/mask/detector_footprint/r10m/x      int64     little
15          /conditions/mask/detector_footprint/r10m/y      int64     little
16        /conditions/mask/detector_footprint/r20m/b05      uint8       <NA>
17        /conditions/mask/detector_footprint/r20m/b06      uint8       <NA>
18        /conditions/mask/detector_footprint/r20m/b07      uint8       <NA>
19        /conditions/mask/detector_footprint/r20m/b11      uint8       <NA>
20        /conditions/mask/detector_footprint/r20m/b12      uint8       <NA>
21        /conditions/mask/detector_footprint/r20m/b8a      uint8       <NA>
22          /conditions/mask/detector_footprint/r20m/x      int64     little
23          /conditions/mask/detector_footprint/r20m/y      int64     little
24        /conditions/mask/detector_footprint/r60m/b01      uint8       <NA>
25        /conditions/mask/detector_footprint/r60m/b09      uint8       <NA>
26        /conditions/mask/detector_footprint/r60m/b10      uint8       <NA>
27          /conditions/mask/detector_footprint/r60m/x      int64     little
28          /conditions/mask/detector_footprint/r60m/y      int64     little
29        /conditions/mask/l1c_classification/r60m/b00      uint8       <NA>
30          /conditions/mask/l1c_classification/r60m/x      int64     little
31          /conditions/mask/l1c_classification/r60m/y      int64     little
32                /conditions/meteorology/cams/aod1240    float32     little
33                 /conditions/meteorology/cams/aod469    float32     little
34                 /conditions/meteorology/cams/aod550    float32     little
35                 /conditions/meteorology/cams/aod670    float32     little
36                 /conditions/meteorology/cams/aod865    float32     little
37               /conditions/meteorology/cams/bcaod550    float32     little
38               /conditions/meteorology/cams/duaod550    float32     little
39          /conditions/meteorology/cams/isobaricInhPa    float64     little
40               /conditions/meteorology/cams/latitude    float64     little
41              /conditions/meteorology/cams/longitude    float64     little
42                 /conditions/meteorology/cams/number      int64     little
43               /conditions/meteorology/cams/omaod550    float32     little
44               /conditions/meteorology/cams/ssaod550    float32     little
45                   /conditions/meteorology/cams/step      int64     little
46               /conditions/meteorology/cams/suaod550    float32     little
47                /conditions/meteorology/cams/surface    float64     little
48                   /conditions/meteorology/cams/time      int64     little
49             /conditions/meteorology/cams/valid_time      int64     little
50                      /conditions/meteorology/cams/z    float32     little
51         /conditions/meteorology/ecmwf/isobaricInhPa    float64     little
52              /conditions/meteorology/ecmwf/latitude    float64     little
53             /conditions/meteorology/ecmwf/longitude    float64     little
54                   /conditions/meteorology/ecmwf/msl    float32     little
55                /conditions/meteorology/ecmwf/number      int64     little
56                     /conditions/meteorology/ecmwf/r    float32     little
57                  /conditions/meteorology/ecmwf/step      int64     little
58               /conditions/meteorology/ecmwf/surface    float64     little
59                  /conditions/meteorology/ecmwf/tco3    float32     little
60                  /conditions/meteorology/ecmwf/tcwv    float32     little
61                  /conditions/meteorology/ecmwf/time      int64     little
62                   /conditions/meteorology/ecmwf/u10    float32     little
63                   /conditions/meteorology/ecmwf/v10    float32     little
64            /conditions/meteorology/ecmwf/valid_time      int64     little
65                  /measurements/reflectance/r10m/b02     uint16     little
66                  /measurements/reflectance/r10m/b03     uint16     little
67                  /measurements/reflectance/r10m/b04     uint16     little
68                  /measurements/reflectance/r10m/b08     uint16     little
69                    /measurements/reflectance/r10m/x      int64     little
70                    /measurements/reflectance/r10m/y      int64     little
71                  /measurements/reflectance/r20m/b05     uint16     little
72                  /measurements/reflectance/r20m/b06     uint16     little
73                  /measurements/reflectance/r20m/b07     uint16     little
74                  /measurements/reflectance/r20m/b11     uint16     little
75                  /measurements/reflectance/r20m/b12     uint16     little
76                  /measurements/reflectance/r20m/b8a     uint16     little
77                    /measurements/reflectance/r20m/x      int64     little
78                    /measurements/reflectance/r20m/y      int64     little
79                  /measurements/reflectance/r60m/b01     uint16     little
80                  /measurements/reflectance/r60m/b09     uint16     little
81                  /measurements/reflectance/r60m/b10     uint16     little
82                    /measurements/reflectance/r60m/x      int64     little
83                    /measurements/reflectance/r60m/y      int64     little
84                    /quality/l1c_quicklook/r10m/band      int64     little
85                     /quality/l1c_quicklook/r10m/tci      uint8       <NA>
86                       /quality/l1c_quicklook/r10m/x      int64     little
87                       /quality/l1c_quicklook/r10m/y      int64     little
88                              /quality/mask/r10m/b02      uint8       <NA>
89                              /quality/mask/r10m/b03      uint8       <NA>
90                              /quality/mask/r10m/b04      uint8       <NA>
91                              /quality/mask/r10m/b08      uint8       <NA>
92                                /quality/mask/r10m/x      int64     little
93                                /quality/mask/r10m/y      int64     little
94                              /quality/mask/r20m/b05      uint8       <NA>
95                              /quality/mask/r20m/b06      uint8       <NA>
96                              /quality/mask/r20m/b07      uint8       <NA>
97                              /quality/mask/r20m/b11      uint8       <NA>
98                              /quality/mask/r20m/b12      uint8       <NA>
99                              /quality/mask/r20m/b8a      uint8       <NA>
100                               /quality/mask/r20m/x      int64     little
101                               /quality/mask/r20m/y      int64     little
102                             /quality/mask/r60m/b01      uint8       <NA>
103                             /quality/mask/r60m/b09      uint8       <NA>
104                             /quality/mask/r60m/b10      uint8       <NA>
105                               /quality/mask/r60m/x      int64     little
106                               /quality/mask/r60m/y      int64     little
    compressor          dim    chunk_dim      nchunks
1        blosc            2            2            1
2        blosc           13           13            1
3        blosc            7            7            1
4        blosc            2            2            1
5        blosc        13, 2        13, 2         1, 1
6        blosc    2, 23, 23    2, 23, 23      1, 1, 1
7        blosc 13, 7, 2.... 7, 4, 2,.... 2, 2, 1,....
8        blosc           23           23            1
9        blosc           23           23            1
10       blosc 10980, 10980   1830, 1830         6, 6
11       blosc 10980, 10980   1830, 1830         6, 6
12       blosc 10980, 10980   1830, 1830         6, 6
13       blosc 10980, 10980   1830, 1830         6, 6
14       blosc        10980        10980            1
15       blosc        10980        10980            1
16       blosc   5490, 5490     915, 915         6, 6
17       blosc   5490, 5490     915, 915         6, 6
18       blosc   5490, 5490     915, 915         6, 6
19       blosc   5490, 5490     915, 915         6, 6
20       blosc   5490, 5490     915, 915         6, 6
21       blosc   5490, 5490     915, 915         6, 6
22       blosc         5490         5490            1
23       blosc         5490         5490            1
24       blosc   1830, 1830     305, 305         6, 6
25       blosc   1830, 1830     305, 305         6, 6
26       blosc   1830, 1830     305, 305         6, 6
27       blosc         1830         1830            1
28       blosc         1830         1830            1
29       blosc   1830, 1830     305, 305         6, 6
30       blosc         1830         1830            1
31       blosc         1830         1830            1
32       blosc         9, 9         9, 9         1, 1
33       blosc         9, 9         9, 9         1, 1
34       blosc         9, 9         9, 9         1, 1
35       blosc         9, 9         9, 9         1, 1
36       blosc         9, 9         9, 9         1, 1
37       blosc         9, 9         9, 9         1, 1
38       blosc         9, 9         9, 9         1, 1
39        <NA>                                       
40       blosc            9            9            1
41       blosc            9            9            1
42        <NA>                                       
43       blosc         9, 9         9, 9         1, 1
44       blosc         9, 9         9, 9         1, 1
45        <NA>                                       
46       blosc         9, 9         9, 9         1, 1
47        <NA>                                       
48        <NA>                                       
49        <NA>                                       
50       blosc         9, 9         9, 9         1, 1
51        <NA>                                       
52       blosc            9            9            1
53       blosc            9            9            1
54       blosc         9, 9         9, 9         1, 1
55        <NA>                                       
56       blosc         9, 9         9, 9         1, 1
57        <NA>                                       
58        <NA>                                       
59       blosc         9, 9         9, 9         1, 1
60       blosc         9, 9         9, 9         1, 1
61        <NA>                                       
62       blosc         9, 9         9, 9         1, 1
63       blosc         9, 9         9, 9         1, 1
64        <NA>                                       
65       blosc 10980, 10980   1830, 1830         6, 6
66       blosc 10980, 10980   1830, 1830         6, 6
67       blosc 10980, 10980   1830, 1830         6, 6
68       blosc 10980, 10980   1830, 1830         6, 6
69       blosc        10980        10980            1
70       blosc        10980        10980            1
71       blosc   5490, 5490     915, 915         6, 6
72       blosc   5490, 5490     915, 915         6, 6
73       blosc   5490, 5490     915, 915         6, 6
74       blosc   5490, 5490     915, 915         6, 6
75       blosc   5490, 5490     915, 915         6, 6
76       blosc   5490, 5490     915, 915         6, 6
77       blosc         5490         5490            1
78       blosc         5490         5490            1
79       blosc   1830, 1830     305, 305         6, 6
80       blosc   1830, 1830     305, 305         6, 6
81       blosc   1830, 1830     305, 305         6, 6
82       blosc         1830         1830            1
83       blosc         1830         1830            1
84       blosc            3            3            1
85       blosc 3, 10980.... 1, 1830,....      3, 6, 6
86       blosc        10980        10980            1
87       blosc        10980        10980            1
88       blosc 10980, 10980   1830, 1830         6, 6
89       blosc 10980, 10980   1830, 1830         6, 6
90       blosc 10980, 10980   1830, 1830         6, 6
91       blosc 10980, 10980   1830, 1830         6, 6
92       blosc        10980        10980            1
93       blosc        10980        10980            1
94       blosc   5490, 5490     915, 915         6, 6
95       blosc   5490, 5490     915, 915         6, 6
96       blosc   5490, 5490     915, 915         6, 6
97       blosc   5490, 5490     915, 915         6, 6
98       blosc   5490, 5490     915, 915         6, 6
99       blosc   5490, 5490     915, 915         6, 6
100      blosc         5490         5490            1
101      blosc         5490         5490            1
102      blosc   1830, 1830     305, 305         6, 6
103      blosc   1830, 1830     305, 305         6, 6
104      blosc   1830, 1830     305, 305         6, 6
105      blosc         1830         1830            1
106      blosc         1830         1830            1

The same information can be retrieved with sf::gdal_utils('mdiminfo', ...).

Reading Zarr data with Rarr

Reading zarrays from a remote Zarr store with Rarr and turning them into useful spatial R objects is not as straightforward as with GDAL-based stars and terra functions. We will demonstrate the current capabilities and limitations of Rarr by using a reference object retrieved through the GDAL Zarr driver which is loaded in the background.

# GDAL-based reference object
zz_gdal
stars object with 2 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
       Min. 1st Qu. Median      Mean 3rd Qu.   Max.
b01  0.1299  0.1542 0.1628 0.3541124  0.4304 1.6556
dimension(s):
  from   to offset delta                refsys x/y
x    1 1830  3e+05    60 WGS 84 / UTM zone 32N [x]
y    1 1830  5e+06   -60 WGS 84 / UTM zone 32N [y]

We start with an naive attempt to read the entire Zarr store using Rarr::read_zarr_array.

â›” Rarr, like stars and terra, is not capable of reading multiple arrays from the remote Zarr store simultaneously.

read_zarr_array(zarr_url)
Error:
! The requested `.zarray` metadata file (possibly listed in `.zmetadata`) does not exist.

âś… But if a single band array is selected we are able to access the desired information.

band_variable = "/measurements/reflectance/r60m/b01"

zarray = read_zarr_array(paste0(zarr_url, band_variable))
str(zarray)
 int [1:1830, 1:1830] 9548 8117 7158 5906 7007 7523 5999 6570 7165 6666 ...

Raster geometry

The function returns a matrix (2D array) lacking any spatial metadata (e.g. Coordinate Reference System) or map coordinates. Let’s plot it together with the reference object for comparison.

image(zarray)
image(zz_gdal$b01)

Rarr (raw zarray)

GDAL (directly retrieved stars object)
WarningCaution

The raw zarray has flipped X- and Y-dimensions! But we can access X and Y coordinates for each pixel separately and handle the issue by transposing them later (using t()).

# retrieve X and Y coordinates
zarray_60m_x = paste0(zarr_url, "/measurements/reflectance/r60m/x") |> 
  read_zarr_array()
zarray_60m_y = paste0(zarr_url, "/measurements/reflectance/r60m/y") |> 
  read_zarr_array()

range(zarray_60m_x)
[1] 300030 409770
range(zarray_60m_y)
[1] 4890270 5000010

Regardless of their ordering, are the coordinate values consistent with those retrieved via GDAL?

range(st_get_dimension_values(zz_gdal, "x"))
[1] 300030 409770
range(st_get_dimension_values(zz_gdal, "y"))
[1] 4890270 5000010

âś… Yes!

Reflectance values

Before building the spatial object we should also check if the actual data values are consistent between both methods.

head(t(zarray)[1,])     # flipped zarray
[1] 9548 8117 7158 5906 7007 7523
head(zz_gdal$b01[1,])   # reference arrray
[1] 0.8548 0.7117 0.6158 0.4906 0.6007 0.6523
WarningCaution

By comapring the first few values of each array we can observe 2 important differences:

  • zarray contains integer values while zz_gdal has floating point values scaled with factor 10,000.
  • after rescaling, min and max for example still differ (0.1299 vs. 0.2299 / 1.6556 vs. 1.7556), indicating an offset of 0.1.

đź’ˇ Explanation: The data provider applies scale and offset to store more memory-efficient integer values. A scale factor of 10,000 preserves 4 decimals, while an offset of 0.1 avoids the multiplication of 0. When reading raw data like that we need to apply scale and offset manually to get the original reflectance values.

Scale and offset parameters are important metadata, but currently not accessible via Rarr. At this point we can only find it by digging deep into the metadata retrieved by GDAL:

# GDAL-comaptible VSI URL
vsi_url = paste0("ZARR:/vsicurl/", dplyr::as_label(zarr_url)) 

# get entire metadata
info = sf::gdal_utils("mdiminfo", source = vsi_url, quiet = T) |> fromJSON()

# access scale and offset for 60 meter reflectance bands
(scale = info$groups$measurements$groups$reflectance$groups$r60m$arrays$b01$scale)
[1] 1e-04
(offset = info$groups$measurements$groups$reflectance$groups$r60m$arrays$b01$offset)
[1] -0.1

Constructing a stars object from zarray components

Let’s combine all pieces of the puzzle to form a spatial object:

  • data array
  • scale and offset
  • X and Y coordinates
  • CRS
# read CRS from Zarr attributes
zarr_crs = zattrs$stac_discovery$properties$`proj:epsg`

# band name
var = basename(band_variable)

# read and transpose data matrix 
# apply scale and offset
# convert to stars
zz = st_as_stars(t(zarray) * scale + offset) |>
  st_set_dimensions(1, names = "x", values = zarray_60m_x, is_raster = TRUE) |> 
  st_set_dimensions(2, names = "y", values = zarray_60m_y, is_raster = TRUE) |> 
  setNames(var)

# assign CRS
st_crs(zz) = st_crs(zarr_crs)

# print the object
zz
stars object with 2 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
       Min. 1st Qu. Median      Mean 3rd Qu.   Max.
b01  0.1299  0.1542 0.1628 0.3541124  0.4304 1.6556
dimension(s):
  from   to offset delta                refsys point x/y
x    1 1830  3e+05    60 WGS 84 / UTM zone 32N FALSE [x]
y    1 1830  5e+06   -60 WGS 84 / UTM zone 32N FALSE [y]

We can compare the newly constructed object with reference read via GDAL. They should have identical geometry and values.

identical(zz[[1]] |> str(),
          zz_gdal[[1]] |> str())
 num [1:1830, 1:1830] 0.855 0.753 0.7 0.756 0.718 ...
 num [1:1830, 1:1830] 0.855 0.753 0.7 0.756 0.718 ...
[1] TRUE

Looks good! Let’s visualize both objects.

# plot as maps
plot(zz, axes = TRUE)

Rarr (constructed stars object)
plot(zz_gdal, axes = TRUE)

GDAL (directly retrieved stars object)

The identical maps show Sentinel-2 band 1 (coastal aerosol band) reflectance at 60m spatial resolution for north-western Italy with the Alps in the West and the city of Torino in the North-East.

Wrapping it all up in a function

We can enclose these steps in a single function with some flexibility for variable names and resolutions. This allows us to retrieve any band at any available resolution from the remote Zarr store. Since scale and offset are not accessible via Rarr, we need to provide them as function arguments.

st_read_zarray = function(path, var, res, scale = 1, offset = 0, ...){
  
  # room for checks and input tests:
  # stopifnot valid zarr url / variable / resolution ...

  # get metadata including CRS
  zattrs = jsonlite::read_json(file.path(path, ".zattrs"))
  zarr_crs = zattrs$stac_discovery$properties$`proj:epsg`
  
  # selecte the correct spatial resolution
  # zattrs$stac_discovery$properties$`eopf:resolutions`
  res_char = switch(as.character(res),
    "10" = "r10m",
    "20" = "r20m", 
    "60" = "r60m",
    stop("Resolution must be one of 10, 20, or 60.")
  )

  # get 2D data array and 1D coordinate arrays
  # ...: make use of index arguments of read_zarr_array if desired
  zarray = file.path(path, "measurements/reflectance", res_char, var) |> 
    read_zarr_array(...)
  zarray_x = file.path(path, "measurements/reflectance", res_char, "x") |> 
    read_zarr_array(...)
  zarray_y = file.path(path, "measurements/reflectance", res_char, "y") |> 
    read_zarr_array(...)
    
  # transpose matrix before converting to stars, apply scale and offset
  # assign x and y coordinates and CRS
  z = st_as_stars(t(zarray) * scale + offset) |>
  st_set_dimensions(1, names = "x", values = zarray_60m_x, is_raster = TRUE) |> 
  st_set_dimensions(2, names = "y", values = zarray_60m_y, is_raster = TRUE) |> 
    setNames(var)
  st_crs(z) = st_crs(zarr_crs)
  return(z)
}

Read and combine multiple zarrays

✅ Now we can read multiple bands, combine them to a single multi-band object, and visualize them in a static map…

system.time({
  # scale: 0.0001 = 1 / 10,000
  b01 = st_read_zarray(zarr_url, "b01", 60, scale=0.0001, offset = 0.1)
  b09 = st_read_zarray(zarr_url, "b09", 60, scale=0.0001, offset = 0.1)
  multi_band = c(b01, b09) |> merge(name="band") |> setNames("reflectance")
})
   user  system elapsed 
  1.933   0.070  31.699 
plot(multi_band, axes = TRUE)

… or by generating an interactive map:

mapview(multi_band)

Benchmark

Using Rarr to read remote Zarr stores is just one of several options. R packages stars and terra provide functionality, that uses GDAl’s Zarr driver in the background. To see if there is a benefit to each of the methods, we can compare their execution time and memory allocation with a little benchmark.

# GDAL-compatible VSI URL
vsi_prefix = "ZARR:/vsicurl/"
vsi_url = paste0(vsi_prefix, dplyr::as_label(zarr_url)) 

# functions for reading a single Sentinel-2 band to memory
f_stars = function(){rs = read_stars(paste(vsi_url, band_variable, sep = ":"))}
f_mdim = function(){sm = read_mdim(vsi_url, band_variable, proxy = FALSE) |> 
  setNames(basename(band_variable))}
f_rast = function(){tr = terra::rast(vsi_url, band_variable) |> terra::toMemory()}
f_rarr = function(){rr = st_read_zarray(zarr_url, "b01", 60)}
bench::mark(f_stars(), f_mdim(), f_rast(), f_rarr(), 
  check = F, iterations = 5)
# A tibble: 4 Ă— 12
  expression min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time
  <chr>      <chr> <chr>      <dbl> <chr>        <dbl> <int> <dbl> <chr>     
1 f_stars()  24s   24.2s     0.0409 25.6MB     0.0163      5     2 2.04m     
2 f_mdim()   23.2s 23.3s     0.0427 25.9MB     0.00854     5     1 1.95m     
3 f_rast()   23.8s 24s       0.0407 12.2KB     0           5     0 2.05m     
4 f_rarr()   13.7s 15.6s     0.0647 147.6MB    0.104       5     8 1.29m     
# ℹ 3 more variables: result <list>, time <list>, gc <list>

The benchmark does not reveal a clear winner in terms of speed. However, Rarr seems not to be faster than GDAL but allocates more memory compared to stars and terra.

đź’Ş Now it is your turn

  • đź”­ Task: Supply more arguments to st_read_zarray and pass a subset to read only parts on an array. Visit ?Rarr::read_zarr_array for more details.
  • đź’ľ Task: Try writing zarrays to disk using Rarr::write_zarr_array().
  • đź“– Read the Rarr documentation.

Conclusion

In this notebook we have demonstrated how to access remote Zarr stores in R using the Rarr package. We have seen that it is possible to read specific data arrays, but read_zarr_array lacks the ability of accessing anything other than the raw array. To combine it with a CRS or coordinates we need to leverage a spatial class like stars and manually set the relevant metadata. Further, the benchmark did not reveal a significant benefit regarding processing time or memory allocation over “traditional” GDAL-based data retrieval.

What’s next?

Further tests should evaluate wether data retrieved via Rarr versus GDAL are identical in terms of spatial extent and offset.

Explore this related notebook which demonstrates how to access the same Zarr data using the GDAL Zarr driver via the stars and terra packages.