To get it, you first see it, and then let it go
In this tutorial 🧑🏫, we’ll step through an Earth Observation 🛰️ data pipeline
using torchdata
and by the end of this lesson, you should be able to:
Find Cloud-Optimized GeoTIFFs (COGs) from STAC catalogs 🥞
Construct a DataPipe that iteratively reads several COGs in a stream 🌊
Loop through batches of images in a DataPipe with a DataLoader 🏋️
🎉 Getting started#
These are the tools 🛠️ you’ll need.
# Geospatial libraries
import pystac
import planetary_computer
import rioxarray
# Deep Learning libraries
import torch
import torchdata
import zen3geo
Just to make sure we’re on the same page 📃, let’s check that we’ve got compatible versions installed.
print(f"pystac version: {pystac.__version__}")
print(f"planetary-computer version: {planetary_computer.__version__}")
print(f"torch version: {torch.__version__}")
print(f"torchdata version: {torchdata.__version__}")
print(f"zen3geo version: {zen3geo.__version__}")
pystac version: 1.10.0
planetary-computer version: 1.0.0
torch version: 2.2.2+cu121
torchdata version: 0.7.1
zen3geo version: 0.6.2
rioxarray (0.15.3) deps:
rasterio: 1.3.9
xarray: 2024.3.0
GDAL: 3.6.4
GEOS: 3.11.1
PROJ: 9.0.1
PROJ DATA: /home/docs/checkouts/readthedocs.org/user_builds/zen3geo/envs/latest/lib/python3.11/site-packages/rasterio/proj_data
GDAL DATA: /home/docs/checkouts/readthedocs.org/user_builds/zen3geo/envs/latest/lib/python3.11/site-packages/rasterio/gdal_data
Other python deps:
scipy: 1.13.0
pyproj: 3.6.1
python: 3.11.6 (main, Feb 1 2024, 16:47:41) [GCC 11.4.0]
executable: /home/docs/checkouts/readthedocs.org/user_builds/zen3geo/envs/latest/bin/python
machine: Linux-5.19.0-1028-aws-x86_64-with-glibc2.35
0️⃣ Find Cloud-Optimized GeoTIFFs 🗺️#
Let’s get some optical satellite data using STAC! How about Sentinel-2 L2A data over Singapore 🇸🇬?
🔗 Links:
item_url = "https://planetarycomputer.microsoft.com/api/stac/v1/collections/sentinel-2-l2a/items/S2A_MSIL2A_20220115T032101_R118_T48NUG_20220115T170435"
# Load the individual item metadata and sign the assets
item = pystac.Item.from_file(item_url)
signed_item = planetary_computer.sign(item)
- type "Feature"
- stac_version "1.0.0"
- id "S2A_MSIL2A_20220115T032101_R118_T48NUG_20220115T170435"
- datetime "2022-01-15T03:21:01.024000Z"
- platform "Sentinel-2A"
- proj:epsg 32648
instruments[] 1 items
- 0 "msi"
- s2:mgrs_tile "48NUG"
- constellation "Sentinel 2"
- s2:granule_id "S2A_OPER_MSI_L2A_TL_ESRI_20220115T170436_A034292_T48NUG_N03.00"
- eo:cloud_cover 17.352597
- s2:datatake_id "GS2A_20220115T032101_034292_N03.00"
- s2:product_uri "S2A_MSIL2A_20220115T032101_N0300_R118_T48NUG_20220115T170435.SAFE"
- s2:datastrip_id "S2A_OPER_MSI_L2A_DS_ESRI_20220115T170436_S20220115T033502_N03.00"
- s2:product_type "S2MSI2A"
- sat:orbit_state "descending"
- s2:datatake_type "INS-NOBS"
- s2:generation_time "2022-01-15T17:04:35.333066Z"
- sat:relative_orbit 118
- s2:water_percentage 37.552899
- s2:mean_solar_zenith 32.5977245182785
- s2:mean_solar_azimuth 134.8610857439
- s2:processing_baseline "03.00"
- s2:snow_ice_percentage 0.000617
- s2:vegetation_percentage 30.781212
- s2:thin_cirrus_percentage 10.297693
- s2:cloud_shadow_percentage 1.10995
- s2:nodata_pixel_percentage 3e-06
- s2:unclassified_percentage 2.320639
- s2:dark_features_percentage 0.780946
- s2:not_vegetated_percentage 10.101138
- s2:degraded_msi_data_percentage 0.0
- s2:high_proba_clouds_percentage 3.881726
- s2:reflectance_conversion_factor 1.0339916908969
- s2:medium_proba_clouds_percentage 3.173178
- s2:saturated_defective_pixel_percentage 0.0
- type "Polygon"
coordinates[] 1 items
0[] 5 items
0[] 2 items
- 0 103.2020569
- 1 1.8089222
1[] 2 items
- 0 104.1890208
- 1 1.8096362
2[] 2 items
- 0 104.1893409
- 1 0.8163468
3[] 2 items
- 0 103.2027661
- 1 0.8160248
4[] 2 items
- 0 103.2020569
- 1 1.8089222
links[] 6 items
- rel "self"
- href "https://planetarycomputer.microsoft.com/api/stac/v1/collections/sentinel-2-l2a/items/S2A_MSIL2A_20220115T032101_R118_T48NUG_20220115T170435"
- type "application/json"
- rel "collection"
- href "https://planetarycomputer.microsoft.com/api/stac/v1/collections/sentinel-2-l2a"
- type "application/json"
- rel "parent"
- href "https://planetarycomputer.microsoft.com/api/stac/v1/collections/sentinel-2-l2a"
- type "application/json"
- rel "root"
- href "https://planetarycomputer.microsoft.com/api/stac/v1/"
- type "application/json"
- rel "license"
- href "https://sentinel.esa.int/documents/247904/690755/Sentinel_Data_Legal_Notice"
- rel "preview"
- href "https://planetarycomputer.microsoft.com/api/data/v1/item/map?collection=sentinel-2-l2a&item=S2A_MSIL2A_20220115T032101_R118_T48NUG_20220115T170435"
- type "text/html"
- title "Map of item"
- href "https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/48/N/UG/2022/01/15/S2A_MSIL2A_20220115T032101_N0300_R118_T48NUG_20220115T170435.SAFE/GRANULE/L2A_T48NUG_A034292_20220115T033502/IMG_DATA/R10m/T48NUG_20220115T032101_AOT_10m.tif?st=2024-04-11T07%3A15%3A12Z&se=2024-04-13T07%3A15%3A12Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2024-04-12T03%3A18%3A07Z&ske=2024-04-19T03%3A18%3A07Z&sks=b&skv=2021-06-08&sig=0Ie1JiMBBlHVB0EwThT7mu5A3ILvafzth0jw00YAMZ8%3D"
- type "image/tiff; application=geotiff; profile=cloud-optimized"
- title "Aerosol optical thickness (AOT)"
proj:bbox[] 4 items
- 0 300000.0
- 1 90240.0
- 2 409800.0
- 3 200040.0
proj:shape[] 2 items
- 0 10980
- 1 10980
proj:transform[] 6 items
- 0 10.0
- 1 0.0
- 2 300000.0
- 3 0.0
- 4 -10.0
- 5 200040.0
- gsd 10.0
roles[] 1 items
- 0 "data"
- href "https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/48/N/UG/2022/01/15/S2A_MSIL2A_20220115T032101_N0300_R118_T48NUG_20220115T170435.SAFE/GRANULE/L2A_T48NUG_A034292_20220115T033502/IMG_DATA/R60m/T48NUG_20220115T032101_B01_60m.tif?st=2024-04-11T07%3A15%3A12Z&se=2024-04-13T07%3A15%3A12Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2024-04-12T03%3A18%3A07Z&ske=2024-04-19T03%3A18%3A07Z&sks=b&skv=2021-06-08&sig=0Ie1JiMBBlHVB0EwThT7mu5A3ILvafzth0jw00YAMZ8%3D"
- type "image/tiff; application=geotiff; profile=cloud-optimized"
- title "Band 1 - Coastal aerosol - 60m"
proj:bbox[] 4 items
- 0 300000.0
- 1 90240.0
- 2 409800.0
- 3 200040.0
proj:shape[] 2 items
- 0 1830
- 1 1830
proj:transform[] 6 items
- 0 60.0
- 1 0.0
- 2 300000.0
- 3 0.0
- 4 -60.0
- 5 200040.0
- gsd 60.0
eo:bands[] 1 items
- name "B01"
- common_name "coastal"
- description "Band 1 - Coastal aerosol"
- center_wavelength 0.443
- full_width_half_max 0.027
roles[] 1 items
- 0 "data"
- href "https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/48/N/UG/2022/01/15/S2A_MSIL2A_20220115T032101_N0300_R118_T48NUG_20220115T170435.SAFE/GRANULE/L2A_T48NUG_A034292_20220115T033502/IMG_DATA/R10m/T48NUG_20220115T032101_B02_10m.tif?st=2024-04-11T07%3A15%3A12Z&se=2024-04-13T07%3A15%3A12Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2024-04-12T03%3A18%3A07Z&ske=2024-04-19T03%3A18%3A07Z&sks=b&skv=2021-06-08&sig=0Ie1JiMBBlHVB0EwThT7mu5A3ILvafzth0jw00YAMZ8%3D"
- type "image/tiff; application=geotiff; profile=cloud-optimized"
- title "Band 2 - Blue - 10m"
proj:bbox[] 4 items
- 0 300000.0
- 1 90240.0
- 2 409800.0
- 3 200040.0
proj:shape[] 2 items
- 0 10980
- 1 10980
proj:transform[] 6 items
- 0 10.0
- 1 0.0
- 2 300000.0
- 3 0.0
- 4 -10.0
- 5 200040.0
- gsd 10.0
eo:bands[] 1 items
- name "B02"
- common_name "blue"
- description "Band 2 - Blue"
- center_wavelength 0.49
- full_width_half_max 0.098
roles[] 1 items
- 0 "data"
- href "https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/48/N/UG/2022/01/15/S2A_MSIL2A_20220115T032101_N0300_R118_T48NUG_20220115T170435.SAFE/GRANULE/L2A_T48NUG_A034292_20220115T033502/IMG_DATA/R10m/T48NUG_20220115T032101_B03_10m.tif?st=2024-04-11T07%3A15%3A12Z&se=2024-04-13T07%3A15%3A12Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2024-04-12T03%3A18%3A07Z&ske=2024-04-19T03%3A18%3A07Z&sks=b&skv=2021-06-08&sig=0Ie1JiMBBlHVB0EwThT7mu5A3ILvafzth0jw00YAMZ8%3D"
- type "image/tiff; application=geotiff; profile=cloud-optimized"
- title "Band 3 - Green - 10m"
proj:bbox[] 4 items
- 0 300000.0
- 1 90240.0
- 2 409800.0
- 3 200040.0
proj:shape[] 2 items
- 0 10980
- 1 10980
proj:transform[] 6 items
- 0 10.0
- 1 0.0
- 2 300000.0
- 3 0.0
- 4 -10.0
- 5 200040.0
- gsd 10.0
eo:bands[] 1 items
- name "B03"
- common_name "green"
- description "Band 3 - Green"
- center_wavelength 0.56
- full_width_half_max 0.045
roles[] 1 items
- 0 "data"
- href "https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/48/N/UG/2022/01/15/S2A_MSIL2A_20220115T032101_N0300_R118_T48NUG_20220115T170435.SAFE/GRANULE/L2A_T48NUG_A034292_20220115T033502/IMG_DATA/R10m/T48NUG_20220115T032101_B04_10m.tif?st=2024-04-11T07%3A15%3A12Z&se=2024-04-13T07%3A15%3A12Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2024-04-12T03%3A18%3A07Z&ske=2024-04-19T03%3A18%3A07Z&sks=b&skv=2021-06-08&sig=0Ie1JiMBBlHVB0EwThT7mu5A3ILvafzth0jw00YAMZ8%3D"
- type "image/tiff; application=geotiff; profile=cloud-optimized"
- title "Band 4 - Red - 10m"
proj:bbox[] 4 items
- 0 300000.0
- 1 90240.0
- 2 409800.0
- 3 200040.0
proj:shape[] 2 items
- 0 10980
- 1 10980
proj:transform[] 6 items
- 0 10.0
- 1 0.0
- 2 300000.0
- 3 0.0
- 4 -10.0
- 5 200040.0
- gsd 10.0
eo:bands[] 1 items
- name "B04"
- common_name "red"
- description "Band 4 - Red"
- center_wavelength 0.665
- full_width_half_max 0.038
roles[] 1 items
- 0 "data"
- href "https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/48/N/UG/2022/01/15/S2A_MSIL2A_20220115T032101_N0300_R118_T48NUG_20220115T170435.SAFE/GRANULE/L2A_T48NUG_A034292_20220115T033502/IMG_DATA/R20m/T48NUG_20220115T032101_B05_20m.tif?st=2024-04-11T07%3A15%3A12Z&se=2024-04-13T07%3A15%3A12Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2024-04-12T03%3A18%3A07Z&ske=2024-04-19T03%3A18%3A07Z&sks=b&skv=2021-06-08&sig=0Ie1JiMBBlHVB0EwThT7mu5A3ILvafzth0jw00YAMZ8%3D"
- type "image/tiff; application=geotiff; profile=cloud-optimized"
- title "Band 5 - Vegetation red edge 1 - 20m"
proj:bbox[] 4 items
- 0 300000.0
- 1 90240.0
- 2 409800.0
- 3 200040.0
proj:shape[] 2 items
- 0 5490
- 1 5490
proj:transform[] 6 items
- 0 20.0
- 1 0.0
- 2 300000.0
- 3 0.0
- 4 -20.0
- 5 200040.0
- gsd 20.0
eo:bands[] 1 items
- name "B05"
- common_name "rededge"
- description "Band 5 - Vegetation red edge 1"
- center_wavelength 0.704
- full_width_half_max 0.019
roles[] 1 items
- 0 "data"
- href "https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/48/N/UG/2022/01/15/S2A_MSIL2A_20220115T032101_N0300_R118_T48NUG_20220115T170435.SAFE/GRANULE/L2A_T48NUG_A034292_20220115T033502/IMG_DATA/R20m/T48NUG_20220115T032101_B06_20m.tif?st=2024-04-11T07%3A15%3A12Z&se=2024-04-13T07%3A15%3A12Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2024-04-12T03%3A18%3A07Z&ske=2024-04-19T03%3A18%3A07Z&sks=b&skv=2021-06-08&sig=0Ie1JiMBBlHVB0EwThT7mu5A3ILvafzth0jw00YAMZ8%3D"
- type "image/tiff; application=geotiff; profile=cloud-optimized"
- title "Band 6 - Vegetation red edge 2 - 20m"
proj:bbox[] 4 items
- 0 300000.0
- 1 90240.0
- 2 409800.0
- 3 200040.0
proj:shape[] 2 items
- 0 5490
- 1 5490
proj:transform[] 6 items
- 0 20.0
- 1 0.0
- 2 300000.0
- 3 0.0
- 4 -20.0
- 5 200040.0
- gsd 20.0
eo:bands[] 1 items
- name "B06"
- common_name "rededge"
- description "Band 6 - Vegetation red edge 2"
- center_wavelength 0.74
- full_width_half_max 0.018
roles[] 1 items
- 0 "data"
- href "https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/48/N/UG/2022/01/15/S2A_MSIL2A_20220115T032101_N0300_R118_T48NUG_20220115T170435.SAFE/GRANULE/L2A_T48NUG_A034292_20220115T033502/IMG_DATA/R20m/T48NUG_20220115T032101_B07_20m.tif?st=2024-04-11T07%3A15%3A12Z&se=2024-04-13T07%3A15%3A12Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2024-04-12T03%3A18%3A07Z&ske=2024-04-19T03%3A18%3A07Z&sks=b&skv=2021-06-08&sig=0Ie1JiMBBlHVB0EwThT7mu5A3ILvafzth0jw00YAMZ8%3D"
- type "image/tiff; application=geotiff; profile=cloud-optimized"
- title "Band 7 - Vegetation red edge 3 - 20m"
proj:bbox[] 4 items
- 0 300000.0
- 1 90240.0
- 2 409800.0
- 3 200040.0
proj:shape[] 2 items
- 0 5490
- 1 5490
proj:transform[] 6 items
- 0 20.0
- 1 0.0
- 2 300000.0
- 3 0.0
- 4 -20.0
- 5 200040.0
- gsd 20.0
eo:bands[] 1 items
- name "B07"
- common_name "rededge"
- description "Band 7 - Vegetation red edge 3"
- center_wavelength 0.783
- full_width_half_max 0.028
roles[] 1 items
- 0 "data"
- href "https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/48/N/UG/2022/01/15/S2A_MSIL2A_20220115T032101_N0300_R118_T48NUG_20220115T170435.SAFE/GRANULE/L2A_T48NUG_A034292_20220115T033502/IMG_DATA/R10m/T48NUG_20220115T032101_B08_10m.tif?st=2024-04-11T07%3A15%3A12Z&se=2024-04-13T07%3A15%3A12Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2024-04-12T03%3A18%3A07Z&ske=2024-04-19T03%3A18%3A07Z&sks=b&skv=2021-06-08&sig=0Ie1JiMBBlHVB0EwThT7mu5A3ILvafzth0jw00YAMZ8%3D"
- type "image/tiff; application=geotiff; profile=cloud-optimized"
- title "Band 8 - NIR - 10m"
proj:bbox[] 4 items
- 0 300000.0
- 1 90240.0
- 2 409800.0
- 3 200040.0
proj:shape[] 2 items
- 0 10980
- 1 10980
proj:transform[] 6 items
- 0 10.0
- 1 0.0
- 2 300000.0
- 3 0.0
- 4 -10.0
- 5 200040.0
- gsd 10.0
eo:bands[] 1 items
- name "B08"
- common_name "nir"
- description "Band 8 - NIR"
- center_wavelength 0.842
- full_width_half_max 0.145
roles[] 1 items
- 0 "data"
- href "https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/48/N/UG/2022/01/15/S2A_MSIL2A_20220115T032101_N0300_R118_T48NUG_20220115T170435.SAFE/GRANULE/L2A_T48NUG_A034292_20220115T033502/IMG_DATA/R60m/T48NUG_20220115T032101_B09_60m.tif?st=2024-04-11T07%3A15%3A12Z&se=2024-04-13T07%3A15%3A12Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2024-04-12T03%3A18%3A07Z&ske=2024-04-19T03%3A18%3A07Z&sks=b&skv=2021-06-08&sig=0Ie1JiMBBlHVB0EwThT7mu5A3ILvafzth0jw00YAMZ8%3D"
- type "image/tiff; application=geotiff; profile=cloud-optimized"
- title "Band 9 - Water vapor - 60m"
proj:bbox[] 4 items
- 0 300000.0
- 1 90240.0
- 2 409800.0
- 3 200040.0
proj:shape[] 2 items
- 0 1830
- 1 1830
proj:transform[] 6 items
- 0 60.0
- 1 0.0
- 2 300000.0
- 3 0.0
- 4 -60.0
- 5 200040.0
- gsd 60.0
eo:bands[] 1 items
- name "B09"
- description "Band 9 - Water vapor"
- center_wavelength 0.945
- full_width_half_max 0.026
roles[] 1 items
- 0 "data"
- href "https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/48/N/UG/2022/01/15/S2A_MSIL2A_20220115T032101_N0300_R118_T48NUG_20220115T170435.SAFE/GRANULE/L2A_T48NUG_A034292_20220115T033502/IMG_DATA/R20m/T48NUG_20220115T032101_B11_20m.tif?st=2024-04-11T07%3A15%3A12Z&se=2024-04-13T07%3A15%3A12Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2024-04-12T03%3A18%3A07Z&ske=2024-04-19T03%3A18%3A07Z&sks=b&skv=2021-06-08&sig=0Ie1JiMBBlHVB0EwThT7mu5A3ILvafzth0jw00YAMZ8%3D"
- type "image/tiff; application=geotiff; profile=cloud-optimized"
- title "Band 11 - SWIR (1.6) - 20m"
proj:bbox[] 4 items
- 0 300000.0
- 1 90240.0
- 2 409800.0
- 3 200040.0
proj:shape[] 2 items
- 0 5490
- 1 5490
proj:transform[] 6 items
- 0 20.0
- 1 0.0
- 2 300000.0
- 3 0.0
- 4 -20.0
- 5 200040.0
- gsd 20.0
eo:bands[] 1 items
- name "B11"
- common_name "swir16"
- description "Band 11 - SWIR (1.6)"
- center_wavelength 1.61
- full_width_half_max 0.143
roles[] 1 items
- 0 "data"
- href "https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/48/N/UG/2022/01/15/S2A_MSIL2A_20220115T032101_N0300_R118_T48NUG_20220115T170435.SAFE/GRANULE/L2A_T48NUG_A034292_20220115T033502/IMG_DATA/R20m/T48NUG_20220115T032101_B12_20m.tif?st=2024-04-11T07%3A15%3A12Z&se=2024-04-13T07%3A15%3A12Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2024-04-12T03%3A18%3A07Z&ske=2024-04-19T03%3A18%3A07Z&sks=b&skv=2021-06-08&sig=0Ie1JiMBBlHVB0EwThT7mu5A3ILvafzth0jw00YAMZ8%3D"
- type "image/tiff; application=geotiff; profile=cloud-optimized"
- title "Band 12 - SWIR (2.2) - 20m"
proj:bbox[] 4 items
- 0 300000.0
- 1 90240.0
- 2 409800.0
- 3 200040.0
proj:shape[] 2 items
- 0 5490
- 1 5490
proj:transform[] 6 items
- 0 20.0
- 1 0.0
- 2 300000.0
- 3 0.0
- 4 -20.0
- 5 200040.0
- gsd 20.0
eo:bands[] 1 items
- name "B12"
- common_name "swir22"
- description "Band 12 - SWIR (2.2)"
- center_wavelength 2.19
- full_width_half_max 0.242
roles[] 1 items
- 0 "data"
- href "https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/48/N/UG/2022/01/15/S2A_MSIL2A_20220115T032101_N0300_R118_T48NUG_20220115T170435.SAFE/GRANULE/L2A_T48NUG_A034292_20220115T033502/IMG_DATA/R20m/T48NUG_20220115T032101_B8A_20m.tif?st=2024-04-11T07%3A15%3A12Z&se=2024-04-13T07%3A15%3A12Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2024-04-12T03%3A18%3A07Z&ske=2024-04-19T03%3A18%3A07Z&sks=b&skv=2021-06-08&sig=0Ie1JiMBBlHVB0EwThT7mu5A3ILvafzth0jw00YAMZ8%3D"
- type "image/tiff; application=geotiff; profile=cloud-optimized"
- title "Band 8A - Vegetation red edge 4 - 20m"
proj:bbox[] 4 items
- 0 300000.0
- 1 90240.0
- 2 409800.0
- 3 200040.0
proj:shape[] 2 items
- 0 5490
- 1 5490
proj:transform[] 6 items
- 0 20.0
- 1 0.0
- 2 300000.0
- 3 0.0
- 4 -20.0
- 5 200040.0
- gsd 20.0
eo:bands[] 1 items
- name "B8A"
- common_name "rededge"
- description "Band 8A - Vegetation red edge 4"
- center_wavelength 0.865
- full_width_half_max 0.033
roles[] 1 items
- 0 "data"
- href "https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/48/N/UG/2022/01/15/S2A_MSIL2A_20220115T032101_N0300_R118_T48NUG_20220115T170435.SAFE/GRANULE/L2A_T48NUG_A034292_20220115T033502/IMG_DATA/R20m/T48NUG_20220115T032101_SCL_20m.tif?st=2024-04-11T07%3A15%3A12Z&se=2024-04-13T07%3A15%3A12Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2024-04-12T03%3A18%3A07Z&ske=2024-04-19T03%3A18%3A07Z&sks=b&skv=2021-06-08&sig=0Ie1JiMBBlHVB0EwThT7mu5A3ILvafzth0jw00YAMZ8%3D"
- type "image/tiff; application=geotiff; profile=cloud-optimized"
- title "Scene classfication map (SCL)"
proj:bbox[] 4 items
- 0 300000.0
- 1 90240.0
- 2 409800.0
- 3 200040.0
proj:shape[] 2 items
- 0 5490
- 1 5490
proj:transform[] 6 items
- 0 20.0
- 1 0.0
- 2 300000.0
- 3 0.0
- 4 -20.0
- 5 200040.0
- gsd 20.0
roles[] 1 items
- 0 "data"
- href "https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/48/N/UG/2022/01/15/S2A_MSIL2A_20220115T032101_N0300_R118_T48NUG_20220115T170435.SAFE/GRANULE/L2A_T48NUG_A034292_20220115T033502/IMG_DATA/R10m/T48NUG_20220115T032101_WVP_10m.tif?st=2024-04-11T07%3A15%3A12Z&se=2024-04-13T07%3A15%3A12Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2024-04-12T03%3A18%3A07Z&ske=2024-04-19T03%3A18%3A07Z&sks=b&skv=2021-06-08&sig=0Ie1JiMBBlHVB0EwThT7mu5A3ILvafzth0jw00YAMZ8%3D"
- type "image/tiff; application=geotiff; profile=cloud-optimized"
- title "Water vapour (WVP)"
proj:bbox[] 4 items
- 0 300000.0
- 1 90240.0
- 2 409800.0
- 3 200040.0
proj:shape[] 2 items
- 0 10980
- 1 10980
proj:transform[] 6 items
- 0 10.0
- 1 0.0
- 2 300000.0
- 3 0.0
- 4 -10.0
- 5 200040.0
- gsd 10.0
roles[] 1 items
- 0 "data"
- href "https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/48/N/UG/2022/01/15/S2A_MSIL2A_20220115T032101_N0300_R118_T48NUG_20220115T170435.SAFE/GRANULE/L2A_T48NUG_A034292_20220115T033502/IMG_DATA/R10m/T48NUG_20220115T032101_TCI_10m.tif?st=2024-04-11T07%3A15%3A12Z&se=2024-04-13T07%3A15%3A12Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2024-04-12T03%3A18%3A07Z&ske=2024-04-19T03%3A18%3A07Z&sks=b&skv=2021-06-08&sig=0Ie1JiMBBlHVB0EwThT7mu5A3ILvafzth0jw00YAMZ8%3D"
- type "image/tiff; application=geotiff; profile=cloud-optimized"
- title "True color image"
proj:bbox[] 4 items
- 0 300000.0
- 1 90240.0
- 2 409800.0
- 3 200040.0
proj:shape[] 2 items
- 0 10980
- 1 10980
proj:transform[] 6 items
- 0 10.0
- 1 0.0
- 2 300000.0
- 3 0.0
- 4 -10.0
- 5 200040.0
- gsd 10.0
eo:bands[] 3 items
- name "B04"
- common_name "red"
- description "Band 4 - Red"
- center_wavelength 0.665
- full_width_half_max 0.038
- name "B03"
- common_name "green"
- description "Band 3 - Green"
- center_wavelength 0.56
- full_width_half_max 0.045
- name "B02"
- common_name "blue"
- description "Band 2 - Blue"
- center_wavelength 0.49
- full_width_half_max 0.098
roles[] 1 items
- 0 "data"
- href "https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/48/N/UG/2022/01/15/S2A_MSIL2A_20220115T032101_N0300_R118_T48NUG_20220115T170435.SAFE/GRANULE/L2A_T48NUG_A034292_20220115T033502/QI_DATA/T48NUG_20220115T032101_PVI.tif?st=2024-04-11T07%3A15%3A12Z&se=2024-04-13T07%3A15%3A12Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2024-04-12T03%3A18%3A07Z&ske=2024-04-19T03%3A18%3A07Z&sks=b&skv=2021-06-08&sig=0Ie1JiMBBlHVB0EwThT7mu5A3ILvafzth0jw00YAMZ8%3D"
- type "image/tiff; application=geotiff; profile=cloud-optimized"
- title "Thumbnail"
roles[] 1 items
- 0 "thumbnail"
- href "https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/48/N/UG/2022/01/15/S2A_MSIL2A_20220115T032101_N0300_R118_T48NUG_20220115T170435.SAFE/manifest.safe?st=2024-04-11T07%3A15%3A12Z&se=2024-04-13T07%3A15%3A12Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2024-04-12T03%3A18%3A07Z&ske=2024-04-19T03%3A18%3A07Z&sks=b&skv=2021-06-08&sig=0Ie1JiMBBlHVB0EwThT7mu5A3ILvafzth0jw00YAMZ8%3D"
- type "application/xml"
- title "SAFE manifest"
roles[] 1 items
- 0 "metadata"
- href "https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/48/N/UG/2022/01/15/S2A_MSIL2A_20220115T032101_N0300_R118_T48NUG_20220115T170435.SAFE/GRANULE/L2A_T48NUG_A034292_20220115T033502/MTD_TL.xml?st=2024-04-11T07%3A15%3A12Z&se=2024-04-13T07%3A15%3A12Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2024-04-12T03%3A18%3A07Z&ske=2024-04-19T03%3A18%3A07Z&sks=b&skv=2021-06-08&sig=0Ie1JiMBBlHVB0EwThT7mu5A3ILvafzth0jw00YAMZ8%3D"
- type "application/xml"
- title "Granule metadata"
roles[] 1 items
- 0 "metadata"
- href "https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/48/N/UG/2022/01/15/S2A_MSIL2A_20220115T032101_N0300_R118_T48NUG_20220115T170435.SAFE/INSPIRE.xml?st=2024-04-11T07%3A15%3A12Z&se=2024-04-13T07%3A15%3A12Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2024-04-12T03%3A18%3A07Z&ske=2024-04-19T03%3A18%3A07Z&sks=b&skv=2021-06-08&sig=0Ie1JiMBBlHVB0EwThT7mu5A3ILvafzth0jw00YAMZ8%3D"
- type "application/xml"
- title "INSPIRE metadata"
roles[] 1 items
- 0 "metadata"
- href "https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/48/N/UG/2022/01/15/S2A_MSIL2A_20220115T032101_N0300_R118_T48NUG_20220115T170435.SAFE/MTD_MSIL2A.xml?st=2024-04-11T07%3A15%3A12Z&se=2024-04-13T07%3A15%3A12Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2024-04-12T03%3A18%3A07Z&ske=2024-04-19T03%3A18%3A07Z&sks=b&skv=2021-06-08&sig=0Ie1JiMBBlHVB0EwThT7mu5A3ILvafzth0jw00YAMZ8%3D"
- type "application/xml"
- title "Product metadata"
roles[] 1 items
- 0 "metadata"
- href "https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/48/N/UG/2022/01/15/S2A_MSIL2A_20220115T032101_N0300_R118_T48NUG_20220115T170435.SAFE/DATASTRIP/DS_ESRI_20220115T170436_S20220115T033502/MTD_DS.xml?st=2024-04-11T07%3A15%3A12Z&se=2024-04-13T07%3A15%3A12Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2024-04-12T03%3A18%3A07Z&ske=2024-04-19T03%3A18%3A07Z&sks=b&skv=2021-06-08&sig=0Ie1JiMBBlHVB0EwThT7mu5A3ILvafzth0jw00YAMZ8%3D"
- type "application/xml"
- title "Datastrip metadata"
roles[] 1 items
- 0 "metadata"
- href "https://planetarycomputer.microsoft.com/api/data/v1/item/tilejson.json?collection=sentinel-2-l2a&item=S2A_MSIL2A_20220115T032101_R118_T48NUG_20220115T170435&assets=visual&asset_bidx=visual%7C1%2C2%2C3&nodata=0&format=png"
- type "application/json"
- title "TileJSON with default rendering"
roles[] 1 items
- 0 "tiles"
- href "https://planetarycomputer.microsoft.com/api/data/v1/item/preview.png?collection=sentinel-2-l2a&item=S2A_MSIL2A_20220115T032101_R118_T48NUG_20220115T170435&assets=visual&asset_bidx=visual%7C1%2C2%2C3&nodata=0&format=png"
- type "image/png"
- title "Rendered preview"
- rel "preview"
roles[] 1 items
- 0 "overview"
bbox[] 4 items
- 0 103.20205689
- 1 0.81602476
- 2 104.18934086
- 3 1.8096362
stac_extensions[] 3 items
- 0 "https://stac-extensions.github.io/eo/v1.1.0/schema.json"
- 1 "https://stac-extensions.github.io/sat/v1.0.0/schema.json"
- 2 "https://stac-extensions.github.io/projection/v1.1.0/schema.json"
- collection "sentinel-2-l2a"
Inspect one of the data assets 🍱#
The Sentinel-2 STAC item contains several assets. These include different 🌈 bands (e.g. ‘B02’, ‘B03’, ‘B04’). Let’s just use the ‘visual’ product for now which includes the RGB bands.
url: str = signed_item.assets["visual"].href
da = rioxarray.open_rasterio(filename=url)
<xarray.DataArray (band: 3, y: 10980, x: 10980)> Size: 362MB [361681200 values with dtype=uint8] Coordinates: * band (band) int64 24B 1 2 3 * x (x) float64 88kB 3e+05 3e+05 3e+05 ... 4.098e+05 4.098e+05 * y (y) float64 88kB 2e+05 2e+05 2e+05 ... 9.026e+04 9.024e+04 spatial_ref int64 8B 0 Attributes: AREA_OR_POINT: Area _FillValue: 0 scale_factor: 1.0 add_offset: 0.0
This is how the Sentinel-2 image looks like over Singapore on 15 Jan 2022.
1️⃣ Construct DataPipe 📡#
A torch DataPipe
is a way of composing data (rather than inheriting data).
Yes, I don’t know what it really means either, so here’s some extra reading.
🔖 References:
Create an Iterable 📏#
Start by wrapping a list of URLs to the Cloud-Optimized GeoTIFF files.
We only have 1 item so we’ll use [url]
, but if you have more, you can do
[url1, url2, url3]
, etc. Pass this iterable list into
dp = torchdata.datapipes.iter.IterableWrapper(iterable=[url])
The dp
variable is the DataPipe!
Now to apply some more transformations/functions on it.
Read using RioXarrayReader 🌐#
This is where ☯ zen3geo
comes in. We’ll be using the
class, or
rather, the short alias zen3geo.datapipes.RioXarrayReader
Confusingly, there are two ways or forms of applying RioXarrayReader
a class-based method and a functional method.
# Using class constructors
dp_rioxarray = zen3geo.datapipes.RioXarrayReader(source_datapipe=dp)
# Using functional form (recommended)
dp_rioxarray = dp.read_from_rioxarray()
Note that both ways are equivalent (they produce the same IterDataPipe output), but the latter (functional) form is preferred, see also https://pytorch.org/data/0.4/tutorial.html#registering-datapipes-with-the-functional-api
What if you don’t want the whole Sentinel-2 scene at the full 10m resolution?
Since we’re using Cloud-Optimized GeoTIFFs, you could set an overview_level
(following https://corteva.github.io/rioxarray/stable/examples/COG.html).
dp_rioxarray_zoom3 = dp.read_from_rioxarray(overview_level=3)
Extra keyword arguments will be handled by rioxarray.open_rasterio()
or rasterio.open()
Other DataPipe classes/functions can be stacked or joined to this basic GeoTIFF reader. For example, clipping by bounding box or reprojecting to a certain Coordinate Reference System. If you would like to implement this, check out the Contributing Guidelines to get started!
2️⃣ Loop through DataPipe ⚙️#
A DataPipe describes a flow of information. Through a series of steps it goes, as one piece comes in, another might follow.
Basic iteration ♻️#
At the most basic level, you could iterate through the DataPipe like so:
it = iter(dp_rioxarray_zoom3)
dataarray = next(it)
<xarray.DataArray (band: 3, y: 687, x: 687)> Size: 1MB [1415907 values with dtype=uint8] Coordinates: * band (band) int64 24B 1 2 3 * x (x) float64 5kB 3.001e+05 3.002e+05 ... 4.096e+05 4.097e+05 * y (y) float64 5kB 2e+05 1.998e+05 ... 9.048e+04 9.032e+04 spatial_ref int64 8B 0 Attributes: AREA_OR_POINT: Area _FillValue: 0 scale_factor: 1.0 add_offset: 0.0
Or if you’re more familiar with a for-loop, here it is:
for dataarray in dp_rioxarray_zoom3:
# Run model on this data batch
StreamWrapper<<xarray.DataArray (band: 3, y: 687, x: 687)> Size: 1MB
[1415907 values with dtype=uint8]
* band (band) int64 24B 1 2 3
* x (x) float64 5kB 3.001e+05 3.002e+05 ... 4.096e+05 4.097e+05
* y (y) float64 5kB 2e+05 1.998e+05 ... 9.048e+04 9.032e+04
spatial_ref int64 8B 0
_FillValue: 0
scale_factor: 1.0
add_offset: 0.0>
Into a DataLoader 🏋️#
For the deep learning folks, you might need one extra step.
The xarray.DataArray
needs to be converted to a tensor.
In the Pytorch world, that can happen via torch.as_tensor()
def fn(da):
return torch.as_tensor(da.data)
Using torchdata.datapipes.iter.Mapper
(functional name: map
we’ll apply the tensor conversion function to each dataarray in the DataPipe.
dp_tensor = dp_rioxarray_zoom3.map(fn=fn)
Finally, let’s put our DataPipe into a torch.utils.data.DataLoader
dataloader = torch.utils.data.DataLoader(dataset=dp_tensor)
for batch in dataloader:
tensor = batch
tensor([[[[ 46, 29, 34, ..., 241, 246, 255],
[ 83, 73, 66, ..., 248, 255, 251],
[ 53, 43, 55, ..., 246, 247, 243],
[101, 101, 104, ..., 78, 179, 83],
[ 99, 103, 105, ..., 68, 60, 45],
[ 95, 103, 102, ..., 50, 34, 42]],
[[ 58, 24, 44, ..., 255, 255, 255],
[ 60, 51, 57, ..., 255, 255, 254],
[ 47, 22, 47, ..., 255, 255, 255],
[110, 111, 114, ..., 95, 189, 87],
[110, 113, 113, ..., 85, 62, 48],
[108, 112, 112, ..., 62, 60, 62]],
[[ 42, 22, 29, ..., 255, 255, 255],
[ 43, 41, 39, ..., 255, 255, 254],
[ 35, 30, 37, ..., 255, 255, 255],
[ 82, 82, 83, ..., 74, 174, 57],
[ 82, 84, 84, ..., 57, 46, 28],
[ 80, 83, 82, ..., 37, 31, 31]]]], dtype=torch.uint8)
And so it begins 🌄
That’s all 🎉! For more information on how to use DataPipes, check out:
If you have any questions 🙋, feel free to ask us anything at weiji14/zen3geo#discussions or visit the Pytorch forums at https://discuss.pytorch.org/c/data/37.