API Reference
Contents
API Reference#
DataPipes#
Iterable-style DataPipes for geospatial raster ๐ and vector ๐ data.
Datashader#
DataPipes for datashader.
- zen3geo.datapipes.DatashaderRasterizer#
alias of
zen3geo.datapipes.datashader.DatashaderRasterizerIterDataPipe
- class zen3geo.datapipes.datashader.DatashaderRasterizerIterDataPipe(source_datapipe, vector_datapipe, agg=None, **kwargs)[source]#
Takes vector
geopandas.GeoSeriesorgeopandas.GeoDataFramegeometries and rasterizes them usingdatashader.Canvasto yield anxarray.DataArrayraster with the input geometries aggregated into a fixed-sized grid (functional name:rasterize_with_datashader).- Parameters
source_datapipe (IterDataPipe[datashader.Canvas]) โ A DataPipe that contains
datashader.Canvasobjects with a.crsattribute. This will be the template defining the output rasterโs spatial extent and x/y range.vector_datapipe (IterDataPipe[geopandas.GeoDataFrame]) โ A DataPipe that contains
geopandas.GeoSeriesorgeopandas.GeoDataFramevector geometries with a.crsattribute.agg (Optional[datashader.reductions.Reduction]) โ
Reduction operation to compute. Default depends on the input vector type:
For points, default is
datashader.reductions.countFor lines, default is
datashader.reductions.anyFor polygons, default is
datashader.reductions.any
For more information, refer to the section on Aggregation under datashaderโs Pipeline docs.
kwargs (Optional) โ Extra keyword arguments to pass to the
datashader.Canvasclassโs aggregation methods such asdatashader.Canvas.points.
- Yields
raster (xarray.DataArray) โ An
xarray.DataArrayobject containing the raster data. This raster will have arioxarray.rioxarray.XRasterBase.crsproperty and a proper affine transform viewable withrioxarray.rioxarray.XRasterBase.transform().- Raises
ModuleNotFoundError โ If
spatialpandasis not installed. Please install it (e.g. viapip install spatialpandas) before using this class.ValueError โ If either the length of the
vector_datapipeis not 1, or if the length of thevector_datapipeis not equal to the length of thesource_datapipe. I.e. the ratio of vector:canvas must be 1:N or be exactly N:N.AttributeError โ If either the canvas in
source_datapipeor vector geometry invector_datapipeis missing a.crsattribute. Please set the coordinate reference system (e.g. usingcanvas.crs = 'EPSG:4326'for thedatashader.Canvasinput orvector = vector.set_crs(epsg=4326)for thegeopandas.GeoSeriesorgeopandas.GeoDataFrameinput) before passing them into the datapipe.NotImplementedError โ If the input vector geometry type to
vector_datapipeis not supported, typically when ashapely.geometry.GeometryCollectionis used. Supported types include Point, LineString, and Polygon, plus their multipart equivalents MultiPoint, MultiLineString, and MultiPolygon.
- Return type
None
Example
>>> import pytest >>> datashader = pytest.importorskip("datashader") >>> pyogrio = pytest.importorskip("pyogrio") >>> spatialpandas = pytest.importorskip("spatialpandas") ... >>> from torchdata.datapipes.iter import IterableWrapper >>> from zen3geo.datapipes import DatashaderRasterizer ... >>> # Read in a vector point data source >>> geodataframe = pyogrio.read_dataframe( ... "https://github.com/geopandas/pyogrio/raw/v0.4.0/pyogrio/tests/fixtures/test_gpkg_nulls.gpkg", ... read_geometry=True, ... ) >>> assert geodataframe.crs == "EPSG:4326" # longitude/latitude coords >>> dp_vector = IterableWrapper(iterable=[geodataframe]) ... >>> # Setup blank raster canvas where we will burn vector geometries onto >>> canvas = datashader.Canvas( ... plot_width=5, ... plot_height=6, ... x_range=(160000.0, 620000.0), ... y_range=(0.0, 450000.0), ... ) >>> canvas.crs = "EPSG:32631" # UTM Zone 31N, North of Gulf of Guinea >>> dp_canvas = IterableWrapper(iterable=[canvas]) ... >>> # Rasterize vector point geometries onto blank canvas >>> dp_datashader = dp_canvas.rasterize_with_datashader( ... vector_datapipe=dp_vector ... ) ... >>> # Loop or iterate over the DataPipe stream >>> it = iter(dp_datashader) >>> dataarray = next(it) >>> dataarray <xarray.DataArray (y: 6, x: 5)> array([[0, 0, 0, 0, 1], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 1, 0, 0], [0, 1, 0, 0, 0], [1, 0, 0, 0, 0]], dtype=uint32) Coordinates: * x (x) float64 2.094e+05 3.083e+05 4.072e+05 5.06e+05 6.049e+05 * y (y) float64 4.157e+05 3.47e+05 2.783e+05 ... 1.41e+05 7.237e+04 spatial_ref int64 0 ... >>> dataarray.rio.crs CRS.from_epsg(32631) >>> dataarray.rio.transform() Affine(98871.00388807665, 0.0, 160000.0, 0.0, -68660.4193667199, 450000.0)
- zen3geo.datapipes.XarrayCanvas#
alias of
zen3geo.datapipes.datashader.XarrayCanvasIterDataPipe
- class zen3geo.datapipes.datashader.XarrayCanvasIterDataPipe(source_datapipe, **kwargs)[source]#
Bases:
torch.utils.data.datapipes.datapipe.IterDataPipe[Union[xarray.core.dataarray.DataArray,xarray.core.dataset.Dataset]]Takes an
xarray.DataArrayorxarray.Datasetand creates a blankdatashader.Canvasbased on the spatial extent and coordinates of the input (functional name:canvas_from_xarray).- Parameters
source_datapipe (IterDataPipe[xarrray.DataArray]) โ A DataPipe that contains
xarray.DataArrayorxarray.Datasetobjects. These data objects need to have both a.rio.x_dimand.rio.y_dimattribute, which is present if the original dataset was opened usingrioxarray.open_rasterio(), or by setting it manually usingrioxarray.rioxarray.XRasterBase.set_spatial_dims().kwargs (Optional) โ Extra keyword arguments to pass to
datashader.Canvas.
- Yields
canvas (datashader.Canvas) โ A
datashader.Canvasobject representing the same spatial extent and x/y coordinates of the input raster grid. This canvas will also have a.crsattribute that captures the original Coordinate Reference System from the input xarray objectโsrioxarray.rioxarray.XRasterBase.crsproperty.- Raises
ModuleNotFoundError โ If
datashaderis not installed. Follow install instructions for datashader before using this class.- Return type
None
Example
>>> import pytest >>> import numpy as np >>> import xarray as xr >>> datashader = pytest.importorskip("datashader") ... >>> from torchdata.datapipes.iter import IterableWrapper >>> from zen3geo.datapipes import XarrayCanvas ... >>> # Create blank canvas from xarray.DataArray using DataPipe >>> y = np.arange(0, -3, step=-1) >>> x = np.arange(0, 6) >>> dataarray: xr.DataArray = xr.DataArray( ... data=np.zeros(shape=(1, 3, 6)), ... coords=dict(band=[1], y=y, x=x), ... ) >>> dataarray = dataarray.rio.set_spatial_dims(x_dim="x", y_dim="y") >>> dp = IterableWrapper(iterable=[dataarray]) >>> dp_canvas = dp.canvas_from_xarray() ... >>> # Loop or iterate over the DataPipe stream >>> it = iter(dp_canvas) >>> canvas = next(it) >>> print(canvas.raster(source=dataarray)) <xarray.DataArray (band: 1, y: 3, x: 6)> array([[[0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0.], [0., 0., 0., 0., 0., 0.]]]) Coordinates: * x (x) int64 0 1 2 3 4 5 * y (y) int64 0 -1 -2 * band (band) int64 1 ...
Pyogrio#
DataPipes for pyogrio.
- zen3geo.datapipes.PyogrioReader#
alias of
zen3geo.datapipes.pyogrio.PyogrioReaderIterDataPipe
- class zen3geo.datapipes.pyogrio.PyogrioReaderIterDataPipe(source_datapipe, **kwargs)[source]#
Bases:
torch.utils.data.datapipes.datapipe.IterDataPipe[torch.utils.data.datapipes.utils.common.StreamWrapper]Takes vector files (e.g. FlatGeoBuf, GeoPackage, GeoJSON) from local disk or URLs (as long as they can be read by pyogrio) and yields
geopandas.GeoDataFrameobjects (functional name:read_from_pyogrio).Based on https://github.com/pytorch/data/blob/v0.4.0/torchdata/datapipes/iter/load/iopath.py#L42-L97
- Parameters
source_datapipe (IterDataPipe[str]) โ A DataPipe that contains filepaths or URL links to vector files such as FlatGeoBuf, GeoPackage, GeoJSON, etc.
kwargs (Optional) โ Extra keyword arguments to pass to
pyogrio.read_dataframe().
- Yields
stream_obj (geopandas.GeoDataFrame) โ A
geopandas.GeoDataFrameobject containing the vector data.- Raises
ModuleNotFoundError โ If
pyogriois not installed. See install instructions for pyogrio, and ensure thatgeopandasis installed too (e.g. viapip install pyogrio[geopandas]) before using this class.- Return type
None
Example
>>> import pytest >>> pyogrio = pytest.importorskip("pyogrio") ... >>> from torchdata.datapipes.iter import IterableWrapper >>> from zen3geo.datapipes import PyogrioReader ... >>> # Read in GeoPackage data using DataPipe >>> file_url: str = "https://github.com/geopandas/pyogrio/raw/v0.4.0/pyogrio/tests/fixtures/test_gpkg_nulls.gpkg" >>> dp = IterableWrapper(iterable=[file_url]) >>> dp_pyogrio = dp.read_from_pyogrio() ... >>> # Loop or iterate over the DataPipe stream >>> it = iter(dp_pyogrio) >>> geodataframe = next(it) >>> geodataframe StreamWrapper< col_bool col_int8 ... col_float64 geometry 0 1.0 1.0 ... 1.5 POINT (0.00000 0.00000) 1 0.0 2.0 ... 2.5 POINT (1.00000 1.00000) 2 1.0 3.0 ... 3.5 POINT (2.00000 2.00000) 3 NaN NaN ... NaN POINT (4.00000 4.00000) [4 rows x 12 columns]>
Rioxarray#
DataPipes for rioxarray.
- zen3geo.datapipes.RioXarrayReader#
alias of
zen3geo.datapipes.rioxarray.RioXarrayReaderIterDataPipe
- class zen3geo.datapipes.rioxarray.RioXarrayReaderIterDataPipe(source_datapipe, **kwargs)[source]#
Bases:
torch.utils.data.datapipes.datapipe.IterDataPipe[torch.utils.data.datapipes.utils.common.StreamWrapper]Takes raster files (e.g. GeoTIFFs) from local disk or URLs (as long as they can be read by rioxarray and/or rasterio) and yields
xarray.DataArrayobjects (functional name:read_from_rioxarray).Based on https://github.com/pytorch/data/blob/v0.4.0/torchdata/datapipes/iter/load/online.py#L55-L96
- Parameters
source_datapipe (IterDataPipe[str]) โ A DataPipe that contains filepaths or URL links to raster files such as GeoTIFFs.
kwargs (Optional) โ Extra keyword arguments to pass to
rioxarray.open_rasterio()and/orrasterio.open().
- Yields
stream_obj (xarray.DataArray) โ An
xarray.DataArrayobject containing the raster data.- Return type
None
Example
>>> from torchdata.datapipes.iter import IterableWrapper >>> from zen3geo.datapipes import RioXarrayReader ... >>> # Read in GeoTIFF data using DataPipe >>> file_url: str = "https://github.com/GenericMappingTools/gmtserver-admin/raw/master/cache/earth_day_HD.tif" >>> dp = IterableWrapper(iterable=[file_url]) >>> dp_rioxarray = dp.read_from_rioxarray() ... >>> # Loop or iterate over the DataPipe stream >>> it = iter(dp_rioxarray) >>> dataarray = next(it) >>> dataarray.encoding["source"] 'https://github.com/GenericMappingTools/gmtserver-admin/raw/master/cache/earth_day_HD.tif' >>> dataarray StreamWrapper<<xarray.DataArray (band: 1, y: 960, x: 1920)> [1843200 values with dtype=uint8] Coordinates: * band (band) int64 1 * x (x) float64 -179.9 -179.7 -179.5 -179.3 ... 179.5 179.7 179.9 * y (y) float64 89.91 89.72 89.53 89.34 ... -89.53 -89.72 -89.91 spatial_ref int64 0 ...
Xbatcher#
DataPipes for xbatcher.
- zen3geo.datapipes.XbatcherSlicer#
alias of
zen3geo.datapipes.xbatcher.XbatcherSlicerIterDataPipe
- class zen3geo.datapipes.xbatcher.XbatcherSlicerIterDataPipe(source_datapipe, input_dims, **kwargs)[source]#
Bases:
torch.utils.data.datapipes.datapipe.IterDataPipe[Union[xarray.core.dataarray.DataArray,xarray.core.dataset.Dataset]]Takes an
xarray.DataArrayorxarray.Datasetand creates a sliced window view (also known as a chip or tile) of the n-dimensional array (functional name:slice_with_xbatcher).- Parameters
source_datapipe (IterDataPipe[xarray.DataArray]) โ A DataPipe that contains
xarray.DataArrayorxarray.Datasetobjects.input_dims (dict) โ A dictionary specifying the size of the inputs in each dimension to slice along, e.g.
{'lon': 64, 'lat': 64}. These are the dimensions the machine learning library will see. All other dimensions will be stacked into one dimension calledbatch.kwargs (Optional) โ Extra keyword arguments to pass to
xbatcher.BatchGenerator().
- Yields
chip (xarray.DataArray) โ An
xarray.DataArrayorxarray.Datasetobject containing the sliced raster data, with the size/shape defined by theinput_dimsparameter.- Raises
ModuleNotFoundError โ If
xbatcheris not installed. Follow install instructions for xbatcher before using this class.- Return type
None
Example
>>> import pytest >>> import numpy as np >>> import xarray as xr >>> xbatcher = pytest.importorskip("xbatcher") ... >>> from torchdata.datapipes.iter import IterableWrapper >>> from zen3geo.datapipes import XbatcherSlicer ... >>> # Sliced window view of xarray.DataArray using DataPipe >>> dataarray: xr.DataArray = xr.DataArray( ... data=np.ones(shape=(3, 128, 128)), ... name="foo", ... dims=["band", "y", "x"] ... ) >>> dp = IterableWrapper(iterable=[dataarray]) >>> dp_xbatcher = dp.slice_with_xbatcher(input_dims={"y": 64, "x": 64}) ... >>> # Loop or iterate over the DataPipe stream >>> it = iter(dp_xbatcher) >>> dataarray_chip = next(it) >>> dataarray_chip <xarray.Dataset> Dimensions: (band: 3, y: 64, x: 64) Dimensions without coordinates: band, y, x Data variables: foo (band, y, x) float64 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0 1.0