iSDM.environment module

A module for all environmental layers functionality.

class iSDM.environment.ClimateLayer(source=None, file_path=None, name_layer=None, **kwargs)

Bases: iSDM.environment.RasterEnvironmentalLayer

class iSDM.environment.DEMLayer(source=None, file_path=None, name_layer=None, **kwargs)

Bases: iSDM.environment.RasterEnvironmentalLayer

class iSDM.environment.EnvironmentalLayer(source=None, file_path=None, name_layer=None, **kwargs)

Bases: object

A generic EnvironmentalLayer class used for subclassing different global-scale environmental data sources.

Variables:
  • source (iSDM.environment.Source) – The source of the global environmental data.
  • name_layer (string) – Description of the layer
get_data()

Needs to be implemented in a subclass.

get_source()
load_data(file_path=None)

Needs to be implemented in a subclass.

save_data(full_name=None, dir_name=None, file_name=None)

Needs to be implemented in a subclass.

set_source(source)
class iSDM.environment.RasterEnvironmentalLayer(source=None, file_path=None, name_layer=None, **kwargs)

Bases: iSDM.environment.EnvironmentalLayer

A class for encapsulating the raster environmental layer functionality. Operations such as reprojecting, overlaying, sampling pseudo-absence pixels, converting to world map coordinates, are some of the functionalities implemented as wrappers around corresponding rasterio/Numpy operations and methods. This class should be used when the expected layer data is in raster format, i.e., 2-dimensional (multi-band) array of data.

Variables:
  • file_path (string) – Location of the raster file from which the raster map data is loaded.
  • raster_affine (rasterio.transform.Affine) – Affine translation used in the environmental raster map.
  • resolution (tuple(int, int)) – The resolution of the raster map, as a tuple (height, width) in pixels.
  • raster_reader (rasterio._io.RasterReader) – file reader for the corresponding rasterized data.
close_dataset()

Close the rasterio._io.RasterReader file reader, if open. This releases resources such as memory.

get_data()
Returns:A raster file reader, from which any band data can be read using .read(band_number)
Return type:rasterio._io.RasterReader
load_data(file_path=None)

Loads the raster data from a previously-saved raster file. Provides information about the loaded data, and returns a rasterio file reader handle, which allows you to read individual raster bands.

Parameters:file_path (string) – The full path to the targed GeoTIFF raster file (including the directory and filename in one string).
Returns:Rasterio RasterReader file object which can be used to read individual bands from the raster file.
Return type:rasterio._io.RasterReader
pixel_to_world_coordinates(raster_data=None, filter_no_data_value=True, no_data_value=0, band_number=1)

Map the pixel coordinates to world coordinates. The affine transformation matrix is used for this purpose. The convention is to reference the pixel corner. To reference the pixel center instead, we translate each pixel by 50%. The “no value” pixels (cells) can be filtered out.

A dataset’s pixel coordinate system has its origin at the “upper left” (imagine it displayed on your screen). Column index increases to the right, and row index increases downward. The mapping of these coordinates to “world” coordinates in the dataset’s reference system is done with an affine transformation matrix.

param string raster_data: the raster data (2-dimensional array) to translate to world coordinates. If not provided, it tries to load existing rasterized data about the RasterEnvironmentalLayer.

Parameters:
  • no_data_value (int) – The pixel values depicting non-burned cells. Default is 0.
  • filter_no_data_value (bool) – Whether to filter-out the no-data pixel values. Default is true. If set to false, all pixels in a 2-dimensional array will be converted to world coordinates. Typically this option is used to get a “base” map of the coordinates of all pixels in an image (map).
Returns:

A tuple of numpy ndarrays. The first array contains the latitude values for each non-zero cell, the second array contains the longitude values for each non-zero cell.

TODO: document raster-affine

Return type:tuple(np.ndarray, np.ndarray)
plot(figsize=(25, 20), band_number=1)

A simple plot of the raster image data. The data should be loaded before calling this method.

Parameters:
  • figsize (tuple) – A tuple containing the (width, height) of the plot, in inches. Default is (25, 20)
  • band_number (int) – The index of the band to use for plotting the raster data.
classmethod plot_world_coordinates(coordinates=None, figsize=(16, 12), projection='merc', facecolor='crimson')

Visually plots coordinates on a Basemap. Basemap supports projections (with coastlines and political boundaries) using matplotlib. The coordinates data must be provided as a tuple of Numpy arrays, one for the x, and one for the y values of the coordinates. First, the data is converted to a pandas.DataFrame with the x and y arrays transposed as decimallatitude and decimallongitude columns. Next, the __geometrize__() method is used to convert the dataframe into a geoopandas format (with a geometry column).

Parameters:
  • coordinates (tuple) – A tuple containing Numpy arrays, one for the x, and one for the y values of the coordinates.
  • figsize (tuple) – A tuple containing the (width, height) of the plot, in inches. Default is (16, 12)
  • projection (string) – The projection to use for plotting. Supported projection values from

Basemap. Default is ‘merc’ (Mercator)

Parameters:facecolor (string) – Fill color for the geometries. Defaylt is “crimson” (red)
Returns:a map with geometries plotted, zoomed to the total boundaries of the geometry Series (column) of the DataFrame.
polygonize(band_number=1)

Extract shapes from raster features. This is the inverse operation of rasterizing shapes. Uses the Rasterio <https://mapbox.github.io/rasterio/_modules/rasterio/features.html>’_ library for this purpose. The data is loaded into a `geopandas GeoDataFrame. GeoDataFrame data structures are pandas DataFrames with added functionality, containing a geometry column for the Shapely geometries. The raster data should be loaded in the layer before calling this method.

Parameters:band_number (int) – The index of the raster band which is to be used as input for extracting gemetrical shapes.
Returns:geopandas.GeoDataFrame
read(band_number=1)

Read a particular band from the raster data array.

Parameters:band_number (int) – The index of the band to read.
Returns:A 2-dimensional Numpy array containing the pixel values of that particular band.
reproject(destination_file, source_file=None, resampling=0, **kwargs)

Reprojects the pixels of a source raster map to a destination raster, with a different reference coordinate system and Affine transform. It uses Rasterio calculate_default_transform() to calculate parameters such as the resolution (if not provided), and the destination transform and dimensions.

param string source_file: Full path to the source file containing a raster map

param string destination_file: Full path to the destination file containing a raster map

Parameters:
  • resampling (int) – Resampling method to use. Can be one of the following: Resampling.nearest, Resampling.bilinear, Resampling.cubic, Resampling.cubic_spline, Resampling.lanczos, Resampling.average, Resampling.mode.
  • kwargs (dict) – Optional additional arguments passed to the method, to parametrize the reprojection. For example: dst_crs for the target coordinate reference system, resolution for the target resolution, in units of target coordinate reference system.
sample_pseudo_absences(species_raster_data, suitable_habitat=None, bias_grid=None, band_number=1, number_of_pseudopoints=1000)

Samples a number_of_pseudopoints points from the RasterEnvironmentalLayer data (raster map), based on a given species raster map which is assumed to contain species presence points (or potential presence). The species_raster_data is used to determine which distinct regions (cell values) from the entire environmental raster map, should be taken into account for potential pseudo-absence sampling regions. In other words, which realms or ecoregions should be taken into account. Optionally, suitable habitat raster (with binary, 0/1s values) can be provided to further limit the area of sampling. Finally, presence pixels are removed from this map, and the resulting pixels are used as a base for sampling pseudo-absences. Optionally, a bias grid can be provided to bias the “random” sampling of pseudo absences. If the number of such resulting pixels left is smaller than the number of requested pseudo-absence points, all pixels are automatically taken as pseudo-absence points, and no random sampling is done.

Otherwise, number_of_pseudopoints pixels positions (indices) are randomly chosen at once (for speed), rather than randomly sampling one by one until the desired number of pseudo-absences is reached.

Parameters:
  • species_raster_data (np.ndarray) – A raster map containing the species presence pixels. If not provided, by default the one loaded previously (if available, otherwise .load_data() should be used before) is used.
  • suitable_habitat (np.ndarray) – A raster map containing the species suitable habitat. It should contain only values of 0 and 1, 1s depicting a suitable areas, while 0s unsuitable.
  • bias_grid (np.ndarray) – A raster map containing the sampling bias grid. It should contain integer values depicting a sampling intensity at every pixel location.
  • band_number (int) – The index of the band from the species_raster_data to use as input. Default is 1.
  • number_of_pseudopoints (int) – Number of pseudo-absence points to sample from the raster environmental layer data.
Returns:

A tuple containing two raster maps, one with all potential background pixels chosen to sample from, and second with all the actual sampled pixels.

Return type:

tuple(np.ndarray, np.ndarray)

class iSDM.environment.Source

Bases: enum.Enum

Possible sources of global environmental data.

class iSDM.environment.VectorEnvironmentalLayer(source=None, file_path=None, name_layer=None, **kwargs)

Bases: iSDM.environment.EnvironmentalLayer

A class for encapsulating the vector environmental layer functionality, with operations such as rasterizing.

Variables:
  • file_path (string) – Full location of the shapefile containing the data for this layer.
  • data_full (geopandas.GeoDataFrame) – Data frame containing the full data for the environmental layer geometries.
  • raster_file (string) – Full location of the corresponding raster map data for this layer.
  • raster_affine (rasterio.transform.Affine) – Affine translation used in the corresponding raster map of this layer.
  • raster_reader (rasterio._io.RasterReader) – file reader for the corresponding rasterized data.
get_classifier()
get_data()

Returns the (pre)loaded species data in a (geo)pandas DataFrame.

Returns:self.data_full
Return type:geopandas.GeoDataFrame or pandas.DataFrame
get_pixel_size()
get_raster_file()
load_data(file_path=None)

Loads the environmental data from the provided file_path shapefile into a geopandas.GeoDataFrame. A GeoDataFrame is a tablular data structure that contains a column called geometry which contains a GeoSeries of Shapely geometries. all other meta-data column names are converted to a lower-case, for consistency.

Parameters:file_path (string) – The full path to the shapefile file (including the directory and filename in one string).
Returns:None
load_raster_data(raster_file=None)

Loads the raster data from a previously-saved raster file. Provides information about the loaded data, and returns a rasterio file reader.

Parameters:raster_file (string) – The full path to the targed GeoTIFF raster file (including the directory and filename in one string).
Returns:Rasterio RasterReader file object which can be used to read individual bands from the raster file.
Return type:rasterio._io.RasterReader
rasterize(raster_file=None, pixel_size=None, all_touched=False, no_data_value=0, default_value=1, crs=None, cropped=False, classifier_column=None, *args, **kwargs)

Rasterize (burn) the environment rangemaps (geometrical shapes) into pixels (cells), i.e., a 2-dimensional image array of type numpy ndarray. Uses the Rasterio library for this purpose. All the shapes from the VectorEnvironmentalLayer object data are burned in a single band of the image. Rasterio datasets can generally have one or more bands, or layers. Following the GDAL convention, these are indexed starting with 1.

Parameters:
  • raster_file (string) – The full path to the targed GeoTIFF raster file (including the directory and filename in one string).
  • pixel_size (int) – The size of the pixel in degrees, i.e., the resolution to use for rasterizing.
  • all_touched (bool) – If true, all pixels touched by geometries, will be burned in. If false, only pixels whose center is within the polygon or that are selected by Bresenham’s line algorithm, will be burned in.
  • no_data_value (int) – Used as value of the pixels which are not burned in. Default is 0.
  • default_value (int) – Used as value of the pixels which are burned in. Default is 1.
  • crs – The Coordinate Reference System to use. Default is “ESPG:4326”
  • cropped (bool) – If true, the resulting pixel array (image) is cropped to the region borders, which contain the burned pixels (i.e., an envelope within the range). Otherwise, a “global world map” is used, i.e., the boundaries are set to (-180, -90, 180, 90) for the resulting array.
Returns:

Rasterio RasterReader file object which can be used to read individual bands from the raster file.

Return type:

rasterio._io.RasterReader

save_data(full_name=None, driver='ESRI Shapefile', overwrite=False)

Saves the current geopandas.GeoDataFrame data in a shapefile. The data is expected to have a ‘geometry’ as a column, besides other metadata metadata. If the full location and name of the file is not provided, then the overwrite should be set to True to overwrite the existing shapefile from which the data was previously loaded.

Parameters:
  • file_path (string) – The full path to the targed shapefile file (including the directory and filename in one string).
  • driver (string) – The driver to use for storing the geopandas.GeoDataFrame data into a file. Default is “ESRI Shapefile”.
  • overwrite (bool) – Whether to overwrite the shapefile from which the data was previously loaded, if a new file_path is not supplied.
Returns:

None

set_classifier(classifier_column)
set_data(data_frame)

Set the species data to the contents of data_frame. The data passed must be in a pandas or geopandas DataFrame. Careful, it overwrites the existing data!

Parameters:data_frame (pandas.DataFrame) – The new data.
Returns:None
set_pixel_size(pixel_size)
set_raster_file(raster_file)