Processor RasterExtract#
Table of Contents#
Part 1: What is Processor RasterExtract#
RasterExtract is a processor that enriches point geometries with the values of raster cells the point overlaps.
This processor requires additional libraries not included with the BDT jar. See the Setup for Raster guide for the respective environment to use this processor.
Edge Cases#
If a point lays on a border of a raster cell, the processor will always look to the cell below it.
If a point lays on a corner of a raster cell, the processor will always look to the lower right cell.
If there is no cell below or to the lower right, the point is considered outside of the raster bounds and no value will be emitted.
Part 2: Processor RasterExtract Example#
Setup BDT#
[1]:
import bdt
bdt.auth("../bdt.lic")
from bdt import processors as P
BDT has been successfully authorized!
Welcome to
___ _ ___ __ ______ __ __ _ __
/ _ ) (_) ___ _ / _ \ ___ _ / /_ ___ _ /_ __/ ___ ___ / / / /__ (_) / /_
/ _ | / / / _ `/ / // // _ `// __// _ `/ / / / _ \/ _ \ / / / '_/ / / / __/
/____/ /_/ \_, / /____/ \_,_/ \__/ \_,_/ /_/ \___/\___//_/ /_/\_\ /_/ \__/
/___/
BDT python version: v3.4.0-develop-12-ge79ea377
BDT jar version: v3.4.0-develop-12-ge79ea377
Input Data#
Create a DataFrame of Point geometries with IDs. Ensure the DataFrame has geometry metadata.
[6]:
point_df = spark.createDataFrame([
(1, 0.0, 0.0),
(2, 1.0, 1.0),
(3, 0.5, 0.7),
(4, 2.5, 1.5),
], schema="ID int, X double, Y double") \
.selectExpr("ID", "ST_MakePoint(X, Y) AS SHAPE") \
.withMeta("Point", 4326)
[7]:
point_df.show(truncate=False)
+---+--------------------------------------------------------------------------------------+
|ID |SHAPE |
+---+--------------------------------------------------------------------------------------+
|1 |{[01 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00], 0.0, 0.0, 0.0, 0.0}|
|2 |{[01 01 00 00 00 00 00 00 00 00 00 F0 3F 00 00 00 00 00 00 F0 3F], 1.0, 1.0, 1.0, 1.0}|
|3 |{[01 01 00 00 00 00 00 00 00 00 00 E0 3F 66 66 66 66 66 66 E6 3F], 0.5, 0.7, 0.5, 0.7}|
|4 |{[01 01 00 00 00 00 00 00 00 00 00 04 40 00 00 00 00 00 00 F8 3F], 2.5, 1.5, 2.5, 1.5}|
+---+--------------------------------------------------------------------------------------+
Running Processor RasterExtract#
Let’s run Processor RasterExtract using the same simple.tif pictured above.
[8]:
out_df = P.raster_extract(point_df, "simple.tif", miid_field="ID", shape_field="SHAPE")
[10]:
out_df.show()
+---+-----------------+
| ID| value|
+---+-----------------+
| 2|6.481858253479004|
| 3|2.502182722091675|
+---+-----------------+
Point 1 was not emitted since it was on a raster cell corner and no cell was found to its lower right.
Point 4 was not emitted since it was outside of the raster bounds.
The resulting dataframe can be joined back to the original dataframe to enrich the point geometries with the raster cell values.
[11]:
out_df.join(point_df, "ID").show(truncate=False)
+---+-----------------+--------------------------------------------------------------------------------------+
|ID |value |SHAPE |
+---+-----------------+--------------------------------------------------------------------------------------+
|2 |6.481858253479004|{[01 01 00 00 00 00 00 00 00 00 00 F0 3F 00 00 00 00 00 00 F0 3F], 1.0, 1.0, 1.0, 1.0}|
|3 |2.502182722091675|{[01 01 00 00 00 00 00 00 00 00 00 E0 3F 66 66 66 66 66 66 E6 3F], 0.5, 0.7, 0.5, 0.7}|
+---+-----------------+--------------------------------------------------------------------------------------+