Setup for Raster Processing in Windows#
The BDT raster processors require additional libararies not included with the BDT jar. This guide describes how to pull these libraries into your environment.
For spark to properly install the required libraries, an ivySettings.xml file must be provided. See the snippet below for what this file should look like. Save this file to an easily accessible place on your machine.
<ivysettings>
<settings defaultResolver="chain"/>
<resolvers>
<chain name="chain">
<ibiblio name="osgeo" m2compatible="true" root="https://repo.osgeo.org/repository/release" />
<ibiblio name="central" m2compatible="true" root="https://repo1.maven.org/maven2/" />
</chain>
</resolvers>
</ivysettings>
Use this init script for launching a jupyter notebook with BDT:
pyspark ^
--master local[*] ^
--driver-java-options "-XX:+UseCompressedOops -Djava.awt.headless=true" ^
--conf spark.executor.extraJavaOptions="-XX:+UseCompressedOops -Djava.awt.headless=true" ^
--conf spark.sql.execution.pyarrow.enabled=true ^
--conf spark.submit.pyFiles="C:\<path>\<to>\<bdt_zip>\" ^
--conf spark.jars="C:\<path>\<to>\<bdt_jar>\" ^
--conf spark.jars.packages="org.geotools:gt-main:24.6,org.geotools:gt-geotiff:24.6,org.geotools:gt-epsg-hsql:24.6" ^
--conf spark.jars.excludes="org.scala-lang:scala-reflect,com.fasterxml.jackson.core:jackson-core" ^
--conf spark.jars.ivySettings="file:///<path-to-ivySettings.xml>"
This init script adds the required libraries for raster processing. Adjust the path to the jar, zip, and ivySettings files to match where they are on your machine.
Note: There is a known issue with using the spark packages option on windows that’s required for raster processing setup: Packages may not work on windows. This may produce null pointer exceptions in the command prompt when first running a notebook cell. However, the notebook should still run with no issues. This issue was not seen when running using a local cluster