ST_DateLine#

Table of Contents#

  1. What is ST_DateLine

  2. ST_DateLine Input Data

  3. Spatial Partitioning Without ST_DateLine

  4. Applying ST_DateLine

  5. Spatial Partitioning With ST_DateLine

[1]:
import bdt
bdt.auth("../bdt.lic")
import bdt.functions as F
from bdt.processors import *
from pyspark.sql.functions import explode
BDT has been successfully authorized!

            Welcome to
             ___    _                ___         __             ______             __   __     _   __
            / _ )  (_)  ___ _       / _ \ ___ _ / /_ ___ _     /_  __/ ___  ___   / /  / /__  (_) / /_
           / _  | / /  / _ `/      / // // _ `// __// _ `/      / /   / _ \/ _ \ / /  /  '_/ / / / __/
          /____/ /_/   \_, /      /____/ \_,_/ \__/ \_,_/      /_/    \___/\___//_/  /_/\_\ /_/  \__/
                      /___/

BDT python version: v3.5.0-v3.4.0-develop-27-gde79522b
BDT jar version: v3.5.0-v3.4.0-develop-27-gde79522b

Part 1: What is ST_DateLine#

ST_DateLine is a function that repairs geometries that cross the date line or antimeridian.

Typically, a geometry crossing the dateline will be interpreted as wrapping around the entire globe opposite from the date line. To address this, ST_DateLine splits geometry at the date line (antimeridian) to properly represent the geometry.

Applying this function to geometries that cross the date line removes inefficiencies in QR spatial partitioning. This is detailed in this notebook.

Geometries passed into this function must be in a pannable spatial reference. Meaning it is either:

  • Any geographic coordinate system.

  • A rectangular projected coordinate system where the x coordinate range is equivalent to a 360 degree range on the defining geographic Coordinate System.

Part 2: ST_DateLine Input Data#

Create a sample line that is intended to cross the date line:

[2]:
wkt = "LINESTRING (170.0 0.0, -170.0 0.0)"

df = spark.sql(f"SELECT ST_FromText('{wkt}') AS SHAPE")

df.show()
+--------------------+
|               SHAPE|
+--------------------+
|{[01 05 00 00 00 ...|
+--------------------+

Visualizing this geometry using Geopandas reveals how this line is actually interpreted. Instead of going from (170.0, 0) to (-170.0, 0) over the date line, the line travels the longer way around the globe to get to (-170.0, 0).

Geometries typically cannot cross the antimeridian of the spatial reference they are in, it serves as a boundary.

[3]:
df.to_geo_pandas(4326).explore(tiles="Esri.WorldTopoMap")
[3]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Part 3: Spatial Partitioning Without ST_DateLine#

Spatially partition the sample line from above and construct geometries representing the QR boundaries for visualization.

[4]:
cell_size = 1.0
qrs = (
    df
        .select(explode(F.st_asQR("SHAPE", cell_size)).alias("QR"))
        .select("*", F.st_qr_to_box("QR", cell_size, 0.0).alias("SHAPE"))
)

Spatially partitioning this geometry results in many more QRs than intended. And creates QRs in areas of the world that the line was not intended to cross.

This high number of QRs for a relatively small geometry, means this geometry will get passed to many more partitions than it should. This makes it inefficient to use in spatial partitioning workflows (most BDT Processors).

[5]:
qrs.to_geo_pandas(4326).explore(tiles="Esri.WorldTopoMap")
[5]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Part 4: Applying ST_DateLine#

Since a geometry cannot cross the date line, the geometry can alternatively be represented by drawing the line right up to the date line, splitting it, and then continuing to draw it on the other side of the date line. ST_DateLine will accomplish this.

[6]:
df_dateline = (
    df.select(explode(F.st_date_line("SHAPE", 4326)).alias("SHAPE"))
)

Looking at the WKT of the geometries output from ST_DateLine, the function split the geometry at 180 degrees (the antimeridian in 4326).

[7]:
(
    df_dateline
        .select(F.st_asText("SHAPE").alias("WKT"))
        .show(truncate=False)
)
+----------------------------------+
|WKT                               |
+----------------------------------+
|MULTILINESTRING ((170 0, 180 0))  |
|MULTILINESTRING ((-180 0, -170 0))|
+----------------------------------+

Visualizing with Geopandas shows the two sides of the split geometry.

[9]:
df_dateline.to_geo_pandas(4326).explore(tiles="Esri.WorldTopoMap")
[9]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Part 5: Spatial Partitioning With ST_DateLine#

Now that ST_DateLine as been applied to the line geometry, spatial partitioning with the line will result in many fewer QRs. And only generate QRs over the intended area of the geometry.

[10]:
cell_size = 1.0
qrs_dateline = (
    df_dateline
        .select(explode(F.st_asQR("SHAPE", cell_size)).alias("QR"))
        .select("*", F.st_qr_to_box("QR", cell_size, 0.0).alias("SHAPE"))
)
[11]:
qrs_dateline.to_geo_pandas(4326).explore(tiles="Esri.WorldTopoMap")
[11]:
Make this Notebook Trusted to load map: File -> Trust Notebook