Train Test Split
Description
Create two subsets of a ground truth raster for training and testing purposes by randomly sampling user-defined percentages of each ground truth class.
Usage
Create training and testing subsets from a ground truth raster for supervised classification applications. The training raster is created by randomly sampling user-defined percentages of each discrete class of the ground truth data. The testing raster is the complement of the training raster, comprised of all remaining cells in the ground truth extents. The testing raster represents a reserved group of ground truth cells that can be used to assess model accuracy for areas where it was not trained.
Parameters
Parameter Name | Type | Direction | Data Type | Dialog Reference |
---|---|---|---|---|
Input Ground Truth Data (*.tif) | Required | Input | Raster Layer | Input raster of integer type where each discrete value corresponds to a ground truth class. Class values must be 0-indexed and have a sequential order |
Training Sampling Percentage per Class | Required | Input | Value Table | Percent to sample from each discrete class in the ground truth dataset. If less than 100 is entered, the training dataset will contain a subset of random cells from that class according to the percentage specified. The remainder will be added to the testing dataset. Users may find that undersampling or oversampling classes in an imbalanced ground truth dataset improves modeling results. |
Output Training Raster (*.tif) | Required | Output | Raster Dataset | Name of the resulting training raster. Must be in TIFF format. The directory will be created if it does not exist. |
Output Testing Raster (*.tif) | Required | Output | Raster Dataset | Name of the resulting testing raster. Must be in TIFF format. The directory will be created if it does not exist. |
Optional Training Sampling Area Constraint | Optional | Input | Feature Layer | Polygon representing the area within which training cells will be sampled from. The polygon must be fully encompassed by the ground truth raster. If no constraint is used, training cells will be sampled from the entire ground truth raster. |