Train Random Trees

Description

Train a Random Trees model using the Scikit-Learn implementation of the Breiman (2001) algorithm.

Usage

Given predictor variables and ground truth labels for the variables, train a Random Trees model that can be subsequently used to predict new areas to belong to one of the ground truth labels.

Parameters

Parameter Name	Type	Direction	Data Type	Dialog Reference
Input Training Raster (*.tif)	Required	Input	Raster Layer	Input raster where cells are assigned an integer value representing a ground truth label. The known locations of each class are used to train the Random Trees model to identify that class from the predictor variables. Must be in TIF format.
Input Predictor Variables Raster (*.tif)	Required	Input	Multiple Value	Input raster(s) with the same extents as the training raster, where cells represent characteristics that may describe the ground truth classes. Must be in TIF format.
Output Trained Model (*.JOBLIB)	Required	Output	File	Name for the trained Random Trees model. Must be in JOBLIB format. If the directory does not exist, it will be created.
Prepared Predictor Variable Raster (*.tif)	Required	Output	Raster Dataset	Name of the output prepared predictor variable raster. This will either be a composite of the all input predictor variable rasters, or a copy of the single input predictor variable raster. If a composite raster is created, it must be the input used to execute Run Random Trees. If only one predictor variable is being used, the input to Run Random Trees can be this output or the original.
Output Variable Importance (*.txt)	Required	Output	File	Name for the variable importance file. For each input predictor variable, the variable importance represents the estimated decrease in model accuracy if that variable was removed from the training phase. Must be in TXT format. If the directory does not exist, it will be created.
Number of Trees	Optional	Input	Double	(Integer) The number of trees that are "grown" in the Random Trees algorithm. Each tree represents a decision tree model that is built for a bootstrapped selection of cells from the input predictor variable raster(s). The final model predictions represent the majority vote among all of the trees grown.
Maximum Tree Depth	Optional	Input	Any Value	(Integer or None) The maximum number of levels (i.e., decisions) made in each tree. Each decision in the tree aims to split the collection of predictor variables into unique groups that belong to a ground truth class with minimal impurity within each group. The default is "None", which will expand nodes until are leaves are pure.
Maximum Number of Features	Optional	Input	Any Value	(Integer, float, string or None) The maximum number of variables to consider when making a decision. Acceptable strings are "auto", "sqrt", "log2", "None". Do not include quotes when entering a string. Integers are acceptable and floats (<1) are acceptable. See Scikit-Learn documentation for further details. The default is "auto", which will set max features equal to the square root of the number of features.
Class Weights	Optional	Input	Value Table	Weights assigned to each target class to predict, where the weight represents the penalty for misclassifying that class. The default is "balanced" (blank), which will set the weight for each class to be inversely proportional to the occurrence of that class in the training data.

Train Random Trees

Description​

Usage​

Parameters​

Description

Usage

Parameters