Create a forest model config object
This function is intended for advanced use cases in which users require detailed control of sampling algorithms and data structures. Minimal input validation and error checks are performed – users are responsible for providing the correct inputs. For tutorials on the "proper" usage of the stochtree's advanced workflow, we provide several vignettes at stochtree.ai
Usage
createForestModelConfig(
feature_types = NULL,
sweep_update_indices = NULL,
num_trees = NULL,
num_features = NULL,
num_observations = NULL,
variable_weights = NULL,
leaf_dimension = 1,
alpha = 0.95,
beta = 2,
min_samples_leaf = 5,
max_depth = -1,
leaf_model_type = 1,
leaf_model_scale = NULL,
variance_forest_shape = 1,
variance_forest_scale = 1,
cutpoint_grid_size = 100,
num_features_subsample = NULL
)Arguments
- feature_types
Vector of integer-coded feature types (integers where 0 = numeric, 1 = ordered categorical, 2 = unordered categorical)
- sweep_update_indices
Vector of (0-indexed) indices of trees to update in a sweep
- num_trees
Number of trees in the forest being sampled
- num_features
Number of features in training dataset
- num_observations
Number of observations in training dataset
- variable_weights
Vector specifying sampling probability for all p covariates in ForestDataset
- leaf_dimension
Dimension of the leaf model (default:
1)- alpha
Root node split probability in tree prior (default:
0.95)- beta
Depth prior penalty in tree prior (default:
2.0)- min_samples_leaf
Minimum number of samples in a tree leaf (default:
5)- max_depth
Maximum depth of any tree in the ensemble in the model. Setting to
-1does not enforce any depth limits on trees. Default:-1.- leaf_model_type
Integer specifying the leaf model type (0 = constant leaf, 1 = univariate leaf regression, 2 = multivariate leaf regression). Default:
0.- leaf_model_scale
Scale parameter used in Gaussian leaf models (can either be a scalar or a q x q matrix, where q is the dimensionality of the basis and is only >1 when
leaf_model_int = 2). Calibrated internally as1/num_trees, propagated along diagonal if needed for multivariate leaf models.- variance_forest_shape
Shape parameter for IG leaf models (applicable when
leaf_model_type = 3). Default:1.- variance_forest_scale
Scale parameter for IG leaf models (applicable when
leaf_model_type = 3). Default:1.- cutpoint_grid_size
Number of unique cutpoints to consider (default:
100)- num_features_subsample
Number of features to subsample for the GFR algorithm