Wrapper around a C++ dataset class used to sample a forest.
A dataset consists of three matrices / vectors: covariates,
bases, and variance weights. Both the basis vector and variance
weights are optional.
This class is intended for advanced use cases in which users require detailed control of sampling algorithms and data structures.
Minimal input validation and error checks are performed – users are responsible for providing the correct inputs.
For tutorials on the "proper" usage of the stochtree's advanced workflow, we provide several vignettes at https://stochtree.ai/
Public fields
data_ptr
External pointer to a C++ ForestDataset class
Methods
Method new()
Create a new ForestDataset object.
Usage
ForestDataset$new(covariates, basis = NULL, variance_weights = NULL)
Arguments
covariates
Matrix of covariates
basis
(Optional) Matrix of bases used to define a leaf regression
variance_weights
(Optional) Vector of observation-specific variance weights
Returns
A new ForestDataset object.
Method update_basis()
Update basis matrix in a dataset
Usage
ForestDataset$update_basis(basis)
Arguments
basis
Updated matrix of bases used to define a leaf regression
Method update_variance_weights()
Update variance_weights in a dataset
Usage
ForestDataset$update_variance_weights(variance_weights, exponentiate = F)
Arguments
variance_weights
Updated vector of variance weights used to define individual variance / case weights
exponentiate
Whether or not input vector should be exponentiated before being written to the Dataset's variance weights. Default: F.
Method num_observations()
Return number of observations in a ForestDataset object
Usage
ForestDataset$num_observations()
Returns
Observation count
Method num_covariates()
Return number of covariates in a ForestDataset object
Usage
ForestDataset$num_covariates()
Method num_basis()
Return number of bases in a ForestDataset object
Usage
ForestDataset$num_basis()
Method get_covariates()
Return covariates as an R matrix
Usage
ForestDataset$get_covariates()
Method get_basis()
Return bases as an R matrix
Usage
ForestDataset$get_basis()
Method get_variance_weights()
Return variance weights as an R vector
Usage
ForestDataset$get_variance_weights()
Returns
Variance weight data
Method has_basis()
Whether or not a dataset has a basis matrix
Usage
ForestDataset$has_basis()
Returns
True if basis matrix is loaded, false otherwise
Method has_variance_weights()
Whether or not a dataset has variance weights
Usage
ForestDataset$has_variance_weights()
Returns
True if variance weights are loaded, false otherwise
Method has_auxiliary_dimension()
Whether or not a dataset has auxiliary data stored at the dimension indicated
Usage
ForestDataset$has_auxiliary_dimension(dim_idx)
Arguments
dim_idx
Dimension of auxiliary data
Returns
True if auxiliary data has been allocated for dim_idx False otherwise
Method add_auxiliary_dimension()
Initialize a new dimension / lane of auxiliary data and allocate data in its place
Usage
ForestDataset$add_auxiliary_dimension(dim_size)
Arguments
dim_size
Size of the new vector of data to allocate
Method get_auxiliary_data_value()
Retrieve auxiliary data value
Usage
ForestDataset$get_auxiliary_data_value(dim_idx, element_idx)
Arguments
dim_idx
Dimension from which data value to be retrieved
element_idx
Element to retrieve from dimension dim_idx
Returns
Floating point value stored in the requested auxiliary data space
Method set_auxiliary_data_value()
Set auxiliary data value
Usage
ForestDataset$set_auxiliary_data_value(dim_idx, element_idx, value)
Arguments
dim_idx
Dimension in which data value to be set
element_idx
Element to set within dimension dim_idx
value
Data value to set at auxiliary data dimension dim_idx and element element_idx
Method get_auxiliary_data_vector()
Retrieve entire auxiliary data vector
Usage
ForestDataset$get_auxiliary_data_vector(dim_idx)
Arguments
dim_idx
Dimension to retrieve
Returns
Vector of all of the auxiliary data stored at dimension dim_idx