StochTree 0.0.1
|
Container of TreeEnsemble
forest objects. This is the primary (in-memory) storage interface for multiple "samples" of a decision tree ensemble in stochtree
.
More...
#include <container.h>
Public Member Functions | |
ForestContainer (int num_trees, int output_dimension=1, bool is_leaf_constant=true, bool is_exponentiated=false) | |
Construct a new ForestContainer object. | |
ForestContainer (int num_samples, int num_trees, int output_dimension=1, bool is_leaf_constant=true, bool is_exponentiated=false) | |
Construct a new ForestContainer object. | |
void | DeleteSample (int sample_num) |
Remove a forest from a container of forest samples and delete the corresponding object, freeing its memory. | |
void | AddSample (TreeEnsemble &forest) |
Add a new forest to the container by copying forest . | |
void | InitializeRoot (double leaf_value) |
Initialize a "root" forest of univariate trees as the first element of the container, setting all root node values in every tree to leaf_value . | |
void | InitializeRoot (std::vector< double > &leaf_vector) |
Initialize a "root" forest of multivariate trees as the first element of the container, setting all root node values in every tree to leaf_vector . | |
void | AddSamples (int num_samples) |
Pre-allocate space for num_samples additional forests in the container. | |
void | CopyFromPreviousSample (int new_sample_id, int previous_sample_id) |
Copy the forest stored at previous_sample_id to the forest stored at new_sample_id . | |
std::vector< double > | Predict (ForestDataset &dataset) |
Predict from every forest in the container on every observation in the provided dataset. The resulting vector is "column-major", where every forest in a container defines the columns of a prediction matrix and every observation in the provided dataset defines the rows. The (i ,j ) element of this prediction matrix can be read from the j * num_rows + i element of the returned std::vector<double> , where num_rows is equal to the number of observations in dataset (i.e. dataset.NumObservations() ). | |
std::vector< double > | PredictRaw (ForestDataset &dataset) |
Predict from every forest in the container on every observation in the provided dataset. The resulting vector stores a possibly three-dimensional array, where the dimensions are arranged as follows. | |
nlohmann::json | to_json () |
Save to JSON. | |
void | from_json (const nlohmann::json &forest_container_json) |
Load from JSON. | |
void | append_from_json (const nlohmann::json &forest_container_json) |
Append to a forest container from JSON, requires that the ensemble already contains a nonzero number of forests. | |
Container of TreeEnsemble
forest objects. This is the primary (in-memory) storage interface for multiple "samples" of a decision tree ensemble in stochtree
.
StochTree::ForestContainer::ForestContainer | ( | int | num_trees, |
int | output_dimension = 1 , |
||
bool | is_leaf_constant = true , |
||
bool | is_exponentiated = false |
||
) |
Construct a new ForestContainer object.
num_trees | Number of trees in each forest. |
output_dimension | Dimension of the leaf node parameter in each tree of each forest. |
is_leaf_constant | Whether or not the leaves of each tree are treated as "constant." If true, then predicting from an ensemble is simply a matter or determining which leaf node an observation falls into. If false, prediction will multiply a leaf node's parameter(s) for a given observation by a basis vector. |
is_exponentiated | Whether or not the leaves of each tree are stored in log scale. If true, leaf predictions are exponentiated before their prediction is returned. |
StochTree::ForestContainer::ForestContainer | ( | int | num_samples, |
int | num_trees, | ||
int | output_dimension = 1 , |
||
bool | is_leaf_constant = true , |
||
bool | is_exponentiated = false |
||
) |
Construct a new ForestContainer object.
num_samples | Initial size of a container of forest samples. |
num_trees | Number of trees in each forest. |
output_dimension | Dimension of the leaf node parameter in each tree of each forest. |
is_leaf_constant | Whether or not the leaves of each tree are treated as "constant." If true, then predicting from an ensemble is simply a matter or determining which leaf node an observation falls into. If false, prediction will multiply a leaf node's parameter(s) for a given observation by a basis vector. |
is_exponentiated | Whether or not the leaves of each tree are stored in log scale. If true, leaf predictions are exponentiated before their prediction is returned. |
void StochTree::ForestContainer::DeleteSample | ( | int | sample_num | ) |
Remove a forest from a container of forest samples and delete the corresponding object, freeing its memory.
sample_num | Index of forest to be deleted. |
void StochTree::ForestContainer::AddSample | ( | TreeEnsemble & | forest | ) |
Add a new forest to the container by copying forest
.
forest | Forest to be copied and added to the container of retained forest samples. |
void StochTree::ForestContainer::InitializeRoot | ( | double | leaf_value | ) |
Initialize a "root" forest of univariate trees as the first element of the container, setting all root node values in every tree to leaf_value
.
leaf_value | Value to assign to the root node of every tree. |
void StochTree::ForestContainer::InitializeRoot | ( | std::vector< double > & | leaf_vector | ) |
Initialize a "root" forest of multivariate trees as the first element of the container, setting all root node values in every tree to leaf_vector
.
leaf_value | Vector of values to assign to the root node of every tree. |
void StochTree::ForestContainer::AddSamples | ( | int | num_samples | ) |
Pre-allocate space for num_samples
additional forests in the container.
num_samples | Number of (default-constructed) forests to allocated space for in the container. |
void StochTree::ForestContainer::CopyFromPreviousSample | ( | int | new_sample_id, |
int | previous_sample_id | ||
) |
Copy the forest stored at previous_sample_id
to the forest stored at new_sample_id
.
new_sample_id | Index of the new forest to be copied from an earlier sample. |
previous_sample_id | Index of the previous forest to copy to new_sample_id . |
std::vector< double > StochTree::ForestContainer::Predict | ( | ForestDataset & | dataset | ) |
Predict from every forest in the container on every observation in the provided dataset. The resulting vector is "column-major", where every forest in a container defines the columns of a prediction matrix and every observation in the provided dataset defines the rows. The (i
,j
) element of this prediction matrix can be read from the j * num_rows + i
element of the returned std::vector<double>
, where num_rows
is equal to the number of observations in dataset
(i.e. dataset.NumObservations()
).
dataset | Data object containining training data, including covariates, leaf regression bases, and case weights. |
dataset
. std::vector< double > StochTree::ForestContainer::PredictRaw | ( | ForestDataset & | dataset | ) |
Predict from every forest in the container on every observation in the provided dataset. The resulting vector stores a possibly three-dimensional array, where the dimensions are arranged as follows.
If the leaf nodes have univariate values, then the "first dimension" is 1 and the resulting array has the exact same layout as in Predict.
dataset | Data object containining training data, including covariates, leaf regression bases, and case weights. |
dataset
.