| StochTree 0.1.1
    | 
Class storing a "forest," or an ensemble of decision trees. More...
#include <ensemble.h>
| Public Member Functions | |
| TreeEnsemble (int num_trees, int output_dimension=1, bool is_leaf_constant=true, bool is_exponentiated=false) | |
| Initialize a new TreeEnsemble. | |
| TreeEnsemble (TreeEnsemble &ensemble) | |
| Initialize an ensemble based on the state of an existing ensemble. | |
| void | MergeForest (TreeEnsemble &ensemble) | 
| Combine two forests into a single forest by merging their trees. | |
| void | AddValueToLeaves (double constant_value) | 
| Add a constant value to every leaf of every tree in an ensemble. If leaves are multi-dimensional, constant_valuewill be added to every dimension of the leaves. | |
| void | MultiplyLeavesByValue (double constant_multiple) | 
| Multiply every leaf of every tree by a constant value. If leaves are multi-dimensional, constant_multiplewill be multiplied through every dimension of the leaves. | |
| Tree * | GetTree (int i) | 
| Return a pointer to a tree in the forest. | |
| void | ResetRoot () | 
| Reset a TreeEnsembleto all single-node "root" trees. | |
| void | ResetTree (int i) | 
| Reset a single tree in an ensemble. | |
| void | ResetInitTree (int i) | 
| Reset a single tree in an ensemble. | |
| void | CloneFromExistingTree (int i, Tree *tree) | 
| Clone a single tree in an ensemble from an existing tree, overwriting current tree. | |
| void | ReconstituteFromForest (TreeEnsemble &ensemble) | 
| Reset an ensemble to clone another ensemble. | |
| int | GetMaxLeafIndex () | 
| Obtain a 0-based "maximum" leaf index for an ensemble, which is equivalent to the sum of the number of leaves in each tree. This is used in conjunction with PredictLeafIndicesInplace, which returns an observation-specific leaf index for every observation-tree pair. | |
| void | PredictLeafIndicesInplace (ForestDataset *dataset, std::vector< int32_t > &output, int num_trees, data_size_t n) | 
| Obtain a 0-based leaf index for every tree in an ensemble and for each observation in a ForestDataset. Internally, trees are stored as essentially vectors of node information, and the leaves_ vector gives us node IDs for every leaf in the tree. Here, we would like to know, for every observation in a dataset, which leaf number it is mapped to. Since the leaf numbers themselves do not carry any information, we renumber them from 0 to leaves_.size()-1. We compute this at the tree-level and coordinate this computation at the ensemble level. | |
| void | PredictLeafIndicesInplace (Eigen::Map< Eigen::Matrix< double, Eigen::Dynamic, Eigen::Dynamic, Eigen::ColMajor > > &covariates, std::vector< int32_t > &output, int num_trees, data_size_t n) | 
| Obtain a 0-based leaf index for every tree in an ensemble and for each observation in a ForestDataset. Internally, trees are stored as essentially vectors of node information, and the leaves_ vector gives us node IDs for every leaf in the tree. Here, we would like to know, for every observation in a dataset, which leaf number it is mapped to. Since the leaf numbers themselves do not carry any information, we renumber them from 0 to leaves_.size()-1. We compute this at the tree-level and coordinate this computation at the ensemble level. | |
| void | PredictLeafIndicesInplace (Eigen::Map< Eigen::Matrix< double, Eigen::Dynamic, Eigen::Dynamic, Eigen::ColMajor > > &covariates, Eigen::Map< Eigen::Matrix< int, Eigen::Dynamic, Eigen::Dynamic, Eigen::ColMajor > > &output, int column_ind, int num_trees, data_size_t n) | 
| Obtain a 0-based leaf index for every tree in an ensemble and for each observation in a ForestDataset. Internally, trees are stored as essentially vectors of node information, and the leaves_ vector gives us node IDs for every leaf in the tree. Here, we would like to know, for every observation in a dataset, which leaf number it is mapped to. Since the leaf numbers themselves do not carry any information, we renumber them from 0 to leaves_.size()-1. We compute this at the tree-level and coordinate this computation at the ensemble level. | |
| void | PredictLeafIndicesInplace (Eigen::MatrixXd &covariates, std::vector< int32_t > &output, int num_trees, data_size_t n) | 
| Obtain a 0-based leaf index for every tree in an ensemble and for each observation in a ForestDataset. Internally, trees are stored as essentially vectors of node information, and the leaves_ vector gives us node IDs for every leaf in the tree. Here, we would like to know, for every observation in a dataset, which leaf number it is mapped to. Since the leaf numbers themselves do not carry any information, we renumber them from 0 to leaves_.size()-1. We compute this at the tree-level and coordinate this computation at the ensemble level. | |
| std::vector< int32_t > | PredictLeafIndices (ForestDataset *dataset) | 
| Same as PredictLeafIndicesInplacebut assumes responsibility for allocating and returning output vector. | |
| json | to_json () | 
| Save to JSON. | |
| void | from_json (const json &ensemble_json) | 
| Load from JSON. | |
Class storing a "forest," or an ensemble of decision trees.
| 
 | inline | 
Initialize a new TreeEnsemble.
| num_trees | Number of trees in a forest | 
| output_dimension | Dimension of the leaf node parameter | 
| is_leaf_constant | Whether or not the leaves of each tree are treated as "constant." If true, then predicting from an ensemble is simply a matter or determining which leaf node an observation falls into. If false, prediction will multiply a leaf node's parameter(s) for a given observation by a basis vector. | 
| is_exponentiated | Whether or not the leaves of each tree are stored in log scale. If true, leaf predictions are exponentiated before their prediction is returned. | 
| 
 | inline | 
Initialize an ensemble based on the state of an existing ensemble.
| ensemble | TreeEnsembleused to initialize the current ensemble | 
| 
 | inline | 
Combine two forests into a single forest by merging their trees.
| ensemble | Reference to another TreeEnsemblethat will be merged into the current ensemble | 
| 
 | inline | 
Add a constant value to every leaf of every tree in an ensemble. If leaves are multi-dimensional, constant_value will be added to every dimension of the leaves. 
| constant_value | Value that will be added to every leaf of every tree | 
| 
 | inline | 
Multiply every leaf of every tree by a constant value. If leaves are multi-dimensional, constant_multiple will be multiplied through every dimension of the leaves. 
| constant_multiple | Value that will be multiplied by every leaf of every tree | 
| 
 | inline | 
Return a pointer to a tree in the forest.
| i | Index (0-based) of a tree to be queried | 
| 
 | inline | 
Reset a single tree in an ensemble.
| i | Index (0-based) of the tree to be reset | 
| 
 | inline | 
Reset a single tree in an ensemble.
| i | Index (0-based) of the tree to be reset | 
| 
 | inline | 
Clone a single tree in an ensemble from an existing tree, overwriting current tree.
| i | Index of the tree to be overwritten | 
| tree | Pointer to tree used to clone tree i | 
| 
 | inline | 
Reset an ensemble to clone another ensemble.
| ensemble | Reference to an existing TreeEnsemble | 
| 
 | inline | 
Obtain a 0-based leaf index for every tree in an ensemble and for each observation in a ForestDataset. Internally, trees are stored as essentially vectors of node information, and the leaves_ vector gives us node IDs for every leaf in the tree. Here, we would like to know, for every observation in a dataset, which leaf number it is mapped to. Since the leaf numbers themselves do not carry any information, we renumber them from 0 to leaves_.size()-1. We compute this at the tree-level and coordinate this computation at the ensemble level. 
Note: this assumes the creation of a vector of column indices of size dataset.NumObservations() x ensemble.NumTrees() 
| ForestDataset | Dataset with which to predict leaf indices from the tree | 
| output | Vector of length num_trees*n which stores the leaf node prediction | 
| num_trees | Number of trees in an ensemble | 
| n | Size of dataset | 
| 
 | inline | 
Obtain a 0-based leaf index for every tree in an ensemble and for each observation in a ForestDataset. Internally, trees are stored as essentially vectors of node information, and the leaves_ vector gives us node IDs for every leaf in the tree. Here, we would like to know, for every observation in a dataset, which leaf number it is mapped to. Since the leaf numbers themselves do not carry any information, we renumber them from 0 to leaves_.size()-1. We compute this at the tree-level and coordinate this computation at the ensemble level. 
Note: this assumes the creation of a vector of column indices of size dataset.NumObservations() x ensemble.NumTrees() 
| covariates | Matrix of covariates | 
| output | Vector of length num_trees*n which stores the leaf node prediction | 
| num_trees | Number of trees in an ensemble | 
| n | Size of dataset | 
| 
 | inline | 
Obtain a 0-based leaf index for every tree in an ensemble and for each observation in a ForestDataset. Internally, trees are stored as essentially vectors of node information, and the leaves_ vector gives us node IDs for every leaf in the tree. Here, we would like to know, for every observation in a dataset, which leaf number it is mapped to. Since the leaf numbers themselves do not carry any information, we renumber them from 0 to leaves_.size()-1. We compute this at the tree-level and coordinate this computation at the ensemble level. 
Note: this assumes the creation of a matrix of column indices with num_trees*n rows and as many columns as forests that were requested from R / Python 
| covariates | Matrix of covariates | 
| output | Matrix with num_trees*n rows and as many columns as forests that were requested from R / Python | 
| column_ind | Index of column in outputinto which the result should be unpacked | 
| num_trees | Number of trees in an ensemble | 
| n | Size of dataset | 
| 
 | inline | 
Obtain a 0-based leaf index for every tree in an ensemble and for each observation in a ForestDataset. Internally, trees are stored as essentially vectors of node information, and the leaves_ vector gives us node IDs for every leaf in the tree. Here, we would like to know, for every observation in a dataset, which leaf number it is mapped to. Since the leaf numbers themselves do not carry any information, we renumber them from 0 to leaves_.size()-1. We compute this at the tree-level and coordinate this computation at the ensemble level. 
Note: this assumes the creation of a vector of column indices of size dataset.NumObservations() x ensemble.NumTrees() 
| ForestDataset | Dataset with which to predict leaf indices from the tree | 
| output | Vector of length num_trees*n which stores the leaf node prediction | 
| num_trees | Number of trees in an ensemble | 
| n | Size of dataset | 
| 
 | inline | 
Same as PredictLeafIndicesInplace but assumes responsibility for allocating and returning output vector. 
| ForestDataset | Dataset with which to predict leaf indices from the tree |