|
| TreeEnsemble (int num_trees, int output_dimension=1, bool is_leaf_constant=true, bool is_exponentiated=false) |
| Initialize a new TreeEnsemble.
|
|
| TreeEnsemble (TreeEnsemble &ensemble) |
| Initialize an ensemble based on the state of an existing ensemble.
|
|
Tree * | GetTree (int i) |
| Return a pointer to a tree in the forest.
|
|
void | ResetRoot () |
| Reset a TreeEnsemble to all single-node "root" trees.
|
|
void | ResetTree (int i) |
| Reset a single tree in an ensemble.
|
|
void | ResetInitTree (int i) |
| Reset a single tree in an ensemble.
|
|
void | CloneFromExistingTree (int i, Tree *tree) |
| Clone a single tree in an ensemble from an existing tree, overwriting current tree.
|
|
void | ReconstituteFromForest (TreeEnsemble &ensemble) |
| Reset an ensemble to clone another ensemble.
|
|
int | GetMaxLeafIndex () |
| Obtain a 0-based "maximum" leaf index for an ensemble, which is equivalent to the sum of the number of leaves in each tree. This is used in conjunction with PredictLeafIndicesInplace , which returns an observation-specific leaf index for every observation-tree pair.
|
|
void | PredictLeafIndicesInplace (ForestDataset *dataset, std::vector< int32_t > &output, int num_trees, data_size_t n) |
| Obtain a 0-based leaf index for every tree in an ensemble and for each observation in a ForestDataset. Internally, trees are stored as essentially vectors of node information, and the leaves_ vector gives us node IDs for every leaf in the tree. Here, we would like to know, for every observation in a dataset, which leaf number it is mapped to. Since the leaf numbers themselves do not carry any information, we renumber them from 0 to leaves_.size()-1 . We compute this at the tree-level and coordinate this computation at the ensemble level.
|
|
void | PredictLeafIndicesInplace (Eigen::Map< Eigen::Matrix< double, Eigen::Dynamic, Eigen::Dynamic, Eigen::ColMajor > > &covariates, std::vector< int32_t > &output, int num_trees, data_size_t n) |
| Obtain a 0-based leaf index for every tree in an ensemble and for each observation in a ForestDataset. Internally, trees are stored as essentially vectors of node information, and the leaves_ vector gives us node IDs for every leaf in the tree. Here, we would like to know, for every observation in a dataset, which leaf number it is mapped to. Since the leaf numbers themselves do not carry any information, we renumber them from 0 to leaves_.size()-1 . We compute this at the tree-level and coordinate this computation at the ensemble level.
|
|
void | PredictLeafIndicesInplace (Eigen::Map< Eigen::Matrix< double, Eigen::Dynamic, Eigen::Dynamic, Eigen::ColMajor > > &covariates, Eigen::Map< Eigen::Matrix< int, Eigen::Dynamic, Eigen::Dynamic, Eigen::ColMajor > > &output, int column_ind, int num_trees, data_size_t n) |
| Obtain a 0-based leaf index for every tree in an ensemble and for each observation in a ForestDataset. Internally, trees are stored as essentially vectors of node information, and the leaves_ vector gives us node IDs for every leaf in the tree. Here, we would like to know, for every observation in a dataset, which leaf number it is mapped to. Since the leaf numbers themselves do not carry any information, we renumber them from 0 to leaves_.size()-1 . We compute this at the tree-level and coordinate this computation at the ensemble level.
|
|
void | PredictLeafIndicesInplace (Eigen::MatrixXd &covariates, std::vector< int32_t > &output, int num_trees, data_size_t n) |
| Obtain a 0-based leaf index for every tree in an ensemble and for each observation in a ForestDataset. Internally, trees are stored as essentially vectors of node information, and the leaves_ vector gives us node IDs for every leaf in the tree. Here, we would like to know, for every observation in a dataset, which leaf number it is mapped to. Since the leaf numbers themselves do not carry any information, we renumber them from 0 to leaves_.size()-1 . We compute this at the tree-level and coordinate this computation at the ensemble level.
|
|
std::vector< int32_t > | PredictLeafIndices (ForestDataset *dataset) |
| Same as PredictLeafIndicesInplace but assumes responsibility for allocating and returning output vector.
|
|
json | to_json () |
| Save to JSON.
|
|
void | from_json (const json &ensemble_json) |
| Load from JSON.
|
|
Class storing a "forest," or an ensemble of decision trees.
void StochTree::TreeEnsemble::PredictLeafIndicesInplace |
( |
Eigen::Map< Eigen::Matrix< double, Eigen::Dynamic, Eigen::Dynamic, Eigen::ColMajor > > & |
covariates, |
|
|
Eigen::Map< Eigen::Matrix< int, Eigen::Dynamic, Eigen::Dynamic, Eigen::ColMajor > > & |
output, |
|
|
int |
column_ind, |
|
|
int |
num_trees, |
|
|
data_size_t |
n |
|
) |
| |
|
inline |
Obtain a 0-based leaf index for every tree in an ensemble and for each observation in a ForestDataset. Internally, trees are stored as essentially vectors of node information, and the leaves_ vector gives us node IDs for every leaf in the tree. Here, we would like to know, for every observation in a dataset, which leaf number it is mapped to. Since the leaf numbers themselves do not carry any information, we renumber them from 0 to leaves_.size()-1
. We compute this at the tree-level and coordinate this computation at the ensemble level.
Note: this assumes the creation of a matrix of column indices with num_trees*n
rows and as many columns as forests that were requested from R / Python
- Parameters
-
covariates | Matrix of covariates |
output | Matrix with num_trees*n rows and as many columns as forests that were requested from R / Python |
column_ind | Index of column in output into which the result should be unpacked |
num_trees | Number of trees in an ensemble |
n | Size of dataset |