Forest API#
stochtree.forest.Forest
#
In-memory python wrapper around a C++ tree ensemble object
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_trees
|
int
|
Number of trees that each forest should contain |
required |
output_dimension
|
int
|
Dimension of the leaf node parameters in each tree |
1
|
leaf_constant
|
bool
|
Whether the leaf node model is "constant" (i.e. prediction is simply a sum of leaf node parameters for every observation in a dataset) or not (i.e. each leaf node parameter is multiplied by a "basis vector" before being returned as a prediction). |
True
|
is_exponentiated
|
bool
|
Whether or not the leaf node parameters are stored in log scale (in which case, they must be exponentiated before being returned as predictions). |
False
|
reset_root()
#
Reset forest to a forest with all single node (i.e. "root") trees
reset(forest_container, forest_num)
#
Reset forest to the forest indexed by forest_num
in forest_container
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_container
|
`ForestContainer
|
Stochtree object storing tree ensembles |
required |
forest_num
|
int
|
Index of the ensemble used to reset the |
required |
predict(dataset)
#
Predict from each forest in the container, using the provided Dataset
object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset
|
Dataset
|
Python object wrapping the "dataset" class used by C++ sampling and prediction data structures. |
required |
Returns:
Type | Description |
---|---|
array
|
One-dimensional numpy array with length equal to the number of observations in |
predict_raw(dataset)
#
Predict raw leaf values for a every forest in the container, using the provided Dataset
object
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset
|
Dataset
|
Python object wrapping the "dataset" class used by C++ sampling and prediction data structures. |
required |
Returns:
Type | Description |
---|---|
array
|
Numpy array with ( |
set_root_leaves(leaf_value)
#
Set constant (root) leaf node values for every tree in the forest. Assumes the forest consists of all root (single-node) trees.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
leaf_value
|
float or array
|
Constant values to which root nodes are to be set. If the trees in forest |
required |
add_numeric_split(tree_num, leaf_num, feature_num, split_threshold, left_leaf_value, right_leaf_value)
#
Add a numeric (i.e. X[,i] <= c) split to a given tree in the forest
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tree_num
|
int
|
Index of the tree to be split |
required |
leaf_num
|
int
|
Leaf to be split |
required |
feature_num
|
int
|
Feature that defines the new split |
required |
split_threshold
|
float
|
Value that defines the cutoff of the new split |
required |
left_leaf_value
|
float or array
|
Value (or array of values) to assign to the newly created left node |
required |
right_leaf_value
|
float or array
|
Value (or array of values) to assign to the newly created right node |
required |
get_tree_leaves(tree_num)
#
Retrieve a vector of indices of leaf nodes for a given tree in the forest
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tree_num
|
float or array
|
Index of the tree for which leaf indices will be retrieved |
required |
Returns:
Type | Description |
---|---|
array
|
One-dimensional numpy array, containing the indices of leaf nodes in a given tree. |
get_tree_split_counts(tree_num, num_features)
#
Retrieve a vector of split counts for every training set variable in a given tree in the forest
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tree_num
|
int
|
Index of the tree for which split counts will be retrieved |
required |
num_features
|
int
|
Total number of features in the training set |
required |
Returns:
Type | Description |
---|---|
array
|
One-dimensional numpy array with as many elements as in the forest model's training set, containing the split count for each feature for a given tree of the forest. |
get_overall_split_counts(num_features)
#
Retrieve a vector of split counts for every training set variable in the forest
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_features
|
int
|
Total number of features in the training set |
required |
Returns:
Type | Description |
---|---|
array
|
One-dimensional numpy array with as many elements as in the forest model's training set, containing the overall split count in the forest for each feature. |
get_granular_split_counts(num_features)
#
Retrieve a vector of split counts for every training set variable in the forest, reported separately for each tree
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_features
|
int
|
Total number of features in the training set |
required |
Returns:
Type | Description |
---|---|
array
|
One-dimensional numpy array with as many elements as in the forest model's training set, containing the split count for each feature for a every tree in the forest. |
num_forest_leaves()
#
Return the total number of leaves in a forest
Returns:
Type | Description |
---|---|
int
|
Number of leaves in a forest |
sum_leaves_squared()
#
Return the total sum of squared leaf values in a forest
Returns:
Type | Description |
---|---|
float
|
Sum of squared leaf values in a forest |
is_leaf_node(tree_num, node_id)
#
Whether or not a given node of a given tree of a forest is a leaf
tree_num : int Index of the tree to be queried node_id : int Index of the node to be queried
Returns:
Type | Description |
---|---|
bool
|
|
is_numeric_split_node(tree_num, node_id)
#
Whether or not a given node of a given tree of a forest is a numeric split node
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tree_num
|
int
|
Index of the tree to be queried |
required |
node_id
|
int
|
Index of the node to be queried |
required |
Returns:
Type | Description |
---|---|
bool
|
|
is_categorical_split_node(tree_num, node_id)
#
Whether or not a given node of a given tree of a forest is a categorical split node
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tree_num
|
int
|
Index of the tree to be queried |
required |
node_id
|
int
|
Index of the node to be queried |
required |
Returns:
Type | Description |
---|---|
bool
|
|
parent_node(tree_num, node_id)
#
Parent node of given node of a given tree of a forest
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tree_num
|
int
|
Index of the tree to be queried |
required |
node_id
|
int
|
Index of the node to be queried |
required |
Returns:
Type | Description |
---|---|
int
|
Index of the parent of node |
left_child_node(tree_num, node_id)
#
Left child node of given node of a given tree of a forest
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tree_num
|
int
|
Index of the tree to be queried |
required |
node_id
|
int
|
Index of the node to be queried |
required |
Returns:
Type | Description |
---|---|
int
|
Index of the left child of node |
right_child_node(tree_num, node_id)
#
Right child node of given node of a given tree of a forest
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tree_num
|
int
|
Index of the tree to be queried |
required |
node_id
|
int
|
Index of the node to be queried |
required |
Returns:
Type | Description |
---|---|
int
|
Index of the right child of node |
node_depth(tree_num, node_id)
#
Depth of given node of a given tree of a forest
Returns -1
if the node is a leaf.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tree_num
|
int
|
Index of the tree to be queried |
required |
node_id
|
int
|
Index of the node to be queried |
required |
Returns:
Type | Description |
---|---|
int
|
Depth of node |
node_split_index(tree_num, node_id)
#
Split index of given node of a given tree of a forest.
Returns -1
if the node is a leaf.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tree_num
|
int
|
Index of the tree to be queried |
required |
node_id
|
int
|
Index of the node to be queried |
required |
Returns:
Type | Description |
---|---|
int
|
Split index of |
node_split_threshold(tree_num, node_id)
#
Threshold that defines a numeric split for a given node of a given tree of a forest.
Returns np.Inf
if the node is a leaf or a categorical split node.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tree_num
|
int
|
Index of the tree to be queried |
required |
node_id
|
int
|
Index of the node to be queried |
required |
Returns:
Type | Description |
---|---|
float
|
Threshold that defines a numeric split for node |
node_split_categories(tree_num, node_id)
#
Array of category indices that define a categorical split for a given node of a given tree of a forest.
Returns np.array([np.Inf])
if the node is a leaf or a numeric split node.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tree_num
|
int
|
Index of the tree to be queried |
required |
node_id
|
int
|
Index of the node to be queried |
required |
Returns:
Type | Description |
---|---|
array
|
Array of category indices that define a categorical split for node |
node_leaf_values(tree_num, node_id)
#
Leaf node value(s) for a given node of a given tree of a forest. Values are stale if the node is a split node.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tree_num
|
int
|
Index of the tree to be queried |
required |
node_id
|
int
|
Index of the node to be queried |
required |
Returns:
Type | Description |
---|---|
array
|
Array of parameter values for node |
num_nodes(tree_num)
#
Number of nodes in a given tree of a forest
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tree_num
|
int
|
Index of the tree to be queried |
required |
Returns:
Type | Description |
---|---|
int
|
Total number of nodes in tree |
num_leaves(tree_num)
#
Number of leaves in a given tree of a forest
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tree_num
|
int
|
Index of the tree to be queried |
required |
Returns:
Type | Description |
---|---|
int
|
Total number of leaves in tree |
num_leaf_parents(tree_num)
#
Number of leaf parents in a given tree of a forest
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tree_num
|
int
|
Index of the tree to be queried |
required |
Returns:
Type | Description |
---|---|
int
|
Total number of leaf parents in tree |
num_split_nodes(tree_num)
#
Number of split_nodes in a given tree of a forest
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tree_num
|
int
|
Index of the tree to be queried |
required |
Returns:
Type | Description |
---|---|
int
|
Total number of split nodes in tree |
nodes(tree_num)
#
Array of node indices in a given tree of a forest
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tree_num
|
int
|
Index of the tree to be queried |
required |
Returns:
Type | Description |
---|---|
array
|
Array of indices of nodes in tree |
leaves(tree_num)
#
Array of leaf indices in a given tree of a forest
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tree_num
|
int
|
Index of the tree to be queried |
required |
Returns:
Type | Description |
---|---|
array
|
Array of indices of leaf nodes in tree |
stochtree.forest.ForestContainer
#
Container that stores sampled (and retained) tree ensembles from BART, BCF or a custom sampler.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_trees
|
int
|
Number of trees that each forest should contain |
required |
output_dimension
|
int
|
Dimension of the leaf node parameters in each tree |
1
|
leaf_constant
|
bool
|
Whether the leaf node model is "constant" (i.e. prediction is simply a sum of leaf node parameters for every observation in a dataset) or not (i.e. each leaf node parameter is multiplied by a "basis vector" before being returned as a prediction). |
True
|
is_exponentiated
|
bool
|
Whether or not the leaf node parameters are stored in log scale (in which case, they must be exponentiated before being returned as predictions). |
False
|
predict(dataset)
#
Predict from each forest in the container, using the provided Dataset
object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset
|
Dataset
|
Python object wrapping the "dataset" class used by C++ sampling and prediction data structures. |
required |
Returns:
Type | Description |
---|---|
array
|
Numpy array with ( |
predict_raw(dataset)
#
Predict raw leaf values for a every forest in the container, using the provided Dataset
object
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset
|
Dataset
|
Python object wrapping the "dataset" class used by C++ sampling and prediction data structures. |
required |
Returns:
Type | Description |
---|---|
array
|
Numpy array with ( |
predict_raw_single_forest(dataset, forest_num)
#
Predict raw leaf values for a specific forest (indexed by forest_num
), using the provided Dataset
object
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset
|
Dataset
|
Python object wrapping the "dataset" class used by C++ sampling and prediction data structures. |
required |
forest_num
|
int
|
Index of the forest from which to predict. Forest indices are 0-based. |
required |
Returns:
Type | Description |
---|---|
array
|
Numpy array with ( |
predict_raw_single_tree(dataset, forest_num, tree_num)
#
Predict raw leaf values for a specific tree of a specific forest (indexed by tree_num
and forest_num
respectively), using the provided Dataset
object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset
|
Dataset
|
Python object wrapping the "dataset" class used by C++ sampling and prediction data structures. |
required |
forest_num
|
int
|
Index of the forest from which to predict. Forest indices are 0-based. |
required |
tree_num
|
int
|
Index of the tree which to predict (within forest indexed by |
required |
Returns:
Type | Description |
---|---|
array
|
Numpy array with ( |
set_root_leaves(forest_num, leaf_value)
#
Set constant (root) leaf node values for every tree in the forest indexed by forest_num
.
Assumes the forest consists of all root (single-node) trees.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest for which we will set root node parameters. |
required |
leaf_value
|
float or array
|
Constant values to which root nodes are to be set. If the trees in forest |
required |
save_to_json_file(json_filename)
#
Save the forests in the container to a JSON file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
json_filename
|
str
|
Name of JSON file to which forest container state will be saved. May contain absolute or relative paths. |
required |
load_from_json_file(json_filename)
#
Load a forest container from output stored in a JSON file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
json_filename
|
str
|
Name of JSON file from which forest container state will be restored. May contain absolute or relative paths. |
required |
dump_json_string()
#
Dump a forest container into an in-memory JSON string (which can be directly serialized or combined with other JSON strings before serialization).
Returns:
Type | Description |
---|---|
str
|
In-memory string containing state of a forest container. |
load_from_json_string(json_string)
#
Reload a forest container from an in-memory JSON string.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
json_string
|
str
|
In-memory string containing state of a forest container. |
required |
add_sample(leaf_value)
#
Add a new all-root ensemble to the container, with all of the leaves set to the value / vector provided
Parameters:
Name | Type | Description | Default |
---|---|---|---|
leaf_value
|
float or array
|
Value (or vector of values) to initialize root nodes of every tree in a forest |
required |
add_numeric_split(forest_num, tree_num, leaf_num, feature_num, split_threshold, left_leaf_value, right_leaf_value)
#
Add a numeric (i.e. X[,i] <= c) split to a given tree in the ensemble
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest which contains the tree to be split |
required |
tree_num
|
int
|
Index of the tree to be split |
required |
leaf_num
|
int
|
Leaf to be split |
required |
feature_num
|
int
|
Feature that defines the new split |
required |
split_threshold
|
float
|
Value that defines the cutoff of the new split |
required |
left_leaf_value
|
float or array
|
Value (or array of values) to assign to the newly created left node |
required |
right_leaf_value
|
float or array
|
Value (or array of values) to assign to the newly created right node |
required |
get_tree_leaves(forest_num, tree_num)
#
Retrieve a vector of indices of leaf nodes for a given tree in a given forest
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest which contains tree |
required |
tree_num
|
float or array
|
Index of the tree for which leaf indices will be retrieved |
required |
Returns:
Type | Description |
---|---|
array
|
One-dimensional numpy array, containing the indices of leaf nodes in a given tree. |
get_tree_split_counts(forest_num, tree_num, num_features)
#
Retrieve a vector of split counts for every training set feature in a given tree in a given forest
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest which contains tree |
required |
tree_num
|
int
|
Index of the tree for which split counts will be retrieved |
required |
num_features
|
int
|
Total number of features in the training set |
required |
Returns:
Type | Description |
---|---|
array
|
One-dimensional numpy array with as many elements as in the forest model's training set, containing the split count for each feature for a given forest and tree. |
get_forest_split_counts(forest_num, num_features)
#
Retrieve a vector of split counts for every training set feature in a given forest
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest which contains tree |
required |
num_features
|
int
|
Total number of features in the training set |
required |
Returns:
Type | Description |
---|---|
array
|
One-dimensional numpy array with as many elements as in the forest model's training set, containing the split count for each feature for a given forest (summed across every tree in the forest). |
get_overall_split_counts(num_features)
#
Retrieve a vector of split counts for every training set feature, aggregated across ensembles and trees.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_features
|
int
|
Total number of features in the training set |
required |
Returns:
Type | Description |
---|---|
array
|
One-dimensional numpy array with as many elements as in the forest model's training set, containing the split count for each feature summed across every forest of every tree in the container. |
get_granular_split_counts(num_features)
#
Retrieve a vector of split counts for every training set variable in a given forest, reported separately for each ensemble and tree
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_features
|
int
|
Total number of features in the training set |
required |
Returns:
Type | Description |
---|---|
array
|
Three-dimensional numpy array, containing the number of splits a variable receives in each tree of each forest in a |
num_forest_leaves(forest_num)
#
Return the total number of leaves for a given forest in the ForestContainer
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest to be queried |
required |
Returns:
Type | Description |
---|---|
int
|
Number of leaves in a given forest in a |
sum_leaves_squared(forest_num)
#
Return the total sum of squared leaf values for a given forest in the ForestContainer
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest to be queried |
required |
Returns:
Type | Description |
---|---|
float
|
Sum of squared leaf values in a given forest in a |
is_leaf_node(forest_num, tree_num, node_id)
#
Whether or not a given node of a given tree in a given forest in the ForestContainer
is a leaf
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest to be queried |
required |
tree_num
|
int
|
Index of the tree to be queried |
required |
node_id
|
int
|
Index of the node to be queried |
required |
Returns:
Type | Description |
---|---|
bool
|
|
is_numeric_split_node(forest_num, tree_num, node_id)
#
Whether or not a given node of a given tree in a given forest in the ForestContainer
is a numeric split node
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest to be queried |
required |
tree_num
|
int
|
Index of the tree to be queried |
required |
node_id
|
int
|
Index of the node to be queried |
required |
Returns:
Type | Description |
---|---|
bool
|
|
is_categorical_split_node(forest_num, tree_num, node_id)
#
Whether or not a given node of a given tree in a given forest in the ForestContainer
is a categorical split node
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest to be queried |
required |
tree_num
|
int
|
Index of the tree to be queried |
required |
node_id
|
int
|
Index of the node to be queried |
required |
Returns:
Type | Description |
---|---|
bool
|
|
parent_node(forest_num, tree_num, node_id)
#
Parent node of given node of a given tree in a given forest in the ForestContainer
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest to be queried |
required |
tree_num
|
int
|
Index of the tree to be queried |
required |
node_id
|
int
|
Index of the node to be queried |
required |
Returns:
Type | Description |
---|---|
int
|
Index of the parent of node |
left_child_node(forest_num, tree_num, node_id)
#
Left child node of given node of a given tree in a given forest in the ForestContainer
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest to be queried |
required |
tree_num
|
int
|
Index of the tree to be queried |
required |
node_id
|
int
|
Index of the node to be queried |
required |
Returns:
Type | Description |
---|---|
int
|
Index of the left child of node |
right_child_node(forest_num, tree_num, node_id)
#
Right child node of given node of a given tree in a given forest in the ForestContainer
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest to be queried |
required |
tree_num
|
int
|
Index of the tree to be queried |
required |
node_id
|
int
|
Index of the node to be queried |
required |
Returns:
Type | Description |
---|---|
int
|
Index of the right child of node |
node_depth(forest_num, tree_num, node_id)
#
Depth of given node of a given tree in a given forest in the ForestContainer
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest to be queried |
required |
tree_num
|
int
|
Index of the tree to be queried |
required |
node_id
|
int
|
Index of the node to be queried |
required |
Returns:
Type | Description |
---|---|
int
|
Depth of node |
node_split_index(forest_num, tree_num, node_id)
#
Split index of given node of a given tree in a given forest in the ForestContainer
.
Returns -1
if the node is a leaf.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest to be queried |
required |
tree_num
|
int
|
Index of the tree to be queried |
required |
node_id
|
int
|
Index of the node to be queried |
required |
Returns:
Type | Description |
---|---|
int
|
Split index of |
node_split_threshold(forest_num, tree_num, node_id)
#
Threshold that defines a numeric split for a given node of a given tree in a given forest in the ForestContainer
.
Returns np.Inf
if the node is a leaf or a categorical split node.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest to be queried |
required |
tree_num
|
int
|
Index of the tree to be queried |
required |
node_id
|
int
|
Index of the node to be queried |
required |
Returns:
Type | Description |
---|---|
float
|
Threshold that defines a numeric split for node |
node_split_categories(forest_num, tree_num, node_id)
#
Array of category indices that define a categorical split for a given node of a given tree in a given forest in the ForestContainer
.
Returns np.array([np.Inf])
if the node is a leaf or a numeric split node.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest to be queried |
required |
tree_num
|
int
|
Index of the tree to be queried |
required |
node_id
|
int
|
Index of the node to be queried |
required |
Returns:
Type | Description |
---|---|
array
|
Array of category indices that define a categorical split for node |
node_leaf_values(forest_num, tree_num, node_id)
#
Node parameter value(s) for a given node of a given tree in a given forest in the ForestContainer
.
Values are stale if the node is a split node.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest to be queried |
required |
tree_num
|
int
|
Index of the tree to be queried |
required |
node_id
|
int
|
Index of the node to be queried |
required |
Returns:
Type | Description |
---|---|
array
|
Array of parameter values for node |
num_nodes(forest_num, tree_num)
#
Number of nodes in a given tree in a given forest in the ForestContainer
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest to be queried |
required |
tree_num
|
int
|
Index of the tree to be queried |
required |
Returns:
Type | Description |
---|---|
int
|
Total number of nodes in tree |
num_leaves(forest_num, tree_num)
#
Number of leaves in a given tree in a given forest in the ForestContainer
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest to be queried |
required |
tree_num
|
int
|
Index of the tree to be queried |
required |
Returns:
Type | Description |
---|---|
int
|
Total number of leaves in tree |
num_leaf_parents(forest_num, tree_num)
#
Number of leaf parents (split nodes with two leaves as children) in a given tree in a given forest in the ForestContainer
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest to be queried |
required |
tree_num
|
int
|
Index of the tree to be queried |
required |
Returns:
Type | Description |
---|---|
int
|
Total number of leaf parents in tree |
num_split_nodes(forest_num, tree_num)
#
Number of split_nodes in a given tree in a given forest in the ForestContainer
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest to be queried |
required |
tree_num
|
int
|
Index of the tree to be queried |
required |
Returns:
Type | Description |
---|---|
int
|
Total number of split nodes in tree |
nodes(forest_num, tree_num)
#
Array of node indices in a given tree in a given forest in the ForestContainer
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest to be queried |
required |
tree_num
|
int
|
Index of the tree to be queried |
required |
Returns:
Type | Description |
---|---|
array
|
Array of indices of nodes in tree |
leaves(forest_num, tree_num)
#
Array of leaf indices in a given tree in a given forest in the ForestContainer
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest to be queried |
required |
tree_num
|
int
|
Index of the tree to be queried |
required |
Returns:
Type | Description |
---|---|
array
|
Array of indices of leaf nodes in tree |
delete_sample(forest_num)
#
Modify the ForestContainer
by removing the forest sample indexed by forest_num
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
forest_num
|
int
|
Index of the forest to be removed from the |
required |