coverforest#

coverforest is a Python library that extends scikit-learn’s random forest implementation to provide prediction sets/intervals with guaranteed coverage using conformal prediction methods. It offers a simple and efficient way to get uncertainty estimates for both classification and regression tasks.

Useful links: Source Repository | Issues & Ideas |

Key Features#

Scikit-learn compatible API
Three conformal prediction methods:
- CV+ (Cross-Validation+) [1] [2]
- Jackknife+-after-Bootstrap [3]
- Split Conformal [4]
Efficient conformity score calculation with parallel processing support
Regularized set predictions for classification tasks [5]

Installation#

You can install coverforest using pip:

pip install coverforest

Requirements:

Python >=3.9
Scikit-learn >=1.6.0

Quick Start#

Classification Example#

from coverforest import CoverForestClassifier

clf = CoverForestClassifier(n_estimators=100, method='cv')  # using CV+
clf.fit(X_train, y_train)
y_pred, y_sets = clf.predict(X_test, alpha=0.05)            # 95% coverage sets

Regression Example#

from coverforest import CoverForestRegressor

reg = CoverForestRegressor(n_estimators=100, method='bootstrap')  # using J+-a-Bootstrap
reg.fit(X_train, y_train)
y_pred, y_intervals = reg.predict(X_test, alpha=0.05)             # 95% coverage intervals

Performance Tips#

Use the n_jobs parameter in fit() and predict() to control parallel processing (n_jobs=-1 uses all CPU cores)
For large test sets, consider batch processing to optimize memory usage when calculating conformity scores
The memory requirement for prediction scales with (n_train × n_test × n_classes)

coverforest

Contents