coverforest#

coverforest is a Python library that extends scikit-learn’s random forest implementation to provide prediction sets/intervals with guaranteed coverage using conformal prediction methods. It offers a simple and efficient way to get uncertainty estimates for both classification and regression tasks.

Useful links: Source Repository | Issues & Ideas |

Key Features#

  • Scikit-learn compatible API

  • Three conformal prediction methods:
    • CV+ (Cross-Validation+) [1] [2]

    • Jackknife+-after-Bootstrap [3]

    • Split Conformal [4]

  • Efficient conformity score calculation with parallel processing support

  • Regularized set predictions for classification tasks [5]

Installation#

You can install coverforest using pip:

pip install coverforest

Requirements:

  • Python >=3.9

  • Scikit-learn >=1.6.0

Quick Start#

Classification Example#

from coverforest import CoverForestClassifier

clf = CoverForestClassifier(n_estimators=100, method='cv')  # using CV+
clf.fit(X_train, y_train)
y_pred, y_sets = clf.predict(X_test, alpha=0.05)            # 95% coverage sets

Regression Example#

from coverforest import CoverForestRegressor

reg = CoverForestRegressor(n_estimators=100, method='bootstrap')  # using J+-a-Bootstrap
reg.fit(X_train, y_train)
y_pred, y_intervals = reg.predict(X_test, alpha=0.05)             # 95% coverage intervals

Performance Tips#

  • Use the n_jobs parameter in fit() and predict() to control parallel processing (n_jobs=-1 uses all CPU cores)

  • For large test sets, consider batch processing to optimize memory usage when calculating conformity scores

  • The memory requirement for prediction scales with (n_train × n_test × n_classes)

References#