Even though scikit-learn supports probabilistic classification and, thus, discrete output distributions, it has a few problems:
- the
predict_proba API is inconsistent with the skpro regression API, as it returns a numpy array and not a distribution object
- there is no API for multivariate probabilistic classification, or ordinal regression.
It was suggested by @felipeangelimvieira to add a skpro native probabilistic classifier API that can represent a wider range of return distributions.
Someone working on this should give more details about the API design first.
As a starting point, I suggest:
- a new module
classification
- a base class
BaseProbaClassifier with methods fit, predict and predict_proba
- if ordinal classification is also covered, then
predict_quantiles and predict_interval may also make sense, in this case a capability tag for ordinal classification
- a distribution
DiscreteClass used for the output
- a
scikit-learn adapter that allows to expose any sklearn probabilistic classifier under the modified API
Even though
scikit-learnsupports probabilistic classification and, thus, discrete output distributions, it has a few problems:predict_probaAPI is inconsistent with theskproregression API, as it returns a numpy array and not a distribution objectIt was suggested by @felipeangelimvieira to add a
skpronative probabilistic classifier API that can represent a wider range of return distributions.Someone working on this should give more details about the API design first.
As a starting point, I suggest:
classificationBaseProbaClassifierwith methodsfit,predictandpredict_probapredict_quantilesandpredict_intervalmay also make sense, in this case a capability tag for ordinal classificationDiscreteClassused for the outputscikit-learnadapter that allows to expose any sklearn probabilistic classifier under the modified API