-
Notifications
You must be signed in to change notification settings - Fork 19
Description
Our current implementation of predict_proba in BaseClassifier uses the ensemble approach where we build a kernel around a new point, make prediction using all the local models within the bandwidth and get a normalised weighted average. This can be very expensive with large bandwidths.
Georganos et al use a different approach.
For predicting, we fuse the global and local estimates using a weight parameter (a). Fusing the predictions allows us to extract the locally heterogeneous signal (low bias) from the local sub-model and merging it to that of a global model which uses more data (low variance).
...
To predict on new spatial locations, the closest available GRF model is used.
I think we should enhance our prediction in three ways:
- Allow selection of nearest local model instead of the ensemble - ENH: prediction based on nearest model only or a custom bandwidth #52
- Allow custom bandwidth, to potentially use more localised subset of the original ensemble - ENH: prediction based on nearest model only or a custom bandwidth #52
- Allow fusion of global prediction, independent on whether the local prediction is pulled from the nearest model or from the ensemble of models. - ENH: implement fusion with the global model in prediction #54
The signature could looks something like
predict_proba(
X,
geometry,
bandwidth: "nearest" | int | float | None = "nearest",
global_model_weight: float = 0,
)In bandwidth, int or float are interpreted as the new bandwidth, most likely smaller than the original but there's no restriction, None is interpreted as self.bandwidth.