Skip to content

feature_importance for multiinput models with data as a list of arrays#142

Open
jmaspons wants to merge 8 commits intoModelOriented:masterfrom
jmaspons:multiinput_feature_importance
Open

feature_importance for multiinput models with data as a list of arrays#142
jmaspons wants to merge 8 commits intoModelOriented:masterfrom
jmaspons:multiinput_feature_importance

Conversation

@jmaspons
Copy link
Copy Markdown
Contributor

Datasets can be 2d or 3d arrays

@hbaniecki
Copy link
Copy Markdown
Member

hbaniecki commented Mar 20, 2022

Hi @jmaspons, thanks for this contribution. I will try to review it next week.

  1. Could you provide some code examples (use case) for this functionality? Ideally, it would also be later incorporated into the package tests.
  2. Can you tell me how it relates to Support 3d arrays as input data for feature_importance #141?

@hbaniecki hbaniecki added the feature 💡 New feature or request label Mar 20, 2022
@jmaspons
Copy link
Copy Markdown
Contributor Author

Hello,

You can find a test script with dummy data at https://gist.github.com/jmaspons/0199ef922571bafe5eaac1a056963a83 (it requires keras, abind, DALEX and data.table packages). The patch implements feature_importance for models with more than one input datasets as 2D and 3D arrays. It can be useful for time series data (3D to a RNN) with some static variables (2D). DALEX::explainer doesn't support this kind of data input, so no changes to feature_importance.explainer

#141 implements feature_importance for a single input model. I should add some changes to that PR for the variable_groups and the autogenerated variables following this patch which I tested much more cases.

The feature_importance.default and feature_importance.multiinput

In order to add tests, do you think it's acceptable to add all the dependencies or skip some by saving some data in the package?

@hbaniecki
Copy link
Copy Markdown
Member

For starters, we should use underscore notation for function parameters instead of camelCase, e.g. perm_dim.

In order to add tests, do you think it's acceptable to add all the dependencies or skip some by saving some data in the package?

All such dependencies should be added to suggests; we wouldn't want more dependencies in imports. It would be nice to have some tests; they can run on generated data.

jmaspons added a commit to jmaspons/MLTools that referenced this pull request Mar 22, 2022
Static variables can be also categorical
Requires ModelOriented/ingredients#142
jmaspons added a commit to jmaspons/MLTools that referenced this pull request Apr 29, 2022
Implements support for 3D arrays in a list of inputs and ... to predict function
Waiting for ModelOriented/ingredients#142 and ModelOriented/ingredients#143
jmaspons added a commit to jmaspons/MLTools that referenced this pull request Apr 29, 2022
Implements support for 3D arrays in a list of inputs and ... to predict function
Waiting for ModelOriented/ingredients#142 and ModelOriented/ingredients#143
@pbiecek
Copy link
Copy Markdown
Member

pbiecek commented Jan 13, 2023

let's not have data.table and keras as dependencies

@jmaspons
Copy link
Copy Markdown
Contributor Author

let's not have data.table and keras as dependencies

I'll find alternatives implementations for the data.table part. For the keras tests, is it ok to make it conditional and add keras in the suggested packages section?

@pbiecek
Copy link
Copy Markdown
Member

pbiecek commented Mar 15, 2023

we are trying to have DALEX as light as possible and keras is quite heavy package
so maybe a valid solution would be to move this function to DALEXtra?
(it has some heavy dependencies)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature 💡 New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants