WIP, ENH: Add ML training and inference#28
WIP, ENH: Add ML training and inference#28adamwitmer wants to merge 20 commits intonidhin_data_analysis_backupfrom
Conversation
Added neat_ml/model for ML Added neat_ml/utils for plotting Added neat_ml/phase_diagram for plotting phase diagram Added test scripts and baseline images Modified run_workflow.py for training, inference and feature importance
Updated the test_workflow.py to incorporate new tests Updated README.md with commandline example with train, infer,plot. Removed extraneous code from test scripts Modified baseline images to ensure the tolerance to be within 1e-4
Added ci.yml Added LANL copyright assertion ID
* add test file assets * fix mypy error
|
@adamwitmer let me know when this is ready for review -- I believe it was to be presented today (March 20th) after being delayed from 4 months ago. I gave you a detailed review of gh-4 after you at least did a few things for the ASC polymer project (it does still seem a bit sluggish over there). I'll review this roughly 3 days after it is presented for review, assuming that duties on the ASC project are kept up. |
|
Initial TODO items for reviewing this branch:
|
| plt.gcf().set_size_inches(8, 6) | ||
| plt.tight_layout() | ||
| plt.savefig(out_dir / "shap_summary.png", dpi=300) |
There was a problem hiding this comment.
I think we should be able to use fig, ax handles here...
There was a problem hiding this comment.
I do not think that shap.summary_plot has support for fig, ax handles, per: shap/shap#3411, which can be seen at: https://github.com/shap/shap/blob/93dc2a1e446616fb0858b2ec108f80e4969ba6d9/shap/plots/summary.py#L45. This was an issue in the ldrd_virus_work project, which still uses global plt handles.
There was a problem hiding this comment.
Well, I'm pretty sure I asked for help resolving that upstream for that other project too. In fact, I did here--https://lisdi-git.lanl.gov/treddy/ldrd_virus_work/-/issues/89#note_31565.
And I resolved the matter myself on the rng side at shap/shap#3945. That could have been a good opportunity to help out the community...
Quoting from the internal issue:
My expectation is that the team will clearly communicate which
shapissues still remain, and really that you'll help me solve them proactively. That took about a month for a fairly simple patch, so look at your calendar and think about how long larger changes may take.
There was a problem hiding this comment.
I opened a new issue to track this bug #29.
There was a problem hiding this comment.
Please don't self-resolve comments where I've made a request for a change.
This isn't resolved--you've simply opened an issue to delay doing something I requested in November 2024, so the preferable route forward there is clear.
that you'll help me solve them proactively
|
@tylerjereddy I have completed my initial checklist and addressed all review comments including initial comments made on PR #5. I made sure the workflow runs on |
|
I'll make a note to do a first round of review on Friday, April 3, assuming activity is kept up on the ASC polymer project at 2 days effort/week (or if charging was completely stopped there this week for a new project substitution). Otherwise, I'll wait for you to catch up over there. As discussed in person, presenting this volume of work this close to the deadline does place a lot of review burden on the team that is best avoided by progressively presenting the work over months of more digestible back and forth. |
tylerjereddy
left a comment
There was a problem hiding this comment.
I've tried to provide a detailed review here despite the extraordinary (months) delay in presenting for review.
The value of hyperopt seems to be basically 0. I also commented in my review that you aren't even using hyeropt on one of your two estimators, so this really is a poor effort on the hyperopt side of things. Probably an issue should be opened and the source should point to that issue for future improvements in more complex situations where it actually matters. Grid search probably isn't sustainable in more complex situations where a guided/Bayesian search is likely required.
Test line coverage seems fine. However, when running the testsuite locally on this PR branch I saw several failures--possibly caused by dependency versions. It would be helpful to support a reasonably-wide range of dependency versions and to error out right away at runtime when importing a dependency at a version we do not support, so that what is happening is clear to the user (and the reviewer, who is spending time trying to figure this out...). Testsuite error output is below the fold--I'll leave resolution of that to you, and will probably just blindly change dependency versions locally until things work.
Details
============================================================================================ FAILURES ============================================================================================
___________________________________________________________________________________ test_train_with_validation ___________________________________________________________________________________
[gw1] darwin -- Python 3.12.3 /Users/treddy/python_venvs/py_312_ldrd_neat_dev/bin/python
sample_data = feature1 feature2 feature3 exclude_col target
0 0.773956 9.085807 A 0 1.0
1 0.438878...06 1.964347 B 98 1.0
99 0.961898 3.103237 B 99 0.0
[100 rows x 5 columns]
def test_train_with_validation(sample_data: pd.DataFrame):
X, y = preprocess(sample_data, target="target")
# perfectly align all the feature data with the target
X['feature1'] = np.where(
y == 1.0,
np.random.uniform(0.6, 1.0, len(X)),
np.random.uniform(0.0, 0.4, len(X))
)
X['feature2'] = np.where(
y == 1.0,
np.random.uniform(6, 10, len(X)),
np.random.uniform(0, 4, len(X))
)
X_train, y_train = X.iloc[:80], y.iloc[:80]
X_val, y_val = X.iloc[80:], y.iloc[80:]
> model, metrics, _, actual_val_proba = train_with_validation(
X_train, y_train, X_val, y_val, n_jobs=1, ml_hyper_opt=False,
)
neat_ml/tests/test_train.py:74:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
neat_ml/model/train.py:229: in train_with_validation
final_model = pipeline.fit(X_train, y_train)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/base.py:1336: in wrapper
return fit_method(estimator, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/pipeline.py:621: in fit
self._final_estimator.fit(Xt, y, **last_step_params["fit"])
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/base.py:1336: in wrapper
return fit_method(estimator, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/ensemble/_voting.py:405: in fit
return super().fit(X, transformed_y, **fit_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/ensemble/_voting.py:80: in fit
names, clfs = self._validate_estimators()
^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = VotingClassifier(estimators=[('rf',
RandomForestClassifier(class_weight='balanced',
...ee=None,
random_state=42, ...))],
n_jobs=1, voting='soft')
def _validate_estimators(self):
if len(self.estimators) == 0 or not all(
isinstance(item, (tuple, list)) and isinstance(item[0], str)
for item in self.estimators
):
raise ValueError(
"Invalid 'estimators' attribute, 'estimators' should be a "
"non-empty list of (string, estimator) tuples."
)
names, estimators = zip(*self.estimators)
# defined by MetaEstimatorMixin
self._validate_names(names)
has_estimator = any(est != "drop" for est in estimators)
if not has_estimator:
raise ValueError(
"All estimators are dropped. At least one is required "
"to be an estimator."
)
is_estimator_type = is_classifier if is_classifier(self) else is_regressor
for est in estimators:
if est != "drop" and not is_estimator_type(est):
> raise ValueError(
"The estimator {} should be a {}.".format(
est.__class__.__name__, is_estimator_type.__name__[3:]
)
)
E ValueError: The estimator XGBClassifier should be a classifier.
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/ensemble/_base.py:243: ValueError
_____________________________________________________________________________________ test_save_model_bundle _____________________________________________________________________________________
[gw1] darwin -- Python 3.12.3 /Users/treddy/python_venvs/py_312_ldrd_neat_dev/bin/python
tmp_path = PosixPath('/private/var/folders/5_/hm0ft57n6dn2ksgg2p0bx5h0000w2g/T/pytest-of-treddy/pytest-13/popen-gw1/test_save_model_bundle0')
sample_data = feature1 feature2 feature3 exclude_col target
0 0.773956 9.085807 A 0 1.0
1 0.438878...06 1.964347 B 98 1.0
99 0.961898 3.103237 B 99 0.0
[100 rows x 5 columns]
def test_save_model_bundle(tmp_path: Path, sample_data: pd.DataFrame):
X, y = preprocess(sample_data, target="target")
X_train, y_train = X.iloc[:80], y.iloc[:80]
X_val, y_val = X.iloc[80:], y.iloc[80:]
> expected_model, expected_metrics, expected_params, _ = train_with_validation(
X_train, y_train, X_val, y_val, n_jobs=1, ml_hyper_opt=False
)
neat_ml/tests/test_train.py:100:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
neat_ml/model/train.py:229: in train_with_validation
final_model = pipeline.fit(X_train, y_train)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/base.py:1336: in wrapper
return fit_method(estimator, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/pipeline.py:621: in fit
self._final_estimator.fit(Xt, y, **last_step_params["fit"])
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/base.py:1336: in wrapper
return fit_method(estimator, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/ensemble/_voting.py:405: in fit
return super().fit(X, transformed_y, **fit_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/ensemble/_voting.py:80: in fit
names, clfs = self._validate_estimators()
^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = VotingClassifier(estimators=[('rf',
RandomForestClassifier(class_weight='balanced',
...ee=None,
random_state=42, ...))],
n_jobs=1, voting='soft')
def _validate_estimators(self):
if len(self.estimators) == 0 or not all(
isinstance(item, (tuple, list)) and isinstance(item[0], str)
for item in self.estimators
):
raise ValueError(
"Invalid 'estimators' attribute, 'estimators' should be a "
"non-empty list of (string, estimator) tuples."
)
names, estimators = zip(*self.estimators)
# defined by MetaEstimatorMixin
self._validate_names(names)
has_estimator = any(est != "drop" for est in estimators)
if not has_estimator:
raise ValueError(
"All estimators are dropped. At least one is required "
"to be an estimator."
)
is_estimator_type = is_classifier if is_classifier(self) else is_regressor
for est in estimators:
if est != "drop" and not is_estimator_type(est):
> raise ValueError(
"The estimator {} should be a {}.".format(
est.__class__.__name__, is_estimator_type.__name__[3:]
)
)
E ValueError: The estimator XGBClassifier should be a classifier.
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/ensemble/_base.py:243: ValueError
________________________________________________________________________________ test_compare_methods_end_to_end _________________________________________________________________________________
[gw2] darwin -- Python 3.12.3 /Users/treddy/python_venvs/py_312_ldrd_neat_dev/bin/python
tmp_path = PosixPath('/private/var/folders/5_/hm0ft57n6dn2ksgg2p0bx5h0000w2g/T/pytest-of-treddy/pytest-13/popen-gw2/test_compare_methods_end_to_en0')
classification_dataset = ( PEO 10 kg/mol (wt%) ... graph_num_components
0 2.334594 ... 2.416491
1 -2.1...18
[10 rows x 5 columns], 0 0
1 1
2 0
3 0
4 1
5 0
6 1
7 0
8 1
9 1
Name: y, dtype: int64)
stable_rc = {'axes.labelsize': 10, 'axes.linewidth': 1.0, 'axes.titlesize': 12, 'figure.dpi': 100, ...}, baseline_dir = PosixPath('/Users/treddy/LANL/gitlab/ldrd_neat_ml/neat_ml/tests/baseline')
def test_compare_methods_end_to_end(
tmp_path: Path,
classification_dataset: tuple[pd.DataFrame, pd.Series],
stable_rc,
baseline_dir,
):
"""
End-to-end test of compare_methods.
Test consistency of mean rank of important features
PNG compared via inline NumPy RMS diff.
"""
rng = np.random.default_rng(0)
X, y = classification_dataset
# "preprocess" dataset to remove composition columns
X = X.drop(columns=["PEO 10 kg/mol (wt%)", "Dextran 10 kg/mol (wt%)"])
model = RandomForestClassifier(random_state=0).fit(X, y)
with mpl.rc_context(stable_rc):
> fi.compare_methods(model, X, y, out_dir=tmp_path, top=3, rng=rng)
neat_ml/tests/test_feature_importance.py:154:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
neat_ml/model/feature_importance.py:335: in compare_methods
shap_imp = _run_shap(model, X, out_dir, top=top, rng=rng)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
model = RandomForestClassifier(random_state=0)
X = num_blobs coverage_percentage graph_num_components
0 -1.698423 2.336225 2.416491
1 1.....170261
8 2.200476 -2.275953 0.915428
9 -0.377733 -1.104456 -1.824018
out_dir = PosixPath('/private/var/folders/5_/hm0ft57n6dn2ksgg2p0bx5h0000w2g/T/pytest-of-treddy/pytest-13/popen-gw2/test_compare_methods_end_to_en0'), top = 3, n_jobs = -1
rng = Generator(PCG64) at 0x3441CFCA0
def _run_shap(
model, X: pd.DataFrame,
out_dir: Path,
top: int = 20,
n_jobs: int = -1,
rng: np.random.Generator | None = None,
) -> pd.Series:
"""
Compute global SHAP values for *model* and derive per-feature importance.
A permutation explainer is instantiated on the fly because it works with
any black box predict_proba** function. The absolute SHAP values are
averaged across all rows, giving a single scalar importance per feature.
Parameters
----------
model : Any
Fitted classifier exposing a predict_proba(X) -> ndarray method whose
second dimension contains probabilities for the positive class.
X : pandas.DataFrame
Numeric feature matrix used both as background data for the explainer
and as the evaluation set whose SHAP values are summarized.
out_dir : pathlib.Path
Directory where the SHAP bar chart (shap_summary.png) will be saved.
top : int, default 20
Maximum number of features to display in the SHAP summary figure.
n_jobs : int
number of parallel processes to run for shap explainer. n_jobs=-1 uses
all cores.
rng : np.random.Generator | None
pseudorandom number generator
Returns
-------
imp : pandas.Series
Index = feature names, values = mean absolute SHAP value (descending).
"""
explainer = shap.Explainer(
model.predict_proba,
masker=X.values,
algorithm="permutation",
n_jobs=n_jobs,
feature_names=X.columns.to_list(),
)
vals = explainer(X.values).values
vals = vals[:, :, 1] if vals.ndim == 3 else vals
imp = pd.Series(np.abs(vals).mean(0), index=X.columns).sort_values(ascending=False)
> shap.summary_plot(vals, features=X, max_display=top, show=False, rng=rng)
E TypeError: summary_legacy() got an unexpected keyword argument 'rng'
neat_ml/model/feature_importance.py:74: TypeError
_____________________________________________________________________________ test_stage_train_model_column_mismatch _____________________________________________________________________________
[gw2] darwin -- Python 3.12.3 /Users/treddy/python_venvs/py_312_ldrd_neat_dev/bin/python
tmp_path = PosixPath('/private/var/folders/5_/hm0ft57n6dn2ksgg2p0bx5h0000w2g/T/pytest-of-treddy/pytest-13/popen-gw2/test_stage_train_model_column_0')
sample_data = feature1 feature2 feature3 exclude_col target
0 0.773956 9.085807 A 0 1.0
1 0.438878...06 1.964347 B 98 1.0
99 0.961898 3.103237 B 99 0.0
[100 rows x 5 columns]
caplog = <_pytest.logging.LogCaptureFixture object at 0x39ea10530>
def test_stage_train_model_column_mismatch(
tmp_path: Path, sample_data, caplog
):
caplog.set_level(logging.WARNING)
train_ds = {"id": "TR4"}
train_path = tmp_path / "train.csv"
val_path = tmp_path / "val.csv"
train_paths = {"agg_csv": train_path, "model_dir": tmp_path / "model"}
val_paths = {"agg_csv": val_path}
sample_data.to_csv(train_path, index=False)
val_data = sample_data.drop(columns=["feature1", "exclude_col"])
val_data.to_csv(val_path, index=False)
> wf.stage_train_model(
train_ds,
train_paths,
val_ds={"id": "VAL"},
val_paths=val_paths,
target="target"
)
neat_ml/tests/test_workflow.py:726:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
neat_ml/workflow/lib_workflow.py:407: in stage_train_model
model, metrics, best_params, val_proba = train_with_validation(
neat_ml/model/train.py:224: in train_with_validation
grid_search.fit(X, y)
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/base.py:1336: in wrapper
return fit_method(estimator, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/model_selection/_search.py:1053: in fit
self._run_search(evaluate_candidates)
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/model_selection/_search.py:1612: in _run_search
evaluate_candidates(ParameterGrid(self.param_grid))
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/model_selection/_search.py:1030: in evaluate_candidates
_warn_or_raise_about_fit_failures(out, self.error_score)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
results = [{'fit_error': 'Traceback (most recent call last):\n File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python...fier should be a classifier.\n', 'fit_time': 0.0013880729675292969, 'n_test_samples': 99, 'score_time': 0.0, ...}, ...]
error_score = nan
def _warn_or_raise_about_fit_failures(results, error_score):
fit_errors = [
result["fit_error"] for result in results if result["fit_error"] is not None
]
if fit_errors:
num_failed_fits = len(fit_errors)
num_fits = len(results)
fit_errors_counter = Counter(fit_errors)
delimiter = "-" * 80 + "\n"
fit_errors_summary = "\n".join(
f"{delimiter}{n} fits failed with the following error:\n{error}"
for error, n in fit_errors_counter.items()
)
if num_failed_fits == num_fits:
all_fits_failed_message = (
f"\nAll the {num_fits} fits failed.\n"
"It is very likely that your model is misconfigured.\n"
"You can try to debug the error by setting error_score='raise'.\n\n"
f"Below are more details about the failures:\n{fit_errors_summary}"
)
> raise ValueError(all_fits_failed_message)
E ValueError:
E All the 72 fits failed.
E It is very likely that your model is misconfigured.
E You can try to debug the error by setting error_score='raise'.
E
E Below are more details about the failures:
E --------------------------------------------------------------------------------
E 72 fits failed with the following error:
E Traceback (most recent call last):
E File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 833, in _fit_and_score
E estimator.fit(X_train, y_train, **fit_params)
E File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/base.py", line 1336, in wrapper
E return fit_method(estimator, *args, **kwargs)
E ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/pipeline.py", line 621, in fit
E self._final_estimator.fit(Xt, y, **last_step_params["fit"])
E File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/base.py", line 1336, in wrapper
E return fit_method(estimator, *args, **kwargs)
E ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/ensemble/_voting.py", line 405, in fit
E return super().fit(X, transformed_y, **fit_params)
E ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/ensemble/_voting.py", line 80, in fit
E names, clfs = self._validate_estimators()
E ^^^^^^^^^^^^^^^^^^^^^^^^^^^
E File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/ensemble/_base.py", line 243, in _validate_estimators
E raise ValueError(
E ValueError: The estimator XGBClassifier should be a classifier.
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:479: ValueError
--------------------------------------------------------------------------------------- Captured log call ----------------------------------------------------------------------------------------
WARNING neat_ml.workflow.lib_workflow:lib_workflow.py:396 Feature mismatch: using 1common features (train=Index(['feature1', 'feature2', 'exclude_col'], dtype='object'), val=Index(['feature2'], dtype='object')).
_____________________________________________________________________ test_stage_train_model_happy_path_saves_bundle_and_roc _____________________________________________________________________
[gw2] darwin -- Python 3.12.3 /Users/treddy/python_venvs/py_312_ldrd_neat_dev/bin/python
tmp_path = PosixPath('/private/var/folders/5_/hm0ft57n6dn2ksgg2p0bx5h0000w2g/T/pytest-of-treddy/pytest-13/popen-gw2/test_stage_train_model_happy_p0')
sample_data = feature1 feature2 feature3 exclude_col target
0 0.773956 9.085807 A 0 1.0
1 0.438878...06 1.964347 B 98 1.0
99 0.961898 3.103237 B 99 0.0
[100 rows x 5 columns]
def test_stage_train_model_happy_path_saves_bundle_and_roc(
tmp_path: Path,
sample_data,
):
train_ds = {"id": "TR5"}
train_paths = {"agg_csv": tmp_path / "train.csv", "model_dir": tmp_path / "model"}
val_paths = {"agg_csv": tmp_path / "val.csv"}
sample_data.to_csv(val_paths["agg_csv"], index=False)
sample_data.to_csv(train_paths["agg_csv"], index=False)
> wf.stage_train_model(
train_ds,
train_paths,
val_ds={"id": "VAL"},
val_paths=val_paths,
target="target"
)
neat_ml/tests/test_workflow.py:747:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
neat_ml/workflow/lib_workflow.py:407: in stage_train_model
model, metrics, best_params, val_proba = train_with_validation(
neat_ml/model/train.py:224: in train_with_validation
grid_search.fit(X, y)
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/base.py:1336: in wrapper
return fit_method(estimator, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/model_selection/_search.py:1053: in fit
self._run_search(evaluate_candidates)
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/model_selection/_search.py:1612: in _run_search
evaluate_candidates(ParameterGrid(self.param_grid))
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/model_selection/_search.py:1030: in evaluate_candidates
_warn_or_raise_about_fit_failures(out, self.error_score)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
results = [{'fit_error': 'Traceback (most recent call last):\n File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python...assifier should be a classifier.\n', 'fit_time': 0.000946044921875, 'n_test_samples': 99, 'score_time': 0.0, ...}, ...]
error_score = nan
def _warn_or_raise_about_fit_failures(results, error_score):
fit_errors = [
result["fit_error"] for result in results if result["fit_error"] is not None
]
if fit_errors:
num_failed_fits = len(fit_errors)
num_fits = len(results)
fit_errors_counter = Counter(fit_errors)
delimiter = "-" * 80 + "\n"
fit_errors_summary = "\n".join(
f"{delimiter}{n} fits failed with the following error:\n{error}"
for error, n in fit_errors_counter.items()
)
if num_failed_fits == num_fits:
all_fits_failed_message = (
f"\nAll the {num_fits} fits failed.\n"
"It is very likely that your model is misconfigured.\n"
"You can try to debug the error by setting error_score='raise'.\n\n"
f"Below are more details about the failures:\n{fit_errors_summary}"
)
> raise ValueError(all_fits_failed_message)
E ValueError:
E All the 72 fits failed.
E It is very likely that your model is misconfigured.
E You can try to debug the error by setting error_score='raise'.
E
E Below are more details about the failures:
E --------------------------------------------------------------------------------
E 72 fits failed with the following error:
E Traceback (most recent call last):
E File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 833, in _fit_and_score
E estimator.fit(X_train, y_train, **fit_params)
E File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/base.py", line 1336, in wrapper
E return fit_method(estimator, *args, **kwargs)
E ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/pipeline.py", line 621, in fit
E self._final_estimator.fit(Xt, y, **last_step_params["fit"])
E File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/base.py", line 1336, in wrapper
E return fit_method(estimator, *args, **kwargs)
E ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/ensemble/_voting.py", line 405, in fit
E return super().fit(X, transformed_y, **fit_params)
E ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/ensemble/_voting.py", line 80, in fit
E names, clfs = self._validate_estimators()
E ^^^^^^^^^^^^^^^^^^^^^^^^^^^
E File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/ensemble/_base.py", line 243, in _validate_estimators
E raise ValueError(
E ValueError: The estimator XGBClassifier should be a classifier.
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:479: ValueError
__________________________________________________________________ test_stage_explain_aligns_features_and_calls_compare_methods __________________________________________________________________
[gw2] darwin -- Python 3.12.3 /Users/treddy/python_venvs/py_312_ldrd_neat_dev/bin/python
tmp_path = PosixPath('/private/var/folders/5_/hm0ft57n6dn2ksgg2p0bx5h0000w2g/T/pytest-of-treddy/pytest-13/popen-gw2/test_stage_explain_aligns_feat0')
sample_inference_data = PosixPath('/private/var/folders/5_/hm0ft57n6dn2ksgg2p0bx5h0000w2g/T/pytest-of-treddy/pytest-13/popen-gw2/infer0/inference_data.csv')
trained_model_bundle = PosixPath('/private/var/folders/5_/hm0ft57n6dn2ksgg2p0bx5h0000w2g/T/pytest-of-treddy/pytest-13/popen-gw2/model0/model.joblib')
def test_stage_explain_aligns_features_and_calls_compare_methods(
tmp_path: Path,
sample_inference_data,
trained_model_bundle,
):
explain_out = tmp_path / "explain_out"
train_ds = {"id": "TRX", "composition_cols": ["PEG"]}
paths = {"agg_csv": sample_inference_data, "explain_dir": explain_out}
> wf.stage_explain(train_ds, paths, trained_model_bundle, target="ground_truth")
neat_ml/tests/test_workflow.py:766:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
neat_ml/workflow/lib_workflow.py:479: in stage_explain
compare_methods(
neat_ml/model/feature_importance.py:335: in compare_methods
shap_imp = _run_shap(model, X, out_dir, top=top, rng=rng)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
model = Pipeline(steps=[('impute', SimpleImputer(strategy='median')),
('scale', StandardScaler()),
('clf', LogisticRegression(random_state=42))])
X = feat_a feat_b
0 0.682352 0.0
1 0.053821 1.0
2 0.220360 2.0
3 0.184372 3.0
4 0.175906 ...173632 44.0
45 0.312742 45.0
46 0.014474 46.0
47 0.032552 47.0
48 0.496702 48.0
49 0.468313 49.0
out_dir = PosixPath('/private/var/folders/5_/hm0ft57n6dn2ksgg2p0bx5h0000w2g/T/pytest-of-treddy/pytest-13/popen-gw2/test_stage_explain_aligns_feat0/explain_out'), top = 20, n_jobs = -1
rng = None
def _run_shap(
model, X: pd.DataFrame,
out_dir: Path,
top: int = 20,
n_jobs: int = -1,
rng: np.random.Generator | None = None,
) -> pd.Series:
"""
Compute global SHAP values for *model* and derive per-feature importance.
A permutation explainer is instantiated on the fly because it works with
any black box predict_proba** function. The absolute SHAP values are
averaged across all rows, giving a single scalar importance per feature.
Parameters
----------
model : Any
Fitted classifier exposing a predict_proba(X) -> ndarray method whose
second dimension contains probabilities for the positive class.
X : pandas.DataFrame
Numeric feature matrix used both as background data for the explainer
and as the evaluation set whose SHAP values are summarized.
out_dir : pathlib.Path
Directory where the SHAP bar chart (shap_summary.png) will be saved.
top : int, default 20
Maximum number of features to display in the SHAP summary figure.
n_jobs : int
number of parallel processes to run for shap explainer. n_jobs=-1 uses
all cores.
rng : np.random.Generator | None
pseudorandom number generator
Returns
-------
imp : pandas.Series
Index = feature names, values = mean absolute SHAP value (descending).
"""
explainer = shap.Explainer(
model.predict_proba,
masker=X.values,
algorithm="permutation",
n_jobs=n_jobs,
feature_names=X.columns.to_list(),
)
vals = explainer(X.values).values
vals = vals[:, :, 1] if vals.ndim == 3 else vals
imp = pd.Series(np.abs(vals).mean(0), index=X.columns).sort_values(ascending=False)
> shap.summary_plot(vals, features=X, max_display=top, show=False, rng=rng)
E TypeError: summary_legacy() got an unexpected keyword argument 'rng'
neat_ml/model/feature_importance.py:74: TypeError
======================================================================================== warnings summary ========================================================================================
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/shap/plots/colors/_colorconv.py:819: 7272 warnings
/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/shap/plots/colors/_colorconv.py:819: DeprecationWarning: Converting `np.inexact` or `np.floating` to a dtype is deprecated. The current result is `float64` which is not strictly correct.
if np.issubdtype(dtype_in, np.dtype(dtype).type):
neat_ml/tests/test_analysis.py: 30 warnings
/Users/treddy/LANL/gitlab/ldrd_neat_ml/neat_ml/analysis/data_analysis.py:410: DeprecationWarning: Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)
nbr_dists = np.fromiter((d["distance"] for _, _, d in
neat_ml/tests/test_detection.py::test_detect_single_image_no_blobs
/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/numpy/lib/_nanfunctions_impl.py:1231: RuntimeWarning: Mean of empty slice
return np.nanmean(a, axis, out=out, keepdims=keepdims)
neat_ml/tests/test_lib.py: 15 warnings
neat_ml/tests/test_feature_importance.py: 719 warnings
/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/utils/validation.py:2691: UserWarning: X does not have valid feature names, but RandomForestClassifier was fitted with feature names
warnings.warn(
neat_ml/tests/test_bubblesam.py: 89 warnings
neat_ml/tests/test_workflow.py: 59 warnings
/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/torch/jit/_script.py:1480: DeprecationWarning: `torch.jit.script` is deprecated. Please switch to `torch.compile` or `torch.export`.
warnings.warn(
neat_ml/tests/test_detection.py::test_detect_single_image_processed
neat_ml/tests/test_workflow.py::test_run_workflow_single_image_path[opencv-None-paths2-bubble_data]
neat_ml/tests/test_detection.py::test_visual_regression_debug_overlay
neat_ml/tests/test_workflow.py::test_stage_detect_pipeline_runs[ds1-paths1-exp_columns1]
/Users/treddy/LANL/gitlab/ldrd_neat_ml/neat_ml/opencv/detection.py:70: FutureWarning: Downcasting object dtype arrays on .fillna, .ffill, .bfill is deprecated and will change in a future version. Call result.infer_objects(copy=False) instead. To opt-in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)`
columns=["bubble_number", "center", "radius", "area", "bbox"]).fillna(np.nan)
neat_ml/tests/test_workflow.py: 5049 warnings
/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/utils/validation.py:2691: UserWarning: X does not have valid feature names, but SimpleImputer was fitted with feature names
warnings.warn(
neat_ml/tests/test_bubblesam.py::test_sam_internal_api[mps]
neat_ml/tests/test_bubblesam.py::test_bubblesam_detection_generates_pngs[cpu]
neat_ml/tests/test_workflow.py::test_run_workflow_single_image_path[bubblesam-cpu-paths0-masks_filtered]
neat_ml/tests/test_workflow.py::test_stage_detect_pipeline_runs[ds0-paths0-exp_columns0]
neat_ml/tests/test_bubblesam.py::test_run_bubblesam[cpu]
neat_ml/tests/test_bubblesam.py::test_bubblesam_detection_generates_pngs[mps]
neat_ml/tests/test_bubblesam.py::test_sam_internal_api[cpu]
neat_ml/tests/test_workflow.py::test_run_workflow_single_image_path[bubblesam-gpu-paths1-masks_filtered]
neat_ml/tests/test_bubblesam.py::test_run_bubblesam[gpu]
/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sam2/sam2_image_predictor.py:431: UserWarning: cannot import name '_C' from 'sam2' (/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sam2/__init__.py)
Skipping the post-processing step due to the error above. You can still use SAM 2 and it's OK to ignore the error above, although some post-processing functionality may be limited (which doesn't affect the results in most cases; see https://github.com/facebookresearch/sam2/blob/main/INSTALL.md).
masks = self._transforms.postprocess_masks(
neat_ml/tests/test_workflow.py::test_stage_run_inference_calls_inference_and_makes_pred_dir
neat_ml/tests/test_workflow.py::test_stage_run_inference_calls_inference_and_makes_pred_dir
/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/base.py:1336: ConvergenceWarning: Number of distinct clusters (1) found smaller than n_clusters (2). Possibly due to duplicate points in X.
return fit_method(estimator, *args, **kwargs)
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
==================================================================================== short test summary info =====================================================================================
FAILED neat_ml/tests/test_train.py::test_train_with_validation - ValueError: The estimator XGBClassifier should be a classifier.
FAILED neat_ml/tests/test_train.py::test_save_model_bundle - ValueError: The estimator XGBClassifier should be a classifier.
FAILED neat_ml/tests/test_feature_importance.py::test_compare_methods_end_to_end - TypeError: summary_legacy() got an unexpected keyword argument 'rng'
FAILED neat_ml/tests/test_workflow.py::test_stage_train_model_column_mismatch - ValueError:
FAILED neat_ml/tests/test_workflow.py::test_stage_train_model_happy_path_saves_bundle_and_roc - ValueError:
FAILED neat_ml/tests/test_workflow.py::test_stage_explain_aligns_features_and_calls_compare_methods - TypeError: summary_legacy() got an unexpected keyword argument 'rng'
=================================================================== 6 failed, 166 passed, 2 skipped, 13249 warnings in 56.22s ====================================================================
| on: | ||
| push: | ||
| branches: [ main ] | ||
| branches: [ main, nidhin_data_analysis_backup ] |
There was a problem hiding this comment.
I think you only need the temporary pull_request modification below, since we don't plan to merge into the non-main branch here.
| if: runner.os == 'macOS' | ||
| run: | | ||
| echo "Limiting OpenMP to 1 thread for macOS performance" | ||
| echo "OMP_NUM_THREADS=1" >> $GITHUB_ENV |
There was a problem hiding this comment.
There is not sufficient detail here to motivate the need for this shim, and you should try to avoid burdening the reviewer with having to go fishing for the related information.
Even if you explained this somewhere else, the most helpful course of action tends to be to help the reader out with a clear and concise comment in your CI configuration that explains exactly why this is needed--which library is affected? Is it an upstream bug?
Why are we doing this instead of using a canonical Python-level tool like threadpoolctl, which helps limit the number of threads used in native libraries that handle their own internal threadpool (BLAS and OpenMP implementations).
I'm not necessarily saying you're wrong here, but you're asking the reviewer to do some heavy lifting to figure out what is going on, which isn't great for clarity/efficiency of reviewer time.
| ebm_act = tmp_path / "ebm_importance.png" | ||
| ebm_exp = baseline_dir / "ebm_importance_expected.png" | ||
| result = compare_images(ebm_exp, ebm_act, tol=1e-4) # type: ignore[call-overload] | ||
| assert result is None |
There was a problem hiding this comment.
This test is failing pretty consistently for me locally on ARM Mac with traceback below the fold. Tests should be contructed to be reliable--if dependency versions cause issues that should be cleaned up somehow (shim in the source code or error out with unsupported version of dep). If something is missing a random seed it should be pinned, etc.
Details
_______________________________________________________________________________ test_compare_methods_end_to_end _________________________________________________________________________________
[gw2] darwin -- Python 3.12.3 /Users/treddy/python_venvs/py_312_ldrd_neat_dev/bin/python
tmp_path = PosixPath('/private/var/folders/5_/hm0ft57n6dn2ksgg2p0bx5h0000w2g/T/pytest-of-treddy/pytest-15/popen-gw2/test_compare_methods_end_to_en0')
classification_dataset = ( PEO 10 kg/mol (wt%) ... graph_num_components
0 2.334594 ... 2.416491
1 -2.1...18
[10 rows x 5 columns], 0 0
1 1
2 0
3 0
4 1
5 0
6 1
7 0
8 1
9 1
Name: y, dtype: int64)
stable_rc = {'axes.labelsize': 10, 'axes.linewidth': 1.0, 'axes.titlesize': 12, 'figure.dpi': 100, ...}, baseline_dir = PosixPath('/Users/treddy/LANL/gitlab/ldrd_neat_ml/neat_ml/tests/baseline')
def test_compare_methods_end_to_end(
tmp_path: Path,
classification_dataset: tuple[pd.DataFrame, pd.Series],
stable_rc,
baseline_dir,
):
"""
End-to-end test of compare_methods.
Test consistency of mean rank of important features
PNG compared via inline NumPy RMS diff.
"""
rng = np.random.default_rng(0)
X, y = classification_dataset
# "preprocess" dataset to remove composition columns
X = X.drop(columns=["PEO 10 kg/mol (wt%)", "Dextran 10 kg/mol (wt%)"])
model = RandomForestClassifier(random_state=0).fit(X, y)
with mpl.rc_context(stable_rc):
fi.compare_methods(model, X, y, out_dir=tmp_path, top=3, rng=rng)
actual_csv_path = tmp_path / "feature_importance_comparison.csv"
actual_df = pd.read_csv(actual_csv_path, index_col=0)
# SHAP importance values fluctuate on the order of 1e-2 floating
# point precision between calls, so check that the mean ranking of
# the feature importance values is preserved.
assert_allclose(actual_df["mean_rank"], [1.3333333333333333, 2.0, 2.6666666666666665])
# check the output of ebm importance ranking.
# for the same reason that SHAP values are difficult to compare,
# the SHAP plot and FIC plots also fluctuate between runs,
# by a floating point value big enough to make image comparison difficult.
ebm_act = tmp_path / "ebm_importance.png"
ebm_exp = baseline_dir / "ebm_importance_expected.png"
result = compare_images(ebm_exp, ebm_act, tol=1e-4) # type: ignore[call-overload]
> assert result is None
E AssertionError: assert 'Error: Image files did not match.\n RMS Value: 9.395480684439018\n Expected: \n /Users/treddy/LANL/gitlab/ldrd_...f-treddy/pytest-15/popen-gw2/test_compare_methods_end_to_en0/ebm_importance-failed-diff.png\n Tolerance: \n 0.0001' is None
neat_ml/tests/test_feature_importance.py:171: AssertionError
======================================================================================== warnings summary ========================================================================================
neat_ml/tests/test_analysis.py: 30 warnings
/Users/treddy/LANL/gitlab/ldrd_neat_ml/neat_ml/analysis/data_analysis.py:410: DeprecationWarning: Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)
nbr_dists = np.fromiter((d["distance"] for _, _, d in
neat_ml/tests/test_detection.py::test_detect_single_image_no_blobs
/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/numpy/lib/_nanfunctions_impl.py:1241: RuntimeWarning: Mean of empty slice
return np.nanmean(a, axis, out=out, keepdims=keepdims)
neat_ml/tests/test_lib.py: 15 warnings
neat_ml/tests/test_feature_importance.py: 729 warnings
/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/utils/validation.py:2691: UserWarning: X does not have valid feature names, but RandomForestClassifier was fitted with feature names
warnings.warn(
neat_ml/tests/test_bubblesam.py: 89 warnings
neat_ml/tests/test_workflow.py: 59 warnings
/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/torch/jit/_script.py:1480: DeprecationWarning: `torch.jit.script` is deprecated. Please switch to `torch.compile` or `torch.export`.
warnings.warn(
neat_ml/tests/test_workflow.py::test_stage_detect_pipeline_runs[ds0-paths0-exp_columns0]
neat_ml/tests/test_bubblesam.py::test_bubblesam_detection_generates_pngs[cpu]
neat_ml/tests/test_bubblesam.py::test_sam_internal_api[mps]
neat_ml/tests/test_workflow.py::test_run_workflow_single_image_path[bubblesam-cpu-paths0-masks_filtered]
neat_ml/tests/test_workflow.py::test_run_workflow_single_image_path[bubblesam-gpu-paths1-masks_filtered]
neat_ml/tests/test_bubblesam.py::test_bubblesam_detection_generates_pngs[mps]
neat_ml/tests/test_bubblesam.py::test_run_bubblesam[cpu]
neat_ml/tests/test_bubblesam.py::test_run_bubblesam[gpu]
neat_ml/tests/test_bubblesam.py::test_sam_internal_api[cpu]
/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sam2/sam2_image_predictor.py:431: UserWarning: cannot import name '_C' from 'sam2' (/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sam2/__init__.py)
Skipping the post-processing step due to the error above. You can still use SAM 2 and it's OK to ignore the error above, although some post-processing functionality may be limited (which doesn't affect the results in most cases; see https://github.com/facebookresearch/sam2/blob/main/INSTALL.md).
masks = self._transforms.postprocess_masks(
neat_ml/tests/test_workflow.py::test_stage_detect_pipeline_runs[ds0-paths0-exp_columns0]
/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/joblib/memory.py:607: UserWarning: Persisting input arguments took 1.23s to run.If this happens often in your code, it can cause performance problems (results will be correct in all cases). The reason for this is probably some large input arguments for a wrapped function.
return self._cached_call(args, kwargs, shelving=False)[0]
neat_ml/tests/test_workflow.py::test_stage_train_model_column_mismatch
neat_ml/tests/test_workflow.py::test_stage_train_model_happy_path_saves_bundle_and_roc
/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/utils/validation.py:2684: UserWarning: X has feature names, but SimpleImputer was fitted without feature names
warnings.warn(
neat_ml/tests/test_workflow.py: 5099 warnings
/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/utils/validation.py:2691: UserWarning: X does not have valid feature names, but SimpleImputer was fitted with feature names
warnings.warn(
neat_ml/tests/test_workflow.py::test_stage_explain_aligns_features_and_calls_compare_methods
/Users/treddy/LANL/gitlab/ldrd_neat_ml/neat_ml/model/feature_importance.py:74: FutureWarning: The NumPy global RNG was seeded by calling `np.random.seed`. In a future version this function will no longer use the global RNG. Pass `rng` explicitly to opt-in to the new behaviour and silence this warning.
shap.summary_plot(vals, features=X, max_display=top, show=False, rng=rng)
neat_ml/tests/test_workflow.py::test_stage_run_inference_calls_inference_and_makes_pred_dir
neat_ml/tests/test_workflow.py::test_stage_run_inference_calls_inference_and_makes_pred_dir
/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/base.py:1336: ConvergenceWarning: Number of distinct clusters (1) found smaller than n_clusters (2). Possibly due to duplicate points in X.
return fit_method(estimator, *args, **kwargs)
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
==================================================================================== short test summary info =====================================================================================
FAILED neat_ml/tests/test_feature_importance.py::test_compare_methods_end_to_end - AssertionError: assert 'Error: Image files did not match.\n RMS Value: 9.395480684439018\n Expected: \n /Users/treddy/LANL/gitlab/ldrd_...f-treddy/pytest-15/popen-gw2/test_compare_met...
==================================================================== 1 failed, 171 passed, 2 skipped, 6037 warnings in 46.57s ====================================================================
|
|
||
|
|
||
| def _run_shap( | ||
| model, X: pd.DataFrame, |
There was a problem hiding this comment.
Don't know why this was marked as resolved since it isn't, I'll reopen it...
It would be good if resolutions were checked and explained.
| second dimension contains probabilities for the positive class. | ||
| X : pandas.DataFrame | ||
| Numeric feature matrix used both as background data for the explainer | ||
| and as the evaluation set whose SHAP values are summarized. |
There was a problem hiding this comment.
This description is confusing. Is X the design matrix used for training the estimators or something else? Not clear.
What is an "evaluation set?" Is that different from training data? Often we use feature importance techniques on the training data, but I'm finding this description not particularly clear...
| train_dataset_config: dict[str, Any], | ||
| paths: dict[str, Path], | ||
| model_path: Path, | ||
| target: str = "Phase_Separation", |
There was a problem hiding this comment.
no way to set number of top features to use from this public function?
| infer_dataset_config: dict[str, Any], | ||
| paths: dict[str, Path], | ||
| model_path: Path, | ||
| steps: list[str], |
There was a problem hiding this comment.
Does this take any str or just a Literal of a few possible string options?
| The path to the trained model file. | ||
| steps : list[str] | ||
| A list of active workflow steps to determine | ||
| whether to run inference, plotting, or both. |
There was a problem hiding this comment.
There's only a small finite/literal set of string options here, right?
| Detection and analysis must be run for every dataset to be used for training, validation and inference. For running the `train`, `infer`, `explain` and `plot` steps, a separate `dataset: -id:` must be used for each input dataset with the appropriate `role` for each dataset, i.e. `train`, `val` or `infer`. Paths for saving the model, training/inference results can be set with `root: model` and `root: results` respectively, and `inference_model` can be set to explicitly provide the path to the trained model when performing inference separately from training. | ||
|
|
||
| The user can also determine whether or not to perform machine learning classifier hyperparameter optimization via exhaustive grid search by setting the `ml_hyper_opt` to True or False (the default is True if no parameter is specified.) | ||
|
|
There was a problem hiding this comment.
it might be sensible to let them control estimator concurrency
| val_ds = val_list[0] | ||
| train_id = train_ds.get("id") | ||
| trained_model = Path(model_path) / f"{train_id}_model.joblib" | ||
| if not trained_model.exists(): |
There was a problem hiding this comment.
weird, in stage_train_model module it says # check to see if the model path already exists, if so, skip re-training; which one is correct?
|
Also, for this PR, gh-4 (as emphasized at #4 (comment)) and elsewhere, please disclose any scenarios where AI was used to write test cases (or anything else). There is a general "feeling" of the tests being verbose and repetitive instead of being crafted with care. It may also be that Nidhin did that initially, or a symptom of rushed copy pasting (neither great)--either way, the quality control has been quite time consuming. |
This is a work in progress branch of PR #5 that has been rebased against PR #4 to show only the relevant changes to this branch (as opposed to all of the changes from all previous PRs) and is intended to be used for addressing review comments (both from PR #4 and any new comments made here).