WIP, ENH: Add ML training and inference by adamwitmer · Pull Request #28 · lanl/ldrd_neat_ml

adamwitmer · 2026-03-17T16:16:44Z

This is a work in progress branch of PR #5 that has been rebased against PR #4 to show only the relevant changes to this branch (as opposed to all of the changes from all previous PRs) and is intended to be used for addressing review comments (both from PR #4 and any new comments made here).

Added neat_ml/model for ML Added neat_ml/utils for plotting Added neat_ml/phase_diagram for plotting phase diagram Added test scripts and baseline images Modified run_workflow.py for training, inference and feature importance

Updated the test_workflow.py to incorporate new tests Updated README.md with commandline example with train, infer,plot. Removed extraneous code from test scripts Modified baseline images to ensure the tolerance to be within 1e-4

Added ci.yml Added LANL copyright assertion ID

* add test file assets * fix mypy error

tylerjereddy · 2026-03-20T23:10:00Z

@adamwitmer let me know when this is ready for review -- I believe it was to be presented today (March 20th) after being delayed from 4 months ago. I gave you a detailed review of gh-4 after you at least did a few things for the ASC polymer project (it does still seem a bit sluggish over there).

I'll review this roughly 3 days after it is presented for review, assuming that duties on the ASC project are kept up.

adamwitmer · 2026-03-22T20:51:20Z

adamwitmer · 2026-03-24T16:25:56Z

+    plt.gcf().set_size_inches(8, 6)
+    plt.tight_layout()
+    plt.savefig(out_dir / "shap_summary.png", dpi=300)


I think we should be able to use fig, ax handles here...

I do not think that shap.summary_plot has support for fig, ax handles, per: shap/shap#3411, which can be seen at: https://github.com/shap/shap/blob/93dc2a1e446616fb0858b2ec108f80e4969ba6d9/shap/plots/summary.py#L45. This was an issue in the ldrd_virus_work project, which still uses global plt handles.

Well, I'm pretty sure I asked for help resolving that upstream for that other project too. In fact, I did here--https://lisdi-git.lanl.gov/treddy/ldrd_virus_work/-/issues/89#note_31565.

And I resolved the matter myself on the rng side at shap/shap#3945. That could have been a good opportunity to help out the community...

Quoting from the internal issue:

My expectation is that the team will clearly communicate which shap issues still remain, and really that you'll help me solve them proactively. That took about a month for a fairly simple patch, so look at your calendar and think about how long larger changes may take.

I opened a new issue to track this bug #29.

Please don't self-resolve comments where I've made a request for a change.

This isn't resolved--you've simply opened an issue to delay doing something I requested in November 2024, so the preferable route forward there is clear.

that you'll help me solve them proactively

adamwitmer · 2026-03-31T16:46:33Z

@tylerjereddy I have completed my initial checklist and addressed all review comments including initial comments made on PR #5. I made sure the workflow runs on glycan using real data and compared the outputs of running hyperparameter optimization vs. not for opencv and bubblesam (#28 (comment)) in relation to the request at #5 (comment). I re-read the diff several times looking for unnecessary complexity and other points of emphasis from previous PR's. This branch should be ready for an your review, thanks.

tylerjereddy · 2026-03-31T21:50:03Z

I'll make a note to do a first round of review on Friday, April 3, assuming activity is kept up on the ASC polymer project at 2 days effort/week (or if charging was completely stopped there this week for a new project substitution). Otherwise, I'll wait for you to catch up over there.

As discussed in person, presenting this volume of work this close to the deadline does place a lot of review burden on the team that is best avoided by progressively presenting the work over months of more digestible back and forth.

tylerjereddy

I've tried to provide a detailed review here despite the extraordinary (months) delay in presenting for review.

The value of hyperopt seems to be basically 0. I also commented in my review that you aren't even using hyeropt on one of your two estimators, so this really is a poor effort on the hyperopt side of things. Probably an issue should be opened and the source should point to that issue for future improvements in more complex situations where it actually matters. Grid search probably isn't sustainable in more complex situations where a guided/Bayesian search is likely required.

Test line coverage seems fine. However, when running the testsuite locally on this PR branch I saw several failures--possibly caused by dependency versions. It would be helpful to support a reasonably-wide range of dependency versions and to error out right away at runtime when importing a dependency at a version we do not support, so that what is happening is clear to the user (and the reviewer, who is spending time trying to figure this out...). Testsuite error output is below the fold--I'll leave resolution of that to you, and will probably just blindly change dependency versions locally until things work.

Details

============================================================================================ FAILURES ============================================================================================
___________________________________________________________________________________ test_train_with_validation ___________________________________________________________________________________
[gw1] darwin -- Python 3.12.3 /Users/treddy/python_venvs/py_312_ldrd_neat_dev/bin/python

sample_data =     feature1  feature2 feature3  exclude_col  target
0   0.773956  9.085807        A            0     1.0
1   0.438878...06  1.964347        B           98     1.0
99  0.961898  3.103237        B           99     0.0

[100 rows x 5 columns]

    def test_train_with_validation(sample_data: pd.DataFrame):
        X, y = preprocess(sample_data, target="target")
        # perfectly align all the feature data with the target
        X['feature1'] = np.where(
            y == 1.0,
            np.random.uniform(0.6, 1.0, len(X)),
            np.random.uniform(0.0, 0.4, len(X))
        )
        X['feature2'] = np.where(
            y == 1.0,
            np.random.uniform(6, 10, len(X)),
            np.random.uniform(0, 4, len(X))
        )
        X_train, y_train = X.iloc[:80], y.iloc[:80]
        X_val, y_val = X.iloc[80:], y.iloc[80:]
    
>       model, metrics, _, actual_val_proba = train_with_validation(
            X_train, y_train, X_val, y_val, n_jobs=1, ml_hyper_opt=False,
        )

neat_ml/tests/test_train.py:74: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
neat_ml/model/train.py:229: in train_with_validation
    final_model = pipeline.fit(X_train, y_train)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/base.py:1336: in wrapper
    return fit_method(estimator, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/pipeline.py:621: in fit
    self._final_estimator.fit(Xt, y, **last_step_params["fit"])
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/base.py:1336: in wrapper
    return fit_method(estimator, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/ensemble/_voting.py:405: in fit
    return super().fit(X, transformed_y, **fit_params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/ensemble/_voting.py:80: in fit
    names, clfs = self._validate_estimators()
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = VotingClassifier(estimators=[('rf',
                              RandomForestClassifier(class_weight='balanced',
    ...ee=None,
                                            random_state=42, ...))],
                 n_jobs=1, voting='soft')

    def _validate_estimators(self):
        if len(self.estimators) == 0 or not all(
            isinstance(item, (tuple, list)) and isinstance(item[0], str)
            for item in self.estimators
        ):
            raise ValueError(
                "Invalid 'estimators' attribute, 'estimators' should be a "
                "non-empty list of (string, estimator) tuples."
            )
        names, estimators = zip(*self.estimators)
        # defined by MetaEstimatorMixin
        self._validate_names(names)
    
        has_estimator = any(est != "drop" for est in estimators)
        if not has_estimator:
            raise ValueError(
                "All estimators are dropped. At least one is required "
                "to be an estimator."
            )
    
        is_estimator_type = is_classifier if is_classifier(self) else is_regressor
    
        for est in estimators:
            if est != "drop" and not is_estimator_type(est):
>               raise ValueError(
                    "The estimator {} should be a {}.".format(
                        est.__class__.__name__, is_estimator_type.__name__[3:]
                    )
                )
E               ValueError: The estimator XGBClassifier should be a classifier.

../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/ensemble/_base.py:243: ValueError
_____________________________________________________________________________________ test_save_model_bundle _____________________________________________________________________________________
[gw1] darwin -- Python 3.12.3 /Users/treddy/python_venvs/py_312_ldrd_neat_dev/bin/python

tmp_path = PosixPath('/private/var/folders/5_/hm0ft57n6dn2ksgg2p0bx5h0000w2g/T/pytest-of-treddy/pytest-13/popen-gw1/test_save_model_bundle0')
sample_data =     feature1  feature2 feature3  exclude_col  target
0   0.773956  9.085807        A            0     1.0
1   0.438878...06  1.964347        B           98     1.0
99  0.961898  3.103237        B           99     0.0

[100 rows x 5 columns]

    def test_save_model_bundle(tmp_path: Path, sample_data: pd.DataFrame):
        X, y = preprocess(sample_data, target="target")
        X_train, y_train = X.iloc[:80], y.iloc[:80]
        X_val, y_val = X.iloc[80:], y.iloc[80:]
>       expected_model, expected_metrics, expected_params, _ = train_with_validation(
            X_train, y_train, X_val, y_val, n_jobs=1, ml_hyper_opt=False
        )

neat_ml/tests/test_train.py:100: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
neat_ml/model/train.py:229: in train_with_validation
    final_model = pipeline.fit(X_train, y_train)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/base.py:1336: in wrapper
    return fit_method(estimator, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/pipeline.py:621: in fit
    self._final_estimator.fit(Xt, y, **last_step_params["fit"])
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/base.py:1336: in wrapper
    return fit_method(estimator, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/ensemble/_voting.py:405: in fit
    return super().fit(X, transformed_y, **fit_params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/ensemble/_voting.py:80: in fit
    names, clfs = self._validate_estimators()
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = VotingClassifier(estimators=[('rf',
                              RandomForestClassifier(class_weight='balanced',
    ...ee=None,
                                            random_state=42, ...))],
                 n_jobs=1, voting='soft')

    def _validate_estimators(self):
        if len(self.estimators) == 0 or not all(
            isinstance(item, (tuple, list)) and isinstance(item[0], str)
            for item in self.estimators
        ):
            raise ValueError(
                "Invalid 'estimators' attribute, 'estimators' should be a "
                "non-empty list of (string, estimator) tuples."
            )
        names, estimators = zip(*self.estimators)
        # defined by MetaEstimatorMixin
        self._validate_names(names)
    
        has_estimator = any(est != "drop" for est in estimators)
        if not has_estimator:
            raise ValueError(
                "All estimators are dropped. At least one is required "
                "to be an estimator."
            )
    
        is_estimator_type = is_classifier if is_classifier(self) else is_regressor
    
        for est in estimators:
            if est != "drop" and not is_estimator_type(est):
>               raise ValueError(
                    "The estimator {} should be a {}.".format(
                        est.__class__.__name__, is_estimator_type.__name__[3:]
                    )
                )
E               ValueError: The estimator XGBClassifier should be a classifier.

../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/ensemble/_base.py:243: ValueError
________________________________________________________________________________ test_compare_methods_end_to_end _________________________________________________________________________________
[gw2] darwin -- Python 3.12.3 /Users/treddy/python_venvs/py_312_ldrd_neat_dev/bin/python

tmp_path = PosixPath('/private/var/folders/5_/hm0ft57n6dn2ksgg2p0bx5h0000w2g/T/pytest-of-treddy/pytest-13/popen-gw2/test_compare_methods_end_to_en0')
classification_dataset = (   PEO 10 kg/mol (wt%)  ...  graph_num_components
0             2.334594  ...              2.416491
1            -2.1...18

[10 rows x 5 columns], 0    0
1    1
2    0
3    0
4    1
5    0
6    1
7    0
8    1
9    1
Name: y, dtype: int64)
stable_rc = {'axes.labelsize': 10, 'axes.linewidth': 1.0, 'axes.titlesize': 12, 'figure.dpi': 100, ...}, baseline_dir = PosixPath('/Users/treddy/LANL/gitlab/ldrd_neat_ml/neat_ml/tests/baseline')

    def test_compare_methods_end_to_end(
        tmp_path: Path,
        classification_dataset: tuple[pd.DataFrame, pd.Series],
        stable_rc,
        baseline_dir,
    ):
        """
        End-to-end test of compare_methods.
        Test consistency of mean rank of important features
        PNG compared via inline NumPy RMS diff.
        """
        rng = np.random.default_rng(0)
        X, y = classification_dataset
        # "preprocess" dataset to remove composition columns
        X = X.drop(columns=["PEO 10 kg/mol (wt%)", "Dextran 10 kg/mol (wt%)"])
        model = RandomForestClassifier(random_state=0).fit(X, y)
    
        with mpl.rc_context(stable_rc):
>           fi.compare_methods(model, X, y, out_dir=tmp_path, top=3, rng=rng)

neat_ml/tests/test_feature_importance.py:154: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
neat_ml/model/feature_importance.py:335: in compare_methods
    shap_imp = _run_shap(model, X, out_dir, top=top, rng=rng)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

model = RandomForestClassifier(random_state=0)
X =    num_blobs  coverage_percentage  graph_num_components
0  -1.698423             2.336225              2.416491
1   1.....170261
8   2.200476            -2.275953              0.915428
9  -0.377733            -1.104456             -1.824018
out_dir = PosixPath('/private/var/folders/5_/hm0ft57n6dn2ksgg2p0bx5h0000w2g/T/pytest-of-treddy/pytest-13/popen-gw2/test_compare_methods_end_to_en0'), top = 3, n_jobs = -1
rng = Generator(PCG64) at 0x3441CFCA0

    def _run_shap(
        model, X: pd.DataFrame,
        out_dir: Path,
        top: int = 20,
        n_jobs: int = -1,
        rng: np.random.Generator | None = None,
    ) -> pd.Series:
        """
        Compute global SHAP values for *model* and derive per-feature importance.
    
        A permutation explainer is instantiated on the fly because it works with
        any black box predict_proba** function.  The absolute SHAP values are
        averaged across all rows, giving a single scalar importance per feature.
    
        Parameters
        ----------
        model : Any
            Fitted classifier exposing a predict_proba(X) -> ndarray method whose
            second dimension contains probabilities for the positive class.
        X : pandas.DataFrame
            Numeric feature matrix used both as background data for the explainer
            and as the evaluation set whose SHAP values are summarized.
        out_dir : pathlib.Path
            Directory where the SHAP bar chart (shap_summary.png) will be saved.
        top : int, default 20
            Maximum number of features to display in the SHAP summary figure.
        n_jobs : int
            number of parallel processes to run for shap explainer. n_jobs=-1 uses
            all cores.
        rng : np.random.Generator | None
            pseudorandom number generator
    
        Returns
        -------
        imp : pandas.Series
            Index = feature names, values = mean absolute SHAP value (descending).
        """
        explainer = shap.Explainer(
            model.predict_proba,
            masker=X.values,
            algorithm="permutation",
            n_jobs=n_jobs,
            feature_names=X.columns.to_list(),
        )
        vals = explainer(X.values).values
        vals = vals[:, :, 1] if vals.ndim == 3 else vals
        imp = pd.Series(np.abs(vals).mean(0), index=X.columns).sort_values(ascending=False)
    
>       shap.summary_plot(vals, features=X, max_display=top, show=False, rng=rng)
E       TypeError: summary_legacy() got an unexpected keyword argument 'rng'

neat_ml/model/feature_importance.py:74: TypeError
_____________________________________________________________________________ test_stage_train_model_column_mismatch _____________________________________________________________________________
[gw2] darwin -- Python 3.12.3 /Users/treddy/python_venvs/py_312_ldrd_neat_dev/bin/python

tmp_path = PosixPath('/private/var/folders/5_/hm0ft57n6dn2ksgg2p0bx5h0000w2g/T/pytest-of-treddy/pytest-13/popen-gw2/test_stage_train_model_column_0')
sample_data =     feature1  feature2 feature3  exclude_col  target
0   0.773956  9.085807        A            0     1.0
1   0.438878...06  1.964347        B           98     1.0
99  0.961898  3.103237        B           99     0.0

[100 rows x 5 columns]
caplog = <_pytest.logging.LogCaptureFixture object at 0x39ea10530>

    def test_stage_train_model_column_mismatch(
        tmp_path: Path, sample_data, caplog
    ):
        caplog.set_level(logging.WARNING)
        train_ds = {"id": "TR4"}
        train_path = tmp_path / "train.csv"
        val_path = tmp_path / "val.csv"
        train_paths = {"agg_csv": train_path, "model_dir": tmp_path / "model"}
        val_paths = {"agg_csv": val_path}
        sample_data.to_csv(train_path, index=False)
        val_data = sample_data.drop(columns=["feature1", "exclude_col"])
        val_data.to_csv(val_path, index=False)
    
>       wf.stage_train_model(
            train_ds,
            train_paths,
            val_ds={"id": "VAL"},
            val_paths=val_paths,
            target="target"
        )

neat_ml/tests/test_workflow.py:726: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
neat_ml/workflow/lib_workflow.py:407: in stage_train_model
    model, metrics, best_params, val_proba = train_with_validation(
neat_ml/model/train.py:224: in train_with_validation
    grid_search.fit(X, y)
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/base.py:1336: in wrapper
    return fit_method(estimator, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/model_selection/_search.py:1053: in fit
    self._run_search(evaluate_candidates)
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/model_selection/_search.py:1612: in _run_search
    evaluate_candidates(ParameterGrid(self.param_grid))
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/model_selection/_search.py:1030: in evaluate_candidates
    _warn_or_raise_about_fit_failures(out, self.error_score)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

results = [{'fit_error': 'Traceback (most recent call last):\n  File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python...fier should be a classifier.\n', 'fit_time': 0.0013880729675292969, 'n_test_samples': 99, 'score_time': 0.0, ...}, ...]
error_score = nan

    def _warn_or_raise_about_fit_failures(results, error_score):
        fit_errors = [
            result["fit_error"] for result in results if result["fit_error"] is not None
        ]
        if fit_errors:
            num_failed_fits = len(fit_errors)
            num_fits = len(results)
            fit_errors_counter = Counter(fit_errors)
            delimiter = "-" * 80 + "\n"
            fit_errors_summary = "\n".join(
                f"{delimiter}{n} fits failed with the following error:\n{error}"
                for error, n in fit_errors_counter.items()
            )
    
            if num_failed_fits == num_fits:
                all_fits_failed_message = (
                    f"\nAll the {num_fits} fits failed.\n"
                    "It is very likely that your model is misconfigured.\n"
                    "You can try to debug the error by setting error_score='raise'.\n\n"
                    f"Below are more details about the failures:\n{fit_errors_summary}"
                )
>               raise ValueError(all_fits_failed_message)
E               ValueError: 
E               All the 72 fits failed.
E               It is very likely that your model is misconfigured.
E               You can try to debug the error by setting error_score='raise'.
E               
E               Below are more details about the failures:
E               --------------------------------------------------------------------------------
E               72 fits failed with the following error:
E               Traceback (most recent call last):
E                 File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 833, in _fit_and_score
E                   estimator.fit(X_train, y_train, **fit_params)
E                 File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/base.py", line 1336, in wrapper
E                   return fit_method(estimator, *args, **kwargs)
E                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E                 File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/pipeline.py", line 621, in fit
E                   self._final_estimator.fit(Xt, y, **last_step_params["fit"])
E                 File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/base.py", line 1336, in wrapper
E                   return fit_method(estimator, *args, **kwargs)
E                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E                 File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/ensemble/_voting.py", line 405, in fit
E                   return super().fit(X, transformed_y, **fit_params)
E                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E                 File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/ensemble/_voting.py", line 80, in fit
E                   names, clfs = self._validate_estimators()
E                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
E                 File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/ensemble/_base.py", line 243, in _validate_estimators
E                   raise ValueError(
E               ValueError: The estimator XGBClassifier should be a classifier.

../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:479: ValueError
--------------------------------------------------------------------------------------- Captured log call ----------------------------------------------------------------------------------------
WARNING  neat_ml.workflow.lib_workflow:lib_workflow.py:396 Feature mismatch: using 1common features (train=Index(['feature1', 'feature2', 'exclude_col'], dtype='object'), val=Index(['feature2'], dtype='object')).
_____________________________________________________________________ test_stage_train_model_happy_path_saves_bundle_and_roc _____________________________________________________________________
[gw2] darwin -- Python 3.12.3 /Users/treddy/python_venvs/py_312_ldrd_neat_dev/bin/python

tmp_path = PosixPath('/private/var/folders/5_/hm0ft57n6dn2ksgg2p0bx5h0000w2g/T/pytest-of-treddy/pytest-13/popen-gw2/test_stage_train_model_happy_p0')
sample_data =     feature1  feature2 feature3  exclude_col  target
0   0.773956  9.085807        A            0     1.0
1   0.438878...06  1.964347        B           98     1.0
99  0.961898  3.103237        B           99     0.0

[100 rows x 5 columns]

    def test_stage_train_model_happy_path_saves_bundle_and_roc(
        tmp_path: Path,
        sample_data,
    ):
    
        train_ds = {"id": "TR5"}
        train_paths = {"agg_csv": tmp_path / "train.csv", "model_dir": tmp_path / "model"}
        val_paths = {"agg_csv": tmp_path / "val.csv"}
        sample_data.to_csv(val_paths["agg_csv"], index=False)
        sample_data.to_csv(train_paths["agg_csv"], index=False)
    
>       wf.stage_train_model(
            train_ds,
            train_paths,
            val_ds={"id": "VAL"},
            val_paths=val_paths,
            target="target"
        )

neat_ml/tests/test_workflow.py:747: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
neat_ml/workflow/lib_workflow.py:407: in stage_train_model
    model, metrics, best_params, val_proba = train_with_validation(
neat_ml/model/train.py:224: in train_with_validation
    grid_search.fit(X, y)
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/base.py:1336: in wrapper
    return fit_method(estimator, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/model_selection/_search.py:1053: in fit
    self._run_search(evaluate_candidates)
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/model_selection/_search.py:1612: in _run_search
    evaluate_candidates(ParameterGrid(self.param_grid))
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/model_selection/_search.py:1030: in evaluate_candidates
    _warn_or_raise_about_fit_failures(out, self.error_score)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

results = [{'fit_error': 'Traceback (most recent call last):\n  File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python...assifier should be a classifier.\n', 'fit_time': 0.000946044921875, 'n_test_samples': 99, 'score_time': 0.0, ...}, ...]
error_score = nan

    def _warn_or_raise_about_fit_failures(results, error_score):
        fit_errors = [
            result["fit_error"] for result in results if result["fit_error"] is not None
        ]
        if fit_errors:
            num_failed_fits = len(fit_errors)
            num_fits = len(results)
            fit_errors_counter = Counter(fit_errors)
            delimiter = "-" * 80 + "\n"
            fit_errors_summary = "\n".join(
                f"{delimiter}{n} fits failed with the following error:\n{error}"
                for error, n in fit_errors_counter.items()
            )
    
            if num_failed_fits == num_fits:
                all_fits_failed_message = (
                    f"\nAll the {num_fits} fits failed.\n"
                    "It is very likely that your model is misconfigured.\n"
                    "You can try to debug the error by setting error_score='raise'.\n\n"
                    f"Below are more details about the failures:\n{fit_errors_summary}"
                )
>               raise ValueError(all_fits_failed_message)
E               ValueError: 
E               All the 72 fits failed.
E               It is very likely that your model is misconfigured.
E               You can try to debug the error by setting error_score='raise'.
E               
E               Below are more details about the failures:
E               --------------------------------------------------------------------------------
E               72 fits failed with the following error:
E               Traceback (most recent call last):
E                 File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/model_selection/_validation.py", line 833, in _fit_and_score
E                   estimator.fit(X_train, y_train, **fit_params)
E                 File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/base.py", line 1336, in wrapper
E                   return fit_method(estimator, *args, **kwargs)
E                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E                 File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/pipeline.py", line 621, in fit
E                   self._final_estimator.fit(Xt, y, **last_step_params["fit"])
E                 File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/base.py", line 1336, in wrapper
E                   return fit_method(estimator, *args, **kwargs)
E                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E                 File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/ensemble/_voting.py", line 405, in fit
E                   return super().fit(X, transformed_y, **fit_params)
E                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E                 File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/ensemble/_voting.py", line 80, in fit
E                   names, clfs = self._validate_estimators()
E                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
E                 File "/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/ensemble/_base.py", line 243, in _validate_estimators
E                   raise ValueError(
E               ValueError: The estimator XGBClassifier should be a classifier.

../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/model_selection/_validation.py:479: ValueError
__________________________________________________________________ test_stage_explain_aligns_features_and_calls_compare_methods __________________________________________________________________
[gw2] darwin -- Python 3.12.3 /Users/treddy/python_venvs/py_312_ldrd_neat_dev/bin/python

tmp_path = PosixPath('/private/var/folders/5_/hm0ft57n6dn2ksgg2p0bx5h0000w2g/T/pytest-of-treddy/pytest-13/popen-gw2/test_stage_explain_aligns_feat0')
sample_inference_data = PosixPath('/private/var/folders/5_/hm0ft57n6dn2ksgg2p0bx5h0000w2g/T/pytest-of-treddy/pytest-13/popen-gw2/infer0/inference_data.csv')
trained_model_bundle = PosixPath('/private/var/folders/5_/hm0ft57n6dn2ksgg2p0bx5h0000w2g/T/pytest-of-treddy/pytest-13/popen-gw2/model0/model.joblib')

    def test_stage_explain_aligns_features_and_calls_compare_methods(
        tmp_path: Path,
        sample_inference_data,
        trained_model_bundle,
    ):
        explain_out = tmp_path / "explain_out"
        train_ds = {"id": "TRX", "composition_cols": ["PEG"]}
        paths = {"agg_csv": sample_inference_data, "explain_dir": explain_out}
    
>       wf.stage_explain(train_ds, paths, trained_model_bundle, target="ground_truth")

neat_ml/tests/test_workflow.py:766: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
neat_ml/workflow/lib_workflow.py:479: in stage_explain
    compare_methods(
neat_ml/model/feature_importance.py:335: in compare_methods
    shap_imp = _run_shap(model, X, out_dir, top=top, rng=rng)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

model = Pipeline(steps=[('impute', SimpleImputer(strategy='median')),
                ('scale', StandardScaler()),
                ('clf', LogisticRegression(random_state=42))])
X =       feat_a  feat_b
0   0.682352     0.0
1   0.053821     1.0
2   0.220360     2.0
3   0.184372     3.0
4   0.175906 ...173632    44.0
45  0.312742    45.0
46  0.014474    46.0
47  0.032552    47.0
48  0.496702    48.0
49  0.468313    49.0
out_dir = PosixPath('/private/var/folders/5_/hm0ft57n6dn2ksgg2p0bx5h0000w2g/T/pytest-of-treddy/pytest-13/popen-gw2/test_stage_explain_aligns_feat0/explain_out'), top = 20, n_jobs = -1
rng = None

    def _run_shap(
        model, X: pd.DataFrame,
        out_dir: Path,
        top: int = 20,
        n_jobs: int = -1,
        rng: np.random.Generator | None = None,
    ) -> pd.Series:
        """
        Compute global SHAP values for *model* and derive per-feature importance.
    
        A permutation explainer is instantiated on the fly because it works with
        any black box predict_proba** function.  The absolute SHAP values are
        averaged across all rows, giving a single scalar importance per feature.
    
        Parameters
        ----------
        model : Any
            Fitted classifier exposing a predict_proba(X) -> ndarray method whose
            second dimension contains probabilities for the positive class.
        X : pandas.DataFrame
            Numeric feature matrix used both as background data for the explainer
            and as the evaluation set whose SHAP values are summarized.
        out_dir : pathlib.Path
            Directory where the SHAP bar chart (shap_summary.png) will be saved.
        top : int, default 20
            Maximum number of features to display in the SHAP summary figure.
        n_jobs : int
            number of parallel processes to run for shap explainer. n_jobs=-1 uses
            all cores.
        rng : np.random.Generator | None
            pseudorandom number generator
    
        Returns
        -------
        imp : pandas.Series
            Index = feature names, values = mean absolute SHAP value (descending).
        """
        explainer = shap.Explainer(
            model.predict_proba,
            masker=X.values,
            algorithm="permutation",
            n_jobs=n_jobs,
            feature_names=X.columns.to_list(),
        )
        vals = explainer(X.values).values
        vals = vals[:, :, 1] if vals.ndim == 3 else vals
        imp = pd.Series(np.abs(vals).mean(0), index=X.columns).sort_values(ascending=False)
    
>       shap.summary_plot(vals, features=X, max_display=top, show=False, rng=rng)
E       TypeError: summary_legacy() got an unexpected keyword argument 'rng'

neat_ml/model/feature_importance.py:74: TypeError
======================================================================================== warnings summary ========================================================================================
../../../python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/shap/plots/colors/_colorconv.py:819: 7272 warnings
  /Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/shap/plots/colors/_colorconv.py:819: DeprecationWarning: Converting `np.inexact` or `np.floating` to a dtype is deprecated. The current result is `float64` which is not strictly correct.
    if np.issubdtype(dtype_in, np.dtype(dtype).type):

neat_ml/tests/test_analysis.py: 30 warnings
  /Users/treddy/LANL/gitlab/ldrd_neat_ml/neat_ml/analysis/data_analysis.py:410: DeprecationWarning: Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)
    nbr_dists = np.fromiter((d["distance"] for _, _, d in

neat_ml/tests/test_detection.py::test_detect_single_image_no_blobs
  /Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/numpy/lib/_nanfunctions_impl.py:1231: RuntimeWarning: Mean of empty slice
    return np.nanmean(a, axis, out=out, keepdims=keepdims)

neat_ml/tests/test_lib.py: 15 warnings
neat_ml/tests/test_feature_importance.py: 719 warnings
  /Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/utils/validation.py:2691: UserWarning: X does not have valid feature names, but RandomForestClassifier was fitted with feature names
    warnings.warn(

neat_ml/tests/test_bubblesam.py: 89 warnings
neat_ml/tests/test_workflow.py: 59 warnings
  /Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/torch/jit/_script.py:1480: DeprecationWarning: `torch.jit.script` is deprecated. Please switch to `torch.compile` or `torch.export`.
    warnings.warn(

neat_ml/tests/test_detection.py::test_detect_single_image_processed
neat_ml/tests/test_workflow.py::test_run_workflow_single_image_path[opencv-None-paths2-bubble_data]
neat_ml/tests/test_detection.py::test_visual_regression_debug_overlay
neat_ml/tests/test_workflow.py::test_stage_detect_pipeline_runs[ds1-paths1-exp_columns1]
  /Users/treddy/LANL/gitlab/ldrd_neat_ml/neat_ml/opencv/detection.py:70: FutureWarning: Downcasting object dtype arrays on .fillna, .ffill, .bfill is deprecated and will change in a future version. Call result.infer_objects(copy=False) instead. To opt-in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)`
    columns=["bubble_number", "center", "radius", "area", "bbox"]).fillna(np.nan)

neat_ml/tests/test_workflow.py: 5049 warnings
  /Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/utils/validation.py:2691: UserWarning: X does not have valid feature names, but SimpleImputer was fitted with feature names
    warnings.warn(

neat_ml/tests/test_bubblesam.py::test_sam_internal_api[mps]
neat_ml/tests/test_bubblesam.py::test_bubblesam_detection_generates_pngs[cpu]
neat_ml/tests/test_workflow.py::test_run_workflow_single_image_path[bubblesam-cpu-paths0-masks_filtered]
neat_ml/tests/test_workflow.py::test_stage_detect_pipeline_runs[ds0-paths0-exp_columns0]
neat_ml/tests/test_bubblesam.py::test_run_bubblesam[cpu]
neat_ml/tests/test_bubblesam.py::test_bubblesam_detection_generates_pngs[mps]
neat_ml/tests/test_bubblesam.py::test_sam_internal_api[cpu]
neat_ml/tests/test_workflow.py::test_run_workflow_single_image_path[bubblesam-gpu-paths1-masks_filtered]
neat_ml/tests/test_bubblesam.py::test_run_bubblesam[gpu]
  /Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sam2/sam2_image_predictor.py:431: UserWarning: cannot import name '_C' from 'sam2' (/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sam2/__init__.py)
  
  Skipping the post-processing step due to the error above. You can still use SAM 2 and it's OK to ignore the error above, although some post-processing functionality may be limited (which doesn't affect the results in most cases; see https://github.com/facebookresearch/sam2/blob/main/INSTALL.md).
    masks = self._transforms.postprocess_masks(

neat_ml/tests/test_workflow.py::test_stage_run_inference_calls_inference_and_makes_pred_dir
neat_ml/tests/test_workflow.py::test_stage_run_inference_calls_inference_and_makes_pred_dir
  /Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/base.py:1336: ConvergenceWarning: Number of distinct clusters (1) found smaller than n_clusters (2). Possibly due to duplicate points in X.
    return fit_method(estimator, *args, **kwargs)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
==================================================================================== short test summary info =====================================================================================
FAILED neat_ml/tests/test_train.py::test_train_with_validation - ValueError: The estimator XGBClassifier should be a classifier.
FAILED neat_ml/tests/test_train.py::test_save_model_bundle - ValueError: The estimator XGBClassifier should be a classifier.
FAILED neat_ml/tests/test_feature_importance.py::test_compare_methods_end_to_end - TypeError: summary_legacy() got an unexpected keyword argument 'rng'
FAILED neat_ml/tests/test_workflow.py::test_stage_train_model_column_mismatch - ValueError: 
FAILED neat_ml/tests/test_workflow.py::test_stage_train_model_happy_path_saves_bundle_and_roc - ValueError: 
FAILED neat_ml/tests/test_workflow.py::test_stage_explain_aligns_features_and_calls_compare_methods - TypeError: summary_legacy() got an unexpected keyword argument 'rng'
=================================================================== 6 failed, 166 passed, 2 skipped, 13249 warnings in 56.22s ====================================================================

tylerjereddy · 2026-04-03T16:47:40Z

 on:
  push:
-    branches: [ main ]
+    branches: [ main, nidhin_data_analysis_backup ]


I think you only need the temporary pull_request modification below, since we don't plan to merge into the non-main branch here.

tylerjereddy · 2026-04-03T16:53:44Z

+      if: runner.os == 'macOS'
+      run: |
+        echo "Limiting OpenMP to 1 thread for macOS performance"
+        echo "OMP_NUM_THREADS=1" >> $GITHUB_ENV


There is not sufficient detail here to motivate the need for this shim, and you should try to avoid burdening the reviewer with having to go fishing for the related information.

Even if you explained this somewhere else, the most helpful course of action tends to be to help the reader out with a clear and concise comment in your CI configuration that explains exactly why this is needed--which library is affected? Is it an upstream bug?

Why are we doing this instead of using a canonical Python-level tool like threadpoolctl, which helps limit the number of threads used in native libraries that handle their own internal threadpool (BLAS and OpenMP implementations).

I'm not necessarily saying you're wrong here, but you're asking the reviewer to do some heavy lifting to figure out what is going on, which isn't great for clarity/efficiency of reviewer time.

tylerjereddy · 2026-04-03T18:34:40Z

+    ebm_act = tmp_path / "ebm_importance.png"
+    ebm_exp = baseline_dir / "ebm_importance_expected.png"
+    result = compare_images(ebm_exp, ebm_act, tol=1e-4) # type: ignore[call-overload]
+    assert result is None


This test is failing pretty consistently for me locally on ARM Mac with traceback below the fold. Tests should be contructed to be reliable--if dependency versions cause issues that should be cleaned up somehow (shim in the source code or error out with unsupported version of dep). If something is missing a random seed it should be pinned, etc.

Details

_______________________________________________________________________________ test_compare_methods_end_to_end _________________________________________________________________________________ [gw2] darwin -- Python 3.12.3 /Users/treddy/python_venvs/py_312_ldrd_neat_dev/bin/python tmp_path = PosixPath('/private/var/folders/5_/hm0ft57n6dn2ksgg2p0bx5h0000w2g/T/pytest-of-treddy/pytest-15/popen-gw2/test_compare_methods_end_to_en0') classification_dataset = ( PEO 10 kg/mol (wt%) ... graph_num_components 0 2.334594 ... 2.416491 1 -2.1...18 [10 rows x 5 columns], 0 0 1 1 2 0 3 0 4 1 5 0 6 1 7 0 8 1 9 1 Name: y, dtype: int64) stable_rc = {'axes.labelsize': 10, 'axes.linewidth': 1.0, 'axes.titlesize': 12, 'figure.dpi': 100, ...}, baseline_dir = PosixPath('/Users/treddy/LANL/gitlab/ldrd_neat_ml/neat_ml/tests/baseline') def test_compare_methods_end_to_end( tmp_path: Path, classification_dataset: tuple[pd.DataFrame, pd.Series], stable_rc, baseline_dir, ): """ End-to-end test of compare_methods. Test consistency of mean rank of important features PNG compared via inline NumPy RMS diff. """ rng = np.random.default_rng(0) X, y = classification_dataset # "preprocess" dataset to remove composition columns X = X.drop(columns=["PEO 10 kg/mol (wt%)", "Dextran 10 kg/mol (wt%)"]) model = RandomForestClassifier(random_state=0).fit(X, y) with mpl.rc_context(stable_rc): fi.compare_methods(model, X, y, out_dir=tmp_path, top=3, rng=rng) actual_csv_path = tmp_path / "feature_importance_comparison.csv" actual_df = pd.read_csv(actual_csv_path, index_col=0) # SHAP importance values fluctuate on the order of 1e-2 floating # point precision between calls, so check that the mean ranking of # the feature importance values is preserved. assert_allclose(actual_df["mean_rank"], [1.3333333333333333, 2.0, 2.6666666666666665]) # check the output of ebm importance ranking. # for the same reason that SHAP values are difficult to compare, # the SHAP plot and FIC plots also fluctuate between runs, # by a floating point value big enough to make image comparison difficult. ebm_act = tmp_path / "ebm_importance.png" ebm_exp = baseline_dir / "ebm_importance_expected.png" result = compare_images(ebm_exp, ebm_act, tol=1e-4) # type: ignore[call-overload] > assert result is None E AssertionError: assert 'Error: Image files did not match.\n RMS Value: 9.395480684439018\n Expected: \n /Users/treddy/LANL/gitlab/ldrd_...f-treddy/pytest-15/popen-gw2/test_compare_methods_end_to_en0/ebm_importance-failed-diff.png\n Tolerance: \n 0.0001' is None neat_ml/tests/test_feature_importance.py:171: AssertionError ======================================================================================== warnings summary ======================================================================================== neat_ml/tests/test_analysis.py: 30 warnings /Users/treddy/LANL/gitlab/ldrd_neat_ml/neat_ml/analysis/data_analysis.py:410: DeprecationWarning: Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.) nbr_dists = np.fromiter((d["distance"] for _, _, d in neat_ml/tests/test_detection.py::test_detect_single_image_no_blobs /Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/numpy/lib/_nanfunctions_impl.py:1241: RuntimeWarning: Mean of empty slice return np.nanmean(a, axis, out=out, keepdims=keepdims) neat_ml/tests/test_lib.py: 15 warnings neat_ml/tests/test_feature_importance.py: 729 warnings /Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/utils/validation.py:2691: UserWarning: X does not have valid feature names, but RandomForestClassifier was fitted with feature names warnings.warn( neat_ml/tests/test_bubblesam.py: 89 warnings neat_ml/tests/test_workflow.py: 59 warnings /Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/torch/jit/_script.py:1480: DeprecationWarning: `torch.jit.script` is deprecated. Please switch to `torch.compile` or `torch.export`. warnings.warn( neat_ml/tests/test_workflow.py::test_stage_detect_pipeline_runs[ds0-paths0-exp_columns0] neat_ml/tests/test_bubblesam.py::test_bubblesam_detection_generates_pngs[cpu] neat_ml/tests/test_bubblesam.py::test_sam_internal_api[mps] neat_ml/tests/test_workflow.py::test_run_workflow_single_image_path[bubblesam-cpu-paths0-masks_filtered] neat_ml/tests/test_workflow.py::test_run_workflow_single_image_path[bubblesam-gpu-paths1-masks_filtered] neat_ml/tests/test_bubblesam.py::test_bubblesam_detection_generates_pngs[mps] neat_ml/tests/test_bubblesam.py::test_run_bubblesam[cpu] neat_ml/tests/test_bubblesam.py::test_run_bubblesam[gpu] neat_ml/tests/test_bubblesam.py::test_sam_internal_api[cpu] /Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sam2/sam2_image_predictor.py:431: UserWarning: cannot import name '_C' from 'sam2' (/Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sam2/__init__.py) Skipping the post-processing step due to the error above. You can still use SAM 2 and it's OK to ignore the error above, although some post-processing functionality may be limited (which doesn't affect the results in most cases; see https://github.com/facebookresearch/sam2/blob/main/INSTALL.md). masks = self._transforms.postprocess_masks( neat_ml/tests/test_workflow.py::test_stage_detect_pipeline_runs[ds0-paths0-exp_columns0] /Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/joblib/memory.py:607: UserWarning: Persisting input arguments took 1.23s to run.If this happens often in your code, it can cause performance problems (results will be correct in all cases). The reason for this is probably some large input arguments for a wrapped function. return self._cached_call(args, kwargs, shelving=False)[0] neat_ml/tests/test_workflow.py::test_stage_train_model_column_mismatch neat_ml/tests/test_workflow.py::test_stage_train_model_happy_path_saves_bundle_and_roc /Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/utils/validation.py:2684: UserWarning: X has feature names, but SimpleImputer was fitted without feature names warnings.warn( neat_ml/tests/test_workflow.py: 5099 warnings /Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/utils/validation.py:2691: UserWarning: X does not have valid feature names, but SimpleImputer was fitted with feature names warnings.warn( neat_ml/tests/test_workflow.py::test_stage_explain_aligns_features_and_calls_compare_methods /Users/treddy/LANL/gitlab/ldrd_neat_ml/neat_ml/model/feature_importance.py:74: FutureWarning: The NumPy global RNG was seeded by calling `np.random.seed`. In a future version this function will no longer use the global RNG. Pass `rng` explicitly to opt-in to the new behaviour and silence this warning. shap.summary_plot(vals, features=X, max_display=top, show=False, rng=rng) neat_ml/tests/test_workflow.py::test_stage_run_inference_calls_inference_and_makes_pred_dir neat_ml/tests/test_workflow.py::test_stage_run_inference_calls_inference_and_makes_pred_dir /Users/treddy/python_venvs/py_312_ldrd_neat_dev/lib/python3.12/site-packages/sklearn/base.py:1336: ConvergenceWarning: Number of distinct clusters (1) found smaller than n_clusters (2). Possibly due to duplicate points in X. return fit_method(estimator, *args, **kwargs) -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ==================================================================================== short test summary info ===================================================================================== FAILED neat_ml/tests/test_feature_importance.py::test_compare_methods_end_to_end - AssertionError: assert 'Error: Image files did not match.\n RMS Value: 9.395480684439018\n Expected: \n /Users/treddy/LANL/gitlab/ldrd_...f-treddy/pytest-15/popen-gw2/test_compare_met... ==================================================================== 1 failed, 171 passed, 2 skipped, 6037 warnings in 46.57s ====================================================================

tylerjereddy · 2026-04-03T18:48:54Z

+
+
+def _run_shap(
+    model, X: pd.DataFrame, 


Don't know why this was marked as resolved since it isn't, I'll reopen it...

It would be good if resolutions were checked and explained.

tylerjereddy · 2026-04-03T18:51:48Z

+        second dimension contains probabilities for the positive class.
+    X : pandas.DataFrame
+        Numeric feature matrix used both as background data for the explainer
+        and as the evaluation set whose SHAP values are summarized.


This description is confusing. Is X the design matrix used for training the estimators or something else? Not clear.

What is an "evaluation set?" Is that different from training data? Often we use feature importance techniques on the training data, but I'm finding this description not particularly clear...

tylerjereddy · 2026-04-04T00:04:02Z

+    train_dataset_config: dict[str, Any],
+    paths: dict[str, Path],
+    model_path: Path,
+    target: str = "Phase_Separation",


no way to set number of top features to use from this public function?

tylerjereddy · 2026-04-04T00:06:30Z

+    infer_dataset_config: dict[str, Any],
+    paths: dict[str, Path],
+    model_path: Path,
+    steps: list[str],


Does this take any str or just a Literal of a few possible string options?

tylerjereddy · 2026-04-04T00:07:18Z

+        The path to the trained model file.
+    steps : list[str]
+        A list of active workflow steps to determine 
+        whether to run inference, plotting, or both.


There's only a small finite/literal set of string options here, right?

tylerjereddy · 2026-04-04T00:11:28Z

+Detection and analysis must be run for every dataset to be used for training, validation and inference. For running the `train`, `infer`, `explain` and `plot` steps, a separate `dataset: -id:` must be used for each input dataset with the appropriate `role` for each dataset, i.e. `train`, `val` or `infer`. Paths for saving the model, training/inference results can be set with `root: model` and `root: results` respectively, and `inference_model` can be set to explicitly provide the path to the trained model when performing inference separately from training. 
+
+The user can also determine whether or not to perform machine learning classifier hyperparameter optimization via exhaustive grid search by setting the `ml_hyper_opt` to True or False (the default is True if no parameter is specified.)
+


it might be sensible to let them control estimator concurrency

tylerjereddy · 2026-04-04T01:00:23Z

+        val_ds = val_list[0]
+        train_id = train_ds.get("id")
+        trained_model = Path(model_path) / f"{train_id}_model.joblib"
+        if not trained_model.exists():


weird, in stage_train_model module it says # check to see if the model path already exists, if so, skip re-training; which one is correct?

tylerjereddy · 2026-04-04T22:05:27Z

Also, for this PR, gh-4 (as emphasized at #4 (comment)) and elsewhere, please disclose any scenarios where AI was used to write test cases (or anything else). There is a general "feeling" of the tests being verbose and repetitive instead of being crafted with care.

It may also be that Nidhin did that initially, or a symptom of rushed copy pasting (neither great)--either way, the quality control has been quite time consuming.

Nidhin Thomas and others added 6 commits March 17, 2026 09:07

ENH: Implemented ML model

db70eb2

Added neat_ml/model for ML Added neat_ml/utils for plotting Added neat_ml/phase_diagram for plotting phase diagram Added test scripts and baseline images Modified run_workflow.py for training, inference and feature importance

TST: Fixed bugs and improved coverage

b9f1f48

Updated the test_workflow.py to incorporate new tests Updated README.md with commandline example with train, infer,plot. Removed extraneous code from test scripts Modified baseline images to ensure the tolerance to be within 1e-4

CI: Added .github/workflows/ci.yml

c5984c5

Added ci.yml Added LANL copyright assertion ID

MAINT,TST: fix conflicts after rebase

11a463f

* add test file assets * fix mypy error

CI: temporarily modify ci to point towards nidhin_data_analysis_backup

95e5427

TYP: add type ignore comments to feature_importance.py

25bd873

adamwitmer changed the title ~~WIP: Nidhin train inference rebase~~ WIP, ENH: Add ML training and inference Mar 17, 2026

MAINT: add variable for controlling n_jobs for classifier training

9442225

MAINT, TST: replace manual grid search with GridSearchCV

50087f6

adamwitmer commented Mar 23, 2026

View reviewed changes

Comment thread neat_ml/model/inference.py Outdated

adamwitmer commented Mar 24, 2026

View reviewed changes

Comment thread neat_ml/model/feature_importance.py Outdated

adamwitmer commented Mar 24, 2026

View reviewed changes

Comment thread neat_ml/model/feature_importance.py Outdated

adamwitmer commented Mar 24, 2026

View reviewed changes

Comment thread neat_ml/model/feature_importance.py Outdated

adamwitmer commented Mar 24, 2026

View reviewed changes

Comment thread neat_ml/model/feature_importance.py Outdated

adamwitmer commented Mar 24, 2026

View reviewed changes

Comment thread neat_ml/model/feature_importance.py Outdated

adamwitmer commented Mar 24, 2026

View reviewed changes

Comment thread neat_ml/model/feature_importance.py Outdated

adamwitmer commented Mar 24, 2026

View reviewed changes

Comment thread neat_ml/model/feature_importance.py Outdated

adamwitmer commented Mar 24, 2026

View reviewed changes

Comment thread neat_ml/model/feature_importance.py Outdated

adamwitmer commented Mar 24, 2026

View reviewed changes

Comment thread neat_ml/model/feature_importance.py Outdated

adamwitmer commented Mar 24, 2026

View reviewed changes

Comment thread neat_ml/model/feature_importance.py Outdated

adamwitmer commented Mar 24, 2026

View reviewed changes

Comment thread neat_ml/model/feature_importance.py Outdated

adamwitmer commented Mar 24, 2026

View reviewed changes

Comment thread neat_ml/model/feature_importance.py

adamwitmer commented Mar 24, 2026

View reviewed changes

Comment thread neat_ml/model/feature_importance.py Outdated

adamwitmer commented Mar 24, 2026

View reviewed changes

Comment thread neat_ml/model/feature_importance.py Outdated

adamwitmer commented Mar 24, 2026

View reviewed changes

Comment thread neat_ml/model/feature_importance.py Outdated

adamwitmer commented Mar 24, 2026

View reviewed changes

Comment thread neat_ml/model/feature_importance.py Outdated

adamwitmer commented Mar 24, 2026

View reviewed changes

Comment thread neat_ml/model/feature_importance.py Outdated

adamwitmer commented Mar 30, 2026

View reviewed changes

Comment thread neat_ml/model/feature_importance.py Outdated

adamwitmer commented Mar 31, 2026

View reviewed changes

Comment thread neat_ml/model/feature_importance.py Outdated

adamwitmer commented Mar 31, 2026

View reviewed changes

Comment thread neat_ml/model/feature_importance.py Outdated

adamwitmer commented Mar 31, 2026

View reviewed changes

Comment thread neat_ml/model/feature_importance.py Outdated

adamwitmer commented Mar 31, 2026

View reviewed changes

Comment thread neat_ml/model/train.py Outdated

adamwitmer commented Mar 31, 2026

View reviewed changes

Comment thread neat_ml/tests/conftest.py Outdated

adamwitmer commented Mar 31, 2026

View reviewed changes

Comment thread neat_ml/tests/test_workflow.py

adamwitmer commented Mar 31, 2026

View reviewed changes

Comment thread neat_ml/workflow/lib_workflow.py Outdated

adamwitmer commented Mar 31, 2026

View reviewed changes

Comment thread neat_ml/workflow/lib_workflow.py Outdated

adamwitmer commented Mar 31, 2026

View reviewed changes

Comment thread neat_ml/workflow/lib_workflow.py Outdated

adamwitmer commented Mar 31, 2026

View reviewed changes

Comment thread run_workflow.py Outdated

adamwitmer commented Mar 31, 2026

View reviewed changes

Comment thread neat_ml/model/feature_importance.py Outdated

adamwitmer commented Mar 31, 2026

View reviewed changes

Comment thread neat_ml/model/train.py Outdated

ajwitmer added 4 commits March 30, 2026 20:05

MAINT: address reviewer comments

6c74d60

CI: revert omp_num_threads back to 1 for macos ci runner

2b8fedb

MAINT: address reviewer comments

316aede

MAINT, BUG: add missing if statement colon

348c3a6

adamwitmer commented Mar 31, 2026

View reviewed changes

Comment thread neat_ml/tests/test_workflow.py Outdated

adamwitmer commented Mar 31, 2026

View reviewed changes

Comment thread neat_ml/tests/test_workflow.py Outdated

adamwitmer commented Mar 31, 2026

View reviewed changes

Comment thread neat_ml/workflow/lib_workflow.py Outdated

adamwitmer commented Mar 31, 2026

View reviewed changes

Comment thread neat_ml/workflow/lib_workflow.py Outdated

adamwitmer commented Mar 31, 2026

View reviewed changes

Comment thread run_workflow.py Outdated

adamwitmer commented Mar 31, 2026

View reviewed changes

Comment thread run_workflow.py Outdated

MAINT: address reviewer comments

c21c328

tylerjereddy added the enhancement New feature or request label Apr 3, 2026

tylerjereddy requested changes Apr 4, 2026

View reviewed changes

adamwitmer mentioned this pull request Apr 16, 2026

ENH: tracking changes to ML manuscript (la-2026-00306e) #26

Open

9 tasks

		Detection and analysis must be run for every dataset to be used for training, validation and inference. For running the `train`, `infer`, `explain` and `plot` steps, a separate `dataset: -id:` must be used for each input dataset with the appropriate `role` for each dataset, i.e. `train`, `val` or `infer`. Paths for saving the model, training/inference results can be set with `root: model` and `root: results` respectively, and `inference_model` can be set to explicitly provide the path to the trained model when performing inference separately from training.

		The user can also determine whether or not to perform machine learning classifier hyperparameter optimization via exhaustive grid search by setting the `ml_hyper_opt` to True or False (the default is True if no parameter is specified.)

Conversation

adamwitmer commented Mar 17, 2026

Uh oh!

tylerjereddy commented Mar 20, 2026

Uh oh!

adamwitmer commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adamwitmer commented Mar 31, 2026

Uh oh!

tylerjereddy commented Mar 31, 2026

Uh oh!

tylerjereddy left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adamwitmer commented Mar 22, 2026 •

edited

Loading