ENH: Add ML training and inference by nidhinthomas-ai · Pull Request #5 · lanl/ldrd_neat_ml

nidhinthomas-ai · 2025-09-25T06:27:15Z

Summary

This MR introduces a compact ML toolkit for phase diagram construction including and model training, inference, explainability and plotting:

Training: an RF+XGBoost soft‑voting pipeline with simple pre‑processing, class‑imbalance handling, and a lightweight grid search over a validation set.
Inference: robust prediction on new CSVs, feature alignment against a saved model bundle, and CSV export of probabilities and labels.
Explainability: SHAP, EBM, and LIME importances with a consensus view and publication‑quality figures.
Utilities: helpers for plotting/analysis, including tools for GMM‑based decision regions & phase‑boundary extraction in composition space.

What’s included

neat_ml/
├── model/
│   ├── train.py                # training pipeline, tuning, ROC plotting, model bundling
│   ├── inference.py            # load bundle, align features, predict, save CSV
│   └── feature_importance.py   # SHAP / EBM / LIME + consensus and plots
└── utils/
    └── utils.py                # plotting helpers, GMM wrappers, boundaries, styling

Design highlights

Ensemble learner: VotingClassifier over RandomForestClassifier and XGBClassifier with soft voting.
Pre‑processing:
- All features coerced to numeric, columns that are entirely NaN are dropped.
- Median imputation (performed in preprocess() and again inside the pipeline for robustness when new features are missing at inference).
- Standardization via StandardScaler.
Imbalance handling: uses scale_pos_weight = (#neg / #pos) for XGBoost, computed on the training (and refit on train+val) target.
Hyperparameter search: exhaustive product over a small grid (72 combos; see below) scored by validation ROC‑AUC; best model is then refit on combined train+val.
Reproducibility: RANDOM_STATE=42 across learners.
Model artifact: single joblib bundle containing the fitted pipeline, feature list, validation metrics, and best parameters.

Main modules & functions

`neat_ml/model/train.py`

Public API

preprocess(df, target, exclude=None) -> (X, y)
- Drops rows with missing target, coerces numerics, drops all‑NaN columns, median‑imputes remaining NaNs.
train_with_validation(X_train, y_train, X_val, y_val)
- Grid‑searches the ensemble, picks best by ROC‑AUC, reports PR‑AUC for the chosen model, then refits on train+val.
- Returns: (final_model_pipeline, metrics_dict, best_params_dict, val_proba_best).
plot_roc(y_true, y_prob, out_png, label="Validation")
- Saves a 4x4 in ROC figure (PNG).
save_model_bundle(model, features, metrics, best_params, path)
- Serializes { "model": Pipeline, "features": List[str], "metrics": Dict[str, float], "best_params": Dict[str, Any] }.

Model pipeline

[ SimpleImputer(median) ] → [ StandardScaler ] → [
  VotingClassifier(
    estimators = [
      ("rf",  RandomForestClassifier(class_weight="balanced")),
      ("xgb", XGBClassifier(objective="binary:logistic", scale_pos_weight=...))
    ],
    voting="soft"
  )
]

Hyperparameter grid (72 total)

RF: n_estimators=[500], max_depth=[None]
XGB: n_estimators=[10,20,50,100,200,400] × learning_rate=[0.05,0.1] × max_depth=[3,4,5,8,10,20]

Note: RF/XGB have defaults in _build_pipeline that are overridden via .set_params(**grid_params) during search.

`neat_ml/model/inference.py`

Public API

save_predictions(df, y_prob, out_csv)
- Appends:
  - Pred_Prob (positive‑class probability)
  - Pred_Label (1 if >=0.5, else 0)
- Writes CSV.
run_inference(model_in, data_csv, target: Optional[str], exclude_cols: list[str], pred_csv)
- Loads the joblib bundle, reads data_csv, runs preprocess (with optional target to enable label alignment but no evaluation/ROC is performed here), adds any missing training features as all‑NaN columns, reorders to match the saved feature list, and computes probabilities with the bundled pipeline.
- Calls save_predictions(...).

`neat_ml/model/feature_importance.py`

Public API

compare_methods(model, X, y, out_dir, top=20)
- Runs:
  - SHAP (permutation): saves shap_summary.png; returns mean |SHAP| importances.
  - EBM (ExplainableBoostingClassifier): saves ebm_importance.png & ebm_importance.csv.
  - LIME (aggregated over sampled rows): returns mean |weight| importances.
- Merges importances into feature_importance_comparison.csv and feature_importance_comparison.png, ranked by mean rank across methods.
- Builds a consensus (majority vote across methods for top‑k) and saves feat_imp_consensus.png.
feature_importance_consensus(pos_class_feat_imps, feature_names, top_feat_count)
- Majority‑vote ranking across importance vectors; returns (ranked_names, ranked_counts, num_models).
plot_feat_import_consensus(ranked_names, ranked_counts, num_models, top_feat_count, out_dir)
- Horizontal bar chart of % occurrence in top‑k per method.
get_k_best_scores(X, y, k, metrics)
- Convenience helper to collect SelectKBest scores (raw vectors) for various scoring functions.

Dependencies

Requires shap, interpret (glassbox), and lime. Currently imported at module import time (i.e., behave as hard dependencies). See TODOs for lazy import to make them optional.

`neat_ml/utils/utils.py`

Public API & behavior

_axis_ranges(df1, df2, x_col, y_col, pad=2) -> ([0,x_max],[0,y_max]) Shared axis ranges across two frames (integer bounds).
GMM utilities
- GMMWrapper(gmm, x_comp, x_features) Uses a GaussianMixture trained on 1D phase features to classify any (x,y) composition point by nearest neighbor mapping back to the observed feature domain.
- _standardise_labels(cluster_labels, x_comp) Deterministic remap of raw GMM labels to the convention:
  - 0 = Two Phase (triangle‑up, aquamarine)
  - 1 = Single Phase (square, light‑steel‑blue)
- extract_boundary_from_contour(z, xs, ys, level=0.5) From a grid z, returns the longest (x,y) contour at threshold level (e.g., phase boundary).
- plot_gmm_decision_regions(df, x_col, y_col, phase_col, ax, xrange, yrange, ..., plot_regions=True, ...) -> (gmm, std_labels, boundary_points)
  - Fits GMM on phase_col, projects decisions onto composition grid (xrange×yrange), optionally fills regions and draws the decision boundary; returns model, standardized labels for observed points, and extracted boundary coordinates.
- plot_gmm_composition_phase(df, x_col, y_col, phase_col, ax, point_cmap=None)
  - Fits GMM on phase_col, scatters (x,y) points with standardized phase labels and legend.

Styling helpers

_set_axis_style(ax, xrange, yrange) Equal aspect, integer tick formatting, thicker spines/ticks.

Usage examples

1) Train -> validate -> bundle

import pandas as pd
from sklearn.model_selection import train_test_split
from neat_ml.model.train import preprocess, train_with_validation, plot_roc, save_model_bundle

df = pd.read_csv("train.csv")
X, y = preprocess(df, target="label", exclude=["id"])  # binary labels 0/1 expected

X_tr, X_val, y_tr, y_val = train_test_split(X, y, test_size=0.2, stratify=y, random_state=42)

model, metrics, best_params, val_proba = train_with_validation(X_tr, y_tr, X_val, y_val)

print(metrics)       # {'val_roc_auc': ..., 'val_pr_auc': ...}
print(best_params)   # chosen RF/XGB params

plot_roc(y_val, val_proba, out_png="artifacts/val_roc.png")

save_model_bundle(
    model=model,
    features=X.columns.tolist(),
    metrics=metrics,
    best_params=best_params,
    path="artifacts/model_bundle.joblib",
)

2) Run inference on a new CSV

from pathlib import Path
from neat_ml.model.inference import run_inference

run_inference(
    model_in=Path("artifacts/model_bundle.joblib"),
    data_csv=Path("inference.csv"),
    target=None,                 # or "label" if present; used only for alignment
    exclude_cols=["id"],
    pred_csv=Path("predictions.csv"),
)
# predictions.csv includes: ..., Pred_Prob, Pred_Label

3) Explainability & consensus

import pandas as pd, joblib
from neat_ml.model.train import preprocess
from neat_ml.model.feature_importance import compare_methods

bundle = joblib.load("path_to_model_bundle.joblib")
model = bundle["model"]

df = pd.read_csv("train.csv")
X, y = preprocess(df, target="label", exclude=["id"])

compare_methods(model, X, y, out_dir=Path("explain/"), top=20)
# Produces: shap_summary.png, ebm_importance.png/.csv,
#           feature_importance_comparison.png/.csv, feat_imp_consensus.png

MR checklist

Add preprocessing.py for enhancing input images Add detection.py for OpenCV blob detection Add input data for testing OpenCV detection Add scripts for testing OpenCV detection Add raw_processed.png to baseline for test Add raw_detection.png to baseline for test Modified pyproject.toml to pull .tiff and .png

Removed fixtures from test scripts Use importlib.resources to retrieve images Reduced the tolerances for compare_images() Added ruff.toml to enforce no mutable default args

Added .gitattributes with image files

…rocessed.png,neat_ml/tests/data/images/raw.tiff,neat_ml/tests/data/images_Processed/raw.tiff: convert to Git LFS

Modified preprocessing.py to remove mutable defaults Updated .gitattributes to define image paths

Added lib_workflow.py to neat_ml/workflow Added test_workflow.py to test lib_workflow.py Modified run_workflow.py to import lib_workflow Modified test yaml file to remove results

Removed unnecessary git lfs installation commands from .gitlab-ci.yml Edited run_workflow.py and test_workflow.py to fix mypy errors.

Modified the test scripts to include pytest.raises(match..) Removed extraneous code in detection.py Added test method for checking a warning message

Added bubblesam.py and SAM.py into neat_ml/bubblesam Added stage_bubblesam() into run_workflow.py Added test_bubblesam.py and associated baseline images Modified .gitlab-ci.yml, mypy.ini and pyproject.toml to incorporate sam2 module Updated README.md with sam-2 installation instructions Added bubblesam_detection_test.yaml for functional test

Created neat_ml/workflow and added lib_workflow.py Added test_workflow.py to test lib_workflow.py

Added git config --global --add safe.directory to fix the dubious ownership error

Added neat_ml/analysis module for bubble analysis Added test_analysis.py for testing analysis module

Updated README.md with instructions to run OpenCV and BubbleSAM workflow with detection and analysis

Added neat_ml/model for ML Added neat_ml/utils for plotting Added neat_ml/phase_diagram for plotting phase diagram Added test scripts and baseline images Modified run_workflow.py for training, inference and feature importance

Updated the test_workflow.py to incorporate new tests Updated README.md with commandline example with train, infer,plot. Removed extraneous code from test scripts Modified baseline images to ensure the tolerance to be within 1e-4

Added ci.yml Added LANL copyright assertion ID

tylerjereddy · 2025-10-20T23:10:18Z

+# Or broader by type (covers future files anywhere)
+*.tif   filter=lfs diff=lfs merge=lfs -text
+*.tiff  filter=lfs diff=lfs merge=lfs -text
+*.png   filter=lfs diff=lfs merge=lfs -text


As noted in gh-8, we can't really use LFS with LANL GitHub org.

Indeed, if you try to checkout this branch out locally you'll get an error similar to:

Error downloading object: neat_ml/tests/baseline/circles_filtered_contours.png (1d6b164): Smudge error: Error downloading neat_ml/tests/baseline/circles_filtered_contours.png (1d6b164340ac27d913e2552b3109cb620f5befec2816b50183e51b1816f8dcda): [1d6b164340ac27d913e2552b3109cb620f5befec2816b50183e51b1816f8dcda] Object does not exist on the server: [404] Object does not exist on the server

As in the other DR project, you can work around it to some degree with export GIT_LFS_SKIP_SMUDGE=1.

But if I check the branch out that way locally, I then end up with other issues with i.e., python -m pytest -n 4 resulting in 6 failing and 5 erroring tests.

Details

FAILED neat_ml/tests/test_detection.py::test_visual_regression_debug_overlay - FileNotFoundError: Unable to read image file: /Users/treddy/github_projects/ldrd_n... FAILED neat_ml/tests/test_feature_importance.py::test_plot_feat_import_consensus_image - PIL.UnidentifiedImageError: cannot identify image file '/Users/treddy/github_proje... FAILED neat_ml/tests/test_preprocessing.py::test_process_directory_single_image - cv2.error: OpenCV(4.10.0) /Users/xperience/GHA-Actions-OpenCV/_work/opencv-python/... FAILED neat_ml/tests/test_phase_diagram.py::test_construct_phase_diagram_image - PIL.UnidentifiedImageError: cannot identify image file '/Users/treddy/github_proje... FAILED neat_ml/tests/test_train.py::test_plot_roc - PIL.UnidentifiedImageError: cannot identify image file '/Users/treddy/github_proje... FAILED neat_ml/tests/test_feature_importance.py::test_compare_methods_end_to_end - AssertionError: ERROR neat_ml/tests/test_bubblesam.py ERROR neat_ml/tests/test_workflow.py ERROR neat_ml/tests/test_analysis.py::test_voronoi_qhull_error ERROR neat_ml/tests/test_analysis.py::test_calculate_nnd_stats_warns_on_no_finite_distances ERROR neat_ml/tests/test_analysis.py::test_calculate_graph_metrics_warns_on_exception

That seems like something @adamwitmer could help with--finding a sustainable solution to binary asset storage here so I/others can run the tests easily on this branch, etc.

tylerjereddy · 2025-10-20T23:34:45Z

+  push:
+    branches: [ main ]
+  pull_request:   # run for all PRs (parity with GitLab MR rule)
+    types: [opened, synchronize, reopened, ready_for_review]


Purge this out--seems unnecessary. In general, this CI config file has far too much duplication/complexity, and merge conflicts need to be resolved against the more concise CI config I added in gh-9.

tylerjereddy · 2025-10-20T23:36:06Z

+            setup.cfg
+            **/requirements*.txt
+            neat_ml/sam2/pyproject.toml
+            neat_ml/sam2/setup.cfg


let's not use caching for now--it is just extra complexity we don't need--this is a very small project that probably won't see much GitHub activity once published.

tylerjereddy · 2025-10-20T23:37:31Z

+          set -euxo pipefail
+          git config --global --add safe.directory "$GITHUB_WORKSPACE"
+          git config --global --add safe.directory "$GITHUB_WORKSPACE/neat_ml/sam2"
+          git submodule update --init --recursive


this all seems superfluous--we are already using submodules: recursive in the checkout action above, so why is this repeated here?

tylerjereddy · 2025-10-20T23:39:31Z

+            openpyxl jinja2 python-ternary pytest shap interpret \
+            types-tqdm types-Pillow scikit-image lightgbm lime \
+            opencv-python-headless==4.10.0.84 \
+            pyyaml types-pyyaml torch


this is verbose and gets repeated below--I think we should just list the dependencies in pyproject.toml and use the standard approach of installing them with pip from that file...

tylerjereddy · 2025-10-20T23:41:20Z

+      - name: Run end-to-end script
+        run: |
+          set -euxo pipefail
+          python main.py


Let's just delete the functional test job above entirely--it hasn't passed for ages and just times out at about an hour internally on LISDI GitLab. I also don't think it tests much of the "new stuff" for the paper anyway. Using pytest tests is also generally preferable of course.

tylerjereddy · 2025-10-20T23:42:03Z

+            setup.cfg
+            **/requirements*.txt
+            neat_ml/sam2/pyproject.toml
+            neat_ml/sam2/setup.cfg


get rid of caching for this small team/project

tylerjereddy · 2025-10-20T23:44:02Z

+          set -euxo pipefail
+          git config --global --add safe.directory "$GITHUB_WORKSPACE"
+          git config --global --add safe.directory "$GITHUB_WORKSPACE/neat_ml/sam2"
+          git submodule update --init --recursive


Again, submodules: recursive should be handled above. Way too much repetition in this file. Using a matrix approach like in gh-9 probably makes sense when resolving merge conflicts.

tylerjereddy · 2025-10-20T23:45:12Z

+            openpyxl jinja2 python-ternary pytest shap interpret \
+            types-tqdm types-Pillow scikit-image lightgbm lime \
+            opencv-python-headless==4.10.0.84 \
+            pyyaml types-pyyaml torch


move these long repeated lists of deps to the appropriate parts of pyproject.toml

tylerjereddy · 2025-10-20T23:47:55Z

+  - git config --global --add safe.directory "$CI_PROJECT_DIR/neat_ml/sam2"
+  - python -m pip install -U mypy ruff pandas-stubs "numpy>=1.26.3,<=2.1.3" matplotlib xgboost pandas scikit-learn openpyxl jinja2 python-ternary pytest shap interpret types-tqdm types-Pillow scikit-image lightgbm lime opencv-python-headless==4.10.0.84  pyyaml types-pyyaml pytest-mock torch
+  - git submodule update --init --recursive
+  - pip install -e ./neat_ml/sam2/


just delete this file--we're not going to continue to use LISDI GitLab long-term (it has a subscription fee, etc.); the code should be open for others to see in any case

tylerjereddy · 2025-10-20T23:54:28Z

+conda activate ldrd_neat_ml
+
+conda install pytorch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 pytorch-cuda=12.4 -c pytorch -c nvidia
+pip install -r requirements.txt


This is different than the steps the CI is following; if we have an updated requirements.txt file we could use that in CI instead of having multiple lists of deps needing to be maintained in different locations.

tylerjereddy · 2025-10-20T23:55:23Z

+
+## Installation
+
+Download the package from the following GitLab repository:  


Let's just rewrite the instructions for the public GitHub--that's all anyone is going to care about at publication time since they cannot access our internal GitLab.

tylerjereddy · 2025-10-20T23:57:08Z

@@ -0,0 +1,144 @@
+from pathlib import Path
+from typing import Any, Dict, List, Optional


there shouldn't be any need to import Dict and List, can just use the builtin list and dict types now

tylerjereddy · 2025-10-20T23:58:44Z

+                    if not graph.has_edge(i, j):
+                        dist = float(np.linalg.norm(points[i] - points[j]))
+                        graph.add_edge(int(i), int(j), distance=dist)
+    except Exception as exc:


Let's try to avoid capturing the generic Exception class for the usual reasons

tylerjereddy · 2025-10-21T00:03:54Z

+        )
+        ax.add_patch(rect)
+    ax.axis("off")
+    plt.savefig(output_path, bbox_inches='tight')


Here and elsewhere--let's try to use fig, ax objects instead of the "global" plt approach, for the usual reasons.

tylerjereddy · 2025-10-21T01:02:11Z

+    print(f"[INFO] scale_pos_weight={spw:.3f}  |  train neg/pos={np.bincount(y_train)}")
+
+    param_names = list(_PARAM_GRID.keys())
+    param_values = list(_PARAM_GRID.values())


let's avoid using "module globals" like _PARAM_GRID, instead generally passing variables in through functions arguments to avoid spaghetti logic...

tylerjereddy · 2025-10-21T01:04:43Z

+            val_proba_best = val_proba
+
+    assert best_model is not None, "No model was fitted!"
+    assert val_proba_best is not None, "Validation probabilities were not calculated!"


avoid plain assert in production code, for the usual reasons

tylerjereddy · 2025-10-21T01:05:31Z

+    plt.ylabel("True-Positive Rate")
+    plt.legend()
+    plt.tight_layout()
+    plt.savefig(out_png, dpi=300)


Let's avoid the "global"/MATLAB-like plt usage, instead favoring the object-oriented fig, ax approach..

tylerjereddy · 2025-10-21T01:06:04Z

+    plt.tight_layout()
+    plt.savefig(out_png, dpi=300)
+    plt.close()
+    print(f"[INFO] ROC curve saved -> {out_png}")


logging not print() for production output

tylerjereddy · 2025-10-21T01:09:16Z

+        auc = roc_auc_score(y_val, val_proba)
+        print(f"[GRID] params={params}  |  val ROC-AUC={auc:.4f}")
+
+        if auc > best_auc:


this manual checking shouldn't be necessary--sklearn has built-in grid searching functions/classes... let's not reinvent the wheel and create unnecessary reviewer burden with manually coded loops

@adamwitmer since this PR hasn't been touched in months and I've mentioned this a few times in person now, I'll mention it in writing next -- please makes sure that:

hyperparameter optimization is justified as useful for OpenCV and SAM2 approaches (model performs better after hyperopt vs. before)

hyperopt is performed in a standard way rather then reinventing the wheel--this comment is from October 2025, was hoping it might be addressed a little faster than this since a few iterations are likely needed and the manuscript revisions are due quite soon..

tylerjereddy · 2025-10-21T01:16:23Z

+_PARAM_GRID: Dict[str, List[int | float | None]] = {
+    # Random‑Forest
+    "ensemble__rf__n_estimators": [500],
+    "ensemble__rf__max_depth": [None],


I don't get it--the Random Forest hyperparameter grid seems pointless--there is no work to be done here with a single entry for each of only two hyperparameters. I also already indicated that n_estimators shouldn't be in here for RandomForest since the trend of "larger is better" always holds true modulo noise in the tail of the distribution.

It feels like we're basically just "pretending" to do hyperparameter optimization, and I haven't seen any indication that it is helpful...

XGB below also probably has too few hyperparameters for it to matter (and brute force may take a while vs. Bayesian search approach...)

tylerjereddy · 2025-10-21T01:20:25Z

I'll revisit this after @adamwitmer cleans things up a bit, gets the tests/CI running/passing without LFS, so I can start doing more thorough reviews with the testsuite working locally..

tylerjereddy · 2026-03-10T02:51:36Z

@adamwitmer paper revisions are due in a few weeks max, and it could very well take me a few weeks to review this even if it were ready for review now, so definitely way past due to have this ready for review given December 2025 deadline.

Nidhin Thomas added 30 commits July 22, 2025 15:16

TST: Modified test scripts

a354192

Removed fixtures from test scripts Use importlib.resources to retrieve images Reduced the tolerances for compare_images() Added ruff.toml to enforce no mutable default args

CI: Added git lfs to .gitlab-ci.yml

029b676

Added .gitattributes with image files

CI: Updated .gitlab-ci.yml to fix libcrypt error

edfccb5

neat_ml/tests/baseline/raw_detection.png,neat_ml/tests/baseline/raw_p…

13a4562

…rocessed.png,neat_ml/tests/data/images/raw.tiff,neat_ml/tests/data/images_Processed/raw.tiff: convert to Git LFS

MAINT: Added run_workflow.py

a9403da

Modified preprocessing.py to remove mutable defaults Updated .gitattributes to define image paths

MAINT: Added test yaml file for functional test

13ee02c

DOC: Updated README.md with instructions

7021b30

MAINT: Created neat_ml/workflow module

37662de

Added lib_workflow.py to neat_ml/workflow Added test_workflow.py to test lib_workflow.py Modified run_workflow.py to import lib_workflow Modified test yaml file to remove results

CI: Modified .gitlab-ci.yml

7349079

CI: Fixed the git lfs

add5f03

Removed unnecessary git lfs installation commands from .gitlab-ci.yml Edited run_workflow.py and test_workflow.py to fix mypy errors.

TST: Modified test scripts

c5048c9

Modified the test scripts to include pytest.raises(match..) Removed extraneous code in detection.py Added test method for checking a warning message

CI: Removed functional test

b8329fa

MAINT: Added .gitmodules for sam2

da468c3

TST: Added baseline images for the bubblesam test

a561f0d

MAINT: Add sam2 as submodule

e3bfd74

MAINT: Add sam2 as submodule

0898d5a

ENH: Created workflow module

0014cf2

Created neat_ml/workflow and added lib_workflow.py Added test_workflow.py to test lib_workflow.py

CI: Added torch module to .gitlab-ci.yml

4a5e34a

TST: Modified test_bubblesam.py to improve coverage

4aed3df

CI: Modified .gitlab-ci.yml

a61e38f

Added git config --global --add safe.directory to fix the dubious ownership error

ENH: Added analysis module

9cb1255

Added neat_ml/analysis module for bubble analysis Added test_analysis.py for testing analysis module

TST: Fixed test_workflow.py to improve coverage

17b3e2a

DOC: Updated the README.md

68a63cb

Updated README.md with instructions to run OpenCV and BubbleSAM workflow with detection and analysis

TST: Added test scripts to improve coverage

816a11f

CI: Added pytest-mock module to .gitlab-ci.yaml

6090f9d

ENH: Implemented ML model

c68a1b9

Added neat_ml/model for ML Added neat_ml/utils for plotting Added neat_ml/phase_diagram for plotting phase diagram Added test scripts and baseline images Modified run_workflow.py for training, inference and feature importance

TST: Fixed bugs and improved coverage

b93f8a8

Updated the test_workflow.py to incorporate new tests Updated README.md with commandline example with train, infer,plot. Removed extraneous code from test scripts Modified baseline images to ensure the tolerance to be within 1e-4

CI: Added .github/workflows/ci.yml

80e73f8

Added ci.yml Added LANL copyright assertion ID

adamwitmer mentioned this pull request Oct 10, 2025

MAINT: TODO for open PRs #6

Open

5 tasks

tylerjereddy added the enhancement New feature or request label Oct 15, 2025

tylerjereddy reviewed Oct 20, 2025

View reviewed changes

tylerjereddy reviewed Oct 21, 2025

View reviewed changes

tylerjereddy mentioned this pull request Oct 22, 2025

ENH: Nidhin opencv detection #2

Merged

adamwitmer mentioned this pull request Feb 25, 2026

ENH: Add BubbleSAM detection method #3

Merged

This was referenced Mar 17, 2026

WIP, ENH: Add ML training and inference #28

Open

ENH: Add scripts to analyze BubbleSAM and OpenCV detection results #4

Open


		## Installation

		Download the package from the following GitLab repository:

		@@ -0,0 +1,144 @@
		from pathlib import Path
		from typing import Any, Dict, List, Optional

Conversation

nidhinthomas-ai commented Sep 25, 2025

Summary

What’s included

Design highlights

Main modules & functions

neat_ml/model/train.py

neat_ml/model/inference.py

neat_ml/model/feature_importance.py

neat_ml/utils/utils.py

Usage examples

1) Train -> validate -> bundle

2) Run inference on a new CSV

3) Explainability & consensus

MR checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tylerjereddy commented Oct 21, 2025

Uh oh!

tylerjereddy commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

`neat_ml/model/train.py`

`neat_ml/model/inference.py`

`neat_ml/model/feature_importance.py`

`neat_ml/utils/utils.py`