Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 24 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,20 +16,15 @@ The Patra Toolkit is a component of the Patra ModelCards framework designed to s

## Features

1. **Encourages Accountability**
- Incorporate essential model information (metadata, dataset details, fairness, explainability) at training time, ensuring AI models remain transparent from development to deployment.
- **Encourages Accountability**: Incorporate essential model information (metadata, dataset details, fairness, explainability) at training time, ensuring AI models remain transparent from development to deployment.

2. **Semi-Automated Capture**
- Automated *Fairness* and *Explainability* scanners compute demographic parity, equal odds, SHAP-based feature importances, etc., for easy integration into Model Cards.
- **Semi-Automated Capture**: Automated *Fairness* and *Explainability* scanners compute demographic parity, equal odds, SHAP-based feature importances, etc., for easy integration into Model Cards.

3. **Machine-Actionable Model Cards**
- Produce a structured JSON representation for ingestion into the Patra Knowledge Base. Ideal for advanced queries on model selection, provenance, versioning, or auditing.
- **Machine-Actionable Model Cards**: Produce a structured JSON representation for ingestion into the Patra Knowledge Base. Ideal for advanced queries on model selection, provenance, versioning, or auditing.

4. **Flexible Repository Support**
- Pluggable backends for storing models/artifacts on **Hugging Face** or **GitHub**, unifying the model publishing workflow.
- **Flexible Repository Support**: Pluggable backends for storing models/artifacts on **Hugging Face** or **GitHub**, unifying the model publishing workflow.

5. **Versioning & Model Relationship Tracking**
- Maintain multiple versions of a model with recognized edges (e.g., `revisionOf`, `alternateOf`) using embedding-based similarity. This ensures clear lineages and easy forward/backward provenance.
- **Versioning & Model Relationship Tracking**: Maintain multiple versions of a model with recognized edges (e.g., `revisionOf`, `alternateOf`) using embedding-based similarity. This ensures clear lineages and easy forward/backward provenance.

## Getting Started

Expand All @@ -52,20 +47,28 @@ Find the descriptions of the Model Card parameters in the [schema descriptions d
from patra_toolkit import ModelCard

mc = ModelCard(
name="UCI Adult Data Analysis model using Tensorflow",
version="0.1",
short_description="UCI Adult Data analysis using Tensorflow for demonstration of Patra Model Cards.",
full_description="We have trained a ML model using the tensorflow framework to predict income for the UCI Adult Dataset. We leverage this data to run the Patra model cards to capture metadata about the model as well as fairness and explainability metrics.",
keywords="uci adult, tensorflow, explainability, fairness, patra",
author="Sachith Withana",
input_type="Tabular",
category="classification",
foundational_model="None"
name="UCI Adult Data Analysis model using Tensorflow",
version="0.1",
short_description="UCI Adult Data analysis using Tensorflow for demonstration of Patra Model Cards.",
full_description="We have trained a ML model using the tensorflow framework to predict income for the UCI Adult Dataset. We leverage this data to run the Patra model cards to capture metadata about the model as well as fairness and explainability metrics.",
keywords="uci adult, tensorflow, explainability, fairness, patra",
input_type="Tabular",
category="classification",
foundational_model="None"
)

# Add Model Metadata
mc.input_data = 'https://archive.ics.uci.edu/dataset/2/adult'
mc.output_data = 'https://huggingface.co/Data-to-Insight-Center/UCI-Adult'

# Add User Information
mc.populate_user(
username="neelk",
orcid="0000-0002-1234-5678",
name="Neelesh Karthikeyan",
institution="Indiana University",
email="neelk@iu.edu"
)
```

### Initialize an AI/ML Model
Expand Down Expand Up @@ -117,7 +120,7 @@ mc.validate()
mc.save(<file_path>)
```

## Submit
## Submit Model Card

Use `mc.submit()` to either upload just a model card, an AI model along with the model card, just the artifacts, or all at once!

Expand All @@ -139,7 +142,7 @@ If a name-version conflict arises, increment `mc.version`. In case of failure, `

---

## Authentication with TACC Credentials
### Authentication with TACC Credentials

To authenticate against a Patra server hosted in TAPIS, use Patra's built-in `authenticate()` method to obtain an access token:

Expand All @@ -149,11 +152,7 @@ from patra_toolkit import ModelCard
mc = ModelCard(...)

tapis_token = mc.authenticate(username="<your_tacc_username>", password="<your_tacc_password>")
```

This will print and return a valid `X-Tapis-Token` (JWT). You can then pass this token to `mc.submit()`:

```python
mc.submit(
patra_server_url=<tapis_hosted_patra_server_url>,
model=<trained_model>,
Expand Down
72 changes: 53 additions & 19 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,30 +60,36 @@ Usage

Create a Model Card
^^^^^^^^^^^^^^^^^^^

Find the descriptions of the Model Card parameters in the
``docs/schema_description.md``.
Find the descriptions of the Model Card parameters in the `schema descriptions document <./docs/schema_description.md>`_.

.. code-block:: python

from patra_toolkit import ModelCard

mc = ModelCard(
name="UCI Adult Data Analysis model using Tensorflow",
version="0.1",
short_description="UCI Adult Data analysis using Tensorflow for demonstration of Patra Model Cards.",
full_description="We have trained a ML model using the tensorflow framework to predict income for the UCI Adult Dataset. We leverage this data to run the Patra model cards to capture metadata about the model as well as fairness and explainability metrics.",
keywords="uci adult, tensorflow, explainability, fairness, patra",
author="Sachith Withana",
input_type="Tabular",
category="classification",
foundational_model="None"
name="UCI Adult Data Analysis model using Tensorflow",
version="0.1",
short_description="UCI Adult Data analysis using Tensorflow for demonstration of Patra Model Cards.",
full_description="We have trained a ML model using the tensorflow framework to predict income for the UCI Adult Dataset. We leverage this data to run the Patra model cards to capture metadata about the model as well as fairness and explainability metrics.",
keywords="uci adult, tensorflow, explainability, fairness, patra",
input_type="Tabular",
category="classification",
foundational_model="None"
)

# Add Model Metadata
mc.input_data = 'https://archive.ics.uci.edu/dataset/2/adult'
mc.output_data = 'https://huggingface.co/Data-to-Insight-Center/UCI-Adult'

# Add User Information
mc.populate_user(
username="neelk",
orcid="0000-0002-1234-5678",
name="Neelesh Karthikeyan",
institution="Indiana University",
email="neelk@iu.edu"
)

Initialize an AI/ML Model
^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down Expand Up @@ -141,22 +147,50 @@ Validate and Save the Model Card
Submit
------

Use ``mc.submit()`` to register only a model card, an AI model along with the model card,
just the artifacts, or all at once!
Use ``mc.submit()`` to either upload just a model card, an AI model along with the model card, just the artifacts, or all at once!

.. code-block:: python

mc.submit(
patra_server_url=<patra_server_url>,
model=<trained_model>,
file_format="pt", # or "h5"
model_store="huggingface", # or "github"
file_format="pt", # or "h5"
model_store="huggingface", # or "github"
inference_labels="labels.txt",
artifacts=[<artifact1_path>, <artifact2_path>]
artifacts=[<artifact1_path>, <artifact2_path>],
token=<optional_token> # optional authentication token
)

The ``token`` parameter is **optional**. If your hosted Patra server requires authentication, provide a valid token.

If a name-version conflict arises, increment ``mc.version``. In case of failure, ``submit()`` attempts partial rollbacks to avoid orphaned uploads.

----

Authentication with TACC Credentials
====================================

To authenticate against a Patra server hosted in TAPIS, use Patra's built-in ``authenticate()`` method to obtain an access token:

.. code-block:: python

from patra_toolkit import ModelCard

mc = ModelCard(...)

tapis_token = mc.authenticate(username="<your_tacc_username>", password="<your_tacc_password>")

This will print and return a valid ``X-Tapis-Token`` (JWT). You can then pass this token to ``mc.submit()``:

.. code-block:: python

mc.submit(
patra_server_url=<tapis_hosted_patra_server_url>,
model=<trained_model>,
token=tapis_token
)

If a name-version conflict arises, increment ``mc.version``. In case of failure,
``submit()`` attempts partial rollbacks to avoid orphaned uploads.
----

Examples
--------
Expand Down
1 change: 0 additions & 1 deletion examples/notebooks/CNN_Example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -306,7 +306,6 @@
" short_description=\"CNN model trained on CIFAR-10 dataset for image classification.\",\n",
" full_description=\"We have trained a Convolutional Neural Network (CNN) model using TensorFlow to classify images from the CIFAR-10 dataset. The dataset consists of 10 classes of images, including airplane, automobile, bird, cat, deer, dog, frog, horse, ship,and truck.\",\n",
" keywords=\"cifar-10, tensorflow, cnn, image classification, deep learning\",\n",
" author=\"Isuru Gamage\",\n",
" input_type=\"Image\",\n",
" category=\"classification\",\n",
" foundational_model=\"None\"\n",
Expand Down
Loading