Skip to content

docs: Update service docs with OpenAI details#1681

Merged
tgasser-nv merged 8 commits intodevelopfrom
feat/openai-doc-update
Mar 3, 2026
Merged

docs: Update service docs with OpenAI details#1681
tgasser-nv merged 8 commits intodevelopfrom
feat/openai-doc-update

Conversation

@tgasser-nv
Copy link
Collaborator

@tgasser-nv tgasser-nv commented Mar 3, 2026

Description

The Guardrails server was recently updated to be OpenAI-compatible, with /models and /chat/completions endpoints. The work was merged in #1340 and #1637 PRs. The documents are now updated with the OpenAI-compatible request and response formats, and how environment variables are used to set the main-model engine and base URL.

Related Issue(s)

Checklist

  • I've read the CONTRIBUTING guidelines.
  • I've updated the documentation if applicable.
  • I've added tests if applicable.
  • @mentions of the person or team responsible for reviewing proposed changes.

@tgasser-nv tgasser-nv changed the title docs: Update service with OpenAI details docs: Update service docs with OpenAI details Mar 3, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2026

Documentation preview

https://nvidia-nemo.github.io/Guardrails/review/pr-1681

@tgasser-nv tgasser-nv requested review from cparisien and miyoungc March 3, 2026 17:59
@tgasser-nv tgasser-nv marked this pull request as ready for review March 3, 2026 18:00
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 3, 2026

Greptile Summary

This PR updates the NeMo Guardrails server documentation to reflect the recently added OpenAI-compatible /v1/chat/completions and /v1/models endpoints (merged in PRs #1340 and #1637). The changes migrate all examples from the old config_id/options top-level fields to the new OpenAI-compatible format where guardrails-specific fields are nested under a guardrails object, and introduce a new list-models.md page documenting provider configuration via environment variables.

Key changes include:

  • index.md (API reference): Comprehensive rewrite documenting OpenAI-compatible request/response schemas, the new /v1/models endpoint, streaming format, error shapes, and environment variable reference.
  • list-models.md (new): Documents the GET /v1/models endpoint with per-provider setup examples. The Authentication section contains an inaccuracy: for the nim provider, fetch_models() defaults api_key_env to OPENAI_API_KEY (since nim maps to an empty dict {} in the PROVIDERS table), not NVIDIA_API_KEY. Users who configure cloud NIM per run-guardrails-server.md and set only NVIDIA_API_KEY will receive unauthenticated 401 responses when calling /v1/models.
  • run-guardrails-server.md: Adds the new Model Provider Configuration section with environment variable setup examples for all supported providers.
  • chat-with-guardrailed-model.md, overview.md, list-guardrail-configs.md: Updated to use the new request format consistently.

The documentation is largely accurate and well-structured, but the NIM authentication section in list-models.md needs clarification to prevent silent 401 failures for users following the documented setup.

Confidence Score: 3/5

  • The documentation is largely accurate and well-written, but contains one factual inaccuracy regarding NIM authentication that will cause silent failures for users following the documented setup.
  • The PR successfully documents the new OpenAI-compatible endpoints with accurate request/response formats and environment variable references. However, the list-models.md file's Authentication section (lines 150–151) is misleading for the NIM provider: the code defaults to OPENAI_API_KEY for auth fallback, not NVIDIA_API_KEY. This contradicts the run-guardrails-server.md setup guide and will cause users to encounter silent 401 failures when calling /v1/models without the API key clarification. The score reflects one clear, actionable finding that needs resolution.
  • docs/run-rails/using-fastapi-server/list-models.md — the Authentication section needs clarification on the NIM-specific API key behavior to match the implementation.

Sequence Diagram

sequenceDiagram
    participant Client
    participant GuardrailsServer
    participant UpstreamLLM

    Note over Client,UpstreamLLM: POST /v1/chat/completions
    Client->>GuardrailsServer: POST /v1/chat/completions<br/>{model, messages, guardrails: {config_id}}
    GuardrailsServer->>GuardrailsServer: Apply input rails (config_id)
    GuardrailsServer->>UpstreamLLM: Forward to LLM<br/>(engine from MAIN_MODEL_ENGINE)
    UpstreamLLM-->>GuardrailsServer: LLM response
    GuardrailsServer->>GuardrailsServer: Apply output rails
    GuardrailsServer-->>Client: ChatCompletion + guardrails object

    Note over Client,UpstreamLLM: GET /v1/models
    Client->>GuardrailsServer: GET /v1/models<br/>[Authorization header optional]
    GuardrailsServer->>GuardrailsServer: Resolve engine via MAIN_MODEL_ENGINE<br/>Look up PROVIDERS table
    alt Known provider with custom URL (anthropic, azure, cohere)
        GuardrailsServer->>UpstreamLLM: GET provider-specific URL<br/>Auth via provider-specific env var
    else OpenAI-compatible (openai, nim, vllm, trt_llm)
        GuardrailsServer->>UpstreamLLM: GET MAIN_MODEL_BASE_URL/v1/models<br/>Auth falls back to OPENAI_API_KEY
    else Unknown engine + MAIN_MODEL_BASE_URL unset
        GuardrailsServer-->>Client: empty list response
    end
    UpstreamLLM-->>GuardrailsServer: Model list
    GuardrailsServer-->>Client: data array of model objects
Loading

Last reviewed commit: 63ee956

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

@tgasser-nv
Copy link
Collaborator Author

tgasser-nv commented Mar 3, 2026

@greptile review PR as of latest SHA: 822719e. Update score and summary

@tgasser-nv
Copy link
Collaborator Author

@greptile review PR with latest commit SHA e727784

@tgasser-nv
Copy link
Collaborator Author

@greptile Review commit SHA 505c6ca and update summary and score

@tgasser-nv tgasser-nv merged commit 909c4ce into develop Mar 3, 2026
4 checks passed
@tgasser-nv tgasser-nv deleted the feat/openai-doc-update branch March 3, 2026 22:49
Comment on lines +150 to +151
The endpoint forwards the `Authorization` header from the incoming request to the upstream provider.
If no `Authorization` header is present, the server uses the API key from the appropriate environment variable for the configured engine.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inaccurate authentication fallback description for NIM

The claim "the server uses the API key from the appropriate environment variable for the configured engine" is not accurate for the nim provider. Looking at nemoguardrails/server/schemas/utils.py, the nim provider is defined as an empty dict {} in the PROVIDERS table, which means fetch_models() defaults api_key_env to "OPENAI_API_KEY" — not "NVIDIA_API_KEY".

As a result, users who follow the cloud NIM setup from run-guardrails-server.md (setting NVIDIA_API_KEY) will find that /v1/models sends unauthenticated requests to NVIDIA's API, producing a 401 response. The chat completions path (model_engine.py:141-143) correctly reads NVIDIA_API_KEY for NIM, but the models-listing path does not.

The Authentication section should clarify that for cloud NIM users must either:

  • Pass the Authorization: Bearer <NVIDIA_API_KEY> header explicitly when calling /v1/models, or
  • Set OPENAI_API_KEY to their NVIDIA API key in addition to (or instead of) NVIDIA_API_KEY
Suggested change
The endpoint forwards the `Authorization` header from the incoming request to the upstream provider.
If no `Authorization` header is present, the server uses the API key from the appropriate environment variable for the configured engine.
The endpoint forwards the `Authorization` header from the incoming request to the upstream provider.
If no `Authorization` header is present, the server uses the API key from the appropriate environment variable for the configured engine.
```{note}
For NVIDIA NIM cloud (e.g., `https://integrate.api.nvidia.com`), the `/v1/models` endpoint uses `OPENAI_API_KEY` for authentication (not `NVIDIA_API_KEY`). Pass your NVIDIA API key via the `Authorization` header, or set `OPENAI_API_KEY` to your NVIDIA API key when using cloud NIM.

<details><summary>Prompt To Fix With AI</summary>

`````markdown
This is a comment left during a code review.
Path: docs/run-rails/using-fastapi-server/list-models.md
Line: 150-151

Comment:
**Inaccurate authentication fallback description for NIM**

The claim "the server uses the API key from the appropriate environment variable for the configured engine" is not accurate for the `nim` provider. Looking at `nemoguardrails/server/schemas/utils.py`, the `nim` provider is defined as an empty dict `{}` in the `PROVIDERS` table, which means `fetch_models()` defaults `api_key_env` to `"OPENAI_API_KEY"` — not `"NVIDIA_API_KEY"`.

As a result, users who follow the cloud NIM setup from `run-guardrails-server.md` (setting `NVIDIA_API_KEY`) will find that `/v1/models` sends unauthenticated requests to NVIDIA's API, producing a 401 response. The chat completions path (`model_engine.py:141-143`) correctly reads `NVIDIA_API_KEY` for NIM, but the models-listing path does not.

The Authentication section should clarify that for cloud NIM users must either:
- Pass the `Authorization: Bearer <NVIDIA_API_KEY>` header explicitly when calling `/v1/models`, **or**
- Set `OPENAI_API_KEY` to their NVIDIA API key in addition to (or instead of) `NVIDIA_API_KEY`

```suggestion
The endpoint forwards the `Authorization` header from the incoming request to the upstream provider.
If no `Authorization` header is present, the server uses the API key from the appropriate environment variable for the configured engine.

```{note}
For NVIDIA NIM cloud (e.g., `https://integrate.api.nvidia.com`), the `/v1/models` endpoint uses `OPENAI_API_KEY` for authentication (not `NVIDIA_API_KEY`). Pass your NVIDIA API key via the `Authorization` header, or set `OPENAI_API_KEY` to your NVIDIA API key when using cloud NIM.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants