docs: Update service docs with OpenAI details by tgasser-nv · Pull Request #1681 · NVIDIA-NeMo/Guardrails

tgasser-nv · 2026-03-03T17:16:54Z

Description

The Guardrails server was recently updated to be OpenAI-compatible, with /models and /chat/completions endpoints. The work was merged in #1340 and #1637 PRs. The documents are now updated with the OpenAI-compatible request and response formats, and how environment variables are used to set the main-model engine and base URL.

Related Issue(s)

Checklist

I've read the CONTRIBUTING guidelines.
I've updated the documentation if applicable.
I've added tests if applicable.
@mentions of the person or team responsible for reviewing proposed changes.

github-actions · 2026-03-03T17:19:08Z

Documentation preview

https://nvidia-nemo.github.io/Guardrails/review/pr-1681

greptile-apps · 2026-03-03T18:03:38Z

Greptile Summary

This PR updates the NeMo Guardrails server documentation to reflect the recently added OpenAI-compatible /v1/chat/completions and /v1/models endpoints (merged in PRs #1340 and #1637). The changes migrate all examples from the old config_id/options top-level fields to the new OpenAI-compatible format where guardrails-specific fields are nested under a guardrails object, and introduce a new list-models.md page documenting provider configuration via environment variables.

Key changes include:

index.md (API reference): Comprehensive rewrite documenting OpenAI-compatible request/response schemas, the new /v1/models endpoint, streaming format, error shapes, and environment variable reference.
list-models.md (new): Documents the GET /v1/models endpoint with per-provider setup examples. The Authentication section contains an inaccuracy: for the nim provider, fetch_models() defaults api_key_env to OPENAI_API_KEY (since nim maps to an empty dict {} in the PROVIDERS table), not NVIDIA_API_KEY. Users who configure cloud NIM per run-guardrails-server.md and set only NVIDIA_API_KEY will receive unauthenticated 401 responses when calling /v1/models.
run-guardrails-server.md: Adds the new Model Provider Configuration section with environment variable setup examples for all supported providers.
chat-with-guardrailed-model.md, overview.md, list-guardrail-configs.md: Updated to use the new request format consistently.

The documentation is largely accurate and well-structured, but the NIM authentication section in list-models.md needs clarification to prevent silent 401 failures for users following the documented setup.

Confidence Score: 3/5

The documentation is largely accurate and well-written, but contains one factual inaccuracy regarding NIM authentication that will cause silent failures for users following the documented setup.
The PR successfully documents the new OpenAI-compatible endpoints with accurate request/response formats and environment variable references. However, the list-models.md file's Authentication section (lines 150–151) is misleading for the NIM provider: the code defaults to OPENAI_API_KEY for auth fallback, not NVIDIA_API_KEY. This contradicts the run-guardrails-server.md setup guide and will cause users to encounter silent 401 failures when calling /v1/models without the API key clarification. The score reflects one clear, actionable finding that needs resolution.
docs/run-rails/using-fastapi-server/list-models.md — the Authentication section needs clarification on the NIM-specific API key behavior to match the implementation.

Sequence Diagram

sequenceDiagram
    participant Client
    participant GuardrailsServer
    participant UpstreamLLM

    Note over Client,UpstreamLLM: POST /v1/chat/completions
    Client->>GuardrailsServer: POST /v1/chat/completions<br/>{model, messages, guardrails: {config_id}}
    GuardrailsServer->>GuardrailsServer: Apply input rails (config_id)
    GuardrailsServer->>UpstreamLLM: Forward to LLM<br/>(engine from MAIN_MODEL_ENGINE)
    UpstreamLLM-->>GuardrailsServer: LLM response
    GuardrailsServer->>GuardrailsServer: Apply output rails
    GuardrailsServer-->>Client: ChatCompletion + guardrails object

    Note over Client,UpstreamLLM: GET /v1/models
    Client->>GuardrailsServer: GET /v1/models<br/>[Authorization header optional]
    GuardrailsServer->>GuardrailsServer: Resolve engine via MAIN_MODEL_ENGINE<br/>Look up PROVIDERS table
    alt Known provider with custom URL (anthropic, azure, cohere)
        GuardrailsServer->>UpstreamLLM: GET provider-specific URL<br/>Auth via provider-specific env var
    else OpenAI-compatible (openai, nim, vllm, trt_llm)
        GuardrailsServer->>UpstreamLLM: GET MAIN_MODEL_BASE_URL/v1/models<br/>Auth falls back to OPENAI_API_KEY
    else Unknown engine + MAIN_MODEL_BASE_URL unset
        GuardrailsServer-->>Client: empty list response
    end
    UpstreamLLM-->>GuardrailsServer: Model list
    GuardrailsServer-->>Client: data array of model objects

_{Last reviewed commit: 63ee956}

greptile-apps

_{7 files reviewed, 4 comments}

_{Edit Code Review Agent Settings | Greptile}

docs/reference/api-server-endpoints/index.md

tgasser-nv · 2026-03-03T18:33:51Z

@greptile review PR as of latest SHA: 822719e. Update score and summary

docs/run-rails/using-fastapi-server/list-models.md

tgasser-nv · 2026-03-03T22:23:10Z

@greptile review PR with latest commit SHA e727784

docs/run-rails/using-fastapi-server/chat-with-guardrailed-model.md

docs/run-rails/using-fastapi-server/run-guardrails-server.md

tgasser-nv · 2026-03-03T22:42:43Z

@greptile Review commit SHA 505c6ca and update summary and score

greptile-apps · 2026-03-03T22:51:33Z

docs/run-rails/using-fastapi-server/list-models.md

+The endpoint forwards the `Authorization` header from the incoming request to the upstream provider.
+If no `Authorization` header is present, the server uses the API key from the appropriate environment variable for the configured engine.


Inaccurate authentication fallback description for NIM

The claim "the server uses the API key from the appropriate environment variable for the configured engine" is not accurate for the nim provider. Looking at nemoguardrails/server/schemas/utils.py, the nim provider is defined as an empty dict {} in the PROVIDERS table, which means fetch_models() defaults api_key_env to "OPENAI_API_KEY" — not "NVIDIA_API_KEY".

As a result, users who follow the cloud NIM setup from run-guardrails-server.md (setting NVIDIA_API_KEY) will find that /v1/models sends unauthenticated requests to NVIDIA's API, producing a 401 response. The chat completions path (model_engine.py:141-143) correctly reads NVIDIA_API_KEY for NIM, but the models-listing path does not.

The Authentication section should clarify that for cloud NIM users must either:

Pass the Authorization: Bearer <NVIDIA_API_KEY> header explicitly when calling /v1/models, or

Set OPENAI_API_KEY to their NVIDIA API key in addition to (or instead of) NVIDIA_API_KEY

Suggested change

The endpoint forwards the `Authorization` header from the incoming request to the upstream provider.

If no `Authorization` header is present, the server uses the API key from the appropriate environment variable for the configured engine.

The endpoint forwards the `Authorization` header from the incoming request to the upstream provider.

If no `Authorization` header is present, the server uses the API key from the appropriate environment variable for the configured engine.

```{note}

For NVIDIA NIM cloud (e.g., `https://integrate.api.nvidia.com`), the `/v1/models` endpoint uses `OPENAI_API_KEY` for authentication (not `NVIDIA_API_KEY`). Pass your NVIDIA API key via the `Authorization` header, or set `OPENAI_API_KEY` to your NVIDIA API key when using cloud NIM.

<details><summary>Prompt To Fix With AI</summary> `````markdown This is a comment left during a code review. Path: docs/run-rails/using-fastapi-server/list-models.md Line: 150-151 Comment: **Inaccurate authentication fallback description for NIM** The claim "the server uses the API key from the appropriate environment variable for the configured engine" is not accurate for the `nim` provider. Looking at `nemoguardrails/server/schemas/utils.py`, the `nim` provider is defined as an empty dict `{}` in the `PROVIDERS` table, which means `fetch_models()` defaults `api_key_env` to `"OPENAI_API_KEY"` — not `"NVIDIA_API_KEY"`. As a result, users who follow the cloud NIM setup from `run-guardrails-server.md` (setting `NVIDIA_API_KEY`) will find that `/v1/models` sends unauthenticated requests to NVIDIA's API, producing a 401 response. The chat completions path (`model_engine.py:141-143`) correctly reads `NVIDIA_API_KEY` for NIM, but the models-listing path does not. The Authentication section should clarify that for cloud NIM users must either: - Pass the `Authorization: Bearer <NVIDIA_API_KEY>` header explicitly when calling `/v1/models`, **or** - Set `OPENAI_API_KEY` to their NVIDIA API key in addition to (or instead of) `NVIDIA_API_KEY` ```suggestion The endpoint forwards the `Authorization` header from the incoming request to the upstream provider. If no `Authorization` header is present, the server uses the API key from the appropriate environment variable for the configured engine. ```{note} For NVIDIA NIM cloud (e.g., `https://integrate.api.nvidia.com`), the `/v1/models` endpoint uses `OPENAI_API_KEY` for authentication (not `NVIDIA_API_KEY`). Pass your NVIDIA API key via the `Authorization` header, or set `OPENAI_API_KEY` to your NVIDIA API key when using cloud NIM.

How can I resolve this? If you propose a fix, please make it concise.

Initial commit of OpenAI doc update

d7917ab

tgasser-nv changed the title ~~docs: Update service with OpenAI details~~ docs: Update service docs with OpenAI details Mar 3, 2026

tgasser-nv added 2 commits March 3, 2026 09:37

Add /models endpoint in docs

feb21f3

Clean up docs

c86ec8f

tgasser-nv requested review from cparisien and miyoungc March 3, 2026 17:59

tgasser-nv marked this pull request as ready for review March 3, 2026 18:00

greptile-apps bot reviewed Mar 3, 2026

View reviewed changes

docs/reference/api-server-endpoints/index.md Outdated Show resolved Hide resolved

docs/reference/api-server-endpoints/index.md Outdated Show resolved Hide resolved

docs/reference/api-server-endpoints/index.md Outdated Show resolved Hide resolved

Address Greptile feedback

822719e

greptile-apps bot reviewed Mar 3, 2026

View reviewed changes

docs/run-rails/using-fastapi-server/list-models.md Show resolved Hide resolved

miyoungc approved these changes Mar 3, 2026

View reviewed changes

Add OPENAI base URL to list-models.md

e727784

greptile-apps bot reviewed Mar 3, 2026

View reviewed changes

docs/run-rails/using-fastapi-server/chat-with-guardrailed-model.md Outdated Show resolved Hide resolved

docs/run-rails/using-fastapi-server/run-guardrails-server.md Show resolved Hide resolved

tgasser-nv added 2 commits March 3, 2026 14:37

Add MAIN_MODEL_BASE_URL for nim, openai, anthropic

09529dd

update options.* to guardrails.options.*

505c6ca

Missed an options -> guardrails.options rename

63ee956

tgasser-nv merged commit 909c4ce into develop Mar 3, 2026
4 checks passed

tgasser-nv deleted the feat/openai-doc-update branch March 3, 2026 22:49

greptile-apps bot reviewed Mar 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: Update service docs with OpenAI details#1681

docs: Update service docs with OpenAI details#1681
tgasser-nv merged 8 commits intodevelopfrom
feat/openai-doc-update

tgasser-nv commented Mar 3, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 3, 2026

Uh oh!

greptile-apps bot commented Mar 3, 2026 •

edited

Loading

Confidence Score: 3/5

Sequence Diagram

Uh oh!

greptile-apps bot left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tgasser-nv commented Mar 3, 2026 •

edited

Loading

Uh oh!

Uh oh!

tgasser-nv commented Mar 3, 2026

Uh oh!

Uh oh!

Uh oh!

tgasser-nv commented Mar 3, 2026

Uh oh!

Uh oh!

greptile-apps bot Mar 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		The endpoint forwards the `Authorization` header from the incoming request to the upstream provider.
		If no `Authorization` header is present, the server uses the API key from the appropriate environment variable for the configured engine.

Conversation

tgasser-nv commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue(s)

Checklist

Uh oh!

github-actions bot commented Mar 3, 2026

Documentation preview

Uh oh!

greptile-apps bot commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Sequence Diagram

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tgasser-nv commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

tgasser-nv commented Mar 3, 2026

Uh oh!

Uh oh!

Uh oh!

tgasser-nv commented Mar 3, 2026

Uh oh!

Uh oh!

greptile-apps bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tgasser-nv commented Mar 3, 2026 •

edited

Loading

greptile-apps bot commented Mar 3, 2026 •

edited

Loading

greptile-apps bot left a comment •

edited

Loading

tgasser-nv commented Mar 3, 2026 •

edited

Loading