docs: Update service docs with OpenAI details#1681
Conversation
Documentation preview |
Greptile SummaryThis PR updates the NeMo Guardrails server documentation to reflect the recently added OpenAI-compatible Key changes include:
The documentation is largely accurate and well-structured, but the NIM authentication section in list-models.md needs clarification to prevent silent 401 failures for users following the documented setup.
|
|
@greptile review PR as of latest SHA: 822719e. Update score and summary |
|
@greptile review PR with latest commit SHA e727784 |
docs/run-rails/using-fastapi-server/chat-with-guardrailed-model.md
Outdated
Show resolved
Hide resolved
|
@greptile Review commit SHA 505c6ca and update summary and score |
| The endpoint forwards the `Authorization` header from the incoming request to the upstream provider. | ||
| If no `Authorization` header is present, the server uses the API key from the appropriate environment variable for the configured engine. |
There was a problem hiding this comment.
Inaccurate authentication fallback description for NIM
The claim "the server uses the API key from the appropriate environment variable for the configured engine" is not accurate for the nim provider. Looking at nemoguardrails/server/schemas/utils.py, the nim provider is defined as an empty dict {} in the PROVIDERS table, which means fetch_models() defaults api_key_env to "OPENAI_API_KEY" — not "NVIDIA_API_KEY".
As a result, users who follow the cloud NIM setup from run-guardrails-server.md (setting NVIDIA_API_KEY) will find that /v1/models sends unauthenticated requests to NVIDIA's API, producing a 401 response. The chat completions path (model_engine.py:141-143) correctly reads NVIDIA_API_KEY for NIM, but the models-listing path does not.
The Authentication section should clarify that for cloud NIM users must either:
- Pass the
Authorization: Bearer <NVIDIA_API_KEY>header explicitly when calling/v1/models, or - Set
OPENAI_API_KEYto their NVIDIA API key in addition to (or instead of)NVIDIA_API_KEY
| The endpoint forwards the `Authorization` header from the incoming request to the upstream provider. | |
| If no `Authorization` header is present, the server uses the API key from the appropriate environment variable for the configured engine. | |
| The endpoint forwards the `Authorization` header from the incoming request to the upstream provider. | |
| If no `Authorization` header is present, the server uses the API key from the appropriate environment variable for the configured engine. | |
| ```{note} | |
| For NVIDIA NIM cloud (e.g., `https://integrate.api.nvidia.com`), the `/v1/models` endpoint uses `OPENAI_API_KEY` for authentication (not `NVIDIA_API_KEY`). Pass your NVIDIA API key via the `Authorization` header, or set `OPENAI_API_KEY` to your NVIDIA API key when using cloud NIM. |
<details><summary>Prompt To Fix With AI</summary>
`````markdown
This is a comment left during a code review.
Path: docs/run-rails/using-fastapi-server/list-models.md
Line: 150-151
Comment:
**Inaccurate authentication fallback description for NIM**
The claim "the server uses the API key from the appropriate environment variable for the configured engine" is not accurate for the `nim` provider. Looking at `nemoguardrails/server/schemas/utils.py`, the `nim` provider is defined as an empty dict `{}` in the `PROVIDERS` table, which means `fetch_models()` defaults `api_key_env` to `"OPENAI_API_KEY"` — not `"NVIDIA_API_KEY"`.
As a result, users who follow the cloud NIM setup from `run-guardrails-server.md` (setting `NVIDIA_API_KEY`) will find that `/v1/models` sends unauthenticated requests to NVIDIA's API, producing a 401 response. The chat completions path (`model_engine.py:141-143`) correctly reads `NVIDIA_API_KEY` for NIM, but the models-listing path does not.
The Authentication section should clarify that for cloud NIM users must either:
- Pass the `Authorization: Bearer <NVIDIA_API_KEY>` header explicitly when calling `/v1/models`, **or**
- Set `OPENAI_API_KEY` to their NVIDIA API key in addition to (or instead of) `NVIDIA_API_KEY`
```suggestion
The endpoint forwards the `Authorization` header from the incoming request to the upstream provider.
If no `Authorization` header is present, the server uses the API key from the appropriate environment variable for the configured engine.
```{note}
For NVIDIA NIM cloud (e.g., `https://integrate.api.nvidia.com`), the `/v1/models` endpoint uses `OPENAI_API_KEY` for authentication (not `NVIDIA_API_KEY`). Pass your NVIDIA API key via the `Authorization` header, or set `OPENAI_API_KEY` to your NVIDIA API key when using cloud NIM.
How can I resolve this? If you propose a fix, please make it concise.
Description
The Guardrails server was recently updated to be OpenAI-compatible, with
/modelsand/chat/completionsendpoints. The work was merged in #1340 and #1637 PRs. The documents are now updated with the OpenAI-compatible request and response formats, and how environment variables are used to set the main-model engine and base URL.Related Issue(s)
Checklist