docs: fix inaccuracies in init-function, knowledge base, and cache docs (#1709)

Pouyanpi · miyoungc · greptile-apps[bot] · web-flow · commit 2646995c5306 · 2026-03-10T20:58:06.000-07:00
Signed-off-by: Miyoung Choi &lt;miyoungc@nvidia.com&gt;
Co-authored-by: Miyoung Choi &lt;miyoungc@nvidia.com&gt;
Co-authored-by: greptile-apps[bot] &lt;165735046+greptile-apps[bot]@users.noreply.github.com&gt;
diff --git a/docs/configure-rails/caching/model-memory-cache.md b/docs/configure-rails/caching/model-memory-cache.md
@@ -20,7 +20,7 @@ The NVIDIA NeMo Guardrails library supports an in-memory cache that avoids makin
 
 In-memory caches are supported for all NemoGuard models: [Content-Safety](https://build.nvidia.com/nvidia/llama-3_1-nemoguard-8b-content-safety), [Topic-Control](https://build.nvidia.com/nvidia/llama-3_1-nemoguard-8b-topic-control), and [Jailbreak Detection](https://build.nvidia.com/nvidia/nemoguard-jailbreak-detect). You can configure each model independently.
 
-The cache uses exact matching (after removing whitespace) on LLM prompts with a Least-Frequently-Used (LFU) algorithm for cache evictions.
+The cache uses exact matching (after normalizing whitespace) on LLM prompts with a Least-Frequently-Used (LFU) algorithm for cache evictions. Whitespace normalization collapses consecutive whitespace characters into a single space and trims leading/trailing whitespace.
 
 For observability, cache hits and misses are visible in OpenTelemetry (OTEL) telemetry and stored in logs at a configurable interval.
 
@@ -166,14 +166,7 @@ The most important metric is the *Hit Rate*, which represents the proportion of
 These statistics accumulate while the library is running.
 
 ```text
-"LFU Cache Statistics - "
-"Size: 23/10000 | "
-"Hits: 20 | "
-"Misses: 3 | "
-"Hit Rate: 87% | "
-"Evictions: 0 | "
-"Puts: 21 | "
-"Updates: 4"
+Cache Stats :: Size: 23/10000 | Hits: 20 | Misses: 3 | Hit Rate: 87.00% | Evictions: 0 | Puts: 21 | Updates: 4
 ```
 
 The following list describes the metrics included in the cache statistics:
diff --git a/docs/configure-rails/custom-initialization/init-function.md b/docs/configure-rails/custom-initialization/init-function.md
@@ -24,6 +24,12 @@ content:
 
 If `config.py` contains an `init` function, it is called during `LLMRails` initialization. Use it to set up shared resources and register action parameters.
 
+```{important}
+The `init` function **must be synchronous** (`def init`, not `async def init`). The framework calls it without `await`, so an async function would silently do nothing.
+```
+
+Any top-level code in `config.py` runs at import time, before `init()` is called. This can be used for provider registration that does not require the `LLMRails` instance.
+
 ## Basic Usage
 
 ```python
@@ -39,17 +45,19 @@ def init(app: LLMRails):
 
 ## Registering Action Parameters
 
-Action parameters registered in `config.py` are automatically injected into actions that declare them:
+Action parameters registered in `config.py` are automatically injected into actions that declare them. The runtime matches parameters by name, i.e., the parameter name in the action must match the name used during registration.
 
 **config.py:**
 
 ```python
+import os
+
 from nemoguardrails import LLMRails
 
 def init(app: LLMRails):
     # Initialize shared resources
     db = DatabaseConnection(host="localhost", port=5432)
-    api_client = ExternalAPIClient(api_key="...")
+    api_client = ExternalAPIClient(api_key=os.environ.get("API_KEY"))
 
     # Register as action parameters
     app.register_action_param("db", db)
@@ -72,6 +80,20 @@ async def call_external_service(query: str, api_client=None):
     return await api_client.search(query)
 ```
 
+## Built-in Action Parameters
+
+In addition to parameters you register, the runtime automatically injects these built-in parameters into any action that declares them:
+
+| Parameter | Type | Description |
+|-----------|------|-------------|
+| `config` | `RailsConfig` | The full rails configuration object |
+| `context` | `dict` | The current conversation context |
+| `events` | `list` | The event history |
+| `llm` | LLM instance | The main LLM (auto-registered during initialization) |
+| `llm_task_manager` | `LLMTaskManager` | Manages LLM task execution |
+
+See [Custom Data](custom-data.md) for details on accessing `config.custom_data` inside actions.
+
 ## Accessing the Configuration
 
 The `app` parameter provides access to the full configuration:
@@ -91,37 +113,32 @@ def init(app: LLMRails):
 ## Example: Database Connection
 
 ```python
-import asyncpg
+import os
+import psycopg2
 from nemoguardrails import LLMRails
 
-async def create_db_pool():
-    return await asyncpg.create_pool(
-        host="localhost",
-        database="mydb",
-        user="user",
-        password="password"
-    )
-
 def init(app: LLMRails):
-    import asyncio
-
     # Create connection pool
-    loop = asyncio.get_event_loop()
-    db_pool = loop.run_until_complete(create_db_pool())
+    conn = psycopg2.connect(
+        host=os.environ.get("DB_HOST", "localhost"),
+        database=os.environ.get("DB_NAME", "mydb"),
+        user=os.environ.get("DB_USER", "user"),
+        password=os.environ.get("DB_PASSWORD"),
+    )
 
-    # Register for use in actions
-    app.register_action_param("db_pool", db_pool)
+    app.register_action_param("db_conn", conn)
 ```
 
 ## Example: API Client Initialization
 
 ```python
+import os
 import httpx
 from nemoguardrails import LLMRails
 
 def init(app: LLMRails):
     # Get API key from custom_data in config.yml
-    api_key = app.config.custom_data.get("api_key")
+    api_key = os.environ.get("API_KEY") or app.config.custom_data.get("api_key")
 
     # Create HTTP client with authentication
     client = httpx.AsyncClient(
diff --git a/docs/configure-rails/other-configurations/knowledge-base.md b/docs/configure-rails/other-configurations/knowledge-base.md
@@ -40,7 +40,7 @@ Currently, only the Markdown format is supported.
 
 Documents in the knowledge base `kb` folder are automatically processed and indexed for retrieval. The system:
 
-1. Splits documents into topic chunks based on markdown headers.
+1. Splits documents into topic chunks based on markdown headers. Large chunks are further split at blank lines to stay within a maximum chunk size.
 2. Uses the configured embedding model to create vector representations of each chunk.
 3. Stores the embeddings for efficient similarity search.
 
@@ -126,26 +126,37 @@ Place markdown files in the `kb` folder as described above. This is the simplest
 Implement a custom action to retrieve chunks from external sources:
 
 ```python
+from typing import Optional
+
 from nemoguardrails.actions import action
+from nemoguardrails.actions.actions import ActionResult
+from nemoguardrails.kb.kb import KnowledgeBase
 
-@action()
-async def retrieve_relevant_chunks(context: dict, llm: BaseLLM):
-    """Custom retrieval from external knowledge base."""
-    user_message = context.get("last_user_message")
+@action(is_system_action=True)
+async def retrieve_relevant_chunks(
+    context: Optional[dict] = None,
+    kb: Optional[KnowledgeBase] = None,
+):
+    user_message = context.get("last_user_message") if context else None
 
     # Implement custom retrieval logic
     # For example, query an external vector database
     chunks = await query_external_kb(user_message)
+    relevant_chunks = "\n".join(chunks)
 
-    return chunks
+    return ActionResult(
+        return_value=relevant_chunks,
+        context_updates={"relevant_chunks": relevant_chunks},
+    )
 ```
 
 ### 3. Using Custom EmbeddingSearchProvider
 
 For advanced use cases, implement a custom embedding search provider:
 
 ```python
-from nemoguardrails.embeddings.index import EmbeddingsIndex
+from typing import List, Optional
+from nemoguardrails.embeddings.index import EmbeddingsIndex, IndexItem
 
 class CustomEmbeddingSearchProvider(EmbeddingsIndex):
     """Custom embedding search provider."""
@@ -154,8 +165,7 @@ class CustomEmbeddingSearchProvider(EmbeddingsIndex):
         # Custom indexing logic
         pass
 
-    async def search(self, text: str, max_results: int) -> List[IndexItem]:
-        # Custom search logic
+    async def search(self, text: str, max_results: int, threshold: Optional[float] = None) -> List[IndexItem]:
         pass
 ```