NLWeb is a modular Python framework for natural language search over web content using vector database retrieval, LLM-based ranking, and multiple protocol support (HTTP, MCP, A2A).
- Ask API — semantic search API
- Crawler — master/worker pipeline that parses schema.org sitemaps, embeds content, and uploads to a vector database
- Chat App — React frontend for the search interface
- Azure Infrastructure — Bicep templates and Helm charts for one-command Azure deployment
The Ask API is a pure Python service, so you only absolutely need Python 3.12 to get started, but a full development toolchain includes:
- Python 3.12
- Node.js (for the graphical UI)
- pnpm (for the graphical UI)
- Docker (for building images for cloud deployment)
- uv (for Python package management)
- Helm (for Kubernetes setup)
- Kubectl (for interacting with the Kubernetes application)
Use your platform's package manager to install the above, such as Homebrew for Mac, apt/rpm for LInux (depending on distribution), or Chocolatey for Windows. For Windows you will also need WSL for the shell scripts to run.
You can use any provider that offers necessary infrastructure, but this repo has a completely worked example setup on Azure. For that you additionally need:
- Azure CLI (2.50+)
Other cloud providers would need their tools (aws, gcloud, etc.), but these are not implemented yet.
Here you have two choices:
Clone this repository to get the full stack (ask API, crawler, chat app, deployment tooling):
git clone https://github.com/nlweb-ai/nlweb-ask-agent.git
cd nlweb-ask-agentThis is coming soon. PIP install the appropriate packages.
NLWeb is governed by a config file.
If you PIP-installed, run the following to generate a sample configuration
nlweb-init ./config.yaml
If you installed from source, simply edit ./ask_api/packages/core/nlweb_core/data/config.yaml in place, and your changes will be picked up by the various scripts.
NLWeb's configuration lets you customize many aspects of the RAG and ranking flow. Most implementations can be swapped at runtime through dynamic imports. The broad sections are as follows:
| Section | Purpose | Default provider |
|---|---|---|
generative_model |
Text/JSON generation (high and low quality tiers) | Azure OpenAI (gpt-4.1 / gpt-4.1-mini) |
embedding |
Vector embeddings for search | Azure OpenAI (text-embedding-3-small) |
scoring_model |
Result relevance ranking | Pi Labs |
retrieval |
Vector database search | Azure AI Search |
object_storage |
Full document storage | Azure Cosmos DB |
site_config |
Per-site behavior configuration | Cosmos DB with 5-min cache |
ranking_config |
Scoring questions for result ranking | — |
Each provider references credentials via *_env keys (e.g., endpoint_env: AZURE_OPENAI_ENDPOINT).
Each of these sections defines a map of different configurations, which code can load on-demand.
Environment variables referenced in the config need to be available when you run the application.
The default configuration (which assumes an Azure backend) needs the following:
| Variable | Used by |
|---|---|
AZURE_OPENAI_ENDPOINT |
Generative models, embeddings |
AZURE_OPENAI_KEY |
Generative models, embeddings |
AZURE_SEARCH_ENDPOINT |
Vector retrieval |
AZURE_SEARCH_KEY |
Vector retrieval |
AZURE_SEARCH_INDEX_NAME |
Vector retrieval (default: crawler-vectors) |
COSMOS_DB_ENDPOINT |
Object storage, site config |
COSMOS_DB_DATABASE_NAME |
Object storage, site config |
COSMOS_DB_CONTAINER_NAME |
Object storage |
PI_LABS_ENDPOINT |
Scoring model |
PI_LABS_KEY |
Scoring model |
Set these manually to address your infrastructure or update the configuration to enable different providers.
Note If you set up an Azure sandbox as described in [deployment/README.md](deployment/README.md), then simply run make init_environment ENV_NAME=yourenv` to prepopulate your environment.
If you PIP installed, run the following in one terminal:
ASK_API_PORT=8081 nlweb-server ./config.yaml
You can then play with the universal viewer using this from another terminal:
ASK_API_URL=http://localhost:8081/ask PORT=8080 npx @nlweb-ai/chat-app
Check it out at http://localhost:8080 for the UI, and http://localhost:8081/ask for the API
If you installed from source you can launch the API service and UI together with:
make ask
That will launch the UI at http://localhost:8080 and the API at http://localhost:8080/ask
Important: This is not supported with PIP yet. You'll need the source.
Start the full stack (includes crawler master + worker):
make fullstackAdd a site to crawl:
# Trigger indexing of a domain (requires it to have a sitemap.xml at the root)
curl -X POST http://localhost:8080/crawler/api/sites \
-H 'Content-Type: application/json' \
-d '{"site_url": "https://example.com"}'
# Trigger indexing of a particular sitemap (for testing with an alternative sitemap)
curl -X POST http://localhost:8080/crawler/api/sites/example.com/schema-files \
-H 'Content-Type: application/json' \
-d '{"schema_map_url": "https://example.com/sitemap.xml"}'
# List all indexed sites
curl http://localhost:8080/crawler/api/sites
# Check status of a specific site
curl http://localhost:8080/crawler/api/sites/example.comThe crawler discovers schema.org sitemaps, queues pages for processing, embeds content, and uploads vectors to Azure AI Search. Monitor progress at http://localhost:8080/crawler.
Important: The crawler only plugs into Azure at this time and does not yet support configuration to address different backends.
Use the following to query (using your path for API URL)
# Non-streaming query
curl -X POST http://localhost:8080/ask \
-H 'Content-Type: application/json' \
-d '{
"query": {"text": "best pasta recipes"},
"prefer": {"streaming": false}
}'
# Streaming query (SSE)
curl -N -X POST http://localhost:8080/ask \
-H 'Content-Type: application/json' \
-d '{
"query": {"text": "best pasta recipes"}
}'The API also supports MCP (/mcp) and A2A (/a2a) protocols.
Every directory has Makefile targets for common developer journeys. Run make help anywhere in the repo to print avaliable actions.
All development uses Docker Compose with an nginx gateway on http://localhost:8080. This enables local testing as close as possible to the default deployment on Kubernetes, with all services behind a Gateway.
make ask # Ask API + Chat App
make fullstack # Full stack (+ crawler)
make down # Stop all services
make logs # Tail logs
make check # Run all checks across all modulesEach service directory has the same targets, for convenience:
cd ask_api && make dev # gateway + ask-api (localhost:8080/ask)
cd frontend && make dev # chat-app (localhost:8080)
cd crawler && make dev # gateway + crawler (localhost:8080/crawler)cd ask_api && make check # ruff check, ruff format, pyright, pytest
cd crawler && make check # ruff check, ruff format, pyright, pytest
cd frontend && make check # eslint, prettier, tsc --noEmitnlweb-ask-agent/
├── ask_api/ # Semantic search API
│ ├── packages/
│ │ ├── core/ # Framework & orchestration (nlweb-core)
│ │ ├── network/ # Protocol adapters (nlweb-network)
│ │ └── providers/ # Azure, Pi Labs provider implementations
├── crawler/ # Web crawler (master/worker)
├── frontend/ # pnpm workspace
│ ├── chat-app/ # React chat UI
│ └── search-components/ # Shared component library
├── deployment/ # Azure Bicep templates & scripts
└── helm/ # Kubernetes Helm charts
See deployment/README.md for infrastructure provisioning. That will walk you through
initializing a resource group (named by ENV_NAME) that contains a full NLWeb stack. Multiple of these can exist in
a subscription and are fully independent.
After creation, run make init_environment ENV_NAME=yourenv to configure your
local development environment to communicate with a particular NLWeb stack. These can be shared within a team.
Run the following to build your current source tree and deploy it to the chosen Azure environment.
make build-all ENV_NAME=yourenv # Build all Docker images to ACR
make deploy-all ENV_NAME=yourenv # Deploy all services to AKS via Helm