The Model Context Protocol (MCP) is an open protocol that enables seamless integration between LLM applications and external data sources and tools. Whether you're building an AI-powered IDE, enhancing a chat interface, or creating custom AI workflows, MCP provides a standardized way to connect LLMs with the context they need.
This repository is an example of how to create a MCP server for Qdrant, a vector search engine.
This fork includes 90-97% token reduction for documentation queries through:
- Minimal mode (default) - Returns metadata only (IPs, hostnames, URLs, file names) without full chunk content (2.7x cheaper than full mode)
- Reduced default limit - 5 results instead of 10
- Optional limit parameter - Override when deeper search needed
- Point IDs in results - All search results include point IDs for precise deletion operations
- Delete tool - Remove points by ID or filter for collection maintenance
- Optional reranking (enabled by default) - Retrieve more candidates and rerank with BGE reranker for improved relevance (zero token cost, server-side operation)
- Multi-collection search - Search multiple collections in parallel or use wildcard to search all collections at once
# Default: minimal mode, 5 results (~2k tokens)
qdrant-find(query="server ip address")
# Full content mode (~3-6k tokens)
qdrant-find(query="configuration procedure", mode="full")
# More results when needed
qdrant-find(query="architecture", mode="full", limit=10)
# Delete specific points by ID (recommended)
qdrant-delete(collection_name="my-collection", point_ids=["uuid-1", "uuid-2"])
# Delete by filter
qdrant-delete(collection_name="my-collection", query_filter={"must": [{"key": "status", "match": {"value": "expired"}}]})
# Reranking enabled by default (zero token cost)
qdrant-find(query="server configuration") # Automatically reranks for best 8 results
# Multi-collection search
qdrant-find(query="service ports", collections=["homelab-docs", "docker-stacks"])
# Search all collections
qdrant-find(query="architecture overview", collections=["*"], mode="full", limit=10)See README.custom.md for detailed comparison and configuration.
An official Model Context Protocol server for keeping and retrieving memories in the Qdrant vector search engine. It acts as a semantic memory layer on top of the Qdrant database.
qdrant-store- Store some information in the Qdrant database
- Input:
information(string): Information to storemetadata(JSON): Optional metadata to storecollection_name(string): Name of the collection to store the information in. This field is required if there are no default collection name. If there is a default collection name, this field is not enabled.
- Returns: Confirmation message
qdrant-find- Retrieve relevant information from the Qdrant database
- Input:
query(string): Query to use for searchingcollection_name(string, optional, DEPRECATED): Name of the collection to search in. Usecollectionsparameter instead.collections(array of strings, optional): List of collections to search. Use["*"]to search all collections. If not provided, usescollection_nameor default collection.mode(string, optional): Response mode - "minimal" (default, metadata only for 2.7x token efficiency) or "full" (complete content)limit(integer, optional): Maximum number of results to return per collection (default: 5)rerank(boolean, optional): Enable reranking for improved relevance (default: true). When enabled, retrieves more candidates and reranks them using BGE reranker (zero token cost, server-side operation).
- Returns: Information stored in the Qdrant database as separate messages, including point IDs and source collection for each result
qdrant-delete- Delete points from the Qdrant database by IDs or filter conditions
- Input:
collection_name(string): Name of the collection to delete from. This field is required if there are no default collection name. If there is a default collection name, this field is not enabled.point_ids(array of strings, optional): List of point IDs (UUIDs) to delete. Preferred method for reliable deletion.query_filter(JSON, optional): Filter conditions to identify points to delete
- Returns: Confirmation message with operation status
- Note: Provide either
point_idsORquery_filter, not both
The configuration of the server is done using environment variables:
| Name | Description | Default Value |
|---|---|---|
QDRANT_URL |
URL of the Qdrant server | None |
QDRANT_API_KEY |
API key for the Qdrant server | None |
COLLECTION_NAME |
Name of the default collection to use. | None |
QDRANT_LOCAL_PATH |
Path to the local Qdrant database (alternative to QDRANT_URL) |
None |
EMBEDDING_PROVIDER |
Embedding provider to use (currently only "fastembed" is supported) | fastembed |
EMBEDDING_MODEL |
Name of the embedding model to use | sentence-transformers/all-MiniLM-L6-v2 |
TOOL_STORE_DESCRIPTION |
Custom description for the store tool | See default in settings.py |
TOOL_FIND_DESCRIPTION |
Custom description for the find tool | See default in settings.py |
TOOL_DELETE_DESCRIPTION |
Custom description for the delete tool | See default in settings.py |
RERANKER_ENABLED |
Enable reranker globally (default: false) | false |
RERANKER_URL |
Reranker API endpoint URL | https://reranker.example.com/rerank |
RERANKER_API_KEY |
Bearer token for reranker API | None |
RERANKER_CANDIDATE_POOL_SIZE |
Number of candidates to retrieve before reranking | 30 |
RERANKER_TOP_K |
Number of top results to return after reranking | 8 |
RERANKER_TIMEOUT |
HTTP timeout in seconds for reranker API | 10 |
Note: You cannot provide both QDRANT_URL and QDRANT_LOCAL_PATH at the same time.
Important
Command-line arguments are not supported anymore! Please use environment variables for all configuration.
Since mcp-server-qdrant is based on FastMCP, it also supports all the FastMCP environment variables. The most
important ones are listed below:
| Environment Variable | Description | Default Value |
|---|---|---|
FASTMCP_DEBUG |
Enable debug mode | false |
FASTMCP_LOG_LEVEL |
Set logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) | INFO |
FASTMCP_HOST |
Host address to bind the server to | 127.0.0.1 |
FASTMCP_PORT |
Port to run the server on | 8000 |
FASTMCP_WARN_ON_DUPLICATE_RESOURCES |
Show warnings for duplicate resources | true |
FASTMCP_WARN_ON_DUPLICATE_TOOLS |
Show warnings for duplicate tools | true |
FASTMCP_WARN_ON_DUPLICATE_PROMPTS |
Show warnings for duplicate prompts | true |
FASTMCP_DEPENDENCIES |
List of dependencies to install in the server environment | [] |
When using uvx no specific installation is needed to directly run mcp-server-qdrant.
QDRANT_URL="http://localhost:6333" \
COLLECTION_NAME="my-collection" \
EMBEDDING_MODEL="sentence-transformers/all-MiniLM-L6-v2" \
uvx mcp-server-qdrantThe server supports different transport protocols that can be specified using the --transport flag:
QDRANT_URL="http://localhost:6333" \
COLLECTION_NAME="my-collection" \
uvx mcp-server-qdrant --transport sseSupported transport protocols:
stdio(default): Standard input/output transport, might only be used by local MCP clientssse: Server-Sent Events transport, perfect for remote clientsstreamable-http: Streamable HTTP transport, perfect for remote clients, more recent than SSE
The default transport is stdio if not specified.
When SSE transport is used, the server will listen on the specified port and wait for incoming connections. The default
port is 8000, however it can be changed using the FASTMCP_PORT environment variable.
QDRANT_URL="http://localhost:6333" \
COLLECTION_NAME="my-collection" \
FASTMCP_PORT=1234 \
uvx mcp-server-qdrant --transport sseA Dockerfile is available for building and running the MCP server:
# Build the container
docker build -t mcp-server-qdrant .
# Run the container
docker run -p 8000:8000 \
-e FASTMCP_HOST="0.0.0.0" \
-e QDRANT_URL="http://your-qdrant-server:6333" \
-e QDRANT_API_KEY="your-api-key" \
-e COLLECTION_NAME="your-collection" \
mcp-server-qdrantTip
Please note that we set FASTMCP_HOST="0.0.0.0" to make the server listen on all network interfaces. This is
necessary when running the server in a Docker container.
To install Qdrant MCP Server for Claude Desktop automatically via Smithery:
npx @smithery/cli install mcp-server-qdrant --client claudeTo use this server with the Claude Desktop app, add the following configuration to the "mcpServers" section of your
claude_desktop_config.json:
{
"qdrant": {
"command": "uvx",
"args": ["mcp-server-qdrant"],
"env": {
"QDRANT_URL": "https://xyz-example.eu-central.aws.cloud.qdrant.io:6333",
"QDRANT_API_KEY": "your_api_key",
"COLLECTION_NAME": "your-collection-name",
"EMBEDDING_MODEL": "sentence-transformers/all-MiniLM-L6-v2"
}
}
}For local Qdrant mode:
{
"qdrant": {
"command": "uvx",
"args": ["mcp-server-qdrant"],
"env": {
"QDRANT_LOCAL_PATH": "/path/to/qdrant/database",
"COLLECTION_NAME": "your-collection-name",
"EMBEDDING_MODEL": "sentence-transformers/all-MiniLM-L6-v2"
}
}
}This MCP server will automatically create a collection with the specified name if it doesn't exist.
By default, the server will use the sentence-transformers/all-MiniLM-L6-v2 embedding model to encode memories.
For the time being, only FastEmbed models are supported.
This MCP server can be used with any MCP-compatible client. For example, you can use it with Cursor and VS Code, which provide built-in support for the Model Context Protocol.
You can configure this MCP server to work as a code search tool for Cursor or Windsurf by customizing the tool descriptions:
QDRANT_URL="http://localhost:6333" \
COLLECTION_NAME="code-snippets" \
TOOL_STORE_DESCRIPTION="Store reusable code snippets for later retrieval. \
The 'information' parameter should contain a natural language description of what the code does, \
while the actual code should be included in the 'metadata' parameter as a 'code' property. \
The value of 'metadata' is a Python dictionary with strings as keys. \
Use this whenever you generate some code snippet." \
TOOL_FIND_DESCRIPTION="Search for relevant code snippets based on natural language descriptions. \
The 'query' parameter should describe what you're looking for, \
and the tool will return the most relevant code snippets. \
Use this when you need to find existing code snippets for reuse or reference." \
uvx mcp-server-qdrant --transport sse # Enable SSE transportIn Cursor/Windsurf, you can then configure the MCP server in your settings by pointing to this running server using SSE transport protocol. The description on how to add an MCP server to Cursor can be found in the Cursor documentation. If you are running Cursor/Windsurf locally, you can use the following URL:
http://localhost:8000/sse
Tip
We suggest SSE transport as a preferred way to connect Cursor/Windsurf to the MCP server, as it can support remote connections. That makes it easy to share the server with your team or use it in a cloud environment.
This configuration transforms the Qdrant MCP server into a specialized code search tool that can:
- Store code snippets, documentation, and implementation details
- Retrieve relevant code examples based on semantic search
- Help developers find specific implementations or usage patterns
You can populate the database by storing natural language descriptions of code snippets (in the information parameter)
along with the actual code (in the metadata.code property), and then search for them using natural language queries
that describe what you're looking for.
Note
The tool descriptions provided above are examples and may need to be customized for your specific use case. Consider adjusting the descriptions to better match your team's workflow and the specific types of code snippets you want to store and retrieve.
If you have successfully installed the mcp-server-qdrant, but still can't get it to work with Cursor, please
consider creating the Cursor rules so the MCP tools are always used when
the agent produces a new code snippet. You can restrict the rules to only work for certain file types, to avoid using
the MCP server for the documentation or other types of content.
You can enhance Claude Code's capabilities by connecting it to this MCP server, enabling semantic search over your existing codebase.
-
Add the MCP server to Claude Code:
# Add mcp-server-qdrant configured for code search claude mcp add code-search \ -e QDRANT_URL="http://localhost:6333" \ -e COLLECTION_NAME="code-repository" \ -e EMBEDDING_MODEL="sentence-transformers/all-MiniLM-L6-v2" \ -e TOOL_STORE_DESCRIPTION="Store code snippets with descriptions. The 'information' parameter should contain a natural language description of what the code does, while the actual code should be included in the 'metadata' parameter as a 'code' property." \ -e TOOL_FIND_DESCRIPTION="Search for relevant code snippets using natural language. The 'query' parameter should describe the functionality you're looking for." \ -- uvx mcp-server-qdrant
-
Verify the server was added:
claude mcp list
Tool descriptions, specified in TOOL_STORE_DESCRIPTION and TOOL_FIND_DESCRIPTION, guide Claude Code on how to use
the MCP server. The ones provided above are examples and may need to be customized for your specific use case. However,
Claude Code should be already able to:
- Use the
qdrant-storetool to store code snippets with descriptions. - Use the
qdrant-findtool to search for relevant code snippets using natural language.
The MCP server can be run in development mode using the mcp dev command. This will start the server and open the MCP
inspector in your browser.
COLLECTION_NAME=mcp-dev fastmcp dev src/mcp_server_qdrant/server.pyFor one-click installation, click one of the install buttons below:
Add the following JSON block to your User Settings (JSON) file in VS Code. You can do this by pressing Ctrl + Shift + P and typing Preferences: Open User Settings (JSON).
{
"mcp": {
"inputs": [
{
"type": "promptString",
"id": "qdrantUrl",
"description": "Qdrant URL"
},
{
"type": "promptString",
"id": "qdrantApiKey",
"description": "Qdrant API Key",
"password": true
},
{
"type": "promptString",
"id": "collectionName",
"description": "Collection Name"
}
],
"servers": {
"qdrant": {
"command": "uvx",
"args": ["mcp-server-qdrant"],
"env": {
"QDRANT_URL": "${input:qdrantUrl}",
"QDRANT_API_KEY": "${input:qdrantApiKey}",
"COLLECTION_NAME": "${input:collectionName}"
}
}
}
}
}Or if you prefer using Docker, add this configuration instead:
{
"mcp": {
"inputs": [
{
"type": "promptString",
"id": "qdrantUrl",
"description": "Qdrant URL"
},
{
"type": "promptString",
"id": "qdrantApiKey",
"description": "Qdrant API Key",
"password": true
},
{
"type": "promptString",
"id": "collectionName",
"description": "Collection Name"
}
],
"servers": {
"qdrant": {
"command": "docker",
"args": [
"run",
"-p", "8000:8000",
"-i",
"--rm",
"-e", "QDRANT_URL",
"-e", "QDRANT_API_KEY",
"-e", "COLLECTION_NAME",
"mcp-server-qdrant"
],
"env": {
"QDRANT_URL": "${input:qdrantUrl}",
"QDRANT_API_KEY": "${input:qdrantApiKey}",
"COLLECTION_NAME": "${input:collectionName}"
}
}
}
}
}Alternatively, you can create a .vscode/mcp.json file in your workspace with the following content:
{
"inputs": [
{
"type": "promptString",
"id": "qdrantUrl",
"description": "Qdrant URL"
},
{
"type": "promptString",
"id": "qdrantApiKey",
"description": "Qdrant API Key",
"password": true
},
{
"type": "promptString",
"id": "collectionName",
"description": "Collection Name"
}
],
"servers": {
"qdrant": {
"command": "uvx",
"args": ["mcp-server-qdrant"],
"env": {
"QDRANT_URL": "${input:qdrantUrl}",
"QDRANT_API_KEY": "${input:qdrantApiKey}",
"COLLECTION_NAME": "${input:collectionName}"
}
}
}
}For workspace configuration with Docker, use this in .vscode/mcp.json:
{
"inputs": [
{
"type": "promptString",
"id": "qdrantUrl",
"description": "Qdrant URL"
},
{
"type": "promptString",
"id": "qdrantApiKey",
"description": "Qdrant API Key",
"password": true
},
{
"type": "promptString",
"id": "collectionName",
"description": "Collection Name"
}
],
"servers": {
"qdrant": {
"command": "docker",
"args": [
"run",
"-p", "8000:8000",
"-i",
"--rm",
"-e", "QDRANT_URL",
"-e", "QDRANT_API_KEY",
"-e", "COLLECTION_NAME",
"mcp-server-qdrant"
],
"env": {
"QDRANT_URL": "${input:qdrantUrl}",
"QDRANT_API_KEY": "${input:qdrantApiKey}",
"COLLECTION_NAME": "${input:collectionName}"
}
}
}
}If you have suggestions for how mcp-server-qdrant could be improved, or want to report a bug, open an issue! We'd love all and any contributions.
The MCP inspector is a developer tool for testing and debugging MCP servers. It runs both a client UI (default port 5173) and an MCP proxy server (default port 3000). Open the client UI in your browser to use the inspector.
QDRANT_URL=":memory:" COLLECTION_NAME="test" \
fastmcp dev src/mcp_server_qdrant/server.pyOnce started, open your browser to http://localhost:5173 to access the inspector interface.
This MCP server is licensed under the Apache License 2.0. This means you are free to use, modify, and distribute the software, subject to the terms and conditions of the Apache License 2.0. For more details, please see the LICENSE file in the project repository.