HR Buddy is an application which is inspired by HR Chatbot portals. Uses combination of Llama (LLM) + Nomic (Embedding) models. Uses Retrieval Augumented Generation for identify the context strictly from the HR Policies PDF.
flowchart TD
UI["Streamlit UI"]
SESS["Session State<br/>(history, session_id)"]
MDB[("MongoDB<br/>Auth & Chat History")]
subgraph RAG [RAG Engine]
direction LR
subgraph Hybrid [Hybrid Retrieval]
SEM["Semantic Search<br/>ChromaDB + OllamaEmbeddings<br/>(MMR, fetch_k=18)"]
BM25["Keyword Search<br/>BM25 (heading-enriched<br/>chunks, top_k=6)"]
FUSION["Weighted Fusion<br/>(semantic=0.7, bm25=0.3)"]
SEM --> FUSION
BM25 --> FUSION
end
CTX["Context Assembly<br/>(top_k=6 documents)"]
LLM["Ollama Llama 3.2<br/>3B params<br/>(temp=0.1, ctx=4096)"]
Hybrid --> CTX --> LLM
end
UI --> SESS
SESS --> MDB
UI -->|"user input + history"| RAG
LLM -->|"response stream"| UI
Before running the application, ensure your system has the following:
- Docker & Docker Compose installed.
- Hardware: Minimum 8GB RAM (16GB+ recommended) to run the Llama 3.2 model smoothly.
- OS: Linux or macOS (Windows users should use WSL2).
If you are on MacOS / Linux, simply make the shell script executable
chmod +x run.sh
Then, just run the shell script.
./run.sh
Note: If you have any other shell instead of bash, open the first line of run.sh and replace the first line with the shell of your choice.
This script will handle all the setup of Ollama Package and the model and as well as builds the docker container.
- Frontend: Streamlit
- AI/LLM: Ollama (Llama 3.2 3B)
- Embeddings: Nomic Embed Text
- Vector Store: ChromaDB
- Retrieval: Hybrid search (semantic + BM25 keyword)
- Database: MongoDB (for user authentication and chat history)
- Orchestration: Langchain & Docker
The app uses hybrid search combining semantic (vector) and keyword (BM25) retrieval:
| Parameter | Default | Description |
|---|---|---|
enabled |
true |
Toggle hybrid search on/off |
semantic_weight |
0.7 |
Weight for semantic (embedding) similarity scores |
bm25_weight |
0.3 |
Weight for BM25 keyword matching scores |
top_k |
6 |
Number of final documents returned to the LLM |
fetch_k |
18 |
Documents fetched per retriever before fusion |
Disable hybrid search ("enabled": false) to fall back to semantic-only retrieval.
By default, the application uses the provided 2016 HR Manual. To use your own data:
- Delete the existing PDF in the
rag_source/directory. - Place your company's HR policy PDF into
rag_source/. - Update the
PDF_PATHvariable inmain.pyif the filename changes. - Restart the containers to trigger a fresh vector embedding.
Ollama Connection Refused inside Docker: If the Streamlit app cannot reach Ollama, you need to configure Ollama to listen to the Docker bridge network.
- Run
sudo systemctl edit ollama.service - Add the following under the
[Service]block:Environment="OLLAMA_HOST=0.0.0.0" - Save, then run
sudo systemctl daemon-reloadandsudo systemctl restart ollama.
ToastCoder * GitHub: @ToastCoder