This document provides step-by-step instructions for using the main components of HRAftu-LM-RAG. Each section covers how to run scripts and services in the src/ directory, along with example commands.
-
Configuration File
Before invoking any script, ensure you have copied and editedconfig/example_config.yamltoconfig/config.yaml, filling in:fastgpt.api_key&fastgpt.host- Vision model device/port settings
- ClickHouse connection details (if using similarity evaluation)
-
Environment Activation
cd HRAftu-LM-RAG source venv/bin/activate # or: conda activate hraftu
The Data Import script (import_data_to_knowledge_datatabase.py) uploads a directory of local files (PDF, TXT, DOCX, etc.) into a FastGPT dataset/collection.
python src/import_data/import_data_to_knowledge_datatabase.py \
--directory_path <PATH_TO_FILES> \
--database <FASTGPT_DATASET_NAME> \
[--collect_name <COLLECTION_NAME>] \
[--parm <create|update|delete>] \
[--parentId <PARENT_COLLECTION_ID>]-
--directory_path(required): Absolute or relative path to the folder containing documents. -
--database(required): The FastGPT dataset/collection name under which to import. -
--collect_name(optional): A human-readable name for the collection; if omitted, defaults to the directory name. -
--parm(optional):create(default) → add all new documents, skip existing ones.update→ replace documents that already exist.delete→ remove documents matching names in the folder.
-
--parentId(optional): If you want to nest this collection under an existing parent collection ID.
python src/import_data/import_data_to_knowledge_datatabase.py \
--directory_path ./data/technical_papers \
--database ResearchCorpus \
--collect_name “TechPapers” \
--parm createThis will:
- Read every file under
./data/technical_papers. - Create or update the FastGPT collection named
ResearchCorpus/TechPapers. - Skip any files that are already indexed.
The Embedding Service (embedding_web.py) is a Flask-based HTTP server that loads multiple embedding models. Clients send text and specify which model to use, and the service returns normalized vectors.
python src/embedding_service/embedding_web.py --port 55443--port(optional): Port number on which Flask will listen (default:55443).
When the service starts, you’ll see console output similar to:
[INFO] Loading model: pubmedbert
[INFO] Loading model: all-MiniLM-L6-v2
[INFO] Loading model: bge-large-en-v1.5
[INFO] Loading model: gte-large
[INFO] Flask server running on http://0.0.0.0:55443
-
URL:
POST http://<host>:<port>/v1/embeddings -
Request body (JSON):
{ "input": ["Text sentence 1", "Text sentence 2", ...], "model": "all-MiniLM-L6-v2" } -
Response body (JSON array):
[ [0.123, -0.456, ...], // normalized vector for "Text sentence 1" [0.789, 0.012, ...] // normalized vector for "Text sentence 2" ]
curl -X POST http://localhost:55443/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"input": ["The quick brown fox jumps over the lazy dog."],
"model": "all-MiniLM-L6-v2"
}'You should receive a JSON array containing one embedding vector.
The LLM Batch Query script (llm_query.py) sends multiple prompts in parallel to FastGPT’s Chat Completions endpoint and collects responses.
python src/llm_query/llm_query.py \
--input_file <PATH_TO_PROMPTS_FILE> \
--output_file <PATH_TO_OUTPUT_JSON> \
[--max_workers <NUM_THREADS>] \
[--timeout <SECONDS>]--input_file(required): Path to a CSV or JSON file containing a list of prompts.--output_file(required): Path where the script will write the aggregated JSON responses.--max_workers(optional): Number of parallel threads to use (default:4).--timeout(optional): Timeout in seconds per request (default:30).
-
CSV:
id,prompt 1,"What is the capital of France?" 2,"Explain the theory of relativity in simple terms."
-
JSON:
[ {"id": "1", "prompt": "What is the capital of France?"}, {"id": "2", "prompt": "Explain the theory of relativity in simple terms."} ]
python src/llm_query/llm_query.py \
--input_file examples/sample_prompts.csv \
--output_file output/llm_responses.json \
--max_workers 8-
The script will read each
prompt, submit it to FastGPT’s/v1/chat/completions, and write a list of objects like:[ {"id": "1", "response": "Paris is the capital of France."}, {"id": "2", "response": "The theory of relativity, developed by Einstein, states that ..."} ]
LVM Query refers to querying vision-language models separately from the LLM batch query. Each model (LLaVa, Llama-3.2-Vision, Phi-3-Vision, Phi-3.5-Vision, Pixtral) exposes its own RESTful endpoint. Below are example commands for each.
Note: Ensure you have already started the relevant LVM service (see
docs/installation.md).
-
Endpoint:
POST http://localhost:11434/completions -
Example
curl:curl -X POST http://localhost:11434/completions \ -H "Content-Type: application/json" \ -d '{ "prompt": "Describe this image: <Base64-encoded-image-data>" }'
-
Response:
{ "id": "abc123", "choices": [ {"text": "A dog is running in a grassy field.", "finish_reason": "stop"} ] }
-
Endpoint:
POST http://localhost:8000 -
Payload Format:
{ "inputs": [ {"type": "image", "data": "<Base64-or-URL>"}, {"type": "text", "text": "What objects do you see in this image?"} ] } -
Example
curl:curl -X POST http://localhost:8000 \ -H "Content-Type: application/json" \ -d '{ "inputs": [ {"type": "image", "data": "<Base64-encoded-image>"}, {"type": "text", "text": "What objects do you see in this image?"} ] }'
-
Response:
{ "id": "req-456", "outputs": [ {"text": "I see a cat sitting on a windowsill.", "entities": []} ] }
-
Endpoint:
POST http://localhost:8001 -
Example
curl:curl -X POST http://localhost:8001 \ -H "Content-Type: application/json" \ -d '{ "inputs": [ {"type": "image", "data": "<Base64>"}, {"type": "text", "text": "Identify any medical anomalies in this X-ray image."} ] }'
-
Response:
{ "id": "req-789", "outputs": [ {"text": "There is a small opacity in the lower right lung field.", "entities": [{"label":"Anomaly","text":"opacity","confidence":0.92}]} ] }
-
Endpoint:
POST http://localhost:8002 -
Example
curl:curl -X POST http://localhost:8002 \ -H "Content-Type: application/json" \ -d '{ "inputs": [ {"type": "image", "data": "<Base64>"}, {"type": "text", "text": "Extract any chemical structures from this image."} ] }'
-
Response:
{ "id": "req-012", "outputs": [ {"text": "I see ethanol and benzene rings.", "entities": [{"label":"Molecule","text":"ethanol","confidence":0.87}, {"label":"Molecule","text":"benzene","confidence":0.90}]} ] }
-
Endpoint:
POST http://localhost:8003 -
Example
curl:curl -X POST http://localhost:8003 \ -H "Content-Type: application/json" \ -d '{ "inputs": [ {"type": "image", "data": "<Base64>"}, {"type": "text", "text": "Describe the scene and list any textual labels visible."} ] }'
-
Response:
{ "id": "req-345", "outputs": [ {"text": "A street sign says 'Main St.' and a traffic light is green.", "entities": [{"label":"Text","text":"Main St.","confidence":0.95}]} ] }
The Similarity Evaluation script (jaccard_similarity.py) compares LVM outputs against ground truth annotations and writes an Excel report containing Jaccard similarity (and optional EMD) scores.
python src/similarity/jaccard_similarity.py \
--ground_truth <PATH_TO_GROUND_TRUTH_XLSX> \
--clickhouse_table <CLICKHOUSE_TABLE_NAME> \
--output <PATH_TO_OUTPUT_XLSX> \
[--host <CLICKHOUSE_HOST>] \
[--port <CLICKHOUSE_PORT>] \
[--user <CLICKHOUSE_USER>] \
[--password <CLICKHOUSE_PASSWORD>] \
[--database <CLICKHOUSE_DB>]--ground_truth(required): Excel file with annotated entities/relations (e.g.,schema-test.xlsx).--clickhouse_table(required): Table where LVM outputs are stored (e.g.,clkg.vision_outputs).--output(required): Path to the resulting Excel report (e.g.,results_with_all_similarity_and_emd5.xlsx).- ClickHouse connection parameters default to those in
config/config.yamlif not provided.
Your ground truth Excel should have columns such as:
id | entity_name | entity_label | relation_type | ...
---------------------------------------------------------------
img_1 | “cat” | “Animal” | “has_tail” | ...
img_2 | “benzene” | “Molecule” | “...” | ...
Each row corresponds to one entity or relation annotation for a specific id.
python src/similarity/jaccard_similarity.py \
--ground_truth data/schema-test.xlsx \
--clickhouse_table clkg.vision_outputs \
--output results_with_all_similarity_and_emd5.xlsx-
The script will:
- Connect to ClickHouse (
host,port,user,password,databasefromconfig/config.yamlby default). - Query all rows in
clkg.vision_outputsand parse theoutput_jsoncolumn into entity/relation sets. - Compare against the
schema-test.xlsxannotations. - Compute Jaccard similarity for each record and write results to
results_with_all_similarity_and_emd5.xlsx.
- Connect to ClickHouse (
-
Import Data
python src/import_data/import_data_to_knowledge_datatabase.py \ --directory_path ./data/technical_papers \ --database ResearchCorpus
-
Start Embedding Service
python src/embedding_service/embedding_web.py --port 55443
-
Verify Embedding API
curl -X POST http://localhost:55443/v1/embeddings \ -H "Content-Type: application/json" \ -d '{ "input": ["Deep learning for NLP."], "model": "all-MiniLM-L6-v2" }'
-
Run LLM Batch Query
python src/llm_query/llm_query.py \ --input_file examples/sample_prompts.csv \ --output_file output/llm_responses.json
-
Query LVM (LLaVa Example)
curl -X POST http://localhost:11434/completions \ -H "Content-Type: application/json" \ -d '{ "prompt": "Describe the following image: <Base64-encoded-image>" }'
-
Store LVM Output in ClickHouse
- Assuming your inference client writes to
clkg.vision_outputsin ClickHouse.
- Assuming your inference client writes to
-
Run Similarity Evaluation
python src/similarity/jaccard_similarity.py \ --ground_truth data/schema-test.xlsx \ --clickhouse_table clkg.vision_outputs \ --output results_with_all_similarity_and_emd5.xlsx
After following these steps, you should have:
- Documents indexed in FastGPT.
- Embedding Service running and returning float vectors.
- LLM responses saved to JSON.
- LVM inferences stored in ClickHouse.
- Similarity scores exported to an Excel report.