DIAL Realtime analytics

Overview

Overview

Realtime analytics server for AI DIAL. The service consumes the logs stream from AI DIAL Core, analyzes the conversation and writes the analytics to the InfluxDB.

Refer to Documentation to learn how to configure AI DAL Core and other necessary components.

Usage

Check the AI DIAL Core documentation to configure the way to send the logs to the instance of the realtime analytics server.

InfluxDB schema

The realtime analytics server analyzes the logs stream provided by Vector in the realtime and writes metrics to the InfluxDB.

Chat completion and embedding requests

The logs for /chat/completions and /embeddings endpoints are saved to the analytics measurement with the following tags and fields:

Tag	Description
model	The model name for the request.
deployment	The deployment name of the model or application for the request.
parent_deployment	The deployment name of the model or application that called the current deployment.
execution_path	A list of deployment calls representing the call stack of the request. E.g. `['app1', 'app2', 'model1']` means `app1` called `app2` and `app2` called `model1`. The last element of the list equals to the `deployment` tag. The penultimate element of the list (when present) equals to the `parent_deployment` tag.
trace_id	OpenTelemetry trace ID.
core_span_id	OpenTelemetry span ID generated by DIAL Core.
core_parent_span_id	OpenTelemetry span ID generated by DIAL Core that called the span `core_span_id`.
project_id	The project ID for the request.
language	The language detected for the content of the request.
upstream	The upstream endpoint used by the DIAL model.
topic	The topic detected for the content of the request.
title	The title of the person making the request.
response_id	Unique ID of the response. For chat completion response it equals to `id` response field; for embedding request - it's generate from scratch as UUID.

Field	Type	Description
user_hash	string	The unique hash identifying the user.
deployment_price	float	The cost of this specific request, excluding the cost of any requests it directly or indirectly initiated.
price	float	The total cost of the request, including the cost of this request and all related requests it directly or indirectly triggered. It always holds that `price>=deployment_price`.
number_request_messages	int	The total number of messages in the request. For chat completion requests it's number of messages in the chat history. For embedding requests it's number of inputs.
chat_id	string	The unique identifier for the conversation that this request is part of.
prompt_tokens	int	The number of tokens in the request.
cached_prompt_tokens	int	The number of tokens read from the model cache. `cached_prompt_tokens` <= `prompt_tokens`
completion_tokens	int	The number of tokens in the response.

Rate requests

The logs for the /rate endpoint are saved to the rate_analytics measurement:

Tag	Description
deployment	The deployment name of the model or application for the request.
project_id	The project ID for the request.
title	The title of the person making the request.
response_id	Unique ID of the response.
user_hash	The unique hash identifying the user.
chat_id	The unique identifier for the conversation that this request is part of.

Field	Type	Description
dislike_count	int	1 for a thumbs up request, otherwise 0.
like_count	int	1 for a thumbs down request, otherwise 0.

MCP requests

The logs for the /mcp endpoint are saved to the mcp_analytics measurement:

Tag	Description
project_id	The project ID for the request.
title	The title of the person making the request.
deployment	The deployment name of a DIAL toolset corresponding to the MCP call.
parent_deployment	The deployment name of the model or application that called the DIAL toolset.
mcp_method	MCP method name such as `tools/list`, `tools/call` etc.

Field	Type	Description
execution_path	string	A list of deployment calls representing the call stack of the request. E.g. `['app1', 'app2', 'toolset1']` means `app1` called `app2` and `app2` called `toolset1`. The last element of the list equals to the `deployment` tag. The penultimate element of the list (when present) equals to the `parent_deployment` tag.
chat_id	string	The unique identifier for the conversation that this request is part of.
user_hash	string	The unique hash identifying the user.
upstream	string	The upstream endpoint of the DIAL toolset.
trace_id	string	OpenTelemetry trace ID.
core_span_id	string	OpenTelemetry span ID generated by DIAL Core.
core_parent_span_id	string	OpenTelemetry span ID generated by DIAL Core that called the span `core_span_id`.
mcp_tool_call_name	string	The name of the requested tool given that `mcp_method` equal to `tools/call`.

Note

Only the requests with the HTTP status code 200 are processed by the analytics server.

Configuration

Copy .env.example to .env and customize it for your environment.

Connection to the InfluxDB

InfluxDB 2

You need to specify the connection options to the InfluxDB instance using the environment variables:

Variable	Description
INFLUX_URL	URL to the InfluxDB to write the analytics data
INFLUX_ORG	Name of the InfluxDB organization to write the analytics data
INFLUX_BUCKET	Name of the bucket to write the analytics data
INFLUX_API_TOKEN	InfluxDB API Token

You can follow the InfluxDB 2 documentation to setup InfluxDB locally and acquire the required configuration parameters.

InfluxDB 3

You need to specify the connection options to the InfluxDB instance using the environment variables:

Variable	Description
INFLUX_URL	URL to the InfluxDB to write the analytics data
INFLUX_DATABASE	Name of the InfluxDB 3 database to write the analytics data
INFLUX_API_TOKEN	InfluxDB API Token with the write access to the target database

You can follow the InfluxDB 3 documentation to setup InfluxDB locally and acquire the required configuration parameters.

Important

The INFLUX_DATABASE variable was introduced in version 0.22.0. For earlier versions set INFLUX_BUCKET variable to the target database name and INFLUX_ORG variable to any non-empty value (e.g. "ignored") to enable the InfluxDB 3 support.

Aggregated Dashboards (Optional)

This project includes optional aggregated Grafana dashboards that visualize 6-hours and monthly trends.

To enable these dashboards, you must manually create the required InfluxDB buckets and tasks. These steps are not automated via Helm and must be applied manually.

See influxdb/README.md for full instructions.

Important

Aggregated Dashboards are only supported for InfluxDB 2.

Other configuration

Also, following environment valuables can be used to configure the service behavior:

Variable	Default	Description
MODEL_RATES	{}	Specifies per-token price rates for models in JSON format
TOPIC_MODEL		Specifies the name or path for the topic model. If the model is specified by name, it will be downloaded from the Huggingface. When unset or set to an empty string, the topic classification feature is disabled.
TOPIC_EMBEDDINGS_MODEL		Specifies the name or path for the embeddings model used with the topic model. If the model is specified by name, it will be downloaded from the Huggingface. When unset or set to an empty string, the name will be used from the topic model config.
LOG_LEVEL	INFO	The server logging level. Use DEBUG for dev purposes and INFO in prod

Example of the MODEL_RATES configuration:

{
    "gpt-4": {
        "unit":"token",
        "prompt_price":"0.00003",
        "completion_price":"0.00006"
    },
    "gpt-35-turbo": {
        "unit":"token",
        "prompt_price":"0.0000015",
        "completion_price":"0.000002"
    },
    "gpt-4-32k": {
        "unit":"token",
        "prompt_price":"0.00006",
        "completion_price":"0.00012"
    },
    "text-embedding-ada-002": {
        "unit":"token",
        "prompt_price":"0.0000001"
    },
    "chat-bison@001": {
        "unit":"char_without_whitespace",
        "prompt_price":"0.0000005",
        "completion_price":"0.0000005"
    }
}

Development

Development Environment

This project requires Python ≥3.11 and Poetry ≥2.1.1 for dependency management.

Setup

Install Poetry. See the official installation guide.
(Optional) Specify custom Python or Poetry executables in .env.dev. This is useful if multiple versions are installed. By default, python and poetry are used.
```
POETRY_PYTHON=path-to-python-exe
POETRY=path-to-poetry-exe
```
Create and activate the virtual environment:
```
make init_env
source .venv/bin/activate
```
Install project dependencies (including linting, formatting, and test tools):
```
make install
```

Build

To build the wheel packages run:

make build

Run

To run the development server locally run:

make serve

The server will be running as http://localhost:5001

Docker

To build the docker image run:

make docker_build

To run the server locally from the docker image run:

make docker_serve

The server will be running as http://localhost:5001

Lint

Run the linting before committing:

make lint

To auto-fix formatting issues run:

make format

Test

Run unit tests locally:

make test

Clean

To remove the virtual environment and build artifacts:

make clean

Name		Name	Last commit message	Last commit date
Latest commit History 176 Commits
.github		.github
aidial_analytics_realtime		aidial_analytics_realtime
dashboards		dashboards
tests		tests
.dockerignore		.dockerignore
.env.dev.example		.env.dev.example
.env.example		.env.example
.flake8		.flake8
.gitignore		.gitignore
.markdownlint.json		.markdownlint.json
.ort.yml		.ort.yml
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
noxfile.py		noxfile.py
poetry.lock		poetry.lock
poetry.toml		poetry.toml
pyproject.toml		pyproject.toml
trivy.yaml		trivy.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DIAL Realtime analytics

Overview

Usage

InfluxDB schema

Chat completion and embedding requests

Rate requests

MCP requests

Configuration

Connection to the InfluxDB

InfluxDB 2

InfluxDB 3

Aggregated Dashboards (Optional)

Other configuration

Development

Development Environment

Setup

Build

Run

Docker

Lint

Test

Clean

About

Uh oh!

Releases 30

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DIAL Realtime analytics

Overview

Usage

InfluxDB schema

Chat completion and embedding requests

Rate requests

MCP requests

Configuration

Connection to the InfluxDB

InfluxDB 2

InfluxDB 3

Aggregated Dashboards (Optional)

Other configuration

Development

Development Environment

Setup

Build

Run

Docker

Lint

Test

Clean

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 30

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages