feat: implement sophisticated token counting (beyond character/4)

### Problem
The current token counting mechanism is a mix of a simple estimation (length / 4) and a call to the Google AI countTokens API. This is not a robust solution for a production environment for several reasons:

*   **Inaccurate Fallback:** The length / 4 estimation is highly inaccurate for different types of content (code, JSON, etc.) and different languages.
*   **Vendor Lock-in:** The countTokens API is specific to Google AI models. The system should be able to handle models from other vendors (e.g., OpenAI, Anthropic) that use different tokenizers.
*   **No Local Tokenization:** The system relies on a network call for accurate token counting, which introduces latency and a point of failure. For models where the tokenizer is available locally (like 	iktoken for OpenAI models), it should be used.
*   **No Model-Specific Tokenization:** The current implementation does not account for different tokenization rules for different models from the same vendor (e.g., gpt-3.5-turbo vs. gpt-4).

### Desired State
A production-ready token counting system should have the following characteristics:

*   **Pluggable Tokenizers:** The system should support multiple tokenization strategies and allow for new ones to be added easily.
*   **Model-Specific Configuration:** The system should be able to determine which tokenizer to use based on the model being used.
*   **Local First, Remote Fallback:** For models with available local tokenizers (	iktoken), the system should use them first to avoid network latency. If a local tokenizer is not available, it should fall back to a remote API call if possible.
*   **Improved Estimation:** The fallback estimation logic should be more sophisticated than a simple character ratio, taking into account content type and language.
*   **Caching:** All token counting operations (local and remote) should be cached to avoid redundant computations.

### Implementation Plan

**Phase 1: Create a Pluggable Tokenizer Framework in TypeScript**

1.  **Define ITokenizer Interface (src/core/tokenizers/ITokenizer.ts):**
    *   [ ] Create a new interface that defines the contract for all tokenizers.
    *   [ ] It should have a single method: countTokens(text: string): Promise<number>.

2.  **Create a TokenizerFactory (src/core/tokenizers/TokenizerFactory.ts):**
    *   [ ] This factory will be responsible for creating the correct tokenizer based on the model name.
    *   [ ] It will have a create(modelName: string): ITokenizer method.
    *   [ ] It will maintain a mapping of model names (or prefixes) to tokenizer implementations.

3.  **Implement TiktokenTokenizer (src/core/tokenizers/TiktokenTokenizer.ts):**
    *   [ ] Create a class that implements the ITokenizer interface.
    *   [ ] It will use the 	iktoken library to count tokens.
    *   [ ] The constructor will take the model name and use 	iktoken.get_encoding_for_model() to get the correct encoding.
    *   [ ] The countTokens method will encode the text and return the length of the resulting array.

4.  **Implement GoogleAITokenizer (src/core/tokenizers/GoogleAITokenizer.ts):**
    *   [ ] Create a class that implements the ITokenizer interface.
    *   [ ] It will use the Google AI countTokens API.
    *   [ ] The constructor will take the API key and model name.
    *   [ ] The countTokens method will make the API call and return the 	otalTokens.

5.  **Implement EstimationTokenizer (src/core/tokenizers/EstimationTokenizer.ts):**
    *   [ ] Create a class that implements the ITokenizer interface.
    *   [ ] This will be the fallback tokenizer.
    *   [ ] It will contain the improved estimation logic based on content type and language.

**Phase 2: Refactor the TokenCounter to Use the New Framework**

1.  **Update src/core/token-counter.ts:**
    *   [ ] The TokenCounter class will no longer have any tokenization logic itself.
    *   [ ] It will use the TokenizerFactory to get the correct tokenizer for the given model.
    *   [ ] The count method will delegate the token counting to the tokenizer instance.
    *   [ ] The TokenCounter will still be responsible for caching the results.

2.  **Update the count_tokens MCP Tool:**
    *   [ ] The count_tokens tool in src/server/index.ts will be updated to take the modelName as an argument.
    *   [ ] It will use the TokenCounter to count the tokens for the given text and model.

**Phase 3: Update the PowerShell Orchestrator**

1.  **Modify hooks/handlers/token-optimizer-orchestrator.ps1:**
    *   [ ] The PowerShell TokenCounter class will be removed.
    *   [ ] All calls to the TokenCounter will be replaced with calls to the count_tokens MCP tool.
    *   [ ] The modelName will be retrieved from the environment or a configuration file and passed to the count_tokens tool.
    *   [ ] The fallback estimation logic in PowerShell will be removed.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: implement sophisticated token counting (beyond character/4) #124

Problem

Desired State

Implementation Plan

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

feat: implement sophisticated token counting (beyond character/4) #124

Description

Problem

Desired State

Implementation Plan

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions