Skip to content

KV Cache optimization based on sequence length#1974

Open
chilukam-qti wants to merge 1 commit intomicrosoft:mainfrom
CodeLinaro:chilukam/windowed_kv_cache_optimization
Open

KV Cache optimization based on sequence length#1974
chilukam-qti wants to merge 1 commit intomicrosoft:mainfrom
CodeLinaro:chilukam/windowed_kv_cache_optimization

Conversation

@chilukam-qti
Copy link
Contributor

Optimized Sliding Window based KVCache update by copying only cache for seqlen instead of entire context length

Optimized Sliding Window based KVCache update by copying only cache for seqlen instead of entire context length
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant