ModelTC repositories

mtc-token-healing

Public

Token healing implementation in Rust

Rust

•

Apache License 2.0

•0•4•0•1•Updated

Feb 23, 2026

MoDES

Public

[CVPR 2026] This is the official PyTorch implementation of "MoDES: Accelerating Mixture-of-Experts Multimodal Large Language Models via Dynamic Expert Skipping"…

moe cvpr vlmmultimodal mixture-of-experts kimi-vl qwen3-vl cvpr-2026

Python

•

Apache License 2.0

•0•3•0•0•Updated

Feb 21, 2026

general-sam-py

Public

Python bindings for general-sam and some utilities

Python

•

Apache License 2.0

•0•5•0•0•Updated

Feb 20, 2026

LightTTS

Public

LightTTS is a lightweight TTS inference framework optimized for CosyVoice2 and CosyVoice3, enabling fast and scalable speech synthesis in Python and supports st…

text-to-speech real-time ttsspeech-synthesis low-latency tensorrt inference-optimization audio-generation cosyvoice cosyvoice2

Python

•

Apache License 2.0

•7•28•2•0•Updated

Feb 20, 2026

LightLLM

Public

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed perf…

nlp deep-learning llamagpt model-serving llm openai-triton

Python

•

Apache License 2.0

•300•3.9k•82•35•Updated

Feb 20, 2026

Qwen-Image-Edit-Causal

Public

In our implementation of Qwen-Image-Edit, we employ block causal attention to improve inference speed.

Python

•

Apache License 2.0

•2•31•1•0•Updated

Feb 16, 2026

LightMem

Public

Python

•

Apache License 2.0

•0•4•0•0•Updated

Feb 14, 2026

GenRL

Public

Reinforcement Learning Framework for Visual Generation

reinforcement-learning infra rlwan dpo imagegeneration videogeneration grpo wan-video wan21

Python

•

Apache License 2.0

•1•43•0•0•Updated

Feb 13, 2026

QVGen

Public

[ICLR 2026] This is the official PyTorch implementation of "QVGen: Pushing the Limit of Quantized Video Generative Models".

wan iclr qatvideo-generation diffusion-models videogen model-quantization quantization-aware-training generative-ai text-to-video-generation

Python

•

Apache License 2.0

•0•13•0•0•Updated

Feb 11, 2026

SageAttention3-sparse

Public

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across la…

Cuda

•

Apache License 2.0

•355•1•0•0•Updated

Feb 11, 2026

LightX2V

Public

Light Image Video Generation Inference Framework

video-generation diffusion-models wan-videoauto-regressive-diffusion-model

Python

•

Apache License 2.0

•163•2k•131•2•Updated

Feb 11, 2026

VideoAlign

Public

Python

•

MIT License

•0•1•0•0•Updated

Feb 10, 2026

HPSv3

Public

Python

•

MIT License

•0•1•0•0•Updated

Feb 10, 2026

SageAttention

Public

Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and v…

Cuda

•

Apache License 2.0

•355•3•0•1•Updated

Feb 10, 2026

general-sam

Public

A general suffix automaton implementation in Rust with Python bindings

Rust

•

Apache License 2.0

•0•9•0•1•Updated

Feb 9, 2026

LightKernel

Public

HTML

•

Apache License 2.0

•0•3•0•0•Updated

Feb 4, 2026

Prototype

Public

Python

•

Apache License 2.0

•3•14•0•0•Updated

Feb 3, 2026

ComfyUI-Lightx2vWrapper

Public

ComfyUI custom node for lightx2v

comfyui comfyui-nodes

Python

•

MIT License

•7•79•4•0•Updated

Feb 3, 2026

lightx2v_examples

Public

0•0•0•0•Updated

Jan 23, 2026

modeltc.github.io

Public

HTML

•0•0•0•0•Updated

Jan 14, 2026

SpargeAttn

Public

[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.

Cuda

•

Apache License 2.0

•86•0•0•0•Updated

Jan 12, 2026

Qwen-Image-Lightning

Public

Qwen-Image-Lightning: Speed up Qwen-Image model with distillation

Python

•

Apache License 2.0

•44•1.2k•29•0•Updated

Jan 1, 2026

verl

Public

verl: Volcano Engine Reinforcement Learning for LLMs

Python

•

Apache License 2.0

•3.3k•1•0•0•Updated

Dec 15, 2025

slime

Public

slime is an LLM post-training framework for RL Scaling.

Python

•

Apache License 2.0

•566•0•0•0•Updated

Dec 8, 2025

lightllm-blog

Public

SCSS

•

MIT License

•1•1•0•1•Updated

Nov 26, 2025

greedy-tokenizer

Public

Greedily tokenize strings with the longest tokens iteratively.

Python

•

Apache License 2.0

•0•0•0•3•Updated

Nov 24, 2025

LightCompress

Public

[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLMs, VLMs, and video generative models.

benchmark deployment toolevaluation pruning quantization wan awq large-language-models llm

Python

•

Apache License 2.0

•70•681•40•0•Updated

Nov 19, 2025

Wan2.2-Lightning

Public

Wan2.2-Lightning: Speed up wan2.2 model with distillation

Python

•

Apache License 2.0

•1.7k•270•21•0•Updated

Nov 7, 2025

LTX-Video-Q8-Kernels

Public

Python

•17•0•0•0•Updated

Nov 6, 2025

SageAttention-1104

Public

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across la…

Cuda

•

Apache License 2.0

•355•0•0•0•Updated

Nov 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ModelTC

All

All

72 repositories

mtc-token-healing

MoDES

general-sam-py

LightTTS

LightLLM

Qwen-Image-Edit-Causal

LightMem

GenRL

QVGen

SageAttention3-sparse

LightX2V

VideoAlign

HPSv3

SageAttention

general-sam

LightKernel

Prototype

ComfyUI-Lightx2vWrapper

lightx2v_examples

modeltc.github.io

SpargeAttn

Qwen-Image-Lightning

verl

slime

lightllm-blog

greedy-tokenizer

LightCompress

Wan2.2-Lightning

LTX-Video-Q8-Kernels

SageAttention-1104

All

All

Repositories list

72 repositories