#huggingface
9 pages tagged with "huggingface"
programming
- attention interface โ unified attentioninterface for centralized attention method management with runtime switching
- modular transformers โ modular transformers architecture for reduced code duplication and easier model contributions
- quantization โ 18+ quantization methods as first-class citizens in transformers v5.0
- serving and continuous batching โ transformers serve command and continuous batching with paged attention for efficient inference
- tokenizer backend changes โ unified tokenizer backend system eliminating the fast/slow distinction in transformers v5.0
- TokenizersBackend does not exist โ how to fix the "Tokenizer class TokenizersBackend does not exist or is not currently imported" error when loading models
- transformers v5.0 โ hugging face transformers v5.0 release overview, features, and migration guide
- transformers v5.0 migration guide โ breaking changes and migration guide for upgrading from transformers v4.x to v5.0
- vllm and sglang integration โ transformers as backend for vllm and sglang inference engines with near-native performance