Incompatibility with transformers >= 4.41: DynamicCache.seen_tokens deprecated
Issue Description
The model's custom code (modeling_deepseek.py) uses the deprecated DynamicCache.seen_tokens attribute, which causes errors with newer versions of the transformers library (4.41+).
Error Message
AttributeError: 'DynamicCache' object has no attribute 'seen_tokens'
Environment
- transformers version: 4.57.1 (also affects 4.49.0+)
- PyTorch version: 2.7.0
- Platform: macOS (Apple Silicon M4)
- Python: 3.9
Root Cause
The seen_tokens attribute on DynamicCache was deprecated in transformers v4.41 and has been removed in later versions. The model's modeling_deepseek.py file references this attribute in the prepare_inputs_for_generation method.
Reference: https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite-Chat/discussions/9
Suggested Fix
Update the model code to use cache_position instead of seen_tokens, as recommended by the transformers deprecation warning:
The
seen_tokensattribute is deprecated and will be removed in v4.41. Use thecache_positionmodel input instead.
Workaround
Currently, users must downgrade to transformers==4.43.4 to use this model, which may conflict with other dependencies.
Context
We're integrating this MPS-compatible model into Docling to provide DeepSeek-OCR support on Apple Silicon. The transformers version constraint (>=4.46.0) in Docling's dependencies makes this incompatibility a blocker for MPS users.
Thank you for creating this MPS-compatible fork! It would be great to have it updated for newer transformers versions.
Issue Description
The model's custom code (
modeling_deepseek.py) uses the deprecatedDynamicCache.seen_tokensattribute, which causes errors with newer versions of thetransformerslibrary (4.41+).Error Message
AttributeError: 'DynamicCache' object has no attribute 'seen_tokens'Environment
- transformers version: 4.57.1 (also affects 4.49.0+)
- PyTorch version: 2.7.0
- Platform: macOS (Apple Silicon M4)
- Python: 3.9
Root Cause
The
seen_tokensattribute onDynamicCachewas deprecated in transformers v4.41 and has been removed in later versions. The model'smodeling_deepseek.pyfile references this attribute in theprepare_inputs_for_generationmethod.Reference: https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite-Chat/discussions/9
Suggested Fix
Update the model code to use
cache_positioninstead ofseen_tokens, as recommended by the transformers deprecation warning:The
seen_tokensattribute is deprecated and will be removed in v4.41. Use thecache_positionmodel input instead.Workaround
Currently, users must downgrade to
transformers==4.43.4to use this model, which may conflict with other dependencies.Context
We're integrating this MPS-compatible model into Docling to provide DeepSeek-OCR support on Apple Silicon. The transformers version constraint (>=4.46.0) in Docling's dependencies makes this incompatibility a blocker for MPS users.
Thank you for creating this MPS-compatible fork! It would be great to have it updated for newer transformers versions.
Thank you for the suggestion. I've went ahead and found the PR that introduced that change and applied the patch.
https://github.com/huggingface/transformers/pull/29467/files
Also tested with both old transformers and new transformers version. Let me know if there are any issues. I am happy to hear you are adding this model to docling.
Awesome! Thank you for your prompt response.
I will test it here.