FlashAttention Windows Wheel
Unofficial Windows-compatible wheel of flash-attention for Windows
Python 3.11 & 3.12 & 3.13 versions only.
Overview
This repository provides Windows-compatible wheels for FlashAttention-2 that are not officially distributed.
Pre-built version: lash_attn-2 with Python 3.11 & 3.12 & 3.13 support.
Key Features
- ✅ Native Windows support (Python 3.11 & 3.12 & 3.13)
- ⚡ FlashAttention-2
Changelog
- 15.11.2025 Uploaded v2.8.3 based on PyTorch 2.9.1+cu130
- 12.02.2026 Uploaded v2.8.3 based on PyTorch 2.10.0+cu130
- 29.03.2026 Uploaded v2.8.3 based on PyTorch 2.11.0+cu130
- 13.05.2026 Uploaded v2.9.0 based on PyTorch 2.10.0+cu130 — unofficial fork-only build (not an official release). Includes FA2 A-1/A-2 optimizations.
About v2.9.0
v2.9.0 is not an official FlashAttention release.
It is an independent fork build that continues FA2 kernel development while upstream focuses on FA3/F4.
- Optimization plan (GitHub): https://github.com/ussoewwin/flash-attention/blob/main/AI/FA2_BACKPORT_FROM_FA4_PLAN.md
- Kernel change notes (GitHub): https://github.com/ussoewwin/flash-attention/blob/main/md/FA2_CHANGES_v1.1.md
Disclaimer
- No performance benchmarks have been run on this build.
- No multi-environment testing has been performed.
- This is an unofficial fork build. Use at your own risk.
※Unofficial built version!! It works correctly in my environment, but I am not sure that will work in yours.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 1 Ask for provider support