FlashAttention Windows Wheel

Unofficial Windows-compatible wheel of flash-attention for Windows
Python 3.11 & 3.12 & 3.13 versions only.

Overview

This repository provides Windows-compatible wheels for FlashAttention-2 that are not officially distributed.
Pre-built version: lash_attn-2 with Python 3.11 & 3.12 & 3.13 support.

Key Features

  • ✅ Native Windows support (Python 3.11 & 3.12 & 3.13)
  • ⚡ FlashAttention-2

Changelog

  • 15.11.2025 Uploaded v2.8.3 based on PyTorch 2.9.1+cu130
  • 12.02.2026 Uploaded v2.8.3 based on PyTorch 2.10.0+cu130
  • 29.03.2026 Uploaded v2.8.3 based on PyTorch 2.11.0+cu130
  • 13.05.2026 Uploaded v2.9.0 based on PyTorch 2.10.0+cu130 — unofficial fork-only build (not an official release). Includes FA2 A-1/A-2 optimizations.

About v2.9.0

v2.9.0 is not an official FlashAttention release.
It is an independent fork build that continues FA2 kernel development while upstream focuses on FA3/F4.

Disclaimer

  • No performance benchmarks have been run on this build.
  • No multi-environment testing has been performed.
  • This is an unofficial fork build. Use at your own risk.

※Unofficial built version!! It works correctly in my environment, but I am not sure that will work in yours.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 1 Ask for provider support