Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
HuggingFaceFW 's Collections
🌐 FineWiki
πŸ“„ FinePDFs
πŸ₯‚ FineWeb2
🍷 FineWeb
πŸ“š FineWeb-Edu
πŸ“€ Dataset comparison models
πŸ§ͺ FineWeb v1 data experiments

πŸ₯‚ FineWeb2

updated Jun 27
Upvote
21

  • FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

    Paper β€’ 2506.20920 β€’ Published Jun 26 β€’ 75

  • HuggingFaceFW/fineweb-2

    Viewer β€’ Updated Oct 27 β€’ 4.48B β€’ 95.7k β€’ 700

  • Running
    83

    Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks

    πŸ“
    83

    Evaluate multilingual models using FineTasks

Upvote
21
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs