FLUX icon

FLUX

.2 [klein]

Generates images from text or edits existing ones using a combination of text and image inputs.

Pricing:Free + from $0.01/per use
Jump to section
Overview of FLUX.2 [klein]'s 4B and 9B model variants with speed benchmarks

Featured alternatives

Ideogram icon

Ideogram

Canva Magic Edit

Leonardo AI

Luminar Neo

Krea

Krea Edit

Overview

FLUX.2 [klein] is Black Forest Labs' ultra-fast image generation model family released on January 15, 2026. Designed for interactive visual intelligence workflows, it introduces 4-billion and 9-billion parameter variants optimized through 4-step distillation to achieve sub-second inference times. The 4B version operates under Apache 2.0 open-source license, enabling commercial deployment on consumer-grade GPUs with as little as 13GB VRAM. Unlike previous FLUX releases, [klein] unifies text-to-image generation, image editing, and multi-reference composition in a single architecture, eliminating the need to switch between specialized models.

What's New

Ultra-Fast 4-Step Distillation

FLUX.2 [klein] reduces inference to just 4 steps through aggressive distillation, compared to traditional diffusion models requiring 20-50 steps. The 9B variant generates images in approximately 0.5 seconds on GB200 GPUs and 2 seconds on RTX 5090 hardware. The 4B model achieves approximately 0.3 seconds on GB200 and 1.2 seconds on RTX 5090, making real-time creative workflows feasible. Base (undistilled) variants remain available for users requiring maximum fine-tuning flexibility, though they require 17-35 seconds per image depending on parameter count.

Unified Multi-Modal Architecture

A single [klein] checkpoint now handles text-to-image generation, single-reference image editing, and multi-reference composition tasks. This architecture consolidates capabilities previously distributed across FLUX.1 Tools (Fill, Redux, Canny, Depth) and FLUX.1 Kontext editing models. Users can switch between generation modes without reloading weights, reducing workflow friction. Multi-reference editing supports combining up to 4 reference images via API (higher limits may apply to other FLUX.2 tiers or playground environments), aiming to improve consistency across elements and compositions with results depending on references and prompting.

Quantization Support for Broader Accessibility

[klein] includes native FP8 and NVIDIA NVFP4 quantization formats, reducing VRAM requirements by 40-55% while preserving visual fidelity. FP8 quantization delivers approximately 1.6× speed improvements with ~40% memory savings; NVFP4 achieves up to 2.7× throughput gains with ~55% VRAM reduction. These optimizations enable the 4B model to run on RTX 3090/4070 GPUs (16GB VRAM) and the 9B model on RTX 4090 (24GB VRAM) without quality degradation.

Apache 2.0 Licensing for 4B Variants

The 4B parameter models (both distilled and base) are released under Apache 2.0 license with fully open weights available on Hugging Face. This allows commercial use, modification, and redistribution, subject to standard Apache license notice and attribution requirements. The 9B variants are released under the FLUX Non-Commercial License; commercial deployment requires separate permission or licensing via Black Forest Labs. Base variants are designed for fine-tuning, LoRA training, and custom pipelines where control and flexibility matter for research and production use.

Enhanced Photorealism and Prompt Adherence

[klein] maintains visual quality competitive with models 5× its size, with particular improvements in typography rendering, spatial consistency, and lighting accuracy. The model demonstrates stronger adherence to multi-clause prompts, correctly interpreting spatial relationships, color specifications, and style modifiers. Black Forest Labs claims improved prompt following and strong quality for real-time workflows, though text rendering quality still depends on prompts and resolution settings.

Availability & Access

Access Methods

  • Open Weights: 4B variants available on Hugging Face under Apache 2.0; 9B variants provided under FLUX Non-Commercial License and may require accepting license terms on distribution platforms
  • Official API: Both 4B and 9B models accessible via Black Forest Labs API
  • Third-Party Platforms: Integration available through Replicate, fal.ai, and ComfyUI workflows
  • Local Deployment: Downloadable checkpoints support Diffusers, ComfyUI, and custom inference pipelines via PyTorch-based toolchains

System Requirements & Limitations

Minimum Hardware for Local Deployment:

Official benchmarks list approximately 8.4GB VRAM for [klein] 4B distilled and 19.6GB VRAM for 9B distilled. Black Forest Labs notes the 4B model is designed to run on consumer GPUs and "fits in approximately 13GB VRAM" depending on runtime settings and end-to-end pipeline overhead. The 9B variant typically requires RTX 4090 or A100/H100 equivalents with 24GB+ VRAM.

Quantization Options:

FP8 quantization reduces VRAM by up to 40% and delivers approximately 1.6× speed improvements. NVFP4 quantization achieves up to 55% VRAM reduction with up to 2.7× throughput gains. Quantized variants may broaden the range of compatible GPUs depending on resolution and implementation.

Software Requirements:

  • CUDA 11.8+ for NVIDIA GPUs
  • Local usage typically via PyTorch-based toolchains (e.g., Diffusers/ComfyUI) or via official API
  • Recommended: Linux Ubuntu 22.04+ or Windows 11 with WSL2

Known Limitations:

  • Distilled models optimize for speed over diversity—base variants offer broader output variation
  • 9B non-commercial license prohibits commercial deployment without separate agreement
  • Multi-reference editing limited to 4 images per generation via API
  • Maximum output resolution: 4 megapixels (e.g., 2048×2048), recommended up to 2MP, and width/height must be multiples of 16
  • As a new release, independent community testing and long-tail failure analysis may still be limited

Pricing & Plans

FLUX.2 [klein] offers flexible pricing depending on access method:

API Pricing (Pay-Per-Use)

Black Forest Labs charges in credits (1 credit = $0.01). Pricing for [klein] is per-megapixel:

  • klein 4B: 0.014 + 0.001/MP
  • klein 9B: 0.015 + 0.002/MP

For enterprise volume pricing or custom terms, contact Black Forest Labs sales.

Open-Source / Self-Hosting

  • 4B Variants: Free under Apache 2.0 (no usage fees, commercial use permitted with standard license attribution)
  • 9B Variants: Free for non-commercial research and personal use under FLUX Non-Commercial License; commercial deployment requires separate licensing

Hardware Cost Considerations: Local deployment requires GPUs ranging from RTX 3090/4070 (for 4B) to RTX 4090/A100 (for 9B). Quantized models reduce hardware requirements and may enable deployment on a broader range of GPUs.

Pros & Cons

Pros:

  • Industry-Leading Speed: 0.5-2 second generation times enable real-time creative iteration, eliminating traditional render delays
  • Unified Architecture: Single model handles generation, editing, and multi-reference tasks without checkpoint switching, reducing workflow friction
  • True Open-Source Option: Apache 2.0 licensing for 4B variants permits commercial use with standard license attribution requirements
  • Consumer Hardware Accessibility: 4B model designed to fit in approximately 13GB VRAM, making deployment feasible on mid-range GPUs
  • Production-Ready Quality: Visual quality competitive with models 3-5× larger in parameter count
  • Flexible Quantization: FP8/NVFP4 support enables up to 2.7× speed gains and 55% VRAM reduction

Cons:

  • Licensing Fragmentation: 9B variants require non-commercial license negotiation for commercial deployment, complicating enterprise adoption
  • Distillation Trade-Offs: 4-step models sacrifice output diversity compared to base variants, limiting creative exploration in some scenarios
  • Hardware Requirements Still Significant: 13GB minimum (4B) excludes most laptop GPUs and budget desktop cards (RTX 3060/4060 series)
  • Limited Multi-Reference Capacity: 4-image cap via API for style/character consistency may be insufficient for complex scene composition
  • Early Release Limitations: As a new release, independent community testing and long-tail failure analysis may still be limited

Best For

  • Real-Time Creative Applications: Developers building interactive design tools, live streaming overlays, or in-game asset generators requiring sub-second response times
  • High-Volume Content Producers: Marketing agencies, e-commerce platforms, or social media teams generating 1,000+ images daily where API costs justify hardware investment
  • Independent AI Researchers: Teams requiring unrestricted commercial experimentation with open-weight models under Apache 2.0 terms
  • Mid-Size Studios with GPU Infrastructure: Production houses with RTX 4090/A100 access seeking to eliminate cloud API dependencies and recurring costs
  • Product Designers and UI/UX Teams: Professionals iterating on visual concepts where immediate feedback loops (under 2 seconds) directly impact workflow efficiency
  • Early Adopters of Multi-Reference Workflows: Artists and illustrators requiring consistent character/style transfer across varied compositions (portraits, scenes, marketing visuals)

FAQ

What hardware do I need to run FLUX.2 [klein] locally?

For the 4B distilled model, official benchmarks list approximately 8.4GB VRAM, though Black Forest Labs notes it's designed to fit in approximately 13GB VRAM in typical runtime environments (RTX 3090, RTX 4070, or equivalent). The 9B distilled variant requires approximately 19.6GB VRAM (RTX 4090, A100, or H100). Quantized versions (FP8/NVFP4) reduce VRAM needs by 40-55%, potentially enabling deployment on a broader range of GPUs depending on resolution and implementation. Base (undistilled) models require similar VRAM but generate slower (17-35 seconds per image).

How does [klein] compare to FLUX.2 Pro or FLUX.1.1?

Black Forest Labs positions [klein] as a Pareto-optimal option for quality versus latency, with sub-second inference for distilled variants. While FLUX.2 Pro delivers higher fidelity for single high-stakes images, [klein] provides competitive quality at significantly faster speeds. Compared to FLUX.1.1, [klein] adds unified editing capabilities and reduces inference steps from typical 20-30 down to 4. Use Pro for final production assets requiring maximum quality; use [klein] for iteration, prototyping, and high-volume workflows where speed matters.

Can I use the 9B model commercially?

Only under a separate commercial license agreement with Black Forest Labs. The 9B weights fall under FLUX Non-Commercial License, restricting commercial deployment, redistribution, and monetization without explicit permission. The 4B variant has no such restrictions due to Apache 2.0 licensing. For commercial projects, either use 4B exclusively or contact Black Forest Labs to negotiate 9B commercial terms.

Does [klein] support fine-tuning and LoRA training?

Yes, but with variant-specific considerations. Base (undistilled) models are designed for fine-tuning, LoRA training, and custom pipelines where control and flexibility matter for research and production use. Distilled models are optimized for 4-step inference, which may reduce fine-tuning stability and require adjusted learning rates. Black Forest Labs recommends starting with base variants for training, then optionally distilling custom checkpoints. The 4B Apache 2.0 license explicitly permits modification and derivative works.

What is the quality difference between 4-step distilled and base models?

Distilled models sacrifice some output diversity and fine detail for speed. In side-by-side tests, base models show approximately 10-15% higher variability in compositional choices and subtle texture rendering when using identical prompts. For production workflows requiring exact creative control or maximum fidelity, base models remain preferable despite 8-17× longer generation times. Distilled models excel in scenarios where speed enables iterative refinement (e.g., generating 20 variations to select one best output).

Version History

.2 [klein]

Current Version

Released on January 15, 2026

+What's new
3 updates
  • Introduce 4B and 9B parameter models with 4-step distillation for sub-second inference—9B achieves 0.5s on GB200 GPUs, 4B runs under 1.2s on RTX 5090.
  • Deliver unified architecture supporting text-to-image generation, single-reference editing, and multi-reference composition without switching models.
  • Release 4B variants under Apache 2.0 license with open weights, enabling commercial use and local deployment on consumer hardware (13GB VRAM minimum).

.2

Released on November 25, 2025

View Update
+What's new
3 updates
  • Combine up to 10 reference images in a single generation to maintain consistent characters, products, styles and scenes across complex creative projects like e-commerce campaigns and brand materials
  • Generate images up to 4MP resolution with significantly improved typography, perfect for high-quality advertisements, posters and detailed product photography
  • Control quality and speed with flexible step parameters in FLUX.2 [flex], allowing you to quickly draft concepts at low steps then refine winners at high steps for optimal cost efficiency

.1 Kontext

Released on May 29, 2025

+What's new
3 updates
  • Prompt with both text and images to extract and modify visual concepts from existing photos or generated images, enabling rapid variations like changing product backgrounds or updating clothing while preserving subjects
  • Edit images iteratively with multi-turn instructions up to 8x faster than competing models, supporting progressive refinement workflows for portrait retouching and complex scene building
  • Test FLUX capabilities without technical integration using the new BFL Playground, perfect for POCs, team evaluations and product demos before API implementation

.1 [pro]

Released on October 2, 2024

+What's new
3 updates
  • Generate images 6x faster than FLUX.1 [pro] while improving image quality, prompt adherence and output diversity, enabling commercial-scale batch production for advertising and content platforms
  • Access transparent per-image pricing through the generally available BFL API at 4 cents per image for FLUX1.1 [pro], simplifying cost forecasting for enterprise budgets
  • Benefit from 2x speedup on updated FLUX.1 [pro] without changing outputs, allowing existing workflows to scale immediately

.1

Released on August 1, 2024

+What's new
3 updates
  • Choose between three variants tailored for different needs - [pro] for maximum quality, [dev] for downloadable non-commercial weights, and [schnell] for ultra-fast local iteration under Apache 2.0 license
  • Generate images across flexible aspect ratios and resolutions from 0.1 to 2.0 megapixels, accommodating everything from thumbnails to high-definition marketing materials
  • Leverage 12B parameter architecture built on flow matching with rotary positional embeddings and parallel attention layers, providing the technical foundation for speed and stability improvements

Top alternatives

Related categories