Veo 3.1

3.1Verified

Generate videos using reference images to maintain consistent characters, props and styles across scenes - perfect for brand campaigns and film storyboards Extend existing video clips into longer sequences and control scene transitions with first and last frame constraints Create AI-generated videos with synchronized audio including dialogue, sound effects and ambient noise at the same pricing as Veo 3

Reviewed by ToolWorthy Editors·updated 6 months ago

Pricing:Free + from $7.99/mo

Categories:

AI Video Generator

Visit Site

Newer version available·View latest

Jump to section

Featured alternatives

Airender

ListingVideo

MakeMyBookCover

Zenlytic

Pixlie

Wan

Pros & Cons

Editor-reviewed

Pros

Visual consistency across shots — Reference image generation (up to 3 images) ensures characters and props maintain consistent appearance across multiple video generations
Extended creative control — Video extension for Veo-generated content, first/last frame constraints provide fine-grained control over output
Competitive pricing — Standard at $0.40/second and Fast at $0.15/second, with audio included at no extra cost
Native audio included — Synchronized dialogue, sound effects, and ambient audio generated alongside video; you're only charged for successful generations
Enterprise-ready infrastructure — Vertex AI integration provides permissions, quotas, audit trails, and compliance features for business workflows
Multiple access methods — Available through API (paid preview), enterprise platform (GA + preview features), and consumer interface to suit different user needs

Cons

Preview status for advanced features — Reference images, video extension, and frame control available primarily through preview model IDs; object editing features fragmented across platforms
Feature availability varies by platform — GA models on Vertex AI lack reference image and extension support; object editing not available on Gemini API; capabilities differ between Flow, API, and Vertex AI
8-second generation limit per request — Longer videos require chaining multiple generations through extension workflows; workflow complexity increases with desired length
Geographic and approval restrictions — Availability varies by country/region and product surface; certain use cases (person/child generation) may require project approval on Vertex AI
Learning curve for advanced features — Reference image generation (3-image limit), frame control, and extension workflows require understanding optimal input formats and prompt engineering

Jump to section

Overview

Veo 3.1 is a major update to Google DeepMind's AI video generation model, announced and released to preview on October 15, 2025, and later reaching general availability on Vertex AI on November 17, 2025. This version focuses on creative control and consistency, introducing reference image-driven generation (up to 3 images), video extension capabilities for Veo-generated content, and first/last frame constraints. Each generation produces 8-second videos at 720p or 1080p resolution with natively generated audio. Designed for creators and enterprises requiring consistent visual narratives across multiple shots, Veo 3.1 maintains the same pricing structure as Veo 3 while delivering significantly enhanced control over character appearance, prop styling, and scene transitions.

What's New

Reference Image-Driven Generation

Veo 3.1 introduces the ability to use up to three reference images to guide video generation, ensuring consistent characters, props, and visual styles across different scenes. This addresses a common challenge in AI video generation where the same character or object might appear differently in separate generations. The feature is particularly valuable for brand campaigns requiring consistent product appearance, film storyboards needing fixed character designs, and game cutscenes maintaining visual continuity.

In Google Labs Flow, this experience is surfaced as "Ingredients to Video"; in the Gemini API, it maps to image-based direction using reference images. Available through preview model IDs in the Gemini API, this capability reduces rework by maintaining visual consistency without manual post-production adjustments.

Video Extension

Video extension allows users to extend videos previously generated with Veo into longer, more complete sequences. This feature bridges the gap between short video segments and continuous narratives, enabling creators to build extended scenes by chaining multiple generations. Each generation produces an 8-second video; longer sequences can be achieved through extension workflows, with Google Labs Flow documentation indicating that videos can be extended to a minute or more through iterative extensions.

Flow labels this workflow as "Extend" or "Scene extension," while Gemini API and Vertex AI documentation refer to it as "Video extension." The feature is currently available through preview model IDs, with capabilities varying across different access points (Flow, Gemini API, Vertex AI).

First and Last Frame Control

This new capability enables precise control over scene transitions by allowing users to specify both the starting and ending frames of generated videos. It makes it significantly easier to create controlled animations showing state changes, such as product transformations, character movement from rest to action, or UI animation demonstrations. The feature ensures smoother continuity between shots and more predictable results for planned sequences.

Object Editing Capabilities

Object editing capabilities vary significantly across different Veo access points:

Google Labs Flow: Object insertion ("Insert") is currently available, while object removal ("Remove") is coming soon. These tools allow creators to iteratively refine generated videos by adding elements to existing scenes.

Vertex AI: Object insertion and removal are available as preview features in Veo 2 (not Veo 3.1), providing video-equivalent inpainting capabilities for enterprise workflows with audit trails and permission management.

Gemini API: Object insertion and removal features are not currently available through the API according to official documentation.

Users seeking object editing capabilities should verify current availability status for their specific access point, as these features are distributed unevenly across Veo product surfaces.

Native Audio Generation

Continuing from Veo 3, version 3.1 maintains native audio generation capabilities, producing synchronized dialogue, sound effects, and ambient noise alongside video content. All videos are generated with natively integrated audio without additional cost. Audio-video synchronization quality varies depending on prompt complexity and scene requirements, with simpler scenarios generally producing more consistent results.

Availability & Access

Veo 3.1 is available through three primary access points, each with different feature availability:

Gemini API: Available in paid preview through preview model IDs (veo-3.1-generate-preview for Standard, veo-3.1-fast-generate-preview for Fast). Supports image-based direction (reference images), video extension, and first/last frame control through the preview models.

Vertex AI: Enterprise platform with both generally available (GA) and preview model IDs. GA models (veo-3.1-generate-001, veo-3.1-fast-generate-001) provide baseline video generation. Preview models add support for reference images, video extension, and frame control. Note that certain prompts (such as person or child generation) may require project approval on Vertex AI.

Google Labs Flow: Consumer-facing interface providing access to Veo 3.1 features including "Ingredients to Video" (reference images), Extend (video extension), and Insert (object insertion, with Remove coming soon).

Geographic & Account Requirements: Availability varies by product surface (Gemini app/Flow/API) and country/region. Check official availability lists for your location. API access requires appropriate credentials; Vertex AI requires a Google Cloud account with billing enabled. Standard Google Cloud regional availability and quota limits apply.

Technical Specifications: All generations produce 8-second videos at 720p or 1080p resolution with natively generated audio. You are charged only when a video is successfully generated.

Pricing & Plans

Veo 3.1 maintains the same pricing structure as Veo 3 on the Gemini API paid tier:

Standard Veo 3.1: $0.40 per second of generated video (includes audio)
Veo 3.1 Fast: $0.15 per second of generated video (includes audio)

Fast is optimized for speed and lower cost, while Standard is positioned for highest quality and creative control. All pricing includes native audio generation (dialogue, sound effects, ambient noise) at no additional cost. The per-second pricing model allows flexible budgeting based on actual output duration (8 seconds per generation).

Vertex AI Pricing: May differ from Gemini API rates. Enterprise usage can leverage provisioned throughput or fixed quota configurations. Consult Google Cloud documentation for Vertex AI-specific pricing.

Billing: You are charged only when a video is successfully generated. Preview features may have separate quota limits during the preview period. API access requires a Google Cloud account with billing enabled.

Best For

Content creators producing video series or campaigns requiring consistent character appearances across multiple episodes or scenes through reference image workflows
Brand marketers generating product videos where consistent product styling and presentation is critical across variations (up to 3 reference images)
Film and animation studios using AI for storyboarding and previsualization with controlled scene transitions and frame-level control
Enterprise content teams needing scalable video production with audit trails, approval workflows, and compliance features through Vertex AI
Developers building applications with AI video generation features requiring programmatic control over visual consistency through paid preview APIs
Creative professionals who need to extend AI-generated video clips into longer sequences (particularly through Flow's extension capabilities)

FAQ

Can I use Veo 3.1 to extend any video or only Veo-generated videos?

Official documentation specifies that video extension is designed to "extend videos previously generated using Veo." This indicates the feature is optimized for Veo-generated content rather than arbitrary external videos. Compatibility with specific Veo versions (Veo 2 vs Veo 3) should be confirmed through current Gemini API or Vertex AI documentation, as the feature is still in preview across platforms.

How many reference images can I use for consistent generation?

According to Gemini API documentation, image-based direction supports up to 3 reference images per generation. This limit applies to the preview model IDs on the Gemini API; limits on other platforms (Flow, Vertex AI) should be verified through their respective documentation.

What's the difference between Scene Extension and Video Extension?

These are different names for the same underlying capability used across Google's product surfaces. Flow labels this workflow as "Extend" or "Scene extension," while Gemini API and Vertex AI documentation refer to it as "Video extension." All variants allow you to extend videos previously generated with Veo into longer sequences.

Is Veo 3.1 Fast available with the same control features?

Veo 3.1 Fast is available on both Gemini API (as veo-3.1-fast-generate-preview) and Vertex AI (GA model veo-3.1-fast-generate-001 plus preview variants). On Vertex AI, the GA Fast model does not support reference images or video extension; these features require using preview model IDs. Check current capability matrices in Gemini API and Vertex AI documentation for the latest feature availability across Standard and Fast variants.

When will object insertion and removal move from preview to general availability?

No official timeline has been announced. Note that object insertion and removal are currently preview features in Veo 2 on Vertex AI (not Veo 3.1). On Google Labs Flow, object insertion ("Insert") is already available, while object removal ("Remove") is marked as "coming soon." The Gemini API does not currently offer object editing capabilities. Google typically provides advance notice through Vertex AI Release Notes and blog posts when preview features approach general availability.

Version History

3.1 Lite

Released on March 31, 2026

View Update

+What's new

3 updates

•Generate videos at less than 50% of the cost of Veo 3.1 Fast with the same speed, starting at $0.03/sec for video-only and $0.05/sec for video-plus-audio at 720p
•Create text-to-video and image-to-video content with native audio generation including synchronized sound effects and ambient noise in 720p or 1080p resolution
•Build high-volume video applications with flexible 4s, 6s, or 8s durations in landscape (16:9) or portrait (9:16) formats via the Gemini API and Google AI Studio

3.1

Current Version

Released on November 17, 2025

+What's new

3 updates

•Generate videos using reference images to maintain consistent characters, props and styles across scenes - perfect for brand campaigns and film storyboards
•Extend existing video clips into longer sequences and control scene transitions with first and last frame constraints
•Create AI-generated videos with synchronized audio including dialogue, sound effects and ambient noise at the same pricing as Veo 3

3 GA

Released on July 29, 2025

+What's new

3 updates

•Access production-ready Veo 3 on Vertex AI with enterprise-grade permissions, quotas and compliance for scalable video generation workflows
•Generate videos with native audio including dialogue, sound effects and ambient noise synchronized with accurate lip movements
•Deploy both standard and Fast variants in controlled enterprise environments with full audit trail support

3 Fast

Released on July 29, 2025

+What's new

2 updates

•Generate videos at $0.40 per second (compared to $0.75 for Veo 3) enabling cost-effective batch production for social media and ad variations
•Create videos from single images to produce dynamic product showcases and animated marketing materials

3 Preview

Released on June 26, 2025

+What's new

2 updates

•Access early preview of Veo 3 on Vertex AI for enterprise proof-of-concept testing and integration planning
•Generate high-quality videos with synchronized audio from text or image prompts using cinematic visual language

2 API

Released on April 9, 2025

+What's new

2 updates

•Integrate production-ready Veo 2 into applications via Gemini API with 720p resolution, 24 fps frame rate and 8-second video generation at $0.35 per second
•Generate videos from text descriptions or reference images for e-commerce product animations and scripted ad content

2

Released on December 16, 2024

+What's new

2 updates

•Generate videos up to 4K resolution and minutes-long duration for high-quality deliverables and large-screen presentations
•Create more photorealistic scenes with improved cinematic language understanding for film-like camera movements and transitions

Initial

Released on May 14, 2024

+What's new

2 updates

•Generate 1080p cinematic-quality videos from text prompts with understanding of film language for storyboarding and concept visualization
•Access Veo through VideoFX experimental platform for creator testing and feedback-driven improvement

Top alternatives

Seedance

2.0

Generates multi-shot 1080p videos from text or images with stable motion and precise prompt following

5 months ago

Free + Premium

KLING AI

3.0

KLING AI is a Next-Generation AI Creative Studio offering AI-generated images and videos, powered by KOLORS® and KLING®.

5 months ago

Free + from $6.99/mo

Wan

2.6

Generates videos from text or images and supports video editing using the open-source Wan models

7 months ago

Luma Dream Machine

Luma Dream Machine is an AI video generator that creates realistic, high-quality videos from text and images, featuring consistent motion and characters.

2 years ago

Free + from $9.99/mo

Pika

Pika is an idea-to-video platform that enables users to create motion videos using text, images, and existing videos with various editing features.

2 years ago

Free + from $10/mo

Sora

2Verified

Generates videos with synchronized dialogue and sound effects from text prompts or by inserting subjects from user videos.

10 months ago

Free + from $7.90/mo

Related categories

AI Animation Video Generator AI Video Editor AI Face Swap Video AI Music Video Generator AI Video Enhancer

From the blog

View all →

15 Best Real Estate Video Makers 2026 - From Photos to Listing Videos

A practical guide to 15 real estate video makers for agents and photographers who need listing videos from photos, clear pricing, and MLS-safe output options.

Jul 14, 2026

10 Best LTX Studio Alternatives 2026 — After the Veo 3.1 Pro Lock-In

LTX Studio's Veo 3.1 sits behind $125/mo Pro and Trustpilot is full of credit-burn complaints. We tested 10 alternatives — Runway, Pika, Luma, Google Flow, Kling, Krea — for May 2026.

May 31, 2026

12 Best Image to Video AI Tools 2026 — Real Costs, Honest Limits

Twelve image to video AI tools compared beyond the demo reel—real output quality, true per-clip costs, commercial license terms, and honest limits for content creators.

Apr 26, 2026

15 Best AI Avatar Generators in 2026 — Real Costs, Credit Traps, and Honest Limits

Every AI avatar tool demos beautifully. Then the credit system kicks in. Fifteen platforms tested with verified 2026 pricing and real usage caps — across ads, training, and UGC.

Apr 20, 2026

Track Veo in ToolWorthy Weekly

Important tool updates, better alternatives, and selected AI signals in one weekly brief.