LTX-2 Prompting Guide: Master AI Video Generation
Learn the core principles and six essential elements of LTX-2 video generation. Master how to write high-quality prompts for cinematic 4K videos.
Read MoreFirst open-source DiT-based foundation model for synchronized 4K video and audio generation with 19B parameters
Experience AI-powered 4K video and audio generation in real-time
Explore in-depth guides and practical tutorials for LTX-2
Learn the core principles and six essential elements of LTX-2 video generation. Master how to write high-quality prompts for cinematic 4K videos.
Read More
Explore advanced prompting strategies for different video types. Learn 4K/50FPS optimization and multi-shot sequencing techniques.
Read More
Step-by-step guide to run LTX-2 on consumer GPUs. Learn about GGUF quantization and performance optimization.
Read MoreExplore the advanced features that make LTX-2 the leading open-source AI video generation model
Generate high-quality videos from text prompts with LTX-2's advanced DiT architecture
Transform static images into dynamic videos with smooth motion and natural transitions
Create perfectly synchronized audio and video content in a single unified model
Generate production-ready 4K videos with spatial upscaling capabilities
Customize LTX-2 for specific styles, motions, or appearances with efficient LoRA training
Choose from dev, distilled, or quantized (fp8/fp4) models for optimal speed-quality balance
LTX-2 leverages cutting-edge Diffusion Transformer technology with 19B parameters
LTX-2 is built on a Diffusion Transformer (DiT) architecture, the first of its kind to generate synchronized audio and video in a single unified model. With 19 billion parameters, it delivers production-ready quality for professional workflows.
Discover how LTX-2 empowers creators across industries
Generate engaging social media videos from text descriptions with LTX-2's text-to-video capabilities
Rapid prototyping and pre-visualization for filmmakers using LTX-2's 4K generation
Create promotional videos with synchronized audio using LTX-2's audio-visual synthesis
Produce educational content and tutorials with LTX-2's image-to-video animation
Experiment with AI video generation techniques using LTX-2's open-source architecture
Generate cinematic cutscenes and trailers with LTX-2's video-to-video transformation
Explore stunning examples generated by LTX-2
A dramatic sunset over mountains with flowing clouds
Static portrait brought to life with natural motion
Synchronized audio and video generation
Transform existing video with new artistic style
LTX-2 fine-tuned for specific artistic style
Spatial and temporal upscaling demonstration
Install and run LTX-2 locally in minutes
git clone https://github.com/Lightricks/LTX-2.git
cd LTX-2
uv sync
source .venv/bin/activate
Clone the LTX-2 repository and set up the environment using uv package manager
Python Version
β₯ 3.12
CUDA Version
> 12.7
PyTorch Version
~ 2.7
Find answers to common questions about LTX-2
LTX-2 is a 19B parameter DiT-based AI foundation model for synchronized audio-video generation. It's the first open-source model of its kind, capable of generating high-quality 4K videos with synchronized audio from text prompts, images, or existing videos.
LTX-2 supports multiple generation modes: text-to-video, image-to-video, video-to-video transformation, audio-to-video, and joint audio-visual content creation. It can generate videos up to 4K resolution with synchronized audio.
LTX-2 requires Python β₯3.12, CUDA >12.7, PyTorch ~2.7, and an NVIDIA GPU with sufficient VRAM. The exact VRAM requirements depend on the model variant and generation settings you choose.
Yes, LTX-2 is fully open-source under the Apache 2.0 license. You can freely use, modify, and distribute LTX-2 for both personal and commercial projects.
LTX-2 offers several variants: dev (bf16 full precision), fp8 and fp4 quantized versions for faster inference, and a distilled version optimized for speed. Additionally, spatial and temporal upscaler models are available.
Yes, LTX-2 supports LoRA fine-tuning for custom styles, motions, and appearances. You can train motion, style, or likeness LoRAs in less than 1 hour in many settings.
LTX-2 supports up to 4K resolution with spatial upscaling capabilities. The base model generates videos at various resolutions, and the spatial upscaler can enhance them to 4K quality.
Generation time depends on the model variant you choose. The distilled version is fastest with 8 steps, while the dev version offers highest quality but takes longer. Quantized versions (fp8/fp4) provide a good balance.
Yes, LTX-2 is the first DiT model to generate synchronized audio and video in a single model. It can create perfectly matched audio-visual content for various applications.
You can try the live demo on HuggingFace Spaces at huggingface.co/spaces/Lightricks/ltx-2-distilled, or install LTX-2 locally from GitHub for full control and customization.