Marking the first anniversary of the Chinese language video technology software, Kling AI, its dad or mum firm, Kuaishou, has launched their most superior mannequin but β Kling 2.1. After the success of Kling 1.6 and a pair of.0, customers and creators have been ready for the discharge of Kling AIβs subsequent huge factor, and itβs lastly right here. With superior video technology capabilities and higher coherence and rendering expertise, Kling 2.1 stands as a formidable contender within the AI video technology area towards proprietary fashions resembling Googleβs Veo 3 and OpenAIβs Sora. On this article, weβll discover the options and video technology capabilities of Kling 2.1 and see how nicely it performs towards Veo 3.
What Is Kling 2.1?
Kling 2.1 is a complicated AI-powered video technology mannequin developed by Kuaishou. It transforms reference photographs and textual content prompts into high-definition, cinematic movies, leveraging refined applied sciences like 3D spatiotemporal consideration mechanisms and diffusion transformer architectures. Designed to simulate real-world physics and complex movement dynamics, Kling 2.1 goals to ship movies which are each visually gorgeous and contextually coherent. Constructing upon its predecessor, Kling 2.0, this newest iteration introduces enhancements that cater to each freshmen in addition to seasoned professionals.
Options of Kling 2.1
Listed here are a few of the key options of Kling 2.1:
- Body-based Video Era: Versus most video technology fashions that target text-to-video technology, Kling 2.1 generates movies based mostly on enter photographs as reference frames.
- Real looking Movement and Physics Simulation: Using a 3D spatiotemporal joint consideration mechanism, Kling 2.1 precisely fashions complicated actions, guaranteeing that generated movies adhere to the legal guidelines of physics and exhibit pure movement.
- Dynamic Facial Expressions: The mannequin excels in producing life-like facial expressions and correct actions, enhancing the realism of characters and making them extra participating.
- A number of Video Choices: Kling 2.1 provides creating a number of movies from the identical immediate, giving customers extra freedom and selection, with out the necessity for a number of iterations.
- AI-powered Prompting: For individuals who discover it tough to put in writing detailed and correct prompts for video technology, the mannequin provides a DeepSeek-powered AI software for producing prompts.
Also Learn: 10 Superb Video Era Instruments You Must Examine Out Right this moment!
Methods to Entry Kling 2.1
Kling 2.1 and its Grasp model are each out there on the Kling AI web site and app. Customers around the globe can enroll with simply an e mail ID, and check out the fashions instantly for image-to-video technology, utilizing the free credit given throughout enroll. Observe that these fashions can solely be used for image-to-video technology, as of now.
Methods to Use Kling 2.1
Right hereβs how one can generate movies from photographs utilizing Kling 2.1 and Kling 2.1 Grasp:
- Choose the Mannequin on Kling AI
When you open the web site, choose Kling 2.1 (or Kling 2.1 Grasp) from the mannequin choice drop-down menu on high.
- Add Reference Photos
Underneath the image-to-video tab, choose βFramesβ and add a reference picture for use because the beginning body or finish body of the generated video. Please notice that the Parts characteristic is presently not supported by Kline 2.1.
- Add a Immediate
You’ve the choice of including a immediate to explain the video or a damaging immediate explaining what you wouldn’t need within the video. You’ll be able to even use DeepSeek to generate detailed prompts for you based mostly in your description, theme, or thought.
- Configure the Properties
After you have the reference picture and prompts (elective) in place, select if you’d like a regular or skilled (for VIP customers) video. Then resolve on the size of the video (5 or 10 seconds) and the variety of outputs you want to generate (upto 4). Please notice that solely VIP customers have the choice of producing a number of movies from a single picture/immediate.
- Generate the Video
Now that you simplyβre all set, merely click on on βGenerateβ and wait in line for the mannequin to generate your video. Within the free model, this may take as much as 120 minutes.
- Generate Sound (elective)
As soon as the video is generated, Kling offers you the choice of including sound to it utilizing their sound technology software. You’ll be able to add your immediate right here and generate 4 totally different sounds and dialogues to match the scene. Nevertheless, please notice that the software solely generates audio in Chinese language for now and doesn’t mechanically lip sync with the video.
Video Era Capabilities of Kling 2.1
Customers have taken to social media, praising Kling 2.1βs skill to provide movies with reasonable movement and expressive characters. Letβs try a couple of of the movies generated by Kling 2.1 from totally different picture prompts, to see how good this software actually is.
1.Β Hyper-realistic Human Video
Enter Picture:

Immediate: βA lady is dancing to fast-paced music.β
Output:
Supply: Kling AI Library
2. Animated Gaming Video
Enter Picture:

Description: βautomotive within the metropolis racing, 4K extremely reasonable high-octane chase. Easy motion, photorealistic, prime quality.β
DeepSeek-generated Immediate: βA smooth hover-car weaving between towering holographic billboards, blue plasma thrusters igniting, cityscape reflecting off its chrome physique, 4K ultrΒa practical, dynamic movementβ
Output:
Supply: Kling AI Library
3. Dynamic Motion Video
Enter Picture:

Immediate: βCinematic motion shot within the model of an motion film with a drone racing by a forest woodland at midday, navigating between bushes. Daylight streaking by leaves, shut entrance observe angle, dynamic motion, excessive distinction, intense ambiance, detailed composition.β
Adverse Immediate: βmorphing, erratic fluctuation in movement, noisy, unhealthy high quality, distorted, poorly drawn, blurry, grainy, low decision, oversaturated, lack of element, inconsistent lighting. Mistaken anatomy, unnatural facial expressions, unnatural actions, blur, warp, distortion, disfigurement, pixelation, noisy, grainy, overly brilliant colours, harsh shadows, oversaturated colours, erratic fluctuation, artefacts, glitch, low high quality, unhealthy face, transition, morphing, titles, texts, logos, Cartoonish options.β
Output:
Supply: Kling AI Library
Kling 2.1 vs Veo 3 vs Sora: Options Comparability
Talking of superior video technology, we should learn the way good this free software is as in comparison with proprietary fashions like Googleβs Veo 3 and OpenAIβs Sora. Right hereβs a regular comparability of the options of all three video technology fashions.
Function | Kling 2.1 | Veo 3 | Sora |
Max Video Size | 3 minutes | 1 minute | 1 minute |
Decision | 1080p | 1080p | 1080p |
Lip-Sync Functionality | No | Sure | No |
Physics Simulation | Sure | Sure | No |
Side Ratio Flexibility | Low | Average | Low |
Modifying Instruments | Fundamental | Fundamental | Fundamental |
Entry Availability | International (Beta) | Restricted (US solely) | Restricted |
Kling 2.1 vs Veo 3: Efficiency Comparability
Now, letβs evaluate the efficiency of the 2 fashions we presently have entry to: Kling 2.1 and Veo 3.
Right hereβs a video I discovered on-line, which was generated utilizing Veo 3.
Iβll use a screenshot of this video as the primary body reference picture, add a immediate describing the scene, and see what Kling 2.1 does with it.
Enter Picture:

Immediate: βAn American man sporting a blue t-shirt is on the boarding counter on the airport along with his pet penguin. The airline workers, woman wearing blue, doesn’t let him take the penguin on board. Heβs pissed off as she tries to elucidate the scenario to him.β
Video Generated by Kling 2.1
Now letβs use Kling 2.1 so as to add audio to the generated video.
Comparative Evaluation
Veo 3 generated a really reasonable video with nice detailing, applicable expressions, and really nicely lip-synced audio. Even the circulate of the motion and the readability and tone of the dialogues have been high notch. On the entire, this is without doubt one of the finest AI instruments Iβve ever come throughout for video technology.
Kling 2.1 is exceptionally good at recreating movies from reference frames, as seen above. It generated fairly reasonable folks and animals with correct expressions and particulars. As a free software, it does a greater job than most others. Nevertheless, relating to producing audio and syncing it, Kling 2.1 is fairly disappointing. Be it the tone or the timing, it merely doesnβt align with the video. In order thatβs one thing I believe the software nonetheless must work on.
Conclusion
Kling 2.1 proves to be a promising mannequin within the AI-powered video technology panorama. Its easy-to-use interface, high quality of making coherent movies, and talent so as to add audio to it, make it the most effective free-to-use AI video turbines on the market.Β Its capabilities in reasonable movement simulation, facial features rendering, and artistic artistry take it a step forward of most of its contemporaries. That being mentioned, the mannequin nonetheless has room for enchancment relating to producing audio and precisely lip syncing. So, right hereβs trying ahead to Kling AIβs subsequent model thatβll in all probability repair these points as nicely.
Login to proceed studying and luxuriate in expert-curated content material.