What is Lip-sync?

Lip-sync, short for lip synchronization, refers to the synchronization of audio and the lip movements with spoken words or sounds. Our AI model adjusts lips position to ensure that every word spoken aligns flawlessly with speaker’s mouth movements.

Can I lip-sync videos without dubbing?

No, the Lip-sync feature can only be applied to a translated video. You must create and dub your project first before using Lip-sync.

Is Lip-sync free?

Lip-syncing your video is billed separately from the dubbing process. The number of minutes charged for lip-syncing corresponds to the length of your video.

However, minutes for lip-syncing are only charged once per project. This means you can make changes and regenerate the lip-sync as many times as needed without incurring additional charges.

What do I start with?

  1. Create a project following the steps outlined here or use an existing project.
  2. Ensure the project is dubbed, i.e. you have a translated_video property in Get Project response populated with the download link.

Check face task

For a video to be eligible for lip-syncing, it must contain at least one face. The Check Face task is responsible for verifying this. This task only needs to be run once per project, and is typically started automatically.

To confirm that the task has been completed and the video is eligible for lip-syncing, check the response from the Get lip-sync info endpoint. Ensure the following fields are set:

"check_face_task_status": "done",
"video_has_face": true,

If the task has not been started yet, you can manually initiate it using the Run check face endpoint.

Generate Lip-synced video

Once you’re ready to generate a lip-synced version of your video, you can do so by running lip-sync task.

To generate the full lip-synced version of your video, pass the following parameters:

"is_multiple_speakers": "True",
"is_free_lipsync": "False
If your video always has only one face on screen at a time (this refers to faces visible on screen simultaneously, not the number of speakers), you can set is_multiple_speakers to false. This will help speed up the process.

Check progress and download result

Once you have successfully started the Lip-sync process, you can monitor the progress using the Get lip-sync info endpoint.

Task statuses
  1. Queued.

While the task status is:

"lipsync_task_status": "started",
"lipsync_task_progress": null

It indicates that your video is currently queued for processing.

To monitor the queue’s progress, you can check the tasks_in_lipsync_queue value, which provides the number of tasks ahead in the queue. For details about queueing and processing speed, refer to Introducing the New Lip-sync Model.

  1. Processing

When lipsync_task_progress starts changing (e.g., 0 → 50 → 100), it indicates that your video is actively being processed.

  1. Completed

Once the process is finished, you’ll see:

"lipsync_task_status": "done",
"lipsync_task_progress": 100

At this point, the lipsync_result_path property will be populated with a download link for your completed lip-synced video.

Re-generate Lip-Sync after changes to the video

Lip-sync can only be generated once per dubbed version of the video. If you’ve made changes to the segments - such as text, timestamps, speakers, or speaker voices - you’ll first need to regenerate the voiceover to create a new version of the dubbed video. Follow the steps outlined here to do this.

Once the new version is ready and status is merging_done, you can run the Lip-sync task again for this project to account for changes made in voiceover.

Re-generating lip-sync is free, and you don’t need to run the Check Face Task again.

Failures

If your check_face_task_status or lipsync_task_status shows as failed, please contact us at support@rask.ai and include your project_id for assistance.