Video generation
Time for you to have a go at creating a video entirely from a text prompt! You'll use a CogVideoXPipeline
pipeline and the following prompt to guide the generation:
A robot doing the robot dance. The dance floor has colorful squares and a glitterball.
Note: Inference on video generation models can take a long time, so we've pre-loaded the generated video for you. Running different prompts will not generated new videos.
The CogVideoXPipeline
class has already been imported for you.
This exercise is part of the course
Multi-Modal Models with Hugging Face
Exercise instructions
- Create a
CogVideoXPipeline
from theTHUDM/CogVideoX-2b
checkpoint. - Run the pipeline with the provided prompt, setting the number of inference steps to
20
, the number of frames to generate to20
, and the guidance scale to6
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
prompt = "A robot doing the robot dance. The dance floor has colorful squares and a glitterball."
# Create a CogVideoXPipeline
pipe = ____(
"____",
torch_dtype=torch.float16
)
# Run the pipeline with the provided prompt
video = ____
video = video.frames[0]
video_path = export_to_video(video, "output.mp4", fps=8)
video = VideoFileClip(video_path)
video.write_gif("video_ex.gif")