Build a video!
Time for you to have a go at creating a video entirely from a text prompt! You'll use a CogVideoXPipeline
pipeline and the following prompt to guide the generation:
A robot doing the robot dance. The dance floor has colorful squares and a glitterball.
Note: Inference on video generation models can take a long time, so we've pre-loaded the generated video for you. Running different prompts will not generated new videos.
The CogVideoXPipeline
class has already been imported for you.
Diese Übung ist Teil des Kurses
Multi-Modal Models with Hugging Face
Anleitung zur Übung
- Create a
CogVideoXPipeline
from theTHUDM/CogVideoX-2b
checkpoint. - Run the pipeline with the provided prompt, setting the number of inference steps to
20
, the number of frames to generate to20
, and the guidance scale to6
.
Interaktive Übung
Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.
prompt = "A robot doing the robot dance. The dance floor has colorful squares and a glitterball."
# Create a CogVideoXPipeline
pipe = ____(
"____",
torch_dtype=torch.float16
)
# Run the pipeline with the provided prompt
video = pipe(
prompt=____,
num_inference_steps=____,
num_frames=____,
guidance_scale=____
)
video = video.frames[0]
video_path = export_to_video(video, "output.mp4", fps=8)
video = VideoFileClip(video_path)
video.write_gif("video_ex.gif")