Schedule a Databricks job
In this exercise, you will use the Databricks Workspace Client to create a job that runs a notebook with name HelloWorld at a scheduled time.
You already learned how to create a job that can be explicitly triggered. By passing in the schedule parameter to this function, you can schedule the job to run at a specific time by passing in a cron expression.
In this exercise, you can assume there is a predefined variable called cron_expression that represent the cron expression to run once a day at 3 am Eastern Time, respectively.
We have already created the WorkspaceClient(), storing as w, and a notebook_path variable.
This exercise is part of the course
Databricks with the Python SDK
Exercise instructions
- Use the
create()method in theWorkspaceClient.jobsAPI to create a job. - Schedule the job being created to run every day at 3 am in the eastern timezone, using the preset
cron_expressionvariable. - Configure the job to time out after 25 seconds.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create a job scheduled to run every day at 3 AM
created_job = w.jobs.____(name='sdk-dc-project-task',
tasks=[
jobs.Task(description="schedule_notebook_test",
notebook_task=jobs.____(
____=notebook_path),
task_key="my-key")
],
# Schedule it to run every day at 3 AM
schedule=jobs.____(
quartz_cron_expression=____, ____="America/New_York")
# The job should timeout after 25 seconds
____=25,
)