- Lab
- Data

Schedule Workflows in Apache Airflow
Automate your workflows with Apache Airflow! In this hands-on Code Lab, you’ll learn how to schedule, trigger, and manage workflows using Airflow’s powerful scheduling features. Whether you're manually triggering tasks via the CLI or automating execution with cron expressions, this lab will give you the practical skills needed to build reliable, time-driven data pipelines.

Path Info
Table of Contents
-
Challenge
Creating a DAG with a Scheduled Interval
In this lab, you will learn how to schedule workflows in Apache Airflow. Scheduling is a crucial aspect of workflow automation, allowing tasks to run at specified intervals without manual intervention. You will gain hands-on experience in defining Directed Acyclic Graphs (DAGs), manually triggering workflows, and configuring automatic execution using
schedule_interval
andschedule
.What is a DAG?
A Directed Acyclic Graph (DAG) in Apache Airflow represents a sequence of tasks with dependencies, ensuring an orderly execution. DAGs define the relationships and execution order of tasks but do not contain data themselves. Each DAG run executes a predefined workflow based on a schedule or manual trigger.
Defining a DAG with a scheduled interval allows automation of tasks at predefined times, reducing manual effort and ensuring consistency in data processing. Before proceeding with the lab, please ensure that you run the following command to validate the
tasks.py
script and confirm that your DAG is correctly defined. Open a terminal and run this command:python tasks.py
You will observe this warning:
⚠ Warning:
/home/ps-user/workspace/tasks.py:17 RemovedInAirflow3Warning: Param schedule_interval is deprecated and will be removed in a future release. Please use schedule instead.
You're seeing this warning because schedule_interval is deprecated in Airflow 2.4+ and will be removed in Airflow 3. The new parameter is
schedule
. Validate that the warning message has been resolved after changing the parameter fromschedule_interval
toschedule
by running thepython tasks.py
command in the terminal. If succesful, you will see no ouput. ###### Load your DAG into Airflow- Airflow scans the dags/ folder. DAGs placed elsewhere will not be detected.
- Storing DAGs in the right location ensures smooth scheduling.
- Keeping all DAGs in one place helps with version control and debugging.
Verify Your DAG in Airflow UI
In this step, you will access the Airflow web interface and visualize your DAG's structure.
🟦 Why It Matters:
- The Airflow UI is an essential tool for monitoring and managing workflows.
- Visualizing DAGs helps ensure that tasks and dependencies are correctly defined and working as intended.
Instructions
- Open the Web Browser tab in the lab environment and navigate to
http://localhost:8081/
. Refresh the browser tab when you encounter a 502 response or a blank screen. - Log in to the Airflow UI using the following credentials:
- Username:
admin
- Password:
admin
if you see a blank screen refresh the browser tab.
- Username:
- Once logged in:
- Navigate to the list of DAGs and locate the scheduled_workflow entry.
- Click on the DAG name to open its details page.
- Click on the Graph tab to visualize the DAG structure. This is a simple DAG that prints Executing the scheduled task when run.
-
Challenge
Triggering Workflows Manually Using CLI
🟦 Why It Matters:
- Triggering DAGs manually using the CLI allows testing and validation before deploying scheduled execution. This approach helps in debugging issues early, ensuring that dependencies and task logic function as expected. It also provides flexibility in executing workflows on demand, which is especially useful in development and troubleshooting scenarios.
- This is crucial in debugging and verifying that the DAG behaves as expected before relying on automation. It also helps in scenarios where immediate execution is needed without waiting for a scheduled interval.
-
Challenge
Automating Workflow Execution
🟦 Why It Matters:
- Automating workflows ensures tasks run without manual intervention, improving efficiency and reliability.
- Airflow pauses newly created DAGs by default to prevent accidental executions.
- Unpausing the DAG enables its tasks to be executed according to its schedule or when triggered manually. ###### 🟦 Important:
A cron expression is a string consisting of five fields (or six in some systems) that specify the schedule for a job or task to run. The general structure for cron schedules is:
* * * * * * │ │ │ │ │ │ │ │ │ │ │ └─ Year (optional) │ │ │ │ └──── Day of the week (0 - 6) (Sunday = 0) │ │ │ └────── Month (1 - 12) │ │ └─────── Day of the month (1 - 31) │ └──────── Hour (0 - 23) └───────── Minute (0 - 59)
Special Cron Abbreviations for Airflow:
| Abbreviation | Description | | -------- | -------- | |@hourly | Equivalent to 0 * * * * (Runs at the start of every hour).| |@daily | Equivalent to 0 0 * * * (Runs at midnight every day).| |@weekly | Equivalent to 0 0 * * 0 (Runs at midnight on Sunday).| |@monthly | Equivalent to 0 0 1 * * (Runs at midnight on the 1st day of every month).| |@yearly | Equivalent to 0 0 1 1 * (Runs at midnight on January 1st).| |@once | Runs only once at the time of DAG execution.|
Congratulations on Completing the Lab! 🎉
In this lab, you explored how to schedule workflows in Apache Airflow by defining DAGs with cron-based schedules, triggering workflows manually, and automating task execution. By following the steps, you were able to implement scheduled tasks, trigger them using the CLI, and ensure their automated execution using Airflow’s scheduling capabilities.
Key Takeaways:
- Defining Scheduled Workflows: You learned how to define DAGs with cron expressions to automate task execution at specific intervals, minimizing manual work and ensuring timely execution.
- Manual DAG Triggering: By triggering DAGs manually via the CLI, you gained the ability to test workflows before automating them, ensuring proper functionality.
- Automation and Monitoring: You saw how to ensure DAGs run automatically based on the schedule, as well as how to monitor task executions and logs to confirm successful operations.
What's a lab?
Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.
Provided environment for hands-on practice
We will provide the credentials and environment necessary for you to practice right within your browser.
Guided walkthrough
Follow along with the author’s guided walkthrough and build something new in your provided environment!
Did you know?
On average, you retain 75% more of your learning if you get time for practice.