Kestra: An Extra powerful Orchestrator

Ajith Shetty
6 min readMay 27, 2023
source: https://github.com/kestra-io/kestra

Automation is the key term in every part of the Software development and Engineering.

The automation is moving at a very fast pace. And more and more companies are looking for a single place to incorporate all their requirements.

The common requirement from every organization is Orchestration and Scheduling.

Building a process is one Task but managing, orchestrating and scheduling the same is the next big Task.

With more complex requirements from the business and ever changing infrastructure or cloud processes it is becoming easier for any Engineer to have a common place to integrate the deployment and monitoring the processes.

Kestra

To put it in a simple one liner, Kestra is a:

“Event-driven declarative orchestrator to simplify data operations”

Source: https://github.com/kestra-io/kestra

Kestra is an open source orchestrator that is built to simplify the business processes between the Engineers and the Business users.

Kestra is build to create a reliable workflows and to manage, monitor the same in the single place.

Kestra shines with the help of Declarative YAML. You can provide business logic in an YAML file and the orchestration logic can be updated from UI or from an API.

Ref: https://kestra.io/

Key Terms

Kestra consists of below key concepts:

Flow: Its the main concept which consists of tasks and orchestration logic.

Namespace

Namespace is like folders which can be nestes. Consider this as a logical seperation between your DEV and PROD environment.

Tasks

Tasks are the operations within a flow. You can create multiple tasks and define it to run either sequentially or parallel.

Triggers

Triggers are used to schedule your flow. There are multiple Triggers.

A regular trigger

  1. API call
  2. Trigger from another flow
  3. Custom events like file sensor.

Inputs: Inputs are the parameters you pass to your tasks. They are strongly typed, and allow additional validation rules.

Plugins

There are pre-defined tasks referred as Plugins in the Kestra.

At a very high level it supports mostly all the standart set of plugins like:

S3, DynamoDB, DBT, Fivetran, Git, DuckDB, Rockset, Spark, PowerBI and many more.

Current available plugins are:https://kestra.io/plugins

If you have a custom requirement, you can write your own Plugins.

https://kestra.io/docs/plugin-developer-guide

Additional features

With the standard set of features, Kestra supports other capabilities like:

  1. retries
  2. timeout
  3. error handling
  4. conditional branching
  5. dynamic tasks
  6. sequential and parallel tasks
  7. skipping tasks or triggers when needed by setting the flag disabled to true.
  8. configuring dependencies between tasks, flows and triggers
  9. advanced scheduling and trigger conditions
  10. backfills
  11. documenting your flows, tasks and triggers by adding a markdown description to any component

Ref: https://github.com/kestra-io/kestra#rich-orchestration-capabilities

Simpleness to write a Logic

Kestra provides a UI where the user can write the business requirements as an YAML file.

UI supports the auto completion and provides the examples to help the user to write a better and quick logic.

Using the Topology View, user can easily monitor the different tasks and the real time status.

Before we start the demo I would really encourage you to read the below discussion, where I found an interesting point of view on the Kestra VS Airflow.

DEMO

we would be using the Docker compose to have the demo setup.

https://demo.kestra.io/ui/

Download the docker compose file

curl -o docker-compose.yml https://raw.githubusercontent.com/kestra-io/kestra/develop/docker-compose.yml
docker-compose up

Open the local host.

http://localhost:8080

Homepage

Click on Create my First Flow

Lets use a simple hello world program.

But lets add a bit of an action.

id: flow-001  
namespace: dev
tasks:
- id: bash-print-1
type: io.kestra.core.tasks.log.Log
message: Hello world from bash print 1!

- id: bash-print-2
type: io.kestra.core.tasks.log.Log
message: Hello world from bash print 2!

- id: python-print-1
type: io.kestra.core.tasks.scripts.Python
inputFiles:
data.json: |
{"status": "OK"}
main.py: |
import json
import sys
result = json.loads(open(sys.argv[1]).read())
print(f"python script {result['status']}")
args:
- data.json

id: Represents the name of the FLOW.

ID must be unique within the namespace.

namespace: logical separation of the flow. For eg, DEV and PROD. It can also be nested using dot symbol.

Tasks: Its an action where we want to run the simple hello world.

Did you notice the auto completion, which helps you write your logic even faster.

To make it more friendly, UI is integrated with the examples.

Now, Save it and clock on execute in the Flow Page.

Execution

You can check the Logs in the Logs page.

You can see how your tasks are created sequentially.

Lets add a schedule

id: flow-001  
namespace: dev
tasks:
- id: bash-print-1
type: io.kestra.core.tasks.log.Log
message: Hello world from bash print 1!

- id: bash-print-2
type: io.kestra.core.tasks.log.Log
message: Hello world from bash print 2!

- id: python-print-1
type: io.kestra.core.tasks.scripts.Python
inputFiles:
data.json: |
{"status": "OK"}
main.py: |
import json
import sys
result = json.loads(open(sys.argv[1]).read())
print(f"python script {result['status']}")
args:
- data.json
triggers:
- id: schedule
type: io.kestra.core.models.triggers.types.Schedule
cron: "*/15 * * * *"

Click on Topology.

You can see the changes as well

You can monitor the entire namespace execution status in the Homepage

Feel free to use the demo setup for even easier demo:

Conclusion

Kestra is a powerful Orchestrator which can help you manage and monitor your jobs in one place.

Kestra is very easy to write the logic with the help of in built plugins and it does not expect for you to learn any language. You can easily define the process in the YAML file and you are good to go.

Kestra is much more than what we discussed in the blog. I strongly encourage you to read the below documentation which helps you understand the unique set of features which you can leverage.

Features: https://kestra.io/features

Kestra also provides the Terraform provider.

https://kestra.io/docs/terraform/

Reference:

Ajith Shetty

Bigdata Engineer — Bigdata, Analytics, Cloud and Infrastructure.

Subscribe✉️ ||More blogs📝||LinkedIn📊||Profile Page📚||Git Repo👓

Interested in getting the weekly newsletter on the big data analytics around the world, do subscribe to my: Weekly Newsletter Just Enough Data

--

--

Ajith Shetty

Bigdata Engineer — Love for BigData, Analytics, Cloud and Infrastructure. Want to talk more? Ping me in Linked In: https://www.linkedin.com/in/ajshetty28/