Matt Rickard

Share this post

Workflow Engine Paradigms

blog.matt-rickard.com

Discover more from Matt Rickard

Thoughts on engineering, startups, and AI.
Continue reading
Sign in

Workflow Engine Paradigms

Jan 24, 2023
Share this post

Workflow Engine Paradigms

blog.matt-rickard.com
Share

All happy workflow engines are alike; every unhappy workflow engine is unhappy in its own way. – Tolstoy, on workflow engines.

Workflow engines automate a series of tasks. These tasks are usually related to CI/CD, infrastructure automation, ETL, or some other data or batch processing.

Execution environment – Modern workflow engines have mostly converged on either container-native or serverless execution environments. This is done for idempotence and reproducibility, testability, and cost savings. Argo is one of the best examples of a Kubernetes and container-native workflow engine.

AWS Step Functions uses AWS Lambda to stitch together a serverless workflow engine.

DAG – Most workflow engines like Airflow operate on a static graph. Each job defines it's dependencies and downstream tasks.

Another variable on the DAG-as-ground-truth workflow engine is event-based. The DAG is designed implicitly – workflows emit or trigger events that are consumed by certain services. Those services know little about the workflow topology besides the event they are listening for. Brigade is an example of an event-driven workflow engine for Kubernetes.

Configuration – Workflow tasks are defined in a variety of ways. Argo uses Kubernetes resource definitions (YAML). GitHub Actions uses it's own YAML definition. Prefect, Airflow, Dagster, Luigi, and other data-centric workflow engines define jobs as a python API.

Long-running or fault-tolerant workflows – Retry logic is often the hardest to get right. For many workflows, it doesn't matter: CI/CD workflows that fail are annoying to re-run but never impact the customer directly. Dealing with production-critical workflows is a different story. Temporal solves this problem as the basis of their engine (as does Cadence (Uber) and Conductor (Netflix)).

Share this post

Workflow Engine Paradigms

blog.matt-rickard.com
Share
Previous
Next
Comments
Top
New
Community

No posts

Ready for more?

© 2023 Matt Rickard
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing