Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> People use the big data frameworks as glorified distributed-job management tools

Do you have any tools you like for job management without all the distributed-systems baggage?

I've heard folks advocate for Make for this kind of thing, perhaps that or some other orchestration tool that deals with job dependency graphs would be the unix way? (Having a nice way to visualize failed step would of course be a plus; a common use-case is "re-run the intermediate pipeline, and everything downstream".)



There's a bunch, at various levels of abstraction and slightly different primary use cases: Luigi, Dask, Airflow, Celery, Dagster, Prefect, Metaflow, Snakemake, Nextflow, etc


Have a look at airflow.

However, so far I didn’t switch from rundeck & make.


Airflow is really limiting in some non-obvious ways: https://medium.com/the-prefect-blog/why-not-airflow-4cfa4232...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: