![]() ![]() Highly inspired by the great work puckel/docker-airflow.for manual creating Kubernetes services and deployments to run Airflow on Kubernetes.Dockerfile(.template) of airflow for Docker images published to the public Docker Hub Registry.See this SO question for gotchas) consumes all the resources a single airflow-scheduler pod provides, which will be after the pretty long time. It will work until the concurrent kubectl run executions(up to the concurrency implied by scheduler's max_threads and LocalExecutor's parallelism. Just keep using LocalExecutor instead of CeleryExecutor and delegate actual tasks to Kubernetes by calling e.g. If you have ever considered to avoid Celery for task parallelism, yes, K8S can still help you for a while.If you already had a K8S cluster, just let K8S manage them for you. This is where Kubernetes comes into play, again. However, managing a H/A backend database and Celery workers just for parallelising task executions sounds like a hassle. ![]() The common way to scale out workers in Airflow is to utilize Celery.It is possible but can't we just utilize a kind of cluster manager here? This is where Kubernetes comes into play. Someone in the internet tried to implement a wrapper to implement leader election on top of the scheduler so that only one scheduler executes the tasks at a time. Running multiple schedulers for high availability isn't safe so it isn't the way to go in the first place.Easy high availability of the Airflow scheduler.Kube-airflow provides a set of tools to run Airflow in a Kubernetes cluster. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |