7/23/2023 0 Comments Astronomer kubernetesI went through school and then I did a computer science degree at the University of Bristol, and then I just ended up taking a software engineer job from there. JAMES LAVERACK: I took a pretty traditional path to software engineering. Tell me a little bit about how you came to software? This transcript has been lightly edited and condensed for clarity.ĬRAIG BOX: Your journey to Kubernetes went through the financial technology (fintech) industry. Make sure you subscribe, wherever you get your podcasts, so you hear all our stories from the cloud native community, including the story of 1.25 next week. James was on the podcast in May, and while you can read his story below, if you can, please do listen to it in his own voice. That release was led by James Laverack of Jetstack. With 1.25 around the corner, the tradition continues with a look back at the story of 1.24. With every new Kubernetes release comes an interview with the release team lead, telling the story of that release, but also their own personal story. ![]() It is my pleasure to be the documentarian for the stories of the Kubernetes community in the weekly Kubernetes Podcast from Google. The one thing that unifies them, no matter their differences, are that they all have an interesting story. Some are friends, some are colleagues, and some are strangers. Already released for Astronomer Enterprise, Worker/Scheduler/Webserver logs will all be exposed in real-time within the Astronomer UI.The Kubernetes project has participants from all around the globe.Worker/Scheduler/Webserver Logging ( coming soon to Astronomer Cloud) The KubernetesExecutor will gracefully handle this process.Kubernetes Executor ( coming soon to Astronomer Cloud) If you have more sensors than worker slots, the sensor will now get thrown into a new up_for_reschedule state, thus unblocking a worker slot.If you’re using Airflow 1.10.1 or a prior version, sensors run continuously and occupy a task slot in perpetuity until they find what they’re looking for, so they have a tendency to cause concurrency issues.If this is a sensor, you could set mode=reschedule ( Airflow v1.10.2+) You’ll need to be set up with Kubectl, but you can run: kubectl exec -it bash.Our default grace period is currently 10 mins, but that’s actually something you can adjust freely on the Astronomer UI ( deployments > Configure > Worker Termination Grace Period)Ĭhange the log_fetch_timeout_sec to something more than 5 seconds (default)Įxec into the corresponding Celery worker to look for the log files there ( Enterprise only).If configured, the celery worker will wait an x number of minutes to restart if the worker is otherwise in the middle of executing a task.To minimize disruption to tasks running at the time you deploy, you can leverage the worker termination grace period. Workers restart after each astro airflow deploy to make sure they’re looking at the most up-to-date code.That task may be crashing due to too much memory - is that job pulling in a lot of data into memory? Have you tried raising the AU’s allocated to your Scheduler?ĭo you have the “ Worker Termination Grace Period” configured? ( Celery only) Things to TryĬan you try re-run that task by going into the corresponding Task Instance from Airflow and hitting “Clear”? Do you have retries/retry_delays configured? ![]() The JIRA issue above essentially requests for that process to happen upfront, to allow you to see those logs if they exist even if the task crashes or is otherwise interrupted. In other, the task didn’t “finish” failing/succeeding/executing at all - it crashed halfway. If there is a hard crash on the task mid-process or it’s otherwise interrupted, the database never gets the hostname and you can’t fetch the logs. Currently, Airflow commits a hostname to the backend db after the task completes, not before or during. ![]() We’re seeing it pop up more and filed that JIRA ticket directly in the OSS project. Hi you’re not seeing logs being exposed in the Airflow UI, it’s likely tied to this Airflow bug.
0 Comments
Leave a Reply. |