I have daily ingest pipeline as production runs on Cloud Composer, the code is unchanged for a while. One day there is one error that raised from database_health.py which is
after that, in Log Explorer their are following errors from “airflow-scheduler” and “airflow-worker” that have the same errors text as above like “server closed connection unexpectedly”
An unknown problem with a connection to the Airflow database, where the server closed the connection unexpectedly, is an issue that can occur in Cloud Composer. This error can be caused by several factors, including:
A network issue between the Airflow database and the Airflow scheduler or worker.
A problem with the Airflow database itself.
A problem with the Airflow scheduler or worker.
To troubleshoot this issue, you can:
Check the network connection between the Airflow database and the Airflow scheduler and worker. Ensure they can communicate.
Review the Airflow database logs for errors and address them.
Restart the Airflow scheduler and worker.
If the issue persists, consider restarting the Airflow database.
If unresolved, contact Google Cloud support.
Impact on Production Pipelines: If this error occurs in production, your Airflow pipelines might be interrupted. You may need to manually restart your pipelines once resolved.
Prevention: To prevent recurrence:
Monitor the network connection between the Airflow database and the scheduler/worker.
Regularly check the Airflow database logs for errors.
Keep the Airflow components updated with the latest patches.
Have a contingency plan, including manual pipeline restarts.
Additional Tips:
If using Cloud SQL, consider adjusting the connection pool size, but ensure the database can handle the increased connections.
Distributing tasks across multiple Airflow workers can help manage the load and prevent overloading.