Hello,
I’m seeking help with a persistent database connection issue between a Cloud Run service and a Cloud SQL for PostgreSQL instance that I have been unable to solve after extensive troubleshooting.
The Problem
My Python/Flask application, deployed on Cloud Run, fails to connect to my Cloud SQL for PostgreSQL database.
-
The application log shows the connection fails with: (psycopg2.OperationalError) connection to server on socket … failed: FATAL: password authentication failed for user “cloudrunuser”.
-
The Cloud SQL database log shows the corresponding rejection: FATAL: password authentication failed for user “cloudrunuser”.
Troubleshooting Steps Performed
We have systematically worked through every potential cause and have confirmed the following:
-
Correct User Type: The cloudrunuser is a standard SQL user with a password, not an IAM database user.
-
Correct Permissions: The cloudrunuser has been made the owner of all tables in the database using REASSIGN OWNED and can successfully see all tables during a manual login.
-
Correct Connection Method: The application correctly uses the secure Unix socket for the connection, which is confirmed by host=[local] in the database logs.
-
Clean Builds: Every deployment is built using gcloud builds submit --no-cache to ensure the latest code is always used.
-
Removed Secret Manager Override: We discovered an early issue where the deployment was configured to use a password from Secret Manager. We have since deployed multiple times using the –remove-secrets=DB_PASS flag to ensure this override is no longer active.
-
Application-Level Debugging: I have added a diagnostic test directly into my application’s startup code (app/init.py) to verify the credentials it’s using.
The Core Contradiction
This is the part I cannot solve. The logs from my most recent deployment show a direct contradiction:
-
Fact 1: My app log proves it has the correct password. My diagnostic code prints the password it is about to use, and the log confirms it is correct (e.g., Password starts with: ‘XXXX’).
-
Fact 2: My app log proves the connection is rejected. My diagnostic code then catches the database error and prints my custom message: —
DATABASE CONNECTION FAILED: PASSWORD REJECTED
— -
Fact 3: My manual test with the exact same password succeeds. When I log in to the database manually from Cloud Shell (psql) as cloudrunuser and type the exact same password that is in my code, I can connect successfully and see all my tables.
My question is: How is this possible? How can the database reject a password from the application, when the application log shows it is using the correct password, and a manual test with that same password works?
Any insight into what could cause this discrepancy between a manual psql connection and a psycopg2 connection from within the Cloud Run environment would be greatly appreciated.
Thank you.