We had a routine database version upgrade from Postgres 15 to Postgres 17. 21 instances upgraded without issues but one did not.
Instance information:
-
db-custom-1-3840
-
maintenanceVersion: POSTGRES_15_8.R20240910.01_02
-
10gb sdd
-
HA mode
-
database size approx 4 GB
-
located in europe-west3-c
Symptoms:
After initiating the upgrade ~15min passed and then the error was displayed in Operations and logs:

The instance itself shows that it is under maintenance.
Troubleshooting:
- All actions such as restart/patch/failover from WEB-UI and gcloud not working which is expected for instance that is in maintenance mode.
- Looking at logs did not show any new information. Logs ended on instance shutdown.
- After waiting a few hours backup restore was initiated. In the same region and same specifications new instance was created. The process did not finish in two hours. This time no errors are visible. Now new instance is also stuck in maintenance mode.
- Then, the third instance was created now in europe-north1-b. Same result as with the previous attempt. It’s stuck for 20 at this time.
Edit: The second restore attempt failed with the same error in Operations and logs after 2 hours.
Hi @a-korolkovs ,
It looks like your Cloud SQL instance got stuck due to a few potential issues, according to the troubleshooting documentation. Here are some common reasons this might happen:
- Large Temporary Data Size: The instance can get stuck if there’s a lot of temporary data being created during the upgrade or high query load, especially when it exceeds the available disk space. This can happen if many temporary tables are created at once.
- Fatal Upgrade Error: Sometimes, a fatal error can occur during an upgrade, which leaves the instance stuck in maintenance mode, unable to complete the upgrade process.
- Running Out of Disk Space: If the instance runs out of disk space, especially during an upgrade, it can get stuck on restart.
You may try the following options to resolve the issue:
- Temporary Tables and Storage: If large temporary tables are causing the issue, one workaround is to create them with
ROW_FORMAT=COMPRESSED. This stores the temporary tables in file-per-table tablespaces within the temporary file directory, which can help reduce the load on the instance. However, note that this might come with a performance tradeoff, as creating and removing these tablespaces can be slower.
- Restart the Instance: Unfortunately, the only way to shrink the
ibtmp1 file is by restarting the service, which can help clear up any excess temporary data that’s clogging up the system.
- Automatic Storage Increase: If your instance runs out of storage, and automatic storage increase isn’t enabled, the instance will go offline. To prevent this, you can enable automatic storage increase for future instances. This ensures your instance can scale up automatically when needed, avoiding outages.
- Logs Are Limited: If the logs aren’t providing much insight, it may be time to reach out to Google Cloud Support. They can help force the recreation of the instance if needed.
Hope this helps!
I’m running into this too, on a PostgreSQL 14 to 15 update. Looks like the old instance got shut down, but the new one didn’t start, so we’re just at 1 hour 28 minutes and counting (“a few minutes”?).
Many of the suggestions - like restarting the instance - aren’t available while the instance is in this state. There’s no useful log information for where/why. Feels like something in the backend should detect this situation (that no progress is being made) and handle it.