Hello everyone,
I have been trying to solve a strange and consistent daily failed uptime check error I have been experiencing for nearly 3 years now on my GCP standard VM e2-micro (2 vCPUs, 1 GB Memory). Using a spot or preemptable VM is not the issue.
My website is offline consistently ~16 hours a day while recovering and working perfectly for the other 8-9 hours. It is weird to me that these uptime check failures seem to run on scheduled blocks of time? I get a failed time check alert email every day at ~8:20 AM UTC and an uptime check recovered email every time at about 0:00 UTC. It recovers itself at the start of each UTC day? This seems very strange to me?
When I get the failed uptime check errors, I check my VM instances and see that the VM is in fact still running.
I found this article which I believe seems very similar to my issue:
I request some assistance setting up my Cloud Armor Security Policies. The article says to “download all the Uptime Check source IP addresses” which I have done, but am not sure how to complete the next step to “configure your Cloud Armor Security Policies to allow these IPs making requests to resources in your project”
I cannot tell for sure, but I don’t think I even have “Cloud Armor Security Policies that deny specific IP ranges.” I certainly never set up the instance to deny specific ranges.
How do I go about whitelisting these Uptime Check source IP addresses? When I go into Cloud Armor Security policies, I see “create policy” meaning that I do not have any policies currently running? When I click to create one, it only allows me to input 10 IP addresses yet the Uptime Check IP addresses include more like 50+ IPs across different regions of the world. Do I need to create multiple policies for each region (USA, South America, Asia Pacific, Europe) ?
This gets very complicated for me to understand, if anyone has any experience setting this up I would really appreciate the help!
I don’t have access to support in my GCP tier, that is why I am asking here.
Checking my error logs: I see the following error “Error response: Guest attributed endpoint access is disabled”
Screenshots of my error logs: 403 error
Screenshot of uptime failures:
Screenshot of quota and limits:
Screenshot CPU utilization:
Let me know if you need any additional info!
I will tag a few staff here hoping to get a response:
@alexmoore @willie @dchiesa1 @Rhett @DamianS @caryna @reinc @JuatonCJ @greb @ChieKo
Thank you for your help!