unable to see running process in Observability tab

Issue:

updated the Google Cloud Ops Agent to version 2.48.0-1.el7 on your CentOS 7 VMs. After the update, process monitoring stopped working on some of these VMs. The Ops Agent no longer collects or displays process data for these specific VMs while it continues to function correctly on others.

Troubleshooting Steps Taken:

already tried the following to resolve the issue:

Updated and downgraded the Google Ops Agent: You’ve tested different versions of the agent to see if the problem was specific to 2.48.0-1.el7.
Verified processes are running: You’ve used the ps -aux command to confirm that processes are actively running on the affected VMs.
Reinstalled the Ops Agent: You’ve completely removed and reinstalled the agent to eliminate potential installation corruption.
Rebooted the VMs: You’ve restarted the VMs to rule out any temporary system issues.
Error Messages:

not seeing any error messages in the Ops Agent logs or system logs, which makes it difficult to diagnose the problem.

Hi @jatinbishtg4s ,

Welcome to Google Cloud Community!

I understand you have done your troubleshooting on your end; nonetheless, please review these documents for further investigation:

Is the agent sending logs to Cloud Logging?

  • If the agent is running but not sending logs, then check the status of the agent’s runtime health checks.

Agent is logging ‘metrics receiver with type “nvml” is not supported’

  • You see the error message ‘metrics receiver with type “nvml” is not supported’ after installing Ops Agent version 2.38.0 or newer when you were using the preview nvml receiver and you overrode the default collection interval in your user-specified configuration file. The error occurs because because the nvml receiver no longer exists but your user-specified configuration file still refers to it.

To correct this problem, update your user-specified configuration file to override the collection interval on the hostmetrics receiver instead.

Status: FAIL

  • If the Ops Agent is not sending both logs and metrics from the VM, then you see a status description like the following:
    Agent is installed, but it’s failing to send both logs and metrics to Google Cloud.Is Ops Agent sending logs? (Yes) Is Ops Agent sending metrics? (No)

If the Ops Agent is not sending logs or metrics from the VM, then use the agent health checks for start-time errors to determine and correct the problem.

Upgrade the agent

  • Note: If you upgraded your instance’s Linux operating system to a new major release, then you should first remove the agent and then re-install it using the procedures on this page, instead of completing these upgrade procedures*.*

After updating the agent, there may have been changes in the configuration files, which might be affecting the process monitoring on certain VMs. Check the configuration file for the Ops Agent and verify that the configuration includes the necessary settings for process monitoring.

Other necessary documentation: Configure the Ops Agent

You may also reach out to Google Cloud Support for more detailed insights and assistance.

I hope the information above is helpful.