Apigee hybrid runtime server_nio exposed metrics

We are using Prometheus to scrape runtime metrics. Some metrics, such as server_nio, have interesting labels, and I’d like to know what they mean. There is any documentation explaining these metrics better?

Some of these metrics are used by Apigee itself to scale the runtime, as described in the doc: Scale and autoscale runtime services  |  Apigee  |  Google Cloud

What I want to to understand is:

  • What do these metrics actually mean?

  • What factors affected by them? Like target performance, proxy performance, etc…

  • How can we tune and test the HPA behaviour based on these metrics?

The metrics are:

server_fault_count{source=“apigee_errors”}

server_fault_count{source=“policy_errors”}

server_fault_count{source=“target_errors”}

server_heap{state=“committed”}

server_heap{state=“init”}

server_heap{state=“max”}

server_heap{state=“p_used”}

server_heap{state=“used”}

server_nio{state=“accepted_total”}

server_nio{state=“accepted”}

server_nio{state=“AX_FAILED_COUNT”}

server_nio{state=“AX_SUCCESS_COUNT”}

server_nio{state=“close_failed”}

server_nio{state=“close_success”}

server_nio{state=“conn_pending”}

server_nio{state=“connected_total”}

server_nio{state=“connected”}

server_nio{state=“heap_committed”}

server_nio{state=“heap_init”}

server_nio{state=“heap_max”}

server_nio{state=“heap_usage”}

server_nio{state=“main_task_queue_depth”}

server_nio{state=“main_task_wait_time”}

server_nio{state=“max_conn”}

server_nio{state=“max_mc_queue_size”}

server_nio{state=“mc_queue_size”}

server_nio{state=“MINT_FAILED_COUNT”}

server_nio{state=“MINT_SUCCESS_COUNT”}

server_nio{state=“netty_task_wait_time”}

server_nio{state=“nio_task_queue_depth”}

server_nio{state=“nio_task_wait_time”}

server_nio{state=“non_heap_committed”}

server_nio{state=“non_heap_init”}

server_nio{state=“non_heap_max”}

server_nio{state=“non_heap_usage”}

server_nio{state=“pool_size”}

server_nio{state=“servers”}

server_nio{state=“timeouts”}

server_nio{state=“TRACE_FAILED_COUNT”}server_nio{state=“TRACE_SUCCESS_COUNT”}server_num_threads{}

1 Like

Hi, we’ve noticed your question hasn’t been answered yet, but we’ll keep it on our radar and reach out to others in the community to chime in.

In the coming weeks, we’ll be hosting two live Q&A sessions where Google experts will answer pre-select forum questions. Feel free to sign up and join us on November 20 or December 4 for a 30-minute session on Apigee Q&A :slight_smile: RSVP here!

1 Like

Hi Andre,

See here for more details on advanced metrics-based autoscaling configuration: Scale and autoscale runtime services  |  Apigee  |  Google Cloud Documentation

The full list of details is only shared under NDA with customers and partners. If you reach back to your contact, he will be able to provide the documentation you are after with the full details and description of the metrics. Alternatively, send me a private message.

Kindly,
Nicola

Hey,

Hope you’re keeping well.

Those server_nio metrics come directly from the Netty-based NIO layer that Apigee hybrid runtime uses for request handling. They represent internal event-loop states and thread-pool activity within the Message Processor JVM. Metrics like accepted, connected, and timeouts track socket-level I/O operations, while those with heap_* and non_heap_* describe JVM memory regions used by that process. Counters such as AX_SUCCESS_COUNT and MINT_SUCCESS_COUNT map to successful proxy request executions within the Apigee transport layer; their “FAILED” counterparts count exceptions or dropped transactions before policy execution.

These metrics primarily affect runtime scaling because the Horizontal Pod Autoscaler evaluates CPU, memory, and connection backlog signals derived from them. You can tune HPA behavior by adjusting the target utilization thresholds in your apigee-hybrid-values.yaml under runtime.scaling and then simulate traffic through a controlled load test to observe changes in server_nio{state=”main_task_queue_depth”} and server_nio{state=”nio_task_wait_time”}—those are strong indicators of thread saturation. The “Scale and autoscale runtime services” documentation defines which metrics are sampled for autoscaling decisions; others are best interpreted as JVM and I/O telemetry for performance debugging rather than direct scaling triggers.

Thanks and regards,

Taz