gcp-ops-1

Part-19: GCE Ops Agent: Logging & Monitoring in Google Cloud Platform (GCP)

When running workloads on Google Compute Engine (GCE), monitoring and logging are critical to keeping your systems healthy and your applications reliable. Google now recommends using the Ops Agent โ€” a modern, unified solution for collecting logs, metrics, and traces from your VMs.

Letโ€™s break it down. ๐Ÿ‘‡

Why Ops Agent?

Google had legacy agents for logging and monitoring, but:

  • โŒ No new feature development
  • โŒ No support for newer OS versions
  • โš ๏ธ Maintenance-only mode

Thatโ€™s why Ops Agent is the recommended choice for all new workloads. If youโ€™re still running the old agents, itโ€™s time to migrate.

What is Ops Agent?

Ops Agent is a single agent that runs on Compute Engine VMs to:

  • ๐Ÿ“œ Collect logs โ†’ send to Cloud Logging
  • ๐Ÿ“Š Collect metrics & traces โ†’ send to Cloud Monitoring
  • ๐Ÿ›  Uses Fluent Bit for logs
  • ๐Ÿ›  Uses OpenTelemetry Collector for metrics & traces

Itโ€™s designed for both Linux and Windows VMs, with flexible installation options.

gcp-ops-1

Key Features

๐Ÿ”ง Installation & Management

You can deploy Ops Agent in multiple ways:

  • Auto-install during VM creation
  • Fleet installation using gcloud or automation tools like Ansible, Chef, Puppet, Terraform
  • Agent policies via CLI
  • Manual install on individual VMs

๐Ÿ“ YAML-based Configuration

  • Simple and flexible config files
  • Easy customization for log collection, parsing, and filtering

Logging Features

๐Ÿš€ Better performance than the legacy logging agent

๐Ÿ“‚ Collects logs from:

  • System logs (/var/log/syslog, /var/log/messages)
  • File-based logs (customizable paths)
  • TCP protocol streams
  • Forward protocol (Fluent Bit/Fluentd)

๐Ÿ›  Flexible processing:

  • Parse unstructured logs into structured JSON
  • Regex-based parsing
  • Exclude logs with labels/regex

๐Ÿ”Œ Third-party app support: Apache Kafka, Nginx, Hadoop, MongoDB, MySQL, Redis, Oracle DB, SAP HANA, and more.

Full list here

Monitoring Features

๐Ÿ“Š System metrics out of the box:

  • CPU, disk, memory, processes, networking, swap
  • GPU (Linux)
  • IIS, MSSQL, Pagefile (Windows)

๐Ÿ”Œ Third-party app integrations (Kafka, Nginx, MariaDB, MongoDB, Redis, WildFly, etc.)

๐Ÿ“ก Prometheus metrics collection for apps running on Compute Engine

๐ŸŽฎ NVIDIA GPU monitoring with DCGM integration

Final Thoughts

If youโ€™re running workloads on GCE, adopting Ops Agent is a no-brainer:

โœ… One agent for both logs & metrics
โœ… Actively developed and future-proof
โœ… Better performance & third-party support
โœ… Flexible deployment at scale

Google has made it clear: transition your workloads to Ops Agent now and unlock better observability for your infrastructure.

๐Ÿ‘‰ Have you already migrated from the legacy agents? What was your experience with Ops Agent so far?

Similar Posts