Google Cloud Guidance Report

Note

The Cost and Security recommendations in Site24x7 Guidance Report will be available only in ManageEngine CloudSpend as Recommendation Reports. If you use both Site24x7 and CloudSpend, you can continue to get these recommendations from CloudSpend > Reports > Recommendations Reports.

The Recommendations Report in CloudSpend helps you optimize cloud costs, and improve fault tolerance and performance of your cloud infrastructure across AWS, Azure, and GCP accounts. It provides tailored recommendations that can help you achieve significant savings and improve the overall efficiency of your cloud environment.

If you have not subscribed to CloudSpend and want to keep getting these recommendations, you can get started with CloudSpend now.

Site24x7's Google Cloud Guidance Report offers critical insights to fine-tune your cloud resources and identifies bottlenecks, optimize configurations, and ensure peak performance for your Google Cloud setup by implementing the recommendations provided in Guidance Report.

Where can I view the Guidance Report

You can view the Guidance report for Google Cloud by logging into Site24x7 and then navigating to Cloud in the left navigation pane > GCP > your monitor name > Guidance Report.

List of Google Cloud services covered under Guidance Report

The supported Google services covered under Guidance Report are:

Cloud SQL

1. Enable Automated Backups (Priority: High)

Category:

Reliability

Baseline:

Automated backups ensure the protection of your valuable data by creating regular, scheduled backups of your Cloud SQL databases. In case of accidental data loss, database corruption, or other unforeseen issues, you can easily restore your data to the previous state.

Recommendation:

In the Backups section, check whether Automated Backups are enabled.

2. Enable High Availability (Priority: High)

Baseline:

Checks the instances that have configured ZONAL availability.

Description:

Data redundancy is maintained during planned maintenance or outages by enabling a High Availability (HA) configuration or database cluster in Google Cloud SQL. As it operates across both a primary and secondary zone within the designated Google Cloud region, a Cloud SQL instance configured for high availability is referred to as a regional instance.

Recommendation:

Make sure that HA and automatic failover support are set up for all of your production and mission-critical Google Cloud SQL database instances.

3. Enable Point-in-Time Recovery (Priority: Moderate)

Baseline:

Checks the instances that have not configured a Point-in-Time Recovery flag.

Description:

Point-in-Time Recovery (PITR) allows you to restore a Google Cloud MySQL database instance to a precise moment—even down to the exact second. This feature is particularly valuable if data loss occurs due to an error or if the database becomes corrupted, enabling you to revert the database to its operational state before the issue.

Recommendation:

Ensure that the Point-in-Time Recovery (PITR) feature is enabled for all MySQL database instances in your GCP account. This allows you to restore data from a specific point in time while maintaining cost efficiency. Before enabling PITR, ensure that automated backups and binary logging are both activated for your MySQL database instances.

4. Cloud SQL - Enable slow_query_log flag in MySQL (Priority: Medium)

Baseline:

Checks if the slow_query_log flag is enabled for MySQL instances.

Description:

The slow_query_log flag enables the logging of queries that exceed a defined execution time. This helps identify performance issues and potential optimization opportunities in your database.

Recommendation:

Enable the slow_query_log flag in your CloudSQL MySQL instance to identify and optimize slow-performing queries.

5. Cloud SQL - Set log_error_verbosity flag in PostgreSQL (Priority: Medium)

Baseline:

Checks if the log_error_verbosity flag is set to verbose for PostgreSQL instances.

Description:

The log_error_verbosity flag controls the detail level of messages logged. The verbose setting includes function names, line numbers, and other details that are crucial for effective debugging and troubleshooting.

Recommendation:

Set the log_error_verbosity flag to verbose in your CloudSQL PostgreSQL instance to ensure comprehensive error information is captured in logs.

6. Cloud SQL - Disable log_planner_stats flag in PostgreSQL (Priority: Medium)

Baseline:

Checks if the log_planner_stats flag is disabled for PostgreSQL instances.

Description:

The log_planner_stats flag logs query planner performance statistics, which can generate excessive log entries in production environments. This level of detail is typically only needed during specific debugging or optimization sessions.

Recommendation:

Disable the log_planner_stats flag in your CloudSQL PostgreSQL instance for production environments to prevent excessive logging and potential performance impact.

Compute Engine - VM

1. Underutilized Compute instance (Priority: Moderate)

Baseline:

Checks the resource utilization of Google Compute Engine instances and labels them as underutilized, if the CPU usage is less than 2% for the past 48 hours.

Recommendation:

For Google Compute Engine, you are billed based on the instance type and the number of consumed hours. You can lower your costs by identifying and stopping under utilized instances. In addition, Site24x7's Guidance Report also shows the Current Machine Type and recommend the desired instance type (Suggested Machine Type) that you can downgrade to, for better cost cutting.

2. High utilized Compute instance (Priority: High)

Baseline:

Checks the performance counters for GCP Compute and identifies instances that appear to be highly utilized.

Description:

A Compute instance is deemed as overutilized if it meets the following criteria:

The average daily CPU usage for the Compute instance is more than 90% for the last seven days.
The average daily memory utilization for the Compute instance is more than 90% for the last seven days (applicable only if you've deployed our agent on the Compute instance).

Recommendation:

Consider changing the instance size or add the instance to an autoscaling group.

3. Compute maintenance configuration (Priority: High)

Baseline:

Checks whether the instance On host maintenance is marked as TERMINATE.

Description:

Google Cloud Compute Engine enables VM instances to be migrated during infrastructure maintenance without any downtime. Set the On host maintenance option under the Availability policies to Migrate to ensure VMs are moved to a new hardware.

Recommendation:

Configure VM instances for live migration to ensure that they are moved to a new host, preventing downtime during maintenance.

4. Preemptible instances (Priority: High)

Baseline:

Checks whether the instance's preemptible flag is enabled.

Description:

Preemptible instances are cost-effective, short-lived VMs that Google Cloud can stop at any moment. Designed for interruptible workloads, they provide substantial cost savings but have a maximum runtime of 24 hours.

Recommendation:

To ensure that your instances are not preemptible, follow these steps:

Navigate to the GCP console > Compute Engine section.
Stop the VM instance you wish to modify.
Edit the VM instance settings and set Preemptibility to Regular instead of Preemptible.
Save the changes and restart the instance.

5. Auto restart disabled instances (Priority: Moderate)

Baseline:

Checks whether the instance's automaticRestart flag is enabled.

Description:

The Google Cloud Compute Engine service may stop due to non-user-initiated reasons, including maintenance events, hardware issues, and software failures.

Recommendations:

Enable automatic restart to ensure that your instance is automatically restarted in case of VM host failure.
Automatic restart helps maintain availability by recovering instances without manual intervention.

6. Stopped instances (Priority: Moderate)

Baseline:

Checks whether the instances that have been stopped are present for more than the allowed number of days.

Description:

When instances are stopped, you can still be charged for storage. However, when you terminate them, you'll be freed of all charges. Additionaly, if an instance has not run for a specified time, it can pose a high risk since the instance may not be actively maintained.

Recommendation:

Ensure that there are no stopped instances after the specified period.

Compute Engine - Disks

1. Unattached Disks (Priority: Moderate)

Baseline:

Check Compute Engine disk configuration for the associated instance ID.

Description:

Compute Engine disks can persist independently even after instance termination or after you explicitly unmount and detach the volume from the instance. As you may know, unattached volumes are still charged based on the provisioned storage and for input/output operations per second (IOPS).

Recommendation:

Associate the configured Compute Engine disks with an active instance or delete the disk.

Kubernetes Cluster

1. Enable auto repair cluster nodes (Priority: Moderate)

Baseline:

Checks whether the cluster node auto repair property is disabled.

Description:

Auto-repair helps maintain the health of your GKE cluster nodes. When enabled, GKE periodically checks the health of each node, and if a node fails multiple health checks within a set timeframe, GKE automatically initiates a repair process.

Recommendation:

Enable the auto-repair feature for all GKE cluster nodes to maintain their health and ensure smooth operation.

Filestore

1. Restrict unauthorized access (Priority: High)

Baseline:

Checks whether Filestore's access control is restricted to an IP address or range.

Description:

By default, Filestore allows unrestricted access for clients in the same project and VPC network, which can result in data breaches. To enhance security, implement IP-based access control to limit access to trusted IP addresses and block all others.

Recommendation:

Ensure that you establish trusted IP addresses or ranges to prevent any unauthorized access to sensitive data.

Cloud Run functions (formerly Cloud Functions)

1. Enable CMEKs (Priority: High)

Baseline:

Checks whether the functions' CMEKs are configured.

Description:

Google Cloud automatically encrypts data stored with Google-managed keys. For additional control, consider using CMEKs through Cloud KMS for secure key management, rotation, and revocation.

Recommendation:

Use CMEKs instead of Google-managed encryption keys for greater control and compliance.

2. Minimum instance configuration (Priority: Moderate)

Baseline:

Checks whether the functions are configured for minimum instance settings.

Description:

Cloud Run functions can experience cold starts, increasing latency. To minimize this, set a minimum number of function instances. This ensures faster response times and better reliability by keeping some instances warm and ready, reducing latency. This is important for production workloads with consistent traffic or low-latency needs.

Recommendation:

Reduce cold start times and improve performance by setting enough warm instances for Cloud Run functions.

Cloud Run

1. Enable end-to-end HTTP/2 (Priority: Moderate)

Baseline:

Checks whether end-to-end HTTP/2 is disabled for Cloud Run services.

Description:

Enabling end-to-end HTTP/2 improves performance by allowing multiplexing of requests and reducing latency, which can enhance the user experience for applications running on Cloud Run.

Recommendation:

Enable end-to-end HTTP/2 for your Cloud Run services to improve performance and reduce latency.

2. Minimum instances (Priority: Moderate)

Baseline:

Checks whether the minimum number of instances is configured for Cloud Run services.

Description:

Configuring the minimum number of instances helps to ensure that your Cloud Run services are always available and can handle sudden spikes in traffic.

Recommendation:

Set a minimum number of instances for your Cloud Run services to ensure availability and handle traffic spikes effectively.

Cloud Storage

1. Enable versioning (Priority: Moderate)

Baseline:

Checks whether the versioning settings are enabled for your GCP storage buckets.

Description:

Enabling versioning helps protect against accidental deletions and overwrites by keeping multiple versions of an object.

Recommendation:

Enable versioning for your storage buckets to protect against data loss and maintain object history.

Cloud Pub/Sub

1. Cloud Pub/Sub - Dead letter policy disabled (Priority: Low)

Baseline:

Checks Cloud Pub/Sub subscriptions to identify those without a dead letter policy configured.

Description:

Dead letter policies provide a mechanism to handle messages that cannot be processed after repeated attempts. Without a dead letter policy, problem messages can block processing and cause delivery delays for other messages in the subscription.

Recommendation:

Configure dead letter policies for your Pub/Sub subscriptions to ensure proper message handling and prevent unprocessable messages from causing service disruptions.

Managed Instance Group

1. Managed Instance Group - Auto-healing disabled instance groups (Priority: High)

Baseline:

Checks whether the instance group's auto-healing configuration is disabled.

Description:

Auto-healing helps maintain the health and availability of your managed instance groups by automatically repairing unhealthy VMs. When enabled, Google Cloud regularly checks the health of each instance and recreates instances that fail health checks, ensuring your applications remain available and resilient.

Recommendation:

Enable auto-healing for your managed instance groups to increase application reliability and reduce manual intervention during instance failures.

2. Managed Instance Group - Single-zone instance groups (Priority: Low)

Baseline:

Checks whether the instance group is deployed in a single zone.

Description:

Single-zone instance groups are vulnerable to zone-specific outages, which can affect application availability. Regional (multi-zone) managed instance groups distribute VM instances across multiple zones within a region, providing higher availability and resilience against zone failures.

Recommendation:

Convert single-zone managed instance groups to regional managed instance groups to improve application availability and protect against zone-level failures.

Load Balancer

1. Cloud CDN not enabled (Priority: Medium)

Baseline:

Checks if Cloud CDN is enabled for Google Cloud load balancers.

Description:

Cloud Content Delivery Network(CDN) caches content at Google's edge locations to reduce latency, minimize origin server load, and reduce serving costs. Without Cloud CDN, your applications might experience increased latency, higher origin server load, and higher data transfer costs, especially for static content.

Recommendation:

Enable Cloud CDN for your HTTP(S) load balancers to improve performance and reduce costs. Configure appropriate cache settings based on your application's content characteristics. For dynamic content that must be served from the origin, consider using cache control headers to optimize caching behavior.

GCP Dataflow

1. Hanged jobs (Priority: Medium)

Baseline:

Checks if any Dataflow jobs have been running for more than six hours.

Description:

Dataflow jobs that run for extended periods of time might be stuck or experiencing issues, which can lead to unnecessary resource consumption and costs.

Recommendation:

Review long-running Dataflow jobs to determine if they are functioning correctly or need to be terminated. Consider implementing job timeouts or monitoring to detect automatically and address potentially hanged jobs.

On this page

Where can I view the Guidance Report

List of Google Cloud services covered under Guidance Report