Help Docs

Amazon DLM monitoring

Amazon Data Lifecycle Manager (DLM) helps automate the creation, retention, and deletion of Amazon EBS snapshots and EBS-backed Amazon Machine Images (AMIs). It simplifies data backup management by defining life cycle policies that handle these tasks on a schedule, ensuring your backup and cleanup operations remain consistent and cost-efficient.

Overview

By integrating AWS DLM with Site24x7, you can monitor and validate your automated backup and AMI creation processes. Site24x7 tracks the execution status of your DLM policies, helping you confirm that snapshots are being created and deleted as expected. This ensures your data retention and cleanup rules are functioning correctly.

When you integrate AWS DLM with Site24x7, two individual monitors are created:

  • DLM-EBS Snapshots: Site24x7 collects information about snapshot life cycle policies, including creation and deletion schedules, recent executions, and any failed actions. It helps you confirm that snapshot automation is working correctly and that old snapshots are being cleaned up according to defined retention rules.
  • DLM-EBS Backed AMI: This monitor focuses on AMI life cycle policies. It checks for successful AMI creation and deregistration activities, identifies failed executions, and ensures that AMI retention settings are applied properly. Monitoring this helps maintain a consistent and optimized AMI inventory.

Together, these monitors help validate the overall health of your DLM configurations, ensuring both EBS volume backups and AMI images are managed as expected across your AWS environment.

Use case

If you manage multiple EBS volumes or AMIs across accounts, you may rely on DLM policies to automate snapshot creation and deletion. However, policy failures or misconfigurations such as missing permissions, invalid schedules, or resource tagging errors can cause backup gaps without immediate notice.

Site24x7’s DLM integration helps detect such issues early by monitoring life cycle policy activity and execution results. You can verify whether each policy is working correctly, track any failures, and ensure data protection tasks are running as intended.

Benefits of Site24x7's Amazon DLM integration

Integrate your Amazon DLM environment with Site24x7 and leverage the following benefits:

  • Operational assurance: Ensure data backup and cleanup automation continues without interruption.
  • Proactive alerts: Receive notifications when policy executions fail or deviate from the defined schedule.
  • Audit support: Maintain visibility into policy executions for compliance and retention verification.
  • Cross-service correlation: Combine DLM insights with EC2, EBS, and other AWS services for complete infrastructure monitoring.

Setup and configuration

  1. Log in to your Site24x7 account.
  2. Go to Cloud > AWS > Integrate AWS Account and create a cross-account IAM role to enable Site24x7 to access your AWS resources.
  3. On the Integrate AWS Account page, select DLM from the Services to be discovered list based on your requirement.

Permissions

Ensure that Site24x7 receives the following permissions to monitor Amazon DLM:

  • "dlm:GetLifecyclePolicies"
  • "dlm:GetLifecyclePolicy"

Polling frequency

Site24x7 queries AWS service-level APIs according to the set polling frequency (from once a minute to once a day) to collect metrics from DLM monitors.

Supported metrics

The supported metrics for DLM monitors are given below.

DLM-EBS Snapshots

The supported metrics for DLM-EBS Snapshots monitor are given below.

Metric name Description Statistics Unit

Resources Targeted

The number of resources targeted by the tags specified in a snapshot or EBS-backed AMI policy.

Sum

Count

Snapshots Create Started

The number of snapshot create actions initiated by a snapshot policy.

Sum

Count

Snapshots Create Completed

The number of snapshots created by a snapshot policy. This includes successful retries within 60 minutes of the scheduled time.

Sum

Count

Snapshots Create Failed

The number of snapshots that could not be created by a snapshot policy. This includes unsuccessful retries within 60 minutes from the scheduled time.

Sum

Count

Snapshots Shared Completed

The number of snapshots shared across accounts by a snapshot policy.

Sum

Count

Snapshots Delete Completed

The number of snapshots deleted by a snapshot or EBS-backed AMI policy. This metric applies only to snapshots created by the policy.

Sum

Count

Snapshots Delete Failed

The number of snapshots that could not be deleted by a snapshot or EBS-backed AMI policy. This metric applies only to snapshots created by the policy.

Sum

Count

Snapshots Copied Region Started

The number of cross-region snapshot copy actions initiated by a snapshot policy.

Sum

Count

Snapshots Copied Region Completed

The number of cross-region snapshot copies created by a snapshot policy. This includes successful retries within 24 hours of the scheduled time.

Sum

Count

Snapshots Copied Region Failed

The number of cross-region snapshot copies that could not be created by a snapshot policy. This includes unsuccessful retries within 24 hours from the scheduled time.

Sum

Count

Snapshots Copied Region Delete Completed

The number of cross-region snapshot copies deleted, as designated by the retention rule, by a snapshot policy.

Count

Snapshots Copied Region Delete Failed

The number of cross-region snapshot copies that could not be deleted, as designated by the retention rule, by a snapshot policy.

Sum

Count

Snapshots Archive Deletion Failed

The number of archived snapshots that could not be deleted from the archive tier by a snapshot policy.

Sum

Count

Snapshots Archive Scheduled

The number of snapshots that were scheduled to be archived by a snapshot policy.

Sum

Count

Snapshots Archive Completed

The number of snapshots that were successfully archived by a snapshot policy.

Sum

Count

Snapshots Archive Failed

The number of snapshots that could not be archived by a snapshot policy.

Sum

Count

Snapshots Archive Deletion Completed

The number of archived snapshots that were successfully deleted from the archive tier by a snapshot policy.

Sum

Count

Pre-Script Started

The number of instances for which a pre-script was successfully initiated. If script retries are enabled, this metric can be emitted multiple times per policy run.

Sum

Count

Pre-Script Completed

The number of instances for which a pre-script was successfully completed. The metric is emitted even if the pre script completes outside of the specified timeout period. If script retries are enabled, this metric can be emitted multiple times per policy run.

Sum

Count

Pre-Script Failed

The number of instances for which a pre-script failed to complete successfully. The metric is emitted even if the pre-script completes outside of the specified timeout period. If script retries are enabled, this metric can be emitted multiple times per policy run.

Sum

Count

Post-Script Started

The number of instances for which a post-script was successfully initiated. If script retries are enabled, this metric can be emitted multiple times per policy run.

Sum

Count

Post-Script Completed

The number of instances for which a post script was successfully completed. The metric is emitted even if the post script completes outside of the specified timeout period. If script retries are enabled, this metric can be emitted multiple times per policy run.

Sum

Count

Post-Script Failed

The number of instances for which a post script failed to complete successfully. The metric is emitted even if the post script completes outside of the specified timeout period. If script retries are enabled, this metric can be emitted multiple times per policy run.

Sum

Count

VSS Backup Started

The number of instances for which a Volume Shadow Copy Service (VSS) backup was successfully initiated. If script retries are enabled, this metric can be emitted multiple times per policy run.

Sum

Count

VSS Backup Completed

The number of instances for which a VSS backup was successfully completed. The metric is emitted even if the VSS backup completes outside of the timeout period. If script retries are enabled, this metric can be emitted multiple times per policy run.

Sum

Count

VSS Backup Failed

The number of instances for which a VSS backup failed to complete successfully. The metric is emitted even if the VSS backup completes outside of the timeout period. If script retries are enabled, this metric can be emitted multiple times per policy run.

Sum

Count

Snapshots Copied Account Started

The number of cross-account snapshot copy actions initiated by a cross-account copy event policy.

Sum

Count

Snapshots Copied Account Completed

The number of snapshots copied from another account by a cross-account copy event policy. This includes successful retries within 24 hours of the scheduled time.

Sum

Count

Snapshots Copied Account Failed

The number of snapshots that could not be copied from another account by a cross-account copy event policy. This includes unsuccessful retries within 24 hours of the scheduled time.

Sum

Count

Snapshots Copied Account Delete Completed

The number of cross-region snapshot copies deleted, as designated by the retention rule, by a cross-account copy event policy.

Sum

Count

Snapshots Copied Account Delete Failed

The number of cross-region snapshot copies that could not be deleted, as designated by the retention rule, by a cross-account copy event policy.

Sum

Count

DLM-EBS Backed AMI

The supported metrics for DLM-EBS Backed AMI monitor are given below.

Metric name Description Statistics Unit

Resources Targeted

The number of resources targeted by the tags specified in a snapshot or EBS-backed AMI policy.

Sum

Count

Snapshots Delete Completed

The number of snapshots deleted by a snapshot or EBS-backed AMI policy. This metric applies only to snapshots created by the policy.

Sum

Count

Snapshots Delete Failed

The number of snapshots that could not be deleted by a snapshot or EBS-backed AMI policy. This metric applies only to snapshots created by the policy.

Sum

Count

Snapshots Copied Region Delete Completed

The number of cross-region snapshot copies deleted, as designated by the retention rule, by a snapshot policy.

Sum

Count

Snapshots Copied Region Delete Failed

The number of cross-region snapshot copies that could not be deleted, as designated by the retention rule, by a snapshot policy.

Sum

Count

Images Create Started

The number of create image actions initiated by an EBS-backed AMI policy.

Sum

Count

Images Create Completed

The number of AMIs created by an EBS-backed AMI policy.

Sum

Count

Images Create Failed

The number of AMIs that could not be created by an EBS-backed AMI policy.

Sum

Count

Images Deregister Completed

The number of AMIs deregistered by an EBS-backed AMI policy.

Sum

Count

Images Deregister Failed

The number of AMIs that could not be deregistered by an EBS-backed AMI policy.

Sum

Count

Images Copied Region Started

The number of cross-region copy actions initiated by an EBS-backed AMI policy.

Sum

Count

Images Copied Region Completed

The number of cross-region AMI copies created by an EBS-backed AMI policy.

Sum

Count

Images Copied Region Failed

The number of cross-region AMI copies that could not be created by an EBS-backed AMI policy.

Sum

Count

Images Copied Region Deregister Completed

The number of cross-region AMI copies deregistered, as designated by the retention rule, by an EBS-backed AMI policy.

Sum

Count

Images Copied Region Deregistered Failed

The number of cross-region AMI copies that could not be deregistered, as designated by the retention rule, by an EBS-backed AMI policy.

Sum

Count

Enable Image Deprecation Completed

The number of AMIs that were marked for deprecation by an EBS-backed AMI policy.

Sum

Count

Enable Image Deprecation Failed

The number of AMIs that could not be marked for deprecation by an EBS-backed AMI policy.

Sum

Count

Enable Copied Image Deprecation Completed

The number of cross-region AMI copies that were marked for deprecation by an EBS-backed AMI policy.

Sum

Count

Enable Copied Image Deprecation Failed

The number of cross-region AMI copies that could not be marked for deprecation by an EBS-backed AMI policy.

Sum

Count

Threshold configuration

To configure thresholds for DLM monitors:

  1. Log in to your Site24x7 account and navigate to Admin > Configuration Profiles > Threshold and Availability.
  2. Click Add Threshold Profile.
  3. Select the applicable monitor type from the Monitor Type drop-down menu. The available monitor types are DLM-EBS Snapshots and DLM-EBS Backed AMI.
  4. Provide an appropriate name in the Display Name field.
  5. The supported metrics are displayed in the Threshold Configuration section. You can set threshold values for all the metrics mentioned above.
  6. Click Save.

Licensing

  • Each DLM-EBS Snapshots monitor utilizes one basic monitor license.
  • Each DLM-EBS Backed AMI monitor utilizes one basic monitor license.

Viewing Amazon DLM data

To view the DLM-EBS Snapshots monitor:

  • From the Site24x7 console, navigate to Cloud > AWS > DLM-EBS Snapshots.

To view the DLM-EBS Backed AMI monitor:

  • From the Site24x7 console, navigate to Cloud > AWS > DLM-EBS Backed AMI.

Monitor data

The monitor data for each Amazon DLM monitor is given below.

DLM-EBS Snapshots

The monitor data for DLM-EBS Snapshots monitor is given below.

Summary

The Summary tab provides an overview of the events timeline and metrics in form of charts.

Cross Region Policies

The Cross Region Policies tab shows details of your snapshot copy and deletion activities across AWS regions. You can track metrics such as the completion and failure counts of cross-region snapshot copy and delete operations. The charts and data help confirm if your cross-region backup policies are running successfully and highlight any failures that might need attention.

Configuration Details

The Configuration tab displays all key configuration details of the monitored DLM-EBS Snapshots monitor. You can view details such as Policy ID, DLM - EBS Snapshot Policy Name, Resource Type, and Policy Type.

Schedules

The Schedules tab displays the schedule configuration details for the DLM policy. It includes information about when snapshots are created, their retention periods, and how frequently they are taken or deleted.

Zia Forecast

A forecast chart displays future points of a performance metric (measurement of resource usage) based on historical time series data. Thirty days of historical data is used to predict what your metric usage will be in the next seven days.

Outages

The Outages tab provides details on an outage's start time, end time, duration, and comments, if any.

Inventory

Obtain details like Resource Name, Monitor Licensing Category, and Check Frequency from the Inventory tab. Set and view the Threshold and Availability Profile and the Notification Profile according to the user in this tab.

Log Report

This tab provides a consolidated report of the DLM-EBS Snapshots monitor's log status, which can be downloaded as a CSV file.

Alert Logs

This tab displays a chronological list of all triggered alerts related to the DLM-EBS Snapshots monitor. This tab helps you trace alert history and severity to assess issues and validate threshold settings.

DLM-EBS Backed AMI

The monitor data for DLM-EBS Backed AMI monitor is given below.

Summary

The Summary tab provides an overview of the events timeline and metrics in form of charts.

Cross Region Policies

The Cross Region Policies tab displays data related to the cross-region operations of your AMI lifecycle policies. You can view metrics showing when AMI copies are initiated, completed, or failed across regions. It also tracks the deregistration status of copied AMIs and the deletion status of associated snapshots, helping you confirm that your cross-region image copy and cleanup tasks are working as expected.

Configuration Details

The Configuration tab displays all key configuration details of the monitored DLM-EBS Snapshots monitor. You can view details such as Policy ID, DLM - EBS Snapshot Policy Name, Resource Type, and Policy Type.

Schedules

The Schedules tab lists the schedule configuration details defined in your DLM policy. It shows when AMIs are created, copied, and deleted, along with their retention periods and frequency. This helps verify that your AMI creation and rotation schedules align with your organization’s backup and compliance requirements.

Zia Forecast

A forecast chart displays future points of a performance metric (measurement of resource usage) based on historical time series data. Thirty days of historical data is used to predict what your metric usage will be in the next seven days.

Outages

The Outages tab provides details on an outage's start time, end time, duration, and comments, if any.

Inventory

Obtain details like Cache Name, Region, and Monitor Licensing Category from the Inventory tab. Set and view the Threshold and Availability Profile and the Notification Profile according to the user in this tab.

Log Report

This tab provides a consolidated report of the DLM-EBS Backed AMI monitor's log status, which can be downloaded as a CSV file.

Alert Logs

This tab displays a chronological list of all triggered alerts related to the DLM-EBS Backed AMI monitor. This tab helps you trace alert history and severity to assess issues and validate threshold settings.

Was this document helpful?

Would you like to help us improve our documents? Tell us what you think we could do better.


We're sorry to hear that you're not satisfied with the document. We'd love to learn what we could do to improve the experience.


Thanks for taking the time to share your feedback. We'll use your feedback to improve our online help resources.

Shortlink has been copied!