AWS S3 Issues and Troubleshooting: A Comprehensive Guide

Over the last almost two decades, AWS S3 has cemented its position as a leading cloud storage solution. Its inherent scalability, high availability, user-friendly interface, and strong security features make it a preferred choice among businesses and developers.

However, as any developer who’s worked with S3 would tell you, the service can sometimes encounter issues related to performance, configurations, and permissions. Therefore, the purpose of this guide is to provide a comprehensive resource for troubleshooting common S3 problems. Let’s get started!

AWS S3 basics

Amazon Simple Storage Service (S3) is an object storage service built from the ground up with scalability, security, and durability in mind. It allows users to store and retrieve any amount of data at any time. Data stored in S3 can be accessed via the AWS Management Console, the AWS CLI, SDKs, or directly through REST APIs.

The following components make up the S3 architecture:

  • Buckets: These act as containers for object storage. Each bucket has a unique name and is associated with a specific AWS region.
  • Objects: These are the actual files stored in S3. Each object consists of data, metadata, and a unique key for identification.
  • Keys: These are unique identifiers for each object within a bucket. Similar to a file path, a key allows you to pinpoint and retrieve a specific object.
  • Storage classes: S3 offers different storage classes (e.g., Standard, Intelligent-Tiering, Glacier) that optimize cost and performance based on access patterns.
  • Access control: Permissions and policies control who can access data and what actions they can perform.
  • Versioning: This feature allows multiple versions of an object to be stored, which prevents accidental deletions or modifications.
  • Event notifications: S3 can trigger events for operations like object uploads or deletions, which can be used as triggers for AWS Lambda functions or sent to Simple Notification Service (SNS).

Common use cases

S3 powers a variety of use cases across industries, including:

  • Backup and disaster recovery: Businesses store critical data in S3 as part of their redundancy and recovery initiatives.
  • Big data and analytics: S3 is often used as a data lake for storing and analyzing large data sets.
  • Static website hosting: Developers host static websites directly from an S3 bucket, without needing to set up a web server.
  • Media storage and streaming: Streaming platforms store and serve large media files from S3.
  • Machine Learning and AI: S3 serves as a repository for training data sets and model storage.

Why prompt troubleshooting of S3 issues is critical

Here are some reasons why you should never delay troubleshooting an S3 issue:

  • If not fixed early, small issues can escalate into widespread outages. For example, after a release upgrade, a minor misconfiguration in bucket permissions starts to prevent access to important files. Without immediate intervention, this leads to a full-blown service outage for a SaaS platform.
  • Unresolved S3 issues can slow down applications and frustrate users. For example, suppose an online learning platform stores course videos in S3. Slow retrieval speeds due to misconfigured storage classes cause videos to buffer, leading to complaints.
  • Delays in fixing security gaps can result in data leaks or unauthorized access. For example, due to a user error, a company’s S3 bucket accidentally becomes public. Without immediate action, hackers may have enough time to scrape sensitive customer data.
  • Unchecked issues can result in excessive API calls, unnecessary data transfers, and higher storage costs. For example, a logging system continuously writes duplicate data to S3 due to a script bug. By the time the issue is caught, the company has racked up thousands in unexpected charges.
  • Failure to address security or data retention issues can lead to legal complications. For example, suppose a healthcare provider is required to keep patient records in protected storage. Due to an unmonitored bucket misconfiguration, they fail an audit and face fines.

Tools for S3 troubleshooting

To become an effective S3 troubleshooter, here are some tools you should have in your repertoire:

AWS CloudTrail

CloudTrail logs every action taken on S3, including file uploads, deletions, and permission changes. While performing security audits or debugging access issues, you can use it to track who made changes and when.

  • Sample use case: If an S3 bucket becomes public, CloudTrail can show which user or service modified its permissions.

AWS CloudWatch

CloudWatch provides detailed metrics on S3 usage, including request counts, errors, and data transfer statistics. You can set up alarms for unusual activity for early issue detection.

  • Sample use case: If a web application slows down due to high S3 request latencies, CloudWatch can help pinpoint potential performance bottlenecks.

AWS Config

AWS Config continuously monitors S3 bucket configurations and alerts users when compliance rules are violated. You can even use AWS Systems Manager Automation documents to automate remediation steps when a misconfiguration is detected.

  • Sample use case: If an S3 bucket’s versioning is disabled, AWS Config can detect the change and alert administrators to restore the correct configuration before any chance of data loss.

Amazon S3 Storage Lens

This tool provides visibility into S3 storage usage and activity. You can use it to identify trends and anomalies.

  • Sample use case: If S3 costs suddenly increase, Storage Lens can show whether the cause is excessive API requests, unnecessary object versions, or something else.

Access Analyzer for S3

Access Analyzer for S3 helps detect unintended public or cross-account access to buckets.

  • Sample use case: If a bucket is shared with external accounts without authorization, Access Analyzer will flag it for review.

Site24x7

Site24x7 is a dedicated, all-in-one monitoring tool that provides real-time insights into S3 availability, latency, and performance. It helps track access failures, response times, and unusual data usage patterns.

  • Sample use case: If an S3 bucket experiences a sudden spike in failed access requests, Site24x7 can trigger alerts to administrators.

S3 issue troubleshooting guide

This section covers common AWS S3 issues categorized into different problem areas.

Access and permission issues

Let’s start with issues related to access and permissions.

Access denied errors

You try to access an S3 bucket or object but receive an "Access Denied" message.

Symptoms:

  • Error message: 403 Forbidden - Access Denied
  • Unable to read, write, or delete objects even with valid credentials

Troubleshooting:

  • Ensure that the bucket policy or object ACL does not explicitly deny access.
  • Confirm that the IAM user or role has the necessary s3:GetObject, s3:PutObject, or s3:DeleteObject permissions.
  • Run the IAM Policy Simulator to test whether a policy is blocking access.
  • If accessing from another AWS account, ensure that the bucket policy allows cross-account access.

Incorrect S3 presigned URL access

Generated pre-signed URLs fail when accessed.

Symptoms:

  • Error message: 403 Forbidden - SignatureDoesNotMatch
  • URL works for some users but not others
  • URL expires too soon or does not work at all

Troubleshooting:

  • Ensure that the expiration time set when generating the URL is long enough.
  • Some pre-signed URLs work only with a specific method (GET, PUT, etc.).
  • Ensure correct region and credentials. The URL must be generated with credentials matching the S3 bucket’s region.
  • Some headers (such as Host) must match those used during URL signing.

Storage issues

Next, let’s dissect some common storage related problems.

Unexpected S3 storage costs

Your AWS bill shows unexpectedly high S3 costs.

Symptoms:

  • Increased storage charges despite minimal uploads
  • High request costs due to excessive API calls

Troubleshooting:

  • Check S3 Storage Lens or Cost Explorer to identify buckets with high usage or request rates.
  • Enable S3 Lifecycle Policies to automatically delete or move infrequently accessed objects to Glacier or Intelligent-Tiering.
  • Use AWS CloudWatch to track excessive GET or PUT requests.
  • Ensure that logging is not generating large amounts of redundant data.
  • Identify and reduce cross-region or external data transfers.

S3 object versioning leading to excessive storage

Versioning increases bucket storage size due to old object versions accumulating.

Symptoms:

  • Large bucket size despite minimal new uploads
  • High storage costs from multiple versions of the same file

Troubleshooting:

  • Configure lifecycle policy rules to delete older versions automatically.
  • List versions using aws s3api list-object-versions and remove unused ones.
  • Enable Intelligent-Tiering to automatically move older versions to cheaper storage classes.
  • Use S3 Storage Lens to track how much space old versions consume.

Performance issues

Performance issues in S3 can slow down applications, cause timeouts, or result in unpredictable response times.

High latency in S3 API requests

S3 API requests take longer than expected, which affects application performance.

Symptoms:

  • Slow response times when making PUT, GET, or DELETE requests
  • Increased timeout errors in applications using S3

Troubleshooting:

  • Use AWS CloudWatch to track request latency and other related metrics.
  • Use AWS Global Accelerator to improve latency for global users accessing S3.
  • Avoid using a single object prefix too often; S3 optimizes performance when objects are spread across multiple prefixes.
  • Reduce API calls by using batch processing for large workloads.
  • Run ping and traceroute to check for network-related delays.

Inconsistent performance across different S3 storage classes

Objects in different storage classes experience varying performance levels.

Symptoms:

  • Faster access to objects in S3 Standard but slower retrieval from Glacier or Deep Archive
  • Variations in response times depending on object class

Troubleshooting:

  • First up, it’s important to understand storage class behavior. S3 Standard is designed for frequent, quick access, while Glacier and Deep Archive are intended for long-term archival and require a retrieval request, which incurs a delay.
  • Use S3 Intelligent-Tiering to automatically move objects between tiers based on access patterns.
  • If needed, ensure that expedited retrieval is enabled for Glacier objects.
  • Store frequently accessed files in Amazon CloudFront for faster retrieval.

Bucket and object management issues

Misconfigurations in bucket management can lead to problems like failed deletions, stalled uploads, or storage waste.

Bucket deletion fails

You cannot delete an S3 bucket, even when it appears empty.

Symptoms:

  • BucketNotEmpty or similar errors
  • AWS Management Console shows the bucket as empty, but deletion still fails

Troubleshooting:

  • Run aws s3api list-object-versions to find and delete all versions before deleting the bucket.
  • If MFA delete is enabled, you would need to authenticate with MFA before you can delete the bucket.
  • Confirm that bucket ownership settings allow deletion.
  • Manually delete contents using AWS CLI. Use the aws s3 rb s3://bucket-name --force to remove the bucket and all objects.

Multipart uploads not completing

Large files uploaded via multipart upload remain incomplete.

Symptoms:

  • Objects appear in S3 but cannot be accessed
  • Uploads fail midway, leaving partial object data

Troubleshooting:

  • Run aws s3api list-multipart-uploads to find unfinished multipart uploads.
  • Use aws s3api complete-multipart-upload or aws s3api abort-multipart-upload to finalize or clean up the upload.
  • If you are uploading from an application, ensure that timeout values are high enough to allow completion.
  • Use AWS Transfer Acceleration to improve multipart upload speeds, especially for large files.

Encryption issues

Encryption ensures that S3 data remains secure, but misconfigurations can cause access problems or expose sensitive data.

S3 default encryption not working

Objects uploaded to an S3 bucket are not automatically encrypted.

Symptoms:

  • New objects show as unencrypted despite bucket encryption settings
  • Security audits flag missing encryption on objects

Troubleshooting:

  • Check if bucket default encryption is enabled (AES-256 or AWS-KMS).
  • If using the AWS SDK or API, include x-amz-server-side-encryption when uploading.
  • Check IAM policies to rule out any misconfigurations. Some IAM policies may restrict encryption settings.
  • Set up AWS Config rules to ensure all objects are encrypted at rest.

KMS key issues preventing object access

S3 objects encrypted with AWS Key Management Service (KMS) cannot be accessed.

Symptoms:

  • Access Denied, KMS Access Denied, or similar error messages
  • Users cannot download or modify objects encrypted with KMS

Troubleshooting:

  • Ensure that users have kms:Decrypt and kms:GenerateDataKey permissions.
  • Make sure that the KMS key allows the required roles or users to access the encrypted objects.
  • KMS keys are region-specific; use the correct key for the bucket’s region.
  • Use AWS CloudTrail to identify any failed KMS decryption attempts.

Scalability problems

AWS S3 is designed to handle massive amounts of data, but improper scaling strategies can lead to bottlenecks, API throttling, and inefficient data management.

API throttling due to request limits

S3 limits the number of requests per second, and exceeding these limits can cause throttling.

Symptoms:

  • Error messages like SlowDown or 503 Service Unavailable
  • Increased request latency and intermittent failures when accessing S3

Troubleshooting:

  • Avoid overloading a single prefix by spreading objects across different key names.
  • If feasible, use Amazon S3 Express One Zone. It provides low-latency, high-throughput access for high-performance workloads.
  • Configure retries in applications to avoid overwhelming S3 with repeated requests.
  • Track API request patterns to avoid hitting S3 limits.

Bucket size limits reached

While S3 is designed to scale, performance or operational challenges can arise when dealing with extremely large buckets.

Symptoms:

  • Slower performance when listing objects in large buckets
  • Increased time for operations like inventory reports or lifecycle transitions

Troubleshooting:

  • Use S3 Inventory to generate inventory reports to analyze and manage large data sets efficiently.
  • Automate large-scale operations like copying, tagging, or deleting objects.
  • Split data into multiple buckets to reduce the size of individual buckets.
  • Use a hierarchical prefix structure to improve listing performance.
  • Use S3 Storage Lens to gain insights into storage usage and optimize bucket organization.

Preventative measures and best practices

To avoid several of the aforementioned issues, follow these best practices when managing S3 buckets:

  • Implement the Principle of Least Privilege (PoLP): Assign the minimum required permissions to users, applications, and roles. Create IAM policies, bucket policies, and Access Control Lists (ACLs) with care to prevent unauthorized access.
  • Set Up CloudWatch and CloudTrail monitoring: Enable AWS CloudWatch for performance tracking and AWS CloudTrail for auditing all S3-related activities.
  • Use S3 Object Lock for data protection: If you need to prevent accidental deletions or modifications, enable S3 Object Lock in governance or compliance mode.
  • Optimize storage costs with lifecycle policies: Define rules to transition objects to lower-cost storage tiers like Glacier or delete them when they are no longer needed. This reduces unnecessary storage costs.
  • Ensure proper encryption for security: Use Server-Side Encryption (SSE-S3, SSE-KMS, or SSE-C) or Client-Side Encryption to protect sensitive data at rest. Always enforce HTTPS for secure data transfer.
  • Distribute load with multiple prefixes: To avoid throttling and request limits, spread objects across multiple prefixes within a bucket instead of overloading a single one.
  • Enable versioning for data recovery: Turn on versioning to protect against accidental overwrites and deletions. This also helps recover from ransomware attacks and user errors.
  • Regularly audit and rotate access keys: Avoid using long-lived access keys and instead use IAM roles with temporary security credentials. If access keys can’t be avoided, rotate them frequently.
  • Limit public access to buckets: Avoid exposing buckets to the public unless absolutely necessary. Use AWS Block Public Access settings to restrict accidental public exposure.
  • Set up alerts for anomalies: Use AWS Config, CloudWatch Alarms, or Site24x7 to detect unusual activities, such as sudden spikes in API requests, unauthorized access, or unexpected storage growth.

Conclusion

S3 is a staple of modern cloud infrastructures. However, despite its inherent fault-tolerance and reliability, it can sometimes run into issues that can be hard to debug. We hope the insights in this guide help you be better prepared to handle them the next time they arise.

For comprehensive monitoring and visibility into your bucket infrastructure, don’t forget to check out the S3 monitoring tool by Site24x7.

Was this article helpful?

Related Articles