Overview

Amazon Simple Storage Service (S3) is a cloud-based object storage service launched by Amazon Web Services (AWS) in 2006. It provides a highly scalable, durable, and available storage infrastructure designed to store and retrieve any amount of data from anywhere on the web, across various AWS regions. S3 operates on a RESTful API and is foundational for many cloud-native architectures, serving as a backend for applications, data lakes, content distribution, and backup solutions.

S3โ€™s design emphasizes durability and availability, offering multiple storage classes to address different access patterns and cost optimization strategies. For instance, S3 Standard is intended for frequently accessed data, while S3 Glacier Deep Archive is designed for long-term data retention with retrieval times measured in hours. This tiered approach allows users to manage costs effectively based on their data's lifecycle and access requirements. The service also incorporates features like versioning, lifecycle policies, and replication, which contribute to data protection and management efficiency.

S3 is particularly well-suited for scenarios requiring massive data scalability without significant upfront infrastructure investment. Use cases include building data lakes for analytics, hosting static websites, storing backups and disaster recovery archives, and serving as primary storage for cloud applications. Its integration with other AWS services, such as Amazon EC2, AWS Lambda, and Amazon Athena, extends its utility across a broad spectrum of computing and data processing workloads. For example, S3 can act as the source for Lambda functions triggered by object uploads, or as the data store for analytical queries performed by Athena. The breadth of its applications and its operational track record make it a significant component in many enterprise cloud strategies, as observed in discussions on forums like r/aws.

Developers and technical buyers often choose S3 for its demonstrated reliability and extensive ecosystem. The service supports a wide array of compliance certifications, including SOC 1, SOC 2 Type II, and SOC 3, as well as HIPAA, PCI DSS, and GDPR, addressing stringent regulatory requirements. The availability of comprehensive SDKs for languages like Python, Java, and JavaScript, along with a robust CLI, facilitates programmatic access and management, enabling automation of storage operations within CI/CD pipelines and operational scripts.

Key features

  • Object Storage: Stores data as objects within buckets, accessible via a unique URL, supporting arbitrary data types.
  • Storage Classes: Offers various classes (e.g., S3 Standard, S3 Intelligent-Tiering, S3 Glacier) optimized for different access patterns and cost points, allowing for automatic or manual data lifecycle management.
  • Scalability and Durability: Designed for high durability (99.999999999%) and availability, automatically scaling to accommodate growing data volumes without requiring infrastructure provisioning.
  • Versioning: Preserves multiple versions of an object, protecting against accidental deletions or overwrites.
  • Lifecycle Management: Defines rules to automatically transition objects between storage classes or expire them after a specified period, optimizing storage costs.
  • Replication: Provides Cross-Region Replication (CRR) and Same-Region Replication (SRR) for automated, asynchronous copying of objects across buckets in different or the same AWS Regions.
  • Static Website Hosting: Supports hosting static websites directly from an S3 bucket, serving web content without managing web servers.
  • Security and Access Management: Integrates with AWS Identity and Access Management (IAM), bucket policies, Access Control Lists (ACLs), and encryption options (at rest and in transit) to secure data.
  • Event Notifications: Triggers notifications to AWS Lambda, Amazon SNS, or Amazon SQS in response to S3 object-level operations.
  • Data Lake Foundation: Commonly used as the storage layer for data lakes, allowing for scalable ingestion and processing of structured and unstructured data.

Pricing

AWS S3 pricing is based on a pay-as-you-go model, with costs varying depending on several factors:

  • Storage: The amount of data stored per month, which differs across storage classes (e.g., S3 Standard, S3 Intelligent-Tiering, S3 Glacier Deep Archive).
  • Requests: The number and type of requests made to your S3 objects and buckets (e.g., GET, PUT, COPY, POST, LIST).
  • Data Transfer Out: Data transferred from S3 to the internet, other AWS regions, or Amazon CloudFront.
  • Data Retrieval: Specifically for S3 Glacier and S3 Glacier Deep Archive storage classes, retrieval costs vary based on speed and volume.
  • Monitoring and Management: Features like S3 Storage Lens, S3 Inventory, and S3 Select incur additional costs.

Pricing is also region-dependent, with different rates applied in various AWS geographic regions.

Example S3 Standard Pricing (us-east-1, as of 2026-04-26)

Billing Component Price per GB/1000 requests
First 50 TB/Month (Storage) $0.023 per GB
Next 450 TB/Month (Storage) $0.022 per GB
Over 500 TB/Month (Storage) $0.021 per GB
PUT, COPY, POST, LIST Requests $0.005 per 1,000 requests
GET, SELECT, and all other Requests $0.0004 per 1,000 requests
Data Transfer Out to Internet (up to 10 TB/Month) $0.090 per GB

For detailed and up-to-date pricing information across all storage classes and regions, refer to the official AWS S3 Pricing page.

Common integrations

  • AWS EC2: Often used for storing application data, logs, and backups generated by Amazon EC2 instances.
  • AWS Lambda: S3 events (e.g., object creation, deletion) can trigger Lambda functions for serverless data processing. See Invoking Lambda functions using Amazon S3 events.
  • Amazon Redshift / Amazon Athena: S3 serves as the primary data lake storage for analytical queries performed by these services. Refer to Creating tables in Amazon Athena.
  • Amazon CloudFront: S3 buckets can act as origin servers for CloudFront distributions to accelerate content delivery globally. Detailed in What is Amazon CloudFront?.
  • AWS Backup: Integrates for centralized backup and recovery of S3 data and other AWS services. More information at What is AWS Backup?.
  • AWS Storage Gateway: Provides hybrid cloud storage by connecting on-premises applications to cloud storage in S3. See How AWS Storage Gateway works.
  • Amazon SageMaker: S3 is a common repository for training data, model artifacts, and results in machine learning workflows.

Alternatives

  • Google Cloud Storage: Google's object storage offering with various storage classes and global reach.
  • Azure Blob Storage: Microsoft Azure's solution for storing large amounts of unstructured object data.
  • Cloudflare R2: An object storage service that seeks to eliminate egress fees.

Getting started

To interact with AWS S3 using the AWS SDK for Python (Boto3), you can perform operations like creating a bucket, uploading a file, and downloading a file. Ensure you have the boto3 library installed (pip install boto3) and your AWS credentials configured.


import boto3
import os

# Configure S3 client
s3 = boto3.client('s3')

bucket_name = 'my-unique-example-bucket-12345'
file_name = 'hello.txt'
object_name = 'path/to/hello.txt'

# 1. Create a simple text file
with open(file_name, 'w') as f:
    f.write('Hello, CloudPicker S3 example!')

try:
    # 2. Create an S3 bucket
    print(f"Creating bucket: {bucket_name}")
    s3.create_bucket(Bucket=bucket_name)
    print(f"Bucket '{bucket_name}' created successfully.")

    # 3. Upload the file to the bucket
    print(f"Uploading '{file_name}' to '{object_name}' in '{bucket_name}'")
    s3.upload_file(file_name, bucket_name, object_name)
    print("File uploaded successfully.")

    # 4. Download the file from the bucket
    download_name = 'downloaded_hello.txt'
    print(f"Downloading '{object_name}' from '{bucket_name}' to '{download_name}'")
    s3.download_file(bucket_name, object_name, download_name)
    print(f"File downloaded successfully to '{download_name}'.")

    # 5. Read the content of the downloaded file
    with open(download_name, 'r') as f:
        content = f.read()
        print(f"Content of downloaded file: '{content}'")

except Exception as e:
    print(f"An error occurred: {e}")
finally:
    # Clean up: delete the created files and bucket
    if os.path.exists(file_name):
        os.remove(file_name)
    if os.path.exists(download_name):
        os.remove(download_name)
    
    try:
        # First, delete objects in the bucket
        response = s3.list_objects_v2(Bucket=bucket_name)
        if 'Contents' in response:
            for obj in response['Contents']:
                print(f"Deleting object: {obj['Key']}")
                s3.delete_object(Bucket=bucket_name, Key=obj['Key'])

        # Then, delete the bucket itself
        print(f"Deleting bucket: {bucket_name}")
        s3.delete_bucket(Bucket=bucket_name)
        print("Bucket deleted successfully.")
    except Exception as e:
        print(f"Error during cleanup: {e}")