AWS Storage Services Cheat Sheet

- provides key-value based object storage with unlimited storage, unlimited objects up to 5 TB for the internet
- offers an extremely durable, highly available, and infinitely scalable data storage infrastructure at very low costs.
- is Object-level storage (not a Block level storage) and cannot be used to host OS or dynamic websites (but can work with Javascript SDK)
- provides durability by redundantly storing objects on multiple facilities within a region
- regularly verifies the integrity of data using checksums and provides the auto-healing capability
- S3 resources consist of globally unique buckets with objects and related metadata. The data model is a flat structure with no hierarchies or folders.
- As of March 2026, S3 stores more than 500 trillion objects, serves more than 200 million requests per second globally across hundreds of exabytes of data.
- S3 Replication enables automatic, asynchronous copying of objects across S3 buckets in the same or different AWS regions using SRR or CRR. Replication needs versioning enabled on either side.
- S3 Transfer Acceleration helps speed data transport over long distances between a client and an S3 bucket using CloudFront edge locations.
- S3 supports cost-effective Static Website hosting with Client-side scripts.
- S3 CORS – Cross-Origin Resource Sharing allows cross-origin access to S3 resources.
- S3 Access Logs enables tracking access requests to an S3 bucket.
- S3 notification feature enables notifications to be triggered when certain events happen in the bucket.
- S3 Inventory helps manage the storage and can be used to audit and report on the replication and encryption status of the objects for business, compliance, and regulatory needs.
- Requestor Pays help bucket owner to specify that the requester requesting the download will be charged for the download.
- S3 Batch Operations help perform large-scale batch operations on S3 objects and can perform a single operation on lists of specified S3 objects.
- Pre-Signed URLs can be used shared for uploading/downloading objects for a limited time without requiring AWS security credentials.
- Multipart Uploads allows
- parallel uploads with improved throughput and bandwidth utilization
- fault tolerance and quick recovery from network issues
- ability to pause and resume uploads
- begin an upload before the final object size is known
- Versioning
- helps preserve, retrieve, and restore every version of every object
- protect from unintended overwrites and accidental deletions
- protects individual files but does NOT protect from Bucket deletion
- MFA (Multi-Factor Authentication) can be enabled for additional security for the deletion of objects.
- Integrates with CloudTrail, CloudWatch, and SNS for event notifications
- S3 Object Lock
- provides Write-Once-Read-Many (WORM) protection for S3 objects
- prevents objects from being deleted or overwritten for a fixed amount of time or indefinitely
- Governance Mode – users with specific IAM permissions can remove the lock
- Compliance Mode – no user, including the root account, can remove the lock until retention period expires
- supports Legal Hold which prevents object deletion indefinitely until explicitly removed
- requires versioning to be enabled on the bucket
- S3 Storage Classes
- S3 Standard
- default storage class, ideal for frequently accessed data
- 99.999999999% durability & 99.99% availability
- Low latency and high throughput performance
- designed to sustain the loss of data in two facilities
- S3 Intelligent-Tiering
- automatically moves data between access tiers based on access patterns with no retrieval charges
- includes Frequent Access (default), Infrequent Access (after 30 days, 40% lower cost), and Archive Instant Access (after 90 days, 68% lower cost) tiers
- optional Archive Access (90-730 days) and Deep Archive Access (180-730 days) tiers can be enabled
- 99.999999999% durability & 99.9% availability
- ideal for data with unknown or changing access patterns
- small monthly monitoring and automation charge per object; no retrieval charges
- S3 Express One Zone
- high-performance storage class launched in November 2023
- delivers up to 10x better performance than S3 Standard with consistent single-digit millisecond latency
- request costs up to 50% lower than S3 Standard
- uses directory buckets (a new bucket type) stored in a single Availability Zone
- supports up to 2 million requests per second per directory bucket
- ideal for ML training, interactive analytics, financial modeling, and real-time advertising
- allows co-locating storage and compute in the same AZ for optimal performance
- S3 Standard-Infrequent Access (S3 Standard-IA)
- optimized for long-lived and less frequently accessed data
- designed to sustain the loss of data in two facilities
- 99.999999999% durability & 99.9% availability
- suitable for objects greater than 128 KB kept for at least 30 days
- S3 One Zone-Infrequent Access (S3 One Zone-IA)
- optimized for rapid access, less frequently accessed data
- ideal for secondary backups and reproducible data
- stores data in a single AZ, data stored in this storage class will be lost in the event of AZ destruction.
- 99.999999999% durability & 99.5% availability
S3 Reduced Redundancy Storage (Not Recommended)
designed for noncritical, reproducible data stored at lower levels of redundancy than the STANDARD storage class
reduces storage costs
99.99% durability & 99.99% availability
designed to sustain the loss of data in a single facility
- S3 Glacier Instant Retrieval
- lowest-cost storage for long-lived data that is rarely accessed but requires milliseconds retrieval
- ideal for medical images, news media assets, or genomics data accessed once per quarter
- 99.999999999% durability & 99.9% availability
- Minimum storage duration of 90 days
- up to 68% lower cost than S3 Standard-IA
- S3 Glacier Flexible Retrieval (formerly S3 Glacier)
- suitable for low cost data archiving, where data access is infrequent
- provides retrieval time of minutes to hours
- Expedited – 1 to 5 minutes
- Standard – 3 to 5 hours
- Bulk – 5 to 12 hours (free)
- 99.999999999% durability & 99.9% availability
- Minimum storage duration of 90 days
- S3 Glacier Deep Archive
- provides lowest cost data archiving, where data access is infrequent
- 99.999999999% durability & 99.9% availability
- provides retrieval time of several (12-48) hours
- Standard – 12 hours
- Bulk – 48 hours
- Minimum storage duration of 180 days
- supports long-term retention and digital preservation for data that may be accessed once or twice a year
- Lifecycle Management policies
- transition to move objects to different storage classes and Glacier
- expiration to remove objects and object versions
- can be applied to both current and non-current objects, in case, versioning is enabled.
- Data Consistency Model
- provides strong read-after-write consistency for PUT and DELETE requests of objects in the S3 bucket in all AWS Regions
- updates to a single key are atomic
- S3 Security
- IAM policies – grant users within your own AWS account permission to access S3 resources
- Bucket and Object ACL – grant other AWS accounts (not specific users) access to S3 resources
- Bucket policies – allows to add or deny permissions across some or all of the objects within a single bucket
- S3 Access Points simplify data access for any AWS service or customer application that stores data in S3.
- S3 Glacier Vault Lock helps deploy and enforce compliance controls for individual S3 Glacier vaults with a vault lock policy.
- S3 VPC Gateway Endpoint enables private connections between a VPC and S3, without requiring that you use an internet gateway, NAT device, VPN connection, or AWS Direct Connect connection.
- Support SSL encryption of data in transit and data encryption at rest
- S3 Block Public Access – provides settings to block public access at the account and bucket level (enabled by default on new buckets)
- SSE-C disabled by default – as of April 2026, Server-Side Encryption with Customer-Provided Keys (SSE-C) is disabled by default on all new general purpose buckets for enhanced security
- S3 Data Encryption
- supports data at rest and data in transit encryption
- All new objects are encrypted by default with SSE-S3 (Amazon S3-managed keys)
- Server-Side Encryption
- SSE-S3 – encrypts S3 objects using keys handled & managed by AWS (default)
- SSE-KMS – leverage AWS Key Management Service to manage encryption keys. KMS provides control and audit trail over the keys.
- SSE-C – when you want to manage your own encryption keys. AWS does not store the encryption key. Requires HTTPS. Disabled by default on new buckets since April 2026.
- DSSE-KMS – Dual-layer Server-Side Encryption with KMS keys, provides two layers of encryption for compliance requirements
- Client-Side Encryption
- Client library such as the S3 Encryption Client
- Clients must encrypt data themselves before sending it to S3
- Clients must decrypt data themselves when retrieving from S3
- Customer fully manages the keys and encryption cycle
- S3 Best Practices
- use parallel threads and Multipart upload for faster writes
- use parallel threads and Range Header GET for faster reads
- for list operations with a large number of objects, it’s better to build a secondary index in DynamoDB
- use Versioning to protect from unintended overwrites and deletions, but this does not protect against bucket deletion
- use VPC S3 Endpoints with VPC to transfer data using Amazon internal network
- use S3 Object Lock for WORM compliance and ransomware protection
S3 Bucket Types
- General Purpose Buckets – traditional S3 buckets for most workloads with flat storage namespace
- Directory Buckets – used with S3 Express One Zone storage class, organized with a hierarchical directory structure for low-latency workloads
- Table Buckets – purpose-built for storing tabular data in Apache Iceberg format (launched December 2024), with automatic compaction, snapshot management, and garbage collection
- Vector Buckets – optimized for durable, low-cost vector storage for AI embeddings (GA December 2025), supports up to 2 billion vectors per index with dedicated APIs for storing, accessing, and querying vectors
S3 Files (2026)
- provides fully-featured, high-performance NFS file system access to S3 data
- first cloud object store to provide full file system semantics without data ever leaving S3
- enables accessing S3 objects using file-based protocols for applications requiring file system interfaces
- provides temporary or ephemeral block-level storage for an EC2 instance
- is physically attached to the Instance
- deliver very high random I/O performance, which is a good option when storage with very low latency is needed
- cannot be dynamically resized
- data persists when an instance is rebooted
- data does not persist if the
- underlying disk drive fails
- instance stops i.e. if the EBS backed instance with instance store volumes attached is stopped
- instance terminates
- can be attached to an EC2 instance only when the instance is launched
- is ideal for the temporary storage of information that changes frequently, such as buffers, caches, scratch data, and other temporary content, or for data that is replicated across a fleet of instances, such as a load-balanced pool of web servers.
- is virtual network-attached block storage
- provides highly available, reliable, durable, block-level storage volumes that can be attached to a running instance
- provides high durability and are redundant in an AZ, as the data is automatically replicated within that AZ to prevent data loss due to any single hardware component failure
- persists and is independent of EC2 lifecycle
- multiple volumes can be attached to a single EC2 instance
- can be detached & attached to another EC2 instance in that same AZ only
- volumes are Zonal i.e. created in a specific AZ and CAN’T span across AZs
- snapshots
- for making volume available to different AZ, create a snapshot of the volume and restore it to a new volume in any AZ within the region
- for making the volume available to different Region, the snapshot of the volume can be copied to a different region and restored as a volume
- Multi-Attach enables attaching a single Provisioned IOPS SSD (io1 or io2) volume to multiple instances that are in the same AZ.
- EBS Volume Types:
- General Purpose SSD (gp3) – default and recommended for most workloads
- baseline 3,000 IOPS and 125 MiB/s throughput included (independent of volume size)
- as of September 2025, supports up to 64 TiB (4x previous 16 TiB), 80,000 IOPS (5x previous 16,000), and 2,000 MiB/s throughput (2x previous 1,000 MiB/s)
- 99.9% durability
- 20% lower cost than gp2 with ability to independently provision IOPS and throughput
- General Purpose SSD (gp2) – legacy, still supported
- IOPS scales with volume size (3 IOPS per GiB), up to 16,000 IOPS
- suitable for boot volumes, dev/test environments
- recommended to migrate to gp3 for cost savings
- Provisioned IOPS SSD (io2 Block Express) – highest performance
- up to 256,000 IOPS, 4,000 MiB/s throughput, 64 TiB volume size
- 99.999% durability (100x higher than io1)
- sub-millisecond latency
- 1,000 IOPS per GiB ratio (20x higher than io1)
- supports Multi-Attach
- same price as io1, recommended as replacement
- available in all commercial and GovCloud regions (2025)
- Provisioned IOPS SSD (io1) – legacy, being superseded by io2
- up to 64,000 IOPS, 50 IOPS per GiB
- 99.9% durability
- recommended to upgrade to io2 Block Express for better performance at same cost
- Throughput Optimized HDD (st1)
- low-cost HDD for frequently accessed, throughput-intensive workloads
- big data, data warehouses, log processing
- max throughput 500 MiB/s, max IOPS 500
- cannot be a boot volume
- Cold HDD (sc1)
- lowest cost HDD for less frequently accessed workloads
- max throughput 250 MiB/s, max IOPS 250
- cannot be a boot volume
- allows encryption using the EBS encryption feature.
- All data stored at rest, disk I/O, and snapshots created from the volume are encrypted.
- uses 256-bit AES algorithms (AES-256) and an Amazon-managed KMS
- Snapshots of encrypted EBS volumes are automatically encrypted.
- EBS encryption by default can be enabled at the account level for all new volumes
- helps create backups of EBS volumes
- are incremental
- occur asynchronously
- are regional and CANNOT span across regions
- can be copied across regions to make it easier to leverage multiple regions for geographical expansion, data center migration, and disaster recovery
- can be shared by making them public or with specific AWS accounts by modifying the access permissions of the snapshots
- support EBS encryption
- Snapshots of encrypted volumes are automatically encrypted
- Volumes created from encrypted snapshots are automatically encrypted
- All data in flight between the instance and the volume is encrypted
- Volumes created from an unencrypted snapshot owned or have access to can be encrypted on the fly.
- Encrypted snapshot owned or having access to, can be encrypted with a different key during the copy process.
- can be automated using AWS Data Lifecycle Manager (DLM)
- EBS Snapshots Archive – move rarely-accessed snapshots to a low-cost archive tier (up to 75% cheaper), with retrieval taking 24-72 hours
- Recycle Bin – protects against accidental deletion by retaining deleted snapshots for a configurable retention period
Refer blog post @ EBS vs Instance Store
EFS
- fully-managed, easy to set up, scale, and cost-optimize file storage
- can automatically scale from gigabytes to petabytes of data without needing to provision storage
- provides managed NFS (network file system) that can be mounted on and accessed by multiple EC2 in multiple AZs simultaneously
- highly durable, highly scalable and highly available.
- stores data redundantly across multiple Availability Zones
- grows and shrinks automatically as files are added and removed, so there is no need to manage storage procurement or provisioning.
- uses the Network File System version 4 (NFS v4) protocol
- is compatible with all Linux-based AMIs for EC2, POSIX file system (~Linux) that has a standard file API
- does not support Windows AMI (use FSx for Windows instead)
- offers the ability to encrypt data at rest using KMS and in transit.
- can be accessed from on-premises using an AWS Direct Connect or AWS VPN connection between the on-premises datacenter and VPC.
- can be accessed concurrently from servers in the on-premises datacenter as well as EC2 instances in the Amazon VPC
- supports up to 10,000 access points per file system (10x increase from previous 1,000 limit, February 2025)
- Performance
- Elastic Throughput (recommended) – automatically scales throughput up or down based on workload
- up to 60 GiB/s read and 10 GiB/s write throughput (October 2024 increase)
- Provisioned Throughput – specify throughput independent of storage
- Bursting Throughput – scales with file system size
- supports up to 2.5 million read IOPS and 500,000 write IOPS per file system (November 2024, 10x increase)
- Storage Classes
- EFS Standard – for frequently accessed files, multi-AZ redundancy
- EFS Standard-IA (Infrequent Access) – lower cost for infrequently accessed files, multi-AZ redundancy
- EFS One Zone – single-AZ, lower cost for frequently accessed data
- EFS One Zone-IA – single-AZ, lowest cost for infrequent access
- Lifecycle Management automatically moves data between storage classes based on access patterns
- EFS Replication – enables automatic replication of file systems to another AWS Region or within the same Region for disaster recovery
- EFS is a shared POSIX system for Linux systems and does not work for Windows
- is a fully managed, highly reliable, and scalable Windows file system share drive
- supports SMB protocol & Windows NTFS
- supports Microsoft Active Directory integration, ACLs, user quotas
- built on SSD, scale up to 10s of GB/s, millions of IOPS, 100s PB of data
- is accessible from Windows, Linux, and MacOS compute instances
- can be accessed from the on-premise infrastructure
- can be configured to be Multi-AZ (high availability)
- supports encryption of data at rest and in transit
- provides data deduplication, which enables further cost optimization by removing redundant data.
- data is backed-up daily to S3
- provides easy and cost effective way to launch and run the world’s most popular high-performance file system.
- is a type of parallel distributed file system, for large-scale computing
- Lustre is derived from “Linux” and “cluster”
- Machine Learning, High Performance Computing (HPC) esp. Video Processing, Financial Modeling, Electronic Design Automation
- scales up to 100s GB/s, millions of IOPS, sub-ms latencies
- seamless integration with S3, it transparently presents S3 objects as files and allows you to write changed data back to S3.
- can “read S3” as a file system (through FSx)
- can write the output of the computations back to S3 (through FSx)
- supports encryption of data at rest and in transit
- can be used from on-premise servers
Amazon FSx for NetApp ONTAP
- fully managed shared storage built on NetApp’s popular ONTAP file system
- supports NFS, SMB, and iSCSI protocols — accessible from Linux, Windows, and macOS
- provides enterprise features: snapshots, cloning, replication, compression, deduplication, and tiering
- supports Multi-AZ deployments for high availability
- ideal for migrating on-premises NetApp/NAS workloads to AWS
- second-generation file systems (July 2024) deliver up to 6 GBps throughput per HA pair
- supports S3 Access Points (2025) — access file data through S3 APIs for AI/ML and analytics workloads without moving data
- supports Autonomous Ransomware Protection (ARP) (April 2025) — detects unusual activity and generates automatic snapshots
- can be accessed from on-premises via Direct Connect or VPN
Amazon FSx for OpenZFS
- fully managed shared file storage built on the OpenZFS file system
- supports NFS protocol (v3, v4, v4.1, v4.2)
- delivers up to 1 million IOPS with sub-millisecond latencies
- provides data management capabilities: snapshots, cloning, compression
- ideal for migrating Linux-based file servers and applications to AWS
- supports S3 Access Points (2025) — seamless access to file data through S3 APIs
- accessible from Linux, Windows, and macOS compute instances
- provides low latency and high data transfer speeds for distribution of static, dynamic web or streaming content to web users
- delivers the content through a worldwide network of data centers called Edge Locations (700+ locations globally)
- keeps persistent connections with the origin servers so that the files can be fetched from the origin servers as quickly as possible.
- dramatically reduces the number of network hops that users’ requests must pass through
- supports multiple origin server options, like AWS hosted service for e.g. S3, EC2, ELB or an on premise server, which stores the original, definitive version of the objects
- single distribution can have multiple origins and Path pattern in a cache behavior determines which requests are routed to the origin
- supports Web distribution for static, dynamic web content, on demand using progressive download & HLS and live streaming video content
RTMP Streaming distribution was deprecated and removed on December 31, 2020
- supports HTTPS using either
- dedicated IP address, which is expensive as dedicated IP address is assigned to each CloudFront edge location
- Server Name Indication (SNI), which is free but supported by modern browsers only with the domain name available in the request header
- For E2E HTTPS connection,
- Viewers -> CloudFront needs either self signed certificate, or certificate issued by CA or ACM
- CloudFront -> Origin needs certificate issued by ACM for ELB and by CA for other origins
- Security
- Origin Access Control (OAC) is the recommended method to restrict S3 origin access to CloudFront only. OAC supports SSE-KMS, all S3 bucket types, and dynamic requests (PUT/DELETE).
Origin Access Identity (OAI) is legacy — deprecated for new distributions as of March 2026. Migrate to OAC.
- VPC Origins (November 2024) – enables CloudFront to connect directly to ALBs, NLBs, or EC2 instances in private subnets, making CloudFront the single point of entry without exposing origins to the internet
- supports Geo restriction (Geo-Blocking) to whitelist or blacklist countries that can access the content
- Signed URLs
- to restrict access to individual files, for e.g., an installation download for your application.
- users using a client, for e.g. a custom HTTP client, that doesn’t support cookies
- Signed Cookies
- provide access to multiple restricted files, for e.g., video part files in HLS format or all of the files in the subscribers’ area of a website.
- don’t want to change the current URLs
- integrates with AWS WAF, a web application firewall that helps protect web applications from attacks by allowing rules configured based on IP addresses, HTTP headers, and custom URI strings
- integrates with AWS Shield (Standard included free) for DDoS protection
- Edge Compute
- CloudFront Functions – lightweight functions executing at 700+ edge locations with sub-millisecond startup, for simple request/response manipulations (URL redirects, header manipulation, cache key normalization)
- Lambda@Edge – runs at 13 Regional Edge Caches, supports longer execution (up to 30 seconds), network access, and larger packages for complex logic
- CloudFront KeyValueStore (2023) – globally distributed low-latency data store for CloudFront Functions, enabling data lookups without network calls (A/B testing, feature flags, geo-routing)
- Connection Functions (November 2025) – functions for mutual TLS (mTLS) viewer authentication
- supports GET, HEAD, OPTIONS, PUT, POST, PATCH, DELETE to get object & object headers, add, update, and delete objects
- only caches responses to GET and HEAD requests and, optionally, OPTIONS requests
- does not cache responses to PUT, POST, PATCH, DELETE request methods and these requests are proxied back to the origin
- object removal from cache
- would be removed upon expiry (TTL) from the cache, by default 24 hrs
- can be invalidated explicitly, but has a cost associated, however might continue to see the old version until it expires from those caches
- change object name, versioning, to serve different version
- supports adding or modifying custom headers before the request is sent to origin which can be used to
- validate if user is accessing the content from CDN
- identifying CDN from which the request was forwarded from, in case of multiple CloudFront distribution
- for viewers not supporting CORS to return the Access-Control-Allow-Origin header for every request
- supports Partial GET requests using range header to download object in smaller units improving the efficiency of partial downloads and recovery from partially failed transfers
- supports compression to compress and serve compressed files when viewer requests include Accept-Encoding: gzip in the request header
- supports different price class to include all regions, to include only least expensive regions and other regions to exclude most expensive regions
- CloudFront Pricing Plans (2025) – flat-rate plans (Free, Pro $15/mo, Business $200/mo, Premium $1000/mo) combining CDN, WAF, DDoS protection, bot management, Route 53, and S3 credits into predictable monthly pricing
- Origin Shield – additional caching layer between edge locations and origin that reduces origin load and improves cache hit ratios
- Continuous Deployment – enables safe deployment of CloudFront configuration changes using staging distributions for testing with a subset of traffic
- supports access logs which contain detailed information about every user request
AWS Import/Export & Data Transfer
⚠️ AWS Import/Export Disk is a legacy service and has been superseded by the AWS Snow Family. AWS Snow Family devices (Snowball Edge) are no longer available to new customers as of November 7, 2025.
Alternatives for new customers:
- AWS DataSync — for online data transfers
- AWS Data Transfer Terminal — for secure physical transfers
- AWS Partner solutions — for specialized migration needs
- AWS Outposts — for edge computing needs
AWS Snow Family (Existing Customers Only)
- physical devices for transferring large amounts of data into and out of AWS
- Snowball Edge Storage Optimized – 80 TB usable storage, 40 vCPUs
- Snowball Edge Compute Optimized – 28 TB usable storage, 104 vCPUs, optional GPU
- suitable for large-scale data migrations, disaster recovery, and edge computing
- supports S3-compatible storage and EC2 compute instances at the edge
- No longer available to new customers as of November 7, 2025
AWS Data Transfer Terminal (2024)
- secure, physical locations where customers bring their storage devices for high-speed data transfer to/from AWS
- provides at least two 100 Gigabit Ethernet (100 GbE) ports per terminal
- supports transfer to Amazon S3, EFS, and other AWS endpoints
- available in multiple locations globally (US, Europe, etc.)
- reservation-based model — book date and time through AWS Console
- ideal replacement for Snow Family for physical data transfer use cases
- charges based on number of ports used during reservation (per port-hour)
AWS DataSync
- online data transfer service that simplifies, automates, and accelerates moving data between on-premises storage and AWS
- supports transfer to/from S3, EFS, FSx, and between AWS storage services
- automatically handles many transfer tasks: network optimization, data integrity validation, encryption
- can transfer up to 10 Gbps over a Direct Connect link
- recommended alternative to Snow Family for online transfers