AWS FSx for Lustre

AWS FSx for Lustre

  • FSx for Lustre is a fully managed service that makes it easy and cost-effective to launch and run the world’s most popular high-performance Lustre file system.
  • FSx for Lustre is built on the open-source Lustre file system designed for applications that require fast storage, where the storage needs to keep up with the compute.
  • handles the traditional complexity of setting up and managing high-performance Lustre file systems.
  • is POSIX-compliant and can be used with existing Linux-based applications without having to make any changes.
  • provides a native file system interface and works as any file system does with the Linux operating system.
  • provides read-after-write consistency and supports file locking.
  • is compatible with the most popular Linux-based AMIs, including Amazon Linux, Amazon Linux 2, Amazon Linux 2023, Red Hat Enterprise Linux (RHEL), CentOS, SUSE Linux, and Ubuntu.
  • is accessible from compute workloads running on EC2 instances and containers running on Amazon EKS, and from on-premises servers.
  • can be accessed from a Linux instance by installing the open-source Lustre client and mounting the file system using standard Linux commands.
  • is ideal for use cases where speed matters, such as machine learning, high-performance computing (HPC), video processing, financial modelling, genome sequencing, electronic design automation (EDA), and AI/ML training workloads.
  • delivers the fastest storage performance for GPU instances in the cloud with up to 1,200 Gbps per-client throughput using Elastic Fabric Adapter (EFA) and NVIDIA GPUDirect Storage (GDS).
  • delivers virtually unlimited storage capacity, millions of IOPS, up to terabytes per second of throughput, and sub-millisecond latencies.
  • supports Lustre LTS versions 2.10, 2.12, and 2.15, with in-place version upgrades supported.

FSx for Lustre Deployment Options

  • FSx for Lustre provides two file system deployment options: Scratch and Persistent.

Scratch file systems

  • designed for temporary storage and short-term processing of data.
  • provide high burst throughput of up to six times the baseline throughput of 200 MBps per TiB of storage capacity.
  • data is not replicated and does not persist if a file server fails.
  • ideal for cost-optimized storage for short-term, processing-heavy workloads.

Persistent file systems

  • designed for long-term storage and workloads.
  • is highly available, and data is automatically replicated within the AZ that is associated with the file system.
  • data volumes attached to the file servers are replicated independently from the file servers to which they are attached.
  • if a file server becomes unavailable, it is replaced automatically within minutes of failure.
  • continuously monitored for hardware failures, and automatically replaces infrastructure components in the event of a failure.
  • ideal for workloads that run for extended periods or indefinitely, and that might be sensitive to disruptions in availability.
  • Persistent-2 file systems are the latest generation, built on AWS Graviton processors, providing higher throughput per TiB (up to 1 GB/s per TiB) and lower cost of throughput compared to previous generation file systems.

FSx for Lustre - Scratch vs Persistence

FSx for Lustre Storage Classes

  • FSx for Lustre provides three storage classes: SSD, Intelligent-Tiering, and HDD.

SSD Storage Class

  • delivers consistent sub-millisecond latencies for the entire dataset.
  • ideal for latency-sensitive workloads that require all-flash performance.
  • available with both scratch and persistent deployment types.

Intelligent-Tiering Storage Class (New – 2025)

  • launched in May 2025, delivers virtually unlimited scalability, fully elastic Lustre file storage, and the lowest-cost Lustre file storage in the cloud.
  • automatically scales storage up and down based on access patterns — pay only for what you use.
  • automatically tiers data between three access tiers:
    • Frequent Access tier for actively used data.
    • Infrequent Access tier for less frequently accessed data.
    • Archive Instant Access tier for rarely accessed data.
  • offers an optional SSD read cache that delivers SSD-level performance at HDD pricing for latency-sensitive workloads.
  • delivers up to 34% better price-performance compared to on-premises HDD file storage.
  • delivers up to 70% better price-performance compared to other cloud-based Lustre storage.
  • starting at less than $0.005 per GB-month.
  • optimized for HDD-based or mixed HDD/SSD workloads with a mix of hot and cold data.
  • ideal for workloads like weather forecasting, seismic imaging, genomic analysis, and ADAS training.

HDD Storage Class

  • provides lower-cost storage for throughput-oriented workloads that don’t require sub-millisecond latencies.
  • suitable for workloads with large sequential I/O patterns.

FSx for Lustre Performance

  • FSx for Lustre file systems scale to terabytes per second of throughput and millions of IOPS.
  • supports concurrent access to the same file or directory from thousands of compute instances.
  • provides consistent, sub-millisecond latencies for file operations.

Elastic Fabric Adapter (EFA) and GPUDirect Storage (GDS) Support (2024)

  • launched in November 2024, provides the fastest storage performance for GPU instances in the cloud.
  • delivers up to 12x higher throughput per client instance (up to 1,200 Gbps) compared to previous FSx for Lustre systems.
  • NVIDIA GPUDirect Storage (GDS) creates a direct data path between storage and GPU memory, bypassing CPU and system memory.
  • supported on Nitro v4 (or higher) EC2 instances with EFA support (e.g., P5 GPU instances).
  • accelerates machine learning training jobs and reduces workload costs.
  • also supports ENA Express for enhanced networking.

Scalable Metadata Performance (2024)

  • increased maximum metadata IOPS by 15x (launched June 2024).
  • allows provisioning metadata IOPS independently of file system storage capacity.
  • supports up to 192,000 metadata IOPS per file system.
  • metadata IOPS can be configured in AUTOMATIC mode (scales with storage capacity) or USER_PROVISIONED mode.
  • available on Persistent-2 file systems.
  • up to 5x faster directory listing performance (launched November 2025).

Data Compression

  • uses the LZ4 compression algorithm optimized to deliver high levels of compression without adversely impacting performance.
  • newly written files are automatically compressed before writing to disk and uncompressed when read.
  • reduces storage consumption of both file system storage and backups.

FSx for Lustre with S3

  • FSx for Lustre integrates natively with S3, making it easy to process cloud data sets with the Lustre high-performance file system.
  • FSx for Lustre file system transparently presents S3 objects as files and allows writing changed data back to S3.
  • supports Data Repository Associations (DRAs) — links between a directory on the file system and an S3 bucket or prefix.
  • supports up to 8 DRAs per file system, enabling links to multiple S3 buckets or prefixes.
  • provides full bi-directional synchronization including deleted files and objects.
  • S3 objects are lazy-loaded by default:
    • FSx automatically loads the corresponding objects from S3 only when first accessed by applications.
    • Subsequent reads are served directly from the file system with low, consistent latencies.
    • FSx for Lustre file system can optionally be batch hydrated.
  • FSx for Lustre uses parallel data transfer techniques to transfer data from S3 at up to hundreds of GBs/s.
  • Files from the file system can be exported back to the S3 bucket.
  • supports automatic import and export policies to keep file system and S3 synchronized.
  • DRAs are supported on Lustre 2.12 and newer file systems (excluding scratch_1 deployment type).
  • supports cross-account S3 access for sharing data across AWS accounts.

FSx for Lustre Security

  • FSx for Lustre provides encryption at rest for the file system and the backups, by default, using KMS.
  • FSx encrypts data-in-transit when accessed from supported EC2 instances.
  • complies with PCI DSS, ISO 9001, 27001, 27017, and 27018, and SOC 1, 2, and 3.
  • is HIPAA eligible.
  • file systems are accessed from endpoints in a VPC, enabling network isolation.
  • integrated with AWS IAM for resource-level permissions.
  • supports storage quotas for monitoring and controlling user- and group-level storage consumption.

FSx for Lustre Availability and Durability

  • On a scratch file system, file servers are not replaced if they fail and data is not replicated.
  • On a persistent file system, if a file server becomes unavailable it is replaced automatically and within minutes.
  • FSx for Lustre provides a parallel file system, where data is stored across multiple network file servers to maximize performance and reduce bottlenecks, and each server has multiple disks.
  • FSx takes daily automatic incremental backups of the file systems, and allows manual backups at any point.
  • Backups are stored in Amazon S3 with 99.999999999% (11 9’s) of durability.
  • Backups are highly durable and file-system-consistent.
  • Supports cross-region and cross-account backup copies using AWS Backup for disaster recovery.
  • Supports copying backups across AWS opt-in Regions (launched April 2026).

FSx for Lustre Lustre Version Management

  • Supports Lustre LTS versions 2.10, 2.12, and 2.15.
  • In-place Lustre version upgrades supported (launched February 2025) — upgrade file systems to newer versions within minutes using the console or CLI/SDK.
  • Newer versions provide performance enhancements, new features, and support for the latest Linux kernel versions.
  • No downtime required for version upgrades.

FSx for Lustre Monitoring

  • Provides enhanced monitoring dashboard with performance insights and recommendations (launched September 2024).
  • Provides additional performance metrics for improved visibility into file system activity.
  • Integrates with Amazon CloudWatch for file system metrics.
  • Provides performance warnings and recommendations when metrics exceed thresholds.

FSx for Lustre Integration with Compute Services

  • Accessible from Amazon EC2 instances, containers on Amazon EKS, and on-premises servers.
  • Integrates with Amazon SageMaker as an input data source for ML training jobs.
  • Integrates with AWS Batch through EC2 Launch Templates for batch scheduling.
  • Integrates with AWS ParallelCluster for HPC cluster deployments.
  • Supports Lustre client on Amazon Linux, Amazon Linux 2, Amazon Linux 2023, RHEL, CentOS, SUSE Linux, and Ubuntu (including Ubuntu 24.04 with Kernel 6.14).

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A solutions architect is designing storage for a high performance computing (HPC) environment based on Amazon Linux. The workload stores and processes a large amount of engineering drawings that require shared storage and heavy computing. Which storage option would be the optimal solution?
    1. Amazon Elastic File System (Amazon EFS)
    2. Amazon FSx for Lustre
    3. Amazon EC2 instance store
    4. Amazon EBS Provisioned IOPS SSD (io1)
  2. A company is planning to deploy a High Performance Computing (HPC) cluster in its VPC that requires a scalable, high performance file system. The storage service must be optimized for efficient workload processing, and the data must be accessible via a fast and scalable file system interface. It should also work natively with Amazon S3 that enables you to easily process your S3 data with a high-performance POSIX interface. Which of the following is the MOST suitable service that you should use for this scenario?
    1. Amazon Elastic File System (Amazon EFS)
    2. Amazon FSx for Lustre
    3. Amazon Elastic Block Store
    4. Amazon EBS Provisioned IOPS SSD (io1)
  3. A machine learning team needs to train large language models using GPU instances and requires the fastest possible storage throughput to keep GPUs fully utilized. The training data is stored in S3 and the team wants sub-millisecond latency access. Which FSx for Lustre feature should they enable for maximum GPU throughput?
    1. HDD storage class with burst throughput
    2. Scratch file system with increased storage capacity
    3. EFA-enabled file system with NVIDIA GPUDirect Storage (GDS)
    4. Persistent file system with data compression enabled
  4. A company runs large-scale genomics workloads with petabytes of data that has a mix of frequently and infrequently accessed files. They want the lowest-cost Lustre storage that automatically scales with their data and eliminates the need to provision capacity upfront. Which FSx for Lustre configuration best meets these requirements?
    1. Persistent SSD file system with data compression
    2. Scratch file system with HDD storage
    3. Intelligent-Tiering storage class with SSD read cache
    4. Persistent-2 file system with maximum throughput per TiB
  5. A company needs to link their FSx for Lustre file system to data in multiple S3 buckets for different teams. How many Data Repository Associations (DRAs) can be configured on a single FSx for Lustre file system?
    1. 1
    2. 4
    3. 8
    4. 16
  6. An organization is running metadata-intensive workloads on FSx for Lustre and needs to increase the number of file creation and listing operations. Which feature allows them to scale metadata performance independently of storage capacity?
    1. Increasing storage capacity
    2. Enabling data compression
    3. User-provisioned metadata IOPS on Persistent-2 file systems
    4. Switching to scratch file system deployment

References

AWS FSx for Windows

AWS FSx for Windows File Server

  • Amazon FSx for Windows File Server provides fully managed, highly reliable, and scalable file storage that is accessible over the industry-standard Server Message Block (SMB) protocol.
  • FSx for Windows is built on Windows Server, delivering a wide range of administrative features such as user quotas, end-user file restore, File Server Resource Manager (FSRM), ACLs, and Microsoft Active Directory (AD) integration.
  • FSx for Windows provides high levels of throughput and IOPS and consistent sub-millisecond latencies.
  • FSx for Windows supports up to 12 GBps throughput capacity and up to 400,000 IOPS.
  • FSx for Windows offers single-AZ and multi-AZ deployment options, fully managed backups, and encryption of data at rest and in transit.
  • FSx for Windows File Server backups are file-system-consistent, highly durable, and incremental.
  • Amazon FSx is accessible from Windows, Linux, and MacOS compute instances and devices.
  • Amazon FSx provides concurrent access to the file system to thousands of compute instances and devices.
  • Amazon FSx can connect the file system to EC2, VMware Cloud on AWS, Amazon WorkSpaces, Amazon AppStream 2.0, and Amazon ECS instances.
  • Integrated with CloudWatch to monitor storage capacity and file system activity.
  • Integrated with CloudTrail to monitor all Amazon FSx API calls.
  • Amazon FSx was designed for use cases that require Windows shared file storage, like CRM, ERP, custom or .NET applications, home directories, data analytics, media, and entertainment workflows, web serving and content management, software build environments, and Microsoft SQL Server.
  • FSx file system is accessible from the on-premises environment using an AWS Direct Connect or AWS VPN connection.
  • FSx is accessible from multiple VPCs, AWS accounts, and AWS Regions using VPC Peering connections or AWS Transit Gateway.
  • FSx provides consistent sub-millisecond latencies with SSD storage and single-digit millisecond latencies with HDD storage.
  • FSx supports Microsoft’s Distributed File System (DFS) to organize shares into a single folder structure up to hundreds of PB in size.
  • FSx supports DNS aliases to access file systems using custom DNS names (up to 50 aliases per file system), enabling seamless migration from on-premises file servers.
  • FSx supports two network type options: IPv4-only and dual-stack (for both IPv4 and IPv6), allowing access from IPv6 clients without complex address translation.

FSx for Windows Performance

  • FSx for Windows supports up to 12 GBps of throughput capacity per file system.
  • Maximum IOPS levels up to 400,000 for file systems with 12 GBps throughput capacity.
  • SSD IOPS can be provisioned independently of storage capacity, up to 400,000 IOPS.
  • Throughput capacity and storage capacity can be increased or decreased independently at any time.
  • Storage type can be updated from HDD to SSD without creating a new file system.
  • Each file system can be provisioned up to 64 TB in size.
  • Data deduplication helps reduce storage consumption by identifying and removing duplicate data.

FSx for Windows Security

  • FSx works with Microsoft Active Directory (AD) to integrate with existing Windows environments, which can either be an AWS Managed Microsoft AD or self-managed Microsoft AD.
  • FSx integrates with AWS Secrets Manager for enhanced management of Active Directory credentials for domain join operations.
  • FSx provides standard Windows permissions (full support for Windows Access Controls ACLs) for files and folders.
  • FSx for Windows File Server supports encryption at rest for the file system and backups using KMS managed keys.
  • FSx encrypts data-in-transit using SMB Kerberos session keys when accessing the file system from clients that support SMB 3.0.
  • FSx supports file-level or folder-level restores to previous versions by supporting Windows shadow copies, which are point-in-time snapshots of the file system.
  • FSx supports Windows shadow copies to enable the end-users to easily undo file changes and compare file versions by restoring files to previous versions, and backups to support the backup retention and compliance needs.
  • FSx complies with ISO, PCI-DSS, and SOC certifications, and is HIPAA eligible.
  • FSx supports AWS PrivateLink interface VPC endpoints (including dual-stack endpoints) to access the FSx API from within a VPC without sending traffic over the internet.

FSx for Windows File Access Auditing

  • FSx supports file access auditing to log end-user accesses on files, folders, and file shares.
  • File access auditing helps meet security and compliance requirements by tracking who accessed, modified, or changed permissions on files.
  • Audit event logs can be sent to Amazon CloudWatch Logs or streamed to Amazon Kinesis Data Firehose.
  • Supports configuring audit levels independently for file/folder accesses and file share accesses.
  • Audit log levels include: SUCCESS_ONLY, FAILURE_ONLY, SUCCESS_AND_FAILURE, and DISABLED.
  • File access auditing is supported on file systems with a throughput capacity of 32 MBps or greater.

FSx for Windows File Server Resource Manager (FSRM)

  • Amazon FSx supports File Server Resource Manager (FSRM), a Windows Server feature that provides capabilities to manage, govern, and monitor file data.
  • FSRM enables:
    • File Classification – Automatically classify and identify sensitive data (e.g., PII).
    • File Screening – Block unauthorized file types from being saved to business folders.
    • Folder-level Quotas – Set storage limits to prevent users from consuming excessive storage.
    • Storage Reports – Generate detailed reports about storage usage patterns.
    • Retention Policies – Create data retention and lifecycle policies.
  • FSRM events can be published to Amazon CloudWatch Logs or Amazon Kinesis Data Firehose for monitoring and automation.
  • FSRM events can trigger AWS Lambda functions to take reactive actions based on file events.
  • FSRM is supported on file systems with SSD storage and a throughput capacity of 128 MB/s or greater.

FSx for Windows Availability and Durability

  • FSx for Windows automatically replicates the data within an Availability Zone (AZ) to protect it from component failure.
  • FSx continuously monitors for hardware failures and automatically replaces infrastructure components in the event of a failure.
  • FSx supports Multi-AZ deployment
    • automatically provisions and maintains a standby file server in a different Availability Zone.
    • any changes written to disk in the file system are synchronously replicated across AZs to standby.
    • helps enhance availability during planned system maintenance.
    • helps protect the data against instance failure and AZ disruption.
    • In the event of planned file system maintenance or unplanned service disruption, FSx automatically fails over to the secondary file server, allowing data accessibility without manual intervention.
  • Multi-AZ file systems automatically failover from the preferred file server to the standby file server if
    • An Availability Zone outage occurs.
    • Preferred file server becomes unavailable.
    • Preferred file server undergoes planned maintenance.
  • FSx supports automatic daily backups of the file systems, which incrementally store only the changes after the most recent backup.
  • FSx stores backups in S3.
  • FSx supports copying backups cross-region (to another AWS Region) and in-region for disaster recovery and compliance.
  • FSx is integrated with AWS Backup for centralized backup management, cross-account backup, and cross-region backup copy.

FSx for Windows and FSx File Gateway

  • Note: Amazon FSx File Gateway is no longer available to new customers as of October 28, 2024. Existing customers can continue using the service.
  • FSx File Gateway previously provided low-latency, on-premises access to fully managed file shares in the cloud by caching frequently accessed data locally.
  • For on-premises access, AWS now recommends accessing FSx for Windows File Server directly using AWS Direct Connect or AWS VPN connections.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A data processing facility wants to move a group of Microsoft Windows servers to the AWS Cloud. These servers require access to a shared file system that can integrate with the facility’s existing Active Directory (AD) infrastructure for file and folder permissions. The solution needs to provide seamless support for shared files with AWS and on-premises servers and allow the environment to be highly available. The chosen solution should provide added security by supporting encryption at rest and in transit. The solution should also be cost-effective to implement and manage. Which storage solution would meet these requirements?
    1. An AWS Storage Gateway file gateway joined to the existing AD domain
    2. An Amazon FSx for Windows File Server file system joined to the existing AD domain
    3. An Amazon Elastic File System (Amazon EFS) file system joined to an AWS managed AD domain
    4. An Amazon S3 bucket mounted on Amazon EC2 instances in multiple Availability Zones running Windows Server and joined to an AWS managed AD domain.
  2. A company needs to audit file access patterns on its Amazon FSx for Windows File Server file system to meet compliance requirements. The security team needs to track who accessed, modified, or changed permissions on files and folders. Which feature should the solutions architect configure?
    1. Enable CloudTrail logging for FSx API calls
    2. Enable file access auditing with audit logs sent to CloudWatch Logs
    3. Configure Windows Event Viewer on the file server
    4. Enable VPC Flow Logs on the FSx file system’s network interfaces
  3. A solutions architect needs to manage storage costs for Amazon FSx for Windows File Server. The organization requires the ability to classify sensitive data, block unauthorized file types, set storage limits per department folder, and generate storage usage reports. Which feature should the architect use?
    1. Configure data deduplication on the file system
    2. Use Amazon Macie to classify data on FSx
    3. Enable File Server Resource Manager (FSRM) on the file system
    4. Use AWS Config rules to monitor storage usage
  4. A company is deploying Amazon FSx for Windows File Server and requires the file system to be accessible from both IPv4 and IPv6 clients within their VPC and on-premises network. Which configuration should the solutions architect choose?
    1. Create a file system with IPv4-only network type and use a NAT64 gateway
    2. Create a file system with dual-stack network type
    3. Create two file systems, one for IPv4 and one for IPv6 clients
    4. Deploy an Application Load Balancer with dualstack in front of the file system
  5. A company is migrating from on-premises Windows file servers to Amazon FSx for Windows File Server. They want to ensure end users can continue accessing file shares using the same DNS names without any client-side configuration changes. Which approach should the solutions architect recommend?
    1. Update the on-premises DNS to point to the FSx file system’s default DNS name
    2. Associate DNS aliases with the FSx file system matching the existing on-premises file server DNS names
    3. Create a Route 53 private hosted zone with CNAME records
    4. Configure AWS Global Accelerator with the FSx file system as an endpoint

References

AWS Storage Services Cheat Sheet

AWS Storage Services Cheat Sheet

AWS Storage Services

Simple Storage Service – S3

  • provides key-value based object storage with unlimited storage, unlimited objects up to 5 TB for the internet
  • offers an extremely durable, highly available, and infinitely scalable data storage infrastructure at very low costs.
  • is Object-level storage (not a Block level storage) and cannot be used to host OS or dynamic websites (but can work with Javascript SDK)
  • provides durability by redundantly storing objects on multiple facilities within a region
  • regularly verifies the integrity of data using checksums and provides the auto-healing capability
  • S3 resources consist of globally unique buckets with objects and related metadata. The data model is a flat structure with no hierarchies or folders.
  • S3 Replication enables automatic, asynchronous copying of objects across S3 buckets in the same or different AWS regions using SRR or CRR. Replication needs versioning enabled on either side.
  • S3 Transfer Acceleration helps speed data transport over long distances between a client and an S3 bucket using CloudFront edge locations.
  • S3 supports cost-effective Static Website hosting with Client-side scripts.
  • S3 CORS – Cross-Origin Resource Sharing allows cross-origin access to S3 resources.
  • S3 Access Logs enables tracking access requests to an S3 bucket.
  • S3 notification feature enables notifications to be triggered when certain events happen in the bucket.
  • S3 Inventory helps manage the storage and can be used to audit and report on the replication and encryption status of the objects for business, compliance, and regulatory needs.
  • Requestor Pays help bucket owner to specify that the requester requesting the download will be charged for the download.
  • S3 Batch Operations help perform large-scale batch operations on S3 objects and can perform a single operation on lists of specified S3 objects.
  • Pre-Signed URLs can be used shared for uploading/downloading objects for a limited time without requiring AWS security credentials.
  • Multipart Uploads allows
    • parallel uploads with improved throughput and bandwidth utilization
    • fault tolerance and quick recovery from network issues
    • ability to pause and resume uploads
    • begin an upload before the final object size is known
  • Versioning
    • helps preserve, retrieve, and restore every version of every object
    • protect from unintended overwrites and accidental deletions
    • protects individual files but does NOT protect from Bucket deletion
  • MFA (Multi-Factor Authentication) can be enabled for additional security for the deletion of objects.
  • Integrates with CloudTrail, CloudWatch, and SNS for event notifications
  • S3 Storage Classes
    • S3 Standard
      • default storage class, ideal for frequently accessed data
      • 99.999999999% durability & 99.99% availability
      • Low latency and high throughput performance
      • designed to sustain the loss of data in a two facilities
    • S3 Standard-Infrequent Access (S3 Standard-IA)
      • optimized for long-lived and less frequently accessed data
      • designed to sustain the loss of data in a two facilities
      • 99.999999999% durability & 99.9% availability
      • suitable for objects greater than 128 KB kept for at least 30 days
    • S3 One Zone-Infrequent Access (S3 One Zone-IA)
      • optimized for rapid access, less frequently access data
      • ideal for secondary backups and reproducible data
      • stores data in a single AZ, data stored in this storage class will be lost in the event of AZ destruction.
      • 99.999999999% durability & 99.5% availability
    • S3 Reduced Redundancy Storage (Not Recommended)
      • designed for noncritical, reproducible data stored at lower levels of redundancy than the STANDARD storage class
      • reduces storage costs
      • 99.99% durability & 99.99% availability
      • designed to sustain the loss of data in a single facility
    • S3 Glacier
      • suitable for low cost data archiving, where data access is infrequent
      • provides retrieval time of minutes to several hours
        • Expedited – 1 to 5 minutes
        • Standard – 3 to 5 hours
        • Bulk – 5 to 12 hours
      • 99.999999999% durability & 99.9% availability
      • Minimum storage duration of 90 days
    • S3 Glacier Deep Archive (S3 Glacier Deep Archive)
      • provides lowest cost data archiving, where data access is infrequent
      • 99.999999999% durability & 99.9% availability
      • provides retrieval time of several (12-48) hours
        • Standard – 12 hours
        • Bulk – 48 hours
      • Minimum storage duration of 180 days
      • supports long-term retention and digital preservation for data that may be accessed once or twice a year
  • Lifecycle Management policies
    • transition to move objects to different storage classes and Glacier
    • expiration to remove objects and object versions
    • can be applied to both current and non-current objects, in case, versioning is enabled.
  • Data Consistency Model
    • provides strong read-after-write consistency for PUT and DELETE requests of objects in the S3 bucket in all AWS Regions
    • updates to a single key are atomic
    • does not currently support object locking for concurrent writes
  • S3 Security
    • IAM policies – grant users within your own AWS account permission to access S3 resources
    • Bucket and Object ACL – grant other AWS accounts (not specific users) access to  S3 resources
    • Bucket policies – allows to add or deny permissions across some or all of the objects within a single bucket
    • S3 Access Points simplify data access for any AWS service or customer application that stores data in S3.
    • S3 Glacier Vault Lock helps deploy and enforce compliance controls for individual S3 Glacier vaults with a vault lock policy.
    • S3 VPC Gateway Endpoint enables private connections between a VPC and S3, without requiring that you use an internet gateway, NAT device, VPN connection, or AWS Direct Connect connection.
    • Support SSL encryption of data in transit and data encryption at rest
  • S3 Data Encryption
    • supports data at rest and data in transit encryption
    • Server-Side Encryption
      • SSE-S3 – encrypts S3 objects using keys handled & managed by AWS
      • SSE-KMS – leverage AWS Key Management Service to manage encryption keys. KMS provides control and audit trail over the keys.
      • SSE-C – when you want to manage your own encryption keys. AWS does not store the encryption key. Requires HTTPS.
    • Client-Side Encryption
      • Client library such as the S3 Encryption Client
      • Clients must encrypt data themselves before sending it to S3
      • Clients must decrypt data themselves when retrieving from S3
      • Customer fully manages the keys and encryption cycle
  • S3 Best Practices
    • use random hash prefix for keys and ensure a random access pattern, as S3 stores object lexicographically randomness helps distribute the contents across multiple partitions for better performance
    • use parallel threads and Multipart upload for faster writes
    • use parallel threads and Range Header GET for faster reads
    • for list operations with a large number of objects, it’s better to build a secondary index in DynamoDB
    • use Versioning to protect from unintended overwrites and deletions, but this does not protect against bucket deletion
    • use VPC S3 Endpoints with VPC to transfer data using Amazon internal network

Instance Store

  • provides temporary or ephemeral block-level storage for an EC2 instance
  • is physically attached to the Instance
  • deliver very high random I/O performance, which is a good option when storage with very low latency is needed
  • cannot be dynamically resized
  • data persists when an instance is rebooted
  • data does not persists if the
    • underlying disk drive fails
    • instance stops i.e. if the EBS backed instance with instance store volumes attached is stopped
    • instance terminates
  • can be attached to an EC2 instance only when the instance is launched
  • is ideal for the temporary storage of information that changes frequently, such as buffers, caches, scratch data, and other temporary content, or for data that is replicated across a fleet of instances, such as a load-balanced pool of web servers.

Elastic Block Store – EBS

  • is virtual network-attached block storage
  • provides highly available, reliable, durable, block-level storage volumes that can be attached to a running instance
  • provides high durability and are redundant in an AZ, as the data is automatically replicated within that AZ to prevent data loss due to any single hardware component failure
  • persists and is independent of EC2 lifecycle
  • multiple volumes can be attached to a single EC2 instance
  • can be detached & attached to another EC2 instance in that same AZ only
  • volumes are Zonal i.e. created in a specific AZ and CAN’T span across AZs
  • snapshots
  • for making volume available to different AZ, create a snapshot of the volume and restore it to a new volume in any AZ within the region
  • for making the volume available to different Region, the snapshot of the volume can be copied to a different region and restored as a volume
  • PIOPS is designed to run transactions applications that require high and consistent IO for e.g. Relation database, NoSQL, etc
  • volumes CANNOT be shared with multiple EC2 instances, use EFS instead
  • Multi-Attach enables attaching a single Provisioned IOPS SSD (io1 or io2) volume to multiple instances that are in the same AZ.

EBS Encryption

  • allow encryption using the EBS encryption feature.
  • All data stored at rest, disk I/O, and snapshots created from the volume are encrypted.
  • uses 256-bit AES algorithms (AES-256) and an Amazon-managed KMS
  • Snapshots of encrypted EBS volumes are automatically encrypted.

EBS Snapshots

  • helps create backups of EBS volumes
  • are incremental
  • occur asynchronously, consume the instance IOPS
  • are regional and CANNOT span across regions
  • can be copied across regions to make it easier to leverage multiple regions for geographical expansion, data center migration, and disaster recovery
  • can be shared by making them public or with specific AWS accounts by modifying the access permissions of the snapshots
  • support EBS encryption
    • Snapshots of encrypted volumes are automatically encrypted
    • Volumes created from encrypted snapshots are automatically encrypted
    • All data in flight between the instance and the volume is encrypted
    • Volumes created from an unencrypted snapshot owned or have access to can be encrypted on the fly.
    • Encrypted snapshot owned or having access to, can be encrypted with a different key during the copy process.
  • can be automated using AWS Data Lifecycle Manager

EBS vs Instance Store

Refer blog post @ EBS vs Instance Store

Glacier

  • suitable for archiving data, where data access is infrequent and a retrieval time of several hours (3 to 5 hours) is acceptable (Not true anymore with enhancements from AWS)
  • provides a high durability by storing archive in multiple facilities and multiple devices at a very low cost storage
  • performs regular, systematic data integrity checks and is built to be automatically self healing
  • aggregate files into bigger files before sending them to Glacier and use range retrievals to retrieve partial file and reduce costs
  • improve speed and reliability with multipart upload
  • automatically encrypts the data using AES-256
  • upload or download data to Glacier via SSL encrypted endpoints

EFS

  • fully-managed, easy to set up, scale, and cost-optimize file storage
  • can automatically scale from gigabytes to petabytes of data without needing to provision storage
  • provides managed NFS (network file system) that can be mounted on and accessed by multiple EC2 in multiple AZs simultaneously
  • highly durable, highly scalable and highly available.
    • stores data redundantly across multiple Availability Zones
    • grows and shrinks automatically as files are added and removed, so you there is no need to manage storage procurement or provisioning.
  • expensive (3x gp2), but you pay per use
  • uses the Network File System version 4 (NFS v4) protocol
  • is compatible with all Linux-based AMIs for EC2,  POSIX file system (~Linux) that has a standard file API
  • does not support Windows AMI
  • offers the ability to encrypt data at rest using KMS and in transit.
  • can be accessed from on-premises using an AWS Direct Connect or AWS VPN connection between the on-premises datacenter and VPC.
  • can be accessed concurrently from servers in the on-premises datacenter as well as EC2 instances in the Amazon VPC
  • Performance mode
    • General purpose (default)
      • latency-sensitive use cases (web server, CMS, etc…)
    • Max I/O
      • higher latency, throughput, highly parallel (big data, media processing)
  • Storage Tiers
    • Standard
      • for frequently accessed files
      • ideal for active file system workloads and you pay only for the file system storage you use per month
    • Infrequent access (EFS-IA)
      • a lower cost storage class that’s cost-optimized for files infrequently accessed i.e. not accessed every day
      • cost to retrieve files, lower price to store
    • EFS Lifecycle Management with choosing an age-off policy allows moving files to EFS IA
    • Lifecycle Management automatically moves the data to the EFS IA storage class according to the lifecycle policy. for e.g., you can move files automatically into EFS IA fourteen days of not being accessed.
    • EFS is a shared POSIX system for Linux systems and does not work for Windows

Amazon FSx for Windows

  • is a fully managed,  highly reliable, and scalable Windows file system share drive
  • supports SMB protocol & Windows NTFS
  • supports Microsoft Active Directory integration, ACLs, user quotas
  • built on SSD, scale up to 10s of GB/s, millions of IOPS, 100s PB of data
  • is accessible from Windows, Linux, and MacOS compute instances
  • can be accessed from the on-premise infrastructure
  • can be configured to be Multi-AZ (high availability)
  • supports encryption of data at rest and in transit
  • provides data deduplication, which enables further cost optimization by removing redundant data.
  • data is backed-up daily to S3

Amazon FSx for Lustre

  • provides easy and cost effective way to launch and run the world’s most popular high-performance file system.
  • is a type of parallel distributed file system, for large-scale computing
  • Lustre is derived from “Linux” and “cluster”
  • Machine Learning, High Performance Computing (HPC) esp. Video Processing, Financial Modeling, Electronic Design Automation
  • scales up to 100s GB/s, millions of IOPS, sub-ms latencies
  • seamless integration with S3, it transparently presents S3 objects as files and allows you to write changed data back to S3.
  • can “read S3” as a file system (through FSx)
  • can write the output of the computations back to S3 (through FSx)
  • supports encryption of data at rest and in transit
  • can be used from on-premise servers

CloudFront

  • provides low latency and high data transfer speeds for distribution of static, dynamic web or streaming content to web users
  • delivers the content through a worldwide network of data centers called Edge Locations
  • keeps persistent connections with the origin servers so that the files can be fetched from the origin servers as quickly as possible.
  • dramatically reduces the number of network hops that users’ requests must pass through
  • supports multiple origin server options, like AWS hosted service for e.g. S3, EC2, ELB or an on premise server, which stores the original, definitive version of the objects
  • single distribution can have multiple origins and Path pattern in a cache behavior determines which requests are routed to the origin
  • supports Web Download distribution and RTMP Streaming distribution
    • Web distribution supports static, dynamic web content, on demand using progressive download & HLS and live streaming video content
    • RTMP supports streaming of media files using Adobe Media Server and the Adobe Real-Time Messaging Protocol (RTMP) ONLY
  • supports HTTPS using either
    • dedicated IP address, which is expensive as dedicated IP address is assigned to each CloudFront edge location
    • Server Name Indication (SNI), which is free but supported by modern browsers only with the domain name available in the request header
  • For E2E HTTPS connection,
    • Viewers -> CloudFront needs either self signed certificate, or certificate issued by CA or ACM
    • CloudFront -> Origin needs certificate issued by ACM for ELB and by CA for other origins
  •  Security
    • Origin Access Identity (OAI) can be used to restrict the content from S3 origin to be accessible from CloudFront only
    • supports Geo restriction (Geo-Blocking) to whitelist or blacklist countries that can access the content
    • Signed URLs 
      • for RTMP distribution as signed cookies aren’t supported
      • to restrict access to individual files, for e.g., an installation download for your application.
      • users using a client, for e.g. a custom HTTP client, that doesn’t support cookies
    • Signed Cookies
      • provide access to multiple restricted files, for e.g., video part files in HLS format or all of the files in the subscribers’ area of a website.
      • don’t want to change the current URLs
    • integrates with AWS WAF, a web application firewall that helps protect web applications from attacks by allowing rules configured based on IP addresses, HTTP headers, and custom URI strings
  • supports GET, HEAD, OPTIONS, PUT, POST, PATCH, DELETE to get object & object headers, add, update, and delete objects
    • only caches responses to GET and HEAD requests and, optionally, OPTIONS requests
    • does not cache responses to PUT, POST, PATCH, DELETE request methods and these requests are proxied back to the origin
  • object removal from cache
    • would be removed upon expiry (TTL) from the cache, by default 24 hrs
    • can be invalidated explicitly, but has a cost associated, however might continue to see the old version until it expires from those caches
    • objects can be invalidated only for Web distribution
    • change object name, versioning, to serve different version
  • supports adding or modifying custom headers before the request is sent to origin which can be used to
    • validate if user is accessing the content from CDN
    • identifying CDN from which the request was forwarded from, in case of multiple CloudFront distribution
    • for viewers not supporting CORS to return the Access-Control-Allow-Origin header for every request
  • supports Partial GET requests using range header to download object in smaller units improving the efficiency of partial downloads and recovery from partially failed transfers
  • supports compression to compress and serve compressed files when viewer requests include Accept-Encoding: gzip in the request header
  • supports different price class to include all regions, to include only least expensive regions and other regions to exclude most expensive regions
  • supports access logs which contain detailed information about every user request for both web and RTMP distribution

AWS Import/Export

  • accelerates moving large amounts of data into and out of AWS using portable storage devices for transport and transfers data directly using Amazon’s high speed internal network, bypassing the internet.
  • suitable for use cases with
    • large datasets
    • low bandwidth connections
    • first time migration of data
  • Importing data to several types of AWS storage, including EBS snapshots, S3 buckets, and Glacier vaults.
  • Exporting data out from S3 only, with versioning enabled only the latest version is exported
  • Import data can be encrypted (optional but recommended) while export is always encrypted using Truecrypt
  • Amazon will wipe the device if specified, however it will not destroy the device