AWS FSx for Lustre
- FSx for Lustre is a fully managed service, that makes it easy and cost-effective to launch and run the world’s most popular HPC high-performance Lustre file system.
- FSx for Lustre is an open-source file system designed for applications that require fast storage, where the storage needs to keep up with the compute.
- handles the traditional complexity of setting up and managing high-performance Lustre file systems.
- is POSIX-compliant and can be used with existing Linux-based applications without having to make any changes.
- provides a native file system interface and works as any file system does with the Linux operating system.
- provides read-after-write consistency and supports file locking.
- is compatible with the most popular Linux-based AMIs, including Amazon Linux, Amazon Linux 2, Red Hat Enterprise Linux (RHEL), CentOS, SUSE Linux and Ubuntu.
- is accessible from compute workloads running on EC2 instances and containers running on EKS.
- can be accessed from a Linux instance, by installing the open-source Lustre client and mounting the file system using standard Linux commands.
- is ideal for use cases where speed matters, such as machine learning, high-performance computing (HPC), video processing, financial modelling, genome sequencing, and electronic design automation (EDA)
FSx for Lustre Deployment Options
Scratch file systems
- designed for temporary storage and short-term processing of data.
- provide high burst throughput of up to six times the baseline throughput of 200 MBps per TiB of storage capacity.
- data is not replicated and does not persist if a file server fails.
- ideal for cost-optimized storage for short-term, processing-heavy workloads.
Persistent file systems
- designed for long-term storage and workloads.
- is highly available, and data is automatically replicated within the AZ that is associated with the file system.
- data volumes attached to the file servers are replicated independently from the file servers to which they are attached.
- if a file server becomes unavailable, it is replaced automatically within minutes of failure.
- continuously monitored for hardware failures, and automatically replaces infrastructure components in the event of a failure.
- ideal for workloads that run for extended periods or indefinitely, and that might be sensitive to disruptions in availability.
FSx for Lustre with S3
- FSx for Lustre also integrates seamlessly with S3, making it easy to process cloud data sets with the Lustre high-performance file system.
- FSx for Lustre file system transparently presents S3 objects as files and allows writing changed data back to S3.
- FSx for Lustre file system can be linked with a specified S3 bucket, making the data in the S3 accessible to the file system.
- S3 objects’ names and prefixes will be visible as files and directories
- S3 objects are lazy-loaded by default.
- FSx automatically loads the corresponding objects from S3 only when first accessed by the applications.
- Subsequent reads of these files are served directly out of the file system with low, consistent latencies.
- FSx for Lustre file system can optionally be batch hydrated.
- FSx for Lustre uses parallel data transfer techniques to transfer data from S3 at up to hundreds of GBs/s.
- Files from the file system can be exported back to the S3 bucket
FSx for Lustre Security
- FSx for Lustre provides encryption at rest for the file system and the backups, by default, using KMS.
- FSx encrypts data-in-transit when accessed from supported EC2 instances only
FSx for Lustre Scalability
- FSx for Lustre file systems scale to hundreds of GB/s of throughput and millions of IOPS.
- FSx for Lustre also supports concurrent access to the same file or directory from thousands of compute instances.
- FSx for Lustre provides consistent, sub-millisecond latencies for file operations.
FSx for Lustre Availability and Durability
- On a scratch file system, file servers are not replaced if they fail and data is not replicated.
- On a persistent file system, if a file server becomes unavailable it is replaced automatically and within minutes.
- FSx for Lustre provides a parallel file system, where data is stored across multiple network file servers to maximize performance and reduce bottlenecks, and each server has multiple disks.
- FSx takes daily automatic incremental backups of the file systems, and allows manual backups at any point.
- Backups are highly durable and file-system-consistent
AWS Certification Exam Practice Questions
- Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
- AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
- AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
- Open to further feedback, discussion and correction.
- A solutions architect is designing storage for a high performance computing (HPC) environment based on Amazon Linux. The workload stores and processes a large amount of engineering drawings that require shared storage and heavy computing. Which storage option would be the optimal solution?
- Amazon Elastic File System (Amazon EFS)
- Amazon FSx for Lustre
- Amazon EC2 instance store
- Amazon EBS Provisioned IOPS SSD (io1)
- A company is planning to deploy a High Performance Computing (HPC) cluster in its VPC that requires a scalable, high performance file system. The storage service must be optimized for efficient workload processing, and the data must be accessible via a fast and scalable file system interface. It should also work natively with Amazon S3 that enables you to easily process your S3 data with a high-performance POSIX interface. Which of the following is the MOST suitable service that you should use for this scenario?
- Amazon Elastic File System (Amazon EFS)
- Amazon FSx for Lustre
- Amazon Elastic Block Store
- Amazon EBS Provisioned IOPS SSD (io1)