AWS EC2 Storage

EC2 Storage Overview

EC2 Storage Options - EBS, S3 & Instance Store

Storage Types

Elastic Block Store – EBS

  • Elastic Block Store – EBS provides highly available, reliable, durable, block-level storage volumes that can be attached to an EC2 instance.
  • persists independently from the running life of an instance.
  • behaves like a raw, unformatted, external block device that can be attached to a single EC2 instance at a time.
  • is recommended for data that requires frequent and granular updates e.g. running a database or filesystem.
  • is Zonal and can be attached to any instance within the same Availability Zone and can be used like any other physical hard drive.
  • is particularly well-suited for use as the primary storage for file systems, databases, or any applications that require fine granular updates and access to raw, unformatted, block-level storage.

Instance Store Storage

  • Instance store provides temporary or Ephemeral block-level storage
  • is located on the disks that are physically attached to the host computer.
  • consists of one or more instance store volumes exposed as block devices.
  • The size of an instance store varies by instance type.
  • Virtual devices for instance store volumes that are ephemeral[0-23], starting the first one as ephemeral0 and so on.
  • While an instance store is dedicated to a particular instance, the disk subsystem is shared among instances on a host computer.
  • is ideal for temporary storage of information that changes frequently, such as buffers, caches, scratch data, and other temporary content, or for data that is replicated across a fleet of instances, such as a load-balanced pool of web servers.
  • delivers very high random I/O performance and is a good option for storage with very low latency requirements, but you don’t need the data to persist when the instance terminates or you can take advantage of fault-tolerant architectures.

Amazon EBS vs Instance Store

More detailed @ Comparison of EBS vs Instance Store

Simple Storage Service – S3

More details @ AWS S3

Elastic File Store – EFS

  • Elastic File Store – EFS provides a simple, fully managed, easy-to-set-up, scalable, serverless, and cost-optimized file storage
  • can automatically scale from gigabytes to petabytes of data without needing to provision storage.
  • provides managed NFS (network file system) that can be mounted on and accessed by multiple EC2 in multiple AZs simultaneously.
  • offers highly durable, highly scalable, and highly available.
    • stores data redundantly across multiple AZs in the same region
    • grows and shrinks automatically as files are added and removed, so there is no need to manage storage procurement or provisioning.
  • supports the Network File System version 4 (NFSv4.1 and NFSv4.0) protocol.
  • provides file system access semantics, such as strong data consistency and file locking.
  • is compatible with all Linux-based AMIs for EC2,  POSIX file system (~Linux) that has a standard file API.
  • is a shared POSIX system for Linux systems and does not work for Windows.
  • offers the ability to encrypt data at rest using KMS and in transit.
  • can be accessed from on-premises using an AWS Direct Connect or AWS VPN connection between the on-premises datacenter and VPC.
  • can be accessed concurrently from servers in the on-premises data center as well as EC2 instances in the VPC.

Block Device Mapping

  • A block device is a storage device that moves data in sequences of bytes or bits (blocks) and supports random access and generally use buffered I/O for e.g. hard disks, CD-ROM etc
  • Block devices can be physically attached to a computer (like an instance store volume) or can be accessed remotely as if it was attached (like an EBS volume)
  • Block device mapping defines the block devices to be attached to an instance, which can either be done while creation of an AMI or when an instance is launched
  • Block device must be mounted on the instance, after being attached to the instance, to be able to be accessed
  • When a block device is detached from an instance, it is unmounted by the operating system and you can no longer access the storage device.
  • Additional Instance store volumes can be attached only when the instance is launched while EBS volumes can be attached to a running instance.
  • Viewing the block device mapping for an instance only shows the EBS volumes and not the instance store volumes. Instance metadata can be used to query the complete block device mapping.

Public Data Sets

  • Amazon Web Services provides a repository of public data sets that can be seamlessly integrated into AWS cloud-based applications.
  • Amazon stores the data sets at no charge to the community and, as with all AWS services, you pay only for the compute and storage you use for your own applications.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. When you view the block device mapping for your instance, you can see only the EBS volumes, not the instance store volumes.
    1. Depends on the instance type
    2. FALSE
    3. Depends on whether you use API call
    4. TRUE
  1. Amazon EC2 provides a repository of public data sets that can be seamlessly integrated into AWS cloud-based applications. What is the monthly charge for using the public data sets?
    1. A 1 time charge of 10$ for all the datasets.
    2. 1$ per dataset per month
    3. 10$ per month for all the datasets
    4. There is no charge for using the public data sets
  1. How many types of block devices does Amazon EC2 support?
    1. 2
    2. 4
    3. 3
    4. 1

References

AWS EFS vs EBS Multi-Attach

AWS EFS vs EBS Multi-Attach

AWS EFS vs EBS Multi-Attach

EFS vs EBS Multi-Attach features

  • Elastic File Store – EFS is a file storage service for use with Amazon compute (EC2, containers, serverless) and on-premises servers. EFS provides a file system interface, file system access semantics (such as strong consistency and file locking), and concurrently accessible storage for up to thousands of EC2 instances.
  • Elastic Block Store – EBS is a block-level storage service for use with EC2. EBS can deliver performance for workloads that require the lowest-latency access to data from a single EC2 instance.
  • Service type
    • Elastic File Store is fully managed by AWS
    • EBS needs to be managed by the user.
  • Accessibility
    • EFS can be accessed concurrently from all AZs in the Region.
    • EBS Multi-Attach can be accessed concurrently from instances within the same AZ.
  • Data Scalability
    • EFS provides unlimited data storage
    • EBS Multi-Attach has limits on the storage it can provide.
  • Instance Scalability
    • EFS can be attached to Tens, hundreds, or even thousands of compute instances.
    • EBS Multi-Attach enabled volumes can be attached to up to 16 Linux instances built on the Nitro System.
  • Supported Instances
    • EFS is compatible with all Linux-based AMIs for EC2,  POSIX file system (~Linux) that has a standard file API
    • Multi-Attach enabled volumes can be attached to up to 16 Linux instances built on the Nitro System that are in the same AZ. Multi-Attach enabled volume can be attached to Windows instances, but the OS does not recognize the data on the volume that is shared between the instances, which can result in data inconsistency.
  • Pricing
    • EFS is priced as per the pay-as-you-use model
    • EBS is priced as per the provisioned capacity

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A company wants to organize the contents of multiple websites in managed file storage. The company must be able to scale the storage based on demand without needing to provision storage. Multiple servers across multiple Availability Zones within a region should be able to access this storage concurrently. Which services should the Solutions Architect recommend?
    1. Amazon S3
    2. Amazon EBS Multi-Attach
    3. Amazon EFS
    4. AWS Storage Gateway – Volume gateway

References

Amazon_EBS & Amazon_EFS

Amazon EBS Multi-Attach

EBS Multi-Attach

  • EBS Multi-Attach enables attaching a single Provisioned IOPS SSD (io1 or io2) volume to multiple instances that are in the same AZ.
  • Multiple Multi-Attach enabled volumes can be attached to an instance or set of instances.
  • Each instance to which the volume is attached has full read and write permission to the shared volume.
  • Multi-Attach helps achieve higher application availability in clustered Linux applications that manage concurrent write operations.

EBS Multi-Attach Considerations & Limitations

  • Multi-Attach is supported exclusively on Provisioned IOPS SSD volumes.
  • Multi-Attach enabled volumes can be attached
    • to up to 16 Linux instances built on the Nitro System that are in the same AZ.
    • to Windows instances, but the operating system does not recognize the data on the volume that is shared between the instances, which can result in data inconsistency.
  • Multi-Attach enabled volumes can be attached to one block device mapping per instance.
  • Multi-Attach enabled volumes are deleted on instance termination if the last attached instance is terminated and if that instance is configured to delete the volume on termination.
  • Multi-Attach enabled volumes can’t be created as boot volumes.
  • Multi-Attach can’t be enabled during instance launch using either the EC2 console or RunInstances API.
  • Multi-Attach enabled volumes do not support I/O fencing. I/O fencing protocols control write access in a shared storage environment to maintain data consistency
  • Multi-Attach can’t be enabled or disabled while the volume is attached to an instance.
  • Multi-Attach option is disabled by default.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.

References

Amazon_EBS_Multi-Attach

AWS EBS Snapshot

EBS Snapshot

  • EBS provides the ability to create snapshots (backups) of any EBS volume and write a copy of the data in the volume to S3, where it is stored redundantly in multiple Availability Zones
  • Snapshots are incremental backups and store only the data that was changed from the time the last snapshot was taken.
  • Snapshots can be used to create new volumes, increase the size of the volumes or replicate data across Availability Zones.
  • Snapshot size can probably be smaller than the volume size as the data is compressed before being saved to S3.
  • Even though snapshots are saved incrementally, the snapshot deletion process is designed so that you need to retain only the most recent snapshot in order to restore the volume.
  • EBS Snapshots can be used to migrate or create EBS volumes in different AZs or regions.

Multi-Volume Snapshots

  • Snapshots can be used to create a backup of critical workloads, such as a large database or a file system that spans across multiple EBS volumes.
  • Multi-volume snapshots help take exact point-in-time, data-coordinated, and crash-consistent snapshots across multiple EBS volumes attached to an EC2 instance.
  • It is no longer required to stop the instance or to coordinate between volumes to ensure crash consistency because snapshots are automatically taken across multiple EBS volumes.

EBS Snapshot creation

  • Snapshots can be created from EBS volumes periodically and are point-in-time snapshots.
  • Snapshots are incremental and only store the blocks on the device that changed since the last snapshot was taken
  • Snapshots occur asynchronously; the point-in-time snapshot is created immediately while it takes time to upload the modified blocks to S3. While it is completing, an in-progress snapshot is not affected by ongoing reads and writes to the volume.
  • Snapshots can be taken from in-use volumes. However, snapshots will only capture the data that was written to the EBS volumes at the time the snapshot command is issued excluding the data which is cached by any applications of OS.
  • Recommended ways to create a Snapshot from an EBS volume are
    • Pause all file writes to the volume
    • Unmount the Volume -> Take Snapshot -> Remount the Volume
    • Stop the instance – Take Snapshot (for root EBS volumes)
  • EBS volume created based on a snapshot
    • begins as an exact replica of the original volume that was used to create the snapshot.
    • replicated volume loads data in the background so that it can be used immediately.
    • If data that hasn’t been loaded yet is accessed, the volume immediately downloads the requested data from S3 and then continues loading the rest of the volume’s data in the background.

EBS Snapshot Deletion

  • When a snapshot is deleted only the data exclusive to that snapshot is removed.
  • Deleting previous snapshots of a volume does not affect the ability to restore volumes from later snapshots of that volume.
  • Active snapshots contain all of the information needed to restore your data (from the time the snapshot was taken) to a new EBS volume.
  • Even though snapshots are saved incrementally, the snapshot deletion process is designed so that you need to retain only the most recent snapshot in order to restore the volume.
  • Snapshot of the root device of an EBS volume used by a registered AMI can’t be deleted. AMI needs to be deregistered to be able to delete the snapshot.

EBS Snapshot Copy

  • Snapshots are constrained to the region in which they are created and can be used to launch EBS volumes within the same region only
  • Snapshots can be copied across regions to make it easier to leverage multiple regions for geographical expansion, data center migration, and disaster recovery
  • Snapshots are copied with S3 server-side encryption (256-bit Advanced Encryption Standard) to encrypt the data and the snapshot copy receives a snapshot ID that’s different from the original snapshot’s ID.
  • User-defined tags are not copied from the source to the new snapshot.
  • First Snapshot copy to another region is always a full copy, while the rest are always incremental.
  • When a snapshot is copied,
    • it can be encrypted if currently unencrypted or
    • can be encrypted using a different encryption key. Changing the encryption status of a snapshot or using a non-default EBS CMK during a copy operation always results in a full copy (not incremental)

EBS Snapshot Sharing

  • Snapshots can be shared by making them public or with specific AWS accounts by modifying the access permissions of the snapshots
  • Encrypted snapshots cannot be made available publicly.
  • Only unencrypted snapshots can be shared. Encrypted snapshots cannot be shared between accounts or made public
  • Encrypted snapshot can be shared with specific AWS accounts by sharing the custom CMK key used must also be shared to encrypt it
  • Cross-account permissions may be applied to a custom key either when it is created or at a later time.
  • Users, with access to snapshots, can copy the snapshot and create their own EBS volumes based on the snapshot while the original snapshot remains unaffected
  • AWS prevents you from sharing snapshots that were encrypted with the default CMK

EBS Snapshot Encryption

  • EBS snapshots fully support EBS encryption.
  • Snapshots of encrypted volumes are automatically encrypted
  • Volumes created from encrypted snapshots are automatically encrypted
  • All data in flight between the instance and the volume is encrypted
  • Volumes created from an unencrypted snapshot owned or have access to can be encrypted on the fly.
  • Unencrypted snapshots can be encrypted during the copy process.
  • Encrypted snapshots that you own or have access to, can be encrypted with a different key during the copy process.
  • First snapshot of an encrypted volume that has been created from an unencrypted snapshot is always a full snapshot.
  • First snapshot of a re-encrypted volume, which has a different CMK compared to the source snapshot, is always a full snapshot.

EBS Snapshot Lifecycle Automation

  • Amazon Data Lifecycle Manager can be used to automate the creation, retention, and deletion of snapshots taken to back up the EBS volumes.
  • Automating snapshot management helps you to:
    • Protect valuable data by enforcing a regular backup schedule.
    • Retain backups as required by auditors or internal compliance.
    • Reduce storage costs by deleting outdated backups.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. An existing application stores sensitive information on a non-boot Amazon EBS data volume attached to an Amazon Elastic Compute Cloud instance. Which of the following approaches would protect the sensitive data on an Amazon EBS volume?
    1. Upload your customer keys to AWS CloudHSM. Associate the Amazon EBS volume with AWS CloudHSM. Remount the Amazon EBS volume.
    2. Create and mount a new, encrypted Amazon EBS volume. Move the data to the new volume. Delete the old Amazon EBS volume.
    3. Unmount the EBS volume. Toggle the encryption attribute to True. Re-mount the Amazon EBS volume.
    4. Snapshot the current Amazon EBS volume. Restore the snapshot to a new, encrypted Amazon EBS volume. Mount the Amazon EBS volume (Need to create a snapshot, create an encrypted copy of snapshot and then create an EBS volume and mount it)
  2. Is it possible to access your EBS snapshots?
    1. Yes, through the Amazon S3 APIs.
    2. Yes, through the Amazon EC2 APIs
    3. No, EBS snapshots cannot be accessed; they can only be used to create a new EBS volume.
    4. EBS doesn’t provide snapshots.
  3. Which of the following approaches provides the lowest cost for Amazon Elastic Block Store snapshots while giving you the ability to fully restore data?
    1. Maintain two snapshots: the original snapshot and the latest incremental snapshot
    2. Maintain a volume snapshot; subsequent snapshots will overwrite one another
    3. Maintain a single snapshot the latest snapshot is both Incremental and complete
    4. Maintain the most current snapshot, archive the original and incremental to Amazon Glacier.
  4. Which procedure for backing up a relational database on EC2 that is using a set of RAIDed EBS volumes for storage minimizes the time during which the database cannot be written to and results in a consistent backup?
    1. Detach EBS volumes, 2. Start EBS snapshot of volumes, 3. Re-attach EBS volumes
    2. Stop the EC2 Instance. 2. Snapshot the EBS volumes
    3. Suspend disk I/O, 2. Create an image of the EC2 Instance, 3. Resume disk I/O
    4. Suspend disk I/O, 2. Start EBS snapshot of volumes, 3. Resume disk I/O
    5. Suspend disk I/O, 2. Start EBS snapshot of volumes, 3. Wait for snapshots to complete, 4. Resume disk I/O
  5. How can an EBS volume that is currently attached to an EC2 instance be migrated from one Availability Zone to another?
    1. Detach the volume and attach it to another EC2 instance in the other AZ.
    2. Simply create a new volume in the other AZ and specify the original volume as the source.
    3. Create a snapshot of the volume, and create a new volume from the snapshot in the other AZ
    4. Detach the volume, then use the ec2-migrate-volume command to move it to another AZ.
  6. How are the EBS snapshots saved on Amazon S3?
    1. Exponentially
    2. Incrementally
    3. EBS snapshots are not stored in the Amazon S3
    4. Decrementally
  7. EBS Snapshots occur _____
    1. Asynchronously
    2. Synchronously
    3. Weekly
  8. What will be the status of the snapshot until the snapshot is complete?
    1. Running
    2. Working
    3. Progressing
    4. Pending
  9. Before I delete an EBS volume, what can I do if I want to recreate the volume later?
    1. Create a copy of the EBS volume (not a snapshot)
    2. Create and Store a snapshot of the volume
    3. Download the content to an EC2 instance
    4. Back up the data in to a physical disk
  10. Which of the following are true regarding encrypted Amazon Elastic Block Store (EBS) volumes? Choose 2 answers
    1. Supported on all Amazon EBS volume types
    2. Snapshots are automatically encrypted
    3. Available to all instance types
    4. Existing volumes can be encrypted
    5. Shared volumes can be encrypted
  11. Amazon EBS snapshots have which of the following two characteristics? (Choose 2.) Choose 2 answers
    1. EBS snapshots only save incremental changes from snapshot to snapshot
    2. EBS snapshots can be created in real-time without stopping an EC2 instance (the snapshot can be taken real time however it will not be consistent and the recommended way is to stop or freeze the IO)
    3. EBS snapshots can only be restored to an EBS volume of the same size or smaller (EBS volume restored from snapshots need to be of the same size of larger size)
    4. EBS snapshots can only be restored and mounted to an instance in the same Availability Zone as the original EBS volume (Snapshots are specific to Region and can be used to create a volume in any AZ and does not depend on the original EBS volume AZ)
  12. A user is planning to schedule a backup for an EBS volume. The user wants security of the snapshot data. How can the user achieve data encryption with a snapshot?
    1. Use encrypted EBS volumes so that the snapshot will be encrypted by AWS (Refer link)
    2. While creating a snapshot select the snapshot with encryption
    3. By default the snapshot is encrypted by AWS
    4. Enable server side encryption for the snapshot using S3
  13. A sys admin is trying to understand EBS snapshots. Which of the below mentioned statements will not be useful to the admin to understand the concepts about a snapshot?
    1. Snapshot is synchronous
    2. It is recommended to stop the instance before taking a snapshot for consistent data
    3. Snapshot is incremental
    4. Snapshot captures the data that has been written to the hard disk when the snapshot command was executed
  14. When creation of an EBS snapshot is initiated but not completed, the EBS volume
    1. Cannot be detached or attached to an EC2 instance until me snapshot completes
    2. Can be used in read-only mode while me snapshot is in progress
    3. Can be used while the snapshot is in progress
    4. Cannot be used until the snapshot completes
  15. You have a server with a 5O0GB Amazon EBS data volume. The volume is 80% full. You need to back up the volume at regular intervals and be able to re-create the volume in a new Availability Zone in the shortest time possible. All applications using the volume can be paused for a period of a few minutes with no discernible user impact. Which of the following backup methods will best fulfill your requirements?
    1. Take periodic snapshots of the EBS volume
    2. Use a third-party Incremental backup application to back up to Amazon Glacier
    3. Periodically back up all data to a single compressed archive and archive to Amazon S3 using a parallelized multi-part upload
    4. Create another EBS volume in the second Availability Zone attach it to the Amazon EC2 instance, and use a disk manager to mirror me two disks
  16. A user is creating a snapshot of an EBS volume. Which of the below statements is incorrect in relation to the creation of an EBS snapshot?
    1. Its incremental
    2. It can be used to launch a new instance
    3. It is stored in the same AZ as the volume (stored in the same region)
    4. It is a point in time backup of the EBS volume
  17. A user has created a snapshot of an EBS volume. Which of the below mentioned usage cases is not possible with respect to a snapshot?
    1. Mirroring the volume from one AZ to another AZ
    2. Launch an instance
    3. Decrease the volume size
    4. Increase the size of the volume
  18. What is true of the way that encryption works with EBS?
    1. Snapshotting an encrypted volume makes an encrypted snapshot; restoring an encrypted snapshot creates an encrypted volume when specified / requested.
    2. Snapshotting an encrypted volume makes an encrypted snapshot when specified / requested; restoring an encrypted snapshot creates an encrypted volume when specified / requested.
    3. Snapshotting an encrypted volume makes an encrypted snapshot; restoring an encrypted snapshot always creates an encrypted volume.
    4. Snapshotting an encrypted volume makes an encrypted snapshot when specified / requested; restoring an encrypted snapshot always creates an encrypted volume.
  19. Why are more frequent snapshots of EBS Volumes faster?
    1. Blocks in EBS Volumes are allocated lazily, since while logically separated from other EBS Volumes, Volumes often share the same physical hardware. Snapshotting the first time forces full block range allocation, so the second snapshot doesn’t need to perform the allocation phase and is faster.
    2. The snapshots are incremental so that only the blocks on the device that have changed after your last snapshot are saved in the new snapshot.
    3. AWS provisions more disk throughput for burst capacity during snapshots if the drive has been pre-warmed by snapshotting and reading all blocks.
    4. The drive is pre-warmed, so block access is more rapid for volumes when every block on the device has already been read at least one time.
  20. Which is not a restriction on AWS EBS Snapshots?
    1. Snapshots which are shared cannot be used as a basis for other snapshots (Snapshots shared with other users are usable in full by the recipient, including but limited to the ability to base modified volumes and snapshots)
    2. You cannot share a snapshot containing an AWS Access Key ID or AWS Secret Access Key
    3. You cannot share encrypted snapshots (NOTE: this has be updated partially where you can share a encrypted snapshot with other accounts)
    4. Snapshot restorations are restricted to the region in which the snapshots are created
  21. There is a very serious outage at AWS. EC2 is not affected, but your EC2 instance deployment scripts stopped working in the region with the outage. What might be the issue?
    1. The AWS Console is down, so your CLI commands do not work.
    2. S3 is unavailable, so you can’t create EBS volumes from a snapshot you use to deploy new volumes. (EBS volume snapshots are stored in S3. If S3 is unavailable, snapshots are unavailable)
    3. AWS turns off the <code>DeployCode</code> API call when there are major outages, to protect from system floods.
    4. None of the other answers make sense. If EC2 is not affected, it must be some other issue.

AWS EBS Performance

AWS EBS Performance Tips

  • EBS Performance depends on several factors including I/O characteristics,  instances and volumes configuration and can be improved using PIOPS, EBS-Optimized instances, Pre-Warming, and RAIDed configuration.

EBS-Optimized or 10 Gigabit Network Instances

  • An EBS-Optimized instance uses an optimized configuration stack and provides additional, dedicated capacity for EBS I/O.
  • Optimization provides the best performance for the EBS volumes by minimizing contention between EBS I/O and other traffic from an instance
  • EBS-Optimized instances deliver dedicated throughput to EBS depending on the instance type used.
  • Not all instance types support EBS-Optimization
  • Some Instance types enable EBS-Optimization by default, while it can be enabled for some.
  • EBS optimization enabled for an instance, that is not EBS-Optimized by default, an additional low, hourly fee for the dedicated capacity is charged
  • When attached to an EBS–optimized instance,
    • General Purpose (SSD) volumes are designed to deliver within 10% of their baseline and burst performance 99.9% of the time in a given year
    • Provisioned IOPS (SSD) volumes are designed to deliver within 10% of their provisioned performance 99.9 percent of the time in a given year.

EBS Volume Initialization – Pre-warming

  • Empty EBS volumes receive their maximum performance the moment that they are available and DO NOT require initialization (pre-warming).
  • EBS volumes needed a pre-warming, previously, before being used to get maximum performance to start with. Pre-warming of the volume was possible by writing to the entire volume with 0 for new volumes or reading the entire volume for volumes from snapshots.
  • Storage blocks on volumes that were restored from snapshots must be initialized (pulled down from S3 and written to the volume) before the block can be accessed.
  • This preliminary action takes time and can cause a significant increase in the latency of an I/O operation the first time each block is accessed.
  • To avoid this initial performance hit in a production environment, the following options can be used
    • Force the immediate initialization of the entire volume by using the dd or fio utilities to read from all of the blocks on a volume.
    • Enable fast snapshot restore – FSR on a snapshot to ensure that the EBS volumes created from it are fully-initialized at creation and instantly deliver all of their provisioned performance.

RAID Configuration

  • EBS volumes can be striped, if a single EBS volume does not meet the performance and more is required.
  • Striping volumes allows pushing tens of thousands of IOPS.
  • EBS volumes are already replicated across multiple servers in an AZ for availability and durability, so AWS generally recommend striping for performance rather than durability.
  • For greater I/O performance than can be achieved with a single volume, RAID 0 can stripe multiple volumes together; for on-instance redundancy, RAID 1 can mirror two volumes together.
  • RAID 0 allows I/O distribution across all volumes in a stripe, allowing straight gains with each addition.
  • RAID 1 can be used for durability to mirror volumes, but in this case, it requires more EC2 to EBS bandwidth as the data is written to multiple volumes simultaneously and should be used with EBS–optimization.
  • EBS volume data is replicated across multiple servers in an AZ to prevent the loss of data from the failure of any single component
  • AWS doesn’t recommend RAID 5 and 6 because the parity write operations of these modes consume the IOPS available for the volumes and can result in 20-30% fewer usable IOPS than RAID 0.
  • A 2-volume RAID 0 config can outperform a 4-volume RAID 6 that costs twice as much.

RAID Configuration

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A user is trying to pre-warm a blank EBS volume attached to a Linux instance. Which of the below mentioned steps should be performed by the user?
    1. There is no need to pre-warm an EBS volume (with latest update no pre-warming is needed)
    2. Contact AWS support to pre-warm (This used to be the case before, but pre warming is not necessary now)
    3. Unmount the volume before pre-warming
    4. Format the device
  2. A user has created an EBS volume of 10 GB and attached it to a running instance. The user is trying to access EBS for first time. Which of the below mentioned options is the correct statement with respect to a first time EBS access?
    1. The volume will show a size of 8 GB
    2. The volume will show a loss of the IOPS performance the first time (the volume needed to be wiped cleaned before for new volumes, however pre warming is not needed any more)
    3. The volume will be blank
    4. If the EBS is mounted it will ask the user to create a file system
  3. You are running a database on an EC2 instance, with the data stored on Elastic Block Store (EBS) for persistence At times throughout the day, you are seeing large variance in the response times of the database queries Looking into the instance with the isolate command you see a lot of wait time on the disk volume that the database’s data is stored on. What two ways can you improve the performance of the database’s storage while maintaining the current persistence of the data? Choose 2 answers
    1. Move to an SSD backed instance
    2. Move the database to an EBS-Optimized Instance
    3. Use Provisioned IOPs EBS
    4. Use the ephemeral storage on an m2.4xLarge Instance Instead
  4. You have launched an EC2 instance with four (4) 500 GB EBS Provisioned IOPS volumes attached. The EC2 Instance is EBS-Optimized and supports 500 Mbps throughput between EC2 and EBS. The two EBS volumes are configured as a single RAID 0 device, and each Provisioned IOPS volume is provisioned with 4,000 IOPS (4000 16KB reads or writes) for a total of 16,000 random IOPS on the instance. The EC2 Instance initially delivers the expected 16,000 IOPS random read and write performance. Sometime later in order to increase the total random I/O performance of the instance, you add an additional two 500 GB EBS Provisioned IOPS volumes to the RAID. Each volume is provisioned to 4,000 IOPS like the original four for a total of 24,000 IOPS on the EC2 instance Monitoring shows that the EC2 instance CPU utilization increased from 50% to 70%, but the total random IOPS measured at the instance level does not increase at all. What is the problem and a valid solution?
    1. Larger storage volumes support higher Provisioned IOPS rates: increase the provisioned volume storage of each of the 6 EBS volumes to 1TB.
    2. EBS-Optimized throughput limits the total IOPS that can be utilized use an EBS-Optimized instance that provides larger throughput. (EC2 Instance types have limit on max throughput and would require larger instance types to provide 24000 IOPS)
    3. Small block sizes cause performance degradation, limiting the I’O throughput, configure the instance device driver and file system to use 64KB blocks to increase throughput.
    4. RAID 0 only scales linearly to about 4 devices, use RAID 0 with 4 EBS Provisioned IOPS volumes but increase each Provisioned IOPS EBS volume to 6.000 IOPS.
    5. The standard EBS instance root volume limits the total IOPS rate, change the instant root volume to also be a 500GB 4,000 Provisioned IOPS volume
  5. A user has deployed an application on an EBS backed EC2 instance. For a better performance of application, it requires dedicated EC2 to EBS traffic. How can the user achieve this?
    1. Launch the EC2 instance as EBS provisioned with PIOPS EBS
    2. Launch the EC2 instance as EBS enhanced with PIOPS EBS
    3. Launch the EC2 instance as EBS dedicated with PIOPS EBS
    4. Launch the EC2 instance as EBS optimized with PIOPS EBS

AWS EBS Volume Types

EBS Volume Types

AWS EBS Volume Types

  • AWS provides the following EBS volume types, which differ in performance characteristics and price and can be tailored for storage performance and cost to the needs of the applications.
  • Solid state drives (SSD-backed) volumes optimized for transactional workloads involving frequent read/write operations with small I/O size, where the dominant performance attribute is IOPS
    • General Purpose SSD (gp2/gp3)
    • Provisioned IOPS SSD (io1/io2/io2 block express)
  • Hard disk drives (HDD-backed) volumes optimized for large streaming workloads where throughput (measured in MiB/s) is a better performance measure than IOPS
    • Throughput Optimized HDD (st1)
    • Cold HDD (sc1)
    • Magnetic Volumes (standard) (Previous Generation)

EBS Volume Types (New Generation)

EBS Volume Types

Solid state drives (SSD-backed) volumes

Solid state drives (SSD-backed) volumes

General Purpose SSD Volumes (gp2/gp3)

  • General Purpose SSD volumes offer cost-effective storage that is ideal for a broad range of workloads.
  • General Purpose SSD volumes deliver single-digit millisecond latencies
  • General Purpose SSD volumes can range in size from 1 GiB to 16 TiB.
  • General Purpose SSD (gp2) volumes
    • has a maximum throughput of 160 MiB/s (at 214 GiB and larger).
    • provides a baseline performance of 3 IOPS/GiB
    • provides the ability to burst to 3,000 IOPS for extended periods of time for volume size less than 1 TiB and up to a maximum of 16,000 IOPS (at 5,334 GiB).
    • If the volume performance is frequently limited to the baseline level (due to an empty I/O credit balance),
      • consider using a larger General Purpose SSD volume (with a higher baseline performance level) or
      • switching to a Provisioned IOPS SSD volume for workloads that require sustained IOPS performance greater than 16,000 IOPS.
  • General Purpose SSD (gp3) volumes
    • deliver a consistent baseline rate of 3,000 IOPS and 125 MiB/s, included with the price of storage.
    • additional IOPS (up to 16,000) and throughput (up to 1,000 MiB/s) can be provisioned for an additional cost.
    • the maximum ratio of provisioned IOPS to provisioned volume size is 500 IOPS per GiB
    • the maximum ratio of provisioned throughput to provisioned IOPS is .25 MiB/s per IOPS.

I/O Credits and Burst Performance

  • I/O credits represent the available bandwidth that the General Purpose SSD volume can use to burst large amounts of I/O when more than the baseline performance is needed.
  • General Purpose SSD (gp2) volume performance is governed by volume size, which dictates the baseline performance level of the volume for e.g. 100 GiB volume has a 300 IOPS @ 3 IOPS/GiB
  • General Purpose SSD volume size also determines how quickly it accumulates I/O credits for e.g. 100 GiB with a performance of 300 IOPS can accumulate 180K IOPS/10 mins (300 * 60 * 10).
  • Larger volumes have higher baseline performance levels and accumulate I/O credits faster for e.g. 1 TiB has a baseline performance of 3000 IOPS
  • More credits the volume has for I/O, the more time it can burst beyond its baseline performance level and the better it performs when more performance is needed for e.g. 300 GiB volume with 180K I/O credit can burst @ 3000 IOPS for 1 minute (180K/3000)
  • Each volume receives an initial I/O credit balance of 5,400,000 I/O credits, which is enough to sustain the maximum burst performance of 3,000 IOPS for 30 minutes.
  • Initial credit balance is designed to provide a fast initial boot cycle for boot volumes and a good bootstrapping experience for other applications.
  • Each volume can accumulate I/O credits over a period of time which can be to burst to the required performance level, up to a max of 3,000 IOPS
  • Unused I/O credit cannot go beyond 54,00,000 I/O credits.

IOPS vs Volume size

  • Volumes till 1 TiB can burst up to 3000 IOPS over and above its baseline performance
  • Volumes larger than 1 TiB have a baseline performance that is already equal to or greater than the maximum burst performance, and their I/O credit balance never depletes.
  • Baseline performance cannot be beyond 10000 IOPS for General Purpose SSD volumes and this limit is reached @ 3333 GiB

IOPS vs Volume Size

Baseline Performance

  • Formula – 3 IOPS i.e. GiB * 3
  • Calculation example
    • 1 GiB volume size =  3 IOPS (1 * 3 IOPS)
    • 250 GiB volume size = 750 IOPS (250* 3 IOPS)

Maximum burst duration @ 3000 IOPS

  • How much time can 5400000 IO credit be sustained @ the burst performance of 3000 IOPS. Subtract the baseline performance from 3000 IOPS which would be contributed by the volume size
  • Formula – 5400000/(3000 – Baseline performance)
  • Calculation example
    • 1 GiB volume size @ 3000 IOPS with 5400000 the burst performance can be maintained for 5400000/(3000-3) = 1802 secs
    • 250 GiB volume size @ 3000 IOPS with 5400000 the burst performance can be maintained for 5400000/(3000-3*250) = 2400 secs

Time to fill the 5400000 I/O credit balance

  • Formula – 5400000/Baseline performance
  • Calculation
    • 1 GiB volume size @ 3 IOPS would require 5400000/3 = 1800000 secs
    • 250 GiB volume size @ 750 IOPS would require 5400000/750 = 7200 secs

Provisioned IOPS SSD (io1/io2) Volumes

  • are designed to meet the needs of I/O intensive workloads, particularly database workloads, that are sensitive to storage performance and consistency in random access I/O throughput.
  • IOPS rate can be specified when the volume is created, and EBS delivers within 10% of the provisioned IOPS performance 99.9% of the time over a given year.
  • can range in size from 4 GiB to 16 TiB
  • have a throughput limit range of 256 KiB for each IOPS provisioned, up to a maximum of 320 500 MiB/s (at 32000 IOPS)
  • can be provision up to 20,000 32,000 64,000 IOPS per volume.
  • Ratio of IOPS provisioned to the volume size requested can be a maximum of 30 50; e.g., a volume with 5,000 IOPS must be at least 100 GiB.
  • can be striped together in a RAID configuration for larger size and greater performance over 20000 IOPS

Hard disk drives (HDD-backed) volumes

Hard disk drives (HDD-backed) volumes

Throughput Optimized HDD (st1) Volumes

  • provide low-cost magnetic storage that defines performance in terms of throughput rather than IOPS.
  • is a good fit for large, sequential workloads such as EMR, ETL, data warehouses, and log processing
  • do not support Bootable sc1 volumes
  • are designed to support frequently accessed data
  • uses a burst-bucket model for performance similar to gp2. Volume size determines the baseline throughput of the volume, which is the rate at which the volume accumulates throughput credits. Volume size also determines the burst throughput of your volume, which is the rate at which you can spend credits when they are available.

Cold HDD (sc1) Volumes

  • provide low-cost magnetic storage that defines performance in terms of throughput rather than IOPS.
  • With a lower throughput limit than st1, sc1 is a good fit ideal for large, sequential cold-data workloads.
  • ideal for infrequent access to data and are looking to save costs, sc1 provides inexpensive block storage
  • do not support Bootable sc1 volumes
  • though are similar to Throughput Optimized HDD (st1) volumes, are designed to support infrequently accessed data.
  • uses a burst-bucket model for performance similar to gp2. Volume size determines the baseline throughput of the volume, which is the rate at which the volume accumulates throughput credits. Volume size also determines the burst throughput of your volume, which is the rate at which you can spend credits when they are available.

Magnetic Volumes (standard)

Magnetic volumes provide the lowest cost per gigabyte of all EBS volume types. Magnetic volumes are backed by magnetic drives and are ideal for workloads performing sequential reads, workloads where data is accessed infrequently, and scenarios where the lowest storage cost is important.

  • Magnetic volumes can range in size from1 GiB to 1 TiB
  • These volumes deliver approximately 100 IOPS on average, with burst capability of up to hundreds of IOPS
  • Magnetic volumes can be striped together in a RAID configuration for larger size and greater performance.

EBS Volume Types (Previous Generation – Reference Only)

EBS Volume Types Comparision

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. You are designing an enterprise data storage system. Your data management software system requires mountable disks and a real filesystem, so you cannot use S3 for storage. You need persistence, so you will be using AWS EBS Volumes for your system. The system needs as low-cost storage as possible, and access is not frequent or high throughput, and is mostly sequential reads. Which is the most appropriate EBS Volume Type for this scenario?
    1. gp1
    2. io1
    3. standard (Standard or Magnetic volumes are suited for cold workloads where data is infrequently accessed, or scenarios where the lowest storage cost is important)
    4. gp2
  2. Which EBS volume type is best for high performance NoSQL cluster deployments?
    1. io1 (io1 volumes, or Provisioned IOPS (PIOPS) SSDs, are best for: Critical business applications that require sustained IOPS performance, or more than 10,000 IOPS or 160 MiB/s of throughput per volume, like large database workloads, such as MongoDB.)
    2. gp1
    3. standard
    4. gp2
  3. Provisioned IOPS Costs: you are charged for the IOPS and storage whether or not you use them in a given month.
    1. FALSE
    2. TRUE
  4. A user is trying to create a PIOPS EBS volume with 8 GB size and 450 IOPS. Will AWS create the volume?
    1. Yes, since the ratio between EBS and IOPS is less than 50
    2. No, since the PIOPS and EBS size ratio is less than 50
    3. No, the EBS size is less than 10 GB
    4. Yes, since PIOPS is higher than 100
  5. A user has provisioned 2000 IOPS to the EBS volume. The application hosted on that EBS is experiencing fewer IOPS than provisioned. Which of the below mentioned options does not affect the IOPS of the volume?
    1. The application does not have enough IO for the volume
    2. Instance is EBS optimized
    3. The EC2 instance has 10 Gigabit Network connectivity
    4. Volume size is too large
  6. A user is trying to create a PIOPS EBS volume with 6000 IOPS and 100 GB size. AWS does not allow the user to create this volume. What is the possible root cause for this?
    1. The ratio between IOPS and the EBS volume is higher than 50
    2. The maximum IOPS supported by EBS is 3000
    3. The ratio between IOPS and the EBS volume is lower than 100
    4. PIOPS is supported for EBS higher than 500 GB size

References

AWS Storage Options – EBS & Instance Store

AWS Storage Options – EBS & Instance Store

  • Elastic Block Store – EBS and Instance Store provide block-level storage options for EC2 instances.

Elastic Block Store (EBS) volume

  • EBS provides durable block-level storage for use with EC2 instances
  • EBS volumes are off-instance, network-attached storage (NAS) that persists independently from the running life of a single EC2 instance.
  • EBS volume is attached to an instance and can be used as a physical hard drive, typically by formatting it with the file system of your choice and using the file I/O interface provided by the instance operating system.
  • EBS volume can be used to boot an EC2 instance (EBS-root AMIs only), and multiple EBS volumes can be attached to a single EC2 instance.
  • EBS volume can be attached to a single EC2 instance only at any point in time.
  • EBS Multi-Attach volume can be attached to multiple EC2 instances.
  • EBS provides the ability to take point-in-time snapshots, which are persisted in S3. These snapshots can be used to instantiate new EBS volumes and to protect data for long-term durability
  • EBS snapshots can be copied across AWS regions as well, making it easier to leverage multiple AWS regions for geographical expansion, data center migration, and disaster recovery

Ideal Usage Patterns

  • EBS is meant for data that changes relatively frequently and requires long-term persistence.
  • EBS volume provides access to raw block-level storage and is particularly well-suited for use as the primary storage for a database or file system
  • EBS Provisioned IOPS volumes are particularly well-suited for use with databases applications that require a high and consistent rate of random disk reads and writes

Anti-Patterns

  • Temporary Storage
    • EBS volume persists independent of the attached EC2 life cycle.
    • For temporary storage such as caches, buffers, queues, etc it is better to use local instance store volumes, SQS, or Elastic Cache
  • Highly-durable storage
    • EBS volumes with less than 20 GB of modified data since the last snapshot are designed for between 99.5% and 99.9% annual durability; volumes with more modified data can be expected to have proportionally lower durability
    • For highly durable storage, use S3 or Glacier which provides 99.999999999% annual durability per object
  • Static data or web content
    • For static web content, where data infrequently changes, EBS with EC2 would require a web server to serve the pages.
    • S3 may represent a more cost-effective and scalable solution for storing this fixed information and is served directly out of S3.

EBS Performance

  • EBS provides two volume types: standard volumes and Provisioned IOPS volumes which differ in performance characteristics and pricing model, allowing you to tailor the storage performance and cost to the needs of the applications.
  • EBS Volumes can be attached and striped across multiple similarly-provisioned EBS volumes using RAID 0 or logical volume manager software, thus aggregating available IOPs, total volume throughput, and total volume size.
  • Standard volumes offer cost-effective storage for applications with moderate or bursty I/O requirements. Standard volumes are also well suited for use as boot volumes, where the burst capability provides fast instance start-up times.
  • Provisioned IOPS volumes are designed to deliver predictable, high performance for I/O intensive workloads such as databases. With Provisioned IOPS, you specify an IOPS rate when creating a volume, and then EBS provisions that rate for the lifetime of the volume.
  • As EBS volumes are network-attached devices, other network I/O performed by the instance, as well as the total load on the shared network, can affect individual EBS volume performance.
  • EBS-optimized instances can be launched which deliver dedicated throughput between EC2 and EBS and enables instances to fully utilize the Provisioned IOPS on an EBS volume,
  • Each separate EBS volume can be configured as EBS standard or EBS Provisioned IOPS as needed. Alternatively, you could stripe the data.

EBS Durability & Availability

  • EBS volumes are designed to be highly available and reliable.
  • EBS volume data is replicated across multiple servers in a single AZ to prevent the loss of data from the failure of any single component.
  • EBS volume durability depends on both the size of the volume and the amount of data that has changed since your last snapshot
  • EBS snapshots are incremental, point-in-time backups, containing only the data blocks changed since the last snapshot.
  • Frequent snapshots are recommended to maximize both the durability and availability of their  EBS data
  • EBS snapshots provide an easy-to-use disk clone or disk image mechanism for backup, sharing, and disaster recovery.

EBS Cost Model

  • EBS pricing has 3 components: provisioned storage, I/O requests, and snapshot storage
  • Standard volumes are charged per GB-month of provisioned storage and per million I/O requests
  • EBS Provisioned IOPS volumes are charged per GB-month of provisioned storage and per Provisioned IOPS-month
  • For both volumes, EBS snapshots are charged per GB-month of data stored. EBS snapshot copy is charged for the data transferred between regions, and for the standard EBS snapshot charges in the destination region.
  • EBS volume storage capacity is allocated at the time of volume creation, and you are charged for this allocated storage even if not used.
  • For EBS snapshots, you are charged only for storage actually used (consumed). Note that EBS snapshots are incremental and compressed, so the storage used in any snapshot is generally much less than the storage consumed on an EBS volume

EBS Scalability and Elasticity

  • EBS volumes can easily and rapidly be provisioned and released to scale in and out with the changing total storage demands
  • EBS volumes cannot be resized, and if additional storage is needed either
    • An additional volume can be attached
    • Create a snapshot and create a new volume from the snapshot with a higher volume size
  • EBS volumes can be resized dynamically, but cannot be reduced by size.

Interfaces

  • AWS offers management APIs for EBS in both SOAP and REST formats which can be used to create, delete, describe, attach, and detach EBS volumes for the EC2 instances as well as to create, delete, and describe snapshots from EBS to S3; and to copy snapshots across regions.
  • Amazon also offers the same capabilities through AWS Management Console

Instance Store Volumes

  • Instance Store volumes are also referred to as Ephemeral Storage.
  • Instance Store volumes provide temporary block-level storage and consist of a preconfigured and pre-attached block of disk storage on the same physical server as the EC2 instance
  • Instance storage’s amount of disk storage depends on the Instance type and larger instances provide both more and larger instance store volumes. Smaller instance types such as micro instances can only be launched with EBS volumes.
  • Storage-optimized instances provide special purpose instance storage targeted to specific uses case for e.g. HI1 provides very fast solid-state drive (SSD) backed instance storage capable of supporting over 120,000 random read IOPS, and is optimized for very high random I/O performance and low cost per IOPS. While, HS1 instances are optimized for very high storage density, low storage cost, and high sequential I/O performance.
  • Instance store volumes, unlike EBS volumes, cannot be detached or attached to another instance.

Ideal Usage Patterns

  • EC2 local instance store volumes are fast, free (that is, included in the price of the EC2 instance) “scratch volumes” best suited for storing temporary data that is continually changing, such as buffers, caches, scratch data or can easily be regenerated, or data that is replicated for durability
  • High I/O instances provide instance store volumes backed by SSD, and are ideally suited for many high performance database workloads. for e.g. applications include NoSQL databases like Cassandra and MongoDB.
  • High storage instances support much higher storage density per EC2 instance and are ideally suited for applications that benefit from high sequential I/O performance across very large datasets. e.g. applications include data warehouses, Hadoop storage nodes, seismic analysis, cluster file systems, etc.

Anti-Patterns

  • Persistent storage
    • For persistent virtual disk storage similar to a physical disk drive for files or other data that must persist longer than the lifetime of a single  EC2 instance, EBS volumes or S3 are more appropriate.
  • Relational database storage
    • In most cases, relational databases require storage that persists beyond the lifetime of a single EC2 instance, making EBS volumes the natural choice.
  • Shared storage
    • Instance store volumes are dedicated to a single EC2 instance, and cannot be shared with other systems or users.
    • If you need storage that can be detached from one instance and attached to a different instance, or if you need the ability to share data easily, S3 or EBS volumes are the better choices.
  • Snapshots
    • If you need the convenience, long-term durability, availability, and shareability of point-in-time disk snapshots, EBS volumes are a better choice.

Instance Store Performance

  • Non-SSD-based instance store volumes in most EC2 instance families have performance characteristics similar to standard EBS volumes.
  • EC2 instance virtual machine and the local instance store volumes are located in the same physical server, and interaction with the storage is very fast, particularly for sequential access.
  • To further increase aggregate IOPS, or to improve sequential disk throughput, multiple instance store volumes can be grouped together using RAID 0 (disk striping) software.
  • Because the bandwidth to the disks is not limited by the network, aggregate sequential throughput for multiple instance volumes can be higher than for the same number of EBS volumes.
  • SSD instance store volumes in the EC2 high I/O instances provide from tens of thousands to hundreds of thousands of low-latency, random 4 KB random IOPS.
  • Because of the I/O characteristics of SSD devices, write performance can be variable.
  • Instance store volumes on EC2 high storage instances provide very high storage density and high sequential read and write performance. High storage instances are capable of delivering 2.6 GB/sec of sequential read and write performance when using a block size of 2 MB.

Instance Store Durability and Availability

  • EC2 local instance store volumes are not intended to be used as durable disk storage and they persist only during the life of the associate EC2 instance

Cost Model

  • Cost of the EC2 instance includes any local instance store volumes if the instance type provides them.
  • While there is no additional charge for data storage on local instance store volumes, note that data transferred to and from EC2 instance store volumes from other AZs or outside of an EC2 region may incur data transfer charges, and additional charges will apply for use of any persistent storage, such as S3, Glacier, EBS volumes, and EBS snapshots

Scalability and Elasticity

  • Local instance store volumes are tied to a particular EC2 instance and are fixed in number and size for a given EC2 instance type, so the scalability and elasticity of this storage are tied to the number of EC2 instances.

Interfaces

  • Instance store volumes are specified using the block device mapping feature of the EC2 API and the AWS Management Console
  • To the EC2 instance, an instance store volume appears just like a local disk drive. To write to and read data from instance store volumes, use the native file system I/O interfaces of the chosen operating system.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Which of the following provides the fastest storage medium?
    1. Amazon S3
    2. Amazon EBS using Provisioned IOPS (PIOPS)
    3. SSD Instance (ephemeral) store (SSD Instance Storage provides 100,000 IOPS on some instance types, much faster than any network-attached storage)
    4. AWS Storage Gateway

References

AWS S3 vs EBS vs EFS

S3 vs EBS vs EFS

EFS, EBS, and S3 are AWS’ three different storage types that are applicable for different types of workload needs

S3 vs EBS vs EFS Comparision

S3 vs EBS vs EFS

Simple Storage Service – S3

  • is an object store with a simple key, value store design, and good at storing vast numbers of backups or user files.
  • offers pay for the storage you actually use. Offers cost-saving storage classes ideal for infrequently access data or for data archival
  • provides unlimited storage
  • provides durability as the data is replicated and stored across at least three geographically dispersed AZs with a maximum of 99.999999999% (11! 9’s)
  • provide high availability with a maximum of 99.99%
  • provides security with a range of access control mechanisms and abilities to encrypt data at rest and in transit
  • data can be accessed programmatically or directly from services such as AWS CloudFront.
  • provides backup capability using versioning and cross-region replication

Elastic Block Storage – EBS

  • delivers high-availability block-level storage volumes for EC2 instances.
  • offers pay for the provisioned storage, even if you do not use it
  • provides limited storage capability and cannot scale infinitely
  • stores data on a file system which can be retained after the EC2 instance is shut down.
  • provides durability by replicating data across multiple servers in an AZ to prevent the loss of data from the failure of any single component
  • designed for 99.999% availability
  • provides low-latency performance – using SSD EBS volumes, it offers reliable I/O performance scaled to meet your workload needs.
  • provides secure storage with access control and providing data at rest and in transit encryption
  • is only accessible from a single EC2 instance in the particular AWS region and AZ
  • provides Multi-Attach option to share storage across multiple EC2 instances, but within a particular AWS region and AZ
  • provides backup capability using backups and snapshots

Elastic File Storage – EFS

  • scalable file storage, also optimized for EC2.
  • offers pay for the storage you actually use. There’s no advance provisioning, up-front fees, or commitments
  • multiple instances can be configured to mount the file system.
  • allows mounting the file system across multiple regions and instances.
  • is designed to be highly durable and highly available. Data is redundantly stored across multiple AZs.
  • provides elasticity – scales up and down automatically, even to meet the most abrupt workload spikes.
  • provides performance that scales to support any workload: EFS offers the throughput changing workloads need. It can provide higher throughput in spurts that match sudden file system growth, even for workloads up to 500,000 IOPS or 10 GB per second.
  • provides accessible file storage, which can be accessed by On-premises servers and EC2 instances concurrently.
  • provides security and compliance – access to the file system can be secured with the current security solution, or control access to EFS file systems using IAM, VPC, or POSIX permissions.
  • provides data encryption in transit or at rest.
  • allows EC2 instances to access EFS file systems located in other AWS regions through VPC peering.
  • a file system can be accessed concurrently from all AZs in the region where it is located, which means the application can be architected to failover from one AZ to other AZs in the region in order to ensure the highest level of application availability. Mount targets themselves are designed to be highly available.
  • used as a common data source for any application or workload that runs on numerous instances.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A company runs an application on a group of Amazon Linux EC2 instances. The application writes log files using standard API calls. For compliance reasons, all log files must be retained indefinitely and will be analyzed by a reporting tool that must access all files concurrently. Which storage service should a solutions architect use to provide the MOST cost-effective solution?
    1. Amazon EBS
    2. Amazon EFS
    3. Amazon EC2 instance store
    4. Amazon S3
  2. A new application is being deployed on Amazon EC2. The Application needs to read write upto 3 TB of data to an external data store and requires read-after-write consistency across all AWS regions for writing new objects into this data store.
    1. Amazon EBS
    2. Amazon Glacier
    3. Amazon EFS
    4. Amazon S3
  3. To meet the requirements of an application, an organization needs to save a constantly increasing volume of files on a cloud storage system with the following features and abilities. What below AWS service will meet these requirements?
      1. Pay only for the storage used
      2. Create different security policies for different groups of files
      3. Allow access to the public
      4. Retrieve the files at any time
      5. Store an unlimited number of files
    1. Amazon EBS
    2. Amazon S3
    3. Amazon Glacier
    4. Amazon EFS
  4. An administrator runs a highly available application in AWS. A file storage layer is needed that can share between instances and scale the platform more easily. The storage should also be POSIX compliant. Which AWS service can perform this action?
    1. Amazon EBS
    2. Amazon S3
    3. Amazon EFS
    4. Amazon EC2 Instance store

Reference

AWS_When_to_choose_EFS

AWS Storage Services Cheat Sheet

AWS Storage Services Cheat Sheet

AWS Storage Services

Simple Storage Service – S3

  • provides key-value based object storage with unlimited storage, unlimited objects up to 5 TB for the internet
  • offers an extremely durable, highly available, and infinitely scalable data storage infrastructure at very low costs.
  • is Object-level storage (not a Block level storage) and cannot be used to host OS or dynamic websites (but can work with Javascript SDK)
  • provides durability by redundantly storing objects on multiple facilities within a region
  • regularly verifies the integrity of data using checksums and provides the auto-healing capability
  • S3 resources consist of globally unique buckets with objects and related metadata. The data model is a flat structure with no hierarchies or folders.
  • S3 Replication enables automatic, asynchronous copying of objects across S3 buckets in the same or different AWS regions using SRR or CRR. Replication needs versioning enabled on either side.
  • S3 Transfer Acceleration helps speed data transport over long distances between a client and an S3 bucket using CloudFront edge locations.
  • S3 supports cost-effective Static Website hosting with Client-side scripts.
  • S3 CORS – Cross-Origin Resource Sharing allows cross-origin access to S3 resources.
  • S3 Access Logs enables tracking access requests to an S3 bucket.
  • S3 notification feature enables notifications to be triggered when certain events happen in the bucket.
  • S3 Inventory helps manage the storage and can be used to audit and report on the replication and encryption status of the objects for business, compliance, and regulatory needs.
  • Requestor Pays help bucket owner to specify that the requester requesting the download will be charged for the download.
  • S3 Batch Operations help perform large-scale batch operations on S3 objects and can perform a single operation on lists of specified S3 objects.
  • Pre-Signed URLs can be used shared for uploading/downloading objects for a limited time without requiring AWS security credentials.
  • Multipart Uploads allows
    • parallel uploads with improved throughput and bandwidth utilization
    • fault tolerance and quick recovery from network issues
    • ability to pause and resume uploads
    • begin an upload before the final object size is known
  • Versioning
    • helps preserve, retrieve, and restore every version of every object
    • protect from unintended overwrites and accidental deletions
    • protects individual files but does NOT protect from Bucket deletion
  • MFA (Multi-Factor Authentication) can be enabled for additional security for the deletion of objects.
  • Integrates with CloudTrail, CloudWatch, and SNS for event notifications
  • S3 Storage Classes
    • S3 Standard
      • default storage class, ideal for frequently accessed data
      • 99.999999999% durability & 99.99% availability
      • Low latency and high throughput performance
      • designed to sustain the loss of data in a two facilities
    • S3 Standard-Infrequent Access (S3 Standard-IA)
      • optimized for long-lived and less frequently accessed data
      • designed to sustain the loss of data in a two facilities
      • 99.999999999% durability & 99.9% availability
      • suitable for objects greater than 128 KB kept for at least 30 days
    • S3 One Zone-Infrequent Access (S3 One Zone-IA)
      • optimized for rapid access, less frequently access data
      • ideal for secondary backups and reproducible data
      • stores data in a single AZ, data stored in this storage class will be lost in the event of AZ destruction.
      • 99.999999999% durability & 99.5% availability
    • S3 Reduced Redundancy Storage (Not Recommended)
      • designed for noncritical, reproducible data stored at lower levels of redundancy than the STANDARD storage class
      • reduces storage costs
      • 99.99% durability & 99.99% availability
      • designed to sustain the loss of data in a single facility
    • S3 Glacier
      • suitable for low cost data archiving, where data access is infrequent
      • provides retrieval time of minutes to several hours
        • Expedited – 1 to 5 minutes
        • Standard – 3 to 5 hours
        • Bulk – 5 to 12 hours
      • 99.999999999% durability & 99.9% availability
      • Minimum storage duration of 90 days
    • S3 Glacier Deep Archive (S3 Glacier Deep Archive)
      • provides lowest cost data archiving, where data access is infrequent
      • 99.999999999% durability & 99.9% availability
      • provides retrieval time of several (12-48) hours
        • Standard – 12 hours
        • Bulk – 48 hours
      • Minimum storage duration of 180 days
      • supports long-term retention and digital preservation for data that may be accessed once or twice a year
  • Lifecycle Management policies
    • transition to move objects to different storage classes and Glacier
    • expiration to remove objects and object versions
    • can be applied to both current and non-current objects, in case, versioning is enabled.
  • Data Consistency Model
    • provides strong read-after-write consistency for PUT and DELETE requests of objects in the S3 bucket in all AWS Regions
    • updates to a single key are atomic
    • does not currently support object locking for concurrent writes
  • S3 Security
    • IAM policies – grant users within your own AWS account permission to access S3 resources
    • Bucket and Object ACL – grant other AWS accounts (not specific users) access to  S3 resources
    • Bucket policies – allows to add or deny permissions across some or all of the objects within a single bucket
    • S3 Access Points simplify data access for any AWS service or customer application that stores data in S3.
    • S3 Glacier Vault Lock helps deploy and enforce compliance controls for individual S3 Glacier vaults with a vault lock policy.
    • S3 VPC Gateway Endpoint enables private connections between a VPC and S3, without requiring that you use an internet gateway, NAT device, VPN connection, or AWS Direct Connect connection.
    • Support SSL encryption of data in transit and data encryption at rest
  • S3 Data Encryption
    • supports data at rest and data in transit encryption
    • Server-Side Encryption
      • SSE-S3 – encrypts S3 objects using keys handled & managed by AWS
      • SSE-KMS – leverage AWS Key Management Service to manage encryption keys. KMS provides control and audit trail over the keys.
      • SSE-C – when you want to manage your own encryption keys. AWS does not store the encryption key. Requires HTTPS.
    • Client-Side Encryption
      • Client library such as the S3 Encryption Client
      • Clients must encrypt data themselves before sending it to S3
      • Clients must decrypt data themselves when retrieving from S3
      • Customer fully manages the keys and encryption cycle
  • S3 Best Practices
    • use random hash prefix for keys and ensure a random access pattern, as S3 stores object lexicographically randomness helps distribute the contents across multiple partitions for better performance
    • use parallel threads and Multipart upload for faster writes
    • use parallel threads and Range Header GET for faster reads
    • for list operations with a large number of objects, it’s better to build a secondary index in DynamoDB
    • use Versioning to protect from unintended overwrites and deletions, but this does not protect against bucket deletion
    • use VPC S3 Endpoints with VPC to transfer data using Amazon internal network

Instance Store

  • provides temporary or ephemeral block-level storage for an EC2 instance
  • is physically attached to the Instance
  • deliver very high random I/O performance, which is a good option when storage with very low latency is needed
  • cannot be dynamically resized
  • data persists when an instance is rebooted
  • data does not persists if the
    • underlying disk drive fails
    • instance stops i.e. if the EBS backed instance with instance store volumes attached is stopped
    • instance terminates
  • can be attached to an EC2 instance only when the instance is launched
  • is ideal for the temporary storage of information that changes frequently, such as buffers, caches, scratch data, and other temporary content, or for data that is replicated across a fleet of instances, such as a load-balanced pool of web servers.

Elastic Block Store – EBS

  • is virtual network-attached block storage
  • provides highly available, reliable, durable, block-level storage volumes that can be attached to a running instance
  • provides high durability and are redundant in an AZ, as the data is automatically replicated within that AZ to prevent data loss due to any single hardware component failure
  • persists and is independent of EC2 lifecycle
  • multiple volumes can be attached to a single EC2 instance
  • can be detached & attached to another EC2 instance in that same AZ only
  • volumes are Zonal i.e. created in a specific AZ and CAN’T span across AZs
  • snapshots
  • for making volume available to different AZ, create a snapshot of the volume and restore it to a new volume in any AZ within the region
  • for making the volume available to different Region, the snapshot of the volume can be copied to a different region and restored as a volume
  • PIOPS is designed to run transactions applications that require high and consistent IO for e.g. Relation database, NoSQL, etc
  • volumes CANNOT be shared with multiple EC2 instances, use EFS instead
  • Multi-Attach enables attaching a single Provisioned IOPS SSD (io1 or io2) volume to multiple instances that are in the same AZ.

EBS Encryption

  • allow encryption using the EBS encryption feature.
  • All data stored at rest, disk I/O, and snapshots created from the volume are encrypted.
  • uses 256-bit AES algorithms (AES-256) and an Amazon-managed KMS
  • Snapshots of encrypted EBS volumes are automatically encrypted.

EBS Snapshots

  • helps create backups of EBS volumes
  • are incremental
  • occur asynchronously, consume the instance IOPS
  • are regional and CANNOT span across regions
  • can be copied across regions to make it easier to leverage multiple regions for geographical expansion, data center migration, and disaster recovery
  • can be shared by making them public or with specific AWS accounts by modifying the access permissions of the snapshots
  • support EBS encryption
    • Snapshots of encrypted volumes are automatically encrypted
    • Volumes created from encrypted snapshots are automatically encrypted
    • All data in flight between the instance and the volume is encrypted
    • Volumes created from an unencrypted snapshot owned or have access to can be encrypted on the fly.
    • Encrypted snapshot owned or having access to, can be encrypted with a different key during the copy process.
  • can be automated using AWS Data Lifecycle Manager

EBS vs Instance Store

Refer blog post @ EBS vs Instance Store

Glacier

  • suitable for archiving data, where data access is infrequent and a retrieval time of several hours (3 to 5 hours) is acceptable (Not true anymore with enhancements from AWS)
  • provides a high durability by storing archive in multiple facilities and multiple devices at a very low cost storage
  • performs regular, systematic data integrity checks and is built to be automatically self healing
  • aggregate files into bigger files before sending them to Glacier and use range retrievals to retrieve partial file and reduce costs
  • improve speed and reliability with multipart upload
  • automatically encrypts the data using AES-256
  • upload or download data to Glacier via SSL encrypted endpoints

EFS

  • fully-managed, easy to set up, scale, and cost-optimize file storage
  • can automatically scale from gigabytes to petabytes of data without needing to provision storage
  • provides managed NFS (network file system) that can be mounted on and accessed by multiple EC2 in multiple AZs simultaneously
  • highly durable, highly scalable and highly available.
    • stores data redundantly across multiple Availability Zones
    • grows and shrinks automatically as files are added and removed, so you there is no need to manage storage procurement or provisioning.
  • expensive (3x gp2), but you pay per use
  • uses the Network File System version 4 (NFS v4) protocol
  • is compatible with all Linux-based AMIs for EC2,  POSIX file system (~Linux) that has a standard file API
  • does not support Windows AMI
  • offers the ability to encrypt data at rest using KMS and in transit.
  • can be accessed from on-premises using an AWS Direct Connect or AWS VPN connection between the on-premises datacenter and VPC.
  • can be accessed concurrently from servers in the on-premises datacenter as well as EC2 instances in the Amazon VPC
  • Performance mode
    • General purpose (default)
      • latency-sensitive use cases (web server, CMS, etc…)
    • Max I/O
      • higher latency, throughput, highly parallel (big data, media processing)
  • Storage Tiers
    • Standard
      • for frequently accessed files
      • ideal for active file system workloads and you pay only for the file system storage you use per month
    • Infrequent access (EFS-IA)
      • a lower cost storage class that’s cost-optimized for files infrequently accessed i.e. not accessed every day
      • cost to retrieve files, lower price to store
    • EFS Lifecycle Management with choosing an age-off policy allows moving files to EFS IA
    • Lifecycle Management automatically moves the data to the EFS IA storage class according to the lifecycle policy. for e.g., you can move files automatically into EFS IA fourteen days of not being accessed.
    • EFS is a shared POSIX system for Linux systems and does not work for Windows

Amazon FSx for Windows

  • is a fully managed,  highly reliable, and scalable Windows file system share drive
  • supports SMB protocol & Windows NTFS
  • supports Microsoft Active Directory integration, ACLs, user quotas
  • built on SSD, scale up to 10s of GB/s, millions of IOPS, 100s PB of data
  • is accessible from Windows, Linux, and MacOS compute instances
  • can be accessed from the on-premise infrastructure
  • can be configured to be Multi-AZ (high availability)
  • supports encryption of data at rest and in transit
  • provides data deduplication, which enables further cost optimization by removing redundant data.
  • data is backed-up daily to S3

Amazon FSx for Lustre

  • provides easy and cost effective way to launch and run the world’s most popular high-performance file system.
  • is a type of parallel distributed file system, for large-scale computing
  • Lustre is derived from “Linux” and “cluster”
  • Machine Learning, High Performance Computing (HPC) esp. Video Processing, Financial Modeling, Electronic Design Automation
  • scales up to 100s GB/s, millions of IOPS, sub-ms latencies
  • seamless integration with S3, it transparently presents S3 objects as files and allows you to write changed data back to S3.
  • can “read S3” as a file system (through FSx)
  • can write the output of the computations back to S3 (through FSx)
  • supports encryption of data at rest and in transit
  • can be used from on-premise servers

CloudFront

  • provides low latency and high data transfer speeds for distribution of static, dynamic web or streaming content to web users
  • delivers the content through a worldwide network of data centers called Edge Locations
  • keeps persistent connections with the origin servers so that the files can be fetched from the origin servers as quickly as possible.
  • dramatically reduces the number of network hops that users’ requests must pass through
  • supports multiple origin server options, like AWS hosted service for e.g. S3, EC2, ELB or an on premise server, which stores the original, definitive version of the objects
  • single distribution can have multiple origins and Path pattern in a cache behavior determines which requests are routed to the origin
  • supports Web Download distribution and RTMP Streaming distribution
    • Web distribution supports static, dynamic web content, on demand using progressive download & HLS and live streaming video content
    • RTMP supports streaming of media files using Adobe Media Server and the Adobe Real-Time Messaging Protocol (RTMP) ONLY
  • supports HTTPS using either
    • dedicated IP address, which is expensive as dedicated IP address is assigned to each CloudFront edge location
    • Server Name Indication (SNI), which is free but supported by modern browsers only with the domain name available in the request header
  • For E2E HTTPS connection,
    • Viewers -> CloudFront needs either self signed certificate, or certificate issued by CA or ACM
    • CloudFront -> Origin needs certificate issued by ACM for ELB and by CA for other origins
  •  Security
    • Origin Access Identity (OAI) can be used to restrict the content from S3 origin to be accessible from CloudFront only
    • supports Geo restriction (Geo-Blocking) to whitelist or blacklist countries that can access the content
    • Signed URLs 
      • for RTMP distribution as signed cookies aren’t supported
      • to restrict access to individual files, for e.g., an installation download for your application.
      • users using a client, for e.g. a custom HTTP client, that doesn’t support cookies
    • Signed Cookies
      • provide access to multiple restricted files, for e.g., video part files in HLS format or all of the files in the subscribers’ area of a website.
      • don’t want to change the current URLs
    • integrates with AWS WAF, a web application firewall that helps protect web applications from attacks by allowing rules configured based on IP addresses, HTTP headers, and custom URI strings
  • supports GET, HEAD, OPTIONS, PUT, POST, PATCH, DELETE to get object & object headers, add, update, and delete objects
    • only caches responses to GET and HEAD requests and, optionally, OPTIONS requests
    • does not cache responses to PUT, POST, PATCH, DELETE request methods and these requests are proxied back to the origin
  • object removal from cache
    • would be removed upon expiry (TTL) from the cache, by default 24 hrs
    • can be invalidated explicitly, but has a cost associated, however might continue to see the old version until it expires from those caches
    • objects can be invalidated only for Web distribution
    • change object name, versioning, to serve different version
  • supports adding or modifying custom headers before the request is sent to origin which can be used to
    • validate if user is accessing the content from CDN
    • identifying CDN from which the request was forwarded from, in case of multiple CloudFront distribution
    • for viewers not supporting CORS to return the Access-Control-Allow-Origin header for every request
  • supports Partial GET requests using range header to download object in smaller units improving the efficiency of partial downloads and recovery from partially failed transfers
  • supports compression to compress and serve compressed files when viewer requests include Accept-Encoding: gzip in the request header
  • supports different price class to include all regions, to include only least expensive regions and other regions to exclude most expensive regions
  • supports access logs which contain detailed information about every user request for both web and RTMP distribution

AWS Import/Export

  • accelerates moving large amounts of data into and out of AWS using portable storage devices for transport and transfers data directly using Amazon’s high speed internal network, bypassing the internet.
  • suitable for use cases with
    • large datasets
    • low bandwidth connections
    • first time migration of data
  • Importing data to several types of AWS storage, including EBS snapshots, S3 buckets, and Glacier vaults.
  • Exporting data out from S3 only, with versioning enabled only the latest version is exported
  • Import data can be encrypted (optional but recommended) while export is always encrypted using Truecrypt
  • Amazon will wipe the device if specified, however it will not destroy the device

AWS Elastic Block Store Storage – EBS

EC2 Elastic Block Storage – EBS

  • Elastic Block Storage – EBS provides highly available, reliable, durable, block-level storage volumes that can be attached to an EC2 instance.
  • EBS as a primary storage device is recommended for data that requires frequent and granular updates e.g. running a database or filesystem.
  • An EBS volume
    • behaves like a raw, unformatted, external block device that can be attached to a single EC2 instance at a time.
    • persists independently from the running life of an instance.
    • is Zonal and can be attached to any instance within the same Availability Zone and can be used like any other physical hard drive.
    • is particularly well-suited for use as the primary storage for file systems, databases, or any applications that require fine granular updates and access to raw, unformatted, block-level storage.

Elastic Block Storage Features

  • EBS Volumes are created in a specific Availability Zone and can be attached to any instance in that same AZ.
  • Volumes can be backed up by creating a snapshot of the volume, which is stored in S3.
  • Volumes can be created from a snapshot that can be attached to another instance within the same region.
  • Volumes can be made available outside of the AZ by creating and restoring the snapshot to a new volume anywhere in that region.
  • Snapshots can also be copied to other regions and then restored to new volumes, making it easier to leverage multiple AWS regions for geographical expansion, data center migration, and disaster recovery.
  • Volumes allow encryption using the EBS encryption feature. All data stored at rest, disk I/O, and snapshots created from the volume are encrypted.
  • Encryption occurs on the EC2 instance, providing encryption of data-in-transit from EC2 to the EBS volume.
  • Elastic Volumes help easily adapt the volumes as the needs of the applications change. Elastic Volumes allow you to dynamically increase capacity, tune performance, and change the type of any new or existing current generation volume with no downtime or performance impact.
  • You can dynamically increase size, modify the provisioned IOPS capacity, and change volume type on live production volumes.
  • General Purpose (SSD) volumes support up to 10,000 16000 IOPS and 160 250 MB/s of throughput and Provisioned IOPS (SSD) volumes support up to 20,000 64000 IOPS and 320 1000 MB/s of throughput.
  • EBS Magnetic volumes can be created from 1 GiB to 1 TiB in size; EBS General Purpose (SSD) and Provisioned IOPS (SSD) volumes can be created up to 16 TiB in size.

EBS Benefits

  • Data Availability
    • Data is automatically replicated in an Availability Zone to prevent data loss due to the failure of any single hardware component.
  • Data Persistence
    • persists independently of the running life of an EC2 instance
    • persists when an instance is stopped, started, or rebooted
    • Root volume is deleted, by default, on Instance termination but the behaviour can be changed using the DeleteOnTermination flag
    • All attached volumes persist, by default, on instance termination
  • Data Encryption
    • can be encrypted by the EBS encryption feature
    • uses 256-bit AES-256 and an Amazon-managed key infrastructure.
    • Encryption occurs on the server that hosts the EC2 instance, providing encryption of data-in-transit from the EC2 instance to EBS storage
    • Snapshots of encrypted EBS volumes are automatically encrypted.
  • Snapshots
    • provides the ability to create snapshots (backups) of any EBS volume and write a copy of the data in the volume to S3, where it is stored redundantly in multiple Availability Zones.
    • can be used to create new volumes, increase the size of the volumes or replicate data across Availability Zones or Regions.
    • are incremental backups and store only the data that was changed from the time the last snapshot was taken.
    • Snapshot size can probably be smaller than the volume size as the data is compressed before being saved to S3.
    • Even though snapshots are saved incrementally, the snapshot deletion process is designed so that you need to retain only the most recent snapshot in order to restore the volume.

EBS Volume Types

Refer blog post @ EBS Volume Types

EBS Volume

EBS Volume Creation

  • Creating New volumes
    • Completely new from console or command line tools and can then be attached to an EC2 instance in the same Availability Zone.
  • Restore volume from Snapshots
    • Volumes can also be restored from previously created snapshots
    • New volumes created from existing snapshots are loaded lazily in the background.
    • There is no need to wait for all of the data to transfer from S3 to the volume before the attached instance can start accessing the volume and all its data.
    • If the instance accesses the data that hasn’t yet been loaded, the volume immediately downloads the requested data from S3, and continues loading the rest of the data in the background.
    • Volumes restored from encrypted snapshots are always encrypted, by default.
  • Volumes can be created and attached to a running EC2 instance by specifying a block device mapping

EBS Volume Detachment

  • EBS volumes can be detached from an instance explicitly or by terminating the instance.
  • EBS root volumes can be detached by stopping the instance.
  • EBS data volumes, attached to a running instance, can be detached by unmounting the volume from the instance first.
  • If the volume is detached without being unmounted, it might result in the volume being stuck in a busy state and could possibly damage the file system or the data it contains.
  • EBS volume can be force detached from an instance, using the Force Detach option, but it might lead to data loss or a corrupted file system as the instance does not get an opportunity to flush file system caches or file system metadata.
  • Charges are still incurred for the volume after its detachment

EBS Volume Deletion

  • EBS volume deletion would wipe out its data and the volume can’t be attached to any instance. However, it can be backed up before deletion using EBS snapshots

EBS Volume Resize

  • EBS Elastic Volumes can be modified to increase the volume size, change the volume type, or adjust the performance of your EBS volumes.
  • If the instance supports Elastic Volumes, changes can be performed without detaching the volume or restarting the instance.

EBS Volume Snapshots

Refer blog post @ EBS Snapshot

EBS Encryption

  • EBS volumes can be created and attached to a supported instance type and support the following types of data
    • Data at rest
    • All disk I/O i.e All data moving between the volume and the instance
    • All snapshots created from the volume
    • All volumes created from those snapshots
  • Encryption occurs on the servers that host EC2 instances, providing encryption of data-in-transit from EC2 instances to EBS storage.
  • EBS encryption is supported with all EBS volume types (gp2, io1, st1, and sc1), and has the same IOPS performance on encrypted volumes as with unencrypted volumes, with a minimal effect on latency
  • EBS encryption is only available on select instance types.
  • Volumes created from encrypted snapshots and snapshots of encrypted volumes are automatically encrypted using the same encryption key.
  • EBS encryption uses AWS KMS customer master keys (CMK) when creating encrypted volumes and any snapshots created from the encrypted volumes.
  • EBS volumes can be encrypted using either
    • a default CMK created for you automatically.
    • a CMK that you created separately using AWS KMS, giving you more flexibility, including the ability to create, rotate, disable, define access controls, and audit the encryption keys used to protect your data.
  • Public or shared snapshots of encrypted volumes are not supported, because other accounts would be able to decrypt your data and needs to be migrated to an unencrypted status before sharing.
  • Existing unencrypted volumes cannot be encrypted directly, but can be migrated by
    • Option 1
      • create an unencrypted snapshot from the volume
      • create an encrypted copy of an unencrypted snapshot
      • create an encrypted volume from the encrypted snapshot
    • Option 2
      • create an unencrypted snapshot from the volume
      • create an encrypted volume from an unencrypted snapshot
  • An encrypted snapshot can be created from an unencrypted snapshot by creating an encrypted copy of the unencrypted snapshot.
  • Unencrypted volume cannot be created from an encrypted volume directly but needs to be migrated

EBS Multi-Attach

Refer blog Post @ EBS Multi-Attach

EBS Performance

Refer blog Post @ EBS Performance

EBS vs Instance Store

Refer blog post @ EBS vs Instance Store

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. _____ is a durable, block-level storage volume that you can attach to a single, running Amazon EC2 instance.
    1. Amazon S3
    2. Amazon EBS
    3. None of these
    4. All of these
  2. Which Amazon storage do you think is the best for my database-style applications that frequently encounter many random reads and writes across the dataset?
    1. None of these.
    2. Amazon Instance Storage
    3. Any of these
    4. Amazon EBS
  3. What does Amazon EBS stand for?
    1. Elastic Block Storage
    2. Elastic Business Server
    3. Elastic Blade Server
    4. Elastic Block Store
  4. Which Amazon Storage behaves like raw, unformatted, external block devices that you can attach to your instances?
    1. None of these.
    2. Amazon Instance Storage
    3. Amazon EBS
    4. All of these
  5. A user has created numerous EBS volumes. What is the general limit for each AWS account for the maximum number of EBS volumes that can be created?
    1. 10000
    2. 5000
    3. 100
    4. 1000
  6. Select the correct set of steps for exposing the snapshot only to specific AWS accounts
    1. Select Public for all the accounts and check mark those accounts with whom you want to expose the snapshots and click save.
    2. Select Private and enter the IDs of those AWS accounts, and click Save.
    3. Select Public, enter the IDs of those AWS accounts, and click Save.
    4. Select Public, mark the IDs of those AWS accounts as private, and click Save.
  7. If an Amazon EBS volume is the root device of an instance, can I detach it without stopping the instance?
    1. Yes but only if Windows instance
    2. No
    3. Yes
    4. Yes but only if a Linux instance
  8. Can we attach an EBS volume to more than one EC2 instance at the same time?
    1. Yes
    2. No
    3. Only EC2-optimized EBS volumes.
    4. Only in read mode.
  9. Do the Amazon EBS volumes persist independently from the running life of an Amazon EC2 instance?
    1. Only if instructed to when created
    2. Yes
    3. No
  10. Can I delete a snapshot of the root device of an EBS volume used by a registered AMI?
    1. Only via API
    2. Only via Console
    3. Yes
    4. No
  11. By default, EBS volumes that are created and attached to an instance at launch are deleted when that instance is terminated. You can modify this behavior by changing the value of the flag_____ to false when you launch the instance
    1. DeleteOnTermination
    2. RemoveOnDeletion
    3. RemoveOnTermination
    4. TerminateOnDeletion
  12. Your company policies require encryption of sensitive data at rest. You are considering the possible options for protecting data while storing it at rest on an EBS data volume, attached to an EC2 instance. Which of these options would allow you to encrypt your data at rest? (Choose 3 answers)
    1. Implement third party volume encryption tools
    2. Do nothing as EBS volumes are encrypted by default
    3. Encrypt data inside your applications before storing it on EBS
    4. Encrypt data using native data encryption drivers at the file system level
    5. Implement SSL/TLS for all services running on the server
  13. Which of the following are true regarding encrypted Amazon Elastic Block Store (EBS) volumes? Choose 2 answers
    1. Supported on all Amazon EBS volume types
    2. Snapshots are automatically encrypted
    3. Available to all instance types
    4. Existing volumes can be encrypted
    5. Shared volumes can be encrypted
  14. How can you secure data at rest on an EBS volume?
    1. Encrypt the volume using the S3 server-side encryption service
    2. Attach the volume to an instance using EC2’s SSL interface.
    3. Create an IAM policy that restricts read and write access to the volume.
    4. Write the data randomly instead of sequentially.
    5. Use an encrypted file system on top of the EBS volume
  15. A user has deployed an application on an EBS backed EC2 instance. For a better performance of application, it requires dedicated EC2 to EBS traffic. How can the user achieve this?
    1. Launch the EC2 instance as EBS dedicated with PIOPS EBS
    2. Launch the EC2 instance as EBS enhanced with PIOPS EBS
    3. Launch the EC2 instance as EBS dedicated with PIOPS EBS
    4. Launch the EC2 instance as EBS optimized with PIOPS EBS
  16. A user is trying to launch an EBS backed EC2 instance under free usage. The user wants to achieve encryption of the EBS volume. How can the user encrypt the data at rest?
    1. Use AWS EBS encryption to encrypt the data at rest (Encryption is allowed on micro instances)
    2. User cannot use EBS encryption and has to encrypt the data manually or using a third party tool (Encryption was not allowed on micro instances before)
    3. The user has to select the encryption enabled flag while launching the EC2 instance
    4. Encryption of volume is not available as a part of the free usage tier
  17. A user is planning to schedule a backup for an EBS volume. The user wants security of the snapshot data. How can the user achieve data encryption with a snapshot?
    1. Use encrypted EBS volumes so that the snapshot will be encrypted by AWS
    2. While creating a snapshot select the snapshot with encryption
    3. By default the snapshot is encrypted by AWS
    4. Enable server side encryption for the snapshot using S3
  18. A user has launched an EBS backed EC2 instance. The user has rebooted the instance. Which of the below mentioned statements is not true with respect to the reboot action?
    1. The private and public address remains the same
    2. The Elastic IP remains associated with the instance
    3. The volume is preserved
    4. The instance runs on a new host computer
  19. A user has launched an EBS backed EC2 instance. What will be the difference while performing the restart or stop/start options on that instance?
    1. For restart it does not charge for an extra hour, while every stop/start it will be charged as a separate hour
    2. Every restart is charged by AWS as a separate hour, while multiple start/stop actions during a single hour will be counted as a single hour
    3. For every restart or start/stop it will be charged as a separate hour
    4. For restart it charges extra only once, while for every stop/start it will be charged as a separate hour
  20. A user has launched an EBS backed instance. The user started the instance at 9 AM in the morning. Between 9 AM to 10 AM, the user is testing some script. Thus, he stopped the instance twice and restarted it. In the same hour the user rebooted the instance once. For how many instance hours will AWS charge the user?
    1. 3 hours
    2. 4 hours
    3. 2 hours
    4. 1 hour
  21. You are running a database on an EC2 instance, with the data stored on Elastic Block Store (EBS) for persistence At times throughout the day, you are seeing large variance in the response times of the database queries Looking into the instance with the isolate command you see a lot of wait time on the disk volume that the database’s data is stored on. What two ways can you improve the performance of the database’s storage while maintaining the current persistence of the data? Choose 2 answers
    1. Move to an SSD backed instance
    2. Move the database to an EBS-Optimized Instance
    3. Use Provisioned IOPs EBS
    4. Use the ephemeral storage on an m2.4xLarge Instance Instead
  22. An organization wants to move to Cloud. They are looking for a secure encrypted database storage option. Which of the below mentioned AWS functionalities helps them to achieve this?
    1. AWS MFA with EBS
    2. AWS EBS encryption
    3. Multi-tier encryption with Redshift
    4. AWS S3 server-side storage
  23. A user has stored data on an encrypted EBS volume. The user wants to share the data with his friend’s AWS account. How can user achieve this?
    1. Create an AMI from the volume and share the AMI
    2. Copy the data to an unencrypted volume and then share
    3. Take a snapshot and share the snapshot with a friend
    4. If both the accounts are using the same encryption key then the user can share the volume directly
  24. A user is using an EBS backed instance. Which of the below mentioned statements is true?
    1. The user will be charged for volume and instance only when the instance is running
    2. The user will be charged for the volume even if the instance is stopped
    3. The user will be charged only for the instance running cost
    4. The user will not be charged for the volume if the instance is stopped
  25. A user is planning to use EBS for his DB requirement. The user already has an EC2 instance running in the VPC private subnet. How can the user attach the EBS volume to a running instance?
    1. The user must create EBS within the same VPC and then attach it to a running instance.
    2. The user can create EBS in the same zone as the subnet of instance and attach that EBS to instance. (Should be in the same AZ)
    3. It is not possible to attach an EBS to an instance running in VPC until the instance is stopped.
    4. The user can specify the same subnet while creating EBS and then attach it to a running instance.
  26. A user is creating an EBS volume. He asks for your advice. Which advice mentioned below should you not give to the user for creating an EBS volume?
    1. Take the snapshot of the volume when the instance is stopped
    2. Stripe multiple volumes attached to the same instance
    3. Create an AMI from the attached volume (AMI is created from the snapshot)
    4. Attach multiple volumes to the same instance
  27. An EC2 instance has one additional EBS volume attached to it. How can a user attach the same volume to another running instance in the same AZ?
    1. Terminate the first instance and only then attach to the new instance
    2. Attach the volume as read only to the second instance
    3. Detach the volume first and attach to new instance
    4. No need to detach. Just select the volume and attach it to the new instance, it will take care of mapping internally
  28. What is the scope of an EBS volume?
    1. VPC
    2. Region
    3. Placement Group
    4. Availability Zone

Reference

Amazon_EBS