AWS EC2 Storage

EC2 Storage Overview

EC2 Storage Options - EBS, S3 & Instance Store

Storage Types

Elastic Block Store – EBS

  • Elastic Block Store – EBS provides highly available, reliable, durable, block-level storage volumes that can be attached to an EC2 instance.
  • persists independently from the running life of an instance.
  • behaves like a raw, unformatted, external block device that can be attached to a single EC2 instance at a time.
  • is recommended for data that requires frequent and granular updates e.g. running a database or filesystem.
  • is Zonal and can be attached to any instance within the same Availability Zone and can be used like any other physical hard drive.
  • is particularly well-suited for use as the primary storage for file systems, databases, or any applications that require fine granular updates and access to raw, unformatted, block-level storage.

Instance Store Storage

  • Instance store provides temporary or Ephemeral block-level storage
  • is located on the disks that are physically attached to the host computer.
  • consists of one or more instance store volumes exposed as block devices.
  • The size of an instance store varies by instance type.
  • Virtual devices for instance store volumes that are ephemeral[0-23], starting the first one as ephemeral0 and so on.
  • While an instance store is dedicated to a particular instance, the disk subsystem is shared among instances on a host computer.
  • is ideal for temporary storage of information that changes frequently, such as buffers, caches, scratch data, and other temporary content, or for data that is replicated across a fleet of instances, such as a load-balanced pool of web servers.
  • delivers very high random I/O performance and is a good option for storage with very low latency requirements, but you don’t need the data to persist when the instance terminates or you can take advantage of fault-tolerant architectures.

Amazon EBS vs Instance Store

More detailed @ Comparison of EBS vs Instance Store

Simple Storage Service – S3

More details @ AWS S3

Elastic File Store – EFS

  • Elastic File Store – EFS provides a simple, fully managed, easy-to-set-up, scalable, serverless, and cost-optimized file storage
  • can automatically scale from gigabytes to petabytes of data without needing to provision storage.
  • provides managed NFS (network file system) that can be mounted on and accessed by multiple EC2 in multiple AZs simultaneously.
  • offers highly durable, highly scalable, and highly available.
    • stores data redundantly across multiple AZs in the same region
    • grows and shrinks automatically as files are added and removed, so there is no need to manage storage procurement or provisioning.
  • supports the Network File System version 4 (NFSv4.1 and NFSv4.0) protocol.
  • provides file system access semantics, such as strong data consistency and file locking.
  • is compatible with all Linux-based AMIs for EC2,  POSIX file system (~Linux) that has a standard file API.
  • is a shared POSIX system for Linux systems and does not work for Windows.
  • offers the ability to encrypt data at rest using KMS and in transit.
  • can be accessed from on-premises using an AWS Direct Connect or AWS VPN connection between the on-premises datacenter and VPC.
  • can be accessed concurrently from servers in the on-premises data center as well as EC2 instances in the VPC.

Block Device Mapping

  • A block device is a storage device that moves data in sequences of bytes or bits (blocks) and supports random access and generally use buffered I/O for e.g. hard disks, CD-ROM etc
  • Block devices can be physically attached to a computer (like an instance store volume) or can be accessed remotely as if it was attached (like an EBS volume)
  • Block device mapping defines the block devices to be attached to an instance, which can either be done while creation of an AMI or when an instance is launched
  • Block device must be mounted on the instance, after being attached to the instance, to be able to be accessed
  • When a block device is detached from an instance, it is unmounted by the operating system and you can no longer access the storage device.
  • Additional Instance store volumes can be attached only when the instance is launched while EBS volumes can be attached to a running instance.
  • Viewing the block device mapping for an instance only shows the EBS volumes and not the instance store volumes. Instance metadata can be used to query the complete block device mapping.

Public Data Sets

  • Amazon Web Services provides a repository of public data sets that can be seamlessly integrated into AWS cloud-based applications.
  • Amazon stores the data sets at no charge to the community and, as with all AWS services, you pay only for the compute and storage you use for your own applications.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. When you view the block device mapping for your instance, you can see only the EBS volumes, not the instance store volumes.
    1. Depends on the instance type
    2. FALSE
    3. Depends on whether you use API call
    4. TRUE
  1. Amazon EC2 provides a repository of public data sets that can be seamlessly integrated into AWS cloud-based applications. What is the monthly charge for using the public data sets?
    1. A 1 time charge of 10$ for all the datasets.
    2. 1$ per dataset per month
    3. 10$ per month for all the datasets
    4. There is no charge for using the public data sets
  1. How many types of block devices does Amazon EC2 support?
    1. 2
    2. 4
    3. 3
    4. 1

References

AWS EC2 Instance Store Storage

EC2 Instance Store

EC2 Instance Store

  • An instance store provides temporary or Ephemeral block-level storage for an Elastic Cloud Compute – EC2 instance.
  • is located on the disks that are physically attached to the host computer.
  • consists of one or more instance store volumes exposed as block devices.
  • The size of an instance store varies by instance type.
  • Virtual devices for instance store volumes that are ephemeral[0-23], starting the first one as ephemeral0 and so on.
  • While an instance store is dedicated to a particular instance, the disk subsystem is shared among instances on a host computer.
  • is ideal for temporary storage of information that changes frequently, such as buffers, caches, scratch data, and other temporary content, or for data that is replicated across a fleet of instances, such as a load-balanced pool of web servers.
  • delivers very high random I/O performance and is a good option for storage with very low latency requirements, but you don’t need the data to persist when the instance terminates or you can take advantage of fault-tolerant architectures.

EC2 Instance Store

Instance Store Lifecycle

  • Instance store data lifetime is dependent on the lifecycle of the Instance to which it is attached.
  • Data on the Instance store persists when an instance is rebooted.
  • However, the data on the instance store does not persist if the
    • underlying disk drive fails
    • instance terminates
    • instance hibernates
    • instance stops i.e. if the EBS-backed instance with instance store volumes attached is stopped
  • Stopping, hibernating, or terminating an instance would cause every block of storage in the instance store to be reseted.
  • If an AMI is created from an Instance with an Instance store volume, the data on its instance store volume isn’t preserved.

Instance Store Volumes

  • Instance type of an instance determines the size of the instance store available for the instance and the type of hardware used for the instance store volumes.
  • Instance store volumes are included as part of the instance’s hourly cost.
  • Some instance types use solid-state drives (SSD) to deliver very high random I/O performance, which is a good option when storage with very low latency is needed, but the data does not need to be persisted when the instance terminates or architecture is fault tolerant.

Instance Store Volumes with EC2 instances

  • EBS volumes and instance store volumes for an instance are specified using a block device mapping.
  • Instance store volume
    • can only be attached to an EC2 instance only when an instance is launched.
    • cannot be detached and reattached to a different instance.
  • After an instance is launched, the instance store volumes for the instance should be formatted and mounted before it can be used.
  • Root volume of an instance store-backed instance is mounted automatically

Instance Store Optimizing Writes

  • Because of the way that EC2 virtualizes disks, the first write to any location on an instance store volume performs more slowly than subsequent writes.
  • Amortizing (gradually writing off) this cost over the lifetime of the instance might be acceptable.
  • However, if high disk performance is required, AWS recommends initializing the drives by writing once to every drive location before production use

EBS vs Instance Store

Refer blog post @ EBS vs Instance Store

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Please select the most correct answer regarding the persistence of the Amazon Instance Store
    1. The data on an instance store volume persists only during the life of the associated Amazon EC2 instance
    2. The data on an instance store volume is lost when the security group rule of the associated instance is changed.
    3. The data on an instance store volume persists even after associated Amazon EC2 instance is deleted
  2. A user has launched an EC2 instance from an instance store backed AMI. The user has attached an additional instance store volume to the instance. The user wants to create an AMI from the running instance. Will the AMI have the additional instance store volume data?
    1. Yes, the block device mapping will have information about the additional instance store volume
    2. No, since the instance store backed AMI can have only the root volume bundled
    3. It is not possible to attach an additional instance store volume to the existing instance store backed AMI instance
    4. No, since this is ephemeral storage it will not be a part of the AMI
  3. When an EC2 instance that is backed by an S3-based AMI Is terminated, what happens to the data on the root volume?
    1. Data is automatically saved as an EBS volume.
    2. Data is automatically saved as an EBS snapshot.
    3. Data is automatically deleted
    4. Data is unavailable until the instance is restarted.
  4. A user has launched an EC2 instance from an instance store backed AMI. If the user restarts the instance, what will happen to the ephemeral storage data?
    1. All the data will be erased but the ephemeral storage will stay connected
    2. All data will be erased and the ephemeral storage is released
    3. It is not possible to restart an instance launched from an instance store backed AMI
    4. The data is preserved
  5. When an EC2 EBS-backed instance is stopped, what happens to the data on any ephemeral store volumes?
    1. Data will be deleted and will no longer be accessible
    2. Data is automatically saved in an EBS volume.
    3. Data is automatically saved as an EBS snapshot
    4. Data is unavailable until the instance is restarted
  6. A user has launched an EC2 Windows instance from an instance store backed AMI. The user has also set the Instance initiated shutdown behavior to stop. What will happen when the user shuts down the OS?
    1. It will not allow the user to shutdown the OS when the shutdown behavior is set to Stop
    2. It is not possible to set the termination behavior to Stop for an Instance store backed AMI instance
    3. The instance will stay running but the OS will be shutdown
    4. The instance will be terminated
  7. Which of the following will occur when an EC2 instance in a VPC (Virtual Private Cloud) with an associated Elastic IP is stopped and started? (Choose 2 answers)
    1. The Elastic IP will be dissociated from the instance
    2. All data on instance-store devices will be lost
    3. All data on EBS (Elastic Block Store) devices will be lost
    4. The ENI (Elastic Network Interface) is detached
    5. The underlying host for the instance is changed

References

AWS Storage Options – EBS & Instance Store

AWS Storage Options – EBS & Instance Store

  • Elastic Block Store – EBS and Instance Store provide block-level storage options for EC2 instances.

Elastic Block Store (EBS) volume

  • EBS provides durable block-level storage for use with EC2 instances
  • EBS volumes are off-instance, network-attached storage (NAS) that persists independently from the running life of a single EC2 instance.
  • EBS volume is attached to an instance and can be used as a physical hard drive, typically by formatting it with the file system of your choice and using the file I/O interface provided by the instance operating system.
  • EBS volume can be used to boot an EC2 instance (EBS-root AMIs only), and multiple EBS volumes can be attached to a single EC2 instance.
  • EBS volume can be attached to a single EC2 instance only at any point in time.
  • EBS Multi-Attach volume can be attached to multiple EC2 instances.
  • EBS provides the ability to take point-in-time snapshots, which are persisted in S3. These snapshots can be used to instantiate new EBS volumes and to protect data for long-term durability
  • EBS snapshots can be copied across AWS regions as well, making it easier to leverage multiple AWS regions for geographical expansion, data center migration, and disaster recovery

Ideal Usage Patterns

  • EBS is meant for data that changes relatively frequently and requires long-term persistence.
  • EBS volume provides access to raw block-level storage and is particularly well-suited for use as the primary storage for a database or file system
  • EBS Provisioned IOPS volumes are particularly well-suited for use with databases applications that require a high and consistent rate of random disk reads and writes

Anti-Patterns

  • Temporary Storage
    • EBS volume persists independent of the attached EC2 life cycle.
    • For temporary storage such as caches, buffers, queues, etc it is better to use local instance store volumes, SQS, or Elastic Cache
  • Highly-durable storage
    • EBS volumes with less than 20 GB of modified data since the last snapshot are designed for between 99.5% and 99.9% annual durability; volumes with more modified data can be expected to have proportionally lower durability
    • For highly durable storage, use S3 or Glacier which provides 99.999999999% annual durability per object
  • Static data or web content
    • For static web content, where data infrequently changes, EBS with EC2 would require a web server to serve the pages.
    • S3 may represent a more cost-effective and scalable solution for storing this fixed information and is served directly out of S3.

EBS Performance

  • EBS provides two volume types: standard volumes and Provisioned IOPS volumes which differ in performance characteristics and pricing model, allowing you to tailor the storage performance and cost to the needs of the applications.
  • EBS Volumes can be attached and striped across multiple similarly-provisioned EBS volumes using RAID 0 or logical volume manager software, thus aggregating available IOPs, total volume throughput, and total volume size.
  • Standard volumes offer cost-effective storage for applications with moderate or bursty I/O requirements. Standard volumes are also well suited for use as boot volumes, where the burst capability provides fast instance start-up times.
  • Provisioned IOPS volumes are designed to deliver predictable, high performance for I/O intensive workloads such as databases. With Provisioned IOPS, you specify an IOPS rate when creating a volume, and then EBS provisions that rate for the lifetime of the volume.
  • As EBS volumes are network-attached devices, other network I/O performed by the instance, as well as the total load on the shared network, can affect individual EBS volume performance.
  • EBS-optimized instances can be launched which deliver dedicated throughput between EC2 and EBS and enables instances to fully utilize the Provisioned IOPS on an EBS volume,
  • Each separate EBS volume can be configured as EBS standard or EBS Provisioned IOPS as needed. Alternatively, you could stripe the data.

EBS Durability & Availability

  • EBS volumes are designed to be highly available and reliable.
  • EBS volume data is replicated across multiple servers in a single AZ to prevent the loss of data from the failure of any single component.
  • EBS volume durability depends on both the size of the volume and the amount of data that has changed since your last snapshot
  • EBS snapshots are incremental, point-in-time backups, containing only the data blocks changed since the last snapshot.
  • Frequent snapshots are recommended to maximize both the durability and availability of their  EBS data
  • EBS snapshots provide an easy-to-use disk clone or disk image mechanism for backup, sharing, and disaster recovery.

EBS Cost Model

  • EBS pricing has 3 components: provisioned storage, I/O requests, and snapshot storage
  • Standard volumes are charged per GB-month of provisioned storage and per million I/O requests
  • EBS Provisioned IOPS volumes are charged per GB-month of provisioned storage and per Provisioned IOPS-month
  • For both volumes, EBS snapshots are charged per GB-month of data stored. EBS snapshot copy is charged for the data transferred between regions, and for the standard EBS snapshot charges in the destination region.
  • EBS volume storage capacity is allocated at the time of volume creation, and you are charged for this allocated storage even if not used.
  • For EBS snapshots, you are charged only for storage actually used (consumed). Note that EBS snapshots are incremental and compressed, so the storage used in any snapshot is generally much less than the storage consumed on an EBS volume

EBS Scalability and Elasticity

  • EBS volumes can easily and rapidly be provisioned and released to scale in and out with the changing total storage demands
  • EBS volumes cannot be resized, and if additional storage is needed either
    • An additional volume can be attached
    • Create a snapshot and create a new volume from the snapshot with a higher volume size
  • EBS volumes can be resized dynamically, but cannot be reduced by size.

Interfaces

  • AWS offers management APIs for EBS in both SOAP and REST formats which can be used to create, delete, describe, attach, and detach EBS volumes for the EC2 instances as well as to create, delete, and describe snapshots from EBS to S3; and to copy snapshots across regions.
  • Amazon also offers the same capabilities through AWS Management Console

Instance Store Volumes

  • Instance Store volumes are also referred to as Ephemeral Storage.
  • Instance Store volumes provide temporary block-level storage and consist of a preconfigured and pre-attached block of disk storage on the same physical server as the EC2 instance
  • Instance storage’s amount of disk storage depends on the Instance type and larger instances provide both more and larger instance store volumes. Smaller instance types such as micro instances can only be launched with EBS volumes.
  • Storage-optimized instances provide special purpose instance storage targeted to specific uses case for e.g. HI1 provides very fast solid-state drive (SSD) backed instance storage capable of supporting over 120,000 random read IOPS, and is optimized for very high random I/O performance and low cost per IOPS. While, HS1 instances are optimized for very high storage density, low storage cost, and high sequential I/O performance.
  • Instance store volumes, unlike EBS volumes, cannot be detached or attached to another instance.

Ideal Usage Patterns

  • EC2 local instance store volumes are fast, free (that is, included in the price of the EC2 instance) “scratch volumes” best suited for storing temporary data that is continually changing, such as buffers, caches, scratch data or can easily be regenerated, or data that is replicated for durability
  • High I/O instances provide instance store volumes backed by SSD, and are ideally suited for many high performance database workloads. for e.g. applications include NoSQL databases like Cassandra and MongoDB.
  • High storage instances support much higher storage density per EC2 instance and are ideally suited for applications that benefit from high sequential I/O performance across very large datasets. e.g. applications include data warehouses, Hadoop storage nodes, seismic analysis, cluster file systems, etc.

Anti-Patterns

  • Persistent storage
    • For persistent virtual disk storage similar to a physical disk drive for files or other data that must persist longer than the lifetime of a single  EC2 instance, EBS volumes or S3 are more appropriate.
  • Relational database storage
    • In most cases, relational databases require storage that persists beyond the lifetime of a single EC2 instance, making EBS volumes the natural choice.
  • Shared storage
    • Instance store volumes are dedicated to a single EC2 instance, and cannot be shared with other systems or users.
    • If you need storage that can be detached from one instance and attached to a different instance, or if you need the ability to share data easily, S3 or EBS volumes are the better choices.
  • Snapshots
    • If you need the convenience, long-term durability, availability, and shareability of point-in-time disk snapshots, EBS volumes are a better choice.

Instance Store Performance

  • Non-SSD-based instance store volumes in most EC2 instance families have performance characteristics similar to standard EBS volumes.
  • EC2 instance virtual machine and the local instance store volumes are located in the same physical server, and interaction with the storage is very fast, particularly for sequential access.
  • To further increase aggregate IOPS, or to improve sequential disk throughput, multiple instance store volumes can be grouped together using RAID 0 (disk striping) software.
  • Because the bandwidth to the disks is not limited by the network, aggregate sequential throughput for multiple instance volumes can be higher than for the same number of EBS volumes.
  • SSD instance store volumes in the EC2 high I/O instances provide from tens of thousands to hundreds of thousands of low-latency, random 4 KB random IOPS.
  • Because of the I/O characteristics of SSD devices, write performance can be variable.
  • Instance store volumes on EC2 high storage instances provide very high storage density and high sequential read and write performance. High storage instances are capable of delivering 2.6 GB/sec of sequential read and write performance when using a block size of 2 MB.

Instance Store Durability and Availability

  • EC2 local instance store volumes are not intended to be used as durable disk storage and they persist only during the life of the associate EC2 instance

Cost Model

  • Cost of the EC2 instance includes any local instance store volumes if the instance type provides them.
  • While there is no additional charge for data storage on local instance store volumes, note that data transferred to and from EC2 instance store volumes from other AZs or outside of an EC2 region may incur data transfer charges, and additional charges will apply for use of any persistent storage, such as S3, Glacier, EBS volumes, and EBS snapshots

Scalability and Elasticity

  • Local instance store volumes are tied to a particular EC2 instance and are fixed in number and size for a given EC2 instance type, so the scalability and elasticity of this storage are tied to the number of EC2 instances.

Interfaces

  • Instance store volumes are specified using the block device mapping feature of the EC2 API and the AWS Management Console
  • To the EC2 instance, an instance store volume appears just like a local disk drive. To write to and read data from instance store volumes, use the native file system I/O interfaces of the chosen operating system.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Which of the following provides the fastest storage medium?
    1. Amazon S3
    2. Amazon EBS using Provisioned IOPS (PIOPS)
    3. SSD Instance (ephemeral) store (SSD Instance Storage provides 100,000 IOPS on some instance types, much faster than any network-attached storage)
    4. AWS Storage Gateway

References