AWS EC2 Instance Purchasing Option

April 28, 2016 ~ Last updated on : February 7, 2023 ~ jayendrapatil ~ 27 Comments

AWS EC2 Instance Purchasing Option

Amazon provides different ways to pay for the EC2 instances
- On-Demand Instances
- Reserved Instances
- Spot Instances
- Dedicated Hosts
- Dedicated Instances
- Capacity reservations
EC2 instances can be launched on shared or dedicated tenancy

On-Demand Instances

Pay for the instances and the compute capacity used by the hour or the second, depending on which instances you run
No long-term commitments or up-front payments
Instances can be scaled accordingly as per the demand
Although AWS makes effort to have the capacity to launch On-Demand instances, there might be instances during peak demand where the instance cannot be launched
Well suited for
- Users that want the low cost and flexibility of EC2 without any up-front payment or long-term commitment
- Applications with short term, spiky, or unpredictable workloads that cannot be interrupted
- Applications being developed or tested on EC2 for the first time

Reserved Instances

Reserved Instances provides lower hourly running costs by providing a billing discount (up to 75%) as well as capacity reservation that is applied to instances and there would never be a case of insufficient capacity
Discounted usage price is fixed as long as you own the Reserved Instance, allowing compute costs prediction over the term of the reservation
Reserved instances are best suited if consistent, heavy, use is expected and they can provide savings over owning the hardware or running only On-Demand instances.
Well Suited for
- Applications with steady state or predictable usage
- Applications that require reserved capacity
- Users are able to make upfront payments to reduce their total computing costs even further
Reserved instance is not a physical instance that is launched, but rather a billing discount applied to the use of On-Demand Instances
On-Demand Instances must match certain attributes, such as instance type and Region, in order to benefit from the billing discount.
Reserved Instances do not renew automatically, and the EC2 instances can be continued to be used but charged On-Demand rates
Auto Scaling or other AWS services can be used to launch the On-Demand instances that use the Reserved Instance benefits
With Reserved Instances
- You pay for the entire term, regardless of the usage
- Once purchased, the reservation cannot be canceled but can be sold in the Reserved Instance Marketplace
- Reserved Instance pricing tier discounts only apply to purchases made from AWS, and not to the third party Reserved instances

Reserved Instance Pricing Key Variables

Instance attributes

A Reserved Instance has four instance attributes that determine its price.

Instance type: Instance family + Instance size e.g.m4.large composed of the instance family (m4) and the instance size (large).
Region: Region in which the Reserved Instance is purchased.
Tenancy: Whether your instance runs on shared (default) or single-tenant (dedicated) hardware.
Platform: Operating system; for example, Windows or Linux/Unix.

Term commitment

Reserved Instance can be purchased for a one-year or three-year commitment, with the three-year commitment offering a bigger discount.

One-year: A year is defined as 31536000 seconds (365 days).
Three-year: Three years is defined as 94608000 seconds (1095 days).

Payment options

No Upfront
- No upfront payment is required and the account is charged at a discounted hourly rate for every hour, regardless of the usage
- Only available as a 1-year reservation
Partial Upfront
- A portion of the cost is paid upfront and the remaining hours in the term are charged at an hourly discounted rate, regardless of the usage
Full Upfront
- Full payment is made at the start of the term, with no costs for the remainder of the term, regardless of the usage

Offering class

Standard: Provide the most significant discount, but can only be modified.
Convertible: These provide a lower discount than Standard Reserved Instances, but can be exchanged for another Convertible Reserved Instance with different instance attributes. Convertible Reserved Instances can also be modified.

How Reserved Instances work

Billing Benefits & Payment Options

Reserved Instance purchase reservation is automatically applied to running instances that match the specified parameters
Reserved Instance can also be utilized by launching On-Demand instances with the same configuration as to the purchased reserved capacity

Understanding Hourly Billing

Reserved Instances are billed for every clock-hour during the term that you select, regardless of whether the instance is running or not.
A Reserved Instance billing benefit can be applied to a running instance on a per-second basis. Per-second billing is available for instances using an open-source Linux distribution, such as Amazon Linux and Ubuntu.
Per-hour billing is used for commercial Linux distributions, such as Red Hat Enterprise Linux and SUSE Linux Enterprise Server.
A Reserved Instance billing benefit can apply to a maximum of 3600 seconds (one hour) of instance usage per clock-hour. You can run multiple instances concurrently, but can only receive the benefit of the Reserved Instance discount for a total of 3600 seconds per clock-hour; instance usage that exceeds 3600 seconds in a clock-hour is billed at the On-Demand rate.
Reservations and discounted rates only apply to one instance-hour per hour. If an instance restarts during the first hour of a reservation and runs for two hours before stopping, the first instance-hour is charged at the discounted rate but three instance-hours are charged at the On-Demand rate. If the instance restarts during one hour and again the next hour before running the remainder of the reservation, one instance-hour is charged at the On-Demand rate but the discounted rate is applied to previous and subsequent instance-hours.

Consolidated Billing

Pricing benefits of Reserved Instances are shared when the purchasing account is part of a set of accounts billed under one consolidated billing payer account
Consolidated billing account aggregates the list value of member accounts within a region.
When the list value of all active Reserved Instances for the consolidated billing account reaches a discount pricing tier, any Reserved Instances purchased after this point by any member of the consolidated billing account are charged at the discounted rate (as long as the list value for that consolidated account stays above the discount pricing tier threshold)

Buying Reserved Instances

Buying Reserved Instances need a selection of the following

Platform (for example, Linux)
Instance type (for example, m1.small)
Availability Zone in which to run the instance for Zonal reserved instance
Term (time period) over which you want to reserve capacity
Tenancy You can reserve capacity for your instance to run in single-tenant hardware (dedicated tenancy, as opposed to shared).
Offering (No Upfront, Partial Upfront, All Upfront).

Modifying Reserved Instances

Standard or Convertible Reserved Instances can be modified and continue to benefit from the capacity reservation as the computing needs change.
Availability Zone, instance size (within the same instance family), and scope of the Reserved Instance can be modified
All or a subset of the Reserved Instances can be modified
Two or more Reserved Instances can be merged into a single Reserved Instance
Modification does not change the remaining term of the Reserved Instances; their end dates remain the same.
There is no fee, and you do not receive any new bills or invoices.
Modification is separate from purchasing and does not affect how you use, purchase, or sell Reserved Instances.
Complete reservation or a subset of it can be modified in one or more of the following ways:
- Switch Availability Zones within the same region
- Change between EC2-VPC and EC2-Classic
- Change the instance size within the same instance type, given the instance size footprint remains the same for e.g. four m1.medium instances (4 x 2), you can turn it into a reservation for eight m1.small instances (8 x 1) and vice versa. However, you cannot convert a reservation for a single m1.small instance (1 x 1) into a reservation for an m1.large instance (1 x 4).

Screen Shot 2016-04-26 at 7.07.24 AM.png

Scheduled Reserved Instances

AWS does not have any capacity available for Scheduled Reserved Instances or any plans to make it available in the future. To reserve capacity, use On-Demand Capacity Reservations instead
Scheduled Reserved Instances (Scheduled Instances) enable capacity reservations purchase that recurs on a daily, weekly, or monthly basis, with a specified start time and duration, for a one-year term.
Capacity is reserved in advance and is always available when needed
Charges are incurred for the time that the instances are scheduled, even if they are not used
Scheduled Instances are a good choice for workloads that do not run continuously, but do run on a regular schedule for e.g. weekly or monthly batch jobs
EC2 launches the instances, based on the launch specification during their scheduled time periods
EC2 terminates the EC2 instances three minutes before the end of the current scheduled time period to ensure the capacity is available for any other Scheduled Instances it is reserved for.
Scheduled Reserved instances cannot be stopped or rebooted, however, they can be terminated and relaunched within minutes of termination
Scheduled Reserved instances limits or restrictions
- after purchase cannot be modified, canceled, or resold
- only supported instance types: C3, C4, M4, and R3
- the required term is 365 days (one year).
- minimum required utilization is 1,200 hours per year
- purchase up to three months in advance

On-Demand Capacity Reservations

On-Demand Capacity Reservations enable you to reserve compute capacity for the EC2 instances in a specific AZ for any duration.
This gives you the ability to create and manage Capacity Reservations independently from the billing discounts offered by Savings Plans or regional Reserved Instances.
By creating Capacity Reservations, you ensure that you always have access to EC2 capacity when you need it, for as long as you need it.
Capacity Reservations can be created at any time, without entering into a one-year or three-year term commitment, and the capacity is available immediately.
Billing starts as soon as the capacity is provisioned and the Capacity Reservation enters the active state.
When no longer needed, the Capacity Reservation can be canceled to stop incurring charges.
Capacity Reservation creation requires
- AZ in which to reserve the capacity
- Number of instances for which to reserve capacity
- Instance attributes, including the instance type, tenancy, and platform/OS
Capacity Reservations can only be used by instances that match their attributes. By default, they are automatically used by running instances that match the attributes. If you don’t have any running instances that match the attributes of the Capacity Reservation, it remains unused until you launch an instance with matching attributes.

Spot Instances

Refer blog post @ EC2 Spot Instances

Dedicated Instances

Dedicated Instances are EC2 instances that run in a VPC on hardware that’s dedicated to a single customer
Dedicated Instances are physically isolated at the host hardware level from the instances that aren’t Dedicated Instances and from instances that belong to other AWS accounts.
Each VPC has a related instance tenancy attribute.
- default
  - default is shared.
  - the tenancy can be changed to dedicated after creation
  - all instances launched would be shared, unless you explicitly specify a different tenancy during instance launch.
- dedicated
  - all instances launched would be dedicated
  - the tenancy can’t be changed to default after creation
Each instance launched into a VPC has a tenancy attribute. Default tenancy depends on the VPC tenancy, which by default is shared.
- default – instance runs on shared hardware.
- dedicated – instance runs on single-tenant hardware.
- host – instance runs on a Dedicated Host, which is an isolated server with configurations that you can control.
- default tenancy cannot be changed to dedicatedor hostand vice versa.
- dedicatedtenancy can be changed to hostand vice version
Dedicated Instances can be launched using
- Create the VPC with the instance tenancy set to dedicated, all instances launched into this VPC are Dedicated Instances even though if you mark the tenancy as shared.
- Create the VPC with the instance tenancy set to default, and specify dedicated tenancy for any instances that should be Dedicated Instances when launched.

Dedicated Hosts

EC2 Dedicated Host is a physical server with EC2 instance capacity fully dedicated to your use
Dedicated Hosts allow using existing per-socket, per-core, or per-VM software licenses, including Windows Server, Microsoft SQL Server, SUSE, and Linux Enterprise Server.

Dedicated Hosts vs Dedicated Instances

EC2 Dedicated Host vs Dedicated Instances

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

If I want my instance to run on a single-tenant hardware, which value do I have to set the instance’s tenancy attribute to?
1. dedicated
2. isolated
3. one
4. reserved
You have a video transcoding application running on Amazon EC2. Each instance polls a queue to find out which video should be transcoded, and then runs a transcoding process. If this process is interrupted, the video will be transcoded by another instance based on the queuing system. You have a large backlog of videos, which need to be transcoded, and would like to reduce this backlog by adding more instances. You will need these instances only until the backlog is reduced. Which type of Amazon EC2 instances should you use to reduce the backlog in the most cost efficient way?
1. Reserved instances
2. Spot instances
3. Dedicated instances
4. On-demand instances
The one-time payment for Reserved Instances is __________ refundable if the reservation is cancelled.
1. always
2. in some circumstances
3. never
You run a web application where web servers on EC2 Instances are In an Auto Scaling group Monitoring over the last 6 months shows that 6 web servers are necessary to handle the minimum load. During the day up to 12 servers are needed Five to six days per year, the number of web servers required might go up to 15. What would you recommend to minimize costs while being able to provide hill availability?
1. 6 Reserved instances (heavy utilization). 6 Reserved instances (medium utilization), rest covered by On-Demand instances
2. 6 Reserved instances (heavy utilization). 6 On-Demand instances, rest covered by Spot Instances (don’t go for spot as availability not guaranteed)
3. 6 Reserved instances (heavy utilization) 6 Spot instances, rest covered by On-Demand instances (don’t go for spot as availability not guaranteed)
4. 6 Reserved instances (heavy utilization) 6 Reserved instances (medium utilization) rest covered by Spot instances (don’t go for spot as availability not guaranteed)
A user is running one instance for only 3 hours every day. The user wants to save some cost with the instance. Which of the below mentioned Reserved Instance categories is advised in this case?
1. The user should not use RI; instead only go with the on-demand pricing (seems question before the introduction of the Scheduled Reserved instances in Jan 2016, which can be used in this case)
2. The user should use the AWS high utilized RI
3. The user should use the AWS medium utilized RI
4. The user should use the AWS low utilized RI
Which of the following are characteristics of a reserved instance? Choose 3 answers (but 4 answers seem correct)
1. It can be migrated across Availability Zones (can be modified)
2. It is specific to an Amazon Machine Image (AMI) (specific to platform)
3. It can be applied to instances launched by Auto Scaling (are allowed)
4. It is specific to an instance Type (specific to instance family but instance type can be changed)
5. It can be used to lower Total Cost of Ownership (TCO) of a system (helps to reduce cost)
You have a distributed application that periodically processes large volumes of data across multiple Amazon EC2 Instances. The application is designed to recover gracefully from Amazon EC2 instance failures. You are required to accomplish this task in the most cost-effective way. Which of the following will meet your requirements?
1. Spot Instances
2. Reserved instances
3. Dedicated instances
4. On-Demand instances
Can I move a Reserved Instance from one Region to another?
1. No
2. Only if they are moving into GovCloud
3. Yes
4. Only if they are moving to US East from another region
An application you maintain consists of multiple EC2 instances in a default tenancy VPC. This application has undergone an internal audit and has been determined to require dedicated hardware for one instance. Your compliance team has given you a week to move this instance to single-tenant hardware. Which process will have minimal impact on your application while complying with this requirement?
1. Create a new VPC with tenancy=dedicated and migrate to the new VPC (possible but impact not minimal)
2. Use ec2-reboot-instances command line and set the parameter dedicated=true
3. Right click on the instance, select properties and check the box for dedicated tenancy
4. Stop the instance, create an AMI, launch a new instance with tenancy=dedicated, and terminate the old instance
Your department creates regular analytics reports from your company’s log files. All log data is collected in Amazon S3 and processed by daily Amazon Elastic Map Reduce (EMR) jobs that generate daily PDF reports and aggregated tables in CSV format for an Amazon Redshift data warehouse. Your CFO requests that you optimize the cost structure for this system. Which of the following alternatives will lower costs without compromising average performance of the system or data integrity for the raw data? [PROFESSIONAL]
1. Use reduced redundancy storage (RRS) for PDF and CSV data in Amazon S3. Add Spot instances to Amazon EMR jobs. Use Reserved Instances for Amazon Redshift. (Spot instances impacts performance)
2. Use reduced redundancy storage (RRS) for all data in S3. Use a combination of Spot instances and Reserved Instances for Amazon EMR jobs. Use Reserved instances for Amazon Redshift (Combination of the Spot and reserved with guarantee performance and help reduce cost. Also, RRS would reduce cost and guarantee data integrity, which is different from data durability )
3. Use reduced redundancy storage (RRS) for all data in Amazon S3. Add Spot Instances to Amazon EMR jobs. Use Reserved Instances for Amazon Redshift (Spot instances impacts performance)
4. Use reduced redundancy storage (RRS) for PDF and CSV data in S3. Add Spot Instances to EMR jobs. Use Spot Instances for Amazon Redshift. (Spot instances impacts performance)
A research scientist is planning for the one-time launch of an Elastic MapReduce cluster and is encouraged by her manager to minimize the costs. The cluster is designed to ingest 200TB of genomics data with a total of 100 Amazon EC2 instances and is expected to run for around four hours. The resulting data set must be stored temporarily until archived into an Amazon RDS Oracle instance. Which option will help save the most money while meeting requirements? [PROFESSIONAL]
1. Store ingest and output files in Amazon S3. Deploy on-demand for the master and core nodes and spot for the task nodes.
2. Optimize by deploying a combination of on-demand, RI and spot-pricing models for the master, core and task nodes. Store ingest and output files in Amazon S3 with a lifecycle policy that archives them to Amazon Glacier. (Reserved Instance not cost effective for 4 hour job and data not needed in S3 once moved to RDS)
3. Store the ingest files in Amazon S3 RRS and store the output files in S3. Deploy Reserved Instances for the master and core nodes and on-demand for the task nodes. (Reserved Instance not cost effective)
4. Deploy on-demand master, core and task nodes and store ingest and output files in Amazon S3 RRS (RRS provides not much cost benefits for a 4 hour job while the amount of input data would take time to upload and Output data to reproduce)
A company currently has a highly available web application running in production. The application’s web front-end utilizes an Elastic Load Balancer and Auto scaling across 3 availability zones. During peak load, your web servers operate at 90% utilization and leverage a combination of heavy utilization reserved instances for steady state load and on-demand and spot instances for peak load. You are asked with designing a cost effective architecture to allow the application to recover quickly in the event that an availability zone is unavailable during peak load. Which option provides the most cost effective high availability architectural design for this application? [PROFESSIONAL]
1. Increase auto scaling capacity and scaling thresholds to allow the web-front to cost-effectively scale across all availability zones to lower aggregate utilization levels that will allow an availability zone to fail during peak load without affecting the applications availability. (Ideal for HA to reduce and distribute load)
2. Continue to run your web front-end at 90% utilization, but purchase an appropriate number of utilization RIs in each availability zone to cover the loss of any of the other availability zones during peak load. (90% is not recommended as well RIs would increase the cost)
3. Continue to run your web front-end at 90% utilization, but leverage a high bid price strategy to cover the loss of any of the other availability zones during peak load. (90% is not recommended as high bid price would not guarantee instances and would increase cost)
4. Increase use of spot instances to cost effectively to scale the web front-end across all availability zones to lower aggregate utilization levels that will allow an availability zone to fail during peak load without affecting the applications availability. (Availability cannot be guaranteed)
You run accounting software in the AWS cloud. This software needs to be online continuously during the day every day of the week, and has a very static requirement for compute resources. You also have other, unrelated batch jobs that need to run once per day at any time of your choosing. How should you minimize cost? [PROFESSIONAL]
1. Purchase a Heavy Utilization Reserved Instance to run the accounting software. Turn it off after hours. Run the batch jobs with the same instance class, so the Reserved Instance credits are also applied to the batch jobs. (Because the instance will always be online during the day, in a predictable manner, and there are sequences of batch jobs to perform at any time, we should run the batch jobs when the account software is off. We can achieve Heavy Utilization by alternating these times, so we should purchase the reservation as such, as this represents the lowest cost. There is no such thing a “Full” level utilization purchases on EC2.)
2. Purchase a Medium Utilization Reserved Instance to run the accounting software. Turn it off after hours. Run the batch jobs with the same instance class, so the Reserved Instance credits are also applied to the batch jobs.
3. Purchase a Light Utilization Reserved Instance to run the accounting software. Turn it off after hours. Run the batch jobs with the same instance class, so the Reserved Instance credits are also applied to the batch jobs.
4. Purchase a Full Utilization Reserved Instance to run the accounting software. Turn it off after hours. Run the batch jobs with the same instance class, so the Reserved Instance credits are also applied to the batch jobs.

References

AWS Elastic Block Store Storage – EBS

April 18, 2016 ~ Last updated on : November 22, 2022 ~ jayendrapatil ~ 61 Comments

EC2 Elastic Block Storage – EBS

Elastic Block Storage – EBS provides highly available, reliable, durable, block-level storage volumes that can be attached to an EC2 instance.
EBS as a primary storage device is recommended for data that requires frequent and granular updates e.g. running a database or filesystem.
An EBS volume
- behaves like a raw, unformatted, external block device that can be attached to a single EC2 instance at a time.
- persists independently from the running life of an instance.
- is Zonal and can be attached to any instance within the same Availability Zone and can be used like any other physical hard drive.
- is particularly well-suited for use as the primary storage for file systems, databases, or any applications that require fine granular updates and access to raw, unformatted, block-level storage.

Elastic Block Storage Features

EBS Volumes are created in a specific Availability Zone and can be attached to any instance in that same AZ.
Volumes can be backed up by creating a snapshot of the volume, which is stored in S3.
Volumes can be created from a snapshot that can be attached to another instance within the same region.
Volumes can be made available outside of the AZ by creating and restoring the snapshot to a new volume anywhere in that region.
Snapshots can also be copied to other regions and then restored to new volumes, making it easier to leverage multiple AWS regions for geographical expansion, data center migration, and disaster recovery.
Volumes allow encryption using the EBS encryption feature. All data stored at rest, disk I/O, and snapshots created from the volume are encrypted.
Encryption occurs on the EC2 instance, providing encryption of data-in-transit from EC2 to the EBS volume.
Elastic Volumes help easily adapt the volumes as the needs of the applications change. Elastic Volumes allow you to dynamically increase capacity, tune performance, and change the type of any new or existing current generation volume with no downtime or performance impact.
You can dynamically increase size, modify the provisioned IOPS capacity, and change volume type on live production volumes.
General Purpose (SSD) volumes support up to ~~10,000~~ 16000 IOPS and ~~160~~ 250 MB/s of throughput and Provisioned IOPS (SSD) volumes support up to ~~20,000~~ 64000 IOPS and ~~320~~ 1000 MB/s of throughput.
~~EBS Magnetic volumes can be created from 1 GiB to 1 TiB in size; EBS General Purpose (SSD) and Provisioned IOPS (SSD) volumes can be created up to 16 TiB in size.~~

EBS Benefits

Data Availability
- Data is automatically replicated in an Availability Zone to prevent data loss due to the failure of any single hardware component.
Data Persistence
- persists independently of the running life of an EC2 instance
- persists when an instance is stopped, started, or rebooted
- Root volume is deleted, by default, on Instance termination but the behaviour can be changed using the DeleteOnTermination flag
- All attached volumes persist, by default, on instance termination
Data Encryption
- can be encrypted by the EBS encryption feature
- uses 256-bit AES-256 and an Amazon-managed key infrastructure.
- Encryption occurs on the server that hosts the EC2 instance, providing encryption of data-in-transit from the EC2 instance to EBS storage
- Snapshots of encrypted EBS volumes are automatically encrypted.
Snapshots
- provides the ability to create snapshots (backups) of any EBS volume and write a copy of the data in the volume to S3, where it is stored redundantly in multiple Availability Zones.
- can be used to create new volumes, increase the size of the volumes or replicate data across Availability Zones or Regions.
- are incremental backups and store only the data that was changed from the time the last snapshot was taken.
- Snapshot size can probably be smaller than the volume size as the data is compressed before being saved to S3.
- Even though snapshots are saved incrementally, the snapshot deletion process is designed so that you need to retain only the most recent snapshot in order to restore the volume.

EBS Volume Types

Refer blog post @ EBS Volume Types

EBS Volume

EBS Volume Creation

Creating New volumes
- Completely new from console or command line tools and can then be attached to an EC2 instance in the same Availability Zone.
Restore volume from Snapshots
- Volumes can also be restored from previously created snapshots
- New volumes created from existing snapshots are loaded lazily in the background.
- There is no need to wait for all of the data to transfer from S3 to the volume before the attached instance can start accessing the volume and all its data.
- If the instance accesses the data that hasn’t yet been loaded, the volume immediately downloads the requested data from S3, and continues loading the rest of the data in the background.
- Volumes restored from encrypted snapshots are always encrypted, by default.
Volumes can be created and attached to a running EC2 instance by specifying a block device mapping

EBS Volume Detachment

EBS volumes can be detached from an instance explicitly or by terminating the instance.
EBS root volumes can be detached by stopping the instance.
EBS data volumes, attached to a running instance, can be detached by unmounting the volume from the instance first.
If the volume is detached without being unmounted, it might result in the volume being stuck in a busy state and could possibly damage the file system or the data it contains.
EBS volume can be force detached from an instance, using the Force Detach option, but it might lead to data loss or a corrupted file system as the instance does not get an opportunity to flush file system caches or file system metadata.
Charges are still incurred for the volume after its detachment

EBS Volume Deletion

EBS volume deletion would wipe out its data and the volume can’t be attached to any instance. However, it can be backed up before deletion using EBS snapshots

EBS Volume Resize

EBS Elastic Volumes can be modified to increase the volume size, change the volume type, or adjust the performance of your EBS volumes.
If the instance supports Elastic Volumes, changes can be performed without detaching the volume or restarting the instance.

EBS Volume Snapshots

Refer blog post @ EBS Snapshot

EBS Encryption

EBS volumes can be created and attached to a supported instance type and support the following types of data
- Data at rest
- All disk I/O i.e All data moving between the volume and the instance
- All snapshots created from the volume
- All volumes created from those snapshots
Encryption occurs on the servers that host EC2 instances, providing encryption of data-in-transit from EC2 instances to EBS storage.
EBS encryption is supported with all EBS volume types (gp2, io1, st1, and sc1), and has the same IOPS performance on encrypted volumes as with unencrypted volumes, with a minimal effect on latency
EBS encryption is only available on select instance types.
Volumes created from encrypted snapshots and snapshots of encrypted volumes are automatically encrypted using the same encryption key.
EBS encryption uses AWS KMS customer master keys (CMK) when creating encrypted volumes and any snapshots created from the encrypted volumes.
EBS volumes can be encrypted using either
- a default CMK created for you automatically.
- a CMK that you created separately using AWS KMS, giving you more flexibility, including the ability to create, rotate, disable, define access controls, and audit the encryption keys used to protect your data.
Public or shared snapshots of encrypted volumes are not supported, because other accounts would be able to decrypt your data and needs to be migrated to an unencrypted status before sharing.
Existing unencrypted volumes cannot be encrypted directly, but can be migrated by
- Option 1
  - create an unencrypted snapshot from the volume
  - create an encrypted copy of an unencrypted snapshot
  - create an encrypted volume from the encrypted snapshot
- Option 2
  - create an unencrypted snapshot from the volume
  - create an encrypted volume from an unencrypted snapshot
An encrypted snapshot can be created from an unencrypted snapshot by creating an encrypted copy of the unencrypted snapshot.
Unencrypted volume cannot be created from an encrypted volume directly but needs to be migrated

EBS Multi-Attach

Refer blog Post @ EBS Multi-Attach

EBS Performance

Refer blog Post @ EBS Performance

EBS vs Instance Store

Refer blog post @ EBS vs Instance Store

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

_____ is a durable, block-level storage volume that you can attach to a single, running Amazon EC2 instance.
1. Amazon S3
2. Amazon EBS
3. None of these
4. All of these
Which Amazon storage do you think is the best for my database-style applications that frequently encounter many random reads and writes across the dataset?
1. None of these.
2. Amazon Instance Storage
3. Any of these
4. Amazon EBS
What does Amazon EBS stand for?
1. Elastic Block Storage
2. Elastic Business Server
3. Elastic Blade Server
4. Elastic Block Store
Which Amazon Storage behaves like raw, unformatted, external block devices that you can attach to your instances?
1. None of these.
2. Amazon Instance Storage
3. Amazon EBS
4. All of these
A user has created numerous EBS volumes. What is the general limit for each AWS account for the maximum number of EBS volumes that can be created?
1. 10000
2. 5000
3. 100
4. 1000
Select the correct set of steps for exposing the snapshot only to specific AWS accounts
1. Select Public for all the accounts and check mark those accounts with whom you want to expose the snapshots and click save.
2. Select Private and enter the IDs of those AWS accounts, and click Save.
3. Select Public, enter the IDs of those AWS accounts, and click Save.
4. Select Public, mark the IDs of those AWS accounts as private, and click Save.
If an Amazon EBS volume is the root device of an instance, can I detach it without stopping the instance?
1. Yes but only if Windows instance
2. No
3. Yes
4. Yes but only if a Linux instance
Can we attach an EBS volume to more than one EC2 instance at the same time?
1. Yes
2. No
3. Only EC2-optimized EBS volumes.
4. Only in read mode.
Do the Amazon EBS volumes persist independently from the running life of an Amazon EC2 instance?
1. Only if instructed to when created
2. Yes
3. No
Can I delete a snapshot of the root device of an EBS volume used by a registered AMI?
1. Only via API
2. Only via Console
3. Yes
4. No
By default, EBS volumes that are created and attached to an instance at launch are deleted when that instance is terminated. You can modify this behavior by changing the value of the flag_____ to false when you launch the instance
1. DeleteOnTermination
2. RemoveOnDeletion
3. RemoveOnTermination
4. TerminateOnDeletion
Your company policies require encryption of sensitive data at rest. You are considering the possible options for protecting data while storing it at rest on an EBS data volume, attached to an EC2 instance. Which of these options would allow you to encrypt your data at rest? (Choose 3 answers)
1. Implement third party volume encryption tools
2. Do nothing as EBS volumes are encrypted by default
3. Encrypt data inside your applications before storing it on EBS
4. Encrypt data using native data encryption drivers at the file system level
5. Implement SSL/TLS for all services running on the server
Which of the following are true regarding encrypted Amazon Elastic Block Store (EBS) volumes? Choose 2 answers
1. Supported on all Amazon EBS volume types
2. Snapshots are automatically encrypted
3. Available to all instance types
4. Existing volumes can be encrypted
5. Shared volumes can be encrypted
How can you secure data at rest on an EBS volume?
1. Encrypt the volume using the S3 server-side encryption service
2. Attach the volume to an instance using EC2’s SSL interface.
3. Create an IAM policy that restricts read and write access to the volume.
4. Write the data randomly instead of sequentially.
5. Use an encrypted file system on top of the EBS volume
A user has deployed an application on an EBS backed EC2 instance. For a better performance of application, it requires dedicated EC2 to EBS traffic. How can the user achieve this?
1. Launch the EC2 instance as EBS dedicated with PIOPS EBS
2. Launch the EC2 instance as EBS enhanced with PIOPS EBS
3. Launch the EC2 instance as EBS dedicated with PIOPS EBS
4. Launch the EC2 instance as EBS optimized with PIOPS EBS
A user is trying to launch an EBS backed EC2 instance under free usage. The user wants to achieve encryption of the EBS volume. How can the user encrypt the data at rest?
1. Use AWS EBS encryption to encrypt the data at rest (Encryption is allowed on micro instances)
2. User cannot use EBS encryption and has to encrypt the data manually or using a third party tool (Encryption was not allowed on micro instances before)
3. The user has to select the encryption enabled flag while launching the EC2 instance
4. Encryption of volume is not available as a part of the free usage tier
A user is planning to schedule a backup for an EBS volume. The user wants security of the snapshot data. How can the user achieve data encryption with a snapshot?
1. Use encrypted EBS volumes so that the snapshot will be encrypted by AWS
2. While creating a snapshot select the snapshot with encryption
3. By default the snapshot is encrypted by AWS
4. Enable server side encryption for the snapshot using S3
A user has launched an EBS backed EC2 instance. The user has rebooted the instance. Which of the below mentioned statements is not true with respect to the reboot action?
1. The private and public address remains the same
2. The Elastic IP remains associated with the instance
3. The volume is preserved
4. The instance runs on a new host computer
A user has launched an EBS backed EC2 instance. What will be the difference while performing the restart or stop/start options on that instance?
1. For restart it does not charge for an extra hour, while every stop/start it will be charged as a separate hour
2. Every restart is charged by AWS as a separate hour, while multiple start/stop actions during a single hour will be counted as a single hour
3. For every restart or start/stop it will be charged as a separate hour
4. For restart it charges extra only once, while for every stop/start it will be charged as a separate hour
A user has launched an EBS backed instance. The user started the instance at 9 AM in the morning. Between 9 AM to 10 AM, the user is testing some script. Thus, he stopped the instance twice and restarted it. In the same hour the user rebooted the instance once. For how many instance hours will AWS charge the user?
1. 3 hours
2. 4 hours
3. 2 hours
4. 1 hour
You are running a database on an EC2 instance, with the data stored on Elastic Block Store (EBS) for persistence At times throughout the day, you are seeing large variance in the response times of the database queries Looking into the instance with the isolate command you see a lot of wait time on the disk volume that the database’s data is stored on. What two ways can you improve the performance of the database’s storage while maintaining the current persistence of the data? Choose 2 answers
1. Move to an SSD backed instance
2. Move the database to an EBS-Optimized Instance
3. Use Provisioned IOPs EBS
4. Use the ephemeral storage on an m2.4xLarge Instance Instead
An organization wants to move to Cloud. They are looking for a secure encrypted database storage option. Which of the below mentioned AWS functionalities helps them to achieve this?
1. AWS MFA with EBS
2. AWS EBS encryption
3. Multi-tier encryption with Redshift
4. AWS S3 server-side storage
A user has stored data on an encrypted EBS volume. The user wants to share the data with his friend’s AWS account. How can user achieve this?
1. Create an AMI from the volume and share the AMI
2. Copy the data to an unencrypted volume and then share
3. Take a snapshot and share the snapshot with a friend
4. If both the accounts are using the same encryption key then the user can share the volume directly
A user is using an EBS backed instance. Which of the below mentioned statements is true?
1. The user will be charged for volume and instance only when the instance is running
2. The user will be charged for the volume even if the instance is stopped
3. The user will be charged only for the instance running cost
4. The user will not be charged for the volume if the instance is stopped
A user is planning to use EBS for his DB requirement. The user already has an EC2 instance running in the VPC private subnet. How can the user attach the EBS volume to a running instance?
1. The user must create EBS within the same VPC and then attach it to a running instance.
2. The user can create EBS in the same zone as the subnet of instance and attach that EBS to instance. (Should be in the same AZ)
3. It is not possible to attach an EBS to an instance running in VPC until the instance is stopped.
4. The user can specify the same subnet while creating EBS and then attach it to a running instance.
A user is creating an EBS volume. He asks for your advice. Which advice mentioned below should you not give to the user for creating an EBS volume?
1. Take the snapshot of the volume when the instance is stopped
2. Stripe multiple volumes attached to the same instance
3. Create an AMI from the attached volume (AMI is created from the snapshot)
4. Attach multiple volumes to the same instance
An EC2 instance has one additional EBS volume attached to it. How can a user attach the same volume to another running instance in the same AZ?
1. Terminate the first instance and only then attach to the new instance
2. Attach the volume as read only to the second instance
3. Detach the volume first and attach to new instance
4. No need to detach. Just select the volume and attach it to the new instance, it will take care of mapping internally
What is the scope of an EBS volume?
1. VPC
2. Region
3. Placement Group
4. Availability Zone

Reference

Amazon_EBS

AWS EC2 Instance Types

April 15, 2016 ~ Last updated on : February 6, 2023 ~ jayendrapatil ~ 12 Comments

EC2 Instance Types

EC2 Instance types determine the hardware of the host computer used for the instance.
EC2 Instance types offer different compute, memory & storage capabilities and are grouped in instance families based on these capabilities.
EC2 provides each instance with a consistent and predictable amount of CPU capacity, regardless of its underlying hardware.
EC2 dedicates some resources of the host computer, such as CPU, memory, and instance storage, to a particular instance.
EC2 shares other resources of the host computer, such as the network and the disk subsystem, among instances. If each instance on a host computer tries to use as much of one of these shared resources as possible, each receives an equal share of that resource. However, when a resource is under-utilized, an instance can consume a higher share of that resource while it’s available.

EC2 Instance Types Selection criteria

Some Instance types support only the HVM virtualization type while others support both the PV and HVM virtualization types. AWS, however, recommends using HVM for taking advantage of the underlying hardware
All EC2 instance types are available in a VPC, however, a few are not available in an EC2-classic. AWS recommends using VPC to take advantage of enhanced networking, multiple IP addresses, finer security control etc.
Some instances support only EBS volumes, while others support both EBS and Instance store volumes. Some instances that support instance store volumes use solid-state drives (SSD) to deliver very high random I/O performance.
Some EC2 instance types can be launched as EBS optimized instances with a dedicated capacity for EBS I/O.
Some EC2 Instance types can be launched in placement group to optimize instances for High-Performance Computing (HPC)
Some instances support Enhanced Networking, to get significantly higher packet per second (PPS) performance, lower network jitter, and lower latencies
Some Instances allow EBS volumes to be encrypted

EBS-Optimized

EBS-optimized instance uses an optimized configuration stack and provides additional, dedicated capacity for EBS I/O.
EBS-optimized instances enable you to get consistently high performance for the EBS volumes by eliminating contention between EBS I/O and other network traffic from the instance.
EBS-optimized instances deliver dedicated throughput between Amazon EC2 and EBS, with options between 500 and 60,000 Megabits per second (Mbps) depending on the instance type used.
When attached to an EBS–optimized instance, General Purpose (SSD) volumes are designed to deliver within 10 percent of their baseline and burst performance 99.9 percent of the time in a given year, and Provisioned IOPS (SSD) volumes are designed to deliver within 10 percent of their provisioned performance 99.9 percent of the time in a given year.
EBS optimization can be enabled for an instance that is not EBS–optimized, by default

Placement Groups

EC2 Placement groups determine how the instances are placed on the underlying hardware.
AWS now provides three types of placement groups
- Cluster – clusters instances into a low-latency group in a single AZ
- Partition – spreads instances across logical partitions, ensuring that instances in one partition do not share underlying hardware with instances in other partitions
- Spread – strictly places a small group of instances across distinct underlying hardware to reduce correlated failures

NOTE – AWS keeps on releasing new instance types, please refer AWS documentation for the same.

EC2 Instance Types – Current Generation

EC2 Instance Types

EC2 Instance Types Comparision

Screen Shot 2016-04-15 at 7.06.50 AM.png

T2 Instances (General Purpose)

T2 instances are designed to provide moderate baseline performance and the capability to burst to significantly higher performance as required
Mainly intended for workloads that don’t use the full CPU often or consistently, but occasionally need to burst.
T2 instances are well suited for
- general-purpose workloads, such as web servers, developer environments, remote desktops, and small databases
Requirements
- can be launched only with HVM AMI
- can be launched into a VPC only, and not supported on the EC2-Classic platform
- are available as EBS-backed instances only
- are available as On-Demand, Reserved instances, Dedicated Instances (T3 only), and Spot instances ~~but do not allow spot instances~~
- By default, 20 (soft limit) T2 instances can run simultaneously
- cannot be launched as a Dedicated host
T2 Unlimited Instances
- can sustain high CPU performance for as long as a workload needs it.
- for most general-purpose workloads, it provides ample performance without any additional charges.
- If the instance needs to run at higher CPU utilization for a prolonged period, it can also do so at a flat additional rate

CPU Credits

CPU Credits (Similar to I/O Credits in the case of the EBS general-purpose storage) provides the performance of a full CPU core for one minute
T2 instances provide a baseline level of CPU performance, while CPU governs the ability to burst above the baseline level
One CPU credit is equal to one vCPU running at 100% utilization for one minute. for e.g. can have One vCPU running at 100% for One min OR One vCPU running @ 50% for 2 mins OR Two vCPU running @ 25% for 2 mins
Each T2 instance receives a healthy initial credit balance for startup performance
Initial CPU credits do not expire, but they are used first when an instance uses CPU credits.
Each T2 instance then continuously (at a millisecond-level resolution) receives a set rate of CPU credits per hour, depending on instance size for e.g. t2.nano earns 3/hour while a t2.large earns 36/hour
Each T2 instance accumulates the CPU credit when it uses fewer CPU resources than its allowed baseline performance levels
Maximum earned credit balance for an instance is equal to the number of CPU credits received per hour times 24 hours for e.g. t2.nano can earn max 72 (24 * 3) credits
CPU credit balance is available for a period of 24 hours and it expires 24 hours after they were earned. Expired credits are removed from the balance before new ones are added
CPU credit ceases to persist between an instance stop-start. However, after the start, the instance receives the initial CPU credits again
When the credit balance is completely exhausted, the instance will perform at its baseline performance

C4 Instances (Compute Intensive)

C4 instances are ideal for compute-bound applications that benefit from high-performance processors
Well suited for
- Batch processing workloads,
- Media transcoding,
- High-traffic web servers, massively multiplayer online (MMO) gaming servers, and ad serving engines,
- High-traffic web servers, massively multiplayer online (MMO) gaming servers, and ad serving engines
Features
- are EBS-optimized, by default
- can be enabled for Enhanced Networking capabilities
- can be clustered in a placement group
requirements
- requires 64-bit HVM AMI
- can be launched into a VPC only, and not supported on the EC2-Classic platform

G2 Instances (Graphic Intensive)

GPU instances provide high parallel processing capability
Well suited for
- to accelerate many scientific, engineering, and rendering applications by leveraging the Compute Unified Device Architecture (CUDA) or OpenCL parallel computing frameworks
- graphics applications, including game streaming, 3-D application streaming, and other graphics workloads
Requirements
- requires HVM AMI
- can’t access GPU unless NVIDIA drivers installed
Features
- can be clustered in a placement group

I2 Instances (I/O Intensive)

I2 instances are optimized to deliver tens of thousands of low-latency, random I/O operations per second (IOPS) to applications.
Well suited for applications
- NoSQL databases (for example, Cassandra and MongoDB)
- Clustered databases
- Online transaction processing (OLTP) systems
Features
- Primary data storage is SSD-based instance storage.
- can be enabled for Enhanced Networking capabilities
- can be clustered in a placement group
- can enable EBS–optimization to obtain additional, dedicated capacity for Amazon EBS I/O
Requirements
- requires HVM AMI
HI1 is the equivalent previous generation instance
- supports both PV and HVM AMIs

D2 Instances (Density Intensive)

D2 instances are designed for workloads with very high storage density and that require high sequential read and write access to very large data sets on local storage.
Well suited for applications
- Massive parallel processing (MPP) data warehouse
- MapReduce and Hadoop distributed computing
- Log or data processing applications
Features
- Primary data storage for D2 instances is HDD-based instance storage
- are EBS-optimized, by default
- can be enabled for Enhanced Networking capabilities
- can be clustered in a placement group
requirements
- requires 64-bit HVM AMI
HS1 is the equivalent previous generation instance
- supports both EBS and Instance store backed AMIs
- supports both PV and HVM AMIs

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

Which of the following instance types are available as Amazon EBS-backed only? Choose 2 answers
1. General purpose T2
2. General purpose M3
3. Compute-optimized C4
4. Compute-optimized C3
5. Storage-optimized 12
A t2.medium EC2 instance type must be launched with what type of Amazon Machine Image (AMI)?
1. An Instance store Hardware Virtual Machine AMI
2. An Instance store Paravirtual AMI
3. An Amazon EBS-backed Hardware Virtual Machine AMI
4. An Amazon EBS-backed Paravirtual AMI
You have identified network throughput as a bottleneck on your m1.small EC2 instance when uploading data Into Amazon S3 In the same region. How do you remedy this situation? Add an additional ENI
1. Change to a larger Instance
2. Use DirectConnect between EC2 and S3
3. Use EBS PIOPS on the local volume
You are using an m1.small EC2 Instance with one 300 GB EBS volume to host a relational database. You determined that write throughput to the database needs to be increased. Which of the following approaches can help achieve this? Choose 2 answers
1. Use an array of EBS volumes (Striping to increase throughput)
2. Enable Multi-AZ mode.
3. Place the instance in an Auto Scaling Groups
4. Add an EBS volume and place into RAID 5 (RAID 5 is not recommended as it provides parity and EBS volumes are already replicated across multiple servers in an Availability Zone for availability and durability, so AWS recommends striping for performance rather than durability)
5. Increase the size of the EC2 Instance.
6. Put the database behind an Elastic Load Balancer.
You are tasked with setting up a cluster of EC2 Instances for a NoSQL database. The database requires random read IO disk performance up to a 100,000 IOPS at 4KB block side per node. Which of the following EC2 instances will perform the best for this workload?
1. A High-Memory Quadruple Extra Large (m2.4xlarge) with EBS-Optimized set to true and a PIOPs EBS volume
2. A Cluster Compute Eight Extra Large (cc2.8xlarge) using instance storage
3. High I/O Quadruple Extra Large (hi1.4xlarge) using instance storage
4. A Cluster GPU Quadruple Extra Large (cg1.4xlarge) using four separate 4000 PIOPS EBS volumes in a RAID 0 configuration
You are implementing a URL whitelisting system for a company that wants to restrict outbound HTTP’S connections to specific domains from their EC2-hosted applications you deploy a single EC2 instance running proxy software and configure It to accept traffic from all subnets and EC2 instances in the VPC. You configure the proxy to only pass through traffic to domains that you define in its whitelist configuration You have a nightly maintenance window or 10 minutes where ail instances fetch new software updates. Each update Is about 200MB In size and there are 500 instances In the VPC that routinely fetch updates After a few days you notice that some machines are failing to successfully download some, but not all of their updates within the maintenance window The download URLs used for these updates are correctly listed in the proxy’s whitelist configuration and you are able to access them manually using a web browser on the instances What might be happening? (Choose 2 answers) [PROFESSIONAL]
1. You are running the proxy on an undersized EC2 instance type so network throughput is not sufficient for all instances to download their updates in time.
2. You have not allocated enough storage to the EC2 instance running me proxy so the network buffer is filling up causing some requests to fall
3. You are running the proxy in a public subnet but have not allocated enough EIPs to support the needed network throughput through the Internet Gateway (IGW)
4. You are running the proxy on a affluently-sized EC2 instance in a private subnet and its network throughput is being throttled by a NAT running on an undersized EC2 instance
5. The route table for the subnets containing the affected EC2 instances is not configured to direct network traffic for the software update locations to the proxy.
You have been asked to design the storage layer for an application. The application requires disk performance of at least 100,000 IOPS in addition; the storage layer must be able to survive the loss of an individual disk, EC2 instance, or Availability Zone without any data loss. The volume you provide must have a capacity of at least 3TB. Which of the following designs will meet these objectives? [PROFESSIONAL]
1. Instantiate an i2.8xlarge instance in us-east-1a. Create a RAID 0 volume using the four 800GB SSD ephemeral disks provided with the instance. Provision 3×1 TB EBS volumes attach them to the instance and configure them as a second RAID 0 volume. Configure synchronous, block-level replication from the ephemeral backed volume to the EBS-backed volume. (Same AZ will not survive the AZ loss)
2. Instantiate an i2.8xlarge instance in us-east-1a. Create a RAID 0 volume using the four 800GB SSD ephemeral disks provided with the Instance Configure synchronous block-level replication to an identically configured Instance in us-east-1b.
3. Instantiate a c3.8xlarge Instance in us-east-1. Provision an AWS Storage Gateway and configure it for 3 TB of storage and 100,000 IOPS. Attach the volume to the instance. (Need synchronous replication to prevent any data loss)
4. Instantiate a c3.8xlarge instance in us-east-1 provision 4x1TB EBS volumes, attach them to the instance, and configure them as a single RAID 5 volume Ensure that EBS snapshots are performed every 15 minutes. (RAID 5 not recommended by AWS and Need synchronous replication to prevent any data loss)
5. Instantiate a c3 8xlarge Instance in us-east-1 Provision 3x1TB EBS volumes attach them to the instance, and configure them as a single RAID 0 volume Ensure that EBS snapshots are performed every 15 minutes. (Need synchronous replication to prevent any data loss)

References

AWS EC2 Best Practices

April 11, 2016 ~ Last updated on : September 2, 2021 ~ jayendrapatil ~ 4 Comments

AWS EC2 Best Practices

AWS recommends the following best practices to get maximum benefit and satisfaction from EC2

Security & Network

Implement the least permissive rules for the security group.
Regularly patch, update, and secure the operating system and applications on the instance
Manage access to AWS resources and APIs using identity federation, IAM users, and IAM roles
Establish credential management policies and procedures for creating, distributing, rotating, and revoking AWS access credentials
Launch the instances into a VPC instead of EC2-Classic (If AWS account is newly created VPC is used by default)
Encrypt EBS volumes and snapshots.

Storage

EC2 supports Instance store and EBS volumes, so its best to understand the implications of the root device type for data persistence, backup, and recovery
Use separate Amazon EBS volumes for the operating system (root device) versus your data.
Ensure that the data volume (with the data) persists after instance termination.
Use the instance store available for the instance to only store temporary data. Remember that the data stored in the instance store is deleted when an instance is stopped or terminated.
If you use instance store for database storage, ensure that you have a cluster with a replication factor that ensures fault tolerance.

Resource Management

Use instance metadata and custom resource tags to track and identify your AWS resources
View your current limits for Amazon EC2. Plan to request any limit increases in advance of the time that you’ll need them.

Backup & Recovery

~~Regularly back up the instance using~~ ~~Amazon EBS snapshots (not done automatically)~~ ~~or a backup tool.~~
Data Lifecycle Manager (DLM) to automate the creation, retention, and deletion of snapshots taken to back up the EBS volumes
Create an Amazon Machine Image (AMI) from the instance to save the configuration as a template for launching future instances.
Implement High Availability by deploying critical components of the application across multiple Availability Zones, and replicate the data appropriately
Monitor and respond to events.
Design the applications to handle dynamic IP addressing when the instance restarts.
Implement failover. For a basic solution, you can manually attach a network interface or Elastic IP address to a replacement instance
Regularly test the process of recovering your instances and Amazon EBS volumes if they fail.

References

AWS_EC2_Best_Practices

AWS Encrypting Data at Rest – Whitepaper – Certification

April 10, 2016 ~ Last updated on : February 9, 2017 ~ jayendrapatil ~ 11 Comments

Encrypting Data at Rest

AWS delivers a secure, scalable cloud computing platform with high availability, offering the flexibility for you to build a wide range of applications
AWS allows several options for encrypting data at rest, for additional layer of security, ranging from completely automated AWS encryption solution to manual client-side options
Encryption requires 3 things
- Data to encrypt
- Encryption keys
- Cryptographic algorithm method to encrypt the data
AWS provides different models for Securing data at rest on the following parameters
- Encryption method
  - Encryption algorithm selection involves evaluating security, performance, and compliance requirements specific to your application
- Key Management Infrastructure (KMI)
  - KMI enables managing & protecting the encryption keys from unauthorized access
  - KMI provides
    - Storage layer that protects plain text keys
    - Management layer that authorize key usage
Hardware Security Module (HSM)
- Common way to protect keys in a KMI is using HSM
- An HSM is a dedicated storage and data processing device that performs cryptographic operations using keys on the device.
- An HSM typically provides tamper evidence, or resistance, to protect keys from unauthorized use.
- A software-based authorization layer controls who can administer the HSM and which users or applications can use which keys in the HSM
AWS CloudHSM
- AWS CloudHSM appliance has both physical and logical tamper detection and response mechanisms that trigger zeroization of the appliance.
- Zeroization erases the HSM’s volatile memory where any keys in the process of being decrypted were stored and destroys the key that encrypts stored objects, effectively causing all keys on the HSM to be inaccessible and unrecoverable.
- AWS CloudHSM can be used to generate and store key material and can perform encryption and decryption operations,
- AWS CloudHSM, however, does not perform any key lifecycle management functions (e.g., access control policy, key rotation) and needs a compatible KMI.
- KMI can be deployed either on-premises or within Amazon EC2 and can communicate to the AWS CloudHSM instance securely over SSL to help protect data and encryption keys.
- AWS CloudHSM service uses SafeNet Luna appliances, any key management server that supports the SafeNet Luna platform can also be used with AWS CloudHSM
AWS Key Management Service (KMS)
- AWS KMS is a managed encryption service that allows you to provision and use keys to encrypt data in AWS services and your applications.
- Masters key, after creation, are designed to never be exported from the service.
- AWS KMS gives you centralized control over who can access your master keys to encrypt and decrypt data, and it gives you the ability to audit this access.
- Data can be sent into the KMS to be encrypted or decrypted under a specific master key under you account.
- AWS KMS is natively integrated with other AWS services (for e.g. Amazon EBS, Amazon S3, and Amazon Redshift) and AWS SDKs to simplify encryption of your data within those services or custom applications
- AWS KMS provides global availability, low latency, and a high level of durability for your keys.

Encryption Models in AWS

Encryption models in AWS depends on the on how you/AWS provides the encryption method and the KMI

You control the encryption method and the entire KMI
You control the encryption method, AWS provides the storage component of the KMI, and you provide the management layer of the KMI.
AWS controls the encryption method and the entire KMI.

Model A: You control the encryption method and the entire KMI

You use your own KMI to generate, store, and manage access to keys as well as control all encryption methods in your applications
Proper storage, management, and use of keys to ensure the confidentiality, integrity, and availability of your data is your responsibility
AWS has no access to your keys and cannot perform encryption or decryption on your behalf.
Amazon S3
- Encryption of the data is done before the object is sent to AWS S3
- Encryption of the data can be done using any encryption method and the encrypted data can be uploaded using the PUT request in the Amazon S3 API
- Key used to encrypt the data needs to be stored securely in your KMI
- To decrypt this data, the encrypted object can be downloaded from Amazon S3 using the GET request in the Amazon S3 API and then decrypted using the key in your KMI
- AWS provide Client-side encryption handling, where you can provide your key to the AWS S3 encryption client which will encrypt and decrypt the data on your behalf. However, AWS never has access to the keys or the unencrypted data
Amazon EBS
- Amazon Elastic Block Store (Amazon EBS) provides block-level storage volumes for use with Amazon EC2 instances. Amazon EBS volumes are network-attached, and persist independently from the life of an instance.
- Because Amazon EBS volumes are presented to an instance as a block device, you can leverage most standard encryption tools for file system-level or block-level encryption
- Block level encryption
  - Block level encryption tools usually operate below the file system layer using kernel space device drivers to perform encryption and decryption of data.
  - These tools are useful when you want all data written to a volume to be encrypted regardless of what directory the data is stored in
- File System level encryption
  - File system level encryption usually works by stacking an encrypted file system on top of an existing file system.
  - This method is typically used to encrypt a specific directory
- These solutions require you to provide keys, either manually or from your KMI.
- Both block-level and file system-level encryption tools can only be used to encrypt data volumes that are not Amazon EBS boot volumes, as they don’t allow you to automatically make a trusted key available to the boot volume at startup
- There are third party solutions available, which can help encrypt both the boot and data volumes as well as supplying and protecting keys
AWS Storage Gateway
- AWS Storage Gateway is a service connecting an on-premises software appliance with Amazon S3. Data on disk volumes attached to the AWS Storage Gateway will be automatically uploaded to Amazon S3 based on policy
- Encryption of the source data on the disk volumes can be either done before writing to the disk or using block level encryption on the iSCSI endpoint that AWS Storage Gateway exposes to encrypt all data on the disk volume.
Amazon RDS
- Amazon RDS doesn’t expose the attached disk it uses for data storage, transparent disk encryption using techniques for EBS section cannot be applied.
- However, individual fields data can be encrypted before the data is written to RDS and decrypted after reading it.

Model B: You control the encryption method, AWS provides the KMI storage component, and you provide the KMI management layer

Model B is similar to Model A where the encryption method is managed by you
Model B differs in the approach to Model A where the keys are maintained in AWS CloudHSM rather than than the on-premise key storage system
Only you have access to the cryptographic partitions within the dedicated HSM to use the keys

Model C: AWS controls the encryption method and the entire KMI

AWS provides and manages the server-side encryption of your data, transparently managing the encryption method and the keys.
AWS KMS and other services that encrypt your data directly use a method called envelope encryption to provide a balance between performance and security.
Envelope Encryption method
- A master key is defined either by you or AWS
- A data key (data encryption key) is generated by the AWS service at the time when data encryption is requested
- Data key is used to encrypt your data.
- Data key is then encrypted with a key-encrypting key (master key) unique to the service storing your data.
- Encrypted data key and the encrypted data are then stored by the AWS storage service on your behalf.
Master key (key-encrypting keys) used to encrypt data keys are stored and managed separately from the data and the data keys
For decryption of the data, the process is reversed. Encrypted data key is decrypted using the key-encrypting key; the data key is then used to decrypt your data
Authorized use of encryption keys is done automatically and is securely managed by AWS.
Because unauthorized access to those keys could lead to the disclosure of your data, AWS has built systems and processes with strong access controls that minimize the chance of unauthorized access and had these systems verified by third-party audits to achieve security certifications including SOC 1, 2, and 3, PCI-DSS, and FedRAMP.
Amazon S3
- SSE-S3
  - AWS encrypts each object using a unique data key
  - Data key is encrypted with a periodically rotated master key managed by S3
  - Amazon S3 server-side encryption uses 256-bit Advanced Encryption Standard (AES) keys for both object and master keys
- SSE-KMS
  - Master keys are defined and managed in KMS for your account
  - Object Encryption
    - When an object is uploaded, a request is sent to KMS to create an object key.
    - KMS generates a unique object key and encrypts it using the master key; KMS then returns this encrypted object key along with the plaintext object key to Amazon S3.
    - Amazon S3 web server encrypts your object using the plaintext object key and stores the now encrypted object (with the encrypted object key) and deletes the plaintext object key from memory.
  - Object Decryption
    - To retrieve the encrypted object, Amazon S3 sends the encrypted object key to AWS KMS.
    - AWS KMS decrypts the object key using the correct master key and returns the decrypted (plaintext) object key to S3.
    - Amazon S3 decrypts the encrypted object, with the plaintext object key, and returns it to you.
- SSE-C
  - Amazon S3 is provided an encryption key, while uploading the object
  - Encryption key is used by Amazon S3 to encrypt your data using AES-256
  - After object encryption, Amazon S3 deletes the encryption key
  - For downloading, you need to provide the same encryption key, which AWS matches, decrypts and returns the object
Amazon EBS
- When Amazon EBS volume is created, you can choose the master key in KMS to be used for encrypting the volume
- Volume encryption
  - Amazon EC2 server sends an authenticated request to AWS KMS to create a volume key.
  - AWS KMS generates this volume key, encrypts it using the master key, and returns the plaintext volume key and the encrypted volume key to the Amazon EC2 server.
  - Plaintext volume key is stored in memory to encrypt and decrypt all data going to and from your attached EBS volume.
- Volume decryption
  - When the encrypted volume (or any encrypted snapshots derived
    from that volume) needs to be re-attached to an instance, a call is made to AWS KMS to decrypt the encrypted volume key.
  - AWS KMS decrypts this encrypted volume key with the correct master key and returns the decrypted volume key to Amazon EC2.
Amazon Glacier
- Glacier provide encryption of the data, by default
- Before it’s written to disk, data is always automatically encrypted using 256-bit AES keys unique to the Amazon Glacier service that are stored in separate systems under AWS control
AWS Storage Gateway
- AWS Storage Gateway transfers your data to AWS over SSL
- AWS Storage Gateway stores data encrypted at rest in Amazon S3 or Amazon Glacier using their respective server side encryption schemes.
Amazon RDS – Oracle
- Oracle Advanced Security option for Oracle on Amazon RDS can be used to leverage the native Transparent Data Encryption (TDE) and Native Network Encryption (NNE) features
- Oracle encryption module creates data and key-encrypting keys to encrypt the database
- Key-encrypting keys specific to your Oracle instance on Amazon RDS are themselves encrypted by a periodically rotated 256-bit AES master key.
- Master key is unique to the Amazon RDS service and is stored in separate systems under AWS control
Amazon RDS -SQL server
- Transparent Data Encryption (TDE) can be provisioned for Microsoft SQL Server on Amazon RDS.
- SQL Server encryption module creates data and keyencrypting keys to encrypt the database.
- Key-encrypting keys specific to your SQL Server instance on Amazon RDS are themselves encrypted by a periodically rotated, regional 256-bit AES master key
- Master key is unique to the Amazon RDS service and is stored in separate systems under AWS control

Sample Exam Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

How can you secure data at rest on an EBS volume?
1. Encrypt the volume using the S3 server-side encryption service
2. Attach the volume to an instance using EC2’s SSL interface.
3. Create an IAM policy that restricts read and write access to the volume.
4. Write the data randomly instead of sequentially.
5. Use an encrypted file system on top of the EBS volume
Your company policies require encryption of sensitive data at rest. You are considering the possible options for protecting data while storing it at rest on an EBS data volume, attached to an EC2 instance. Which of these options would allow you to encrypt your data at rest? (Choose 3 answers)
1. Implement third party volume encryption tools —
2. Do nothing as EBS volumes are encrypted by default
3. Encrypt data inside your applications before storing it on EBS
4. Encrypt data using native data encryption drivers at the file system level
5. Implement SSL/TLS for all services running on the server
A company is storing data on Amazon Simple Storage Service (S3). The company’s security policy mandates that data is encrypted at rest. Which of the following methods can achieve this? Choose 3 answers
1. Use Amazon S3 server-side encryption with AWS Key Management Service managed keys
2. Use Amazon S3 server-side encryption with customer-provided keys
3. Use Amazon S3 server-side encryption with EC2 key pair.
4. Use Amazon S3 bucket policies to restrict access to the data at rest.
5. Encrypt the data on the client-side before ingesting to Amazon S3 using their own master key
6. Use SSL to encrypt the data while in transit to Amazon S3.
Which 2 services provide native encryption
1. Amazon EBS
2. Amazon Glacier
3. Amazon Redshift (is optional)
4. Amazon RDS (is optional)
5. Amazon Storage Gateway
With which AWS services CloudHSM can be used (select 2)
1. S3
2. DynamoDb
3. RDS
4. ElastiCache
5. Amazon Redshift

References

AWS Securing Data at Rest with Encryption Whitepaper

AWS S3 Data Consistency Model

March 30, 2016 ~ Last updated on : October 4, 2021 ~ jayendrapatil ~ 7 Comments

AWS S3 Data Consistency Model

S3 Data Consistency provides strong read-after-write consistency for PUT and DELETE requests of objects in the S3 bucket in all AWS Regions
This behavior applies to both writes to new objects as well as PUT requests that overwrite existing objects and DELETE requests.
Read operations on S3 Select, S3 ACLs, S3 Object Tags, and object metadata (for example, the HEAD object) are strongly consistent.
Updates to a single key are atomic. for e.g., if you PUT to an existing key, a subsequent read might return the old data or the updated data, but it will never write corrupted or partial data.
S3 achieves high availability by replicating data across multiple servers within Amazon’s data centers. If a PUT request is successful, the data is safely stored. Any read (GET or LIST request) that is initiated following the receipt of a successful PUT response will return the data written by the PUT request.
S3 Data Consistency behavior examples
- A process writes a new object to S3 and immediately lists keys within its bucket. The new object appears in the list.
- A process replaces an existing object and immediately tries to read it. S3 returns the new data.
- A process deletes an existing object and immediately tries to read it. S3 does not return any data because the object has been deleted.
- A process deletes an existing object and immediately lists keys within its bucket. The object does not appear in the listing.
S3 does not currently support object locking for concurrent writes. for e.g. If two PUT requests are simultaneously made to the same key, the request with the latest timestamp wins. If this is an issue, you will need to build an object-locking mechanism into your application.
Updates are key-based; there is no way to make atomic updates across keys. for e.g, an update of one key cannot be dependent on the update of another key unless you design this functionality into the application.
S3 Object Lock is different as it allows to store objects using a write-once-read-many (WORM) model, which prevents an object from being deleted or overwritten for a fixed amount of time or indefinitely.
S3 provides strong Read-after-Write consistency for PUTS of new objects
- ~~For a PUT request, S3 synchronously stores data across multiple facilities before returning SUCCESS~~
- A process writes a new object to S3 and will be immediately able to read the Object i.e. PUT 200 -> GET 200
- ~~A process writes a new object to S3 and immediately lists keys within its bucket. Until the change is fully propagated, the object might not appear in the list.~~
- However, if a HEAD or GET request to a key name is made before the object is created, then create the object shortly after that, a subsequent GET might not return the object due to eventual consistency. i.e. GET 404 -> PUT 200 -> GET 404
~~S3 provides Eventual Consistency for overwrite PUTS and DELETES in all regions.~~
- For updates and deletes to Objects, the changes are eventually reflected and not available immediately i.e. PUT 200 -> PUT 200 -> GET 200 (might be older version) OR DELETE 200 -> GET 200
- ~~if a process replaces an existing object and immediately attempts to read it, S3 might return the prior data till the change is fully propagated~~
- ~~if a process deletes an existing object and immediately attempts to read it, S3 might return the deleted data until the deletion is fully propagated~~
- ~~if a process deletes an existing object and immediately lists keys within its bucket. Until the deletion is fully propagated, S3 might list the deleted object.~~

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

Which of the following are valid statements about Amazon S3? Choose 2 answers
1. S3 provides read-after-write consistency for any type of PUT or DELETE. (S3 now provides strong read-after-write consistency)
2. Consistency is not guaranteed for any type of PUT or DELETE.
3. A successful response to a PUT request only occurs when a complete object is saved
4. Partially saved objects are immediately readable with a GET after an overwrite PUT.
5. ~~S3 provides eventual consistency for overwrite PUTS and DELETES~~
A customer is leveraging Amazon Simple Storage Service in eu-west-1 to store static content for web-based property. The customer is storing objects using the Standard Storage class. Where are the customers’ objects replicated?
1. Single facility in eu-west-1 and a single facility in eu-central-1
2. Single facility in eu-west-1 and a single facility in us-east-1
3. Multiple facilities in eu-west-1
4. A single facility in eu-west-1
A user has an S3 object in the US Standard region with the content “color=red”. The user updates the object with the content as “color=”white”. If the user tries to read the value 1 minute after it was uploaded, what will S3 return?
1. It will return “color=white” (strong read-after-write consistency)
2. It will return “color=red”
3. It will return an error saying that the object was not found
4. It may return either “color=red” or “color=white” i.e. any of the value (Eventual Consistency)

References

AWS_S3_Data_Consistency

AWS Simple Storage Service – S3

March 29, 2016 ~ Last updated on : May 16, 2023 ~ jayendrapatil ~ 50 Comments

AWS Simple Storage Service – S3

Amazon Simple Storage Service – S3 is a simple key, value object store designed for the Internet
provides unlimited storage space and works on the pay-as-you-use model. Service rates get cheaper as the usage volume increases
offers an extremely durable, highly available, and infinitely scalable data storage infrastructure at very low costs.
is Object-level storage (not Block level storage like EBS volumes) and cannot be used to host OS or dynamic websites.
S3 resources e.g. buckets and objects are private by default.

S3 Buckets & Objects

S3 Buckets

A bucket is a container for objects stored in S3
Buckets help organize the S3 namespace.
A bucket is owned by the AWS account that creates it and helps identify the account responsible for storage and data transfer charges.
Bucket names are globally unique, regardless of the AWS region in which it was created and the namespace is shared by all AWS accounts
Even though S3 is a global service, buckets are created within a region specified during the creation of the bucket.
Every object is contained in a bucket
There is no limit to the number of objects that can be stored in a bucket and no difference in performance whether a single bucket or multiple buckets are used to store all the objects
The S3 data model is a flat structure i.e. there are no hierarchies or folders within the buckets. However, logical hierarchy can be inferred using the key name prefix e.g. Folder1/Object1
Restrictions
- 100 buckets (soft limit) and a maximum of 1000 buckets can be created in each AWS account
- Bucket names should be globally unique and DNS compliant
- Bucket ownership is not transferable
- Buckets cannot be nested and cannot have a bucket within another bucket
- Bucket name and region cannot be changed, once created
Empty or a non-empty buckets can be deleted
S3 allows retrieval of 1000 objects and provides pagination support

Objects

Objects are the fundamental entities stored in a bucket
An object is uniquely identified within a bucket by a key name and version ID (if S3 versioning is enabled on the bucket)
Objects consist of object data, metadata, and others
- Key is the object name and a unique identifier for an object
- Value is actual content stored
- Metadata is the data about the data and is a set of name-value pairs that describe the object e.g. content-type, size, last modified. Custom metadata can also be specified at the time the object is stored.
- Version ID is the version id for the object and in combination with the key helps to uniquely identify an object within a bucket
- Subresources help provide additional information for an object
- Access Control Information helps control access to the objects
S3 objects allow two kinds of metadata
- System metadata
  - Metadata such as the Last-Modified date is controlled by the system. Only S3 can modify the value.
  - System metadata that the user can control, e.g., the storage class, and encryption configured for the object.
- User-defined metadata
  - User-defined metadata can be assigned during uploading the object or after the object has been uploaded.
  - User-defined metadata is stored with the object and is returned when an object is downloaded
  - S3 does not process user-defined metadata.
  - User-defined metadata must begin with the prefix “x-amz-meta“, otherwise S3 will not set the key-value pair as you define it
Object metadata cannot be modified after the object is uploaded and it can be only modified by performing copy operation and setting the metadata
Objects belonging to a bucket that reside in a specific AWS region never leave that region, unless explicitly copied using Cross Region Replication
Each object can be up to 5 TB in size
An object can be retrieved as a whole or a partially
With Versioning enabled, current as well as previous versions of an object can be retrieved

S3 Bucket & Object Operations

Listing
- S3 allows the listing of all the keys within a bucket
- A single listing request would return a max of 1000 object keys with pagination support using an indicator in the response to indicate if the response was truncated
- Keys within a bucket can be listed using Prefix and Delimiter.
- Prefix limits result in only those keys (kind of filtering) that begin with the specified prefix, and the delimiter causes the list to roll up all keys that share a common prefix into a single summary list result.
Retrieval
- An object can be retrieved as a whole
- An object can be retrieved in parts or partially (a specific range of bytes) by using the Range HTTP header.
- Range HTTP header is helpful
  - if only a partial object is needed for e.g. multiple files were uploaded as a single archive
  - for fault-tolerant downloads where the network connectivity is poor
- Objects can also be downloaded by sharing Pre-Signed URLs
- Metadata of the object is returned in the response headers
Object Uploads
- Single Operation – Objects of size 5GB can be uploaded in a single PUT operation
- Multipart upload – can be used for objects of size > 5GB and supports the max size of 5TB. It is recommended for objects above size 100MB.
- Pre-Signed URLs can also be used and shared for uploading objects
- Objects if uploaded successfully can be verified if the request received a successful response. Additionally, returned ETag can be compared to the calculated MD5 value of the upload object
Copying Objects
- Copying of objects up to 5GB can be performed using a single operation and multipart upload can be used for uploads up to 5TB
- When an object is copied
  - user-controlled system metadata e.g. storage class and user-defined metadata are also copied.
  - system controlled metadata e.g. the creation date etc is reset
- Copying Objects can be needed to
  - Create multiple object copies
  - Copy objects across locations or regions
  - Renaming of the objects
  - Change object metadata for e.g. storage class, encryption, etc
  - Updating any metadata for an object requires all the metadata fields to be specified again
Deleting Objects
- S3 allows deletion of a single object or multiple objects (max 1000) in a single call
- For Non Versioned buckets,
  - the object key needs to be provided and the object is permanently deleted
- For Versioned buckets,
  - if an object key is provided, S3 inserts a delete marker and the previous current object becomes the non-current object
  - if an object key with a version ID is provided, the object is permanently deleted
  - if the version ID is of the delete marker, the delete marker is removed and the previous non-current version becomes the current version object
- Deletion can be MFA enabled for adding extra security
Restoring Objects from Glacier
- Objects must be restored before accessing an archived object
- Restoration of an Object takes time and costs more. Glacier now offers expedited retrievals within minutes.
- Restoration request also needs to specify the number of days for which the object copy needs to be maintained.
- During this period, storage cost applies for both the archive and the copy.

Pre-Signed URLs

All buckets and objects are by default private.
Pre-signed URLs allows user to be able to download or upload a specific object without requiring AWS security credentials or permissions.
Pre-signed URL allows anyone to access the object identified in the URL, provided the creator of the URL has permission to access that object.
Pre-signed URLs creation requires the creator to provide security credentials, a bucket name, an object key, an HTTP method (GET for download object & PUT of uploading objects), and expiration date and time
Pre-signed URLs are valid only till the expiration date & time.

Multipart Upload

Multipart upload allows the user to upload a single large object as a set of parts. Each part is a contiguous portion of the object’s data.
Multipart uploads support 1 to 10000 parts and each part can be from 5MB to 5GB with the last part size allowed to be less than 5MB
Multipart uploads allow a max upload size of 5TB
Object parts can be uploaded independently and in any order. If transmission of any part fails, it can be retransmitted without affecting other parts.
After all parts of the object are uploaded and completed initiated, S3 assembles these parts and creates the object.
Using multipart upload provides the following advantages:
- Improved throughput – parallel upload of parts to improve throughput
- Quick recovery from any network issues – Smaller part size minimizes the impact of restarting a failed upload due to a network error.
- Pause and resume object uploads – Object parts can be uploaded over time. Once a multipart upload is initiated there is no expiry; you must explicitly complete or abort the multipart upload.
- Begin an upload before the final object size is known – an object can be uploaded as is it being created
Three Step process
- Multipart Upload Initiation
  - Initiation of a Multipart upload request to S3 returns a unique ID for each multipart upload.
  - This ID needs to be provided for each part upload, completion or abort request and listing of parts call.
  - All the Object metadata required needs to be provided during the Initiation call
- Parts Upload
  - Parts upload of objects can be performed using the unique upload ID
  - A part number (between 1 – 10000) needs to be specified with each request which identifies each part and its position in the object
  - If a part with the same part number is uploaded, the previous part would be overwritten
  - After the part upload is successful, S3 returns an ETag header in the response which must be recorded along with the part number to be provided during the multipart completion request
- Multipart Upload Completion or Abort
  - On Multipart Upload Completion request, S3 creates an object by concatenating the parts in ascending order based on the part number and associates the metadata with the object
  - Multipart Upload Completion request should include the unique upload ID with all the parts and the ETag information
  - The response includes an ETag that uniquely identifies the combined object data
  - On Multipart upload Abort request, the upload is aborted and all parts are removed. Any new part upload would fail. However, any in-progress part upload is completed, and hence an abort request must be sent after all the parts uploads have been completed.
  - S3 should receive a multipart upload completion or abort request else it will not delete the parts and storage would be charged.

S3 Transfer Acceleration

S3 Transfer Acceleration enables fast, easy, and secure transfers of files over long distances between the client and a bucket.
Transfer Acceleration takes advantage of CloudFront’s globally distributed edge locations. As the data arrives at an edge location, data is routed to S3 over an optimized network path.
Transfer Acceleration will have additional charges while uploading data to S3 is free through the public Internet.

S3 Batch Operations

S3 Batch Operations help perform large-scale batch operations on S3 objects and can perform a single operation on lists of specified S3 objects.
A single job can perform a specified operation on billions of objects containing exabytes of data.
S3 tracks progress, sends notifications, and stores a detailed completion report of all actions, providing a fully managed, auditable, and serverless experience.
Batch Operations can be used with S3 Inventory to get the object list and use S3 Select to filter the objects.
Batch Operations can be used for copying objects, modify object metadata, applying ACLs, encrypting objects, transforming objects, invoke a custom lambda function, etc.

Virtual Hosted Style vs Path-Style Request

S3 allows the buckets and objects to be referred to in Path-style or Virtual hosted-style URLs

Path-style

Bucket name is not part of the domain (unless region specific endpoint used)
Endpoint used must match the region in which the bucket resides for e.g, if you have a bucket called mybucket that resides in the EU (Ireland) region with object named puppy.jpg, the correct path-style syntax URI is http://s3-eu-west-1.amazonaws.com/mybucket/puppy.jpg.
A “PermanentRedirect” error is received with an HTTP response code 301, and a message indicating what the correct URI is for the resource if a bucket is accessed outside the US East (N. Virginia) region with path-style syntax that uses either of the following:
- http://s3.amazonaws.com
- An endpoint for a region different from the one where the bucket resides for e.g., if you use http://s3-eu-west-1.amazonaws.com for a bucket that was created in the US West (N. California) region
Path-style requests would not be supported after after September 30, 2020

Virtual hosted-style

S3 supports virtual hosted-style and path-style access in all regions.
In a virtual-hosted-style URL, the bucket name is part of the domain name in the URL for e.g. http://bucketname.s3.amazonaws.com/objectname
S3 virtual hosting can be used to address a bucket in a REST API call by using the HTTP Host header
Benefits
- attractiveness of customized URLs,
- provides an ability to publish to the “root directory” of the bucket’s virtual server. This ability can be important because many existing applications search for files in this standard location.
S3 updates DNS to reroute the request to the correct location when a bucket is created in any region, which might take time.
S3 routes any virtual hosted-style requests to the US East (N.Virginia) region, by default, if the US East (N. Virginia) endpoint s3.amazonaws.com is used, instead of the region-specific endpoint (for e.g., s3-eu-west-1.amazonaws.com) and S3 redirects it with HTTP 307 redirect to the correct region.
When using virtual hosted-style buckets with SSL, the SSL wild card certificate only matches buckets that do not contain periods.To work around this, use HTTP or write your own certificate verification logic.
If you make a request to the http://bucket.s3.amazonaws.com endpoint, the DNS has sufficient information to route the request directly to the region where your bucket resides.

S3 Pricing

S3 costs vary by Region
Charges are incurred for
- Storage – cost is per GB/month
- Requests – per request cost varies depending on the request type GET, PUT
- Data Transfer
  - data transfer-in is free
  - data transfer out is charged per GB/month (except in the same region or to Amazon CloudFront)

Additional Topics

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

What does Amazon S3 stand for?
1. Simple Storage Solution.
2. Storage Storage Storage (triple redundancy Storage).
3. Storage Server Solution.
4. Simple Storage Service
What are characteristics of Amazon S3? Choose 2 answers
1. Objects are directly accessible via a URL
2. S3 should be used to host a relational database
3. S3 allows you to store objects or virtually unlimited size
4. S3 allows you to store virtually unlimited amounts of data
5. S3 offers Provisioned IOPS
You are building an automated transcription service in which Amazon EC2 worker instances process an uploaded audio file and generate a text file. You must store both of these files in the same durable storage until the text file is retrieved. You do not know what the storage capacity requirements are. Which storage option is both cost-efficient and scalable?
1. Multiple Amazon EBS volume with snapshots
2. A single Amazon Glacier vault
3. A single Amazon S3 bucket
4. Multiple instance stores
A user wants to upload a complete folder to AWS S3 using the S3 Management console. How can the user perform this activity?
1. Just drag and drop the folder using the flash tool provided by S3
2. Use the Enable Enhanced Folder option from the S3 console while uploading objects
3. The user cannot upload the whole folder in one go with the S3 management console
4. Use the Enable Enhanced Uploader option from the S3 console while uploading objects (NOTE – Its no longer supported by AWS)
A media company produces new video files on-premises every day with a total size of around 100GB after compression. All files have a size of 1-2 GB and need to be uploaded to Amazon S3 every night in a fixed time window between 3am and 5am. Current upload takes almost 3 hours, although less than half of the available bandwidth is used. What step(s) would ensure that the file uploads are able to complete in the allotted time window?
1. Increase your network bandwidth to provide faster throughput to S3
2. Upload the files in parallel to S3 using mulipart upload
3. Pack all files into a single archive, upload it to S3, then extract the files in AWS
4. Use AWS Import/Export to transfer the video files
A company is deploying a two-tier, highly available web application to AWS. Which service provides durable storage for static content while utilizing lower Overall CPU resources for the web tier?
1. Amazon EBS volume
2. Amazon S3
3. Amazon EC2 instance store
4. Amazon RDS instance
You have an application running on an Amazon Elastic Compute Cloud instance, that uploads 5 GB video objects to Amazon Simple Storage Service (S3). Video uploads are taking longer than expected, resulting in poor application performance. Which method will help improve performance of your application?
1. Enable enhanced networking
2. Use Amazon S3 multipart upload
3. Leveraging Amazon CloudFront, use the HTTP POST method to reduce latency.
4. Use Amazon Elastic Block Store Provisioned IOPs and use an Amazon EBS-optimized instance
When you put objects in Amazon S3, what is the indication that an object was successfully stored?
1. Each S3 account has a special bucket named_s3_logs. Success codes are written to this bucket with a timestamp and checksum.
2. A success code is inserted into the S3 object metadata.
3. A HTTP 200 result code and MD5 checksum, taken together, indicate that the operation was successful.
4. Amazon S3 is engineered for 99.999999999% durability. Therefore there is no need to confirm that data was inserted.
You have private video content in S3 that you want to serve to subscribed users on the Internet. User IDs, credentials, and subscriptions are stored in an Amazon RDS database. Which configuration will allow you to securely serve private content to your users?
1. Generate pre-signed URLs for each user as they request access to protected S3 content
2. Create an IAM user for each subscribed user and assign the GetObject permission to each IAM user
3. Create an S3 bucket policy that limits access to your private content to only your subscribed users’ credentials
4. Create a CloudFront Origin Identity user for your subscribed users and assign the GetObject permission to this user
You run an ad-supported photo sharing website using S3 to serve photos to visitors of your site. At some point you find out that other sites have been linking to the photos on your site, causing loss to your business. What is an effective method to mitigate this?
1. Remove public read access and use signed URLs with expiry dates.
2. Use CloudFront distributions for static content.
3. Block the IPs of the offending websites in Security Groups.
4. Store photos on an EBS volume of the web server.
You are designing a web application that stores static assets in an Amazon Simple Storage Service (S3) bucket. You expect this bucket to immediately receive over 150 PUT requests per second. What should you do to ensure optimal performance?
1. Use multi-part upload.
2. ~~Add a random prefix to the key names.~~
3. Amazon S3 will automatically manage performance at this scale. (With latest S3 performance improvements, S3 scaled automatically)
4. Use a predictable naming scheme, such as sequential numbers or date time sequences, in the key names
What is the maximum number of S3 buckets available per AWS Account?
1. 100 Per region
2. There is no Limit
3. 100 Per Account (Refer documentation)
4. 500 Per Account
5. 100 Per IAM User
Your customer needs to create an application to allow contractors to upload videos to Amazon Simple Storage Service (S3) so they can be transcoded into a different format. She creates AWS Identity and Access Management (IAM) users for her application developers, and in just one week, they have the application hosted on a fleet of Amazon Elastic Compute Cloud (EC2) instances. The attached IAM role is assigned to the instances. As expected, a contractor who authenticates to the application is given a pre-signed URL that points to the location for video upload. However, contractors are reporting that they cannot upload their videos. Which of the following are valid reasons for this behavior? Choose 2 answers { “Version”: “2012-10-17”, “Statement”: [ { “Effect”: “Allow”, “Action”: “s3:*”, “Resource”: “*” } ] }
1. The IAM role does not explicitly grant permission to upload the object. (The role has all permissions for all activities on S3)
2. The contractorsˈ accounts have not been granted “write” access to the S3 bucket. (using pre-signed urls the contractors account don’t need to have access but only the creator of the pre-signed urls)
3. The application is not using valid security credentials to generate the pre-signed URL.
4. The developers do not have access to upload objects to the S3 bucket. (developers are not uploading the objects but its using pre-signed urls)
5. The S3 bucket still has the associated default permissions. (does not matter as long as the user has permission to upload)
6. The pre-signed URL has expired.

AWS S3 Data Protection

March 29, 2016 ~ Last updated on : September 30, 2021 ~ jayendrapatil ~ 4 Comments

AWS S3 Data Protection

S3 provides S3 data protection using highly durable storage infrastructure designed for mission-critical and primary data storage.
Objects are redundantly stored on multiple devices across multiple facilities in an S3 region.
S3 PUT and PUT Object copy operations synchronously store the data across multiple facilities before returning SUCCESS.
Once the objects are stored, S3 maintains its durability by quickly detecting and repairing any lost redundancy.
S3 also regularly verifies the integrity of data stored using checksums. If S3 detects data corruption, it is repaired using redundant data.
In addition, S3 calculates checksums on all network traffic to detect corruption of data packets when storing or retrieving data
Data protection against accidental overwrites and deletions can be added by enabling Versioning to preserve, retrieve and restore every version of the object stored
S3 also provides the ability to protect data in transit (as it travels to and from S3) and at rest (while it is stored in S3)

S3 Encryption

Refer blog post @ S3 Encryption

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

A customer is leveraging Amazon Simple Storage Service in eu-west-1 to store static content for a web-based property. The customer is storing objects using the Standard Storage class. Where are the customers objects replicated?
1. A single facility in eu-west-1 and a single facility in eu-central-1
2. A single facility in eu-west-1 and a single facility in us-east-1
3. Multiple facilities in eu-west-1
4. A single facility in eu-west-1
A system admin is planning to encrypt all objects being uploaded to S3 from an application. The system admin does not want to implement his own encryption algorithm; instead he is planning to use server side encryption by supplying his own key (SSE-C). Which parameter is not required while making a call for SSE-C?
1. x-amz-server-side-encryption-customer-key-AES-256
2. x-amz-server-side-encryption-customer-key
3. x-amz-server-side-encryption-customer-algorithm
4. x-amz-server-side-encryption-customer-key-MD5

References

AWS_S3_Security

AWS S3 Permissions

March 28, 2016 ~ Last updated on : February 9, 2023 ~ jayendrapatil ~ 20 Comments

AWS S3 Permissions

By default, all S3 buckets, objects, and related subresources are private.
Only the Resource owner, the AWS account (not the user) that creates the resource, can access the resource.
Resource owner can be
- AWS account that creates the bucket or object owns those resources
- If an IAM user creates the bucket or object, the AWS account of the IAM user owns the resource
- If the bucket owner grants cross-account permissions to other AWS account users to upload objects to the buckets, the objects are owned by the AWS account of the user who uploaded the object and not the bucket owner except for the following conditions
  - Bucket owner can deny access to the object, as it is still the bucket owner who pays for the object
  - Bucket owner can delete or apply archival rules to the object and perform restoration
User is the AWS Account or the IAM user who access the resource
Bucket owner is the AWS account that created a bucket
Object owner is the AWS account that uploads the object to a bucket, not owned by the account
S3 permissions are classified into
- Resource based policies and
- User policies

User Policies

User policies use IAM with S3 to control the type of access a user or group of users has to specific parts of an S3 bucket the AWS account owns
User policy is always attached to a User, Group, or a Role
Anonymous permissions cannot be granted
If an AWS account that owns a bucket wants to grant permission to users in its account, it can use either a bucket policy or a user policy

Resource-Based policies

Bucket policies and access control lists (ACLs) are resource-based because they are attached to the S3 resources

Screen Shot 2016-03-28 at 5.57.36 PM

Bucket Policies

Bucket policy can be used to grant cross-account access to other AWS accounts or IAM users in other accounts for the bucket and objects in it.
Bucket policies provide centralized, access control to buckets and objects based on a variety of conditions, including S3 operations, requesters, resources, and aspects of the request (e.g. IP address)
If an AWS account that owns a bucket wants to grant permission to users in its account, it can use either a bucket policy or a user policy
Permissions attached to a bucket apply to all of the objects in that bucket created and owned by the bucket owner
Policies can either add or deny permissions across all (or a subset) of objects within a bucket
Only the bucket owner is allowed to associate a policy with a bucket
Bucket policies can cater to multiple use cases
- Granting permissions to multiple accounts with added conditions
- Granting read-only permission to an anonymous user
- Limiting access to specific IP addresses
- Restricting access to a specific HTTP referer
- Restricting access to a specific HTTP header for e.g. to enforce encryption
- Granting permission to a CloudFront OAI
- Adding a bucket policy to require MFA
- Granting cross-account permissions to upload objects while ensuring the bucket owner has full control
- Granting permissions for S3 inventory and Amazon S3 analytics
- Granting permissions for S3 Storage Lens

Access Control Lists (ACLs)

Each bucket and object has an ACL associated with it.
An ACL is a list of grants identifying grantee and permission granted
ACLs are used to grant basic read/write permissions on resources to other AWS accounts.
ACL supports limited permissions set and
- cannot grant conditional permissions, nor can you explicitly deny permissions
- cannot be used to grant permissions for bucket subresources
Permission can be granted to an AWS account by the email address or the canonical user ID (is just an obfuscated Account Id). If an email address is provided, S3 will still find the canonical user ID for the user and add it to the ACL.
It is Recommended to use Canonical user ID as email address would not be supported
Bucket ACL
- Only recommended use case for the bucket ACL is to grant write permission to the S3 Log Delivery group to write access log objects to the bucket
- Bucket ACL will help grant write permission on the bucket to the Log Delivery group if access log delivery is needed to the bucket
- Only way you can grant necessary permissions to the Log Delivery group is via a bucket ACL
Object ACL
- Object ACLs control only Object-level Permissions
- Object ACL is the only way to manage permission to an object in the bucket not owned by the bucket owner i.e. If the bucket owner allows cross-account object uploads and if the object owner is different from the bucket owner, the only way for the object owner to grant permissions on the object is through Object ACL
- If the Bucket and Object is owned by the same AWS account, Bucket policy can be used to manage the permissions
- If the Object and User is owned by the same AWS account, User policy can be used to manage the permissions

S3 Request Authorization

When S3 receives a request, it must evaluate all the user policies, bucket policies, and ACLs to determine whether to authorize or deny the request.

S3 evaluates the policies in 3 context

User context is basically the context in which S3 evaluates the User policy that the parent AWS account (context authority) attaches to the user
Bucket context is the context in which S3 evaluates the access policies owned by the bucket owner (context authority) to check if the bucket owner has not explicitly denied access to the resource
Object context is the context where S3 evaluates policies owned by the Object owner (context authority)

Analogy

Consider 3 Parents (AWS Account) A, B and C with Child (IAM User) AA, BA and CA respectively
Parent A owns a Toy box (Bucket) with Toy AAA and also allows toys (Objects) to be dropped and picked up
Parent A can grant permission (User Policy OR Bucket policy OR both) to his Child AA to access the Toy box and the toys
Parent A can grant permissions (Bucket policy) to Parent B (different AWS account) to drop toys into the toys box. Parent B can grant permissions (User policy) to his Child BA to drop Toy BAA
Parent B can grant permissions (Object ACL) to Parent A to access Toy BAA
Parent A can grant permissions (Bucket Policy) to Parent C to pick up the Toy AAA who in turn can grant permission (User Policy) to his Child CA to access the toy
Parent A can grant permission (through IAM Role) to Parent C to pick up the Toy BAA who in turn can grant permission (User Policy) to his Child CA to access the toy

Bucket Operation Authorization

Screen Shot 2016-03-28 at 6.35.36 AM

If the requester is an IAM user, the user must have permission (User Policy) from the parent AWS account to which it belongs
Amazon S3 evaluates a subset of policies owned by the parent account. This subset of policies includes the user policy that the parent account attaches to the user.
If the parent also owns the resource in the request (in this case, the bucket), Amazon S3 also evaluates the corresponding resource policies (bucket policy and bucket ACL) at the same time.
Requester must also have permissions (Bucket Policy or ACL) from the bucket owner to perform a specific bucket operation.
Amazon S3 evaluates a subset of policies owned by the AWS account that owns the bucket. The bucket owner can grant permission by using a bucket policy or bucket ACL.
Note that, if the AWS account that owns the bucket is also the parent account of an IAM user, then it can configure bucket permissions in a user policy or bucket policy or both

Object Operation Authorization

Screen Shot 2016-03-28 at 6.39.54 AM

If the requester is an IAM user, the user must have permission (User Policy) from the parent AWS account to which it belongs.
Amazon S3 evaluates a subset of policies owned by the parent account. This subset of policies includes the user policy that the parent attaches to the user.
If the parent also owns the resource in the request (bucket, object), Amazon S3 evaluates the corresponding resource policies (bucket policy, bucket ACL, and object ACL) at the same time.
If the parent AWS account owns the resource (bucket or object), it can grant resource permissions to its IAM user by using either the user policy or the resource policy.
S3 evaluates policies owned by the AWS account that owns the bucket.
If the AWS account that owns the object in the request is not the same as the bucket owner, in the bucket context Amazon S3 checks the policies if the bucket owner has explicitly denied access to the object.
If there is an explicit deny set on the object, Amazon S3 does not authorize the request.
Requester must have permissions from the object owner (Object ACL) to perform a specific object operation.
Amazon S3 evaluates the object ACL.
If bucket and object owners are the same, access to the object can be granted in the bucket policy, which is evaluated in the bucket context.
If the owners are different, the object owners must use an object ACL to grant permissions.
If the AWS account that owns the object is also the parent account to which the IAM user belongs, it can configure object permissions in a user policy, which is evaluated in the user context.

Permission Delegation

If an AWS account owns a resource, it can grant those permissions to another AWS account.
That account can then delegate those permissions, or a subset of them, to users in the account. This is referred to as permission delegation.
But an account that receives permissions from another account cannot delegate permission cross-account to another AWS account.
If the Bucket owner wants to grant permission to the Object which does not belong to it to another AWS account it cannot do it through cross-account permissions and need to define an IAM role which can be assumed by the AWS account to gain access

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

Which features can be used to restrict access to data in S3? Choose 2 answers
1. Set an S3 ACL on the bucket or the object.
2. Create a CloudFront distribution for the bucket.
3. Set an S3 bucket policy.
4. Enable IAM Identity Federation
5. Use S3 Virtual Hosting
Which method can be used to prevent an IP address block from accessing public objects in an S3 bucket?
1. Create a bucket policy and apply it to the bucket
2. Create a NACL and attach it to the VPC of the bucket
3. Create an ACL and apply it to all objects in the bucket
4. Modify the IAM policies of any users that would access the bucket
A user has granted read/write permission of his S3 bucket using ACL. Which of the below mentioned options is a valid ID to grant permission to other AWS accounts (grantee. using ACL?
1. IAM User ID
2. S3 Secure ID
3. Access ID
4. Canonical user ID
A root account owner has given full access of his S3 bucket to one of the IAM users using the bucket ACL. When the IAM user logs in to the S3 console, which actions can he perform?
1. He can just view the content of the bucket
2. He can do all the operations on the bucket
3. It is not possible to give access to an IAM user using ACL
4. The IAM user can perform all operations on the bucket using only API/SDK
A root AWS account owner is trying to understand various options to set the permission to AWS S3. Which of the below mentioned options is not the right option to grant permission for S3?
1. User Access Policy
2. S3 Object Policy
3. S3 Bucket Policy
4. S3 ACL
A system admin is managing buckets, objects and folders with AWS S3. Which of the below mentioned statements is true and should be taken in consideration by the sysadmin?
1. Folders support only ACL
2. Both the object and bucket can have an Access Policy but folder cannot have policy
3. Folders can have a policy
4. Both the object and bucket can have ACL but folders cannot have ACL
A user has created an S3 bucket which is not publicly accessible. The bucket is having thirty objects which are also private. If the user wants to make the objects public, how can he configure this with minimal efforts?
1. User should select all objects from the console and apply a single policy to mark them public
2. User can write a program which programmatically makes all objects public using S3 SDK
3. Set the AWS bucket policy which marks all objects as public
4. Make the bucket ACL as public so it will also mark all objects as public
You need to configure an Amazon S3 bucket to serve static assets for your public-facing web application. Which methods ensure that all objects uploaded to the bucket are set to public read? Choose 2 answers
1. Set permissions on the object to public read during upload.
2. Configure the bucket ACL to set all objects to public read.
3. Configure the bucket policy to set all objects to public read.
4. Use AWS Identity and Access Management roles to set the bucket to public read.
5. Amazon S3 objects default to public read, so no action is needed.
Amazon S3 doesn’t automatically give a user who creates _____ permission to perform other actions on that bucket or object.
1. a file
2. a bucket or object
3. a bucket or file
4. a object or file
A root account owner is trying to understand the S3 bucket ACL. Which of the below mentioned options cannot be used to grant ACL on the object using the authorized predefined group?
1. Authenticated user group
2. All users group
3. Log Delivery Group
4. Canonical user group
A user is enabling logging on a particular bucket. Which of the below mentioned options may be best suitable to allow access to the log bucket?
1. Create an IAM policy and allow log access
2. It is not possible to enable logging on the S3 bucket
3. Create an IAM Role, which has access to the log bucket
4. Provide ACL for the logging group
A user is trying to configure access with S3. Which of the following options is not possible to provide access to the S3 bucket / object?
1. Define the policy for the IAM user
2. Define the ACL for the object
3. Define the policy for the object
4. Define the policy for the bucket
A user is having access to objects of an S3 bucket, which is not owned by him. If he is trying to set the objects of that bucket public, which of the below mentioned options may be a right fit for this action?
1. Make the bucket public with full access
2. Define the policy for the bucket
3. Provide ACL on the object
4. Create an IAM user with permission
A bucket owner has allowed another account’s IAM users to upload or access objects in his bucket. The IAM user of Account A is trying to access an object created by the IAM user of account B. What will happen in this scenario?
1. The bucket policy may not be created as S3 will give error due to conflict of Access Rights
2. It is not possible to give permission to multiple IAM users
3. AWS S3 will verify proper rights given by the owner of Account A, the bucket owner as well as by the IAM user B to the object
4. It is not possible that the IAM user of one account accesses objects of the other IAM user

References

AWS_S3_Access_Control

AWS S3 Object Lifecycle Management

March 26, 2016 ~ Last updated on : June 20, 2023 ~ jayendrapatil ~ 20 Comments

S3 Object Lifecycle Management

S3 Object lifecycle can be managed by using a lifecycle configuration, which defines how S3 manages objects during their lifetime.
Lifecycle configuration enables simplification of object lifecycle management, for e.g. moving of less frequently access objects, backup or archival of data for several years, or permanent deletion of objects,
S3 controls all transitions automatically
Lifecycle Management rules applied to a bucket are applicable to all the existing objects in the bucket as well as the ones that will be added anew
S3 Object lifecycle management allows 2 types of behavior
- Transition in which the storage class for the objects changes
- Expiration where the objects expire and are permanently deleted
Lifecycle Management can be configured with Versioning, which allows storage of one current object version and zero or more non-current object versions
Object’s lifecycle management applies to both Non Versioning and Versioning enabled buckets
For Non Versioned buckets
- Transitioning period is considered from the object’s creation date
For Versioned buckets,
- Transitioning period for the current object is calculated for the object creation date
- Transitioning period for a non-current object is calculated for the date when the object became a noncurrent versioned object
- S3 uses the number of days since its successor was created as the number of days an object is noncurrent.
S3 calculates the time by adding the number of days specified in the rule to the object creation time and rounding the resulting time to the next day midnight UTC for e.g. if an object was created at 15/1/2016 10:30 AM UTC and you specify 3 days in a transition rule, which results in 18/1/2016 10:30 AM UTC and rounded of to next day midnight time 19/1/2016 00:00 UTC.
Lifecycle configuration on MFA-enabled buckets is not supported.
1000 lifecycle rules can be configured per bucket

S3 Object Lifecycle Management Rules

Lifecycle Transitions Constraints

STANDARD -> (128 KB & 30 days) -> STANDARD-IA or One Zone-IA or S3 Intelligent-Tiering
- Larger Objects – Only objects with a size more than 128 KB can be transitioned, as cost benefits for transitioning to STANDARD-IA or One Zone-IA can be realized only for larger objects
- Smaller Objects < 128 KB – S3 does not transition objects that are smaller than 128 KB
- Minimum 30 days – Objects must be stored for at least 30 days in the current storage class before being transitioned to the STANDARD-IA or One Zone-IA, as younger objects are accessed more frequently or deleted sooner than is suitable for STANDARD-IA or One Zone-IA
GLACIER -> (90 days) -> Permanent Deletion OR GLACIER Deep Archive -> (180 days) -> Permanent Deletion
- Deleting data that is archived to Glacier is free if the objects deleted are archived for three months or longer.
- S3 charges a prorated early deletion fee if the object is deleted or overwritten within three months of archiving it.
Archival of objects to Glacier by using object lifecycle management is performed asynchronously and there may be a delay between the transition date in the lifecycle configuration rule and the date of the physical transition. However, AWS charges Glacier prices based on the transition date specified in the rule
For a versioning-enabled bucket
- Transition and Expiration actions apply to current versions.
- NoncurrentVersionTransition and NoncurrentVersionExpiration actions apply to noncurrent versions and work similarly to the non-versioned objects except the time period is from the time the objects became noncurrent
Expiration Rules
- For Non Versioned bucket
  - Object is permanently deleted
- For Versioned bucket
  - Expiration is applicable to the Current object only and does not impact any of the non-current objects
  - S3 will insert a Delete Marker object with a unique id and the previous current object becomes a non-current version
  - S3 will not take any action if the Current object is a Delete Marker
  - If the bucket has a single object which is the Delete Marker (referred to as expired object delete marker), S3 removes the Delete Marker
- For Versioned Suspended bucket
  - S3 will insert a Delete Marker object with version ID null and overwrite any object with version ID null
When an object reaches the end of its lifetime, S3 queues it for removal and removes it asynchronously. There may be a delay between the expiration date and the date at which S3 removes an object. Charged for storage time associated with an object that has expired are not incurred.
Cost is incurred if objects are expired in STANDARD-IA before 30 days, GLACIER before 90 days, and GLACIER_DEEP_ARCHIVE before 180 days.

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

If an object is stored in the Standard S3 storage class and you want to move it to Glacier, what must you do in order to properly migrate it?
1. Change the storage class directly on the object.
2. Delete the object and re-upload it, selecting Glacier as the storage class.
3. None of the above.
4. Create a lifecycle policy that will migrate it after a minimum of 30 days. (Any object uploaded to S3 must first be placed into either the Standard, Reduced Redundancy, or Infrequent Access storage class. Once in S3 the only way to move the object to glacier is through a lifecycle policy)
A company wants to store their documents in AWS. Initially, these documents will be used frequently, and after a duration of 6 months, they would not be needed anymore. How would you architect this requirement?
1. Store the files in Amazon EBS and create a Lifecycle Policy to remove the files after 6 months.
2. Store the files in Amazon S3 and create a Lifecycle Policy to remove the files after 6 months.
3. Store the files in Amazon Glacier and create a Lifecycle Policy to remove the files after 6 months.
4. Store the files in Amazon EFS and create a Lifecycle Policy to remove the files after 6 months.
Your firm has uploaded a large amount of aerial image data to S3. In the past, in your on-premises environment, you used a dedicated group of servers to oaten process this data and used Rabbit MQ, an open source messaging system, to get job information to the servers. Once processed the data would go to tape and be shipped offsite. Your manager told you to stay with the current design, and leverage AWS archival storage and messaging services to minimize cost. Which is correct?
1. Use SQS for passing job messages, use Cloud Watch alarms to terminate EC2 worker instances when they become idle. Once data is processed, change the storage class of the S3 objects to Reduced Redundancy Storage (Need to replace On-Premises Tape functionality)
2. Setup Auto-Scaled workers triggered by queue depth that use spot instances to process messages in SQS. Once data is processed, change the storage class of the S3 objects to Reduced Redundancy Storage (Need to replace On-Premises Tape functionality)
3. Setup Auto-Scaled workers triggered by queue depth that use spot instances to process messages in SQS. Once data is processed, change the storage class of the S3 objects to Glacier (Glacier suitable for Tape backup)
4. Use SNS to pass job messages use Cloud Watch alarms to terminate spot worker instances when they become idle. Once data is processed, change the storage class of the S3 object to Glacier.
You have a proprietary data store on-premises that must be backed up daily by dumping the data store contents to a single compressed 50GB file and sending the file to AWS. Your SLAs state that any dump file backed up within the past 7 days can be retrieved within 2 hours. Your compliance department has stated that all data must be held indefinitely. The time required to restore the data store from a backup is approximately 1 hour. Your on-premise network connection is capable of sustaining 1gbps to AWS. Which backup methods to AWS would be most cost-effective while still meeting all of your requirements?
1. Send the daily backup files to Glacier immediately after being generated (will not meet the RTO)
2. Transfer the daily backup files to an EBS volume in AWS and take daily snapshots of the volume (Not cost effective)
3. Transfer the daily backup files to S3 and use appropriate bucket lifecycle policies to send to Glacier (Store in S3 for seven days and then archive to Glacier)
4. Host the backup files on a Storage Gateway with Gateway-Cached Volumes and take daily snapshots (Not Cost-effective as local storage as well as S3 storage)

References

AWS_S3_Object_Lifecycle_Management