AWS EBS Snapshot

November 20, 2022 ~ Last updated on : December 7, 2022 ~ jayendrapatil ~ 38 Comments

EBS Snapshot

EBS provides the ability to create snapshots (backups) of any EBS volume and write a copy of the data in the volume to S3, where it is stored redundantly in multiple Availability Zones
Snapshots are incremental backups and store only the data that was changed from the time the last snapshot was taken.
Snapshots can be used to create new volumes, increase the size of the volumes or replicate data across Availability Zones.
Snapshot size can probably be smaller than the volume size as the data is compressed before being saved to S3.
Even though snapshots are saved incrementally, the snapshot deletion process is designed so that you need to retain only the most recent snapshot in order to restore the volume.
EBS Snapshots can be used to migrate or create EBS volumes in different AZs or regions.

Multi-Volume Snapshots

Snapshots can be used to create a backup of critical workloads, such as a large database or a file system that spans across multiple EBS volumes.
Multi-volume snapshots help take exact point-in-time, data-coordinated, and crash-consistent snapshots across multiple EBS volumes attached to an EC2 instance.
It is no longer required to stop the instance or to coordinate between volumes to ensure crash consistency because snapshots are automatically taken across multiple EBS volumes.

EBS Snapshot creation

Snapshots can be created from EBS volumes periodically and are point-in-time snapshots.
Snapshots are incremental and only store the blocks on the device that changed since the last snapshot was taken
Snapshots occur asynchronously; the point-in-time snapshot is created immediately while it takes time to upload the modified blocks to S3. While it is completing, an in-progress snapshot is not affected by ongoing reads and writes to the volume.
Snapshots can be taken from in-use volumes. However, snapshots will only capture the data that was written to the EBS volumes at the time the snapshot command is issued excluding the data which is cached by any applications of OS.
Recommended ways to create a Snapshot from an EBS volume are
- Pause all file writes to the volume
- Unmount the Volume -> Take Snapshot -> Remount the Volume
- Stop the instance – Take Snapshot (for root EBS volumes)
EBS volume created based on a snapshot
- begins as an exact replica of the original volume that was used to create the snapshot.
- replicated volume loads data in the background so that it can be used immediately.
- If data that hasn’t been loaded yet is accessed, the volume immediately downloads the requested data from S3 and then continues loading the rest of the volume’s data in the background.

EBS Snapshot Deletion

When a snapshot is deleted only the data exclusive to that snapshot is removed.
Deleting previous snapshots of a volume does not affect the ability to restore volumes from later snapshots of that volume.
Active snapshots contain all of the information needed to restore your data (from the time the snapshot was taken) to a new EBS volume.
Even though snapshots are saved incrementally, the snapshot deletion process is designed so that you need to retain only the most recent snapshot in order to restore the volume.
Snapshot of the root device of an EBS volume used by a registered AMI can’t be deleted. AMI needs to be deregistered to be able to delete the snapshot.

EBS Snapshot Copy

Snapshots are constrained to the region in which they are created and can be used to launch EBS volumes within the same region only
Snapshots can be copied across regions to make it easier to leverage multiple regions for geographical expansion, data center migration, and disaster recovery
Snapshots are copied with S3 server-side encryption (256-bit Advanced Encryption Standard) to encrypt the data and the snapshot copy receives a snapshot ID that’s different from the original snapshot’s ID.
User-defined tags are not copied from the source to the new snapshot.
First Snapshot copy to another region is always a full copy, while the rest are always incremental.
When a snapshot is copied,
- it can be encrypted if currently unencrypted or
- can be encrypted using a different encryption key. Changing the encryption status of a snapshot or using a non-default EBS CMK during a copy operation always results in a full copy (not incremental)

EBS Snapshot Sharing

Snapshots can be shared by making them public or with specific AWS accounts by modifying the access permissions of the snapshots
Encrypted snapshots cannot be made available publicly.
~~Only unencrypted snapshots can be shared. Encrypted snapshots cannot be shared between accounts or made public~~
Encrypted snapshot can be shared with specific AWS accounts by sharing the custom CMK key used must also be shared to encrypt it
Cross-account permissions may be applied to a custom key either when it is created or at a later time.
Users, with access to snapshots, can copy the snapshot and create their own EBS volumes based on the snapshot while the original snapshot remains unaffected
AWS prevents you from sharing snapshots that were encrypted with the default CMK

EBS Snapshot Encryption

EBS snapshots fully support EBS encryption.
Snapshots of encrypted volumes are automatically encrypted
Volumes created from encrypted snapshots are automatically encrypted
All data in flight between the instance and the volume is encrypted
Volumes created from an unencrypted snapshot owned or have access to can be encrypted on the fly.
Unencrypted snapshots can be encrypted during the copy process.
Encrypted snapshots that you own or have access to, can be encrypted with a different key during the copy process.
First snapshot of an encrypted volume that has been created from an unencrypted snapshot is always a full snapshot.
First snapshot of a re-encrypted volume, which has a different CMK compared to the source snapshot, is always a full snapshot.

EBS Snapshot Lifecycle Automation

Amazon Data Lifecycle Manager can be used to automate the creation, retention, and deletion of snapshots taken to back up the EBS volumes.
Automating snapshot management helps you to:
- Protect valuable data by enforcing a regular backup schedule.
- Retain backups as required by auditors or internal compliance.
- Reduce storage costs by deleting outdated backups.

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

An existing application stores sensitive information on a non-boot Amazon EBS data volume attached to an Amazon Elastic Compute Cloud instance. Which of the following approaches would protect the sensitive data on an Amazon EBS volume?
1. Upload your customer keys to AWS CloudHSM. Associate the Amazon EBS volume with AWS CloudHSM. Remount the Amazon EBS volume.
2. Create and mount a new, encrypted Amazon EBS volume. Move the data to the new volume. Delete the old Amazon EBS volume.
3. Unmount the EBS volume. Toggle the encryption attribute to True. Re-mount the Amazon EBS volume.
4. Snapshot the current Amazon EBS volume. Restore the snapshot to a new, encrypted Amazon EBS volume. Mount the Amazon EBS volume (Need to create a snapshot, create an encrypted copy of snapshot and then create an EBS volume and mount it)
Is it possible to access your EBS snapshots?
1. Yes, through the Amazon S3 APIs.
2. Yes, through the Amazon EC2 APIs
3. No, EBS snapshots cannot be accessed; they can only be used to create a new EBS volume.
4. EBS doesn’t provide snapshots.
Which of the following approaches provides the lowest cost for Amazon Elastic Block Store snapshots while giving you the ability to fully restore data?
1. Maintain two snapshots: the original snapshot and the latest incremental snapshot
2. Maintain a volume snapshot; subsequent snapshots will overwrite one another
3. Maintain a single snapshot the latest snapshot is both Incremental and complete
4. Maintain the most current snapshot, archive the original and incremental to Amazon Glacier.
Which procedure for backing up a relational database on EC2 that is using a set of RAIDed EBS volumes for storage minimizes the time during which the database cannot be written to and results in a consistent backup?
1. Detach EBS volumes, 2. Start EBS snapshot of volumes, 3. Re-attach EBS volumes
2. Stop the EC2 Instance. 2. Snapshot the EBS volumes
3. Suspend disk I/O, 2. Create an image of the EC2 Instance, 3. Resume disk I/O
4. Suspend disk I/O, 2. Start EBS snapshot of volumes, 3. Resume disk I/O
5. Suspend disk I/O, 2. Start EBS snapshot of volumes, 3. Wait for snapshots to complete, 4. Resume disk I/O
How can an EBS volume that is currently attached to an EC2 instance be migrated from one Availability Zone to another?
1. Detach the volume and attach it to another EC2 instance in the other AZ.
2. Simply create a new volume in the other AZ and specify the original volume as the source.
3. Create a snapshot of the volume, and create a new volume from the snapshot in the other AZ
4. Detach the volume, then use the ec2-migrate-volume command to move it to another AZ.
How are the EBS snapshots saved on Amazon S3?
1. Exponentially
2. Incrementally
3. EBS snapshots are not stored in the Amazon S3
4. Decrementally
EBS Snapshots occur _____
1. Asynchronously
2. Synchronously
3. Weekly
What will be the status of the snapshot until the snapshot is complete?
1. Running
2. Working
3. Progressing
4. Pending
Before I delete an EBS volume, what can I do if I want to recreate the volume later?
1. Create a copy of the EBS volume (not a snapshot)
2. Create and Store a snapshot of the volume
3. Download the content to an EC2 instance
4. Back up the data in to a physical disk
Which of the following are true regarding encrypted Amazon Elastic Block Store (EBS) volumes? Choose 2 answers
1. Supported on all Amazon EBS volume types
2. Snapshots are automatically encrypted
3. Available to all instance types
4. Existing volumes can be encrypted
5. Shared volumes can be encrypted
Amazon EBS snapshots have which of the following two characteristics? (Choose 2.) Choose 2 answers
1. EBS snapshots only save incremental changes from snapshot to snapshot
2. EBS snapshots can be created in real-time without stopping an EC2 instance (the snapshot can be taken real time however it will not be consistent and the recommended way is to stop or freeze the IO)
3. EBS snapshots can only be restored to an EBS volume of the same size or smaller (EBS volume restored from snapshots need to be of the same size of larger size)
4. EBS snapshots can only be restored and mounted to an instance in the same Availability Zone as the original EBS volume (Snapshots are specific to Region and can be used to create a volume in any AZ and does not depend on the original EBS volume AZ)
A user is planning to schedule a backup for an EBS volume. The user wants security of the snapshot data. How can the user achieve data encryption with a snapshot?
1. Use encrypted EBS volumes so that the snapshot will be encrypted by AWS (Refer link)
2. While creating a snapshot select the snapshot with encryption
3. By default the snapshot is encrypted by AWS
4. Enable server side encryption for the snapshot using S3
A sys admin is trying to understand EBS snapshots. Which of the below mentioned statements will not be useful to the admin to understand the concepts about a snapshot?
1. Snapshot is synchronous
2. It is recommended to stop the instance before taking a snapshot for consistent data
3. Snapshot is incremental
4. Snapshot captures the data that has been written to the hard disk when the snapshot command was executed
When creation of an EBS snapshot is initiated but not completed, the EBS volume
1. Cannot be detached or attached to an EC2 instance until me snapshot completes
2. Can be used in read-only mode while me snapshot is in progress
3. Can be used while the snapshot is in progress
4. Cannot be used until the snapshot completes
You have a server with a 5O0GB Amazon EBS data volume. The volume is 80% full. You need to back up the volume at regular intervals and be able to re-create the volume in a new Availability Zone in the shortest time possible. All applications using the volume can be paused for a period of a few minutes with no discernible user impact. Which of the following backup methods will best fulfill your requirements?
1. Take periodic snapshots of the EBS volume
2. Use a third-party Incremental backup application to back up to Amazon Glacier
3. Periodically back up all data to a single compressed archive and archive to Amazon S3 using a parallelized multi-part upload
4. Create another EBS volume in the second Availability Zone attach it to the Amazon EC2 instance, and use a disk manager to mirror me two disks
A user is creating a snapshot of an EBS volume. Which of the below statements is incorrect in relation to the creation of an EBS snapshot?
1. Its incremental
2. It can be used to launch a new instance
3. It is stored in the same AZ as the volume (stored in the same region)
4. It is a point in time backup of the EBS volume
A user has created a snapshot of an EBS volume. Which of the below mentioned usage cases is not possible with respect to a snapshot?
1. Mirroring the volume from one AZ to another AZ
2. Launch an instance
3. Decrease the volume size
4. Increase the size of the volume
What is true of the way that encryption works with EBS?
1. Snapshotting an encrypted volume makes an encrypted snapshot; restoring an encrypted snapshot creates an encrypted volume when specified / requested.
2. Snapshotting an encrypted volume makes an encrypted snapshot when specified / requested; restoring an encrypted snapshot creates an encrypted volume when specified / requested.
3. Snapshotting an encrypted volume makes an encrypted snapshot; restoring an encrypted snapshot always creates an encrypted volume.
4. Snapshotting an encrypted volume makes an encrypted snapshot when specified / requested; restoring an encrypted snapshot always creates an encrypted volume.
Why are more frequent snapshots of EBS Volumes faster?
1. Blocks in EBS Volumes are allocated lazily, since while logically separated from other EBS Volumes, Volumes often share the same physical hardware. Snapshotting the first time forces full block range allocation, so the second snapshot doesn’t need to perform the allocation phase and is faster.
2. The snapshots are incremental so that only the blocks on the device that have changed after your last snapshot are saved in the new snapshot.
3. AWS provisions more disk throughput for burst capacity during snapshots if the drive has been pre-warmed by snapshotting and reading all blocks.
4. The drive is pre-warmed, so block access is more rapid for volumes when every block on the device has already been read at least one time.
Which is not a restriction on AWS EBS Snapshots?
1. Snapshots which are shared cannot be used as a basis for other snapshots (Snapshots shared with other users are usable in full by the recipient, including but limited to the ability to base modified volumes and snapshots)
2. You cannot share a snapshot containing an AWS Access Key ID or AWS Secret Access Key
3. You cannot share encrypted snapshots (NOTE: this has be updated partially where you can share a encrypted snapshot with other accounts)
4. Snapshot restorations are restricted to the region in which the snapshots are created
There is a very serious outage at AWS. EC2 is not affected, but your EC2 instance deployment scripts stopped working in the region with the outage. What might be the issue?
1. The AWS Console is down, so your CLI commands do not work.
2. S3 is unavailable, so you can’t create EBS volumes from a snapshot you use to deploy new volumes. (EBS volume snapshots are stored in S3. If S3 is unavailable, snapshots are unavailable)
3. AWS turns off the <code>DeployCode</code> API call when there are major outages, to protect from system floods.
4. None of the other answers make sense. If EC2 is not affected, it must be some other issue.

AWS EBS Performance

November 18, 2022 ~ Last updated on : November 22, 2022 ~ jayendrapatil ~ 9 Comments

AWS EBS Performance Tips

EBS Performance depends on several factors including I/O characteristics, instances and volumes configuration and can be improved using PIOPS, EBS-Optimized instances, Pre-Warming, and RAIDed configuration.

EBS-Optimized or 10 Gigabit Network Instances

An EBS-Optimized instance uses an optimized configuration stack and provides additional, dedicated capacity for EBS I/O.
Optimization provides the best performance for the EBS volumes by minimizing contention between EBS I/O and other traffic from an instance
EBS-Optimized instances deliver dedicated throughput to EBS depending on the instance type used.
Not all instance types support EBS-Optimization
Some Instance types enable EBS-Optimization by default, while it can be enabled for some.
EBS optimization enabled for an instance, that is not EBS-Optimized by default, an additional low, hourly fee for the dedicated capacity is charged
When attached to an EBS–optimized instance,
- General Purpose (SSD) volumes are designed to deliver within 10% of their baseline and burst performance 99.9% of the time in a given year
- Provisioned IOPS (SSD) volumes are designed to deliver within 10% of their provisioned performance 99.9 percent of the time in a given year.

EBS Volume Initialization – Pre-warming

Empty EBS volumes receive their maximum performance the moment that they are available and DO NOT require initialization (pre-warming).
EBS volumes needed a pre-warming, previously, before being used to get maximum performance to start with. Pre-warming of the volume was possible by writing to the entire volume with 0 for new volumes or reading the entire volume for volumes from snapshots.
Storage blocks on volumes that were restored from snapshots must be initialized (pulled down from S3 and written to the volume) before the block can be accessed.
This preliminary action takes time and can cause a significant increase in the latency of an I/O operation the first time each block is accessed.
To avoid this initial performance hit in a production environment, the following options can be used
- Force the immediate initialization of the entire volume by using the dd or fio utilities to read from all of the blocks on a volume.
- Enable fast snapshot restore – FSR on a snapshot to ensure that the EBS volumes created from it are fully-initialized at creation and instantly deliver all of their provisioned performance.

RAID Configuration

EBS volumes can be striped, if a single EBS volume does not meet the performance and more is required.
Striping volumes allows pushing tens of thousands of IOPS.
EBS volumes are already replicated across multiple servers in an AZ for availability and durability, so AWS generally recommend striping for performance rather than durability.
For greater I/O performance than can be achieved with a single volume, RAID 0 can stripe multiple volumes together; for on-instance redundancy, RAID 1 can mirror two volumes together.
RAID 0 allows I/O distribution across all volumes in a stripe, allowing straight gains with each addition.
RAID 1 can be used for durability to mirror volumes, but in this case, it requires more EC2 to EBS bandwidth as the data is written to multiple volumes simultaneously and should be used with EBS–optimization.
EBS volume data is replicated across multiple servers in an AZ to prevent the loss of data from the failure of any single component
AWS doesn’t recommend RAID 5 and 6 because the parity write operations of these modes consume the IOPS available for the volumes and can result in 20-30% fewer usable IOPS than RAID 0.
A 2-volume RAID 0 config can outperform a 4-volume RAID 6 that costs twice as much.

RAID Configuration

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

A user is trying to pre-warm a blank EBS volume attached to a Linux instance. Which of the below mentioned steps should be performed by the user?
1. There is no need to pre-warm an EBS volume (with latest update no pre-warming is needed)
2. Contact AWS support to pre-warm (This used to be the case before, but pre warming is not necessary now)
3. Unmount the volume before pre-warming
4. Format the device
A user has created an EBS volume of 10 GB and attached it to a running instance. The user is trying to access EBS for first time. Which of the below mentioned options is the correct statement with respect to a first time EBS access?
1. The volume will show a size of 8 GB
2. The volume will show a loss of the IOPS performance the first time (the volume needed to be wiped cleaned before for new volumes, however pre warming is not needed any more)
3. The volume will be blank
4. If the EBS is mounted it will ask the user to create a file system
You are running a database on an EC2 instance, with the data stored on Elastic Block Store (EBS) for persistence At times throughout the day, you are seeing large variance in the response times of the database queries Looking into the instance with the isolate command you see a lot of wait time on the disk volume that the database’s data is stored on. What two ways can you improve the performance of the database’s storage while maintaining the current persistence of the data? Choose 2 answers
1. Move to an SSD backed instance
2. Move the database to an EBS-Optimized Instance
3. Use Provisioned IOPs EBS
4. Use the ephemeral storage on an m2.4xLarge Instance Instead
You have launched an EC2 instance with four (4) 500 GB EBS Provisioned IOPS volumes attached. The EC2 Instance is EBS-Optimized and supports 500 Mbps throughput between EC2 and EBS. The two EBS volumes are configured as a single RAID 0 device, and each Provisioned IOPS volume is provisioned with 4,000 IOPS (4000 16KB reads or writes) for a total of 16,000 random IOPS on the instance. The EC2 Instance initially delivers the expected 16,000 IOPS random read and write performance. Sometime later in order to increase the total random I/O performance of the instance, you add an additional two 500 GB EBS Provisioned IOPS volumes to the RAID. Each volume is provisioned to 4,000 IOPS like the original four for a total of 24,000 IOPS on the EC2 instance Monitoring shows that the EC2 instance CPU utilization increased from 50% to 70%, but the total random IOPS measured at the instance level does not increase at all. What is the problem and a valid solution?
1. Larger storage volumes support higher Provisioned IOPS rates: increase the provisioned volume storage of each of the 6 EBS volumes to 1TB.
2. EBS-Optimized throughput limits the total IOPS that can be utilized use an EBS-Optimized instance that provides larger throughput. (EC2 Instance types have limit on max throughput and would require larger instance types to provide 24000 IOPS)
3. Small block sizes cause performance degradation, limiting the I’O throughput, configure the instance device driver and file system to use 64KB blocks to increase throughput.
4. RAID 0 only scales linearly to about 4 devices, use RAID 0 with 4 EBS Provisioned IOPS volumes but increase each Provisioned IOPS EBS volume to 6.000 IOPS.
5. The standard EBS instance root volume limits the total IOPS rate, change the instant root volume to also be a 500GB 4,000 Provisioned IOPS volume
A user has deployed an application on an EBS backed EC2 instance. For a better performance of application, it requires dedicated EC2 to EBS traffic. How can the user achieve this?
1. Launch the EC2 instance as EBS provisioned with PIOPS EBS
2. Launch the EC2 instance as EBS enhanced with PIOPS EBS
3. Launch the EC2 instance as EBS dedicated with PIOPS EBS
4. Launch the EC2 instance as EBS optimized with PIOPS EBS

AWS EBS Volume Types

November 12, 2022 ~ Last updated on : December 3, 2022 ~ jayendrapatil ~ 39 Comments

AWS EBS Volume Types

AWS provides the following EBS volume types, which differ in performance characteristics and price and can be tailored for storage performance and cost to the needs of the applications.
Solid state drives (SSD-backed) volumes optimized for transactional workloads involving frequent read/write operations with small I/O size, where the dominant performance attribute is IOPS
- General Purpose SSD (gp2/gp3)
- Provisioned IOPS SSD (io1/io2/io2 block express)
Hard disk drives (HDD-backed) volumes optimized for large streaming workloads where throughput (measured in MiB/s) is a better performance measure than IOPS
- Throughput Optimized HDD (st1)
- Cold HDD (sc1)
- ~~Magnetic Volumes (standard)~~ (Previous Generation)

EBS Volume Types (New Generation)

Solid state drives (SSD-backed) volumes

General Purpose SSD Volumes (gp2/gp3)

General Purpose SSD volumes offer cost-effective storage that is ideal for a broad range of workloads.
General Purpose SSD volumes deliver single-digit millisecond latencies
General Purpose SSD volumes can range in size from 1 GiB to 16 TiB.
General Purpose SSD (gp2) volumes
- has a maximum throughput of 160 MiB/s (at 214 GiB and larger).
- provides a baseline performance of 3 IOPS/GiB
- provides the ability to burst to 3,000 IOPS for extended periods of time for volume size less than 1 TiB and up to a maximum of 16,000 IOPS (at 5,334 GiB).
- If the volume performance is frequently limited to the baseline level (due to an empty I/O credit balance),
  - consider using a larger General Purpose SSD volume (with a higher baseline performance level) or
  - switching to a Provisioned IOPS SSD volume for workloads that require sustained IOPS performance greater than 16,000 IOPS.
General Purpose SSD (gp3) volumes
- deliver a consistent baseline rate of 3,000 IOPS and 125 MiB/s, included with the price of storage.
- additional IOPS (up to 16,000) and throughput (up to 1,000 MiB/s) can be provisioned for an additional cost.
- the maximum ratio of provisioned IOPS to provisioned volume size is 500 IOPS per GiB
- the maximum ratio of provisioned throughput to provisioned IOPS is .25 MiB/s per IOPS.

I/O Credits and Burst Performance

I/O credits represent the available bandwidth that the General Purpose SSD volume can use to burst large amounts of I/O when more than the baseline performance is needed.
General Purpose SSD (gp2) volume performance is governed by volume size, which dictates the baseline performance level of the volume for e.g. 100 GiB volume has a 300 IOPS @ 3 IOPS/GiB
General Purpose SSD volume size also determines how quickly it accumulates I/O credits for e.g. 100 GiB with a performance of 300 IOPS can accumulate 180K IOPS/10 mins (300 * 60 * 10).
Larger volumes have higher baseline performance levels and accumulate I/O credits faster for e.g. 1 TiB has a baseline performance of 3000 IOPS
More credits the volume has for I/O, the more time it can burst beyond its baseline performance level and the better it performs when more performance is needed for e.g. 300 GiB volume with 180K I/O credit can burst @ 3000 IOPS for 1 minute (180K/3000)
Each volume receives an initial I/O credit balance of 5,400,000 I/O credits, which is enough to sustain the maximum burst performance of 3,000 IOPS for 30 minutes.
Initial credit balance is designed to provide a fast initial boot cycle for boot volumes and a good bootstrapping experience for other applications.
Each volume can accumulate I/O credits over a period of time which can be to burst to the required performance level, up to a max of 3,000 IOPS
Unused I/O credit cannot go beyond 54,00,000 I/O credits.

Volumes till 1 TiB can burst up to 3000 IOPS over and above its baseline performance
Volumes larger than 1 TiB have a baseline performance that is already equal to or greater than the maximum burst performance, and their I/O credit balance never depletes.
Baseline performance cannot be beyond 10000 IOPS for General Purpose SSD volumes and this limit is reached @ 3333 GiB

Baseline Performance

Formula – 3 IOPS i.e. GiB * 3
Calculation example
- 1 GiB volume size = 3 IOPS (1 * 3 IOPS)
- 250 GiB volume size = 750 IOPS (250* 3 IOPS)

Maximum burst duration @ 3000 IOPS

How much time can 5400000 IO credit be sustained @ the burst performance of 3000 IOPS. Subtract the baseline performance from 3000 IOPS which would be contributed by the volume size
Formula – 5400000/(3000 – Baseline performance)
Calculation example
- 1 GiB volume size @ 3000 IOPS with 5400000 the burst performance can be maintained for 5400000/(3000-3) = 1802 secs
- 250 GiB volume size @ 3000 IOPS with 5400000 the burst performance can be maintained for 5400000/(3000-3*250) = 2400 secs

Time to fill the 5400000 I/O credit balance

Formula – 5400000/Baseline performance
Calculation
- 1 GiB volume size @ 3 IOPS would require 5400000/3 = 1800000 secs
- 250 GiB volume size @ 750 IOPS would require 5400000/750 = 7200 secs

Provisioned IOPS SSD (io1/io2) Volumes

are designed to meet the needs of I/O intensive workloads, particularly database workloads, that are sensitive to storage performance and consistency in random access I/O throughput.
IOPS rate can be specified when the volume is created, and EBS delivers within 10% of the provisioned IOPS performance 99.9% of the time over a given year.
can range in size from 4 GiB to 16 TiB
have a throughput limit range of 256 KiB for each IOPS provisioned, up to a maximum of ~~320~~ 500 MiB/s (at 32000 IOPS)
can be provision up to ~~20,000~~ ~~32,000~~ 64,000 IOPS per volume.
Ratio of IOPS provisioned to the volume size requested can be a maximum of 30 50; e.g., a volume with 5,000 IOPS must be at least 100 GiB.
can be striped together in a RAID configuration for larger size and greater performance over 20000 IOPS

Hard disk drives (HDD-backed) volumes

Throughput Optimized HDD (st1) Volumes

provide low-cost magnetic storage that defines performance in terms of throughput rather than IOPS.
is a good fit for large, sequential workloads such as EMR, ETL, data warehouses, and log processing
do not support Bootable sc1 volumes
are designed to support frequently accessed data
uses a burst-bucket model for performance similar to gp2. Volume size determines the baseline throughput of the volume, which is the rate at which the volume accumulates throughput credits. Volume size also determines the burst throughput of your volume, which is the rate at which you can spend credits when they are available.

Cold HDD (sc1) Volumes

provide low-cost magnetic storage that defines performance in terms of throughput rather than IOPS.
With a lower throughput limit than st1, sc1 is a good fit ideal for large, sequential cold-data workloads.
ideal for infrequent access to data and are looking to save costs, sc1 provides inexpensive block storage
do not support Bootable sc1 volumes
though are similar to Throughput Optimized HDD (st1) volumes, are designed to support infrequently accessed data.
uses a burst-bucket model for performance similar to gp2. Volume size determines the baseline throughput of the volume, which is the rate at which the volume accumulates throughput credits. Volume size also determines the burst throughput of your volume, which is the rate at which you can spend credits when they are available.

Magnetic Volumes (standard)

Magnetic volumes provide the lowest cost per gigabyte of all EBS volume types. Magnetic volumes are backed by magnetic drives and are ideal for workloads performing sequential reads, workloads where data is accessed infrequently, and scenarios where the lowest storage cost is important.

~~Magnetic volumes can range in size from1 GiB to 1 TiB~~
~~These volumes deliver approximately 100 IOPS on average, with burst capability of up to hundreds of IOPS~~
~~Magnetic volumes can be striped together in a RAID configuration for larger size and greater performance.~~

EBS Volume Types (Previous Generation – Reference Only)

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

You are designing an enterprise data storage system. Your data management software system requires mountable disks and a real filesystem, so you cannot use S3 for storage. You need persistence, so you will be using AWS EBS Volumes for your system. The system needs as low-cost storage as possible, and access is not frequent or high throughput, and is mostly sequential reads. Which is the most appropriate EBS Volume Type for this scenario?
1. gp1
2. io1
3. standard (Standard or Magnetic volumes are suited for cold workloads where data is infrequently accessed, or scenarios where the lowest storage cost is important)
4. gp2
Which EBS volume type is best for high performance NoSQL cluster deployments?
1. io1 (io1 volumes, or Provisioned IOPS (PIOPS) SSDs, are best for: Critical business applications that require sustained IOPS performance, or more than 10,000 IOPS or 160 MiB/s of throughput per volume, like large database workloads, such as MongoDB.)
2. gp1
3. standard
4. gp2
Provisioned IOPS Costs: you are charged for the IOPS and storage whether or not you use them in a given month.
1. FALSE
2. TRUE
A user is trying to create a PIOPS EBS volume with 8 GB size and 450 IOPS. Will AWS create the volume?
1. Yes, since the ratio between EBS and IOPS is less than 50
2. No, since the PIOPS and EBS size ratio is less than 50
3. No, the EBS size is less than 10 GB
4. Yes, since PIOPS is higher than 100
A user has provisioned 2000 IOPS to the EBS volume. The application hosted on that EBS is experiencing fewer IOPS than provisioned. Which of the below mentioned options does not affect the IOPS of the volume?
1. The application does not have enough IO for the volume
2. Instance is EBS optimized
3. The EC2 instance has 10 Gigabit Network connectivity
4. Volume size is too large
A user is trying to create a PIOPS EBS volume with 6000 IOPS and 100 GB size. AWS does not allow the user to create this volume. What is the possible root cause for this?
1. The ratio between IOPS and the EBS volume is higher than 50
2. The maximum IOPS supported by EBS is 3000
3. The ratio between IOPS and the EBS volume is lower than 100
4. PIOPS is supported for EBS higher than 500 GB size

References

AWS EC2 User Guide – EBS_Volume_Types

AWS EC2 Network – Enhanced Networking

November 10, 2022 ~ Last updated on : February 6, 2023 ~ jayendrapatil ~ 9 Comments

EC2 Enhanced Networking

Enhanced networking results in higher bandwidth, higher packet per second (PPS) performance, lower latency, consistency, scalability and lower jitter
EC2 provides enhanced networking capabilities using single root I/O virtualization (SR-IOV) only on supported instance types
- SR-IOV is a method of device virtualization that provides higher I/O performance and lower CPU utilization
Amazon Linux AMIs and Windows Server 2012 R2 AMI already have the module installed with the attributes set and do not require any additional configurations.
It can be enabled for other OS distributions by installing the module with the correct attributes configured
Enhanced Networking is supported using
- Elastic Network Adapter (ENA)
  - The Elastic Network Adapter (ENA) supports network speeds of up to 100 Gbps for supported instance types.
  - The current generation instances use ENA for enhanced networking, except for C4, D2, and M4 instances smaller than m4.16xlarge.
- Intel 82599 Virtual Function (VF) interface
  - The Intel 82599 Virtual Function interface supports network speeds of up to 10 Gbps for supported instance types.
  - supported instance types: C3, C4, D2, I2, M4 (excl. m4.16xlarge), and R3.

VF Enhanced Networking Key Requirements

VPC, as enhanced networking can’t be enabled for instance in EC2-Classic
an HVM virtualization type AMI
Instance kernel version
- Linux kernel version of 2.6.32+
- Windows: Server 2008 R2+
Appropriate Virtual Function (VF) driver
- Linux – should have the ixgbevf module installed and that sriovNetSupport attribute set for the instance
- Windows- Intel 82599 Virtual Function driver
supported instance types: C3, C4, D2, I2, M4 (excl. m4.16xlarge), and R3.

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

You have multiple Amazon EC2 instances running in a cluster across multiple Availability Zones within the same region. What combination of the following should be used to ensure the highest network performance (packets per second), lowest latency, and lowest jitter? Choose 3 answers
1. Amazon EC2 placement groups (would not work for multiple AZs)
2. Enhanced networking (provides network performance, lowest latency)
3. Amazon PV AMI (Requires HVM)
4. Amazon HVM AMI (Requires HVM)
5. Amazon Linux (Can be on others as well)
6. Amazon VPC (works only in VPC, can’t enable enhanced networking if the instance is in EC2-Classic)
A group of researchers is studying the migration pattern of a beetle that eats and destroys gram. The researchers must process massive amounts of data and run statistics. Which one of the following options provides the high performance computing for this purpose.
1. Configure an Autoscaling Scaling group to launch dozens of spot instances to run the statistical analysis simultaneously
2. Launch AMI instances that support SR-IOV in a single Availability Zone
3. Launch compute optimized (C4) instances in at least two Availability Zones
4. Launch enhanced network type instances in a placement group

References

Enhanced_Networking_on_Linux

AWS Web Application Firewall – WAF

October 7, 2022 ~ Last updated on : October 8, 2022 ~ jayendrapatil

AWS Web Application Firewall – WAF

AWS WAF – Web Application Firewall protects web applications from attacks by allowing rules configuration that allow, block, or monitor (count) web requests based on defined conditions.
helps protects from common attack techniques like SQL injection and Cross-Site Scripting (XSS), Conditions based include IP addresses, HTTP headers, HTTP body, and URI strings.
tightly integrates with CloudFront, API Gateway, AppSync, and the Application Load Balancer (ALB) services used to deliver content for their websites and applications.
- AWS WAF with Amazon CloudFront
  - AWS WAF rules run in all AWS Edge Locations, located around the world close to the end users.
  - Blocked requests are stopped before they reach the web servers.
  - Helps support custom origins outside of AWS.
- AWS WAF with Application Load Balancer
  - WAF rules run in the region and can be used to protect internet-facing as well as internal load balancers.
- AWS WAF with API Gateway
  - Can help secure and protect the REST APIs.
helps protect applications and can inspect web requests transmitted over HTTP or HTTPS.
provides Managed Rules which are pre-configured rules to protect applications from common threats like application vulnerabilities like OWASP, bots, or Common Vulnerabilities and Exposures (CVE).
logs can be sent to the CloudWatch Logs log group, an S3 bucket, or Kinesis Data Firehose.

WAF Benefits

Additional protection against web attacks using specified conditions
Conditions can be defined by using characteristics of web requests such as the following:
- IP addresses that the requests originate from
- Values in request headers
- Strings that appear in the requests
- Length of requests
- Presence of SQL code that is likely to be malicious (this is known as SQL injection)
- Presence of a script that is likely to be malicious (this is known as cross-site scripting)
Managed Rules to get you started quickly
Rules that you can reuse for multiple web applications
Real-time metrics and sampled web requests
Automated administration using the WAF API

How WAF Works

WAF allows controlling the behaviour of web requests by creating conditions, rules, and web access control lists (web ACLs).

WAF Works

Conditions

Conditions define basic characteristics to watch for in a web request
- Malicious script – XSS (Cross Site Scripting) – Attackers embed scripts that can exploit vulnerabilities in web applications
- IP addresses or address ranges that requests originate from.
- Size – Length of specified parts of the request, such as the query string.
- Malicious SQL – SQL injection – Attackers try to extract data from the database by embedding malicious SQL code in a web request
- Geographic match – Allow or block requests based on the country from which the requests originate.
- Strings that appear in the request, for e.g., values that appear in the User-Agent header or text strings that appear in the query string.
  Some conditions take multiple values.

Actions

Allow all requests except the ones specified – blacklisting for e.g all IP addresses except the ones specified
Block all requests except the ones specified – whitelisting for e.g IP addresses the request originates from
Monitor (Count) the requests that match the specified properties – allows counting of the requests that match the defined properties, which can be useful when configuring and testing allow or block requests using new properties. After confirming that the config did not accidentally block all of the traffic to the website, the configuration can be applied to change the behaviour to allow or block requests.
CAPTCHA – runs a CAPTCHA check against the request.

Rules

AWS WAF rule defines how to inspect HTTP(S) web requests and the action to take on a request when it matches the inspection criteria.
Each rule requires one top-level rule statement, which might contain nested statements at any depth, depending on the rule and statement type.
AWS WAF also supports logical statements for AND, OR, and NOT that you use to combine statements in a rule. for e.g.,
- based on recent requests that you’ve seen from an attacker, you might create a rule that includes the following conditions with logical AND condition:
  - The requests come from 192.0.2.44.
  - They contain the value BadBot in the User-Agent header.
  - They appear to include malicious SQL code in the query string.
- All 3 conditions should be satisfied for the Rule to be passed and the associated action to be taken.

Rule Groups

A Rule Group is a reusable set of rules that can be added to a Web ACL.
Rule groups fall into the following main categories
- Managed rule groups, which AWS Managed Rules and AWS Marketplace sellers create and maintain for you
- Your own rule groups, which you create and maintain
- Rule groups that are owned and managed by other services, like AWS Firewall Manager and Shield Advanced.

Web ACLs – Access Control Lists

A Web Access Control List – Web ACL provides fine-grained control over all of the HTTP(S) web requests that the protected resource responds to.
Web ACLs provides
- Rule Groups OR Combination of Rules
- Action – allow, block or count to perform for each rule
  - WAF compares a request with the rules in a web ACL in the order in which it is listed and takes the action that is associated with the first rule that the request matches.
  - For multiple rules in a web ACL, WAF evaluates each request against the rules in the order they are listed in the web ACL.
  - When a web request matches all of the conditions in a rule, WAF immediately takes the action – allow or block – and doesn’t evaluate the request against the remaining rules in the web ACL, if any.
- Default action
  - determines whether WAF allows or blocks a request that does not match all of the conditions in any of the rules
Supports criteria like the following to allow or block requests
- IP address origin of the request
- Country of origin of the request
- String match or regular expression (regex) match in a part of the request
- Size of a particular part of the request
- Detection of malicious SQL code or scripting
- Rate based rules

AWS WAF based Architecture

AWS WAF integration with CloudFront and Lambda to dynamically update WAF rules
CloudFront receives requests on behalf of the web application, it sends access logs to an S3 bucket that contains detailed information about the requests.
For every new access log stored in the S3 bucket, a Lambda function is triggered. The Lambda function parses the log files and looks for requests that resulted in error codes 400, 403, 404, and 405.
Lambda function then counts the number of bad requests and temporarily stores results in the S3 bucket
Lambda function updates AWS WAF rules to block the IP addresses for a period of time that you specify.
After this blocking period has expired, AWS WAF allows those IP addresses to access your application again, but continues to monitor the requests from those IP addresses.
Lambda function publishes execution metrics in CloudWatch, such as the number of requests analyzed and IP addresses blocked.
CloudWatch metrics can be integrated with SNS for notification

Web Application Firewall Sandwich Architecture

NOTE :- From DDOS Resiliency Whitepaper – doesn’t use the AWS WAF and is not valid anymore

WAF Sandwich Architecture

DDoS attacks at the application layer commonly target web applications with lower volumes of traffic compared to infrastructure attacks.
WAF can be included as part of the infrastructure to mitigate these types of attacks
WAFs act as filters that apply a set of rules to web traffic, which cover exploits like XSS and SQL injection but can also help build resiliency against DDoS by mitigating HTTP GET or POST floods.
HTTP works as a request-response protocol between end users and applications where end users request data (GET) and submit data to be processed (POST). GET floods work by requesting the same URL at a high rate or requesting all objects from your application. POST floods work by finding expensive application processes, i.e., logins or database searches, and triggering those process to overwhelm your application.
WAFs have several features that may prevent these types of attacks from affecting the application availability for e.g. HTTP rate limiting which limits the number of requests per end user within a certain time period. Once the threshold is exceeded, WAFs can block or buffer new requests to ensure other end users have access to the application.
WAFs can also inspect HTTP requests and identify those that don’t confirm to normal patterns
In the “WAF sandwich,” the EC2 instance running the WAF software (not the AWS WAF) is included in an Auto Scaling group and placed in between two ELB load balancers. Basic load balancer in the default VPC will be the frontend, public facing load balancer that will distribute all incoming traffic to the WAF EC2 instance.
With WAF sandwich pattern, the instance can scale and add additional WAF EC2 instances should the traffic spike to elevated levels.
Once the traffic has been inspected and filtered, the WAF EC2 instance forwards traffic to the internal, backend load balancer which then distributes traffic across the application EC2 instance.
This configuration allows the WAF EC2 instances to scale and meet capacity demands without affecting the availability of your application EC2 instance.

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

The Web Application Development team is worried about malicious activity from 200 random IP addresses. Which action will ensure security and scalability from this type of threat?
1. Use inbound security group rules to block the IP addresses.
2. Use inbound network ACL rules to block the IP addresses.
3. Use AWS WAF to block the IP addresses.
4. Write iptables rules on the instance to block the IP addresses.
You’ve been hired to enhance the overall security posture for a very large e-commerce site. They have a well architected multi-tier application running in a VPC that uses ELBs in front of both the web and the app tier with static assets served directly from S3. They are using a combination of RDS and DynamoDB for their dynamic data and then archiving nightly into S3 for further processing with EMR. They are concerned because they found questionable log entries and suspect someone is attempting to gain unauthorized access. Which approach provides a cost effective scalable mitigation to this kind of attack? [Old Exam Question]
1. Recommend mat they lease space at a DirectConnect partner location and establish a 1G DirectConnect connection to their VPC they would then establish Internet connectivity into their space, filter the traffic in hardware Web Application Firewall (WAF). And then pass the traffic through the DirectConnect connection into their application running in their VPC. (Not cost effective)
2. Add previously identified hostile source IPs as an explicit INBOUND DENY NACL to the web tier subnet. (does not protect against new source)
3. Add a WAF tier by creating a new ELB and an AutoScaling group of EC2 Instances running a host-based WAF. They would redirect Route 53 to resolve to the new WAF tier ELB. The WAF tier would then pass the traffic to the current web tier. Web tier Security Groups would be updated to only allow traffic from the WAF tier Security Group
4. Remove all but TLS 1.2 from the web tier ELB and enable Advanced Protocol Filtering This will enable the ELB itself to perform WAF functionality. (No advanced protocol filtering in ELB)

References

AWS_WAF

AWS Identity Services Cheat Sheet

October 6, 2022 ~ Last updated on : February 22, 2023 ~ jayendrapatil

AWS Identity Services Cheat Sheet

AWS Identity and Security Services

IAM – Identity & Access Management

securely control access to AWS services and resources
helps create and manage user identities and grant permissions for those users to access AWS resources
helps create groups for multiple users with similar permissions
not appropriate for application authentication
is Global and does not need to be migrated to a different region
helps define Policies,
- in JSON format
- all permissions are implicitly denied by default
- most restrictive policy wins
IAM Role
- helps grants and delegate access to users and services without the need of creating permanent credentials
- IAM users or AWS services can assume a role to obtain temporary security credentials that can be used to make AWS API calls
- needs Trust policy to define who and Permission policy to define what the user or service can access
- used with Security Token Service (STS), a lightweight web service that provides temporary, limited privilege credentials for IAM users or for authenticated federated users
- IAM role scenarios
  - Service access for e.g. EC2 to access S3 or DynamoDB
  - Cross Account access for users
    - with user within the same account
    - with user within an AWS account owned the same owner
    - with user from a Third Party AWS account with External ID for enhanced security
  - Identity Providers & Federation
    - AssumeRoleWithWebIdentity – Web Identity Federation, where the user can be authenticated using external authentication Identity providers like Amazon, Google or any OpenId IdP
    - AssumeRoleWithSAML – Identity Provider using SAML 2.0, where the user can be authenticated using on premises Active Directory, Open Ldap or any SAML 2.0 compliant IdP
    - AssumeRole (recommended) or GetFederationToken – For other Identity Providers, use Identity Broker to authenticate and provide temporary Credentials
IAM Best Practices
- Do not use Root account for anything other than billing
- Create Individual IAM users
- Use groups to assign permissions to IAM users
- Grant least privilege
- Use IAM roles for applications on EC2
- Delegate using roles instead of sharing credentials
- Rotate credentials regularly
- Use Policy conditions for increased granularity
- Use CloudTrail to keep a history of activity
- Enforce a strong IAM password policy for IAM users
- Remove all unused users and credentials

AWS Organizations

is an account management service that enables consolidating multiple AWS accounts into an organization that can be centrally managed.
include consolidated billing and account management capabilities that enable one to better meet the budgetary, security, and compliance needs of your business.
As an administrator of an organization, new accounts can be created in an organization and invite existing accounts to join the organization.
enables you to
- Automate AWS account creation and management, and provision resources with AWS CloudFormation Stacksets.
- Maintain a secure environment with policies and management of AWS security services
- Govern access to AWS services, resources, and regions
- Centrally manage policies across multiple AWS accounts
- Audit your environment for compliance
- View and manage costs with consolidated billing
- Configure AWS services across multiple accounts
supports Service Control Policies – SCPs
offer central control over the maximum available permissions for all of the accounts in your organization, ensuring member accounts stay within the organization’s access control guidelines.
are one type of policy that help manage the organization.
are available only in an organization that has all features enabled, and aren’t available if the organization has enabled only the consolidated billing features.
are NOT sufficient for granting access to the accounts in the organization.
defines a guardrail for what actions accounts within the organization root or OU can do, but IAM policies need to be attached to the users and roles in the organization’s accounts to grant permissions to them.
Effective permissions are the logical intersection between what is allowed by the SCP and what is allowed by the IAM and resource-based policies.
with an SCP attached to member accounts, identity-based and resource-based policies grant permissions to entities only if those policies and the SCP allow the action
don’t affect users or roles in the management account. They affect only the member accounts in your organization.

AWS Directory Services

gives applications in AWS access to Active Directory services
different from SAML + AD, where the access is granted to AWS services through Temporary Credentials
Simple AD
- least expensive but does not support Microsoft AD advanced features
- provides a Samba 4 Microsoft Active Directory compatible standalone directory service on AWS
- No single point of Authentication or Authorization, as a separate copy is maintained
- trust relationships cannot be setup between Simple AD and other Active Directory domains
- Don’t use it, if the requirement is to leverage access and control through centralized authentication service
AD Connector
- acts just as an hosted proxy service for instances in AWS to connect to on-premises Active Directory
- enables consistent enforcement of existing security policies, such as password expiration, password history, and account lockouts, whether users are accessing resources on-premises or in the AWS cloud
- needs VPN connectivity (or Direct Connect)
- integrates with existing RADIUS-based MFA solutions to enabled multi-factor authentication
- does not cache data which might lead to latency
Read-only Domain Controllers (RODCs)
- works out as a Read-only Active Directory
- holds a copy of the Active Directory Domain Service (AD DS) database and respond to authentication requests
- they cannot be written to and are typically deployed in locations where physical security cannot be guaranteed
- helps maintain a single point to authentication & authorization controls, however needs to be synced
Writable Domain Controllers
- are expensive to setup
- operate in a multi-master model; changes can be made on any writable server in the forest, and those changes are replicated to servers throughout the entire forest

AWS Single Sign-On SSO

is a cloud-based single sign-on (SSO) service that makes it easy to centrally manage SSO access to all of the AWS accounts and cloud applications.
helps manage access and permissions to commonly used third-party software as a service (SaaS) applications, AWS SSO-integrated applications as well as custom applications that support SAML 2.0.
includes a user portal where the end-users can find and access all their assigned AWS accounts, cloud applications, and custom applications in one place.

Amazon Cognito

Amazon Cognito provides authentication, authorization, and user management for the web and mobile apps.
Users can sign in directly with a username and password, or through a third party such as Facebook, Amazon, Google, or Apple.
Cognito has two main components.
- User pools are user directories that provide sign-up and sign-in options for the app users.
- Identity pools enable you to grant the users access to other AWS services.
Cognito Sync helps synchronize data across a user’s devices so that their app experience remains consistent when they switch between devices or upgrade to a new device.

AWS Security Services Cheat Sheet

October 6, 2022 ~ Last updated on : February 2, 2023 ~ jayendrapatil

AWS Security Services Cheat Sheet

Key Management Service – KMS

is a managed encryption service that allows the creation and control of encryption keys to enable data encryption.
provides a highly available key storage, management, and auditing solution to encrypt the data across AWS services & within applications.
uses hardware security modules (HSMs) to protect and validate the KMS keys by the FIPS 140-2 Cryptographic Module Validation Program.
seamlessly integrates with several AWS services to make encrypting data in those services easy.
supports multi-region keys, which are AWS KMS keys in different AWS Regions. Multi-Region keys are not global and each multi-region key needs to be replicated and managed independently.

CloudHSM

provides secure cryptographic key storage to customers by making hardware security modules (HSMs) available in the AWS cloud
helps manage your own encryption keys using FIPS 140-2 Level 3 validated HSMs.
single tenant, dedicated physical device to securely generate, store, and manage cryptographic keys used for data encryption
are inside the VPC (not EC2-classic) & isolated from the rest of the network
can use VPC peering to connect to CloudHSM from multiple VPCs
integrated with Amazon Redshift and Amazon RDS for Oracle
EBS volume encryption, S3 object encryption and key management can be done with CloudHSM but requires custom application scripting
is NOT fault-tolerant and would need to build a cluster as if one fails all the keys are lost
enables quick scaling by adding and removing HSM capacity on-demand, with no up-front costs.
automatically load balance requests and securely duplicates keys stored in any HSM to all of the other HSMs in the cluster.
expensive, prefer AWS Key Management Service (KMS) if cost is a criteria.

AWS WAF

is a web application firewall that helps monitor the HTTP/HTTPS traffic and allows controlling access to the content.
helps protect web applications from attacks by allowing rules configuration that allow, block, or monitor (count) web requests based on defined conditions. These conditions include IP addresses, HTTP headers, HTTP body, URI strings, SQL injection and cross-site scripting.
helps define Web ACLs, which is a combination of Rules that is a combinations of Conditions and Action to block or allow
integrated with CloudFront, Application Load Balancer (ALB), API Gateway services commonly used to deliver content and applications
supports custom origins outside of AWS, when integrated with CloudFront

AWS Secrets Manager

helps protect secrets needed to access applications, services, and IT resources.
enables you to easily rotate, manage, and retrieve database credentials, API keys, and other secrets throughout their lifecycle.
secure secrets by encrypting them with encryption keys managed using AWS KMS.
offers native secret rotation with built-in integration for RDS, Redshift, and DocumentDB.
supports Lambda functions to extend secret rotation to other types of secrets, including API keys and OAuth tokens.
supports IAM and resource-based policies for fine-grained access control to secrets and centralized secret rotation audit for resources in the AWS Cloud, third-party services, and on-premises.
enables secret replication in multiple AWS regions to support multi-region applications and disaster recovery scenarios.
supports private access using VPC Interface endpoints

AWS Shield

is a managed service that provides protection against Distributed Denial of Service (DDoS) attacks for applications running on AWS
provides protection for all AWS customers against common and most frequently occurring infrastructure (layer 3 and 4) attacks like SYN/UDP floods, reflection attacks, and others to support high availability of applications on AWS.
provides AWS Shield Advanced with additional protections against more sophisticated and larger attacks for applications running on EC2, ELB, CloudFront, AWS Global Accelerator, and Route 53.

AWS GuardDuty

offers threat detection that enables continuous monitoring and protects the AWS accounts and workloads.
is a Regional service
analyzes continuous streams of meta-data generated from AWS accounts and network activity found in AWS CloudTrail Events, EKS audit logs, VPC Flow Logs, and DNS Logs.
integrated threat intelligence
combines machine learning, anomaly detection, network monitoring, and malicious file discovery, utilizing both AWS-developed and industry-leading third-party sources to help protect workloads and data on AWS
supports suppression rules, trusted IP lists, and thread lists.
provides Malware Protection to detect malicious files on EBS volumes
operates completely independently from the resources so there is no risk of performance or availability impacts on the workloads.

Amazon Inspector

is a vulnerability management service that continuously scans the AWS workloads for vulnerabilities
automatically discovers and scans EC2 instances and container images residing in Elastic Container Registry (ECR) for software vulnerabilities and unintended network exposure.
creates a finding, when a software vulnerability or network issue is discovered, that describes the vulnerability, rates its severity, identifies the affected resource, and provides remediation guidance.
is a Regional service.
requires Systems Manager (SSM) agent to be installed and enabled.

Amazon Detective

helps analyze, investigate, and quickly identify the root cause of potential security issues or suspicious activities.
automatically collects log data from the AWS resources and uses machine learning, statistical analysis, and graph theory to build a linked set of data to easily conduct faster and more efficient security investigations.
enables customers to view summaries and analytical data associated with CloudTrail logs, EKS audit logs, VPC Flow Logs.
provides detailed summaries, analysis, and visualizations of the behaviors and interactions amongst your AWS accounts, EC2 instances, AWS users, roles, and IP addresses.
maintains up to a year of aggregated data
is a Regional service and needs to be enabled on a region-by-region basis.
is a multi-account service that aggregates data from monitored member accounts under a single administrative account within the same region.
has no impact on the performance or availability of the AWS infrastructure since it retrieves the log data and findings directly from the AWS services.

AWS Security Hub

a cloud security posture management service that performs security best practice checks, aggregates alerts, and enables automated remediation.
collects security data from across AWS accounts, services, and supported third-party partner products and helps you analyze your security trends and identify the highest priority security issues.
is Regional abut supports cross-region aggregation of findings.
automatically runs continuous, account-level configuration and security checks based on AWS best practices and industry standards which include CIS Foundations, PCI DSS.
consolidates the security findings across accounts and provider products and displays results on the Security Hub console.
supports integration with Amazon EventBridge. Custom actions can be defined when a finding is received.
has multi-account management through AWS Organizations integration, which allows delegating an administrator account for the organization.
works with AWS Config to perform most of its security checks for controls

AWS Macie

Macie is a data security service that discovers sensitive data by using machine learning and pattern matching, provides visibility into data security risks, and enables automated protection against those risks.
provides an inventory of the S3 buckets and automatically evaluates and monitors the buckets for security and access control.
automates the discovery, classification, and reporting of sensitive data.
generates a finding for you to review and remediate as necessary if it detects a potential issue with the security or privacy of the data, such as a bucket that becomes publicly accessible.
provides multi-account support using AWS Organizations to enable Macie across all of the accounts.
is a regional service and must be enabled on a region-by-region basis and helps view findings across all the accounts within each Region.
supports VPC Interface Endpoints to access Macie privately from a VPC without an internet gateway, NAT device, VPN connection, or AWS Direct Connect connection.

AWS Artifact

is a self-service audit artifact retrieval portal that provides customers with on-demand access to AWS’ compliance documentation and agreements
can use AWS Artifact Reports to download AWS security and compliance documents, such as AWS ISO certifications, Payment Card Industry (PCI), and System and Organization Control (SOC) reports.

References

AWS_Security_Products

Amazon Detective

October 4, 2022 ~ Last updated on : July 28, 2023 ~ jayendrapatil

Amazon Detective

Amazon Detective makes it easy to analyze, investigate, and quickly identify the root cause of potential security issues or suspicious activities.
automatically collects log data from the AWS resources and uses machine learning, statistical analysis, and graph theory to build a linked set of data to easily conduct faster and more efficient security investigations.
enables customers to view summaries and analytical data associated with CloudTrail logs, EKS audit logs, VPC Flow Logs.
provides detailed summaries, analysis, and visualizations of the behaviors and interactions amongst your AWS accounts, EC2 instances, AWS users, roles, and IP addresses.
maintains up to a year of aggregated data and makes it easily available through a set of visualizations that shows changes in the type and volume of activity over a selected time window, and links those changes to security findings.
is a Regional service and needs to be enabled on a region-by-region basis. This ensures all data analyzed is regionally based and doesn’t cross AWS regional boundaries.
requires Amazon GuardDuty to be enabled on the accounts for at least 48 hours before you enable Detective on those accounts.
is a multi-account service that aggregates data from monitored member accounts under a single administrative account within the same region.
Multi-account monitoring deployments can be configured in the same way it is configured for administrative and member accounts in Amazon GuardDuty and AWS Security Hub.
has no impact on the performance or availability of the AWS infrastructure since it retrieves the log data and findings directly from the AWS services.

Amazon Detective vs GuardDuty

Amazon GuardDuty is a threat detection service that continuously monitors malicious activity and unauthorized behavior to protect AWS accounts and workloads.
Amazon Detective simplifies the process of investigating security findings and identifying the root cause. It automatically creates a graph model and provides a unified, interactive view of your resources, users, and the interactions between them over time.

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

References

Amazon_Detective

AWS Certified Solutions Architect – Associate SAA-C03 Exam Learning Path

AWS Solutions Architect - Associate Certificate

September 29, 2022 ~ Last updated on : October 4, 2023 ~ jayendrapatil

AWS Certified Solutions Architect – Associate SAA-C03 Exam Learning Path

I just cleared the AWS Solutions Architect – Associate SAA-C03 exam with a score of 914/1000.
AWS Solutions Architect – Associate SAA-C03 exam is the latest AWS exam released on 30th August 2022 and has replaced the previous AWS Solutions Architect – SAA-C02 certification exam.

AWS Solutions Architect – Associate SAA-C03 Exam Content

It basically validates the ability to effectively demonstrate knowledge of how to design, architect, and deploy secure, cost-effective, and robust applications on AWS technologies
The exam also validates a candidate’s ability to complete the following tasks:
- Design solutions that incorporate AWS services to meet current business requirements and future projected needs
- Design architectures that are secure, resilient, high-performing, and cost-optimized
- Review existing solutions and determine improvements

Refer AWS Solutions Architect – Associate SAA-C03 Exam Guide

AWS Solutions Architect – Associate SAA-C03 Exam Summary

SAA-C03 exam consists of 65 questions in 130 minutes, and the time is more than sufficient if you are well-prepared.
SAA-C03 exam includes two types of questions, multiple-choice and multiple-response.
SAA-C03 has a scaled score between 100 and 1,000. The scaled score needed to pass the exam is 720.
Associate exams currently cost $ 150 + tax.
You can get an additional 30 minutes if English is your second language by requesting Exam Accommodations. It might not be needed for Associate exams but is helpful for Professional and Specialty ones.
AWS exams can be taken either remotely or online, I prefer to take them online as it provides a lot of flexibility. Just make sure you have a proper place to take the exam with no disturbance and nothing around you.
Also, if you are taking the AWS Online exam for the first time try to join at least 30 minutes before the actual time as I have had issues with both PSI and Pearson with long wait times.

AWS Solutions Architect – Associate SAA-C03 Exam Resources

Online Courses
- Stephane Maarek – Ultimate AWS Certified Solutions Architect Associate SAA-C03
- Adrian Cantrill – AWS Certified Solutions Architect – Associate (SAA-C03)
- Adrian Cantrill – All Associate Bundle
- DolfinEd – AWS Certified Solutions Architect Associate – SAA-C03 (E-Study & Lab Guides Included)
- DolfinEd – AWS Certified Solutions Architect Associate (On-line, Instructor-Led – Private Group Bootcamp)
- Whizlabs – AWS Certified Solutions Architect Associate Course
- Coursera Exam Prep: AWS Certified Solutions Architect – Associate
Practice tests
- Braincert AWS Solutions Architect – Associate SAA-C03 Practice Exams, which are updated for SAA-C03
- Stephane Maarek – AWS Certified Solutions Architect Associate Practice Exams
- Whizlabs – AWS Certified Solutions Architect Associate Practice Tests
Signed up with AWS for the Free Tier account which provides a lot of Services to be tried for free with certain limits which are more than enough to get things going. Be sure to decommission services beyond the free limits, preventing any surprises 🙂
Also, use QwikLabs for introductory courses which are free
Read the FAQs at least for the important topics, as they cover important points and are good for quick review

AWS Solutions Architect – Associate SAA-C03 Exam Topics

SAA-C03 Exam covers the design and architecture aspects in deep, so you must be able to visualize the architecture, even draw them out or prepare a mental picture just to understand how it would work and how different services relate.
SAA-C03 exam concepts cover solutions that fall within AWS Well-Architected framework to cover scalable, highly available, cost-effective, performant, and resilient pillars.
If you had been preparing for the SAA-C02, SAA-C03 is pretty much similar to SAA-C02 except for the addition of some new services Aurora Serverless, AWS Global Accelerator, FSx for Windows, and FSx for Lustre.

Networking

Virtual Private Network – VPC
- Create a VPC from scratch with public, private, and dedicated subnets with proper route tables, security groups, and NACLs.
- Understand what a CIDR is and address patterns.
- Subnets are public or private depending on whether they can route traffic directly through an Internet gateway
- Understand how communication happens between the Internet, Public subnets, Private subnets, NAT, Bastion, etc.
- Bastion (also referred to as a Jump server) can be used to securely access instances in the private subnets.
- Create two-tier architecture with application in public and database in private subnets
- Create three-tier architecture with web servers in public, application, and database servers in private. (hint: focus on security group configuration with least privilege)
Security Groups and NACLs
- Security Groups are Stateful vs NACLs are stateless.
- Also, only NACLs provide the ability to deny or block IPs
NAT Gateway or Instances
- help enables instances in a private subnet to connect to the Internet.
- Understand the difference between NAT Gateway & NAT Instance.
- NAT Gateway is AWS-managed and is scalable and highly available.
VPC endpoints
- enable the creation of a private connection between VPC to supported AWS services and VPC endpoint services powered by PrivateLink using its private IP address without needing an Internet or NAT Gateway.
- VPC Gateway Endpoints supports S3 and DynamoDB.
- VPC Interface Endpoints OR Private Links supports others
VPN and Direct Connect for on-premises to AWS connectivity
- VPN provides a quick, cost-effective, secure channel, however, routes through the internet and does not provide consistent throughput
- Direct Connect provides consistent, dedicated throughput without Internet, however, requires time to set up and is not cost-effective.
Understand Data Migration techniques at a high level
- VPN and Direct Connect for continuous, frequent data transfers.
- Snow Family is ideal for one-time, cost-effective huge data transfer.
- Choose a technique depending on the available bandwidth, data transfer needed, time available, encryption, one-time or continuous.
CloudFront
- fully managed, fast CDN service that speeds up the distribution of static, dynamic web, or streaming content to end-users
- S3 frontend by CloudFront provides low latency, performant experience for global users.
- provides static and dynamic caching for both AWS and on-premises origin.
Global Accelerator
- optimizes the path to applications to keep packet loss, jitter, and latency consistently low.
- helps improve the performance by lowering first-byte latency
- provides 2 static IP address
Know CloudFront vs Global Accelerator
Route 53
- highly available and scalable DNS web service.
- Health checks and failover routing helps provide resilient and active-passive solutions
- Route 53 Routing Policies and their use cases (hint: focus on weighted, latency, geolocation, failover routing)
Elastic Load Balancer
- Focus on ALB and NLB
- Differences between ALB vs NLB
  - ALB is layer 7 vs NLB is layer 4
  - ALB provides content-based, host-based, path-based routing
  - ALB provides dynamic port mapping which allows the same tasks to be hosted on the ECS node
  - NLB provides low latency, the ability to scale rapidly, and a static IP address
  - ALB works with WAF while NLB does not.
- Gateway Load Balancer – GWLB
  - helps deploy, scale, and manage virtual appliances like firewalls, IDS/IPS, and deep packet inspection systems.

Security

Identity Access Management – IAM
- IAM role
  - provides permissions that are not associated with a particular user, group, or service and are intended to be assumable by anyone who needs it.
  - can be used for EC2 application access and Cross-account access
- IAM identity providers and federation and use cases – Although did not see much in SAA-C03
Key Management Services – KMS encryption service
- for key management and envelope encryption
- S3 Integration with SSE, SSE-C, SSE-KMS
- KMS Multi-region keys are AWS KMS keys in different AWS Regions that can be used interchangeably – as though having the same key in multiple Regions.
AWS WAF
- integrates with CloudFront, and ALB to provide protection against Cross-site scripting (XSS), and SQL injection attacks.
- provides IP blocking and geo-protection, rate limiting, etc.
AWS Shield
- managed DDoS protection service
- integrates with CloudFront, ALB, and Route 53
- Advanced provides additional detection and mitigation against large and sophisticated DDoS attacks, near real-time visibility into attacks
AWS GuardDuty
- managed threat detection service and provides Malware protection
AWS Inspector
- is a vulnerability management service that continuously scans the AWS workloads for vulnerabilities
AWS Secrets Manager
- helps protect secrets needed to access applications, services, and IT resources.
- supports rotations of secrets, which Systems Manager Parameter Stores does not support.
Disaster Recovery whitepaper
- Be sure you know the different recovery types with impact on RTO/RPO.

Storage

Understand various storage options S3, EBS, Instance store, EFS, Glacier, FSx, and what are the use cases and anti-patterns for each
Instance Store
- is physically attached to the EC2 instance and provides the lowest latency and highest IOPS
Elastic Block Storage – EBS
- EBS volume types and their use cases in terms of IOPS and throughput. SSD for IOPS and HDD for throughput
- EBS Snapshots
  - Backups are automated, snapshots are manual
  - Can be used to encrypt an unencrypted EBS volume
- Multi-Attach EBS feature allows attaching an EBS volume to multiple instances within the same AZ only.
- EBS fast snapshot restore feature helps ensure that the EBS volumes created from a snapshot are fully-initialized at creation and instantly deliver all of their provisioned performance.
Simple Storage Service – S3
- S3 storage classes with lifecycle policies
  - Understand the difference between SA Standard vs SA IA vs SA IA One Zone in terms of cost and durability
- S3 Data Protection
  - S3 Client-side encryption encrypts data before storing it in S3
- S3 features including
  - S3 provides cost-effective static website hosting. However, it does not support HTTPS endpoint. Can be integrated with CloudFront for HTTPS, caching, performance, and low-latency access.
  - S3 versioning provides protection against accidental overwrites and deletions. Used with MFA Delete feature.
  - S3 Pre-Signed URLs for both upload and download provide access without needing AWS credentials.
  - S3 CORS allows cross-domain calls
  - S3 Transfer Acceleration enables fast, easy, and secure transfers of files over long distances between your client and an S3 bucket.
  - S3 Event Notifications to trigger events on various S3 events like objects added or deleted. Supports SQS, SNS, and Lambda functions.
  - Integrates with Amazon Macie to detect PII data
  - Replication that supports the same and cross-region replication required versioning to be enabled.
  - Integrates with Athena to analyze data in S3 using standard SQL.
Glacier
- as archival storage with various retrieval patterns
- Glacier Instant Retrieval allows retrieval in milliseconds.
- Glacier Expedited retrieval allows object retrieval within mins.
Storage gateway and its different types.
- Cached Volume Gateway provides access to frequently accessed data while using AWS as the actual storage
- Stored Volume gateway uses AWS as a backup, while the data is being stored on-premises as well
- File Gateway supports SMB protocol
FSx is easy and cost-effective to launch and run popular file systems.
- FSx provides two file systems to choose from:
- Amazon FSx for Windows File Server
  - works with both Linux and Windows
  - provides Windows File System features including integration with Active Directory.
- Amazon FSx for Lustre
  - for high-performance workloads
  - works with only Linux
Elastic File System – EFS
- simple, fully managed, scalable, serverless, and cost-optimized file storage for use with AWS Cloud and on-premises resources.
- provides shared volume across multiple EC2 instances, while EBS can be attached to a single instance within the same AZ or EBS Multi-Attach can be attached to multiple instances within the same AZ
- supports the NFS protocol, and is compatible with Linux-based AMIs
- supports cross-region replication, storage classes for cost.
AWS Transfer Family
- secure transfer service that helps transfer files into and out of AWS storage services using FTP, SFTP and FTPS protocol.
Difference between EBS vs S3 vs EFS
Difference between EBS vs Instance Store
Would recommend referring Storage Options whitepaper, although a bit dated 90% still holds right

Compute

Elastic Cloud Compute – EC2
Auto Scaling and ELB
- Auto Scaling provides the ability to ensure a correct number of EC2 instances are always running to handle the load of the application
- Elastic Load Balancer allows the incoming traffic to be distributed automatically across multiple healthy EC2 instances
Autoscaling & ELB
- work together to provide High Availability and Scalability.
- Span both ELB and Auto Scaling across Multi-AZs to provide High Availability
- Do not span across regions. Use Route 53 or Global Accelerator to route traffic across regions.
EC2 Instance Purchase Types – Reserved, Scheduled Reserved, On-demand, and Spot and their use cases
- Reserved instances provide cost benefits for long terms requirements over On-demand instances for continuous persistent load
- Scheduled Reserved Instances for load with fixed scheduled and time interval
- Spot instances provide cost benefits for temporary, fault-tolerant, spiky load
EC2 Placement Groups
- Cluster placement groups provide low latency and high throughput communication
- Spread placement group provides high availability
Lambda and serverless architecture, its features, and use cases.
- Lambda integrated with API Gateway to provide a serverless, highly scalable, cost-effective architecture
Elastic Container Service – ECS with its ability to deploy containers and microservices architecture.
- ECS role for tasks can be provided through taskRoleArn
- ALB provides dynamic port mapping to allow multiple same tasks on the same node.
Elastic Kubernetes Service – EKS
- managed Kubernetes service to run Kubernetes in the AWS cloud and on-premises data centers
- ideal for migration of an existing workload on Kubernetes
Elastic Beanstalk at a high level, what it provides, and its ability to get an application running quickly.

Databases

Understand relational and NoSQL data storage options which include RDS, DynamoDB, and Aurora with their use cases
Relational Database Service – RDS
- Read Replicas vs Multi-AZ
  - Read Replicas for scalability, Multi-AZ for High Availability
  - Multi-AZ are regional only
  - Read Replicas can span across regions and can be used for disaster recovery
- Understand Automated Backups, underlying volume types (which are the same as EBS volume types)
Aurora
- provides multiple read replicas and replicates 6 copies of data across AZs
- Aurora Serverless
  - provides a highly scalable cost-effective database solution
  - automatically starts up, shuts down, and scales capacity up or down based on the application’s needs.
  - supports only MySQL and PostgreSQL
DynamoDB
- provides low latency performance, a key-value store
- is not a relational database
- DynamoDB DAX provides caching for DynamoDB
- DynamoDB TTL helps expire data in DynamoDB without any cost or consuming any write throughput.
ElastiCache use cases, mainly for caching performance
- ElastiCache Redis vs Memcached

Integration Tools

Simple Queue Service
- as message queuing service and SNS as pub/sub notification service
- as a decoupling service and provide resiliency
- SQS features like visibility, and long poll vs short poll
- provide scaling for the Auto Scaling group based on the SQS size.
- SQS Standard vs SQS FIFO difference
  - FIFO provides exactly-once delivery but with low throughput
Simple Notification Service – SNS
- is a web service that coordinates and manages the delivery or sending of messages to subscribing endpoints or clients
- Fanout pattern can be used to push messages to multiple subscribers

Analytics

Redshift as a business intelligence tool
Kinesis
- for real-time data capture and analytics.
- Integrates with Lambda functions to perform transformations
AWS Glue
- fully-managed, ETL service that automates the time-consuming steps of data preparation for analytics

Management Tools

CloudWatch
- monitoring to provide operational transparency
- is extendable with custom metrics
- CloudWatch -> (Subscription filter) -> Kinesis Data Firehose -> S3
CloudTrail
- helps enable governance, compliance, and operational and risk auditing of the AWS account.
- helps to get a history of AWS API calls and related events for the AWS account.
CloudFormation
- easy way to create and manage a collection of related AWS resources, and provision and update them in an orderly and predictable fashion.
AWS Config
- fully managed service that provides AWS resource inventory, configuration history, and configuration change notifications to enable security, compliance, and governance.

AWS Whitepapers & Cheatsheets

On the Exam Day

Make sure you are relaxed and get some good night’s sleep. The exam is not tough if you are well-prepared.
If you are taking the AWS Online exam
- Try to join at least 30 minutes before the actual time as I have had issues with both PSI and Pearson with long wait times.
- The online verification process does take some time and usually, there are glitches.
- Remember, you would not be allowed to take the take if you are late by more than 30 minutes.
- Make sure you have your desk clear, no hand-watches, or external monitors, keep your phones away, and nobody can enter the room.

Finally, All the Best 🙂

AWS EC2 Monitoring

September 20, 2022 ~ Last updated on : February 6, 2023 ~ jayendrapatil ~ 8 Comments

EC2 Monitoring

Status Checks

Status monitoring helps quickly determine whether EC2 has detected any problems that might prevent instances from running applications.
EC2 performs automated checks on every running EC2 instance to identify hardware and software issues.
Status checks are performed every minute and each returns a pass or a fail status.
If all checks pass, the overall status of the instance is OK.
If one or more checks fail, the overall status is Impaired.
Status checks are built into EC2, so they cannot be disabled or deleted.
Status checks data augments the information that EC2 already provides about the intended state of each instance (such as pending, running, and stopping) as well as the utilization metrics that CloudWatch monitors (CPU utilization, network traffic, and disk activity).
Alarms can be created or deleted, that are triggered based on the result of the status checks. for e.g., an alarm can be created to warn if status checks fail on a specific instance.

System Status Checks

monitor the AWS systems, required to use the instance, to ensure they are working properly.
detect problems with the instance that require AWS involvement to repair.
System status checks failure might due to
- Loss of network connectivity
- Loss of system power
- Software issues on the physical host
- Hardware issues on the physical host
When a system status check fails, one can either
- check Personal Health Dashboard for any scheduled critical maintenance by AWS to the instance’s host.
- wait for AWS to fix the issue
- or resolve it by stopping and restarting or terminating and replacing an instance

Instance Status Checks

monitor the software and network configuration of the individual instance
checks to detect problems that require involvement to repair.
Instance status checks failure might be due to
- Failed system status checks
- Misconfigured networking or startup configuration
- Exhausted memory
- Corrupted file system
- Incompatible kernel
When an instance status check fails, it can be resolved by either rebooting the instance or by making modifications to the operating system

CloudWatch Monitoring

CloudWatch helps monitor EC2 instances, which collects and processes
raw data from EC2 into readable, near real-time metrics.
Statistics are recorded for a period of two weeks so that historical information can be accessed and used to gain a better perspective on how
the application or service is performing.
By default, Basic monitoring is enabled and EC2 metric data is sent to CloudWatch in 5-minute periods automatically
Detailed monitoring can be enabled on the EC2 instance, which sends data to CloudWatch in 1-minute periods.
Aggregating Statistics Across Instances/ASG/AMI ID
- Aggregate statistics are available for the instances that have detailed monitoring (at an additional charge) enable, which provides data in 1-minute periods
- Instances that use basic monitoring are not included in the aggregates.
- CloudWatch does not aggregate data across Regions. Therefore, metrics are completely separate between regions.
- CloudWatch returns statistics for all dimensions in the AWS/EC2 namespace if no dimension is specified
- The technique for retrieving all dimensions across an AWS namespace does not work for custom namespaces published to CloudWatch.
- Statistics include Sum, Average, Minimum, Maximum, Data Samples
- With custom namespaces, the complete set of dimensions that are associated with any given data point to retrieve statistics that include the data point must be specified
CloudWatch alarms
- can be created to monitor any one of the EC2 instance’s metrics.
- can be configured to automatically send you a notification when the metric reaches a specified threshold.
- can automatically stop, terminate, reboot, or recover EC2 instances
- can automatically recover an EC2 instance when the instance becomes impaired due to an underlying hardware failure a problem that requires AWS involvement to repair
- can automatically stop or terminate the instances to save costs (EC2 instances that use an EBS volume as the root device can be stopped
  or terminated, whereas instances that use the instance store as the root device can only be terminated)
- can use EC2ActionsAccess IAM role, which enables AWS to perform stop, terminate, or reboot actions on EC2 instances
- If you have read/write permissions for CloudWatch but not for EC2, alarms can still be created but the stop or terminate actions won’t be performed on the EC2 instance

EC2 Monitoring Metrics

CPUCreditUsage
- (Only valid for T2 instances) The number of CPU credits consumed
  during the specified period.
- This metric identifies the amount of time during which physical CPUs
  were used for processing instructions by virtual CPUs allocated to
  the instance.
- CPU Credit metrics are available at a 5-minute frequency.
CPUCreditBalance
- (Only valid for T2 instances) The number of CPU credits that an instance has accumulated.
- This metric is used to determine how long an instance can burst beyond its baseline performance level at a given rate.
- CPU Credit metrics are available at a 5-minute frequency.
CPUUtilization
- % of allocated EC2 compute units that are currently in use on the instance. This metric identifies the processing power required to run an application upon a selected instance.
DiskReadOps
- Completed read operations from all instance store volumes available to the instance in a specified period of time.
DiskWriteOps
- Completed write operations to all instance store volumes available to the instance in a specified period of time.
DiskReadBytes
- Bytes read from all instance store volumes available to the instance.
- This metric is used to determine the volume of the data the application reads from the hard disk of the instance.
- This can be used to determine the speed of the application.
DiskWriteBytes
- Bytes written to all instance store volumes available to the instance.
- This metric is used to determine the volume of the data the application writes onto the hard disk of the instance.
- This can be used to determine the speed of the application.
NetworkIn
- The number of bytes received on all network interfaces by the instance. This metric identifies the volume of incoming network traffic to an application on a single instance.
NetworkOut
- The number of bytes sent out on all network interfaces by the instance. This metric identifies the volume of outgoing network traffic to an application on a single instance.
NetworkPacketsIn
- The number of packets received on all network interfaces by the instance. This metric identifies the volume of incoming traffic in terms of the number of packets on a single instance.
- This metric is available for basic monitoring only
NetworkPacketsOut
- The number of packets sent out on all network interfaces by the instance. This metric identifies the volume of outgoing traffic in terms of the number of packets on a single instance.
- This metric is available for basic monitoring only.
StatusCheckFailed
- Reports if either of the status checks, StatusCheckFailed_Instance and StatusCheckFailed_System that has failed.
- Values for this metric are either 0 (zero) or 1 (one.) A zero indicates that the status checks passed. A one indicates a status check failure.
- Status check metrics are available at a 1-minute frequency
StatusCheckFailed_Instance
- Reports whether the instance has passed the EC2 instance status check in the last minute.
- Values for this metric are either 0 (zero) or 1 (one.) A zero indicates that the status checks passed. A one indicates a status check failure.
- Status check metrics are available at a 1-minute frequency
StatusCheckFailed_System
- Reports whether the instance has passed the EC2 system status check in the last minute.
- Values for this metric are either 0 (zero) or 1 (one.) A zero indicates that the status checks passed. A one indicates a status check failure.
- Status check metrics are available at a 1-minute frequency

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

In the basic monitoring package for EC2, Amazon CloudWatch provides the following metrics:
1. Web server visible metrics such as number failed transaction requests
2. Operating system visible metrics such as memory utilization
3. Database visible metrics such as number of connections
4. Hypervisor visible metrics such as CPU utilization
Which of the following requires a custom CloudWatch metric to monitor?
1. Memory Utilization of an EC2 instance
2. CPU Utilization of an EC2 instance
3. Disk usage activity of an EC2 instance
4. Data transfer of an EC2 instance
A user has configured CloudWatch monitoring on an EBS backed EC2 instance. If the user has not attached any additional device, which of the below mentioned metrics will always show a 0 value?
1. DiskReadBytes
2. NetworkIn
3. NetworkOut
4. CPUUtilization
A user is running a batch process on EBS backed EC2 instances. The batch process starts a few instances to process Hadoop Map reduce jobs, which can run between 50 – 600 minutes or sometimes for more time. The user wants to configure that the instance gets terminated only when the process is completed. How can the user configure this with CloudWatch?
1. Setup the CloudWatch action to terminate the instance when the CPU utilization is less than 5%
2. Setup the CloudWatch with Auto Scaling to terminate all the instances
3. Setup a job which terminates all instances after 600 minutes
4. It is not possible to terminate instances automatically
An AWS account owner has setup multiple IAM users. One IAM user only has CloudWatch access. He has setup the alarm action, which stops the EC2 instances when the CPU utilization is below the threshold limit. What will happen in this case?
1. It is not possible to stop the instance using the CloudWatch alarm
2. CloudWatch will stop the instance when the action is executed
3. The user cannot set an alarm on EC2 since he does not have the permission
4. The user can setup the action but it will not be executed if the user does not have EC2 rights
A user has launched 10 instances from the same AMI ID using Auto Scaling. The user is trying to see the average CPU utilization across all instances of the last 2 weeks under the CloudWatch console. How can the user achieve this?
1. View the Auto Scaling CPU metrics (Refer AS Instance Monitoring)
2. Aggregate the data over the instance AMI ID (Works but needs detailed monitoring enabled)
3. The user has to use the CloudWatchanalyser to find the average data across instances
4. It is not possible to see the average CPU utilization of the same AMI ID since the instance ID is different

References

EC2_Monitoring