AWS EC2 Instance Types

EC2 Instance Types

  • EC2 Instance types determine the hardware of the host computer used for the instance.
  • EC2 Instance types offer different compute, memory & storage capabilities and are grouped in instance families based on these capabilities.
  • EC2 provides each instance with a consistent and predictable amount of CPU capacity, regardless of its underlying hardware.
  • EC2 dedicates some resources of the host computer, such as CPU, memory, and instance storage, to a particular instance.
  • EC2 shares other resources of the host computer, such as the network and the disk subsystem, among instances. If each instance on a host computer tries to use as much of one of these shared resources as possible, each receives an equal share of that resource. However, when a resource is under-utilized, an instance can consume a higher share of that resource while it’s available.

EC2 Instance Types Selection criteria

  • Some Instance types support only the HVM virtualization type while others support both the PV and HVM virtualization types. AWS, however, recommends using HVM for taking advantage of the underlying hardware
  • All EC2 instance types are available in a VPC, however, a few are not available in an EC2-classic. AWS recommends using VPC to take advantage of enhanced networking, multiple IP addresses, finer security control etc.
  • Some instances support only EBS volumes, while others support both EBS and Instance store volumes. Some instances that support instance store volumes use solid-state drives (SSD) to deliver very high random I/O performance.
  • Some EC2 instance types can be launched as EBS optimized instances with a dedicated capacity for EBS I/O.
  • Some EC2 Instance types can be launched in placement group to optimize instances for High-Performance Computing (HPC)
  • Some instances support Enhanced Networking,  to get significantly higher packet per second (PPS) performance, lower network jitter, and lower latencies
  • Some Instances allow EBS volumes to be encrypted

EBS-Optimized

  • EBS-optimized instance uses an optimized configuration stack and provides additional, dedicated capacity for EBS I/O.
  • EBS-optimized instances enable you to get consistently high performance for the EBS volumes by eliminating contention between EBS I/O and other network traffic from the instance.
  • EBS-optimized instances deliver dedicated throughput between Amazon EC2 and EBS, with options between 500 and 60,000 Megabits per second (Mbps) depending on the instance type used.
  • When attached to an EBS–optimized instance, General Purpose (SSD) volumes are designed to deliver within 10 percent of their baseline and burst performance 99.9 percent of the time in a given year, and Provisioned IOPS (SSD) volumes are designed to deliver within 10 percent of their provisioned performance 99.9 percent of the time in a given year.
  • EBS optimization can be enabled for an instance that is not EBS–optimized, by default

Placement Groups

  • EC2 Placement groups determine how the instances are placed on the underlying hardware.
  • AWS now provides three types of placement groups
    • Cluster – clusters instances into a low-latency group in a single AZ
    • Partition – spreads instances across logical partitions, ensuring that instances in one partition do not share underlying hardware with instances in other partitions
    • Spread – strictly places a small group of instances across distinct underlying hardware to reduce correlated failures

NOTE – AWS keeps on releasing new instance types, please refer AWS documentation for the same.

EC2 Instance Types – Current Generation

EC2 Instance Types

EC2 Instance Types Comparision

Screen Shot 2016-04-15 at 7.06.50 AM.png

T2 Instances (General Purpose)

  • T2 instances are designed to provide moderate baseline performance and the capability to burst to significantly higher performance as required
  • Mainly intended for workloads that don’t use the full CPU often or consistently, but occasionally need to burst.
  • T2 instances are well suited for
    • general-purpose workloads, such as web servers, developer environments, remote desktops, and small databases
  • Requirements
    • can be launched only with HVM AMI
    • can be launched into a  VPC only, and not supported on the EC2-Classic platform
    • are available as EBS-backed instances only
    • are available as On-Demand, Reserved instances, Dedicated Instances (T3 only), and Spot instances but do not allow spot instances
    • By default, 20 (soft limit) T2 instances can run simultaneously
    • cannot be launched as a Dedicated host
  • T2 Unlimited Instances
    • can sustain high CPU performance for as long as a workload needs it.
    • for most general-purpose workloads, it provides ample performance without any additional charges.
    • If the instance needs to run at higher CPU utilization for a prolonged period, it can also do so at a flat additional rate

CPU Credits

  • CPU Credits (Similar to I/O Credits in the case of the EBS general-purpose storage) provides the performance of a full CPU core for one minute
  • T2 instances provide a baseline level of CPU performance, while CPU governs the ability to burst above the baseline level
  • One CPU credit is equal to one vCPU running at 100% utilization for one minute. for e.g. can have One vCPU running at 100% for One min OR One vCPU running @ 50% for 2 mins OR Two vCPU running @ 25% for 2 mins
  • Each T2 instance receives a healthy initial credit balance for startup performance
  • Initial CPU credits do not expire, but they are used first when an instance uses CPU credits.
  • Each T2 instance then continuously (at a millisecond-level resolution) receives a set rate of CPU credits per hour, depending on instance size for e.g. t2.nano earns 3/hour while a t2.large earns 36/hour
  • Each T2 instance accumulates the CPU credit when it uses fewer CPU resources than its allowed baseline performance levels
  • Maximum earned credit balance for an instance is equal to the number of CPU credits received per hour times 24 hours for e.g. t2.nano can earn max 72 (24 * 3) credits
  • CPU credit balance is available for a period of 24 hours and it expires 24 hours after they were earned. Expired credits are removed from the balance before new ones are added
  • CPU credit ceases to persist between an instance stop-start. However, after the start, the instance receives the initial CPU credits again
  • When the credit balance is completely exhausted, the instance will perform at its baseline performance

C4 Instances (Compute Intensive)

  • C4 instances are ideal for compute-bound applications that benefit from high-performance processors
  • Well suited for
    • Batch processing workloads,
    • Media transcoding,
    • High-traffic web servers, massively multiplayer online (MMO) gaming servers, and ad serving engines,
    • High-traffic web servers, massively multiplayer online (MMO) gaming servers, and ad serving engines
  • Features
    • are EBS-optimized, by default
    • can be enabled for Enhanced Networking capabilities
    • can be clustered in a placement group
  • requirements
    • requires 64-bit HVM AMI
    • can be launched into a  VPC only, and not supported on the EC2-Classic platform

G2 Instances (Graphic Intensive)

  • GPU instances provide  high parallel processing capability
  • Well suited for
    • to accelerate many scientific, engineering, and rendering applications by leveraging the Compute Unified Device Architecture (CUDA) or OpenCL parallel computing frameworks
    • graphics applications, including game streaming, 3-D application streaming, and other graphics workloads
  • Requirements
    • requires HVM AMI
    • can’t access GPU unless NVIDIA drivers installed
  • Features
    • can be clustered in a placement group

I2 Instances (I/O Intensive)

  • I2 instances are optimized to deliver tens of thousands of low-latency, random I/O operations per second (IOPS) to applications.
  • Well suited for applications
    • NoSQL databases (for example, Cassandra and MongoDB)
    • Clustered databases
    • Online transaction processing (OLTP) systems
  • Features
    • Primary data storage is SSD-based instance storage.
    • can be enabled for Enhanced Networking capabilities
    • can be clustered in a placement group
    • can enable EBS–optimization to obtain additional, dedicated capacity for Amazon EBS I/O
  • Requirements
    • requires HVM AMI
  • HI1 is the equivalent previous generation instance
    • supports both PV and HVM AMIs

D2 Instances (Density Intensive)

  • D2 instances are designed for workloads with very high storage density and that require high sequential read and write access to very large data sets on local storage.
  • Well suited for applications
    • Massive parallel processing (MPP) data warehouse
    • MapReduce and Hadoop distributed computing
    • Log or data processing applications
  • Features
    • Primary data storage for D2 instances is HDD-based instance storage
    • are EBS-optimized, by default
    • can be enabled for Enhanced Networking capabilities
    • can be clustered in a placement group
  • requirements
    • requires 64-bit HVM AMI
  • HS1 is the equivalent previous generation instance
    • supports both EBS and Instance store backed AMIs
    • supports both PV and HVM AMIs

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Which of the following instance types are available as Amazon EBS-backed only? Choose 2 answers
    1. General purpose T2
    2. General purpose M3
    3. Compute-optimized C4
    4. Compute-optimized C3
    5. Storage-optimized 12
  2. A t2.medium EC2 instance type must be launched with what type of Amazon Machine Image (AMI)?
    1. An Instance store Hardware Virtual Machine AMI
    2. An Instance store Paravirtual AMI
    3. An Amazon EBS-backed Hardware Virtual Machine AMI
    4. An Amazon EBS-backed Paravirtual AMI
  3. You have identified network throughput as a bottleneck on your m1.small EC2 instance when uploading data Into Amazon S3 In the same region. How do you remedy this situation? Add an additional ENI
    1. Change to a larger Instance
    2. Use DirectConnect between EC2 and S3
    3. Use EBS PIOPS on the local volume
  4. You are using an m1.small EC2 Instance with one 300 GB EBS volume to host a relational database. You determined that write throughput to the database needs to be increased. Which of the following approaches can help achieve this? Choose 2 answers
    1. Use an array of EBS volumes (Striping to increase throughput)
    2. Enable Multi-AZ mode.
    3. Place the instance in an Auto Scaling Groups
    4. Add an EBS volume and place into RAID 5 (RAID 5 is not recommended as it provides parity and EBS volumes are already replicated across multiple servers in an Availability Zone for availability and durability, so AWS recommends striping for performance rather than durability)
    5. Increase the size of the EC2 Instance.
    6. Put the database behind an Elastic Load Balancer.
  5. You are tasked with setting up a cluster of EC2 Instances for a NoSQL database. The database requires random read IO disk performance up to a 100,000 IOPS at 4KB block side per node. Which of the following EC2 instances will perform the best for this workload?
    1. A High-Memory Quadruple Extra Large (m2.4xlarge) with EBS-Optimized set to true and a PIOPs EBS volume
    2. A Cluster Compute Eight Extra Large (cc2.8xlarge) using instance storage
    3. High I/O Quadruple Extra Large (hi1.4xlarge) using instance storage
    4. A Cluster GPU Quadruple Extra Large (cg1.4xlarge) using four separate 4000 PIOPS EBS volumes in a RAID 0 configuration
  6. You are implementing a URL whitelisting system for a company that wants to restrict outbound HTTP’S connections to specific domains from their EC2-hosted applications you deploy a single EC2 instance running proxy software and configure It to accept traffic from all subnets and EC2 instances in the VPC. You configure the proxy to only pass through traffic to domains that you define in its whitelist configuration You have a nightly maintenance window or 10 minutes where ail instances fetch new software updates. Each update Is about 200MB In size and there are 500 instances In the VPC that routinely fetch updates After a few days you notice that some machines are failing to successfully download some, but not all of their updates within the maintenance window The download URLs used for these updates are correctly listed in the proxy’s whitelist configuration and you are able to access them manually using a web browser on the instances What might be happening? (Choose 2 answers) [PROFESSIONAL]
    1. You are running the proxy on an undersized EC2 instance type so network throughput is not sufficient for all instances to download their updates in time.
    2. You have not allocated enough storage to the EC2 instance running me proxy so the network buffer is filling up causing some requests to fall
    3. You are running the proxy in a public subnet but have not allocated enough EIPs to support the needed network throughput through the Internet Gateway (IGW)
    4. You are running the proxy on a affluently-sized EC2 instance in a private subnet and its network throughput is being throttled by a NAT running on an undersized EC2 instance
    5. The route table for the subnets containing the affected EC2 instances is not configured to direct network traffic for the software update locations to the proxy.
  7. You have been asked to design the storage layer for an application. The application requires disk performance of at least 100,000 IOPS in addition; the storage layer must be able to survive the loss of an individual disk, EC2 instance, or Availability Zone without any data loss. The volume you provide must have a capacity of at least 3TB. Which of the following designs will meet these objectives? [PROFESSIONAL]
    1. Instantiate an i2.8xlarge instance in us-east-1a. Create a RAID 0 volume using the four 800GB SSD ephemeral disks provided with the instance. Provision 3×1 TB EBS volumes attach them to the instance and configure them as a second RAID 0 volume. Configure synchronous, block-level replication from the ephemeral backed volume to the EBS-backed volume. (Same AZ will not survive the AZ loss)
    2. Instantiate an i2.8xlarge instance in us-east-1a. Create a RAID 0 volume using the four 800GB SSD ephemeral disks provided with the Instance Configure synchronous block-level replication to an identically configured Instance in us-east-1b.
    3. Instantiate a c3.8xlarge Instance in us-east-1. Provision an AWS Storage Gateway and configure it for 3 TB of storage and 100,000 IOPS. Attach the volume to the instance. (Need synchronous replication to prevent any data loss)
    4. Instantiate a c3.8xlarge instance in us-east-1 provision 4x1TB EBS volumes, attach them to the instance, and configure them as a single RAID 5 volume Ensure that EBS snapshots are performed every 15 minutes. (RAID 5 not recommended by AWS and Need synchronous replication to prevent any data loss)
    5. Instantiate a c3 8xlarge Instance in us-east-1 Provision 3x1TB EBS volumes attach them to the instance, and configure them as a single RAID 0 volume Ensure that EBS snapshots are performed every 15 minutes. (Need synchronous replication to prevent any data loss)

References

AWS EC2 Best Practices

AWS EC2 Best Practices

AWS recommends the following best practices to get maximum benefit and satisfaction from EC2

Security & Network

  • Implement the least permissive rules for the security group.
  • Regularly patch, update, and secure the operating system and applications on the instance
  • Manage access to AWS resources and APIs using identity federation, IAM users, and IAM roles
  • Establish credential management policies and procedures for creating, distributing, rotating, and revoking AWS access credentials
  • Launch the instances into a VPC instead of EC2-Classic (If AWS account is newly created VPC is used by default)
  • Encrypt EBS volumes and snapshots.

Storage

  • EC2 supports Instance store and EBS volumes, so its best to understand the implications of the root device type for data persistence, backup, and recovery
  • Use separate Amazon EBS volumes for the operating system (root device) versus your data.
  • Ensure that the data volume (with the data) persists after instance termination.
  • Use the instance store available for the instance to only store temporary data. Remember that the data stored in the instance store is deleted when an instance is stopped or terminated.
  • If you use instance store for database storage, ensure that you have a cluster with a replication factor that ensures fault tolerance.

Resource Management

  • Use instance metadata and custom resource tags to track and identify your AWS resources
  • View your current limits for Amazon EC2. Plan to request any limit increases in advance of the time that you’ll need them.

Backup & Recovery

  • Regularly back up the instance using Amazon EBS snapshots (not done automatically) or a backup tool.
  • Data Lifecycle Manager (DLM) to automate the creation, retention, and deletion of snapshots taken to back up the EBS volumes
  • Create an Amazon Machine Image (AMI) from the instance to save the configuration as a template for launching future instances.
  • Implement High Availability by deploying critical components of the application across multiple Availability Zones, and replicate the data appropriately
  • Monitor and respond to events.
  • Design the applications to handle dynamic IP addressing when the instance restarts.
  • Implement failover. For a basic solution, you can manually attach a network interface or Elastic IP address to a replacement instance
  • Regularly test the process of recovering your instances and Amazon EBS volumes if they fail.

References

AWS Encrypting Data at Rest – Whitepaper – Certification

Encrypting Data at Rest

  • AWS delivers a secure, scalable cloud computing platform with high availability, offering the flexibility for you to build a wide range of applications
  • AWS allows several options for encrypting data at rest, for additional layer of security, ranging from completely automated AWS encryption solution to manual client-side options
  • Encryption requires 3 things
    • Data to encrypt
    • Encryption keys
    • Cryptographic algorithm method to encrypt the data
  • AWS provides different models for Securing data at rest on the following parameters
    • Encryption method
      • Encryption algorithm selection involves evaluating security, performance, and compliance requirements specific to your application
    • Key Management Infrastructure (KMI)
      • KMI enables managing & protecting the encryption keys from unauthorized access
      • KMI provides
        • Storage layer that protects plain text keys
        • Management layer that authorize key usage
  • Hardware Security Module (HSM)
    • Common way to protect keys in a KMI is using HSM
    • An HSM is a dedicated storage and data processing device that performs cryptographic operations using keys on the device.
    • An HSM typically provides tamper evidence, or resistance, to protect keys from unauthorized use.
    • A software-based authorization layer controls who can administer the HSM and which users or applications can use which keys in the HSM
  • AWS CloudHSM
    • AWS CloudHSM appliance has both physical and logical tamper detection and response mechanisms that trigger zeroization of the appliance.
    • Zeroization erases the HSM’s volatile memory where any keys in the process of being decrypted were stored and destroys the key that encrypts stored objects, effectively causing all keys on the HSM to be inaccessible and unrecoverable.
    • AWS CloudHSM can be used to generate and store key material and can perform encryption and decryption operations,
    • AWS CloudHSM, however, does not perform any key lifecycle management functions (e.g., access control policy, key rotation) and needs a compatible KMI.
    • KMI can be deployed either on-premises or within Amazon EC2 and can communicate to the AWS CloudHSM instance securely over SSL to help protect data and encryption keys.
    • AWS CloudHSM service uses SafeNet Luna appliances, any key management server that supports the SafeNet Luna platform can also be used with AWS CloudHSM
  • AWS Key Management Service (KMS)
    • AWS KMS is a managed encryption service that allows you to provision and use keys to encrypt data in AWS services and your applications.
    • Masters key, after creation, are designed to never be exported from the service.
    • AWS KMS gives you centralized control over who can access your master keys to encrypt and decrypt data, and it gives you the ability to audit this access.
    • Data can be sent into the KMS to be encrypted or decrypted under a specific master key under you account.
    • AWS KMS is natively integrated with other AWS services (for e.g. Amazon EBS, Amazon S3, and Amazon Redshift) and AWS SDKs to simplify encryption of your data within those services or custom applications
    • AWS KMS provides global availability, low latency, and a high level of durability for your keys.

Encryption Models in AWS

Encryption models in AWS depends on the on how you/AWS provides the encryption method and the KMI

  • You control the encryption method and the entire KMI
  • You control the encryption method, AWS provides the storage component of the KMI, and you provide the management layer of the KMI.
  • AWS controls the encryption method and the entire KMI.

Screen Shot 2016-04-08 at 7.39.04 AM

Model A: You control the encryption method and the entire KMI

  • You use your own KMI to generate, store, and manage access to keys as well as control all encryption methods in your applications
  • Proper storage, management, and use of keys to ensure the confidentiality, integrity, and availability of your data is your responsibility
  • AWS has no access to your keys and cannot perform encryption or decryption on your behalf.
  • Amazon S3
    • Encryption of the data is done before the object is sent to AWS S3
    • Encryption of the data can be done using any encryption method and the encrypted data can be uploaded using the PUT request in the Amazon S3 API
    • Key used to encrypt the data needs to be stored securely in your KMI
    • To decrypt this data, the encrypted object can be downloaded from Amazon S3 using the GET request in the Amazon S3 API and then decrypted using the key in your KMI
    • AWS provide Client-side encryption handling, where you can provide your key to the AWS S3 encryption client which will encrypt and decrypt the data on your behalf. However, AWS never has access to the keys or the unencrypted data
    • Screen Shot 2016-04-08 at 6.51.32 PM.png
  • Amazon EBS
    • Amazon Elastic Block Store (Amazon EBS) provides block-level storage volumes for use with Amazon EC2 instances. Amazon EBS volumes are network-attached, and persist independently from the life of an instance.
    • Because Amazon EBS volumes are presented to an instance as a block device, you can leverage most standard encryption tools for file system-level or block-level encryption
    • Block level encryption
      • Block level encryption tools usually operate below the file system layer using kernel space device drivers to perform encryption and decryption of data.
      • These tools are useful when you want all data written to a volume to be encrypted regardless of what directory the data is stored in
    • File System level encryption
      • File system level encryption usually works by stacking an encrypted file system on top of an existing file system.
      • This method is typically used to encrypt a specific directory
    • These solutions require you to provide keys, either manually or from your KMI.
    • Both block-level and file system-level encryption tools can only be used to encrypt data volumes that are not Amazon EBS boot volumes, as they don’t allow you to automatically make a trusted key available to the boot volume at startup
    • There are third party solutions available, which can help encrypt both the boot and data volumes as well as supplying and protecting keys
  • AWS Storage Gateway
    • AWS Storage Gateway is a service connecting an on-premises software appliance with Amazon S3. Data on disk volumes attached to the AWS Storage Gateway will be automatically uploaded to Amazon S3 based on policy
    • Encryption of the source data on the disk volumes can be either done before writing to the disk or using block level encryption on the iSCSI endpoint that AWS Storage Gateway exposes to encrypt all data on the disk volume.
  • Amazon RDS
    • Amazon RDS doesn’t expose the attached disk it uses for data storage, transparent disk encryption using techniques for EBS section cannot be applied.
    • However, individual fields data can be encrypted before the data is written to RDS and decrypted after reading it.

Model B: You control the encryption method, AWS provides the KMI storage component, and you provide the KMI management layer

  • Model B is similar to Model A where the encryption method is managed by you
  • Model B differs in the approach to Model A where the keys are maintained in AWS CloudHSM rather than than the on-premise key storage system
  • Only you have access to the cryptographic partitions within the dedicated HSM to use the keys

Screen Shot 2016-04-10 at 1.01.54 PM.png

Model C: AWS controls the encryption method and the entire KMI

  • AWS provides and manages the server-side encryption of your data, transparently managing the encryption method and the keys.
  • AWS KMS and other services that encrypt your data directly use a method called envelope encryption to provide a balance between performance and security.
  • Envelope Encryption method
    • A master key is defined either by you or AWS
    • A data key (data encryption key) is generated by the AWS service at the time when data encryption is requested
    • Data key is used to encrypt your data.
    • Data key is then encrypted with a key-encrypting key (master key) unique to the service storing your data.
    • Encrypted data key and the encrypted data are then stored by the AWS storage service on your behalf.
  • Master key (key-encrypting keys) used to encrypt data keys are stored and managed separately from the data and the data keys
  • For decryption of the data, the process is reversed. Encrypted data key is decrypted using the key-encrypting key; the data key is then used to decrypt your data
  • Authorized use of encryption keys is done automatically and is securely managed by AWS.
  • Because unauthorized access to those keys could lead to the disclosure of your data, AWS has built systems and processes with strong access controls that minimize the chance of unauthorized access and had these systems verified by third-party audits to achieve security certifications including SOC 1, 2, and 3, PCI-DSS, and FedRAMP.
  • Amazon S3
    • SSE-S3
      • AWS encrypts each object using a unique data key
      • Data key is encrypted with a periodically rotated master key managed by S3
      • Amazon S3 server-side encryption uses 256-bit Advanced Encryption Standard (AES) keys for both object and master keys
    • SSE-KMS
      • Master keys are defined and managed in KMS for your account
      • Object Encryption
        • When an object is uploaded, a request is sent to KMS to create an object key.
        • KMS generates a unique object key and encrypts it using the master key; KMS then returns this encrypted object key along with the plaintext object key to Amazon S3.
        • Amazon S3 web server encrypts your object using the plaintext object key and stores the now encrypted object (with the encrypted object key) and deletes the plaintext object key from memory.
      • Object Decryption
        • To retrieve the encrypted object, Amazon S3 sends the encrypted object key to AWS KMS.
        • AWS KMS decrypts the object key using the correct master key and returns the decrypted (plaintext) object key to S3.
        • Amazon S3 decrypts the encrypted object, with the plaintext object key, and returns it to you.
    • SSE-C
      • Amazon S3 is provided an encryption key, while uploading the object
      • Encryption key is used by Amazon S3 to encrypt your data using AES-256
      • After object encryption, Amazon S3 deletes the encryption key
      • For downloading, you need to provide the same encryption key, which AWS matches, decrypts and returns the object
  • Amazon EBS
    • When Amazon EBS volume is created, you can choose the master key in KMS to be used for encrypting the volume
    • Volume encryption
      • Amazon EC2 server sends an authenticated request to AWS KMS to create a volume key.
      • AWS KMS generates this volume key, encrypts it using the master key, and returns the plaintext volume key and the encrypted volume key to the Amazon EC2 server.
      • Plaintext volume key is stored in memory to encrypt and decrypt all data going to and from your attached EBS volume.
    • Volume decryption
      • When the encrypted volume (or any encrypted snapshots derived
        from that volume) needs to be re-attached to an instance, a call is made to AWS KMS to decrypt the encrypted volume key.
      • AWS KMS decrypts this encrypted volume key with the correct master key and returns the decrypted volume key to Amazon EC2.
  • Amazon Glacier
    • Glacier provide encryption of the data, by default
    • Before it’s written to disk, data is always automatically encrypted using 256-bit AES keys unique to the Amazon Glacier service that are stored in separate systems under AWS control
  • AWS Storage Gateway
    • AWS Storage Gateway transfers your data to AWS over SSL
    • AWS Storage Gateway stores data encrypted at rest in Amazon S3 or Amazon Glacier using their respective server side encryption schemes.
  • Amazon RDS – Oracle
    • Oracle Advanced Security option for Oracle on Amazon RDS can be used to leverage the native Transparent Data Encryption (TDE) and Native Network Encryption (NNE) features
    • Oracle encryption module creates data and key-encrypting keys to encrypt the database
    • Key-encrypting keys specific to your Oracle instance on Amazon RDS are themselves encrypted by a periodically rotated 256-bit AES master key.
    • Master key is unique to the Amazon RDS service and is stored in separate systems under AWS control
  • Amazon RDS -SQL server
    • Transparent Data Encryption (TDE) can be provisioned for Microsoft SQL Server on Amazon RDS.
    • SQL Server encryption module creates data and keyencrypting keys to encrypt the database.
    • Key-encrypting keys specific to your SQL Server instance on Amazon RDS are themselves encrypted by a periodically rotated, regional 256-bit AES master key
    • Master key is unique to the Amazon RDS service and is stored in separate systems under AWS control

Sample Exam Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. How can you secure data at rest on an EBS volume?
    1. Encrypt the volume using the S3 server-side encryption service
    2. Attach the volume to an instance using EC2’s SSL interface.
    3. Create an IAM policy that restricts read and write access to the volume.
    4. Write the data randomly instead of sequentially.
    5. Use an encrypted file system on top of the EBS volume
  2. Your company policies require encryption of sensitive data at rest. You are considering the possible options for protecting data while storing it at rest on an EBS data volume, attached to an EC2 instance. Which of these options would allow you to encrypt your data at rest? (Choose 3 answers)
    1. Implement third party volume encryption tools
    2. Do nothing as EBS volumes are encrypted by default
    3. Encrypt data inside your applications before storing it on EBS
    4. Encrypt data using native data encryption drivers at the file system level
    5. Implement SSL/TLS for all services running on the server
  3. A company is storing data on Amazon Simple Storage Service (S3). The company’s security policy mandates that data is encrypted at rest. Which of the following methods can achieve this? Choose 3 answers
    1. Use Amazon S3 server-side encryption with AWS Key Management Service managed keys
    2. Use Amazon S3 server-side encryption with customer-provided keys
    3. Use Amazon S3 server-side encryption with EC2 key pair.
    4. Use Amazon S3 bucket policies to restrict access to the data at rest.
    5. Encrypt the data on the client-side before ingesting to Amazon S3 using their own master key
    6. Use SSL to encrypt the data while in transit to Amazon S3.
  4. Which 2 services provide native encryption
    1. Amazon EBS
    2. Amazon Glacier
    3. Amazon Redshift (is optional)
    4. Amazon RDS (is optional)
    5. Amazon Storage Gateway
  5. With which AWS services CloudHSM can be used (select 2)
    1. S3
    2. DynamoDb
    3. RDS
    4. ElastiCache
    5. Amazon Redshift

References

AWS S3 Data Consistency Model

AWS S3 Data Consistency Model

  • S3 Data Consistency provides strong read-after-write consistency for PUT and DELETE requests of objects in the S3 bucket in all AWS Regions
  • This behavior applies to both writes to new objects as well as PUT requests that overwrite existing objects and DELETE requests.
  • Read operations on S3 Select, S3 ACLs, S3 Object Tags, and object metadata (for example, the HEAD object) are strongly consistent.
  • Updates to a single key are atomic. for e.g., if you PUT to an existing key, a subsequent read might return the old data or the updated data, but it will never write corrupted or partial data.
  • S3 achieves high availability by replicating data across multiple servers within Amazon’s data centers. If a PUT request is successful, the data is safely stored. Any read (GET or LIST request) that is initiated following the receipt of a successful PUT response will return the data written by the PUT request.
  • S3 Data Consistency behavior examples
    • A process writes a new object to S3 and immediately lists keys within its bucket. The new object appears in the list.
    • A process replaces an existing object and immediately tries to read it. S3 returns the new data.
    • A process deletes an existing object and immediately tries to read it. S3 does not return any data because the object has been deleted.
    • A process deletes an existing object and immediately lists keys within its bucket. The object does not appear in the listing.
  • S3 does not currently support object locking for concurrent writes. for e.g. If two PUT requests are simultaneously made to the same key, the request with the latest timestamp wins. If this is an issue, you will need to build an object-locking mechanism into your application.
  • Updates are key-based; there is no way to make atomic updates across keys. for e.g, an update of one key cannot be dependent on the update of another key unless you design this functionality into the application.
  • S3 Object Lock is different as it allows to store objects using a write-once-read-many (WORM) model, which prevents an object from being deleted or overwritten for a fixed amount of time or indefinitely.
  • S3 provides strong Read-after-Write consistency for PUTS of new objects
    • For a PUT request, S3 synchronously stores data across multiple facilities before returning SUCCESS
    • A process writes a new object to S3 and will be immediately able to read the Object i.e. PUT 200 -> GET 200
    • A process writes a new object to S3 and immediately lists keys within its bucket. Until the change is fully propagated, the object might not appear in the list.
    • However, if a HEAD or GET request to a key name is made before the object is created, then create the object shortly after that, a subsequent GET might not return the object due to eventual consistency. i.e. GET 404 -> PUT 200 -> GET 404
  • S3 provides Eventual Consistency for overwrite PUTS and DELETES in all regions.
    • For updates and deletes to Objects, the changes are eventually reflected and not available immediately i.e. PUT 200 -> PUT 200 -> GET 200 (might be older version) OR DELETE 200 -> GET 200
    • if a process replaces an existing object and immediately attempts to read it, S3 might return the prior data till the change is fully propagated
    • if a process deletes an existing object and immediately attempts to read it, S3 might return the deleted data until the deletion is fully propagated
    • if a process deletes an existing object and immediately lists keys within its bucket. Until the deletion is fully propagated, S3 might list the deleted object.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Which of the following are valid statements about Amazon S3? Choose 2 answers
    1. S3 provides read-after-write consistency for any type of PUT or DELETE. (S3 now provides strong read-after-write consistency)
    2. Consistency is not guaranteed for any type of PUT or DELETE.
    3. A successful response to a PUT request only occurs when a complete object is saved
    4. Partially saved objects are immediately readable with a GET after an overwrite PUT.
    5. S3 provides eventual consistency for overwrite PUTS and DELETES
  2. A customer is leveraging Amazon Simple Storage Service in eu-west-1 to store static content for web-based property. The customer is storing objects using the Standard Storage class. Where are the customers’ objects replicated?
    1. Single facility in eu-west-1 and a single facility in eu-central-1
    2. Single facility in eu-west-1 and a single facility in us-east-1
    3. Multiple facilities in eu-west-1
    4. A single facility in eu-west-1
  3. A user has an S3 object in the US Standard region with the content “color=red”. The user updates the object with the content as “color=”white”. If the user tries to read the value 1 minute after it was uploaded, what will S3 return?
    1. It will return “color=white” (strong read-after-write consistency)
    2. It will return “color=red”
    3. It will return an error saying that the object was not found
    4. It may return either “color=red” or “color=white” i.e. any of the value (Eventual Consistency)

References

AWS_S3_Data_Consistency

AWS Simple Storage Service – S3

AWS Simple Storage Service – S3

  • Amazon Simple Storage Service – S3 is a simple key, value object store designed for the Internet
  • provides unlimited storage space and works on the pay-as-you-use model. Service rates get cheaper as the usage volume increases
  • offers an extremely durable, highly available, and infinitely scalable data storage infrastructure at very low costs.
  • is Object-level storage (not Block level storage like EBS volumes) and cannot be used to host OS or dynamic websites.
  • S3 resources e.g. buckets and objects are private by default.

S3 Buckets & Objects

S3 Buckets

  • A bucket is a container for objects stored in S3
  • Buckets help organize the S3 namespace.
  • A bucket is owned by the AWS account that creates it and helps identify the account responsible for storage and data transfer charges.
  • Bucket names are globally unique, regardless of the AWS region in which it was created and the namespace is shared by all AWS accounts
  • Even though S3 is a global service, buckets are created within a region specified during the creation of the bucket.
  • Every object is contained in a bucket
  • There is no limit to the number of objects that can be stored in a bucket and no difference in performance whether a single bucket or multiple buckets are used to store all the objects
  • The S3 data model is a flat structure i.e. there are no hierarchies or folders within the buckets. However, logical hierarchy can be inferred using the key name prefix e.g. Folder1/Object1
  • Restrictions
    • 100 buckets (soft limit) and a maximum of 1000 buckets can be created in each AWS account
    • Bucket names should be globally unique and DNS compliant
    • Bucket ownership is not transferable
    • Buckets cannot be nested and cannot have a bucket within another bucket
    • Bucket name and region cannot be changed, once created
  • Empty or a non-empty buckets can be deleted
  • S3 allows retrieval of 1000 objects and provides pagination support

Objects

  • Objects are the fundamental entities stored in a bucket
  • An object is uniquely identified within a bucket by a key name and version ID (if S3 versioning is enabled on the bucket)
  • Objects consist of object data, metadata, and others
    • Key is the object name and a unique identifier for an object
    • Value is actual content stored
    • Metadata is the data about the data and is a set of name-value pairs that describe the object e.g. content-type, size, last modified. Custom metadata can also be specified at the time the object is stored.
    • Version ID is the version id for the object and in combination with the key helps to uniquely identify an object within a bucket
    • Subresources help provide additional information for an object
    • Access Control Information helps control access to the objects
  • S3 objects allow two kinds of metadata
    • System metadata
      • Metadata such as the Last-Modified date is controlled by the system. Only S3 can modify the value.
      • System metadata that the user can control, e.g., the storage class, and encryption configured for the object.
    • User-defined metadata
      • User-defined metadata can be assigned during uploading the object or after the object has been uploaded.
      • User-defined metadata is stored with the object and is returned when an object is downloaded
      • S3 does not process user-defined metadata.
      • User-defined metadata must begin with the prefix “x-amz-meta“, otherwise S3 will not set the key-value pair as you define it
  • Object metadata cannot be modified after the object is uploaded and it can be only modified by performing copy operation and setting the metadata
  • Objects belonging to a bucket that reside in a specific AWS region never leave that region, unless explicitly copied using Cross Region Replication
  • Each object can be up to 5 TB in size
  • An object can be retrieved as a whole or a partially
  • With Versioning enabled, current as well as previous versions of an object can be retrieved

S3 Bucket & Object Operations

  • Listing
    • S3 allows the listing of all the keys within a bucket
    • A single listing request would return a max of 1000 object keys with pagination support using an indicator in the response to indicate if the response was truncated
    • Keys within a bucket can be listed using Prefix and Delimiter.
    • Prefix limits result in only those keys (kind of filtering) that begin with the specified prefix, and the delimiter causes the list to roll up all keys that share a common prefix into a single summary list result.
  • Retrieval
    • An object can be retrieved as a whole
    • An object can be retrieved in parts or partially (a specific range of bytes) by using the Range HTTP header.
    • Range HTTP header is helpful
      • if only a partial object is needed for e.g. multiple files were uploaded as a single archive
      • for fault-tolerant downloads where the network connectivity is poor
    • Objects can also be downloaded by sharing Pre-Signed URLs
    • Metadata of the object is returned in the response headers
  • Object Uploads
    • Single Operation – Objects of size 5GB can be uploaded in a single PUT operation
    • Multipart upload – can be used for objects of size > 5GB and supports the max size of 5TB. It is recommended for objects above size 100MB.
    • Pre-Signed URLs can also be used and shared for uploading objects
    • Objects if uploaded successfully can be verified if the request received a successful response. Additionally, returned ETag can be compared to the calculated MD5 value of the upload object
  • Copying Objects
    • Copying of objects up to 5GB can be performed using a single operation and multipart upload can be used for uploads up to 5TB
    • When an object is copied
      • user-controlled system metadata e.g. storage class and user-defined metadata are also copied.
      • system controlled metadata e.g. the creation date etc is reset
    • Copying Objects can be needed to
      • Create multiple object copies
      • Copy objects across locations or regions
      • Renaming of the objects
      • Change object metadata for e.g. storage class, encryption, etc
      • Updating any metadata for an object requires all the metadata fields to be specified again
  • Deleting Objects
    • S3 allows deletion of a single object or multiple objects (max 1000) in a single call
    • For Non Versioned buckets,
      • the object key needs to be provided and the object is permanently deleted
    • For Versioned buckets,
      • if an object key is provided, S3 inserts a delete marker and the previous current object becomes the non-current object
      • if an object key with a version ID is provided, the object is permanently deleted
      • if the version ID is of the delete marker, the delete marker is removed and the previous non-current version becomes the current version object
    • Deletion can be MFA enabled for adding extra security
  • Restoring Objects from Glacier
    • Objects must be restored before accessing an archived object
    • Restoration of an Object takes time and costs more. Glacier now offers expedited retrievals within minutes.
    • Restoration request also needs to specify the number of days for which the object copy needs to be maintained.
    • During this period, storage cost applies for both the archive and the copy.

Pre-Signed URLs

  • All buckets and objects are by default private.
  • Pre-signed URLs allows user to be able to download or upload a specific object without requiring AWS security credentials or permissions.
  • Pre-signed URL allows anyone to access the object identified in the URL, provided the creator of the URL has permission to access that object.
  • Pre-signed URLs creation requires the creator to provide security credentials, a bucket name, an object key, an HTTP method (GET for download object & PUT of uploading objects), and expiration date and time
  • Pre-signed URLs are valid only till the expiration date & time.

Multipart Upload

  • Multipart upload allows the user to upload a single large object as a set of parts. Each part is a contiguous portion of the object’s data.
  • Multipart uploads support 1 to 10000 parts and each part can be from 5MB to 5GB with the last part size allowed to be less than 5MB
  • Multipart uploads allow a max upload size of 5TB
  • Object parts can be uploaded independently and in any order. If transmission of any part fails, it can be retransmitted without affecting other parts.
  • After all parts of the object are uploaded and completed initiated, S3 assembles these parts and creates the object.
  • Using multipart upload provides the following advantages:
    • Improved throughput – parallel upload of parts to improve throughput
    • Quick recovery from any network issues – Smaller part size minimizes the impact of restarting a failed upload due to a network error.
    • Pause and resume object uploads – Object parts can be uploaded over time. Once a multipart upload is initiated there is no expiry; you must explicitly complete or abort the multipart upload.
    • Begin an upload before the final object size is known – an object can be uploaded as is it being created
  • Three Step process
    • Multipart Upload Initiation
      • Initiation of a Multipart upload request to S3 returns a unique ID for each multipart upload.
      • This ID needs to be provided for each part upload, completion or abort request and listing of parts call.
      • All the Object metadata required needs to be provided during the Initiation call
    • Parts Upload
      • Parts upload of objects can be performed using the unique upload ID
      • A part number (between 1 – 10000) needs to be specified with each request which identifies each part and its position in the object
      • If a part with the same part number is uploaded, the previous part would be overwritten
      • After the part upload is successful, S3 returns an ETag header in the response which must be recorded along with the part number to be provided during the multipart completion request
    • Multipart Upload Completion or Abort
      • On Multipart Upload Completion request, S3 creates an object by concatenating the parts in ascending order based on the part number and associates the metadata with the object
      • Multipart Upload Completion request should include the unique upload ID with all the parts and the ETag information
      • The response includes an ETag that uniquely identifies the combined object data
      • On Multipart upload Abort request, the upload is aborted and all parts are removed. Any new part upload would fail. However, any in-progress part upload is completed, and hence an abort request must be sent after all the parts uploads have been completed.
      • S3 should receive a multipart upload completion or abort request else it will not delete the parts and storage would be charged.

S3 Transfer Acceleration

  • S3 Transfer Acceleration enables fast, easy, and secure transfers of files over long distances between the client and a bucket.
  • Transfer Acceleration takes advantage of CloudFront’s globally distributed edge locations. As the data arrives at an edge location, data is routed to S3 over an optimized network path.
  • Transfer Acceleration will have additional charges while uploading data to S3 is free through the public Internet.

S3 Batch Operations

  • S3 Batch Operations help perform large-scale batch operations on S3 objects and can perform a single operation on lists of specified S3 objects.
  • A single job can perform a specified operation on billions of objects containing exabytes of data.
  • S3 tracks progress, sends notifications, and stores a detailed completion report of all actions, providing a fully managed, auditable, and serverless experience.
  • Batch Operations can be used with S3 Inventory to get the object list and use S3 Select to filter the objects.
  • Batch Operations can be used for copying objects, modify object metadata, applying ACLs, encrypting objects, transforming objects, invoke a custom lambda function, etc.

Virtual Hosted Style vs Path-Style Request

S3 allows the buckets and objects to be referred to in Path-style or Virtual hosted-style URLs

Path-style

  • Bucket name is not part of the domain (unless region specific endpoint used)
  • Endpoint used must match the region in which the bucket resides for e.g, if you have a bucket called mybucket that resides in the EU (Ireland) region with object named puppy.jpg, the correct path-style syntax URI is http://s3-eu-west-1.amazonaws.com/mybucket/puppy.jpg.
  • A “PermanentRedirect” error is received with an HTTP response code 301, and a message indicating what the correct URI is for the resource if a bucket is accessed outside the US East (N. Virginia) region with path-style syntax that uses either of the following:
    • http://s3.amazonaws.com
    • An endpoint for a region different from the one where the bucket resides for e.g., if you use http://s3-eu-west-1.amazonaws.com for a bucket that was created in the US West (N. California) region
  • Path-style requests would not be supported after after September 30, 2020

Virtual hosted-style

  • S3 supports virtual hosted-style and path-style access in all regions.
  • In a virtual-hosted-style URL, the bucket name is part of the domain name in the URL for e.g. http://bucketname.s3.amazonaws.com/objectname
  • S3 virtual hosting can be used to address a bucket in a REST API call by using the HTTP Host header
  • Benefits
    • attractiveness of customized URLs,
    • provides an ability to publish to the “root directory” of the bucket’s virtual server. This ability can be important because many existing applications search for files in this standard location.
  • S3 updates DNS to reroute the request to the correct location when a bucket is created in any region, which might take time.
  • S3 routes any virtual hosted-style requests to the US East (N.Virginia) region, by default, if the US East (N. Virginia) endpoint s3.amazonaws.com is used, instead of the region-specific endpoint (for e.g., s3-eu-west-1.amazonaws.com) and S3 redirects it with HTTP 307 redirect to the correct region.
  • When using virtual hosted-style buckets with SSL, the SSL wild card certificate only matches buckets that do not contain periods.To work around this, use HTTP or write your own certificate verification logic.
  • If you make a request to the http://bucket.s3.amazonaws.com endpoint, the DNS has sufficient information to route the request directly to the region where your bucket resides.

S3 Pricing

  • S3 costs vary by Region
  • Charges are incurred for
    • Storage – cost is per GB/month
    • Requests – per request cost varies depending on the request type GET, PUT
    • Data Transfer
      • data transfer-in is free
      • data transfer out is charged per GB/month (except in the same region or to Amazon CloudFront)

Additional Topics

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. What does Amazon S3 stand for?
    1. Simple Storage Solution.
    2. Storage Storage Storage (triple redundancy Storage).
    3. Storage Server Solution.
    4. Simple Storage Service
  2. What are characteristics of Amazon S3? Choose 2 answers
    1. Objects are directly accessible via a URL
    2. S3 should be used to host a relational database
    3. S3 allows you to store objects or virtually unlimited size
    4. S3 allows you to store virtually unlimited amounts of data
    5. S3 offers Provisioned IOPS
  3. You are building an automated transcription service in which Amazon EC2 worker instances process an uploaded audio file and generate a text file. You must store both of these files in the same durable storage until the text file is retrieved. You do not know what the storage capacity requirements are. Which storage option is both cost-efficient and scalable?
    1. Multiple Amazon EBS volume with snapshots
    2. A single Amazon Glacier vault
    3. A single Amazon S3 bucket
    4. Multiple instance stores
  4. A user wants to upload a complete folder to AWS S3 using the S3 Management console. How can the user perform this activity?
    1. Just drag and drop the folder using the flash tool provided by S3
    2. Use the Enable Enhanced Folder option from the S3 console while uploading objects
    3. The user cannot upload the whole folder in one go with the S3 management console
    4. Use the Enable Enhanced Uploader option from the S3 console while uploading objects (NOTE – Its no longer supported by AWS)
  5. A media company produces new video files on-premises every day with a total size of around 100GB after compression. All files have a size of 1-2 GB and need to be uploaded to Amazon S3 every night in a fixed time window between 3am and 5am. Current upload takes almost 3 hours, although less than half of the available bandwidth is used. What step(s) would ensure that the file uploads are able to complete in the allotted time window?
    1. Increase your network bandwidth to provide faster throughput to S3
    2. Upload the files in parallel to S3 using mulipart upload
    3. Pack all files into a single archive, upload it to S3, then extract the files in AWS
    4. Use AWS Import/Export to transfer the video files
  6. A company is deploying a two-tier, highly available web application to AWS. Which service provides durable storage for static content while utilizing lower Overall CPU resources for the web tier?
    1. Amazon EBS volume
    2. Amazon S3
    3. Amazon EC2 instance store
    4. Amazon RDS instance
  7. You have an application running on an Amazon Elastic Compute Cloud instance, that uploads 5 GB video objects to Amazon Simple Storage Service (S3). Video uploads are taking longer than expected, resulting in poor application performance. Which method will help improve performance of your application?
    1. Enable enhanced networking
    2. Use Amazon S3 multipart upload
    3. Leveraging Amazon CloudFront, use the HTTP POST method to reduce latency.
    4. Use Amazon Elastic Block Store Provisioned IOPs and use an Amazon EBS-optimized instance
  8. When you put objects in Amazon S3, what is the indication that an object was successfully stored?
    1. Each S3 account has a special bucket named_s3_logs. Success codes are written to this bucket with a timestamp and checksum.
    2. A success code is inserted into the S3 object metadata.
    3. A HTTP 200 result code and MD5 checksum, taken together, indicate that the operation was successful.
    4. Amazon S3 is engineered for 99.999999999% durability. Therefore there is no need to confirm that data was inserted.
  9. You have private video content in S3 that you want to serve to subscribed users on the Internet. User IDs, credentials, and subscriptions are stored in an Amazon RDS database. Which configuration will allow you to securely serve private content to your users?
    1. Generate pre-signed URLs for each user as they request access to protected S3 content
    2. Create an IAM user for each subscribed user and assign the GetObject permission to each IAM user
    3. Create an S3 bucket policy that limits access to your private content to only your subscribed users’ credentials
    4. Create a CloudFront Origin Identity user for your subscribed users and assign the GetObject permission to this user
  10. You run an ad-supported photo sharing website using S3 to serve photos to visitors of your site. At some point you find out that other sites have been linking to the photos on your site, causing loss to your business. What is an effective method to mitigate this?
    1. Remove public read access and use signed URLs with expiry dates.
    2. Use CloudFront distributions for static content.
    3. Block the IPs of the offending websites in Security Groups.
    4. Store photos on an EBS volume of the web server.
  11. You are designing a web application that stores static assets in an Amazon Simple Storage Service (S3) bucket. You expect this bucket to immediately receive over 150 PUT requests per second. What should you do to ensure optimal performance?
    1. Use multi-part upload.
    2. Add a random prefix to the key names.
    3. Amazon S3 will automatically manage performance at this scale. (With latest S3 performance improvements, S3 scaled automatically)
    4. Use a predictable naming scheme, such as sequential numbers or date time sequences, in the key names
  12. What is the maximum number of S3 buckets available per AWS Account?
    1. 100 Per region
    2. There is no Limit
    3. 100 Per Account (Refer documentation)
    4. 500 Per Account
    5. 100 Per IAM User
  13. Your customer needs to create an application to allow contractors to upload videos to Amazon Simple Storage Service (S3) so they can be transcoded into a different format. She creates AWS Identity and Access Management (IAM) users for her application developers, and in just one week, they have the application hosted on a fleet of Amazon Elastic Compute Cloud (EC2) instances. The attached IAM role is assigned to the instances. As expected, a contractor who authenticates to the application is given a pre-signed URL that points to the location for video upload. However, contractors are reporting that they cannot upload their videos. Which of the following are valid reasons for this behavior? Choose 2 answers { “Version”: “2012-10-17”, “Statement”: [ { “Effect”: “Allow”, “Action”: “s3:*”, “Resource”: “*” } ] }
    1. The IAM role does not explicitly grant permission to upload the object. (The role has all permissions for all activities on S3)
    2. The contractorsˈ accounts have not been granted “write” access to the S3 bucket. (using pre-signed urls the contractors account don’t need to have access but only the creator of the pre-signed urls)
    3. The application is not using valid security credentials to generate the pre-signed URL.
    4. The developers do not have access to upload objects to the S3 bucket. (developers are not uploading the objects but its using pre-signed urls)
    5. The S3 bucket still has the associated default permissions. (does not matter as long as the user has permission to upload)
    6. The pre-signed URL has expired.

AWS S3 Data Protection

AWS S3 Data Protection

  • S3 provides S3 data protection using highly durable storage infrastructure designed for mission-critical and primary data storage.
  • Objects are redundantly stored on multiple devices across multiple facilities in an S3 region.
  • S3 PUT and PUT Object copy operations synchronously store the data across multiple facilities before returning SUCCESS.
  • Once the objects are stored, S3 maintains its durability by quickly detecting and repairing any lost redundancy.
  • S3 also regularly verifies the integrity of data stored using checksums. If  S3 detects data corruption, it is repaired using redundant data.
  • In addition, S3 calculates checksums on all network traffic to detect corruption of data packets when storing or retrieving data
  • Data protection against accidental overwrites and deletions can be added by enabling Versioning to preserve, retrieve and restore every version of the object stored
  • S3 also provides the ability to protect data in transit (as it travels to and from S3) and at rest (while it is stored in S3)

S3 Encryption

Refer blog post @ S3 Encryption

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A customer is leveraging Amazon Simple Storage Service in eu-west-1 to store static content for a web-based property. The customer is storing objects using the Standard Storage class. Where are the customers objects replicated?
    1. A single facility in eu-west-1 and a single facility in eu-central-1
    2. A single facility in eu-west-1 and a single facility in us-east-1
    3. Multiple facilities in eu-west-1
    4. A single facility in eu-west-1
  2. A system admin is planning to encrypt all objects being uploaded to S3 from an application. The system admin does not want to implement his own encryption algorithm; instead he is planning to use server side encryption by supplying his own key (SSE-C). Which parameter is not required while making a call for SSE-C?
    1. x-amz-server-side-encryption-customer-key-AES-256
    2. x-amz-server-side-encryption-customer-key
    3. x-amz-server-side-encryption-customer-algorithm
    4. x-amz-server-side-encryption-customer-key-MD5

References

AWS_S3_Security

AWS S3 Permissions

AWS S3 Permissions

  • By default, all S3 buckets, objects, and related subresources are private.
  • Only the Resource owner, the AWS account (not the user) that creates the resource, can access the resource.
  • Resource owner can be
    • AWS account that creates the bucket or object owns those resources
    • If an IAM user creates the bucket or object, the AWS account of the IAM user owns the resource
    • If the bucket owner grants cross-account permissions to other AWS account users to upload objects to the buckets, the objects are owned by the AWS account of the user who uploaded the object and not the bucket owner except for the following conditions
      • Bucket owner can deny access to the object, as it is still the bucket owner who pays for the object
      • Bucket owner can delete or apply archival rules to the object and perform restoration
  • User is the AWS Account or the IAM user who access the resource
  • Bucket owner is the AWS account that created a bucket
  • Object owner is the AWS account that uploads the object to a bucket, not owned by the account
  • S3 permissions are classified into
    • Resource based policies and
    • User policies

User Policies

  • User policies use IAM with S3 to control the type of access a user or group of users has to specific parts of an S3 bucket the AWS account owns
  • User policy is always attached to a User, Group, or a Role
  • Anonymous permissions cannot be granted
  • If an AWS account that owns a bucket wants to grant permission to users in its account, it can use either a bucket policy or a user policy

Resource-Based policies

  • Bucket policies and access control lists (ACLs) are resource-based because they are attached to the  S3 resources

Screen Shot 2016-03-28 at 5.57.36 PM

Bucket Policies

  • Bucket policy can be used to grant cross-account access to other AWS accounts or IAM users in other accounts for the bucket and objects in it.
  • Bucket policies provide centralized, access control to buckets and objects based on a variety of conditions, including S3 operations, requesters, resources, and aspects of the request (e.g. IP address)
  • If an AWS account that owns a bucket wants to grant permission to users in its account, it can use either a bucket policy or a user policy
  • Permissions attached to a bucket apply to all of the objects in that bucket created and owned by the bucket owner
  • Policies can either add or deny permissions across all (or a subset) of objects within a bucket
  • Only the bucket owner is allowed to associate a policy with a bucket
  • Bucket policies can cater to multiple use cases
    • Granting permissions to multiple accounts with added conditions
    • Granting read-only permission to an anonymous user
    • Limiting access to specific IP addresses
    • Restricting access to a specific HTTP referer
    • Restricting access to a specific HTTP header for e.g. to enforce encryption
    • Granting permission to a CloudFront OAI
    • Adding a bucket policy to require MFA
    • Granting cross-account permissions to upload objects while ensuring the bucket owner has full control
    • Granting permissions for S3 inventory and Amazon S3 analytics
    • Granting permissions for S3 Storage Lens

Access Control Lists (ACLs)

  • Each bucket and object has an ACL associated with it.
  • An ACL is a list of grants identifying grantee and permission granted
  • ACLs are used to grant basic read/write permissions on resources to other AWS accounts.
  • ACL supports limited permissions set and
    • cannot grant conditional permissions, nor can you explicitly deny permissions
    • cannot be used to grant permissions for bucket subresources
  • Permission can be granted to an AWS account by the email address or the canonical user ID (is just an obfuscated Account Id). If an email address is provided, S3 will still find the canonical user ID for the user and add it to the ACL.
  • It is Recommended to use Canonical user ID as email address would not be supported
  • Bucket ACL
    • Only recommended use case for the bucket ACL is to grant write permission to the S3 Log Delivery group to write access log objects to the bucket
    • Bucket ACL will help grant write permission on the bucket to the Log Delivery group if access log delivery is needed to the bucket
    • Only way you can grant necessary permissions to the Log Delivery group is via a bucket ACL
  • Object ACL
    • Object ACLs control only Object-level Permissions
    • Object ACL is the only way to manage permission to an object in the bucket not owned by the bucket owner i.e. If the bucket owner allows cross-account object uploads and if the object owner is different from the bucket owner, the only way for the object owner to grant permissions on the object is through Object ACL
    • If the Bucket and Object is owned by the same AWS account, Bucket policy can be used to manage the permissions
    • If the Object and User is owned by the same AWS account, User policy can be used to manage the permissions

S3 Request Authorization

When S3 receives a request, it must evaluate all the user policies, bucket policies, and ACLs to determine whether to authorize or deny the request.

S3 evaluates the policies in 3 context

  • User context is basically the context in which S3 evaluates the User policy that the parent AWS account (context authority) attaches to the user
  • Bucket context is the context in which S3 evaluates the access policies owned by the bucket owner (context authority) to check if the bucket owner has not explicitly denied access to the resource
  • Object context is the context where S3 evaluates policies owned by the Object owner (context authority)

Analogy

  • Consider 3 Parents (AWS Account) A, B and C with Child (IAM User) AA, BA and CA respectively
  • Parent A owns a Toy box (Bucket) with Toy AAA and also allows toys (Objects) to be dropped and picked up
  • Parent A can grant permission (User Policy OR Bucket policy OR both) to his Child AA to access the Toy box and the toys
  • Parent A can grant permissions (Bucket policy) to Parent B (different AWS account) to drop toys into the toys box. Parent B can grant permissions (User policy) to his Child BA to drop Toy BAA
  • Parent B can grant permissions (Object ACL) to Parent A to access Toy BAA
  • Parent A can grant permissions (Bucket Policy) to Parent C to pick up the Toy AAA who in turn can grant permission (User Policy) to his Child CA to access the toy
  • Parent A can grant permission (through IAM Role) to Parent C to pick up the Toy BAA who in turn can grant permission (User Policy) to his Child CA to access the toy

Bucket Operation Authorization

Screen Shot 2016-03-28 at 6.35.36 AM

  1. If the requester is an IAM user, the user must have permission (User Policy) from the parent AWS account to which it belongs
  2. Amazon S3 evaluates a subset of policies owned by the parent account. This subset of policies includes the user policy that the parent account attaches to the user.
  3. If the parent also owns the resource in the request (in this case, the bucket), Amazon S3 also evaluates the corresponding resource policies (bucket policy and bucket ACL) at the same time.
  4. Requester must also have permissions (Bucket Policy or ACL) from the bucket owner to perform a specific bucket operation.
  5. Amazon S3 evaluates a subset of policies owned by the AWS account that owns the bucket. The bucket owner can grant permission by using a bucket policy or bucket ACL.
  6. Note that, if the AWS account that owns the bucket is also the parent account of an IAM user, then it can configure bucket permissions in a user policy or bucket policy or both

Object Operation Authorization

Screen Shot 2016-03-28 at 6.39.54 AM

  1. If the requester is an IAM user, the user must have permission (User Policy) from the parent AWS account to which it belongs.
  2. Amazon S3 evaluates a subset of policies owned by the parent account. This subset of policies includes the user policy that the parent attaches to the user.
  3. If the parent also owns the resource in the request (bucket, object), Amazon S3 evaluates the corresponding resource policies (bucket policy, bucket ACL, and object ACL) at the same time.
  4. If the parent AWS account owns the resource (bucket or object), it can grant resource permissions to its IAM user by using either the user policy or the resource policy.
  5. S3 evaluates policies owned by the AWS account that owns the bucket.
  6. If the AWS account that owns the object in the request is not the same as the bucket owner, in the bucket context Amazon S3 checks the policies if the bucket owner has explicitly denied access to the object.
  7. If there is an explicit deny set on the object, Amazon S3 does not authorize the request.
  8. Requester must have permissions from the object owner (Object ACL) to perform a specific object operation.
  9. Amazon S3 evaluates the object ACL.
  10. If bucket and object owners are the same, access to the object can be granted in the bucket policy, which is evaluated in the bucket context.
  11. If the owners are different, the object owners must use an object ACL to grant permissions.
  12. If the AWS account that owns the object is also the parent account to which the IAM user belongs, it can configure object permissions in a user policy, which is evaluated in the user context.

Permission Delegation

  • If an AWS account owns a resource, it can grant those permissions to another AWS account.
  • That account can then delegate those permissions, or a subset of them, to users in the account. This is referred to as permission delegation.
  • But an account that receives permissions from another account cannot delegate permission cross-account to another AWS account.
  • If the Bucket owner wants to grant permission to the Object which does not belong to it to another AWS account it cannot do it through cross-account permissions and need to define an IAM role which can be assumed by the AWS account to gain access

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Which features can be used to restrict access to data in S3? Choose 2 answers
    1. Set an S3 ACL on the bucket or the object.
    2. Create a CloudFront distribution for the bucket.
    3. Set an S3 bucket policy.
    4. Enable IAM Identity Federation
    5. Use S3 Virtual Hosting
  2. Which method can be used to prevent an IP address block from accessing public objects in an S3 bucket?
    1. Create a bucket policy and apply it to the bucket
    2. Create a NACL and attach it to the VPC of the bucket
    3. Create an ACL and apply it to all objects in the bucket
    4. Modify the IAM policies of any users that would access the bucket
  3. A user has granted read/write permission of his S3 bucket using ACL. Which of the below mentioned options is a valid ID to grant permission to other AWS accounts (grantee. using ACL?
    1. IAM User ID
    2. S3 Secure ID
    3. Access ID
    4. Canonical user ID
  4. A root account owner has given full access of his S3 bucket to one of the IAM users using the bucket ACL. When the IAM user logs in to the S3 console, which actions can he perform?
    1. He can just view the content of the bucket
    2. He can do all the operations on the bucket
    3. It is not possible to give access to an IAM user using ACL
    4. The IAM user can perform all operations on the bucket using only API/SDK
  5. A root AWS account owner is trying to understand various options to set the permission to AWS S3. Which of the below mentioned options is not the right option to grant permission for S3?
    1. User Access Policy
    2. S3 Object Policy
    3. S3 Bucket Policy
    4. S3 ACL
  6. A system admin is managing buckets, objects and folders with AWS S3. Which of the below mentioned statements is true and should be taken in consideration by the sysadmin?
    1. Folders support only ACL
    2. Both the object and bucket can have an Access Policy but folder cannot have policy
    3. Folders can have a policy
    4. Both the object and bucket can have ACL but folders cannot have ACL
  7. A user has created an S3 bucket which is not publicly accessible. The bucket is having thirty objects which are also private. If the user wants to make the objects public, how can he configure this with minimal efforts?
    1. User should select all objects from the console and apply a single policy to mark them public
    2. User can write a program which programmatically makes all objects public using S3 SDK
    3. Set the AWS bucket policy which marks all objects as public
    4. Make the bucket ACL as public so it will also mark all objects as public
  8. You need to configure an Amazon S3 bucket to serve static assets for your public-facing web application. Which methods ensure that all objects uploaded to the bucket are set to public read? Choose 2 answers
    1. Set permissions on the object to public read during upload.
    2. Configure the bucket ACL to set all objects to public read.
    3. Configure the bucket policy to set all objects to public read.
    4. Use AWS Identity and Access Management roles to set the bucket to public read.
    5. Amazon S3 objects default to public read, so no action is needed.
  9. Amazon S3 doesn’t automatically give a user who creates _____ permission to perform other actions on that bucket or object.
    1. a file
    2. a bucket or object
    3. a bucket or file
    4. a object or file
  10. A root account owner is trying to understand the S3 bucket ACL. Which of the below mentioned options cannot be used to grant ACL on the object using the authorized predefined group?
    1. Authenticated user group
    2. All users group
    3. Log Delivery Group
    4. Canonical user group
  11. A user is enabling logging on a particular bucket. Which of the below mentioned options may be best suitable to allow access to the log bucket?
    1. Create an IAM policy and allow log access
    2. It is not possible to enable logging on the S3 bucket
    3. Create an IAM Role, which has access to the log bucket
    4. Provide ACL for the logging group
  12. A user is trying to configure access with S3. Which of the following options is not possible to provide access to the S3 bucket / object?
    1. Define the policy for the IAM user
    2. Define the ACL for the object
    3. Define the policy for the object
    4. Define the policy for the bucket
  13. A user is having access to objects of an S3 bucket, which is not owned by him. If he is trying to set the objects of that bucket public, which of the below mentioned options may be a right fit for this action?
    1. Make the bucket public with full access
    2. Define the policy for the bucket
    3. Provide ACL on the object
    4. Create an IAM user with permission
  14. A bucket owner has allowed another account’s IAM users to upload or access objects in his bucket. The IAM user of Account A is trying to access an object created by the IAM user of account B. What will happen in this scenario?
    1. The bucket policy may not be created as S3 will give error due to conflict of Access Rights
    2. It is not possible to give permission to multiple IAM users
    3. AWS S3 will verify proper rights given by the owner of Account A, the bucket owner as well as by the IAM user B to the object
    4. It is not possible that the IAM user of one account accesses objects of the other IAM user

References

AWS_S3_Access_Control

AWS S3 Object Lifecycle Management

S3 Lifecycle Management

S3 Object Lifecycle Management

  • S3 Object lifecycle can be managed by using a lifecycle configuration, which defines how S3 manages objects during their lifetime.
  • Lifecycle configuration enables simplification of object lifecycle management, for e.g. moving of less frequently access objects, backup or archival of data for several years, or permanent deletion of objects,
  • S3 controls all transitions automatically
  • Lifecycle Management rules applied to a bucket are applicable to all the existing objects in the bucket as well as the ones that will be added anew
  • S3 Object lifecycle management allows 2 types of behavior
    • Transition in which the storage class for the objects changes
    • Expiration where the objects expire and are permanently deleted
  • Lifecycle Management can be configured with Versioning, which allows storage of one current object version and zero or more non-current object versions
  • Object’s lifecycle management applies to both Non Versioning and Versioning enabled buckets
  • For Non Versioned buckets
    • Transitioning period is considered from the object’s creation date
  • For Versioned buckets,
    • Transitioning period for the current object is calculated for the object creation date
    • Transitioning period for a non-current object is calculated for the date when the object became a noncurrent versioned object
    • S3 uses the number of days since its successor was created as the number of days an object is noncurrent.
  • S3 calculates the time by adding the number of days specified in the rule to the object creation time and rounding the resulting time to the next day midnight UTC for e.g. if an object was created at 15/1/2016 10:30 AM UTC and you specify 3 days in a transition rule, which results in 18/1/2016 10:30 AM UTC and rounded of to next day midnight time 19/1/2016 00:00 UTC.
  • Lifecycle configuration on MFA-enabled buckets is not supported.
  • 1000 lifecycle rules can be configured per bucket

S3 Object Lifecycle Management Rules

S3 Lifecycle Management

Lifecycle Transitions Constraints

  1. STANDARD -> (128 KB & 30 days) -> STANDARD-IA or One Zone-IA or S3 Intelligent-Tiering
    • Larger Objects – Only objects with a size more than 128 KB can be transitioned, as cost benefits for transitioning to STANDARD-IA or One Zone-IA can be realized only for larger objects
    • Smaller Objects < 128 KB – S3 does not transition objects that are smaller than 128 KB
    • Minimum 30 days – Objects must be stored for at least 30 days in the current storage class before being transitioned to the STANDARD-IA or One Zone-IA, as younger objects are accessed more frequently or deleted sooner than is suitable for STANDARD-IA or One Zone-IA
  2. GLACIER -> (90 days) -> Permanent Deletion OR GLACIER Deep Archive -> (180 days) -> Permanent Deletion
    • Deleting data that is archived to Glacier is free if the objects deleted are archived for three months or longer.
    • S3 charges a prorated early deletion fee if the object is deleted or overwritten within three months of archiving it.
  3. Archival of objects to Glacier by using object lifecycle management is performed asynchronously and there may be a delay between the transition date in the lifecycle configuration rule and the date of the physical transition. However, AWS charges Glacier prices based on the transition date specified in the rule
  4. For a versioning-enabled bucket
    • Transition and Expiration actions apply to current versions.
    • NoncurrentVersionTransition and NoncurrentVersionExpiration actions apply to noncurrent versions and work similarly to the non-versioned objects except the time period is from the time the objects became noncurrent
  5. Expiration Rules
    • For Non Versioned bucket
      • Object is permanently deleted
    • For Versioned bucket
      • Expiration is applicable to the Current object only and does not impact any of the non-current objects
      • S3 will insert a Delete Marker object with a unique id and the previous current object becomes a non-current version
      • S3 will not take any action if the Current object is a Delete Marker
      • If the bucket has a single object which is the Delete Marker (referred to as expired object delete marker), S3 removes the Delete Marker
    • For Versioned Suspended bucket
      • S3 will insert a Delete Marker object with version ID null and overwrite any object with version ID null
  6. When an object reaches the end of its lifetime, S3 queues it for removal and removes it asynchronously. There may be a delay between the expiration date and the date at which S3 removes an object. Charged for storage time associated with an object that has expired are not incurred.
  7. Cost is incurred if objects are expired in STANDARD-IA before 30 days, GLACIER before 90 days, and GLACIER_DEEP_ARCHIVE before 180 days.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. If an object is stored in the Standard S3 storage class and you want to move it to Glacier, what must you do in order to properly migrate it?
    1. Change the storage class directly on the object.
    2. Delete the object and re-upload it, selecting Glacier as the storage class.
    3. None of the above.
    4. Create a lifecycle policy that will migrate it after a minimum of 30 days. (Any object uploaded to S3 must first be placed into either the Standard, Reduced Redundancy, or Infrequent Access storage class. Once in S3 the only way to move the object to glacier is through a lifecycle policy)
  2. A company wants to store their documents in AWS. Initially, these documents will be used frequently, and after a duration of 6 months, they would not be needed anymore. How would you architect this requirement?
    1. Store the files in Amazon EBS and create a Lifecycle Policy to remove the files after 6 months.
    2. Store the files in Amazon S3 and create a Lifecycle Policy to remove the files after 6 months.
    3. Store the files in Amazon Glacier and create a Lifecycle Policy to remove the files after 6 months.
    4. Store the files in Amazon EFS and create a Lifecycle Policy to remove the files after 6 months.
  3. Your firm has uploaded a large amount of aerial image data to S3. In the past, in your on-premises environment, you used a dedicated group of servers to oaten process this data and used Rabbit MQ, an open source messaging system, to get job information to the servers. Once processed the data would go to tape and be shipped offsite. Your manager told you to stay with the current design, and leverage AWS archival storage and messaging services to minimize cost. Which is correct?
    1. Use SQS for passing job messages, use Cloud Watch alarms to terminate EC2 worker instances when they become idle. Once data is processed, change the storage class of the S3 objects to Reduced Redundancy Storage (Need to replace On-Premises Tape functionality)
    2. Setup Auto-Scaled workers triggered by queue depth that use spot instances to process messages in SQS. Once data is processed, change the storage class of the S3 objects to Reduced Redundancy Storage (Need to replace On-Premises Tape functionality)
    3. Setup Auto-Scaled workers triggered by queue depth that use spot instances to process messages in SQS. Once data is processed, change the storage class of the S3 objects to Glacier (Glacier suitable for Tape backup)
    4. Use SNS to pass job messages use Cloud Watch alarms to terminate spot worker instances when they become idle. Once data is processed, change the storage class of the S3 object to Glacier.
  4. You have a proprietary data store on-premises that must be backed up daily by dumping the data store contents to a single compressed 50GB file and sending the file to AWS. Your SLAs state that any dump file backed up within the past 7 days can be retrieved within 2 hours. Your compliance department has stated that all data must be held indefinitely. The time required to restore the data store from a backup is approximately 1 hour. Your on-premise network connection is capable of sustaining 1gbps to AWS. Which backup methods to AWS would be most cost-effective while still meeting all of your requirements?
    1. Send the daily backup files to Glacier immediately after being generated (will not meet the RTO)
    2. Transfer the daily backup files to an EBS volume in AWS and take daily snapshots of the volume (Not cost effective)
    3. Transfer the daily backup files to S3 and use appropriate bucket lifecycle policies to send to Glacier (Store in S3 for seven days and then archive to Glacier)
    4. Host the backup files on a Storage Gateway with Gateway-Cached Volumes and take daily snapshots (Not Cost-effective as local storage as well as S3 storage)

References

AWS_S3_Object_Lifecycle_Management

AWS Interaction Tools – Certification

AWS Interaction Tools Overview

AWS in API driven and AWS Interaction Tools allow plenty of options to enable interaction with its services and includes.

AWS Management console

  • AWS Management console is a graphical user interface to access AWS
  • AWS management console requires credentials in the form of User Name and Password to be able to login and uses Query APIs underlying for its interaction with AWS

AWS Command Line Interface (CLI)

  • AWS Command Line Interface (CLI) is a unified tool that provides a consistent interface for interacting with all parts of AWS
  • Provides commands for a broad set of AWS products, and is supported on Windows, Mac, and Linux
  • CLI required Access key & Secret key credentials and uses Query APIs underlying for its interaction with AWS
  • CLI construct and send requests to AWS for you, and as part of that process, they sign the requests using an access key that you provide.
  • CLI also take care of many of the connection details, such as calculating signatures, handling request retries, and error handling.

Software Development Kit (SDKs)

  • Software Development Kits (SDKs) simplify using AWS services in your applications with an API tailored to your programming language or platform
  • SDKs currently support a wide range of languages which include Java, PHP, Ruby, Python, .Net, GO, Node.js etc
  • SDKs construct and send requests to AWS for you, and as part of that process, they sign the requests using an access key that you provide.
  • SDKs also take care of many of the connection details, such as calculating signatures, handling request retries, and error handling.

Query APIs

  • Query APIs provides HTTP or HTTPS requests that use the HTTP verb GET or POST and a Query parameter named “Action”
  • CLI required Access key & Secret key credentials for its interaction
  • Query APIs is the core of all the access tools and requires you to calculate signatures and attach them to the request

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. REST or Query requests are HTTP or HTTPS requests that use an HTTP verb (such as GET or POST) and a parameter named Action or Operation that specifies the API you are calling.
    1. FALSE
    2. TRUE (Refer link)
  2. Through which of the following interfaces is AWS Identity and Access Management available?
    A) AWS Management Console
    B) Command line interface (CLI)
    C) IAM Query API
    D) Existing libraries

    1. Only through Command line interface (CLI)
    2. A, B and C
    3. A and C
    4. All of the above
  3. Which of the following programming languages have an officially supported AWS SDK? Choose 2 answers
    1. PHP
    2. Pascal
    3. Java
    4. SQL
    5. Perl
  4. HTTP Query-based requests are HTTP requests that use the HTTP verb GET or POST and a Query parameter named_____________.
    1. Action
    2. Value
    3. Reset
    4. Retrieve

AWS DDoS Resiliency – Best Practices – Whitepaper

AWS DDoS Resiliency Whitepaper

  • Denial of Service (DoS) is an attack, carried out by a single attacker, which attempts to make a website or application unavailable to the end users.
  • Distributed Denial of Service (DDoS) is an attack, carried out by multiple attackers either controlled or compromised by a group of collaborators, which generates a flood of requests to the application making in unavailable to the legitimate end users

Mitigation techniques

Minimize the Attack Surface Area

  • This is all all about reducing the attack surface, the different Internet entry points, that allows access to your application
  • Strategy to minimize the Attack surface area
    • reduce the number of necessary Internet entry points,
    • don’t expose back end servers,
    • eliminate non-critical Internet entry points,
    • separate end user traffic from management traffic,
    • obfuscate necessary Internet entry points to the level that untrusted end users cannot access them, and
    • decouple Internet entry points to minimize the effects of attacks.
  • Benefits
    • Minimizes the effective attack vectors and targets
    • Less to monitor and protect
  • Strategy can be achieved using AWS Virtual Private Cloud (VPC)
    • helps define a logically isolated virtual network within the AWS
    • provides ability to create Public & Private Subnets to launch the internet facing and non-public facing instances accordingly
    • provides NAT gateway which allows instances in the private subnet to have internet access without the need to launch them in public subnets with Public IPs
    • allows creation of Bastion host which can be used to connect to instances in the private subnets
    • provides the ability to configure security groups for instances and NACLs for subnets, which act as a firewall, to control and limit outbound and inbound traffic

VPC Architecture

Be Ready to Scale to Absorb the Attack

  • DDOS mainly targets to load the systems till the point they cannot handle the load and are rendered unusable.
  • Scaling out Benefits
    • help build a resilient architecture
    • makes the attacker work harder
    • gives you time to think, analyze and adapt
  • AWS provided services :-
    • Auto Scaling & ELB
      • Horizontal scaling using Auto Scaling with ELB
      • Auto Scaling allows instances to be added and removed as the demand changes
      • ELB helps distribute the traffic across multiple EC2 instances while acting as a Single point of contact.
      • Auto Scaling automatically registers and deregisters EC2 instances with the ELB during scale out and scale in events
    • EC2 Instance
      • Vertical scaling can be achieved by using appropriate EC2 instance types for e.g. EBS optimized or ones with 10 gigabyte network connectivity to handle the load
    • Enhanced Networking
      • Use Instances with Enhanced Networking capabilities which can provide high packet-per-second performance, low latency networking, and improved scalability
    • Amazon CloudFront
      • CloudFront is a CDN, acts as a proxy between end users and the Origin servers, and helps distribute content to the end users without sending traffic to the Origin servers.
      • CloudFront has the inherent ability to help mitigate against both infrastructure and some application layer DDoS attacks by dispersing the traffic across multiple locations.
      • AWS has multiple Internet connections for capacity and redundancy at each location, which allows it to isolate attack traffic while serving content to legitimate end users
      • CloudFront also has filtering capabilities to ensure that only valid TCP connections and HTTP requests are made while dropping invalid requests. This takes the burden of handling invalid traffic (commonly used in UDP & SYN floods, and slow reads) off the origin.
    • Route 53
      • DDOS attacks are also targeted towards DNS, cause if the DNS is unavailable your application is effectively unavailable.
      • AWS Route 53 is highly available and scalable DNS service and have capabilities to ensure access to the application even when under DDOS attack
        • Shuffle Sharding – Shuffle sharding is similar to the concept of database sharding, where horizontal partitions of data are spread across separate database servers to spread load and provide redundancy. Similarly, Amazon Route 53 uses shuffle sharding to spread DNS requests over numerous PoPs, thus providing multiple paths and routes for your application.
        • Anycast Routing – Anycast routing increases redundancy by advertising the same IP address from multiple PoPs. In the event that a DDoS attack overwhelms one endpoint, shuffle sharding isolate failures while providing additional routes to your infrastructure.

Safeguard Exposed & Hard to Scale Expensive Resources

  • If entry points cannot be limited, additional measures to restrict access and protect those entry points without interrupting legitimate end user traffic
  • AWS provided services :-
    • CloudFront
      • CloudFront can restrict access to content using Geo Restriction and Origin Access Identity
      • With Geo Restriction, access can be restricted to a set of whitelisted countries or prevent access from a set of black listed countries
      • Origin Access Identity is the CloudFront special user which allows access to the resources only through CloudFront while denying direct access to the origin content for e.g. if S3 is the Origin for CloudFront, S3 can be configured to allow access only from OAI and hence deny direct access
    • Route 53
      • Route 53 provides two features Alias Record sets & Private DNS to make it easier to scale infrastructure and respond to DDoS attacks
    • WAF
      • WAFs act as filters that apply a set of rules to web traffic. Generally, these rules cover exploits like cross-site scripting (XSS) and SQL injection (SQLi) but can also help build resiliency against DDoS by mitigating HTTP GET or POST floods
      • WAF provides a lot of features like
        • OWASP Top 10
        • HTTP rate limiting (where only a certain number of requests are allowed per user in a timeframe),
        • Whitelist or blacklist (customizable rules)
        • inspect and identify requests with abnormal patterns,
        • CAPTCHA etc
      • To prevent WAF from being a Single point of failure, a WAF sandwich pattern can be implemented where an autoscaled WAF sits between the Internet and Internal Load Balancer

DDOS Resiliency - WAF Sandwich Architecture

Learn Normal Behavior

  • Understand the normal usual levels and Patterns of traffic for your application and use that as a benchmark for identifying abnormal level of traffic or resource spikes patterns
  • Benefits
    • allows one to spot abnormalities
    • configure Alarms with accurate thresholds
    • assists with generating forensic data
  • AWS provided services for tracking
    • AWS CloudWatch monitoring
      • CloudWatch can be used to monitor your infrastructure and applications running in AWS. Amazon CloudWatch can collect metrics, log files, and set alarms for when these metrics have passed predetermined thresholds
    • VPC Flow Logs
      • Flow logs helps capture traffic to the Instances in an VPC and can be used to understand the pattern

Create a Plan for Attacks

  • Have a plan in place before an attack, which ensures that:
    • Architecture has been validated and techniques selected work for the infrastructure
    • Costs for increased resiliency have been evaluated and the goals of your defense are understood
    • Contact points have been identified

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. You are designing a social media site and are considering how to mitigate distributed denial-of-service (DDoS) attacks. Which of the below are viable mitigation techniques? (Choose 3 answers)
    1. Add multiple elastic network interfaces (ENIs) to each EC2 instance to increase the network bandwidth.
    2. Use dedicated instances to ensure that each instance has the maximum performance possible.
    3. Use an Amazon CloudFront distribution for both static and dynamic content.
    4. Use an Elastic Load Balancer with auto scaling groups at the web app and Amazon Relational Database Service (RDS) tiers
    5. Add alert Amazon CloudWatch to look for high Network in and CPU utilization.
    6. Create processes and capabilities to quickly add and remove rules to the instance OS firewall.
  2. You’ve been hired to enhance the overall security posture for a very large e-commerce site. They have a well architected multi-tier application running in a VPC that uses ELBs in front of both the web and the app tier with static assets served directly from S3. They are using a combination of RDS and DynamoDB for their dynamic data and then archiving nightly into S3 for further processing with EMR. They are concerned because they found questionable log entries and suspect someone is attempting to gain unauthorized access. Which approach provides a cost effective scalable mitigation to this kind of attack?
    1. Recommend that they lease space at a DirectConnect partner location and establish a 1G DirectConnect connection to their VPC they would then establish Internet connectivity into their space, filter the traffic in hardware Web Application Firewall (WAF). And then pass the traffic through the DirectConnect connection into their application running in their VPC. (Not cost effective)
    2. Add previously identified hostile source IPs as an explicit INBOUND DENY NACL to the web tier subnet. (does not protect against new source)
    3. Add a WAF tier by creating a new ELB and an AutoScaling group of EC2 Instances running a host-based WAF. They would redirect Route 53 to resolve to the new WAF tier ELB. The WAF tier would their pass the traffic to the current web tier The web tier Security Groups would be updated to only allow traffic from the WAF tier Security Group
    4. Remove all but TLS 1.2 from the web tier ELB and enable Advanced Protocol Filtering This will enable the ELB itself to perform WAF functionality. (No advanced protocol filtering in ELB)

References

DDOS Whitepaper