AWS Services Overview – Whitepaper – Certification

AWS Services Overview

AWS consists of many cloud services that can be use in combinations tailored to meet business or organizational needs. This section introduces the major AWS services by category.


NOTE – This post provides a brief overview of AWS services. Its is good introduction to start all certifications. However, It is more relevant and most important for AWS Cloud Practitioner Certification Exam.


Common Features

  • Almost the features can be access control through AWS Identity Access Management – IAM
  • Services managed by AWS are all made Scalable and Highly Available, without any changes needed from the user

AWS Access

AWS allows accessing its services through unified tools using

  • AWS Management Console – a simple and intuitive user interface
  • AWS Command Line Interface (CLI) – programatic access through scripts
  • AWS Software Development Kits (SDKs) – programatic access through Application Program Interface (API) tailored for programming language (Java, .NET, Node.js, PHP, Python, Ruby, Go, C++, AWS Mobile SDK) or platform (Android, Browser, iOS)

Security, Identity, and Compliance

Amazon Cloud Directory

  • enables building flexible, cloud-native directories for organizing hierarchies of data along multiple dimensions, whereas traditional directory solutions limit to a single directory
  • helps create directories for a variety of use cases, such as organizational charts, course catalogs, and device registries.

AWS Identity and Access Management

  • enables you to securely control access to AWS services and resources for the users.
  • allows creation of AWS users, groups and roles, and use permissions to allow and deny their access to AWS resources
  • helps manage IAM users and their access with individual security credentials like access keys, passwords, and multi-factor authentication devices, or request temporary security credentials to provide users
  • helps role creation & manage permissions to control which operations can be performed by the which entity, or AWS service, that assumes the role
  • enables identity federation to allow existing identities (users, groups, and roles) in the enterprise to access AWS Management Console, call AWS APIs, access resources, without the need to create an IAM user for each identity.

Amazon Inspector

  • is an automated security assessment service that helps improve the security and compliance of applications deployed on AWS.
  • automatically assesses applications for vulnerabilities or deviations from best practices
  • produces a detailed list of security findings prioritized by level of severity.

AWS Certificate Manager

  • helps provision, manage, and deploy Secure Sockets Layer/Transport Layer Security (SSL/TLS) certificates for use with AWS services like ELB
  • removes the time-consuming manual process of purchasing, uploading, and renewing SSL/TLS certificates.

AWS CloudHSM

  • helps meet corporate, contractual, and regulatory compliance requirements for data security by using dedicated Hardware Security Module (HSM) appliances within the AWS Cloud.
  • allows protection of encryption keys within HSMs, designed and validated to government standards for secure key management.
  • helps comply with strict key management requirements without sacrificing application performance.

AWS Directory Service

  • provides Microsoft Active Directory (Enterprise Edition), also known as AWS Microsoft AD, that enables directory-aware workloads and AWS resources to use managed Active Directory in the AWS Cloud.

AWS Key Management Service

  • is a managed service that makes it easy to create and control the encryption keys used to encrypt your data.
  • uses HSMs to protect the security of your keys.

AWS Organizations

  • allows creation of AWS accounts groups, to more easily manage security and automation settings collectively
  • helps centrally manage multiple accounts to help scale.
  • helps to control which AWS services are available to individual accounts, automate new account creation, and simplify billing.

AWS Shield

  • is a managed Distributed Denial of Service (DDoS) protection service that safeguards web applications running on AWS.
  • provides always-on detection and automatic inline mitigations that minimize application downtime and latency, so there is no need to engage AWS Support to benefit from DDoS protection.
  • provides two tiers of AWS Shield: Standard and Advanced.

AWS WAF

  • is a web application firewall that helps protect web applications from common web exploits that could affect application availability, compromise security, or consume excessive resources.
  • gives complete control over which traffic to allow or block to web application by defining customizable web security rules.

AWS Compute Services

Amazon Elastic Compute Cloud (EC2)

  • provides secure, resizable compute capacity
  • provide complete control of the computing resources (root access, ability to start, stop, terminate instances etc.)
  • reduces the time required to obtain and boot new instances to minutes
  • allows quick scaling of capacity, both up and down, as the computing requirements changes
  • provides developers and sysadmins tools to build failure resilient applications and isolate themselves from common failure scenarios.
  • Benefits
    • Elastic Web-Scale Computing
      • enables scaling to increase or decrease capacity within minutes, not hours or days.
    • Flexible Cloud Hosting Services
      • flexibility to choose from multiple instance types, operating systems, and software packages.
      • selection of memory configuration, CPU, instance storage, and boot partition size
    • Reliable
      • offers a highly reliable environment where replacement instances can be rapidly and predictably commissioned.
      • runs within AWS’s proven network infrastructure and data centers.
      • EC2 Service Level Agreement (SLA) commitment is 99.95% availability for each Region.
    • Secure
      • works in conjunction with VPC to provide security and robust networking functionality for your compute resources.
      • allows control of IP address, exposure to Internet (using subnets), inbound and outbound access (using Security groups and NACLs)
      • existing IT infrastructure can be connected to the resources in the VPC using industry-standard encrypted IPsec virtual private network (VPN) connections
    • Inexpensive – pay only for the capacity actually used
  • EC2 Purchasing Options and Types
    • On-Demand Instances
      • pay for compute capacity by the hour with no long-term commitments
      • enables to increase or decrease compute capacity depending on the demands and only pay the specified hourly rate for used instances
      • frees from the costs and complexities of planning, purchasing, and maintaining hardware and transforms what are commonly large fixed costs into much smaller variable costs.
      • also helps remove the need to buy “safety net” capacity to handle periodic traffic spikes.
    • Reserved Instances
      • provides significant discount (up to 75%) compared to On-Demand instance pricing.
      • provides flexibility to change families, operating system types, and tenancies with Convertible Reserved Instances.
    • Spot Instances
      • allow you to bid on spare EC2 computing capacity.
      • are often available at a discount compared to On-Demand pricing, helping reduce the application cost, grow it’s compute capacity and throughput for the same budget
    • Dedicated Instances – that run on hardware dedicated to a single customer for additional isolation.
    • Dedicated Hosts
      • are physical servers with EC2 instance capacity fully dedicated to your use.
      • can help you address compliance requirements and reduce costs by allowing you to use your existing server-bound software licenses.

Amazon EC2 Container Service

  • is a highly scalable, high-performance container management service that supports Docker containers.
  • allows running applications on a managed cluster of EC2 instances
  • eliminates the need to install, operate, and scale cluster management infrastructure.
  • can use to schedule the placement of containers across the cluster based on the resource needs and availability requirements.
  • custom scheduler or third-party schedulers can be integrated to meet business or application-specific requirements.

Amazon EC2 Container Registry

  • is a fully-managed Docker container registry that makes it easy for developers to store, manage, and deploy Docker container images.
  • is integrated with Amazon EC2 Container Service (ECS), simplifying development to production workflow.
  • eliminates the need to operate container repositories or worry about scaling the underlying infrastructure.
  • hosts images in a highly available and scalable architecture
  • pay only for the amount of data stored and data transferred to the Internet.

Amazon Lightsail

  • is designed to be the easiest way to launch and manage a virtual private server with AWS.
  • plans include everything needed to jumpstart a project – a virtual machine, SSD-based storage, data transfer, DNS management, and a static IP address- for a low, predictable price.

AWS Batch

  • enables developers, scientists, and engineers to easily and efficiently run hundreds of thousands of batch computing jobs on AWS.
  • dynamically provisions the optimal quantity and type of compute resources (e.g., CPU or memory-optimized instances) based on the volume and specific resource requirements of the batch jobs submitted.
  • plans, schedules, and executes the batch computing workloads across the full range of AWS compute services and features

AWS Elastic Beanstalk

  • is an easy-to-use service for deploying and scaling web applications and services developed with Java, .NET, PHP, Node.js, Python, Ruby, Go, and Docker on familiar servers such as Apache, Nginx, Passenger, and Internet Information Services (IIS)
  • automatically handles the deployment, from capacity provisioning, load balancing, and auto scaling to application health monitoring.
  • provides full control over the AWS resources with access to the underlying resources at any time.

AWS Lambda

  • enables running code without zero administration, provisioning or managing servers, and scaling for high availability
  • pay only for the compute time consumed – there is no charge when the code is not running
  • can be setup to be automatically triggered from other AWS services, or called it directly from any web or mobile app.

Auto Scaling

  • helps maintain application availability
  • allows scaling EC2 capacity up or down automatically according to defined conditions or demand spikes to reduce cost
  • helps ensure desired number of EC2 instances are running always
  • well suited both to applications that have stable demand patterns and applications that experience hourly, daily, or weekly variability in usage.

Storage

Simple Storage Service

  • is object storage with a simple web service interface to store and retrieve any amount of data from anywhere on the web.
  • S3 Features
    • Durable
      • designed for durability of 99.999999999% of objects
      • data is redundantly stored across multiple facilities and multiple devices in each facility.
    • Available – designed for up to 99.99% availability (standard) of objects over a given year and is backed by the S3 Service Level Agreement
    • Scalable – can help store virtually unlimited data
    • Secure
      • supports data in motion over SSL and data at rest encryption
      • bucket policies and IAM can help manage object permissions and control access to the data
    • Low Cost
      • provides storage at a very low cost.
      • using lifecycle policies, the data can be automatically tiered into lower cost, longer-term cloud storage classes like S3 Standard – Infrequent Access and Glacier for archiving.

Elastic Block Store (EBS)

  • provides persistent block storage volumes for use with EC2 instance
  • offers the consistent and low-latency performance needed to run workloads.
  • allows scaling up or down within minutes – all while paying a low price for only what is provisioned
  • EBS Features
    • High Performance Volumes – Choose between SSD backed or HDD backed volumes to deliver the performance needed
    • Availability
      • is designed for 99.999% availability
      • automatically replicates within its Availability Zone to protect from component failure, offering high availability and durability.
    • Encryption – provides seamless support for data-at-rest and data-in-transit between EC2 instances and EBS volumes.
    • Snapshots – protect data by creating point-in-time snapshots of EBS volumes, which are backed up to S3 for long-term durability.

Elastic File System (EFS)

  • provides simple, scalable file storage for use with EC2 instances
  • storage capacity is elastic, growing and shrinking automatically as files are added and removed
  • provides a standard file system interface and file system access semantics, when mounted on EC2 instances
  • works in shared mode, where multiple EC2 instances can access an EFS file system at the same time, allowing EFS to provide a common data
    source for workloads and applications running on more than one EC2 instance.
  • can be mounted on on-premises data center servers when connected to the VPC with AWS Direct Connect.
  • can be mounted on on-premises servers to migrate data sets to EFS, enable cloud bursting scenarios, or backup on-premises data to EFS.
  • is designed for high availability and durability, and provides performance for a broad spectrum of workloads and applications, including big data and analytics, media processing workflows, content management, web serving, and home directories.

Glacier

  • provides secure, durable, and extremely low-cost storage service for data archiving and long-term backup
  • To keep costs low yet suitable for varying retrieval needs, Glacier provides three options for access to archives, from a few minutes to several hours.

AWS Storage Gateway

  • seamlessly enables hybrid storage between on-premises storage environments and the AWS Cloud
  • combines a multi-protocol storage appliance with highly efficient network connectivity to AWS cloud storage services, delivering local
    performance with virtually unlimited scale.
  • use it in remote offices and data centers for hybrid cloud workloads involving migration, bursting, and storage tiering

Databases

Aurora

  • is a MySQL and PostgreSQL compatible relational database engine
  • provides the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases.
  • Benefits
    • Highly Secure
      • provides multiple levels of security, including
        • network isolation using VPC
        • encryption at rest using keys created and controlled through AWS Key Management Service (KMS), and
        • encryption of data in transit using SSL.
      • with an an encrypted Aurora instance, automated backups, snapshots, and replicas are also encrypted
    • Highly Scalable – automatically grows storage as needed
    • High Availability and Durability
      • designed to offer greater than 99.99% availability
      • recovery from physical storage failures is transparent, and instance failover typically requires less than 30 seconds
      • is fault-tolerant and self-healing. Six copies of the data are replicated across three AZs and continuously backed up to S3.
      • automatically and continuously monitors and backs up your database to S3, enabling granular point-in-time recovery.
    • Fully Managed – is a fully managed database service, and database management tasks such as hardware provisioning, software patching, setup, configuration, monitoring, or backups is taken care of

Relational Database Service (RDS)

  • makes it easy to set up, operate, and scale a relational database
  • provides cost-efficient and resizable capacity while managing time-consuming database administration tasks
  • supports various, including Amazon Aurora, PostgreSQL, MySQL, MariaDB, Oracle, and Microsoft SQL Server
  • Benefits
    • Fast and Easy to Administer – No need for infrastructure provisioning, and no need for installing and maintaining database software.
    • Highly Scalable
      • allows quick and easy scaling of database’s compute and storage resources, often with no downtime.
      • allows offloading read traffic from primary database using Read Replicas, for few RDS engine types
    • Available and Durable
      • runs on the same highly reliable infrastructure
      • allows Multi-AZ DB instance, where RDS synchronously replicates the data to a standby instance in a different Availability Zone (AZ).
      • enhances reliability for critical production databases, by enabling automated backups, database snapshots, and automatic host replacement.
    • Secure
      • provides multiple levels of security, including
        • network isolation using VPC
        • connect to on-premises existing IT infrastructure through an industry-standard encrypted IPsec VPN
        • encryption at rest using keys created and controlled through AWS Key Management Service (KMS), and
        • offer encryption at rest and encryption in transit.
      • with an an encrypted instance, automated backups, snapshots, and replicas are also encrypted
    • Inexpensive – pay very low rates and only for the consumed resources, while taking advantage of on-demand and reserved instance types

DynamoDB

  • fully managed, fast and flexible NoSQL database service for applications that need consistent, single-digit millisecond latency at any scale.
  • supports both document and key-value data models.
  • flexible data model and reliable performance make it a great fit for mobile, web, gaming, ad-tech, Internet of Things (IoT), and other applications
  • Benefits
    • Fast, Consistent Performance
      • designed to deliver consistent, fast performance at any scale
      • uses automatic partitioning and SSD technologies to meet throughput requirements and deliver low latencies at any scale.
    • Highly Scalable – it manages all the scaling to achieve the specified throughput capacity requirements
    • Event-Driven Programming – integrates with AWS Lambda to provide Triggers that enable architecting applications that automatically react to data changes.

ElastiCache

  • is a web service that makes it easy to deploy, operate, and scale an in-memory cache in the cloud.
  • helps improves the performance of web applications by caching results and allowing to retrieve information from fast, managed, in-memory caches, instead of relying entirely on slower disk-based databases.
  • supports two open-source in-memory caching engines: Redis and Memcached

Migration

AWS Application Discovery Service

  • helps systems integrators quickly and reliably plan application migration projects by automatically identifying applications running in on-premises
    data centers, their associated dependencies, and performance profiles
  • automatically collects configuration and usage data from servers, storage, and networking equipment to develop a list of applications, how they
    perform, and how they are interdependent
  • information is retained in encrypted format in an AWS Application Discovery Service database, which you can export as a CSV or XML file into your preferred visualization tool or cloud migration solution to help reduce the complexity and time in planning your cloud migration.

AWS Database Migration Service

  • helps migrate databases to AWS easily and securely
  • source database remains fully operational during the migration, minimizing downtime to applications that rely on the database.
  • supports homogenous migrations such as Oracle to Oracle, as well as heterogeneous migrations between different database platforms, such as Oracle to Amazon Aurora or Microsoft SQL Server to MySQL.
  • allows streaming of data to Redshift from any of the supported sources including Aurora, PostgreSQL, MySQL, MariaDB, Oracle, SAP ASE, and SQL Server, enabling consolidation and easy analysis of data in the petabyte-scale data warehouse
  • can also be used for continuous data replication with high availability.

AWS Server Migration Service

  • is an agentless service which makes it easier and faster to migrate thousands of on-premises workloads to AWS

Snowball

  • is a petabyte-scale data transport solution that uses secure appliances to transfer large amounts of data into and out of AWS.
  • addresses common challenges with large-scale data transfers including high network costs, long transfer times, and security concerns.
  • uses multiple layers of security designed to protect the data including tamper resistant enclosures, 256-bit encryption, and an industry-standard Trusted Platform Module (TPM) designed to ensure both security and full chain of custody of your data.
  • performs a software erasure of the Snowball appliance, once the data transfer job has been processed

Snowball Edge

  • is a 100 TB data transfer device with on-board storage and compute capabilities.
  • can be used to move large amounts of data into and out of AWS, as a temporary storage tier for large local datasets, or to support local workloads in remote or offline locations.
  • multiple devices can be clustered together to form a local storage tier and process the data on-premises, helping ensure the applications continue to run even when they are not able to access the cloud

Snowmobile

  • is an exabyte-scale data transfer service used to move extremely large amounts of data to AWS.
  • provides secure, fast, and cost effective transfer of data
  • data cane be imported into S3 or Glacier, once data loaded
  • uses multiple layers of security designed to protect the data including dedicated security personnel, GPS tracking, alarm monitoring, 24/7 video surveillance, and an optional escort security vehicle while in transit.
  • all data is encrypted with 256-bit encryption keys managed through KMS and designed to ensure both security and full chain of custody of the data

Networking and Content Delivery

Virtual Private Cloud (VPC)

  • helps provision a logically isolated section of the AWS Cloud where AWS resources can be launched in a virtual network that you define
  • provides complete control over the virtual networking environment, including selection of IP address range, creation of subnets (public and private), and configuration of route tables and network gateways.
  • allows use of both IPv4 and IPv6 for secure and easy access to resources and applications
  • allows multiple layers of security, including security groups and network access control lists, to help control access resources
  • allows creation of a hardware virtual private network (VPN) connection between the corporate data center and VPC and leverage the AWS Cloud as an extension of corporate data center.

CloudFront

  • is a global content delivery network (CDN) service that accelerates delivery of websites, APIs, video content, or other web assets.
  • can be used to deliver entire website, including dynamic, static, streaming, and interactive content using a global network of edge locations.
  • allows requests for the content to be automatically routed to the nearest edge location, so content is delivered with the best possible performance.
  • is optimized to work with other services in AWS, such as S3, EC2, ELB, and Route 53 as well as with any non-AWS origin server that stores the original, definitive versions of your files.

Route 53

  • is a highly available and scalable Domain Name System (DNS) web service
  • effectively connects user requests to infrastructure running in AWS – such as EC2 instances, ELB, or S3 buckets—and can also be used to route users to infrastructure outside of AWS.
  • helps configure DNS health checks to route traffic to healthy endpoints or to independently monitor the health of your application and its endpoints.
  • allows traffic management globally through a variety of routing types, including latency-based routing, Geo DNS, and weighted round robin – all of which can be combined with DNS Failover in order to enable a variety of low-latency, fault-tolerant architectures.
  • is fully compliant with IPv6 as well
  • offers Domain Name Registration service

Direct Connect

  • makes it easy to establish a dedicated network connection with on- premises to AWS
  • helps establish private connectivity between AWS and data center, office, or co-location environment,
  • helps increase bandwidth throughput, reduce network costs, , and provide a more consistent network experience than Internet-based connections

Elastic Load Balancing (ELB)

  • automatically distributes incoming application traffic across multiple EC2 instances
  • enables achieve greater levels of fault tolerance by seamlessly providing the required amount of load balancing capacity needed to distribute application traffic.
  • offers two types of load balancers that both feature high availability, automatic scaling, and robust security.
    • Classic Load Balancer
      • routes traffic based on either application or network level information
      • ideal for simple load balancing of traffic across multiple EC2 instances
    • Application Load Balancer
      • routes traffic based on advanced application-level information that includes the content of the request
      • ideal for applications needing advanced routing capabilities, microservices, and container-based architectures.
      • offers the ability to route traffic to multiple services or load balance
        across multiple ports on the same EC2 instance.

Management Tools

AWS CloudWatch

  • is a monitoring and logging service for AWS Cloud resources and the applications running on AWS.
  • can be used to collect and track metrics, collect and monitor log files, set alarms, and automatically react to changes in the AWS resources.

AWS CloudFormation

  • allows developers and systems administrators to implement “Infrastructure as Code”
  • provides an easy way to create and manage a collection of related AWS resources, provisioning and updating them in an orderly and predictable fashion
  • handles the order for provisioning AWS services or the subtleties of making those dependencies work.
  • allows applying version control to the AWS infrastructure the same way its done with software

AWS CloudTrail

  • helps records AWS API calls for the account and delivers log files
  • including API calls made using the AWS Management Console, AWS SDKs, command line tools, and higher-level AWS services (such as AWS CloudFormation),
  • recorded information includes the identity of the API caller, the time of the API call, the source IP address of the API caller, the request parameters, and the response elements returned by the AWS service.
  • enables security analysis, resource change tracking, compliance auditing

AWS Config

  • provides an AWS resource inventory, configuration history, and configuration change notifications to enable security and governance
  • provides Config Rules feature, that enables rules creation that automatically check the configuration of AWS resources
  • helps discover existing and deleted AWS resources, determine overall compliance against rules, and dive into configuration details of a resource at any point in time.
  • enables compliance auditing, security analysis, resource change tracking, and troubleshooting.

AWS OpsWorks

  • configuration management service that uses Chef, an automation platform that treats server configurations as code.
  • uses Chef to automate how servers are configured, deployed, and managed across the EC2 instances or on-premises compute environments.
  • has two offerings, OpsWorks for Chef Automate and OpsWorks Stacks

AWS Service Catalog

  • allows organizations to create and manage catalogs of IT services that are approved for use on AWS.
  • helps centrally manage commonly deployed IT services and helps to achieve consistent governance and meet compliance requirements, while enabling users to quickly deploy only approved IT services they need
  • can include everything from virtual machine images, servers, software, and databases to complete multi-tier application architectures.

AWS Trusted Advisor

  • is an online resource to help reduce cost, increase performance, and improve security by optimizing the AWS environment.
  • provides real-time guidance to help provision the resources following AWS best practices.

AWS Personal Health Dashboard

  • provides alerts and remediation guidance when AWS is experiencing events that might affect you.
  • displays relevant and timely information to help you manage events in progress, and provides proactive notification to help you plan for scheduled activities.
  • alerts are automatically triggered by changes in the health of AWS resources, providing event visibility and guidance to help quickly diagnose and resolve issues.
  • provides a personalized view into the performance and availability of the AWS services underlying the AWS resources.
  • Service Health Dashboard displays the general status of AWS services,

AWS Managed Services

  • provides ongoing management of the AWS infrastructure so the focus can be on applications.
  • helps reduce the operational overhead and risk, by implementing best practices to maintain the infrastructure
  • automates common activities such as change requests, monitoring, patch management, security, and backup services, and provides full-lifecycle services to provision, run, and support the infrastructure.
  • improves agility, reduces cost, and unburdens from infrastructure operations

Developer Tools

AWS CodeCommit

  • is a fully managed source control service that makes to host secure and highly scalable private Git repositories

AWS CodeBuild

  • is a fully managed build service that compiles source code, runs tests, and produces software packages that are ready to deploy
  • also helps provision, manage, and scale the build servers.
  • scales continuously and processes multiple builds concurrently, so the builds are not left waiting in a queue.

AWS CodeDeploy

  • is a service that automates code deployments to any instance, including EC2 instances and instances running on premises.
  • helps to rapidly release new features, avoid downtime during application deployment, and handles the complexity of updating the applications.

AWS CodePipeline

  • is a continuous integration and continuous delivery service for fast and reliable application and infrastructure updates.
  • builds, tests, and deploys the code every time there is a code change, based on the defined release process models

AWS X-Ray

  • helps developers analyze and debug distributed applications in production or development, such as those built using a microservices architecture
  • provides an end-to-end view of requests as they travel through the application, and shows a map of its underlying components.
  • helps understand how the application and its underlying services are performing, to identify and troubleshoot the root cause of performance issues and errors.

Messaging

Amazon SQS

  • is a fast, reliable, scalable, fully managed message queuing service.
  • makes it simple and cost-effective to decouple the components of a cloud application.
  • includes standard queues with high throughput and at-least-once processing, and FIFO queues
  • provides FIFO (first-in, first-out) delivery and exactly-once processing.

Amazon SNS

  • fast, flexible, fully managed push notification service to send individual messages or to fan-out messages to large numbers of recipients.
  • makes it simple and cost effective to send push notifications to mobile device users, email recipients or even send messages to other distributed services
  • notifications can be sent to Apple, Google, Fire OS, and Windows devices, as well as to Android devices in China with Baidu Cloud Push.
  • can also deliver messages to SQS, Lambda functions, or HTTP endpoint

Amazon SES

  • is a cost-effective email service built on the reliable and scalable infrastructure that Amazon.com developed to serve its own customer
  • can send transactional email, marketing messages, or any other type of high-quality content to the customers.
  • can receive messages and deliver them to an S3 bucket, call your custom code via an AWS Lambda function, or publish notifications to SNS.

Analytics

Amazon Athena

  • is an interactive query service that helps to analyze data in S3 using standard SQL.
  • is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.
  • removes the need for complex extract, transform, and load (ETL) jobs

Amazon EMR

  • provides a managed Hadoop framework that makes it easy, fast, and costeffective to process vast amounts of data across dynamically scalable EC2 instances.
  • enables you to run other popular distributed frameworks such as Apache Spark, HBase, Presto, and Flink, and interact with data in other AWS data stores such as S3 and DynamoDB.
  • securely and reliably handles a broad set of big data use cases, including log analysis, web indexing, data transformations (ETL), machine learning, financial analysis, scientific simulation, and bioinformatics.

Amazon CloudSearch

  • is a managed service and makes it simple and costeffective to set up, manage, and scale a search solution for website or application.
  • supports 34 languages and popular search features such as highlighting, autocomplete, and geospatial search.

Amazon Elasticsearch Service

  • makes it easy to deploy, operate, and scale Elasticsearch for log analytics, full text search, application monitoring, and more.
  • is a fully managed service that delivers Elasticsearch’s easy-to-use APIs and real-time capabilities along with the availability, scalability, and security required by production workloads.

Amazon Kinesis

  • is a platform for streaming data on AWS, offering powerful services to make it easy to load and analyze streaming data,
  • provides the ability to build custom streaming data applications for specialized needs.
  • offers three services:
    • Amazon Kinesis Firehose,
      • helps load streaming data into AWS.
      • can capture, transform, and load streaming data into Amazon Kinesis Analytics, S3, Redshift, and Elasticsearch Service, enabling near real-time analytics with existing business intelligence tools and dashboards
      • helps batch, compress, and encrypt the data before loading it, minimizing the amount of storage used at the destination and increasing security.
    • Amazon Kinesis Analytics
      • helps process streaming data in real time with standard SQL
    • Amazon Kinesis Streams
      • enables you to build custom applications that process or analyze streaming data for specialized needs.

Amazon Redshift

  • provides a fast, fully managed, petabyte-scale data warehouse that makes it simple and cost-effective to analyze all your data using your existing business intelligence tools.
  • has a massively parallel processing (MPP) data warehouse architecture, parallelizing and distributing SQL operations to take advantage of all available resources.
  • provides underlying hardware designed for high performance data processing, using local attached storage to maximize throughput between the CPUs and drives, and a 10GigE mesh network to maximize throughput between nodes.

Amazon QuickSight

  • provides fast, cloud-powered business analytics service that makes it easy to build visualizations, perform ad-hoc analysis, and quickly get business insights from your data.

AWS Data Pipeline

  • helps reliably process and move data between different AWS compute and storage services, as well as on-premises data sources, at specified intervals
  • can regularly access your data where it’s stored, transform and process it at scale, and efficiently transfer the results to AWS services such as S3, RDS, DynamoDB, and EMR.
  • helps create complex data processing workloads that are fault tolerant, repeatable, and highly available.
  • also allows you to move and process data that was previously locked up in on-premises data silos.

AWS Glue

  • is a fully managed ETL service that makes it easy to move data between data stores.
  • helps simplifies and automates the difficult and time-consuming tasks of data discovery, conversion, mapping, and job scheduling.
  • helps schedules ETL jobs and provisions and scales all the infrastructure
  • required so that ETL jobs run quickly and efficiently at any scale.

Application Services

AWS Step Functions

  • makes it easy to coordinate the components of distributed applications and microservices using visual workflows.
  • automatically triggers and tracks each step, and retries when there are errors, so the application executes in order and as expected.

Amazon API Gateway

  • is a fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale.
  • handles all the tasks involved in accepting and processing up to hundreds of thousands of concurrent API calls, including traffic management, authorization and access control, monitoring, and API version management.

Amazon Elastic Transcoder

  • is media transcoding in the cloud
  • is designed to be a highly scalable, easy-to-use, and cost-effective way for developers and businesses to convert (or transcode) media files from their source format into versions that will play back on devices like smartphones, tablets, and PCs.

Amazon SWF

  • helps developers build, run, and scale background jobs that have parallel or sequential steps.
  • is a fully-managed state tracker and task coordinator in the cloud.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Which AWS services belong to the Compute services? Choose 2 answers
    1. Lambda
    2. EC2
    3. S3
    4. EMR
    5. CloudFront
  2. Which AWS service provides low cost storage option for archival and long-term backup?
    1. Glacier
    2. S3
    3. EBS
    4. CloudFront
  3. Which AWS services belong to the Storage services? Choose 2 answers
    1. EFS
    2. IAM
    3. EMR
    4. S3
    5. CloudFront
  4. A Company allows users to upload videos on its platform. They want to convert the videos to multiple formats supported on multiple devices and platforms. Which AWS service can they leverage for the requirement?
    1. AWS SWF
    2. AWS Video Converter
    3. AWS Elastic Transcoder
    4. AWS Data Pipeline
  5. Which analytic service helps analyze data in S3 using standard SQL?
    1. Athena
    2. EMR
    3. Elasticsearch
    4. Kinesis
  6. What features does AWS’s Route 53 service provide? Choose the 2 correct answers:
    1. Content Caching
    2. Domain Name System (DNS) service
    3. Database Management
    4. Domain Registration
  7. You are trying to organize and import (to AWS) gigabytes of data that are currently structured in JSON-like, name-value documents. What AWS service would best fit your needs?
    1. Lambda
    2. DynamoDB
    3. RDS
    4. Aurora
  8. What AWS database is primarily used to analyze data using standard SQL formatting with compatibility for your existing business intelligence tools? Choose the correct answer:
    1. Redshift
    2. RDS
    3. DynamoDB
    4. ElastiCache
  9. A company wants their application to use pre-configured machine image with software installed and configured. which AWS feature can help for the same?
    1. Amazon Machine Image
    2. AWS CloudFormation
    3. AWS Lambda
    4. AWS Lightsail
  10. What AWS service can be used for track API event calls for security analysis, resource change tracking?
    1. AWS CloudWatch
    2. AWS CloudFormation
    3. AWS CloudTrail
    4. AWS OpsWorks
  11. Which AWS service can help Offload the read traffic from your database in order to reduce latency caused by read-heavy workload?
    1. ElastiCache
    2. DynamoDB
    3. S3
    4. EFS
  12. What service allows system administrators to run “Infrastructure as code”?
    1. CloudFormation
    2. CloudWatch
    3. CloudTrail
    4. CodeDeploy

References

AWS_Overview_Whitepaper

Architecting for the Cloud – AWS Best Practices – Whitepaper – Certification

Architecting for the Cloud – AWS Best Practices

Architecting for the Cloud – AWS Best Practices whitepaper provides architectural patterns and advice on how to design systems that are secure, reliable, high performing, and cost efficient

AWS Design Principles

Scalability

  • While AWS provides virtually unlimited on-demand capacity, the architecture should be designed to take advantage of those resources
  • There are two ways to scale an IT architecture
    • Vertical Scaling
      • takes place through increasing specifications of an individual resource for e.g. updating EC2 instance type with increasing RAM, CPU, IOPS, or networking capabilities
      • will eventually hit a limit, and is not always a cost effective or highly available approach
    • Horizontal Scaling
      • takes place through increasing number of resources for e.g. adding more EC2 instances or EBS volumes
      • can help leverage the elasticity of cloud computing
      • not all the architectures can be designed to distribute their workload to multiple resources
      • applications designed should be stateless,
        • that needs no knowledge of previous interactions and stores no session information
        • capacity can be increased and decreased, after running tasks have been drained
      • State, if needed, can be implemented using
        • Low latency external store, for e.g. DynamoDB, Redis, to maintain state information
        • Session affinity, for e.g. ELB sticky sessions, to bind all the transactions of a session to a specific compute resource. However, it cannot be guaranteed or take advantage of newly added resources for existing sessions
      • Load can be distributed across multiple resources using
        • Push model, for e.g. through ELB where it distributes the load across multiple EC2 instances
        • Pull model, for e.g. through SQS or Kinesis where multiple consumers subscribe and consume
      • Distributed processing, for e.g. using EMR or Kinesis, helps process large amounts of data by dividing task and its data into many small fragments of works

Disposable Resources Instead of Fixed Servers

  • Resources need to be treated as temporary disposable resources rather then fixed permanent on-premises resources before
  • AWS focuses on the concept of Immutable infrastructure
    • servers once launched, is never updated throughout its lifetime.
    • updates can be performed on a new server with latest configurations,
    • this ensures resources are always in a consistent (and tested) state and easier rollbacks
  • AWS provides multiple ways to instantiate compute resources in an automated and repeatable way
    • Bootstraping
      • scripts to configure and setup for e.g. using data scripts and cloud-init to install software or copy resources and code
    • Golden Images
      • a snapshot of a particular state of that resource,
      • faster start times and removes dependencies to configuration services or third-party repositories
    • Containers
      • AWS support for docker images through Elastic Beanstalk and ECS
      • Docker allows packaging a piece of software in a Docker Image, which is a standardized unit for software development, containing everything the software needs to run: code, runtime, system tools, system libraries, etc
  • Infrastructure as Code
    • AWS assets are programmable, techniques, practices, and tools from software development can be applied to make the whole infrastructure reusable, maintainable, extensible, and testable.
    • AWS provides services like CloudFormation, OpsWorks for deployment

Automation

  • AWS provides various automation tools and services which help improve system’s stability, efficiency and time to market.
    • Elastic Beanstalk
      • a PaaS that allows quick application deployment while handling resource provisioning, load balancing, auto scaling, monitoring etc
    • EC2 Auto Recovery
      • creates CloudWatch alarm that monitors an EC2 instance and automatically recovers it if it becomes impaired.
      • A recovered instance is identical to the original instance, including the instance ID, private & Elastic IP addresses, and all instance metadata.
      • Instance is migrated through reboot, in memory contents are lost.
    • Auto Scaling
      • allows maintain application availability and scale the capacity up or down automatically as per defined conditions
    • CloudWatch Alarms
      • allows SNS triggers to be configured when a particular metric goes beyond a specified threshold for a specified number of periods
    • CloudWatch Events
      • allows real-time stream of system events that describe changes in AWS resources
    • OpsWorks
      • allows continuous configuration through lifecycle events that automatically update the instances’ configuration to adapt to environment changes.
      • Events can be used to trigger Chef recipes on each instance to perform specific configuration tasks
    • Lambda Scheduled Events
      • allows Lambda function creation and direct AWS Lambda to execute it on a regular schedule.

Loose Coupling

  • AWS helps loose coupled architecture that reduces interdependencies, a change or failure in a component does not cascade to other components
    • Asynchronous Integration
      • does not involve direct point-to-point interaction but usually through an intermediate durable storage layer for e.g. SQS, Kinesis
      • decouples the components and introduces additional resiliency
      • suitable for any interaction that doesn’t need an immediate response and where an ack that a request has been registered will suffice
    • Service Discovery
      • allows new resources to be launched or terminated at any point in time and discovered as well for e.g. using ELB as a single point of contact with hiding the underlying instance details or Route 53 zones to abstract load balancer’s endpoint
    • Well-Defined Interfaces
      • allows various components to interact with each other through specific, technology agnostic interfaces for e.g. RESTful apis with API Gateway 

Services, Not Servers

Databases

  • AWS provides different categories of database technologies
    • Relational Databases (RDS)
      • normalizes data into well-defined tabular structures known as tables, which consist of rows and columns
      • provide a powerful query language, flexible indexing capabilities, strong integrity controls, and the ability to combine data from multiple tables in a fast and efficient manner
      • allows vertical scalability by increasing resources and horizontal scalability using Read Replicas for read capacity and sharding or data partitioning for write capacity
      • provides High Availability using Multi-AZ deployment, where data is synchronously replicated
    • NoSQL Databases (DynamoDB)
      • provides databases that trade some of the query and transaction capabilities of relational databases for a more flexible data model that seamlessly scales horizontally
      • perform data partitioning and replication to scale both the reads and writes in a horizontal fashion
      • DynamoDB service synchronously replicates data across three facilities in an AWS region to provide fault tolerance in the event of a server failure or Availability Zone disruption
    • Data Warehouse (Redshift)
      • Specialized type of relational database, optimized for analysis and reporting of large amounts of data
      • Redshift achieves efficient storage and optimum query performance through a combination of massively parallel processing (MPP), columnar data storage, and targeted data compression encoding schemes
      • Redshift MPP architecture enables increasing performance by increasing the number of nodes in the data warehouse cluster
  • For more details refer to AWS Storage Options Whitepaper

Removing Single Points of Failure

  • AWS provides ways to implement redundancy, automate recovery and reduce disruption at every layer of the architecture
  • AWS supports redundancy in the following ways
    • Standby Redundancy
      • When a resource fails, functionality is recovered on a secondary resource using a process called failover.
      • Failover will typically require some time before it completes, and during that period the resource remains unavailable.
      • Secondary resource can either be launched automatically only when needed (to reduce cost), or it can be already running idle (to accelerate failover and minimize disruption).
      • Standby redundancy is often used for stateful components such as relational databases.
    • Active Redundancy
      • requests are distributed to multiple redundant compute resources, if one fails, the rest can simply absorb a larger share of the workload.
      • Compared to standby redundancy, it can achieve better utilization and affect a smaller population when there is a failure.
  • AWS supports replication
    • Synchronous replication
      • acknowledges a transaction after it has been durably stored in both the primary location and its replicas.
      • protects data integrity from the event of a primary node failure
      • used to scale read capacity for queries that require the most up-to-date data (strong consistency).
      • compromises performance and availability
    • Asynchronous replication
      • decouples the primary node from its replicas at the expense of introducing replication lag
      • used to horizontally scale the system’s read capacity for queries that can tolerate that replication lag.
    • Quorum-based replication
      • combines synchronous and asynchronous replication to overcome the challenges of large-scale distributed database systems
      • Replication to multiple nodes can be managed by defining a minimum number of nodes that must participate in a successful write operation
  • AWS provide services to reduce or remove single point of failure
    • Regions, Availability Zones with multiple data centers
    • ELB or Route 53 to configure health checks and mask failure by routing traffic to healthy endpoints
    • Auto Scaling to automatically replace unhealthy nodes
    • EC2 auto-recovery to recover unhealthy impaired nodes
    • S3, DynamoDB with data redundantly stored across multiple facilities
    • Multi-AZ RDS and Read Replicas
    • ElastiCache Redis engine supports replication with automatic failover
  • For more details refer to AWS Disaster Recovery Whitepaper

Optimize for Cost

  • AWS can help organizations reduce capital expenses and drive savings as a result of the AWS economies of scale
  • AWS provides different options which should be utilized as per use case –
    • EC2 instance types – On Demand, Reserved and Spot
    • Trusted Advisor or EC2 usage reports to identify the compute resources and their usage
    • S3 storage class – Standard, Reduced Redundancy, and Standard-Infrequent Access
    • EBS volumes – Magnetic, General Purpose SSD, Provisioned IOPS SSD
    • Cost Allocation tags to identify costs based on tags
    • Auto Scaling to horizontally scale the capacity up or down based on demand
    • Lambda based architectures to never pay for idle or redundant resources
    • Utilize managed services where scaling is handled by AWS for e.g. ELB, CloudFront, Kinesis, SQS, CloudSearch etc.

Caching

  • Caching improves application performance and increases the cost efficiency of an implementation
    • Application Data Caching
      • provides services thats helps store and retrieve information from fast, managed, in-memory caches
      • ElastiCache is a web service that makes it easy to deploy, operate, and scale an in-memory cache in the cloud and supports two open-source in-memory caching engines: Memcached and Redis
    • Edge Caching
      • allows content to be served by infrastructure that is closer to viewers, lowering latency and giving high, sustained data transfer rates needed to deliver large popular objects to end users at scale.
      • CloudFront is Content Delivery Network (CDN) consisting of multiple edge locations, that allows copies of static and dynamic content to be cached

Security

  • AWS works on shared security responsibility model
    • AWS is responsible for the security of the underlying cloud infrastructure
    • you are responsible for securing the workloads you deploy in AWS
  • AWS also provides ample security features
    • IAM to define a granular set of policies and assign them to users, groups, and AWS resources
    • IAM roles to assign short term credentials to resources, which are automatically distributed and rotated
    • Amazon Cognito, for mobile applications, which allows client devices to get controlled access to AWS resources via temporary tokens.
    • VPC to isolate parts of infrastructure through the use of subnets, security groups, and routing controls
    • WAF to help protect web applications from SQL injection and other vulnerabilities in the application code
    • CloudWatch logs to collect logs centrally as the servers are temporary
    • CloudTrail for auditing AWS API calls, which delivers a log file to S3 bucket. Logs can then be stored in an immutable manner and automatically processed to either notify or even take action on your behalf, protecting your organization from non-compliance
    • AWS Config, Amazon Inspector, and AWS Trusted Advisor to continually monitor for compliance or vulnerabilities giving a clear overview of which IT resources are in compliance, and which are not
  • For more details refer to AWS Security Whitepaper

References

Architecting for the Cloud: AWS Best Practices – Whitepaper – 2016

 

AWS High Availability & Fault Tolerance Architecture – Certification

AWS High Availability & Fault Tolerance Architecture

  • Amazon Web Services provides services and infrastructure to build reliable, fault-tolerant, and highly available systems in the cloud.
  • Fault-tolerance defines the ability for a system to remain in operation even if some of the components used to build the system fail.
  • Most of the higher-level services, such as S3, SimpleDB, SQS, and ELB, have been built with fault tolerance and high availability in mind.
  • Services that provide basic infrastructure, such as EC2 and EBS, provide specific features, such as availability zones, elastic IP addresses, and snapshots, that a fault-tolerant and highly available system must take advantage of and use correctly.

AWS High Availability and Fault Tolerance

NOTE: Topic mainly for Professional Exam Only

Regions & Availability Zones

  • Amazon Web Services are available in geographic Regions and with multiple Availability zones (AZs) within a region, which provide easy access to redundant deployment locations.
  • AZs are distinct geographical locations that are engineered to be insulated from failures in other AZs.
  • Regions and AZs help achieve greater fault tolerance by distributing the application geographically and help build multi-site solution.
  • AZs provide inexpensive, low latency network connectivity to other Availability Zones in the same Region
  • By placing EC2 instances in multiple AZs, an application can be protected from failure at a single data center
  • It is important to run independent application stacks in more than one AZ, either in the same region or in another region, so that if one zone fails, the application in the other zone can continue to run.

Amazon Machine Image – AMIs

  • EC2 is a web service within Amazon Web Services that provides computing resources.
  • Amazon Machine Image (AMI) provides a Template that can be used to define the service instances.
  • Template basically contains a software configuration (i.e., OS, application server, and applications) and is applied to an instance type
  • AMI can either contain all the softwares, applications and the code bundled or can be configured to have a bootstrap script to install the same on startup.
  • A single AMI can be used to create server resources of different instance types and start creating new instances or replacing failed instances

Auto Scaling

  • Auto Scaling helps to automatically scale EC2 capacity up or down based on defined rules.
  • Auto Scaling also enables addition of more instances in response to an increasing load; and when those instances are no longer needed, they will be automatically terminated.
  • Auto Scaling enables terminating server instances at will, knowing that replacement instances will be automatically launched.
  • Auto Scaling can work across multiple AZs within an AWS Region

Elastic Load Balancing – ELB

  • Elastic Load balancing is an effective way to increase the availability of a system and distributes incoming traffic to application across several EC2 instances
  • With ELB, a DNS host name is created and any requests sent to this host name are delegated to a pool of EC2 instances
  • ELB supports health checks on hosts, distribution of traffic to EC2 instances across multiple availability zones, and dynamic addition and removal of EC2 hosts from the load-balancing rotation
  • Elastic Load Balancing detects unhealthy instances within its pool of EC2 instances and automatically reroutes traffic to healthy instances, until the unhealthy instances have been restored seamlessly using Auto Scaling.
  • Auto Scaling and Elastic Load Balancing are an ideal combination – while ELB gives a single DNS name for addressing, Auto Scaling ensures there is always the right number of healthy EC2 instances to accept requests.
  • ELB can be used to balance across instances in multiple AZs of a region.

Elastic IPs – EIPs

  • Elastic IP addresses are public static IP addresses that can be mapped programmatically between instances within a region.
  • EIPs associated with the AWS account and not with a specific instance or lifetime of an instance.
  • Elastic IP addresses can be used for instances and services that require consistent endpoints, such as, master databases, central file servers, and EC2-hosted load balancers
  • Elastic IP addresses can be used to work around host or availability zone failures by quickly remapping the address to another running instance or a replacement instance that was just started.

Reserved Instance

  • Reserved instances help reserve and guarantee computing capacity is available at a lower cost always.

Elastic Block Store – EBS

  • Elastic Block Store (EBS) offers persistent off-instance storage volumes that persists independently from the life of an instance and are about an order of magnitude more durable than on-instance storage.
  • EBS volumes store data redundantly and are automatically replicated within a single availability zone.
  • EBS helps in failover scenarios where if an EC2 instance fails and needs to be replaced, the EBS volume can be attached to the new EC2 instance
  • Valuable data should never be stored only on instance (ephemeral) storage without proper backups, replication, or the ability to re-create the data.

EBS Snapshots

  • EBS volumes are highly reliable, but to further mitigate the possibility of a failure and increase durability, point-in-time Snapshots can be created to store data on volumes in S3, which is then replicated to multiple AZs.
  • Snapshots can be used to create new EBS volumes, which are an exact replica of the original volume at the time the snapshot was taken
  • Snapshots provide an effective way to deal with disk failures or other host-level issues, as well as with problems affecting an AZ.
  • Snapshots are incremental and back up only changes since the previous snapshot, so it is advisable to hold on to recent snapshots
  • Snapshots are tied to the region, while EBS volumes are tied to a single AZ

Relational Database Service – RDS

  • RDS makes it easy to run relational databases in the cloud
  • RDS Multi-AZ deployments, where a synchronous standby replica of the database is provisioned in a different AZ, which helps increase the database availability and protect the database against unplanned outages
  • In case of a failover scenario, the standby is promoted to be the primary seamlessly and will handle the database operations.
  • Automated backups, enabled by default, of the database provides point-in-time recovery for the database instance.
  • RDS will back up your database and transaction logs and store both for a user-specified retention period.
  • In addition to the automated backups, manual RDS backups can also be performed which are retained until explicitly deleted.
  • Backups help recover from higher-level faults such as unintentional data modification, either by operator error or by bugs in the application.
  • RDS Read Replicas provide read-only replicas of the database an provides the ability to scale out beyond the capacity of a single database deployment for read-heavy database workloads
  • RDS Read Replicas is a scalability and not a High Availability solution

Simple Storage Service – S3

  • S3 provides highly durable, fault-tolerant and redundant object store
  • S3 stores objects redundantly on multiple devices across multiple facilities in an S3 Region
  • S3 is a great storage solution for somewhat static or slow-changing objects, such as images, videos, and other static media.
  • S3 also supports edge caching and streaming of these assets by interacting with the Amazon CloudFront service.

Simple Queue Service – SQS

  • Simple Queue Service (SQS) is a highly reliable distributed messaging system that can serve as the backbone of fault-tolerant application
  • SQS is engineered to provide “at least once” delivery of all messages
  • Messages are guaranteed for sent to a queue are retained for up to four days( by default, and can be extended upto 14 days)  or until they are read and deleted by the application
  • Messages can be polled by multiple workers and processed, while SQS takes care that a request is processed by only one worker at a time using configurable time interval called visibility timeout
  • If the number of messages in a queue starts to grow or if the average time to process a message becomes too high, workers can be scaled upwards by simply adding additional EC2 instances.

Route 53

  • Amazon Route 53 is a highly available and scalable DNS web service.
  • Queries for the domain are automatically routed to the nearest DNS server and thus are answered with the best possible performance.
  • Route 53 resolves requests for your domain name (for example, www.example.com) to your Elastic Load Balancer, as well as your zone apex record (example.com).

CloudFront

  • CloudFront can be used to deliver website, including dynamic, static and streaming content using a global network of edge locations.
  • Requests for your content are automatically routed to the nearest edge location, so content is delivered with the best possible performance.
  • CloudFront is optimized to work with other Amazon Web Services, like S3 and EC2
  • CloudFront also works seamlessly with any non-AWS origin server, which stores the original, definitive versions of your files.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. You are moving an existing traditional system to AWS, and during the migration discover that there is a master server which is a single point of failure. Having examined the implementation of the master server you realize there is not enough time during migration to re-engineer it to be highly available, though you do discover that it stores its state in a local MySQL database. In order to minimize down-time you select RDS to replace the local database and configure master to use it, what steps would best allow you to create a self-healing architecture[PROFESSIONAL]
    1. Migrate the local database into multi-AWS RDS database. Place master node into a multi-AZ auto-scaling group with a minimum of one and maximum of one with health checks.
    2. Replicate the local database into a RDS read replica. Place master node into a Cross-Zone ELB with a minimum of one and maximum of one with health checks. (Read Replica does not provide HA and write capability and ELB does not have feature for Min and Max 1 and Cross Zone allows just the equal distribution of load across instances)
    3. Migrate the local database into multi-AWS RDS database. Place master node into a Cross-Zone ELB with a minimum of one and maximum of one with health checks. (ELB does not have feature for Min and Max 1 and Cross Zone allows just the equal distribution of load across instances)
    4. Replicate the local database into a RDS read replica. Place master node into a multi-AZ auto-scaling group with a minimum of one and maximum of one with health checks. (Read Replica does not provide HA and write capability)
  2. You are designing Internet connectivity for your VPC. The Web servers must be available on the Internet. The application must have a highly available architecture. Which alternatives should you consider? (Choose 2 answers)
    1. Configure a NAT instance in your VPC. Create a default route via the NAT instance and associate it with all subnets. Configure a DNS A record that points to the NAT instance public IP address (NAT is for internet connectivity for instances in private subnet)
    2. Configure a CloudFront distribution and configure the origin to point to the private IP addresses of your Web servers. Configure a Route53 CNAME record to your CloudFront distribution.
    3. Place all your web servers behind ELB. Configure a Route53 CNAME to point to the ELB DNS name.
    4. Assign EIPs to all web servers. Configure a Route53 record set with all EIPs. With health checks and DNS failover.
  3. When deploying a highly available 2-tier web application on AWS, which combination of AWS services meets the requirements? 1. AWS Direct Connect 2. Amazon Route 53 3. AWS Storage Gateway 4. Elastic Load Balancing 4. Amazon EC2 5. Auto scaling 6. Amazon VPC 7. AWS Cloud Trail [PROFESSIONAL]
    1. 2,4,5 and 6
    2. 3,4,5 and 8
    3. 1 through 8
    4. 1,3,5 and 7
    5. 1,2,5 and 6
  4. Company A has hired you to assist with the migration of an interactive website that allows registered users to rate local restaurants. Updates to the ratings are displayed on the home page, and ratings are updated in real time. Although the website is not very popular today, the company anticipates that It will grow rapidly over the next few weeks. They want the site to be highly available. The current architecture consists of a single Windows Server 2008 R2 web server and a MySQL database running on Linux. Both reside inside an on -premises hypervisor. What would be the most efficient way to transfer the application to AWS, ensuring performance and high-availability? [PROFESSIONAL]
    1. Export web files to an Amazon S3 bucket in us-west-1. Run the website directly out of Amazon S3. Launch a multi-AZ MySQL Amazon RDS instance in us-west-1a. Import the data into Amazon RDS from the latest MySQL backup. Use Route 53 and create an alias record pointing to the elastic load balancer. (Its an Interactive website, although it can be implemented using Javascript SDK, its a migration and the application would need changes. Also no use of ELB if hosted on S3)
    2. Launch two Windows Server 2008 R2 instances in us-west-1b and two in us-west-1a. Copy the web files from on premises web server to each Amazon EC2 web server, using Amazon S3 as the repository. Launch a multi-AZ MySQL Amazon RDS instance in us-west-2a. Import the data into Amazon RDS from the latest MySQL backup. Create an elastic load balancer to front your web servers. Use Route 53 and create an alias record pointing to the elastic load balancer. (Although RDS instance is in a different region which will impact performance, this is the only option that works.)
    3. Use AWS VM Import/Export to create an Amazon Elastic Compute Cloud (EC2) Amazon Machine Image (AMI) of the web server. Configure Auto Scaling to launch two web servers in us-west-1a and two in us-west-1b. Launch a Multi-AZ MySQL Amazon Relational Database Service (RDS) instance in us-west-1b. Import the data into Amazon RDS from the latest MySQL backup. Use Amazon Route 53 to create a hosted zone and point an A record to the elastic load balancer. (does not create a load balancer)
    4. Use AWS VM Import/Export to create an Amazon EC2 AMI of the web server. Configure auto-scaling to launch two web servers in us-west-1a and two in us-west-1b. Launch a multi-AZ MySQL Amazon RDS instance in us-west-1a. Import the data into Amazon RDS from the latest MySQL backup. Create an elastic load balancer to front your web servers. Use Amazon Route 53 and create an A record pointing to the elastic load balancer. (Need to create a aliased record without which the Route 53 pointing to ELB would not work)
  5. Your company runs a customer facing event registration site. This site is built with a 3-tier architecture with web and application tier servers and a MySQL database. The application requires 6 web tier servers and 6 application tier servers for normal operation, but can run on a minimum of 65% server capacity and a single MySQL database. When deploying this application in a region with three availability zones (AZs) which architecture provides high availability? [PROFESSIONAL]
    1. A web tier deployed across 2 AZs with 3 EC2 (Elastic Compute Cloud) instances in each AZ inside an Auto Scaling Group behind an ELB (elastic load balancer), and an application tier deployed across 2 AZs with 3 EC2 instances in each AZ inside an Auto Scaling Group behind an ELB. and one RDS (Relational Database Service) instance deployed with read replicas in the other AZ.
    2. A web tier deployed across 3 AZs with 2 EC2 (Elastic Compute Cloud) instances in each AZ inside an Auto Scaling Group behind an ELB (elastic load balancer) and an application tier deployed across 3 AZs with 2 EC2 instances in each AZ inside an Auto Scaling Group behind an ELB and one RDS (Relational Database Service) Instance deployed with read replicas in the two other AZs.
    3. A web tier deployed across 2 AZs with 3 EC2 (Elastic Compute Cloud) instances in each AZ inside an Auto Scaling Group behind an ELB (elastic load balancer) and an application tier deployed across 2 AZs with 3 EC2 instances m each AZ inside an Auto Scaling Group behind an ELS and a Multi-AZ RDS (Relational Database Service) deployment.
    4. A web tier deployed across 3 AZs with 2 EC2 (Elastic Compute Cloud) instances in each AZ Inside an Auto Scaling Group behind an ELB (elastic load balancer). And an application tier deployed across 3 AZs with 2 EC2 instances in each AZ inside an Auto Scaling Group behind an ELB. And a Multi-AZ RDS (Relational Database services) deployment.
  6. For a 3-tier, customer facing, inclement weather site utilizing a MySQL database running in a Region which has two AZs which architecture provides fault tolerance within the region for the application that minimally requires 6 web tier servers and 6 application tier servers running in the web and application tiers and one MySQL database? [PROFESSIONAL]
    1. A web tier deployed across 2 AZs with 6 EC2 (Elastic Compute Cloud) instances in each AZ inside an Auto Scaling Group behind an ELB (elastic load balancer), and an application tier deployed across 2 AZs with 6 EC2 instances in each AZ inside an Auto Scaling Group behind an ELB. and a Multi-AZ RDS (Relational Database Service) deployment. (As it needs Fault Tolerance with minimal 6 servers always available)
    2. A web tier deployed across 2 AZs with 3 EC2 (Elastic Compute Cloud) instances in each A2 inside an Auto Scaling Group behind an ELB (elastic load balancer) and an application tier deployed across 2 AZs with 3 EC2 instances in each AZ inside an Auto Scaling Group behind an ELB and a Multi-AZ RDS (Relational Database Service) deployment.
    3. A web tier deployed across 2 AZs with 3 EC2 (Elastic Compute Cloud) instances in each AZ inside an Auto Scaling Group behind an ELB (elastic load balancer) and an application tier deployed across 2 AZs with 6 EC2 instances in each AZ inside an Auto Scaling Group behind an ELB and one RDS (Relational Database Service) Instance deployed with read replicas in the other AZs.
    4. A web tier deployed across 1 AZs with 6 EC2 (Elastic Compute Cloud) instances in each AZ Inside an Auto Scaling Group behind an ELB (elastic load balancer). And an application tier deployed in the same AZs with 6 EC2 instances inside an Auto scaling group behind an ELB and a Multi-AZ RDS (Relational Database services) deployment, with 6 stopped web tier EC2 instances and 6 stopped application tier EC2 instances all in the other AZ ready to be started if any of the running instances in the first AZ fails.
  7. You are designing a system which needs, at minimum, 8 m4.large instances operating to service traffic. When designing a system for high availability in the us-east-1 region, which has 6 Availability Zones, you company needs to be able to handle death of a full availability zone. How should you distribute the servers, to save as much cost as possible, assuming all of the EC2 nodes are properly linked to an ELB? Your VPC account can utilize us-east-1’s AZ’s a through f, inclusive.
    1. 3 servers in each of AZ’s a through d, inclusive.
    2. 8 servers in each of AZ’s a and b.
    3. 2 servers in each of AZ’s a through e, inclusive. (You need to design for N+1 redundancy on Availability Zones. ZONE_COUNT = (REQUIRED_INSTANCES / INSTANCE_COUNT_PER_ZONE) + 1. To minimize cost, spread the instances across as many possible zones as you can. By using a though e, you are allocating 5 zones. Using 2 instances, you have 10 total instances. If a single zone fails, you have 4 zones left, with 2 instances each, for a total of 8 instances. By spreading out as much as possible, you have increased cost by only 25% and significantly de-risked an availability zone failure. Refer link)
    4. 4 servers in each of AZ’s a through c, inclusive.
  8. You need your API backed by DynamoDB to stay online during a total regional AWS failure. You can tolerate a couple minutes of lag or slowness during a large failure event, but the system should recover with normal operation after those few minutes. What is a good approach? [PROFESSIONAL]
    1. Set up DynamoDB cross-region replication in a master-standby configuration, with a single standby in another region. Create an Auto Scaling Group behind an ELB in each of the two regions DynamoDB is running in. Add a Route53 Latency DNS Record with DNS Failover, using the ELBs in the two regions as the resource records. (Use DynamoDB cross-regional replication version with two ELBs and ASGs with Route53 Failover and Latency DNS. Refer link)
    2. Set up a DynamoDB Multi-Region table. Create an Auto Scaling Group behind an ELB in each of the two regions DynamoDB is running in. Add a Route53 Latency DNS Record with DNS Failover, using the ELBs in the two regions as the resource records. (No such thing as DynamoDB Multi-Region table)
    3. Set up a DynamoDB Multi-Region table. Create a cross-region ELB pointing to a cross-region Auto Scaling Group, and direct a Route53 Latency DNS Record with DNS Failover to the cross-region ELB. (No such thing as Cross Region ELB or cross-region ASG)
    4. Set up DynamoDB cross-region replication in a master-standby configuration, with a single standby in another region. Create a cross-region ELB pointing to a cross-region Auto Scaling Group, and direct a Route53 Latency DNS Record with DNS Failover to the cross-region ELB. (No such thing as DynamoDB cross-region table or cross-region ELB)
  9. You are putting together a WordPress site for a local charity and you are using a combination of Route53, Elastic Load Balancers, EC2 & RDS. You launch your EC2 instance, download WordPress and setup the configuration files connection string so that it can communicate to RDS. When you browse to your URL however, nothing happens. Which of the following could NOT be the cause of this.
    1. You have forgotten to open port 80/443 on your security group in which the EC2 instance is placed.
    2. Your elastic load balancer has a health check, which is checking a webpage that does not exist; therefore your EC2 instance is not in service.
    3. You have not configured an ALIAS for your A record to point to your elastic load balancer
    4. You have locked port 22 down to your specific IP address therefore users cannot access your site using HTTP/HTTPS
  10. A development team that is currently doing a nightly six-hour build which is lengthening over time on-premises with a large and mostly under utilized server would like to transition to a continuous integration model of development on AWS with multiple builds triggered within the same day. However, they are concerned about cost, security and how to integrate with existing on-premises applications such as their LDAP and email servers, which cannot move off-premises. The development environment needs a source code repository; a project management system with a MySQL database resources for performing the builds and a storage location for QA to pick up builds from. What AWS services combination would you recommend to meet the development team’s requirements? [PROFESSIONAL]
    1. A Bastion host Amazon EC2 instance running a VPN server for access from on-premises, Amazon EC2 for the source code repository with attached Amazon EBS volumes, Amazon EC2 and Amazon RDS MySQL for the project management system, EIP for the source code repository and project management system, Amazon SQL for a build queue, An Amazon Auto Scaling group of Amazon EC2 instances for performing builds and Amazon Simple Email Service for sending the build output. (Bastion is not for VPN connectivity also SES should not be used)
    2. An AWS Storage Gateway for connecting on-premises software applications with cloud-based storage securely, Amazon EC2 for the resource code repository with attached Amazon EBS volumes, Amazon EC2 and Amazon RDS MySQL for the project management system, EIPs for the source code repository and project management system, Amazon Simple Notification Service for a notification initiated build, An Auto Scaling group of Amazon EC2 instances for performing builds and Amazon S3 for the build output. (Storage Gateway does not provide secure connectivity, still needs VPN. SNS alone cannot handle builds)
    3. An AWS Storage Gateway for connecting on-premises software applications with cloud-based storage securely, Amazon EC2 for the resource code repository with attached Amazon EBS volumes, Amazon EC2 and Amazon RDS MySQL for the project management system, EIPs for the source code repository and project management system, Amazon SQS for a build queue, An Amazon Elastic Map Reduce (EMR) cluster of Amazon EC2 instances for performing builds and Amazon CloudFront for the build output. (Storage Gateway does not provide secure connectivity, still needs VPN. EMR is not ideal for performing builds as it needs normal EC2 instances)
    4. A VPC with a VPN Gateway back to their on-premises servers, Amazon EC2 for the source-code repository with attached Amazon EBS volumes, Amazon EC2 and Amazon RDS MySQL for the project management system, EIPs for the source code repository and project management system, SQS for a build queue, An Auto Scaling group of EC2 instances for performing builds and S3 for the build output. (VPN gateway is required for secure connectivity. SQS for build queue and EC2 for builds)

References

AWS Risk and Compliance – Whitepaper – Certification

AWS Risk and Compliance Whitepaper Overview

  • AWS Risk and Compliance Whitepaper is intended to provide information to assist AWS customers with integrating AWS into their existing control framework supporting their IT environment.
  • AWS does communicate its security and control environment relevant to customers. AWS does this by doing the following:
    • Obtaining industry certifications and independent third-party attestations described in this document
    • Publishing information about the AWS security and control practices in whitepapers and web site content
    • Providing certificates, reports, and other documentation directly to AWS customers under NDA (as required)

Shared Responsibility model

  • AWS’ part in the shared responsibility includes
    • providing its services on a highly secure and controlled platform and providing a wide array of security features customers can use
    • relieves the customer’s operational burden as AWS operates, manages and controls the components from the host operating system and virtualization layer down to the physical security of the facilities in which the service operates
  • Customers’ responsibility includes
    • configuring their IT environments in a secure and controlled manner for their purposes
    • responsibility and management of the guest operating system (including updates and security patches), other associated application software as well as the configuration of the AWS provided security group firewall
    • stringent compliance requirements by leveraging technology such as host based firewalls, host based intrusion detection/prevention, encryption and key management
    • relieve customer burden of operating controls by managing those controls associated with the physical infrastructure deployed in the AWS environment

Risk and Compliance Governance

  • AWS provides a wide range of information regarding its IT control environment to customers through white papers, reports, certifications, and other third-party attestations
  • AWS customers are required to continue to maintain adequate governance over the entire IT control environment regardless of how IT is deployed.
  • Leading practices include
    • an understanding of required compliance objectives and requirements (from relevant sources),
    • establishment of a control environment that meets those objectives and requirements,
    • an understanding of the validation required based on the organization’s risk tolerance,
    • and verification of the operating effectiveness of their control environment.
  • Strong customer compliance and governance might include the following basic approach:
    • Review information available from AWS together with other information to understand as much of the entire IT environment as possible, and then document all compliance requirements.
    • Design and implement control objectives to meet the enterprise compliance requirements.
    • Identify and document controls owned by outside parties.
    • Verify that all control objectives are met and all key controls are designed and operating effectively.
  • Approaching compliance governance in this manner helps companies gain a better understanding of their control environment and will help clearly delineate the verification activities to be performed.

AWS Certifications, Programs, Reports, and Third-Party Attestations

  • AWS engages with external certifying bodies and independent auditors to provide customers with considerable information regarding the policies, processes, and controls established and operated by AWS.
  • AWS provides third-party attestations, certifications, Service Organization Controls (SOC) reports and other relevant compliance reports directly to our customers under NDA.

Key Risk and Compliance Questions

  • Shared Responsibility
    • AWS controls the physical components of that technology.
    • Customer owns and controls everything else, including control over connection points and transmissions
  • Auditing IT
    • Auditing for most layers and controls above the physical controls remains the responsibility of the customer
    • AWS ISO 27001 and other certifications are available for auditors review
    • AWS-defined logical and physical controls is documented in the SOC 1 Type II report and available for review by audit and compliance teams
  • Data location
    • AWS customers control which physical region their data and their servers will be located
    • AWS replicates the data only within the region
    • AWS will not move customers’ content from the selected Regions without notifying the customer, unless required to comply with the law or requests of governmental entities
  • Data center tours
    • As AWS host multiple customers, AWS does not allow data center tours by customers, as this exposes a wide range of customers to physical access of a third party.
    • An independent and competent auditor validates the presence and operation of controls as part of our SOC 1 Type II report.
    • This third-party validation provides customers with the independent perspective of the effectiveness of controls in place.
    • AWS customers that have signed a non-disclosure agreement with AWS may request a copy of the SOC 1 Type II report.
  • Third-party access
    • AWS strictly controls access to data centers, even for internal employees.
    • Third parties are not provided access to AWS data centers except when explicitly approved by the appropriate AWS data center manager per the AWS access policy
  • Multi-tenancy
    • AWS environment is a virtualized, multi-tenant environment.
    • AWS has implemented security management processes, PCI controls, and other security controls designed to isolate each customer from other customers.
    • AWS systems are designed to prevent customers from accessing physical hosts or instances not assigned to them by filtering through the virtualization software.
  • Hypervisor vulnerabilities
    • Amazon EC2 utilizes a highly customized version of Xen hypervisor.
    • Hypervisor is regularly assessed for new and existing vulnerabilities and attack vectors by internal and external penetration teams, and is well suited for maintaining strong isolation between guest virtual machines
  • Vulnerability management
    • AWS is responsible for patching systems supporting the delivery of service to customers, such as the hypervisor and networking services
  • Encryption
    • AWS allows customers to use their own encryption mechanisms for nearly all the services, including S3, EBS, SimpleDB, and EC2.
    • IPSec tunnels to VPC are also encrypted
  • Data isolation
    • All data stored by AWS on behalf of customers has strong tenant isolation security and control capabilities
  • Composite services
    • AWS does not leverage any third-party cloud providers to deliver AWS services to customers.
  • Distributed Denial Of Service (DDoS) attacks
    • AWS network provides significant protection against traditional network security issues and the customer can implement further protection
  • Data portability
    • AWS allows customers to move data as needed on and off AWS storage
  • Service & Customer provider business continuity
    • AWS does operate a business continuity program
    • AWS data centers incorporate physical protection against environmental risks.
    • AWS’ physical protection against environmental risks has been validated by an independent auditor and has been certified
    • AWS provides customers with the capability to implement a robust continuity plan with multi region/AZ deployment architectures, backups, data redundancy replication
  • Capability to scale
    • AWS cloud is distributed, highly secure and resilient, giving customers massive scale potential.
    • Customers may scale up or down, paying for only what they use
  • Service availability
    • AWS does commit to high levels of availability in its service level agreements (SLA) for e.g. S3 99.9%
  • Application Security
    • AWS system development lifecycle incorporates industry best practices which include formal design reviews by the AWS Security Team, source code analysis, threat modeling and completion of a risk assessment
    • AWS does not generally outsource development of software.
  • Threat and Vulnerability Management
    • AWS Security regularly engages independent security firms to perform external vulnerability threat assessments
    • AWS Security regularly scans all Internet facing service endpoint IP addresses for vulnerabilities, but do not include customer instances
    • AWS Security notifies the appropriate parties to remediate any identified vulnerabilities.
    • Customers can request permission to conduct scans and Penetration tests of their cloud infrastructure as long as they are limited to the customer’s instances and do not violate the AWS Acceptable Use Policy. Advance approval for these types of scans is required
  • Data Security

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. When preparing for a compliance assessment of your system built inside of AWS. What are three best practices for you to prepare for an audit? Choose 3 answers
    1. Gather evidence of your IT operational controls (Customer still needs to gather all the IT operation controls inline with their environment)
    2. Request and obtain applicable third-party audited AWS compliance reports and certifications (Customers can request the reports and certifications produced by our third-party auditors or can request more information about AWS Compliance)
    3. Request and obtain a compliance and security tour of an AWS data center for a pre-assessment security review (AWS does not allow data center tour)
    4. Request and obtain approval from AWS to perform relevant network scans and in-depth penetration tests of your system’s Instances and endpoints (AWS requires prior approval to be taken to perform penetration tests)
    5. Schedule meetings with AWS’s third-party auditors to provide evidence of AWS compliance that maps to your control objectives (Customers can request the reports and certifications produced by our third-party auditors or can request more information about AWS Compliance)
  2. In the shared security model, AWS is responsible for which of the following security best practices (check all that apply) :
    1. Penetration testing
    2. Operating system account security management
    3. Threat modeling
    4. User group access management
    5. Static code analysis
  3. You are running a web-application on AWS consisting of the following components an Elastic Load Balancer (ELB) an Auto-Scaling Group of EC2 instances running Linux/PHP/Apache, and Relational DataBase Service (RDS) MySQL. Which security measures fall into AWS’s responsibility?
    1. Protect the EC2 instances against unsolicited access by enforcing the principle of least-privilege access (Customer owned)
    2. Protect against IP spoofing or packet sniffing
    3. Assure all communication between EC2 instances and ELB is encrypted (Customer owned)
    4. Install latest security patches on ELB, RDS and EC2 instances (Customer owned)
  4. Which of the following statements is true about achieving PCI certification on the AWS platform? (Choose 2)
    1. Your organization owns the compliance initiatives related to anything placed on the AWS infrastructure
    2. Amazon EC2 instances must run on a single-tenancy environment (dedicated instance)
    3. AWS manages card-holder environments
    4. AWS Compliance provides assurance related to the underlying infrastructure

References

AWS Storage Options – Whitepaper – Certification

Storage Options Whitepaper

AWS Storage Options is one of the most important Whitepaper for AWS Solution Architect Professional Certification exam and covers a brief summary of each AWS storage options, their ideal usage patterns, anti-patterns, performance, durability and availability, scalability etc.

Overview

  • AWS offers multiple cloud-based storage options. Each has a unique combination of performance, durability, availability, cost, and interface, as well as other characteristics such as scalability and elasticity
  • All storage options are ideally suited for some uses cases and there are certain Anti-Patterns which should be taken in account while making a storage choice

AWS Various Storage Options

Amazon S3 & Amazon Glacier

More Details @ AWS Storage Options – S3 & Glacier

Amazon Elastic Block Store (EBS) & Instance Store Volumes

More details @ AWS Storage Options – EBS & Instance Store

Amazon RDS, DynamoDB & Database on EC2

More details @ AWS Storage Options – RDS, DynamoDB & Database on EC2

Amazon SQS & Redshift

More details @ AWS Storage Options – SQS & Redshift

Amazon CloudFront & Elasticache

More details @ AWS Storage Options – CloudFront & ElastiCache

Amazon Storage Gateway & Import/Export

More details @ AWS Storage Options – Storage Gateway & Import/Export

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. You are developing a highly available web application using stateless web servers. Which services are suitable for storing session state data? Choose 3 answers.
    1. Elastic Load Balancing
    2. Amazon Relational Database Service (RDS)
    3. Amazon CloudWatch
    4. Amazon ElastiCache
    5. Amazon DynamoDB
    6. AWS Storage Gateway
  2. Your firm has uploaded a large amount of aerial image data to S3. In the past, in your on-premises environment, you used a dedicated group of servers to oaten process this data and used Rabbit MQ, an open source messaging system, to get job information to the servers. Once processed the data would go to tape and be shipped offsite. Your manager told you to stay with the current design, and leverage AWS archival storage and messaging services to minimize cost. Which is correct? [PROFESSIONAL]
    1. Use SQS for passing job messages, use Cloud Watch alarms to terminate EC2 worker instances when they become idle. Once data is processed, change the storage class of the S3 objects to Reduced Redundancy Storage.
    2. Setup Auto-Scaled workers triggered by queue depth that use spot instances to process messages in SQS. Once data is processed, change the storage class of the S3 objects to Reduced Redundancy Storage.
    3. Setup Auto-Scaled workers triggered by queue depth that use spot instances to process messages in SQS. Once data is processed, change the storage class of the S3 objects to Glacier.
    4. Use SNS to pass job messages use Cloud Watch alarms to terminate spot worker instances when they become idle. Once data is processed, change the storage class of the S3 object to Glacier.
  3. You are developing a new mobile application and are considering storing user preferences in AWS, which would provide a more uniform cross-device experience to users using multiple mobile devices to access the application. The preference data for each user is estimated to be 50KB in size. Additionally 5 million customers are expected to use the application on a regular basis. The solution needs to be cost-effective, highly available, scalable and secure, how would you design a solution to meet the above requirements? [PROFESSIONAL]
    1. Setup an RDS MySQL instance in 2 availability zones to store the user preference data. Deploy a public facing application on a server in front of the database to manage security and access credentials
    2. Setup a DynamoDB table with an item for each user having the necessary attributes to hold the user preferences. The mobile application will query the user preferences directly from the DynamoDB table. Utilize STS. Web Identity Federation, and DynamoDB Fine Grained Access Control to authenticate and authorize access
    3. Setup an RDS MySQL instance with multiple read replicas in 2 availability zones to store the user preference data .The mobile application will query the user preferences from the read replicas. Leverage the MySQL user management and access privilege system to manage security and access credentials.
    4. Store the user preference data in S3 Setup a DynamoDB table with an item for each user and an item attribute pointing to the user’ S3 object. The mobile application will retrieve the S3 URL from DynamoDB and then access the S3 object directly utilize STS, Web identity Federation, and S3 ACLs to authenticate and authorize access.
  4. A company is building a voting system for a popular TV show, viewers would watch the performances then visit the show’s website to vote for their favorite performer. It is expected that in a short period of time after the show has finished the site will receive millions of visitors. The visitors will first login to the site using their Amazon.com credentials and then submit their vote. After the voting is completed the page will display the vote totals. The company needs to build the site such that can handle the rapid influx of traffic while maintaining good performance but also wants to keep costs to a minimum. Which of the design patterns below should they use? [PROFESSIONAL]
    1. Use CloudFront and an Elastic Load balancer in front of an auto-scaled set of web servers, the web servers will first can the Login With Amazon service to authenticate the user then process the users vote and store the result into a multi-AZ Relational Database Service instance.
    2. Use CloudFront and the static website hosting feature of S3 with the Javascript SDK to call the Login With Amazon service to authenticate the user, use IAM Roles to gain permissions to a DynamoDB table to store the users vote.
    3. Use CloudFront and an Elastic Load Balancer in front of an auto-scaled set of web servers, the web servers will first call the Login with Amazon service to authenticate the user, the web servers will process the users vote and store the result into a DynamoDB table using IAM Roles for EC2 instances to gain permissions to the DynamoDB table.
    4. Use CloudFront and an Elastic Load Balancer in front of an auto-scaled set of web servers, the web servers will first call the Login. With Amazon service to authenticate the user, the web servers would process the users vote and store the result into an SQS queue using IAM Roles for EC2 Instances to gain permissions to the SQS queue. A set of application servers will then retrieve the items from the queue and store the result into a DynamoDB table
  5. A large real-estate brokerage is exploring the option to adding a cost-effective location-based alert to their existing mobile application. The application backend infrastructure currently runs on AWS. Users who opt in to this service will receive alerts on their mobile device regarding real-estate offers in proximity to their location. For the alerts to be relevant delivery time needs to be in the low minute count. The existing mobile app has 5 million users across the US. Which one of the following architectural suggestions would you make to the customer? [PROFESSIONAL]
    1. Mobile application will submit its location to a web service endpoint utilizing Elastic Load Balancing and EC2 instances. DynamoDB will be used to store and retrieve relevant offers. EC2 instances will communicate with mobile earners/device providers to push alerts back to mobile application. —
    2. Use AWS Direct Connect or VPN to establish connectivity with mobile carriers EC2 instances will receive the mobile applications location through carrier connection: RDS will be used to store and relevant offers. EC2 instances will communicate with mobile carriers to push alerts back to the mobile application
    3. Mobile application will send device location using SQS. EC2 instances will retrieve the relevant offers from DynamoDB. AWS Mobile Push will be used to send offers to the mobile application
    4. Mobile application will send device location using AWS Mobile Push. EC2 instances will retrieve the relevant offers from DynamoDB. EC2 instances will communicate with mobile carriers/device providers to push alerts back to the mobile application.
  6. You are running a news website in the eu-west-1 region that updates every 15 minutes. The website has a worldwide audience and it uses an Auto Scaling group behind an Elastic Load Balancer and an Amazon RDS database. Static content resides on Amazon S3, and is distributed through Amazon CloudFront. Your Auto Scaling group is set to trigger a scale up event at 60% CPU utilization; you use an Amazon RDS extra-large DB instance with 10.000 Provisioned IOPS its CPU utilization is around 80%. While freeable memory is in the 2 GB range. Web analytics reports show that the average load time of your web pages is around 1.5 to 2 seconds, but your SEO consultant wants to bring down the average load time to under 0.5 seconds. How would you improve page load times for your users? (Choose 3 answers) [PROFESSIONAL]
    1. Lower the scale up trigger of your Auto Scaling group to 30% so it scales more aggressively.
    2. Add an Amazon ElastiCache caching layer to your application for storing sessions and frequent DB queries
    3. Configure Amazon CloudFront dynamic content support to enable caching of re-usable content from your site
    4. Switch Amazon RDS database to the high memory extra-large Instance type
    5. Set up a second installation in another region, and use the Amazon Route 53 latency-based routing feature to select the right region.
  7. A read only news reporting site with a combined web and application tier and a database tier that receives large and unpredictable traffic demands must be able to respond to these traffic fluctuations automatically. What AWS services should be used meet these requirements? [PROFESSIONAL]
    1. Stateless instances for the web and application tier synchronized using ElastiCache Memcached in an autoscaling group monitored with CloudWatch. And RDS with read replicas.
    2. Stateful instances for the web and application tier in an autoscaling group monitored with CloudWatch and RDS with read replicas
    3. Stateful instances for the web and application tier in an autoscaling group monitored with CloudWatch. And multi-AZ RDS
    4. Stateless instances for the web and application tier synchronized using ElastiCache Memcached in an autoscaling group monitored with CloudWatch and multi-AZ RDS
  8. You have a periodic Image analysis application that gets some files as input, analyzes them and for each file writes some data in output to a ten file. The number of files in input per day is high and concentrated in a few hours of the day. Currently you have a server on EC2 with a large EBS volume that hosts the input data and the results it takes almost 20 hours per day to complete the process. What services could be used to reduce the elaboration time and improve the availability of the solution? [PROFESSIONAL]
    1. S3 to store I/O files. SQS to distribute elaboration commands to a group of hosts working in parallel. Auto scaling to dynamically size the group of hosts depending on the length of the SQS queue
    2. EBS with Provisioned IOPS (PIOPS) to store I/O files. SNS to distribute elaboration commands to a group of hosts working in parallel Auto Scaling to dynamically size the group of hosts depending on the number of SNS notifications
    3. S3 to store I/O files, SNS to distribute evaporation commands to a group of hosts working in parallel. Auto scaling to dynamically size the group of hosts depending on the number of SNS notifications
    4. EBS with Provisioned IOPS (PIOPS) to store I/O files SOS to distribute elaboration commands to a group of hosts working in parallel Auto Scaling to dynamically size the group to hosts depending on the length of the SQS queue.
  9. A 3-tier e-commerce web application is current deployed on-premises and will be migrated to AWS for greater scalability and elasticity. The web server currently shares read-only data using a network distributed file system The app server tier uses a clustering mechanism for discovery and shared session state that depends on IP multicast The database tier uses shared-storage clustering to provide database fail over capability, and uses several read slaves for scaling. Data on all servers and the distributed file system directory is backed up weekly to off-site tapes. Which AWS storage and database architecture meets the requirements of the application? [PROFESSIONAL]
    1. Web servers store read-only data in S3, and copy from S3 to root volume at boot time. App servers share state using a combination of DynamoDB and IP unicast. Database use RDS with multi-AZ deployment and one or more Read Replicas. Backup web and app servers backed up weekly via AMIs, database backed up via DB snapshots.
    2. Web servers store read-only data in S3, and copy from S3 to root volume at boot time. App servers share state using a combination of DynamoDB and IP unicast. Database use RDS with multi-AZ deployment and one or more Read replicas. Backup web servers app servers, and database backed up weekly to Glacier using snapshots (Snapshots to Glacier don’t work directly with EBS snapshots)
    3. Web servers store read-only data in S3 and copy from S3 to root volume at boot time. App servers share state using a combination of DynamoDB and IP unicast. Database use RDS with multi-AZ deployment. Backup web and app servers backed up weekly via AMIs. Database backed up via DB snapshots (Need Read replicas for scalability and elasticity)
    4. Web servers, store read-only data in an EC2 NFS server, mount to each web server at boot time App servers share state using a combination of DynamoDB and IP multicast Database use RDS with multi-AZ deployment and one or more Read Replicas Backup web and app servers backed up weekly via AMIs database backed up via DB snapshots (IP multicast not available in AWS)
  10. Our company is getting ready to do a major public announcement of a social media site on AWS. The website is running on EC2 instances deployed across multiple Availability Zones with a Multi-AZ RDS MySQL Extra Large DB Instance. The site performs a high number of small reads and writes per second and relies on an eventual consistency model. After comprehensive tests you discover that there is read contention on RDS MySQL. Which are the best approaches to meet these requirements? (Choose 2 answers) [PROFESSIONAL]
    1. Deploy ElasticCache in-memory cache running in each availability zone
    2. Implement sharding to distribute load to multiple RDS MySQL instances (Would distributed read write both, focus is on read contention)
    3. Increase the RDS MySQL Instance size and Implement provisioned IOPS (Would distributed read write both, focus is on read contention)
    4. Add an RDS MySQL read replica in each availability zone
  11. Run 2-tier app with the following: an ELB, three web app server on EC2, and 1 MySQL RDS db. With grown load, db queries take longer and longer and slow down the overall response time for user request. What Options could speed up performance? (Choose 3) [PROFESSIONAL]
    1. Create an RDS read-replica and redirect half of the database read request to it
    2. Cache database queries in amazon ElastiCache
    3. Setup RDS in multi-availability zone mode.
    4. Shard the database and distribute loads between shards.
    5. Use amazon CloudFront to cache database queries.
  12. You have a web application leveraging an Elastic Load Balancer (ELB) In front of the web servers deployed using an Auto Scaling Group Your database is running on Relational Database Service (RDS) The application serves out technical articles and responses to them in general there are more views of an article than there are responses to the article. On occasion, an article on the site becomes extremely popular resulting in significant traffic Increases that causes the site to go down. What could you do to help alleviate the pressure on the infrastructure while maintaining availability during these events? Choose 3 answers [PROFESSIONAL]
    1. Leverage CloudFront for the delivery of the articles.
    2. Add RDS read-replicas for the read traffic going to your relational database
    3. Leverage Elastic Cache for caching the most frequently used data.
    4. Use SQS to queue up the requests for the technical posts and deliver them out of the queue (does not process and would not be real time)
    5. Use Route53 health checks to fail over to an S3 bucket for an error page (more of an error handling then availability)
  13. Your website is serving on-demand training videos to your workforce. Videos are uploaded monthly in high resolution MP4 format. Your workforce is distributed globally often on the move and using company-provided tablets that require the HTTP Live Streaming (HLS) protocol to watch a video. Your company has no video transcoding expertise and it required you might need to pay for a consultant. How do you implement the most cost-efficient architecture without compromising high availability and quality of video delivery? [PROFESSIONAL]
    1. Elastic Transcoder to transcode original high-resolution MP4 videos to HLS. S3 to host videos with lifecycle Management to archive original flies to Glacier after a few days. CloudFront to serve HLS transcoded videos from S3 (Elastic Transcoder for High quality, S3 to host videos cheaply, Glacier for archives and CloudFront for high availability)
    2. A video transcoding pipeline running on EC2 using SQS to distribute tasks and Auto Scaling to adjust the number or nodes depending on the length of the queue S3 to host videos with Lifecycle Management to archive all files to Glacier after a few days CloudFront to serve HLS transcoding videos from Glacier
    3. Elastic Transcoder to transcode original high-resolution MP4 videos to HLS EBS volumes to host videos and EBS snapshots to incrementally backup original rues after a few days. CloudFront to serve HLS transcoded videos from EC2.
    4. A video transcoding pipeline running on EC2 using SQS to distribute tasks and Auto Scaling to adjust the number of nodes depending on the length of the queue. EBS volumes to host videos and EBS snapshots to incrementally backup original files after a few days. CloudFront to serve HLS transcoded videos from EC2
  14. To meet regulatory requirements, a pharmaceuticals company needs to archive data after a drug trial test is concluded. Each drug trial test may generate up to several thousands of files, with compressed file sizes ranging from 1 byte to 100MB. Once archived, data rarely needs to be restored, and on the rare occasion when restoration is needed, the company has 24 hours to restore specific files that match certain metadata. Searches must be possible by numeric file ID, drug name, participant names, date ranges, and other metadata. Which is the most cost-effective architectural approach that can meet the requirements? [PROFESSIONAL]
    1. Store individual files in Amazon Glacier, using the file ID as the archive name. When restoring data, query the Amazon Glacier vault for files matching the search criteria. (Individual files are expensive and does not allow searching by participant names etc)
    2. Store individual files in Amazon S3, and store search metadata in an Amazon Relational Database Service (RDS) multi-AZ database. Create a lifecycle rule to move the data to Amazon Glacier after a certain number of days. When restoring data, query the Amazon RDS database for files matching the search criteria, and move the files matching the search criteria back to S3 Standard class. (As the data is not needed can be stored to Glacier directly and the data need not be moved back to S3 standard)
    3. Store individual files in Amazon Glacier, and store the search metadata in an Amazon RDS multi-AZ database. When restoring data, query the Amazon RDS database for files matching the search criteria, and retrieve the archive name that matches the file ID returned from the database query. (Individual files and Multi-AZ is expensive)
    4. First, compress and then concatenate all files for a completed drug trial test into a single Amazon Glacier archive. Store the associated byte ranges for the compressed files along with other search metadata in an Amazon RDS database with regular snapshotting. When restoring data, query the database for files that match the search criteria, and create restored files from the retrieved byte ranges.
    5. Store individual compressed files and search metadata in Amazon Simple Storage Service (S3). Create a lifecycle rule to move the data to Amazon Glacier, after a certain number of days. When restoring data, query the Amazon S3 bucket for files matching the search criteria, and retrieve the file to S3 reduced redundancy in order to move it back to S3 Standard class. (Once the data is moved from S3 to Glacier the metadata is lost, as Glacier does not have metadata and must be maintained externally)
  15. A document storage company is deploying their application to AWS and changing their business model to support both free tier and premium tier users. The premium tier users will be allowed to store up to 200GB of data and free tier customers will be allowed to store only 5GB. The customer expects that billions of files will be stored. All users need to be alerted when approaching 75 percent quota utilization and again at 90 percent quota use. To support the free tier and premium tier users, how should they architect their application? [PROFESSIONAL]
    1. The company should utilize an amazon simple work flow service activity worker that updates the users data counter in amazon dynamo DB. The activity worker will use simple email service to send an email if the counter increases above the appropriate thresholds.
    2. The company should deploy an amazon relational data base service relational database with a store objects table that has a row for each stored object along with size of each object. The upload server will query the aggregate consumption of the user in questions by first determining the files store by the user, and then querying the stored objects table for respective file sizes) and send an email via amazon simple email service if the thresholds are breached.
    3. The company should write both the content length and the username of the files owner as S3 metadata for the object. They should then create a file watcher to iterate over each object and aggregate the size for each user and send a notification via amazon simple queue service to an emailing service if the storage threshold is exceeded.
    4. The company should create two separated amazon simple storage service buckets one for data storage for free tier users and another for data storage for premium tier users. An amazon simple workflow service activity worker will query all objects for a given user based on the bucket the data is stored
  16. Your company has been contracted to develop and operate a website that tracks NBA basketball statistics. Statistical data to derive reports like “best game-winning shots from the regular season” and more frequently built reports like “top shots of the game” need to be stored durably for repeated lookup. Leveraging social media techniques, NBA fans submit and vote on new report types from the existing data set so the system needs to accommodate variability in data queries and new static reports must be generated and posted daily. Initial research in the design phase indicates that there will be over 3 million report queries on game day by end users and other applications that use this application as a data source. It is expected that this system will gain in popularity over time and reach peaks of 10-15 million report queries of the system on game days. Select the answer that will allow your application to best meet these requirements while minimizing costs. [PROFESSIONAL]
    1. Launch a multi-AZ MySQL Amazon Relational Database Service (RDS) Read Replica connected to your multi AZ master database and generate reports by querying the Read Replica. Perform a daily table cleanup.
    2. Implement a multi-AZ MySQL RDS deployment and have the application generate reports from Amazon ElastiCache for in-memory performance results. Utilize the default expire parameter for items in the cache.
    3. Generate reports from a multi-AZ MySQL Amazon RDS deployment and have an offline task put reports in Amazon Simple Storage Service (S3) and use CloudFront to cache the content. Use a TTL to expire objects daily. (Offline task with S3 storage and CloudFront cache)
    4. Query a multi-AZ MySQL RDS instance and store the results in a DynamoDB table. Generate reports from the DynamoDB table. Remove stale tables daily.

References

Storage Options Whitepaper – Storage Gateway – Import/Export – AWS Certification

AWS Storage Options Whitepaper cont.

Provides a brief summary for the Ideal Use cases and Anti-Patterns for Storage Gateway and Import/Export AWS storage options

AWS Storage Gateway

  • Storage Gateway is a service that connects an on-premises software appliance with cloud-based storage to provide seamless and secure integration between the organization’s on-premises IT environment and AWS’s storage infrastructure.
  • Storage Gateway enables store data securely to the AWS cloud for scalable and cost-effective storage.
  • It provides low-latency performance by maintaining frequently accessed data on-premises while securely storing all of your data encrypted in S3.
  • For disaster recovery scenarios, it can serve as a cloud-hosted solution, together with EC2, that mirrors your entire production environment.
  • Storage Gateway can be configured as
    • Gateway-cached volumes
      • Gateway-cached volumes utilizes S3 for primary data backup, while retaining frequently accessed data locally in a cache.
      • These volumes minimize the need to scale the on-premises storage infrastructure, while still providing applications with low-latency access to their frequently accessed data.
      • Data written to the volumes is stored in S3, with only a cache of recently written and recently read data is stored locally on the on-premises storage hardware.
    • Gateway-stored volumes
      • Gateway-stored volumes stores the complete primary data locally, while asynchronously backing up that data to AWS.
      • These volumes provide the on-premises applications with low-latency access to their entire datasets, while providing durable, off-site backups.
      • Data written to the gateway-stored volumes is stored on the on-premises storage hardware, and asynchronously backed up to S3 in the form of EBS snapshots.

Ideal Usage Patterns

  • AWS Storage Gateway use cases include
    • corporate file sharing,
    • enabling existing on-premises backup applications to store primary backups on S3,
    • disaster recovery, and
    • data mirroring to cloud-based compute resources.

Anti-Patterns

  • Database storage
    • For Database backup or storage, EC2 instances using EBS volumes are a natural choice for database storage and workloads.

Performance

  • As the Storage Gateway VM sits between the application, underlying on-premises storage and S3, the performance experienced will be dependent upon a number of factors, including the speed and configuration of the underlying local disks, the network bandwidth between the iSCSI initiator and gateway VM, the amount of local storage allocated to the gateway VM, and the bandwidth between the gateway VM and S3.
  • For gateway-cached volumes, to provide low-latency read access to the on-premises applications, it’s important to provide enough local cache storage to store the recently accessed data.
  • Storage Gateway efficiently uses the Internet bandwidth to speed up the upload of on-premises application data to AWS.
  • Storage Gateway only uploads incremental changes (data that has changed), which minimizes the amount of data sent over the Internet.
  • AWS Direct Connect can be used to further increase throughput and reduce the network costs by establishing a dedicated network connection between the on-premises gateway and AWS.

Durability and Availability

  • AWS Storage Gateway durably stores on-premises application data by uploading it to S3.
  • S3 stores data in multiple facilities and on multiple devices within each facility.
  • S3 also performs regular, systematic data integrity checks and is built to be automatically self-healing.

Cost Model

  • AWS Storage Gateway has four pricing components:
    • gateway usage (per gateway per month),
    • snapshot storage usage (per GB per month),
    • volume storage usage (per GB per month), and
    • data transfer out (per GB per month).

Scalability and Elasticity

  • AWS Storage Gateway stores data in Amazon S3, which has been designed to offer a very high level of scalability and elasticity automatically.

Interfaces

  • AWS Management Console can be used to download the AWS Storage Gateway VM image, select between a gateway-cached or gateway-stored configuration, activate the on-premises by associating the gateway’s IP Address with your AWS account, select an AWS region, and create AWS Storage Gateway volumes and attach these volumes as iSCSI devices to your on-premises application servers.

AWS Import/Export (Upgraded to Snowball)

  • AWS Import/Export accelerates moving large amounts of data into and out of AWS using portable storage devices for transport.
  • AWS transfers the data directly onto and off of storage devices using Amazon’s high-speed internal network and bypassing the Internet and can be much faster and more cost effective than upgrading connectivity.
  • AWS Import/Export supports importing into several types of AWS storage, including EBS snapshots, S3 buckets, and Glacier vaults and exporting data from S3.

Ideal Usage Patterns

  • AWS Import/Export is ideal for transferring large amounts of data in and out of the AWS cloud, especially in cases where transferring the data over the Internet would be too slow (a week or more) or too costly.
  • Common use cases include
    • initial data upload to AWS,
    • content distribution or regular data interchange to/from your customers or business associates,
    • transfer to Amazon S3 or Amazon Glacier for off-site backup and archival storage, and quick retrieval of large backups from Amazon S3 or Amazon Glacier for disaster recovery.

Anti-Patterns

  • AWS Import/Export may not be the ideal solution for data that is more easily transferred over the Internet in less than one week.

Performance

  • Each AWS Import/Export station is capable of loading data at over 100 MB per second
  • Rate of the data load will be bounded by a combination of the read or write speed of the portable storage device and, for Amazon S3 data loads, the average object (file) size.

Durability and Availability

  • Durability and availability characteristics of the target storage i.e. EBS, S3 or Glacier applies, after the data has been imported

Cost Model

  • AWS Import/Export has three pricing components: a per-device fee, a data load time charge (per data-loading-hour), and possible return shipping charges (for expedited shipping, or shipping to destinations not local to that AWS Import/Export region).
  • Storage pricing applies for the destination storage, the standard Amazon EBS snapshot, Amazon S3, and Amazon Glacier request and storage pricing applies.

Scalability and Elasticity

  • Total amount of data you can load using AWS Import/Export is limited only by the capacity of the devices sent to AWS.
  • For Amazon S3, individual files will be loaded as objects in Amazon S3, and may range up to 5 terabytes in size.
  • For Amazon Glacier, individual devices will be loaded as a single archive, and may range up to 4 terabytes in size.
  • Aggregate total amount of data that can be imported is virtually unlimited.

Interfaces

  • To upload or download data, AWS Import/Export job for each storage device shipped need to be created and submitted
  • Jobs can be created using AWS CLI, AWS SDK or native REST API
  • Each job request requires a manifest file, a YAML-formatted text file that contains a set of key-value pairs that supply the required information—such as your device ID, secret access key, and return address—necessary to complete the job.
  • Job request is tied to the storage device through a signature file in the root directory (for Amazon S3 import jobs), or by a barcode taped to the device (for Amazon EBS and Amazon Glacier jobs).

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. You are working with a customer who has 10 TB of archival data that they want to migrate to Amazon Glacier. The customer has a 1-Mbps connection to the Internet. Which service or feature provides the fastest method of getting the data into Amazon Glacier?
    1. Amazon Glacier multipart upload
    2. AWS Storage Gateway
    3. VM Import/Export
    4. AWS Import/Export

AWS Storage Options – RDS, DynamoDB & Database on EC2

AWS Storage Options Whitepaper with RDS, DynamoDB & Database on EC2 Cont.

Provides a brief summary for the Ideal Use cases, Anti-Patterns and other factors for Amazon RDS, DynamoDB & Databases on EC2 storage options

Amazon RDS

  • RDS is a web service that provides the capabilities of MySQL, Oracle, MariaDB, Postgres or Microsoft SQL Server relational database as a managed, cloud-based service
  • RDS eliminates much of the administrative overhead associated with launching, managing, and scaling your own relational database on Amazon EC2 or in another computing environment.

Ideal Usage Patterns

  • RDS is a great solution for cloud-based fully-managed relational database
  • RDS is also optimal for new applications with structured data that requires more sophisticated querying and joining capabilities than that provided by Amazon’s NoSQL database offering, DynamoDB.
  • RDS provides full compatibility with the databases supported and direct access to native database engines, code and libraries and is ideal for existing applications that rely on these databases

Anti-Patterns

  • Index and query-focused data
    • If the applications don’t require advanced features such as joins and complex transactions and is more oriented toward indexing and querying data, DynamoDB would be more appropriate for this needs
  • Numerous BLOBs
    • If the application makes heavy use of files (audio files, videos, images, etc), it is a better choice to use S3 to store the objects instead of database engines Blob feature and use RDS or DynamoDB only to save the metadata
  • Automated scalability
    • RDS provides pushbutton scaling and it only scales up and has limited scale out ability. If fully-automated scaling is needed, DynamoDB may be a better choice.
  • Complete control
    • RDS does not provide admin access and does not enable the full feature set of the database engines.
    • So if the application requires complete, OS-level control of the database server with full root or admin login privileges, a self-managed database on EC2 may be a better match.
  • Other database platforms
    • RDS, at this time, provides a MySQL, Oracle, MariaDB, PostgreSQL and SQL Server databases.
    • If any other database platform (such as IBM DB2, Informix, or Sybase) is needed, it should be deployed on a self-managed database on an EC2 instance by using a relational database AMI, or by installing database software on an EC2 instance.

Performance

  • RDS Provisioned IOPS, where the IOPS can be specified when the instance is launched and is guaranteed over the life of the instance, provides a high-performance storage option designed to deliver fast, predictable, and consistent performance for I/O intensive transactional database workload

Durability and Availability

  • RDS leverages Amazon EBS volumes as its data store
  • RDS provides database backups, for enhanced durability, which are replicated across multiple AZ’s
    • Automated backups
      • If enabled, RDS will automatically perform a full daily backup of your data during the specified backup window, and will also capture DB transaction logs
    • User initiated backups
      • User can initiate backups at time and they are not deleted unless deleted explicitly by the user
  • RDS Multi AZ’s feature enhances both the durability and the availability of the database by synchronously replicating the data between a primary RDS DB instance and a standby instance in another Availability Zone, which prevents data loss,
  • RDS provides a DNS endpoint and in case of an failure on the primary, it automatically fails over to the standby instance
  • RDS also allows Read replicas for the supported databases, which are replicated asynchronously

Cost Model

  • RDS offers a tiered pricing structure, based on the size of the database instance, the deployment type (Single-AZ/Multi-AZ), and the AWS region.
  • Pricing for RDS is based on several factors: the DB instance hours (per hour), the amount of provisioned database storage (per GB-month and per million I/O requests), additional backup storage (per GB-month), and data transfer in/out (per GB per month)

Scalability and Elasticity

  • RDS resources can be scaled elastically in several dimensions: database storage size, database storage IOPS rate, database instance compute capacity, and the number of read replicas
  • RDS supports “pushbutton scaling” of both database storage and compute resources. Additional storage can either be added immediately or during the next maintenance cycle
  • RDS for MySQL also enables you to scale out beyond the capacity of a single database deployment for read-heavy database workloads by creating one or more read replicas.
  • Multiple RDS instances can also be configured to leverage database partitioning or sharding to spread the workload over multiple DB instances, achieving even greater database scalability and elasticity.

Interfaces

  • RDS APIs and the AWS Management Console provide a management interface that allows you to create, delete, modify, and terminate RDS DB instances; to create DB snapshots; and to perform point-in-time restores
  • There is no AWS data API for Amazon RDS.
  • Once a database is created, RDS provides a DNS endpoint for the database which can be used to connect to the database.
  • Endpoint does not change over the lifetime of the instance even during the failover in case of Multi-AZ configuration

Amazon DynamoDB

  • Amazon DynamoDB is a fast, fully-managed NoSQL database service that makes it simple and cost-effective to store and retrieve any amount of data, and serve any level of request traffic.
  • DynamoDB being a managed service helps offload the administrative burden of operating and scaling a highly-available distributed database cluster.
  • DynamoDB helps meet the latency and throughput requirements of highly demanding applications by providing extremely fast and predictable performance with seamless throughput and storage scalability.
  • DynamoDB provides both eventually-consistent reads (by default), and strongly-consistent reads (optional), as well as implicit item-level transactions for item put, update, delete, conditional operations, and increment/decrement.
  • Amazon DynamoDB handles the data as below :-
    • DynamoDB stores structured data in tables, indexed by primary key, and allows low-latency read and write access to items.
    • DynamoDB supports three data types: number, string, and binary, in both scalar and multi-valued sets.
    • Tables do not have a fixed schema, so each data item can have a different number of attributes.
    • Primary key can either be a single-attribute hash key or a composite hash-range key.
    • Local secondary indexes provide additional flexibility for querying against attributes other than the primary key.

Ideal Usage Patterns

  • DynamoDB is ideal for existing or new applications that need a flexible NoSQL database with low read and write latencies, and the ability to scale storage and throughput up or down as needed without code changes or downtime.
  • Use cases require a highly available and scalable database because downtime or performance degradation has an immediate negative impact on an organization’s business. for e.g. mobile apps, gaming, digital ad serving, live voting and audience interaction for live events, sensor networks, log ingestion, access control for web-based content, metadata storage for S3 objects, e-commerce shopping carts, and web session management

Anti-Patterns

  • Structured data with Join and/or Complex Transactions
    • If the application uses structured data and required joins, complex transactions or other relationship infrastructure provided by traditional database platforms, it is better to use RDS or Database installed on an EC2 instance
  • Large Blob data
    • If the application uses large blob data for e.g. media, files, videos etc., it is better to use S3 to store the objects and use DynamoDB to store metadata for e.g. name, size, content-type etc
  • Large Objects with Low I/O rate
    • DynamoDB uses SSD drives and is optimized for workloads with a high I/O rate per GB stored. If the applications stores very large amounts of data that are infrequently accessed, S3 might be a better choice
  • Prewritten application with databases
    • For Porting an existing application using databases, RDS or database installed on the EC2 instance would be a better and seamless solution

Performance

  • SSDs and limited indexing on attributes provides high throughput and low latency and drastically reduces the cost of read and write operations.
  • Predictable performance can be achieved by defining the provisioned throughput capacity required for a given table.
  • DynamoDB handles the provisioning of resources to achieve the requested throughput rate, taking away the burden to think about instances, hardware, memory, and other factors that can affect an application’s throughput rate.
  • Provisioned throughput capacity reservations are elastic and can be increased or decreased on demand.

Durability and Availability

  • DynamoDB has built-in fault tolerance that automatically and synchronously replicates data across three AZ’s in a region for high availability and to help protect data against individual machine, or even facility failures.

Cost Model

  • DynamoDB has three pricing components: provisioned throughput capacity (per hour), indexed data storage (per GB per month), data transfer in or out (per GB per month)

Scalability and Elasticity

  • DynamoDB is both highly-scalable and elastic.
  • DynamoDB provides unlimited storage capacity, and the service automatically allocates more storage as the demand increases
  • Data is automatically partitioned and re-partitioned as needed, while the use of SSDs provides predictable low-latency response times at any scale.
  • DynamoDB is also elastic, in that you can simply “dial-up” or “dial-down” the read and write capacity of a table as your needs change.

Interfaces

  • DynamoDB provides a low-level REST API, as well as higher-level SDKs in different languages
  • APIs provide both a management and data interface for Amazon DynamoDB, that enable table management (creating, listing, deleting, and obtaining metadata) and working with attributes (getting, writing, and deleting attributes; query using an index, and full scan).

Databases on EC2

  • EC2 with EBS volumes allows hosting a self managed relational database
  • Ready to use, prebuilt AMIs are also available from leading database solutions

Ideal Usage Patterns

  • Self managed database on EC2 is an ideal scenario for users whose application requires a specific traditional relational database not supported by Amazon RDS for e.g. IBM DB2, Informix, or Sybase
  • Users or applications that require a maximum level of administrative control and configurability which is not provided by RDS

Anti-Patterns

  • Index and query-focused data
    • If the applications don’t require advanced features such as joins and complex transactions and is more oriented toward indexing and querying data, DynamoDB would be more appropriate for this needs
  • Numerous BLOBs
    • If the application makes heavy use of files (audio files, videos, images, and so on), it is a better choice to use S3 to store the objects instead of database engines Blob feature and use RDS or DynamoDB only to save the metadata
  • Automated scalability
    • Relational databases on EC2 leverages the scalability and elasticity of the underlying AWS platform, but this requires system administrators or DBAs to perform a manual or scripted task. If you need pushbutton scaling or fully-automated scaling, DynamoDB or RDS may be a better choice.
  • RDS supported database platforms
    • If the application using RDS supported database engine and all the features are available, RDS would be a better choice instead of self managed relational database on EC2

Performance

  • Performance depends on the size of the underlying EC2 instance, the number and configuration of the EBS volumes and the database itself
  • Performance can be increased by scaling up memory and compute resources by choosing a larger Amazon EC2 instance size.
  • For database storage, it is usually best to use EBS Provisioned IOPS volumes. To scale up I/O performance, the Provisioned IOPS can be increased, the number of EBS volumes changed, or use software RAID 0 (disk striping) across multiple EBS volumes, which will aggregate total IOPS and bandwidth.

Durability & Availability

  • As the database on EC2 uses EBS as storage, it has the same durability and availability provided by EBS and can be further enhanced by using EBS snapshots or by using third-party database backup utilities (such as Oracle’s RMAN) to store database backups in Amazon S3

Cost Model

  • Cost for running a database on EC2 instance is mainly determined by the size and the number of EC2 instance running, the size of the EBS volume used for database storage and any third party licensing cost for the database

Scalability & Elasticity

  • Users of traditional relational database solutions on Amazon EC2 can take advantage of the scalability and elasticity of the underlying AWS platform by creating AMI and spawning multiple instances

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Which of the following are use cases for Amazon DynamoDB? Choose 3 answers
    1. Storing BLOB data.
    2. Managing web sessions
    3. Storing JSON documents
    4. Storing metadata for Amazon S3 objects
    5. Running relational joins and complex updates.
    6. Storing large amounts of infrequently accessed data.
  2. A client application requires operating system privileges on a relational database server. What is an appropriate configuration for highly available database architecture?
    1. A standalone Amazon EC2 instance
    2. Amazon RDS in a Multi-AZ configuration
    3. Amazon EC2 instances in a replication configuration utilizing a single Availability Zone
    4. Amazon EC2 instances in a replication configuration utilizing two different Availability Zones
  3. You are developing a new mobile application and are considering storing user preferences in AWS, which would provide a more uniform cross-device experience to users using multiple mobile devices to access the application. The preference data for each user is estimated to be 50KB in size. Additionally 5 million customers are expected to use the application on a regular basis. The solution needs to be cost-effective, highly available, scalable and secure, how would you design a solution to meet the above requirements?
    1. Setup an RDS MySQL instance in 2 availability zones to store the user preference data. Deploy a public facing application on a server in front of the database to manage security and access credentials
    2. Setup a DynamoDB table with an item for each user having the necessary attributes to hold the user preferences. The mobile application will query the user preferences directly from the DynamoDB table. Utilize STS. Web Identity Federation, and DynamoDB Fine Grained Access Control to authenticate and authorize access (DynamoDB provides high availability as it synchronously replicates data across three facilities within an AWS Region and scalability as it is designed to scale its provisioned throughput up or down while still remaining available. Also suitable for storing user preference data)
    3. Setup an RDS MySQL instance with multiple read replicas in 2 availability zones to store the user preference data .The mobile application will query the user preferences from the read replicas. Leverage the MySQL user management and access privilege system to manage security and access credentials.
    4. Store the user preference data in S3 Setup a DynamoDB table with an item for each user and an item attribute pointing to the user’ S3 object. The mobile application will retrieve the S3 URL from DynamoDB and then access the S3 object directly utilize STS, Web identity Federation, and S3 ACLs to authenticate and authorize access.
  4. A customer is running an application in US-West (Northern California) region and wants to setup disaster recovery failover to the Asian Pacific (Singapore) region. The customer is interested in achieving a low Recovery Point Objective (RPO) for an Amazon RDS multi-AZ MySQL database instance. Which approach is best suited to this need?
    1. Synchronous replication
    2. Asynchronous replication
    3. Route53 health checks
    4. Copying of RDS incremental snapshots
  5. You are designing a file -sharing service. This service will have millions of files in it. Revenue for the service will come from fees based on how much storage a user is using. You also want to store metadata on each file, such as title, description and whether the object is public or private. How do you achieve all of these goals in a way that is economical and can scale to millions of users?
    1. Store all files in Amazon Simple Storage Service (53). Create a bucket for each user. Store metadata in the filename of each object, and access it with LIST commands against the S3 API.
    2. Store all files in Amazon 53. Create Amazon DynamoDB tables for the corresponding key -value pairs on the associated metadata, when objects are uploaded.
    3. Create a striped set of 4000 IOPS Elastic Load Balancing volumes to store the data. Use a database running in Amazon Relational Database Service (RDS) to store the metadata.
    4. Create a striped set of 4000 IOPS Elastic Load Balancing volumes to store the data. Create Amazon DynamoDB tables for the corresponding key-value pairs on the associated metadata, when objects are uploaded.
  6. Company ABCD has recently launched an online commerce site for bicycles on AWS. They have a “Product” DynamoDB table that stores details for each bicycle, such as, manufacturer, color, price, quantity and size to display in the online store. Due to customer demand, they want to include an image for each bicycle along with the existing details. Which approach below provides the least impact to provisioned throughput on the “Product” table?
    1. Serialize the image and store it in multiple DynamoDB tables
    2. Create an “Images” DynamoDB table to store the Image with a foreign key constraint to the “Product” table
    3. Add an image data type to the “Product” table to store the images in binary format
    4. Store the images in Amazon S3 and add an S3 URL pointer to the “Product” table item for each image

AWS Encrypting Data at Rest – Whitepaper – Certification

Encrypting Data at Rest

  • AWS delivers a secure, scalable cloud computing platform with high availability, offering the flexibility for you to build a wide range of applications
  • AWS allows several options for encrypting data at rest, for additional layer of security, ranging from completely automated AWS encryption solution to manual client-side options
  • Encryption requires 3 things
    • Data to encrypt
    • Encryption keys
    • Cryptographic algorithm method to encrypt the data
  • AWS provides different models for Securing data at rest on the following parameters
    • Encryption method
      • Encryption algorithm selection involves evaluating security, performance, and compliance requirements specific to your application
    • Key Management Infrastructure (KMI)
      • KMI enables managing & protecting the encryption keys from unauthorized access
      • KMI provides
        • Storage layer that protects plain text keys
        • Management layer that authorize key usage
  • Hardware Security Module (HSM)
    • Common way to protect keys in a KMI is using HSM
    • An HSM is a dedicated storage and data processing device that performs cryptographic operations using keys on the device.
    • An HSM typically provides tamper evidence, or resistance, to protect keys from unauthorized use.
    • A software-based authorization layer controls who can administer the HSM and which users or applications can use which keys in the HSM
  • AWS CloudHSM
    • AWS CloudHSM appliance has both physical and logical tamper detection and response mechanisms that trigger zeroization of the appliance.
    • Zeroization erases the HSM’s volatile memory where any keys in the process of being decrypted were stored and destroys the key that encrypts stored objects, effectively causing all keys on the HSM to be inaccessible and unrecoverable.
    • AWS CloudHSM can be used to generate and store key material and can perform encryption and decryption operations,
    • AWS CloudHSM, however, does not perform any key lifecycle management functions (e.g., access control policy, key rotation) and needs a compatible KMI.
    • KMI can be deployed either on-premises or within Amazon EC2 and can communicate to the AWS CloudHSM instance securely over SSL to help protect data and encryption keys.
    • AWS CloudHSM service uses SafeNet Luna appliances, any key management server that supports the SafeNet Luna platform can also be used with AWS CloudHSM
  • AWS Key Management Service (KMS)
    • AWS KMS is a managed encryption service that allows you to provision and use keys to encrypt data in AWS services and your applications.
    • Masters key, after creation, are designed to never be exported from the service.
    • AWS KMS gives you centralized control over who can access your master keys to encrypt and decrypt data, and it gives you the ability to audit this access.
    • Data can be sent into the KMS to be encrypted or decrypted under a specific master key under you account.
    • AWS KMS is natively integrated with other AWS services (for e.g. Amazon EBS, Amazon S3, and Amazon Redshift) and AWS SDKs to simplify encryption of your data within those services or custom applications
    • AWS KMS provides global availability, low latency, and a high level of durability for your keys.

Encryption Models in AWS

Encryption models in AWS depends on the on how you/AWS provides the encryption method and the KMI

  • You control the encryption method and the entire KMI
  • You control the encryption method, AWS provides the storage component of the KMI, and you provide the management layer of the KMI.
  • AWS controls the encryption method and the entire KMI.

Screen Shot 2016-04-08 at 7.39.04 AM

Model A: You control the encryption method and the entire KMI

  • You use your own KMI to generate, store, and manage access to keys as well as control all encryption methods in your applications
  • Proper storage, management, and use of keys to ensure the confidentiality, integrity, and availability of your data is your responsibility
  • AWS has no access to your keys and cannot perform encryption or decryption on your behalf.
  • Amazon S3
    • Encryption of the data is done before the object is sent to AWS S3
    • Encryption of the data can be done using any encryption method and the encrypted data can be uploaded using the PUT request in the Amazon S3 API
    • Key used to encrypt the data needs to be stored securely in your KMI
    • To decrypt this data, the encrypted object can be downloaded from Amazon S3 using the GET request in the Amazon S3 API and then decrypted using the key in your KMI
    • AWS provide Client-side encryption handling, where you can provide your key to the AWS S3 encryption client which will encrypt and decrypt the data on your behalf. However, AWS never has access to the keys or the unencrypted data
    • Screen Shot 2016-04-08 at 6.51.32 PM.png
  • Amazon EBS
    • Amazon Elastic Block Store (Amazon EBS) provides block-level storage volumes for use with Amazon EC2 instances. Amazon EBS volumes are network-attached, and persist independently from the life of an instance.
    • Because Amazon EBS volumes are presented to an instance as a block device, you can leverage most standard encryption tools for file system-level or block-level encryption
    • Block level encryption
      • Block level encryption tools usually operate below the file system layer using kernel space device drivers to perform encryption and decryption of data.
      • These tools are useful when you want all data written to a volume to be encrypted regardless of what directory the data is stored in
    • File System level encryption
      • File system level encryption usually works by stacking an encrypted file system on top of an existing file system.
      • This method is typically used to encrypt a specific directory
    • These solutions require you to provide keys, either manually or from your KMI.
    • Both block-level and file system-level encryption tools can only be used to encrypt data volumes that are not Amazon EBS boot volumes, as they don’t allow you to automatically make a trusted key available to the boot volume at startup
    • There are third party solutions available, which can help encrypt both the boot and data volumes as well as supplying and protecting keys
  • AWS Storage Gateway
    • AWS Storage Gateway is a service connecting an on-premises software appliance with Amazon S3. Data on disk volumes attached to the AWS Storage Gateway will be automatically uploaded to Amazon S3 based on policy
    • Encryption of the source data on the disk volumes can be either done before writing to the disk or using block level encryption on the iSCSI endpoint that AWS Storage Gateway exposes to encrypt all data on the disk volume.
  • Amazon RDS
    • Amazon RDS doesn’t expose the attached disk it uses for data storage, transparent disk encryption using techniques for EBS section cannot be applied.
    • However, individual fields data can be encrypted before the data is written to RDS and decrypted after reading it.

Model B: You control the encryption method, AWS provides the KMI storage component, and you provide the KMI management layer

  • Model B is similar to Model A where the encryption method is managed by you
  • Model B differs in the approach to Model A where the keys are maintained in AWS CloudHSM rather than than the on-premise key storage system
  • Only you have access to the cryptographic partitions within the dedicated HSM to use the keys

Screen Shot 2016-04-10 at 1.01.54 PM.png

Model C: AWS controls the encryption method and the entire KMI

  • AWS provides and manages the server-side encryption of your data, transparently managing the encryption method and the keys.
  • AWS KMS and other services that encrypt your data directly use a method called envelope encryption to provide a balance between performance and security.
  • Envelope Encryption method
    • A master key is defined either by you or AWS
    • A data key (data encryption key) is generated by the AWS service at the time when data encryption is requested
    • Data key is used to encrypt your data.
    • Data key is then encrypted with a key-encrypting key (master key) unique to the service storing your data.
    • Encrypted data key and the encrypted data are then stored by the AWS storage service on your behalf.
  • Master key (key-encrypting keys) used to encrypt data keys are stored and managed separately from the data and the data keys
  • For decryption of the data, the process is reversed. Encrypted data key is decrypted using the key-encrypting key; the data key is then used to decrypt your data
  • Authorized use of encryption keys is done automatically and is securely managed by AWS.
  • Because unauthorized access to those keys could lead to the disclosure of your data, AWS has built systems and processes with strong access controls that minimize the chance of unauthorized access and had these systems verified by third-party audits to achieve security certifications including SOC 1, 2, and 3, PCI-DSS, and FedRAMP.
  • Amazon S3
    • SSE-S3
      • AWS encrypts each object using a unique data key
      • Data key is encrypted with a periodically rotated master key managed by S3
      • Amazon S3 server-side encryption uses 256-bit Advanced Encryption Standard (AES) keys for both object and master keys
    • SSE-KMS
      • Master keys are defined and managed in KMS for your account
      • Object Encryption
        • When an object is uploaded, a request is sent to KMS to create an object key.
        • KMS generates a unique object key and encrypts it using the master key; KMS then returns this encrypted object key along with the plaintext object key to Amazon S3.
        • Amazon S3 web server encrypts your object using the plaintext object key and stores the now encrypted object (with the encrypted object key) and deletes the plaintext object key from memory.
      • Object Decryption
        • To retrieve the encrypted object, Amazon S3 sends the encrypted object key to AWS KMS.
        • AWS KMS decrypts the object key using the correct master key and returns the decrypted (plaintext) object key to S3.
        • Amazon S3 decrypts the encrypted object, with the plaintext object key, and returns it to you.
    • SSE-C
      • Amazon S3 is provided an encryption key, while uploading the object
      • Encryption key is used by Amazon S3 to encrypt your data using AES-256
      • After object encryption, Amazon S3 deletes the encryption key
      • For downloading, you need to provide the same encryption key, which AWS matches, decrypts and returns the object
  • Amazon EBS
    • When Amazon EBS volume is created, you can choose the master key in KMS to be used for encrypting the volume
    • Volume encryption
      • Amazon EC2 server sends an authenticated request to AWS KMS to create a volume key.
      • AWS KMS generates this volume key, encrypts it using the master key, and returns the plaintext volume key and the encrypted volume key to the Amazon EC2 server.
      • Plaintext volume key is stored in memory to encrypt and decrypt all data going to and from your attached EBS volume.
    • Volume decryption
      • When the encrypted volume (or any encrypted snapshots derived
        from that volume) needs to be re-attached to an instance, a call is made to AWS KMS to decrypt the encrypted volume key.
      • AWS KMS decrypts this encrypted volume key with the correct master key and returns the decrypted volume key to Amazon EC2.
  • Amazon Glacier
    • Glacier provide encryption of the data, by default
    • Before it’s written to disk, data is always automatically encrypted using 256-bit AES keys unique to the Amazon Glacier service that are stored in separate systems under AWS control
  • AWS Storage Gateway
    • AWS Storage Gateway transfers your data to AWS over SSL
    • AWS Storage Gateway stores data encrypted at rest in Amazon S3 or Amazon Glacier using their respective server side encryption schemes.
  • Amazon RDS – Oracle
    • Oracle Advanced Security option for Oracle on Amazon RDS can be used to leverage the native Transparent Data Encryption (TDE) and Native Network Encryption (NNE) features
    • Oracle encryption module creates data and key-encrypting keys to encrypt the database
    • Key-encrypting keys specific to your Oracle instance on Amazon RDS are themselves encrypted by a periodically rotated 256-bit AES master key.
    • Master key is unique to the Amazon RDS service and is stored in separate systems under AWS control
  • Amazon RDS -SQL server
    • Transparent Data Encryption (TDE) can be provisioned for Microsoft SQL Server on Amazon RDS.
    • SQL Server encryption module creates data and keyencrypting keys to encrypt the database.
    • Key-encrypting keys specific to your SQL Server instance on Amazon RDS are themselves encrypted by a periodically rotated, regional 256-bit AES master key
    • Master key is unique to the Amazon RDS service and is stored in separate systems under AWS control

Sample Exam Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. How can you secure data at rest on an EBS volume?
    1. Encrypt the volume using the S3 server-side encryption service
    2. Attach the volume to an instance using EC2’s SSL interface.
    3. Create an IAM policy that restricts read and write access to the volume.
    4. Write the data randomly instead of sequentially.
    5. Use an encrypted file system on top of the EBS volume
  2. Your company policies require encryption of sensitive data at rest. You are considering the possible options for protecting data while storing it at rest on an EBS data volume, attached to an EC2 instance. Which of these options would allow you to encrypt your data at rest? (Choose 3 answers)
    1. Implement third party volume encryption tools
    2. Do nothing as EBS volumes are encrypted by default
    3. Encrypt data inside your applications before storing it on EBS
    4. Encrypt data using native data encryption drivers at the file system level
    5. Implement SSL/TLS for all services running on the server
  3. A company is storing data on Amazon Simple Storage Service (S3). The company’s security policy mandates that data is encrypted at rest. Which of the following methods can achieve this? Choose 3 answers
    1. Use Amazon S3 server-side encryption with AWS Key Management Service managed keys
    2. Use Amazon S3 server-side encryption with customer-provided keys
    3. Use Amazon S3 server-side encryption with EC2 key pair.
    4. Use Amazon S3 bucket policies to restrict access to the data at rest.
    5. Encrypt the data on the client-side before ingesting to Amazon S3 using their own master key
    6. Use SSL to encrypt the data while in transit to Amazon S3.
  4. Which 2 services provide native encryption
    1. Amazon EBS
    2. Amazon Glacier
    3. Amazon Redshift (is optional)
    4. Amazon RDS (is optional)
    5. Amazon Storage Gateway
  5. With which AWS services CloudHSM can be used (select 2)
    1. S3
    2. DynamoDb
    3. RDS
    4. ElastiCache
    5. Amazon Redshift

References

AWS DDoS Resiliency – Best Practices – Whitepaper

AWS DDoS Resiliency Whitepaper

  • Denial of Service (DoS) is an attack, carried out by a single attacker, which attempts to make a website or application unavailable to the end users.
  • Distributed Denial of Service (DDoS) is an attack, carried out by multiple attackers either controlled or compromised by a group of collaborators, which generates a flood of requests to the application making in unavailable to the legitimate end users

Mitigation techniques

Minimize the Attack Surface Area

  • This is all all about reducing the attack surface, the different Internet entry points, that allows access to your application
  • Strategy to minimize the Attack surface area
    • reduce the number of necessary Internet entry points,
    • don’t expose back end servers,
    • eliminate non-critical Internet entry points,
    • separate end user traffic from management traffic,
    • obfuscate necessary Internet entry points to the level that untrusted end users cannot access them, and
    • decouple Internet entry points to minimize the effects of attacks.
  • Benefits
    • Minimizes the effective attack vectors and targets
    • Less to monitor and protect
  • Strategy can be achieved using AWS Virtual Private Cloud (VPC)
    • helps define a logically isolated virtual network within the AWS
    • provides ability to create Public & Private Subnets to launch the internet facing and non-public facing instances accordingly
    • provides NAT gateway which allows instances in the private subnet to have internet access without the need to launch them in public subnets with Public IPs
    • allows creation of Bastion host which can be used to connect to instances in the private subnets
    • provides the ability to configure security groups for instances and NACLs for subnets, which act as a firewall, to control and limit outbound and inbound traffic

VPC Architecture

Be Ready to Scale to Absorb the Attack

  • DDOS mainly targets to load the systems till the point they cannot handle the load and are rendered unusable.
  • Scaling out Benefits
    • help build a resilient architecture
    • makes the attacker work harder
    • gives you time to think, analyze and adapt
  • AWS provided services :-
    • Auto Scaling & ELB
      • Horizontal scaling using Auto Scaling with ELB
      • Auto Scaling allows instances to be added and removed as the demand changes
      • ELB helps distribute the traffic across multiple EC2 instances while acting as a Single point of contact.
      • Auto Scaling automatically registers and deregisters EC2 instances with the ELB during scale out and scale in events
    • EC2 Instance
      • Vertical scaling can be achieved by using appropriate EC2 instance types for e.g. EBS optimized or ones with 10 gigabyte network connectivity to handle the load
    • Enhanced Networking
      • Use Instances with Enhanced Networking capabilities which can provide high packet-per-second performance, low latency networking, and improved scalability
    • Amazon CloudFront
      • CloudFront is a CDN, acts as a proxy between end users and the Origin servers, and helps distribute content to the end users without sending traffic to the Origin servers.
      • CloudFront has the inherent ability to help mitigate against both infrastructure and some application layer DDoS attacks by dispersing the traffic across multiple locations.
      • AWS has multiple Internet connections for capacity and redundancy at each location, which allows it to isolate attack traffic while serving content to legitimate end users
      • CloudFront also has filtering capabilities to ensure that only valid TCP connections and HTTP requests are made while dropping invalid requests. This takes the burden of handling invalid traffic (commonly used in UDP & SYN floods, and slow reads) off the origin.
    • Route 53
      • DDOS attacks are also targeted towards DNS, cause if the DNS is unavailable your application is effectively unavailable.
      • AWS Route 53 is highly available and scalable DNS service and have capabilities to ensure access to the application even when under DDOS attack
        • Shuffle Sharding – Shuffle sharding is similar to the concept of database sharding, where horizontal partitions of data are spread across separate database servers to spread load and provide redundancy. Similarly, Amazon Route 53 uses shuffle sharding to spread DNS requests over numerous PoPs, thus providing multiple paths and routes for your application.
        • Anycast Routing – Anycast routing increases redundancy by advertising the same IP address from multiple PoPs. In the event that a DDoS attack overwhelms one endpoint, shuffle sharding isolate failures while providing additional routes to your infrastructure.

Safeguard Exposed & Hard to Scale Expensive Resources

  • If entry points cannot be limited, additional measures to restrict access and protect those entry points without interrupting legitimate end user traffic
  • AWS provided services :-
    • CloudFront
      • CloudFront can restrict access to content using Geo Restriction and Origin Access Identity
      • With Geo Restriction, access can be restricted to a set of whitelisted countries or prevent access from a set of black listed countries
      • Origin Access Identity is the CloudFront special user which allows access to the resources only through CloudFront while denying direct access to the origin content for e.g. if S3 is the Origin for CloudFront, S3 can be configured to allow access only from OAI and hence deny direct access
    • Route 53
      • Route 53 provides two features Alias Record sets & Private DNS to make it easier to scale infrastructure and respond to DDoS attacks
    • WAF
      • WAFs act as filters that apply a set of rules to web traffic. Generally, these rules cover exploits like cross-site scripting (XSS) and SQL injection (SQLi) but can also help build resiliency against DDoS by mitigating HTTP GET or POST floods
      • WAF provides a lot of features like
        • OWASP Top 10
        • HTTP rate limiting (where only a certain number of requests are allowed per user in a timeframe),
        • Whitelist or blacklist (customizable rules)
        • inspect and identify requests with abnormal patterns,
        • CAPTCHA etc
      • To prevent WAF from being a Single point of failure, a WAF sandwich pattern can be implemented where an autoscaled WAF sits between the Internet and Internal Load Balancer

DDOS Resiliency - WAF Sandwich Architecture

Learn Normal Behavior

  • Understand the normal usual levels and Patterns of traffic for your application and use that as a benchmark for identifying abnormal level of traffic or resource spikes patterns
  • Benefits
    • allows one to spot abnormalities
    • configure Alarms with accurate thresholds
    • assists with generating forensic data
  • AWS provided services for tracking
    • AWS CloudWatch monitoring
      • CloudWatch can be used to monitor your infrastructure and applications running in AWS. Amazon CloudWatch can collect metrics, log files, and set alarms for when these metrics have passed predetermined thresholds
    • VPC Flow Logs
      • Flow logs helps capture traffic to the Instances in an VPC and can be used to understand the pattern

Create a Plan for Attacks

  • Have a plan in place before an attack, which ensures that:
    • Architecture has been validated and techniques selected work for the infrastructure
    • Costs for increased resiliency have been evaluated and the goals of your defense are understood
    • Contact points have been identified

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. You are designing a social media site and are considering how to mitigate distributed denial-of-service (DDoS) attacks. Which of the below are viable mitigation techniques? (Choose 3 answers)
    1. Add multiple elastic network interfaces (ENIs) to each EC2 instance to increase the network bandwidth.
    2. Use dedicated instances to ensure that each instance has the maximum performance possible.
    3. Use an Amazon CloudFront distribution for both static and dynamic content.
    4. Use an Elastic Load Balancer with auto scaling groups at the web app and Amazon Relational Database Service (RDS) tiers
    5. Add alert Amazon CloudWatch to look for high Network in and CPU utilization.
    6. Create processes and capabilities to quickly add and remove rules to the instance OS firewall.
  2. You’ve been hired to enhance the overall security posture for a very large e-commerce site. They have a well architected multi-tier application running in a VPC that uses ELBs in front of both the web and the app tier with static assets served directly from S3. They are using a combination of RDS and DynamoDB for their dynamic data and then archiving nightly into S3 for further processing with EMR. They are concerned because they found questionable log entries and suspect someone is attempting to gain unauthorized access. Which approach provides a cost effective scalable mitigation to this kind of attack?
    1. Recommend that they lease space at a DirectConnect partner location and establish a 1G DirectConnect connection to their VPC they would then establish Internet connectivity into their space, filter the traffic in hardware Web Application Firewall (WAF). And then pass the traffic through the DirectConnect connection into their application running in their VPC. (Not cost effective)
    2. Add previously identified hostile source IPs as an explicit INBOUND DENY NACL to the web tier subnet. (does not protect against new source)
    3. Add a WAF tier by creating a new ELB and an AutoScaling group of EC2 Instances running a host-based WAF. They would redirect Route 53 to resolve to the new WAF tier ELB. The WAF tier would their pass the traffic to the current web tier The web tier Security Groups would be updated to only allow traffic from the WAF tier Security Group
    4. Remove all but TLS 1.2 from the web tier ELB and enable Advanced Protocol Filtering This will enable the ELB itself to perform WAF functionality. (No advanced protocol filtering in ELB)

References

DDOS Whitepaper

 

 

AWS Security – Whitepaper – Certification

AWS Security Whitepaper

AWS Security whitepaper is one of the most important whitepaper for the Certification perspective

Shared Security Responsibility Model

In the Shared Security Responsibility Model, AWS is responsible for securing the underlying infrastructure that supports the cloud, and you’re responsible for anything you put on the cloud or connect to the cloud.
AWS Security Shared Responsibility Model

AWS Security Responsibilities

  • AWS is responsible for protecting the global infrastructure that runs all of the services offered in the AWS cloud. This infrastructure is comprised of the hardware, software, networking, and facilities that run AWS services.
  • AWS provide several reports from third-party auditors who have verified their compliance with a variety of computer security standards and regulations
  • AWS is responsible for the security configuration of its products that are considered managed services for e.g. RDS, DynamoDB
  • For Managed Services, AWS will handle basic security tasks like guest operating system (OS) and database patching, firewall configuration, and disaster recovery.

Customer Security Responsibilities

  • AWS Infrastructure as a Service (IaaS) products for e.g. EC2, VPC, S3 are completely under your control and require you to perform all of the necessary security configuration and management tasks.
  • Management of the guest OS (including updates and security patches), any application software or utilities installed on the instances, and the configuration of the AWS-provided firewall (called a security group) on each instance
  • For most of these managed services, all you have to do is configure logical access controls for the resources and protect the account credentials

AWS Global Infrastructure Security  

AWS Compliance Program

IT infrastructure that AWS provides to its customers is designed and managed in alignment with security best practices and a variety of IT security standards, including:
  • SOC 1/SSAE 16/ISAE 3402 (formerly SAS 70)
  • SOC 2
  • SOC 3
  • FISMA, DIACAP, and FedRAMP
  • DOD CSM Levels 1-5
  • PCI DSS Level 1
  • ISO 9001 / ISO 27001
  • ITAR
  • FIPS 140-2
  • MTCS Level 3
And meet several industry-specific standards, including:
  • Criminal Justice Information Services (CJIS)
  • Cloud Security Alliance (CSA)
  • Family Educational Rights and Privacy Act (FERPA)
  • Health Insurance Portability and Accountability Act (HIPAA)
  • Motion Picture Association of America (MPAA)

Physical and Environmental Security 

Storage Decommissioning

  • When a storage device has reached the end of its useful life, AWS procedures include a decommissioning process that is designed to prevent customer data from being exposed to unauthorized individuals.
  • AWS uses the techniques detailed in DoD 5220.22-M (National Industrial Security Program Operating Manual) or NIST 800-88 (Guidelines for Media Sanitization) to destroy data as part of the decommissioning process.
  • All decommissioned magnetic storage devices are degaussed and physically destroyed in accordance with industry-standard practices.

Network Security 

Amazon Corporate Segregation

  • AWS Production network is segregated from the Amazon Corporate network and requires a separate set of credentials for logical access.
  • Amazon Corporate network relies on user IDs, passwords, and Kerberos, while the AWS Production network require SSH public-key authentication through a bastion host.

Networking Monitoring & Protection

AWS utilizes a wide variety of automated monitoring systems to provide a high level of service performance and availability. These tools monitor server and network usage, port scanning activities, application usage, and unauthorized intrusion attempts. The tools have the ability to set custom performance metrics thresholds for unusual activity.
AWS network provides protection against traditional network security issues
  1. DDOS – AWS uses proprietary DDoS mitigation techniques. Additionally, AWS’s networks are multi-homed across a number of providers to achieve Internet access diversity.
  2. Man in the Middle attacks – AWS APIs are available via SSL-protected endpoints which provide server authentication
  3. IP spoofing – AWS-controlled, host-based firewall infrastructure will not permit an instance to send traffic with a source IP or MAC address other than its own.
  4. Port Scanning – Unauthorized port scans by Amazon EC2 customers are a violation of the AWS Acceptable Use Policy. When unauthorized port scanning is detected by AWS, it is stopped and blocked. Penetration/Vulnerability testing can be performed only on your own instances, with mandatory prior approval, and must not violate the AWS Acceptable Use Policy.
  5. Packet Sniffing by other tenants – It is not possible for a virtual instance running in promiscuous mode to receive or “sniff” traffic that is intended for a different virtual instance. While you can place your interfaces into promiscuous mode, the hypervisor will not deliver any traffic to them that is not addressed to them. Even two virtual instances that are owned by the same customer located on the same physical host cannot listen to each other’s traffic.

Secure Design Principles

AWS’s development process follows :-
  • Secure software development best practices, which include formal design reviews by the AWS Security Team, threat modeling, and completion of a risk assessment
  • Static code analysis tools are run as a part of the standard build process
  • Recurring penetration testing performed by carefully selected industry experts

AWS Account Security Features

AWS account security features includes credentials for access control, HTTPS endpoints for encrypted data transmission, the creation of separate IAM user accounts, user activity logging for security monitoring, and Trusted Advisor security checks

AWS Credentials

AWS IAM Credentials

Individual User Accounts

Do not use the Root account, instead create an IAM User for each user and provide them with a unique set of Credentials and grant least privilege as required to perform their job function

Secure HTTPS Access Points

Use HTTPS, provided by all AWS services, for data transmissions, which uses public-key cryptography to prevent eavesdropping, tampering, and forgery

Security Logs

Use Amazon CloudTrail which provides logs of all requests for AWS resources within the account and captures information about every API call to every AWS resource you use, including sign-in events

Trusted Advisor Security Checks

Use Trusted Advisor service which helps inspect AWS environment and provide recommendations when opportunities may exist to optimize cost, improve system performance, or close security gaps

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. In the shared security model, AWS is responsible for which of the following security best practices (check all that apply) :
    1. Penetration testing
    2. Operating system account security management (User responsibility)
    3. Threat modeling
    4. User group access management (User responsibility)
    5. Static code analysis (AWS development cycle responsibility)
  2. You are running a web-application on AWS consisting of the following components an Elastic Load Balancer (ELB) an Auto-Scaling Group of EC2 instances running Linux/PHP/Apache, and Relational DataBase Service (RDS) MySQL. Which security measures fall into AWS’s responsibility?
    1. Protect the EC2 instances against unsolicited access by enforcing the principle of least-privilege access (User responsibility)
    2. Protect against IP spoofing or packet sniffing
    3. Assure all communication between EC2 instances and ELB is encrypted (User responsibility)
    4. Install latest security patches on ELB. RDS and EC2 instances (User responsibility)
  3. In AWS, which security aspects are the customer’s responsibility? Choose 4 answers
    1. Controlling physical access to compute resources (AWS responsibility)
    2. Patch management on the EC2 instances operating system
    3. Encryption of EBS (Elastic Block Storage) volumes
    4. Life-cycle management of IAM credentials
    5. Decommissioning storage devices (AWS responsibility)
    6. Security Group and ACL (Access Control List) settings
  4. Per the AWS Acceptable Use Policy, penetration testing of EC2 instances:
    1. May be performed by AWS, and will be performed by AWS upon customer request.
    2. May be performed by AWS, and is periodically performed by AWS.
    3. Are expressly prohibited under all circumstances.
    4. May be performed by the customer on their own instances with prior authorization from AWS.
    5. May be performed by the customer on their own instances, only if performed from EC2 instances
  5. Which is an operational process performed by AWS for data security?
    1. AES-256 encryption of data stored on any shared storage device (User responsibility)
    2. Decommissioning of storage devices using industry-standard practices
    3. Background virus scans of EBS volumes and EBS snapshots (No virus scan is performed by AWS on User instances)
    4. Replication of data across multiple AWS Regions (AWS does not replicate data across regions unless done by User)
    5. Secure wiping of EBS data when an EBS volume is unmounted (data is not wiped off on EBS volume when unmounted and it can be remounted on other EC2 instance)

References