AWS Intrusion Detection & Prevention System IDS/IPS

AWS Intrusion Detection & Prevention System IDS/IPS

  • An Intrusion Prevention System IPS
    • is an appliance that monitors and analyzes network traffic to detect malicious patterns and potentially harmful packets and prevent vulnerability exploits
    • Most IPS offer firewall, unified threat management and routing capabilities
  • An Intrusion Detection System IDS is
    • an appliance or capability that continuously monitors the environment
    • sends alerts when it detects malicious activity, policy violations or network & system attack from someone attempting to break into or compromise the system
    • produces reports for analysis.

Approaches for AWS IDS/IPS

Network Tap or SPAN

  • Traditional approach involves using a network Test Access Point (TAP) or Switch Port Analyzer (SPAN) to access & monitor all network traffic
  • Connection between the AWS Internet Gateway (IGW) and the Elastic Load Balancer would be an ideal place to capture all network traffic
  • However, there is no place to plug this in between IGW and ELB as there are no SPAN ports, network taps, or a concept of Layer 2 bridging

Packet Sniffing

  • It is not possible for a virtual instance running in promiscuous mode to receive or sniff traffic that is intended for a different virtual instance.
  • While interfaces can be placed into promiscuous mode, the hypervisor will not deliver any traffic to an instance that is not addressed to it.
  • Even two virtual instances that are owned by the same customer located on the same physical host cannot listen to each other’s traffic

Host Based Firewall – Forward Deployed IDS

  • Deploy a network-based IDS on every instance you deploy IDS workload scales with your infrastructure
  • Host-based security software works well with highly distributed and scalable application architectures because network packet inspection is distributed across the entire software fleet
  • However, CPU-intensive process is deployed onto every single machine.

Host Based Firewall – Traffic Replication

  • An Agent is deployed on every instance to capture & replicate traffic for centralized analysis
  • Actual workload of network traffic analysis is not performed on the instance but on a separate server
  • Traffic capture and replication is still CPU-intensive (particularly on Windows machines.)
  • It significantly increases the internal network traffic in the environment as every inbound packet is duplicated in the transfer from the instance that captures the traffic to the instance that analyzes the traffic

AWS IDS IPS Solution 1

In-Line Firewall – Inbound IDS Tier

  • Add another tier to the application architecture where a load balancer sends all inbound traffic to a tier of instances that performs the network analysis for e.g. Third Party Solution Fortinet FortiGate
  • IDS workload is now isolated to a horizontally scalable tier in your architecture You have to maintain and manage another mission-critical elastic tier in your architecture

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A web company is looking to implement an intrusion detection and prevention system into their deployed VPC. This platform should have the ability to scale to thousands of instances running inside of the VPC. How should they architect their solution to achieve these goals?
    1. Configure an instance with monitoring software and the elastic network interface (ENI) set to promiscuous mode packet sniffing to see an traffic across the VPC. (virtual instance running in promiscuous mode to receive or“sniff” traffic)
    2. Create a second VPC and route all traffic from the primary application VPC through the second VPC where the scalable virtualized IDS/IPS platform resides.
    3. Configure servers running in the VPC using the host-based ‘route’ commands to send all traffic through the platform to a scalable virtualized IDS/IPS (host based routing is not allowed)
    4. Configure each host with an agent that collects all network traffic and sends that traffic to the IDS/IPS platform for inspection.
  2. You are designing an intrusion detection prevention (IDS/IPS) solution for a customer web application in a single VPC. You are considering the options for implementing IDS/IPS protection for traffic coming from the Internet. Which of the following options would you consider? (Choose 2 answers)
    1. Implement IDS/IPS agents on each Instance running In VPC
    2. Configure an instance in each subnet to switch its network interface card to promiscuous mode and analyze network traffic. (virtual instance running in promiscuous mode to receive or“sniff” traffic)
    3. Implement Elastic Load Balancing with SSL listeners In front of the web applications (ELS with SSL does not serve as IDS/IPS)
    4. Implement a reverse proxy layer in front of web servers and configure IDS/IPS agents on each reverse proxy server

References

AWS OpsWorks – Certification

AWS OpsWorks

  • AWS OpsWorks is a configuration management service that helps to configure and operate applications in a cloud enterprise by using Chef
  • OpsWorks Stacks and AWS OpsWorks for Chef Automate allows using Chef cookbooks and solutions for configuration management

OpsWorks Stacks

AWS OpsWorks Stacks

  • OpsWorks Stacks provides a simple and flexible way to create and manage stacks, groups of AWS resources like load balancers, web, application and database servers, and application deployed on them
  • OpsWorks Stacks helps deploy and monitor applications in the stacks.
  • Unlike OpsWorks for Chef Automate, OpsWorks Stacks does not require or create Chef servers; and performs some of the work of a Chef server itself
  • OpsWorks Stacks monitors instance health, and provisions new instances, when necessary, by using Auto Healing and Auto Scaling
  • OpsWorks Stacks integrates with IAM to control how users can interact with stacks, what stacks can do on the users behalf, what AWS resources an app can access etc
  • OpsWorks Stacks integrates with CloudWatch and CloudTrail to enable monitoring and logging
  • OpsWorks Stacks can be accessed globally and can be used to create and manage instances globally

Stacks

  • Stack is the core AWS OpsWorks Stacks component.
  • Stack is a container for AWS resources like EC2, RDS instances etc that have a common purpose and should be logically managed together
  • Stack helps manage the resources as a group and also defines some default configuration settings, such as the instances’ OS and AWS region
  • Stacks can also be run in VPC to be isolated from direct user interaction
  • Separate Stacks can be created for different environments like Dev, QA etc

Layers

  • Stacks help manage cloud resources in specialized groups called layers.
  • A layer represents a set of EC2 instances that serve a particular purpose, such as serving applications or hosting a database server.
  • Layers depend on Chef recipes to handle tasks such as installing packages on instances, deploying apps, and running scripts
  • Custom recipes and related files is packaged in one or more cookbooks and stored in a cookbook repository such S3 or Git

Recipes and LifeCycle Events

  • Layers depend on Chef recipes to handle tasks such as installing packages on instances, deploying apps, running scripts, and so on.
  • OpsWorks Stacks runs the recipes for each layer, even if the instance belongs to multiple layers for e.g. instance hosting both the application and the mysql server
  • AWS OpsWorks Stacks features is a set of lifecycle events – Setup, Configure, Deploy, Undeploy, and Shutdown – which automatically runs specified set of recipes at the appropriate time on each instance
    • Setup
      • Once a new instance has booted, OpsWorks triggers the Setup event, which runs recipes to set up the instance according to the layer configuration for e.g. installation of apache, PHP packages
      • Once setup is complete, AWS OpsWorks triggers a Deploy event, which runs recipes to deploy your application to the new instance.
    • Configure
      • Whenever an instance enters or leaves the online state, AWS OpsWorks triggers a Configure event on all instances in the stack.
      • Event runs each layer’s configure recipes to update configuration to reflect the current set of online instances for e.g. the HAProxy layer’s Configure recipes can modify the load balancer configuration to reflect any added or removed application server instances.
    • Deploy
      • OpsWorks triggers a Deploy event when the Deploy command is executed, to deploy the application to a set of application servers.
      • Event runs recipes on the application servers to deploy application and any related files from its repository to the layer’s instances.
    • Undeploy
      • OpsWorks triggers an Undeploy event when an app is deleted or  Undeploy command is executed to remove an app from a set of application servers.
      • Event runs recipes to remove all application versions and perform any additional cleanup tasks.
    • Shutdown
      • OpsWorks triggers a Shutdown event when an instance is being shut down, but before the underlying EC2 instance is actually terminated.
      • Event runs recipes to perform cleanup tasks such as shutting down services.
      • OpsWorks allows Shutdown recipes a configurable amount of time to perform their tasks, and then terminates the instance.

Instance

  • An instance represents a single computing resource for e.g. EC2 instance and it defines resource’s basic configuration, such as OS and size
  • OpsWorks Stacks create instances and adds them to a layer.
  • When the instance is started, OpsWorks Stacks launches an EC2 instance using the configuration settings specified by the instance and its layer.
  • After the EC2 instance has finished booting, OpsWorks Stacks installs an agent that handles communication between the instance and the service and runs the appropriate recipes in response to lifecycle events
  • OpsWorks Stacks supports instance auto-healing, whereby if an agent stops communicating with the service, OpsWorks Stacks automatically stops and restarts the instance
  • OpsWorks Stacks supports the following instance types
    • 24/7 instances – launched and stopped manually
    • Time based instances – run on scheduled time
    • Load based instances – automatically started and stopped based on configurable load metrics
  • Linux based computing resources created outside of the OpsWorks stacks for e.g. console or CLI can be added, incorporated and controlled through OpsWorks

Apps

  • An AWS OpsWorks Stacks app represents code that you want to run on an application server residing in the app repository like S3
  • App contains the information required to deploy the code to the appropriate application server instances.
  • When you deploy an app, AWS OpsWorks Stacks triggers a Deploy event, which runs the Deploy recipes on the stack’s instances.
  • OpsWorks supports the ability to deploy multiple apps per stack and per layer

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. You are working with a customer who is using Chef configuration management in their data center. Which service is designed to let the customer leverage existing Chef recipes in AWS?
    1. Amazon Simple Workflow Service
    2. AWS Elastic Beanstalk
    3. AWS CloudFormation
    4. AWS OpsWorks
  2. Your mission is to create a lights-out datacenter environment, and you plan to use AWS OpsWorks to accomplish this. First you created a stack and added an App Server layer with an instance running in it. Next you added an application to the instance, and now you need to deploy a MySQL RDS database instance. Which of the following answers accurately describe how to add a backend database server to an OpsWorks stack? Choose 3 answers
    1. Add a new database layer and then add recipes to the deploy actions of the database and App Server layers. (Refer link)
    2. Use OpsWorks’ “Clone Stack” feature to create a second RDS stack in another Availability Zone for redundancy in the event of a failure in the Primary AZ. To switch to the secondary RDS instance, set the [:database] attributes to values that are appropriate for your server which you can do by using custom JSON.
    3. The variables that characterize the RDS database connection—host, user, and so on—are set using the corresponding values from the deploy JSON’s [:deploy][:app_name][:database] attributes. (Refer link)
    4. Cookbook attributes are stored in a repository, so OpsWorks requires that the “password”: “your_password” attribute for the RDS instance must be encrypted using at least a 256-bit key.
    5. Set up the connection between the app server and the RDS layer by using a custom recipe. The recipe configures the app server as required, typically by creating a configuration file. The recipe gets the connection data such as the host and database name from a set of attributes in the stack configuration and deployment JSON that AWS OpsWorks installs on every instance. (Refer link)
  3. You are tasked with the migration of a highly trafficked node.js application to AWS. In order to comply with organizational standards Chef recipes must be used to configure the application servers that host this application and to support application lifecycle events. Which deployment option meets these requirements while minimizing administrative burden?
    1. Create a new stack within Opsworks add the appropriate layers to the stack and deploy the application
    2. Create a new application within Elastic Beanstalk and deploy this application to a new environment (need to comply with chef recipes)
    3. Launch a Node JS server from a community AMI and manually deploy the application to the launched EC2 instance
    4. Launch and configure Chef Server on an EC2 instance and leverage the AWS CLI to launch application servers and configure those instances using Chef.
  4. A web-startup runs its very successful social news application on Amazon EC2 with an Elastic Load Balancer, an Auto-Scaling group of Java/Tomcat application-servers, and DynamoDB as data store. The main web application best runs on m2.xlarge instances since it is highly memory- bound. Each new deployment requires semi-automated creation and testing of a new AMI for the application servers which takes quite a while and is therefore only done once per week. Recently, a new chat feature has been implemented in node.js and waits to be integrated in the architecture. First tests show that the new component is CPU bound Because the company has some experience with using Chef, they decided to streamline the deployment process and use AWS OpsWorks as an application life cycle tool to simplify management of the application and reduce the deployment cycles. What configuration in AWS OpsWorks is necessary to integrate the new chat module in the most cost-efficient and flexible way?
    1. Create one AWS Ops Works stack, create one AWS Ops Works layer, create one custom recipe
    2. Create one AWS Ops Works stack, create two AWS Ops Works layers create one custom recipe (Single environment stack, two layers for java and node.js application using built-in recipes and custom recipe for DynamoDB connectivity only as other configuration. Refer link)
    3. Create two AWS Ops Works stacks, create two AWS Ops Works layers create one custom recipe
    4. Create two AWS Ops Works stacks, create two AWS Ops Works layers create two custom recipe
  5. You company runs a complex customer relations management system that consists of around 10 different software components all backed by the same Amazon Relational Database (RDS) database. You adopted AWS OpsWorks to simplify management and deployment of that application and created an AWS OpsWorks stack with layers for each of the individual components. An internal security policy requires that all instances should run on the latest Amazon Linux AMI and that instances must be replaced within one month after the latest Amazon Linux AMI has been released. AMI replacements should be done without incurring application downtime or capacity problems. You decide to write a script to be run as soon as a new Amazon Linux AMI is released. Which solutions support the security policy and meet your requirements? Choose 2 answers
    1. Assign a custom recipe to each layer, which replaces the underlying AMI. Use AWS OpsWorks life-cycle events to incrementally execute this custom recipe and update the instances with the new AMI.
    2. Create a new stack and layers with identical configuration, add instances with the latest Amazon Linux AMI specified as a custom AMI to the new layer, switch DNS to the new stack, and tear down the old stack. (Blue-Green Deployment)
    3. Identify all Amazon Elastic Compute Cloud (EC2) instances of your AWS OpsWorks stack, stop each instance, replace the AMI ID property with the ID of the latest Amazon Linux AMI ID, and restart the instance. To avoid downtime, make sure not more than one instance is stopped at the same time.
    4. Specify the latest Amazon Linux AMI as a custom AMI at the stack level, terminate instances of the stack and let AWS OpsWorks launch new instances with the new AMI. (Will lead to downtime)
    5. Add new instances with the latest Amazon Linux AMI specified as a custom AMI to all AWS OpsWorks layers of your stack, and terminate the old ones.

References

AWS Elastic Beanstalk – Certification

AWS Elastic Beanstalk

  • AWS Elastic Beanstalk allows an user to quickly deploy and manage applications in the AWS Cloud without worrying about the infrastructure that runs those applications.
  • AWS Elastic Beanstalk reduces management complexity without restricting choice or control.
  • AWS Elastic Beanstalk enables automated infrastructure management and code deployment, by simply uploading, for applications and includes
    • Application platform management
    • Capacity provisioning
    • Load Balancing
    • Auto scaling
    • Code deployment
    • Health Monitoring
  • Once an application is uploaded, Elastic Beanstalk automatically launches an environment and creates and configures the AWS resources needed to run the code. After your environment is launched, it can be managed and used to deploy new application versions
  • AWS resources launched by Elastic Beanstalk are fully accessible i.e. you can ssh into the EC2 instances
  • Elastic Beanstalk provides developers and systems administrators an easy, fast way to deploy and manage their applications without having to worry about AWS infrastructure.
  • CloudFormation, using templates, is a better option if the internal AWS resources to be used are known

Elastic Beanstalk Components

  • Application
    • An Elastic Beanstalk application is a logical collection of Elastic Beanstalk components, including environments, versions, and environment configurations.
  • Application Version
    • An application version refers to a specific, labeled iteration of deployable code for a web application
    • Applications can have many versions and each application version is unique and points to an S3 object
    • Multiple versions can be deployed for an Application for testing differences and helps rollback to any version if case of issues
  • Environment
    • An environment is a version that is deployed onto AWS resources
    • An environment runs a single application version at a time, but same application version can be deployed across multiple environments
    • When an environment is created, Elastic Beanstalk provisions the resources needed to run the application version you specified.
  • Environment Configuration
    • An environment configuration identifies a collection of parameters and settings that define how an environment and its associated resources behave
    • When an environment’s configuration settings is updated, Elastic Beanstalk automatically applies the changes to existing resources or deletes and deploys new resources, depending upon the change
  • Configuration Template
    • A configuration template is a starting point for creating unique environment configurations

Elastic Beanstalk Architecture

Elastic Beanstalk Environment Tiers

  • Elastic Beanstalk environment requires an environment tier, platform, and
    environment type
  • Environment tier determines whether Elastic Beanstalk provisions resources to support a web application that handles HTTP(S) requests or a web application that handles background-processing tasks
  • Web Environment Tier
    • An environment tier whose web application processes web requests is known as a web server tier.
    • AWS resources created for a web environment tier include a Elastic Load Balancer, an Auto Scaling group, one or more EC2 instances
    • Every Environment has a CNAME url pointing to the ELB, aliased in Route 53 to ELB url
    • Each EC2 server instance that runs the application uses a container type, which defines the infrastructure topology and software stack
    • A software component called the host manager (HM) runs on each EC2 server instance and is responsible for
      • Deploying the application
      • Aggregating events and metrics for retrieval via the console, the API, or the command line
      • Generating instance-level events
      • Monitoring the application log files for critical errors
      • Monitoring the application server
      • Patching instance components
      • Rotating your application’s log files and publishing them to S3
  • Worker Environment Tier 
    • An environment tier whose web application runs background jobs is known as a worker tier
    • AWS resources created for a worker environment tier include an Auto Scaling group, one or more Amazon EC2 instances, and an IAM role.
    • For the worker environment tier, Elastic Beanstalk also creates and provisions an Amazon SQS queue if you don’t already have one
    • When a worker environment tier is launched, Elastic Beanstalk installs the necessary support files for the programming language of choice and a daemon on each EC2 instance in the Auto Scaling group reading from the same SQS queue
    • Daemon is responsible for pulling requests from an Amazon SQS queue and then sending the data to the web application running in the worker environment tier that will process those messages
  • One environment cannot support two different environment tiers because each requires its own set of resources; a worker environment tier and a web server environment tier each require an Auto Scaling group, but Elastic Beanstalk supports only one Auto Scaling group per environment.

Elastic Beanstalk with other AWS Services

  • Elastic Beanstalk supports VPC and launches AWS resources, such as instances, into the VPC
  • Elastic Beanstalk supports IAM and helps you securely control access to your AWS resources.
  • CloudFront can be used to distribute the content in S3, after an Elastic Beanstalk is created and deployed
  • CloudTrail
    • Elastic Beanstalk is integrated with CloudTrail, a service that captures all of the Elastic BeanstalkAPI calls and delivers the log files to an S3 bucket that you specify.
    • CloudTrail captures API calls from the Elastic Beanstalk console or from your code to the Elastic Beanstalk APIs and help to determine the request made to Elastic Beanstalk, the source IP address from which the request was made, who made the request, when it was made etc.
  • RDS
    • Elastic Beanstalk provides support for running RDS instances in the Elastic Beanstalk environment which is ideal for development and testing but not for production.
    • For a production environment, it is not recommended because it ties the lifecycle of the database instance to the lifecycle of application’s environment. So it the Elastic beanstalk environment is deleted, the RDS instance is deleted as well
    • It is recommended to launch a database instance outside of the environment and configure the application to connect to it outside of the functionality provided by Elastic Beanstalk.
    • Using a database instance external to your environment requires additional security group and connection string configuration, but it also lets the application connect to the database from multiple environments, use database types not supported with integrated databases, perform blue/green deployments, and tear down your environment without affecting the database instance.
  • S3
    • Elastic Beanstalk creates an S3 bucket named elasticbeanstalk-region-account-id for each region in which environments is created.
    • Elastic Beanstalk uses the bucket to store application versions, logs, and other supporting files.
    • It applies a bucket policy to buckets it creates to allow environments to write to the bucket and prevent accidental deletion

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. An organization is planning to use AWS for their production roll out. The organization wants to implement automation for deployment such that it will automatically create a LAMP stack, download the latest PHP installable from S3 and setup the ELB. Which of the below mentioned AWS services meets the requirement for making an orderly deployment of the software?
    1. AWS Elastic Beanstalk
    2. AWS Cloudfront
    3. AWS Cloudformation
    4. AWS DevOps
  2. What does Amazon Elastic Beanstalk provide?
    1. A scalable storage appliance on top of Amazon Web Services.
    2. An application container on top of Amazon Web Services
    3. A service by this name doesn’t exist.
    4. A scalable cluster of EC2 instances.
  3. A .NET application that you manage is running in Elastic Beanstalk. Your developers tell you they will need access to application log files to debug issues that arise. The infrastructure will scale up and down. How can you ensure the developers will be able to access only the log files?
    1. Access the log files directly from Elastic Beanstalk
    2. Enable log file rotation to S3 within the Elastic Beanstalk configuration
    3. Ask your developers to enable log file rotation in the applications web.config file
    4. Connect to each Instance launched by Elastic Beanstalk and create a Windows Scheduled task to rotate the log files to S3
  4. Your team has a tomcat-based Java application you need to deploy into development, test and production environments. After some research, you opt to use Elastic Beanstalk due to its tight integration with your developer tools and RDS due to its ease of management. Your QA team lead points out that you need to roll a sanitized set of production data into your environment on a nightly basis. Similarly, other software teams in your org want access to that same restored data via their EC2 instances in your VPC .The optimal setup for persistence and security that meets the above requirements would be the following.
    1. Create your RDS instance as part of your Elastic Beanstalk definition and alter its security group to allow access to it from hosts in your application subnets. (Not optimal for persistence as the RDS is associated with the Elastic Beanstalk lifecycle and would not live independently)
    2. Create your RDS instance separately and add its IP address to your application’s DB connection strings in your code. Alter its security group to allow access to it from hosts within your VPC’s IP address block. (RDS is connected using DNS endpoint only)
    3. Create your RDS instance separately and pass its DNS name to your app’s DB connection string as an environment variable. Create a security group for client machines and add it as a valid source for DB traffic to the security group of the RDS instance itself. (Security group allows instances to access the RDS with new instances launched without any changes)
    4. Create your RDS instance separately and pass its DNS name to your DB connection string as an environment variable. Alter its security group to allow access to it from hosts in your application subnets. (Not optimal for security adding individual hosts)

AWS Simple Email Service – SES – Certification

AWS Simple Email Service – SES

  • SES is a managed service and ideal for sending bulk emails at scale
  • SES acts as an outbound email server and eliminates the need to support own software or applications to do the heavy lifting of email transport
  • Existing email server can also be configure to send outgoing emails through SES with no change in any settings in the email clients
  • Maximum message size including attachments is 10 MB per message (after base64 encoding).

SES Characteristics

  • Compatible with SMTP
  • Applications can send email using a single API call in many supported languages Java, .Net, PHP, Perl, Ruby, HTTPs etc
  • Optimized for highest levels of uptime, availability and scales as per the demand
  • Provides sandbox environment for testing

Sending Limits

  • Production SES has a set of sending limits which include
    • Sending Quota – max number of emails in 24-hour period
    • Maximum Send Rate – max number of emails per second
  • SES automatically adjusts the limits upward as long as emails are of high quality and they are sent in a controlled manner, as any spike in the email sent might be considered to be spams.
  • Limits can also be raised by submitting a Quota increase request

SES Best Practices

  • Send high-quality and real production content that your recipients want
  • Only send to those who have signed up for the mail
  • Unsubscribe recipients who have not interacted with the business recently
  • Have low bounce and compliant rates and remove bounced or complained addresses, using SNS to monitor bounces and complaints, treating them as opt-out
  • Monitor the sending activity

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. What does Amazon SES stand for?
    1. Simple Elastic Server
    2. Simple Email Service
    3. Software Email Solution
    4. Software Enabled Server
  2. Your startup wants to implement an order fulfillment process for selling a personalized gadget that needs an average of 3-4 days to produce with some orders taking up to 6 months you expect 10 orders per day on your first day. 1000 orders per day after 6 months and 10,000 orders after 12 months. Orders coming in are checked for consistency men dispatched to your manufacturing plant for production quality control packaging shipment and payment processing If the product does not meet the quality standards at any stage of the process employees may force the process to repeat a step Customers are notified via email about order status and any critical issues with their orders such as payment failure. Your case architecture includes AWS Elastic Beanstalk for your website with an RDS MySQL instance for customer data and orders. How can you implement the order fulfillment process while making sure that the emails are delivered reliably?
    1. Add a business process management application to your Elastic Beanstalk app servers and re-use the ROS database for tracking order status use one of the Elastic Beanstalk instances to send emails to customers.
    2. Use SWF with an Auto Scaling group of activity workers and a decider instance in another Auto Scaling group with min/max=1 Use the decider instance to send emails to customers.
    3. Use SWF with an Auto Scaling group of activity workers and a decider instance in another Auto Scaling group with min/max=1 use SES to send emails to customers.
    4. Use an SQS queue to manage all process tasks Use an Auto Scaling group of EC2 Instances that poll the tasks and execute them. Use SES to send emails to customers.

AWS EBS Snapshot – Certification

EBS Snapshot

  • EBS provides the ability to create snapshots (backups) of any EBS volume and write a copy of the data in the volume to Amazon S3, where it is stored redundantly in multiple Availability Zones
  • Snapshots can be used to create new volumes, increase the size of the volumes or replicate data across Availability Zones
  • Snapshots are incremental backups and store only the data that was changed from the time the last snapshot was taken.
  • Snapshots size can probably be smaller then the volume size as the data is compressed before being saved to S3
  • Even though snapshots are saved incrementally, the snapshot deletion process is designed so that you need to retain only the most recent snapshot in order to restore the volume.

EBS Snapshot creation

  • Snapshots can be created from EBS volumes periodically and are point-in-time snapshots.
  • Snapshots are incremental and only store the blocks on the device that changed since the last snapshot was taken
  • Snapshots occur asynchronously; the point-in-time snapshot is created immediately while it takes time to upload the modified blocks to S3
  • Snapshots can be taken from in-use volumes. However, snapshots will only capture the data that was written to the EBS volumes at the time snapshot command is issued excluding the data which is cached by any applications of OS
  • Recommended ways to create a Snapshot from an EBS volume are
    • Pause all file writes to the volume
    • Unmount the Volume -> Take Snapshot -> Remount the Volume
    • Stop the instance – Take Snapshot (for root EBS volumes)
  • Snapshots of encrypted volumes are encrypted and volumes created from encrypted snapshots are automatically encrypted

EBS Snapshot Deletion

  • When a snapshot is deleted only the data exclusive to that snapshot is removed.
  • Deleting previous snapshots of a volume do not affect your ability to restore volumes from later snapshots of that volume.
  • Active snapshots contain all of the information needed to restore your data (from the time the snapshot was taken) to a new EBS volume.
  • Even though snapshots are saved incrementally, the snapshot deletion process is designed so that you need to retain only the most recent snapshot in order to restore the volume.
  • Snapshot of the root device of an EBS volume used by a registered AMI can’t be deleted. AMI needs to be deregistered to be able to delete the snapshot.

EBS Snapshot Copy

  • Snapshots are constrained to the region in which they are created and can be used to launch EBS volumes within the same region only
  • Snapshots can be copied across regions to make it easier to leverage multiple regions for geographical expansion, data center migration, and disaster recovery
  • Snapshots are copied with S3 server-side encryption (256-bit Advanced Encryption Standard) to encrypt your data and the snapshot copy receives a snapshot ID that’s different from the original snapshot’s ID.
  • User-defined tags are not copied from the source to the new snapshot.
  • First Snapshot copy to another region is always a full copy, while the rest are always incremental.
  • When a snapshot is copied,
    • it can be encrypted if currently unencrypted or
    • can be encrypted using a different encryption key. Changing the encryption status of a snapshot or using a non-default EBS CMK during a copy operation always results in a full copy (not incremental)

EBS Snapshot Sharing

  • Snapshots can be shared by making them public or with specific AWS accounts by modifying the permissions of the snapshots
  • Only unencrypted snapshots can be shared. Encrypted snapshots cannot be shared between accounts or made public

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. An existing application stores sensitive information on a non-boot Amazon EBS data volume attached to an Amazon Elastic Compute Cloud instance. Which of the following approaches would protect the sensitive data on an Amazon EBS volume?
    1. Upload your customer keys to AWS CloudHSM. Associate the Amazon EBS volume with AWS CloudHSM. Remount the Amazon EBS volume.
    2. Create and mount a new, encrypted Amazon EBS volume. Move the data to the new volume. Delete the old Amazon EBS volume.
    3. Unmount the EBS volume. Toggle the encryption attribute to True. Re-mount the Amazon EBS volume.
    4. Snapshot the current Amazon EBS volume. Restore the snapshot to a new, encrypted Amazon EBS volume. Mount the Amazon EBS volume (Need to create a snapshot, create an encrypted copy of snapshot and then create a EBS volume and mount it)
  2. Is it possible to access your EBS snapshots?
    1. Yes, through the Amazon S3 APIs.
    2. Yes, through the Amazon EC2 APIs
    3. No, EBS snapshots cannot be accessed; they can only be used to create a new EBS volume.
    4. EBS doesn’t provide snapshots.
  3. Which of the following approaches provides the lowest cost for Amazon Elastic Block Store snapshots while giving you the ability to fully restore data?
    1. Maintain two snapshots: the original snapshot and the latest incremental snapshot
    2. Maintain a volume snapshot; subsequent snapshots will overwrite one another
    3. Maintain a single snapshot the latest snapshot is both Incremental and complete
    4. Maintain the most current snapshot, archive the original and incremental to Amazon Glacier.
  4. Which procedure for backing up a relational database on EC2 that is using a set of RAIDed EBS volumes for storage minimizes the time during which the database cannot be written to and results in a consistent backup?
    1. Detach EBS volumes, 2. Start EBS snapshot of volumes, 3. Re-attach EBS volumes
    2. Stop the EC2 Instance. 2. Snapshot the EBS volumes
    3. Suspend disk I/O, 2. Create an image of the EC2 Instance, 3. Resume disk I/O
    4. Suspend disk I/O, 2. Start EBS snapshot of volumes, 3. Resume disk I/O
    5. Suspend disk I/O, 2. Start EBS snapshot of volumes, 3. Wait for snapshots to complete, 4. Resume disk I/O
  5. How can an EBS volume that is currently attached to an EC2 instance be migrated from one Availability Zone to another?
    1. Detach the volume and attach it to another EC2 instance in the other AZ.
    2. Simply create a new volume in the other AZ and specify the original volume as the source.
    3. Create a snapshot of the volume, and create a new volume from the snapshot in the other AZ
    4. Detach the volume, then use the ec2-migrate-volume command to move it to another AZ.
  6. How are the EBS snapshots saved on Amazon S3?
    1. Exponentially
    2. Incrementally
    3. EBS snapshots are not stored in the Amazon S3
    4. Decrementally
  7. EBS Snapshots occur _____
    1. Asynchronously
    2. Synchronously
    3. Weekly
  8. What will be the status of the snapshot until the snapshot is complete?
    1. Running
    2. Working
    3. Progressing
    4. Pending
  9. Before I delete an EBS volume, what can I do if I want to recreate the volume later?
    1. Create a copy of the EBS volume (not a snapshot)
    2. Create and Store a snapshot of the volume
    3. Download the content to an EC2 instance
    4. Back up the data in to a physical disk
  10. Which of the following are true regarding encrypted Amazon Elastic Block Store (EBS) volumes? Choose 2 answers
    1. Supported on all Amazon EBS volume types
    2. Snapshots are automatically encrypted
    3. Available to all instance types
    4. Existing volumes can be encrypted
    5. Shared volumes can be encrypted
  11. Amazon EBS snapshots have which of the following two characteristics? (Choose 2.) Choose 2 answers
    1. EBS snapshots only save incremental changes from snapshot to snapshot
    2. EBS snapshots can be created in real-time without stopping an EC2 instance (the snapshot can be taken real time however it will not be consistent and the recommended way is to stop or freeze the IO)
    3. EBS snapshots can only be restored to an EBS volume of the same size or smaller (EBS volume restored from snapshots need to be of the same size of larger size)
    4. EBS snapshots can only be restored and mounted to an instance in the same Availability Zone as the original EBS volume (Snapshots are specific to Region and can be used to create a volume in any AZ and does not depend on the original EBS volume AZ)
  12. A user is planning to schedule a backup for an EBS volume. The user wants security of the snapshot data. How can the user achieve data encryption with a snapshot?
    1. Use encrypted EBS volumes so that the snapshot will be encrypted by AWS
    2. While creating a snapshot select the snapshot with encryption
    3. By default the snapshot is encrypted by AWS
    4. Enable server side encryption for the snapshot using S3
  13. A sys admin is trying to understand EBS snapshots. Which of the below mentioned statements will not be useful to the admin to understand the concepts about a snapshot?
    1. Snapshot is synchronous
    2. It is recommended to stop the instance before taking a snapshot for consistent data
    3. Snapshot is incremental
    4. Snapshot captures the data that has been written to the hard disk when the snapshot command was executed
  14. When creation of an EBS snapshot is initiated but not completed, the EBS volume
    1. Cannot be detached or attached to an EC2 instance until me snapshot completes
    2. Can be used in read-only mode while me snapshot is in progress
    3. Can be used while the snapshot is in progress
    4. Cannot be used until the snapshot completes
  15. You have a server with a 5O0GB Amazon EBS data volume. The volume is 80% full. You need to back up the volume at regular intervals and be able to re-create the volume in a new Availability Zone in the shortest time possible. All applications using the volume can be paused for a period of a few minutes with no discernible user impact. Which of the following backup methods will best fulfill your requirements?
    1. Take periodic snapshots of the EBS volume
    2. Use a third-party Incremental backup application to back up to Amazon Glacier
    3. Periodically back up all data to a single compressed archive and archive to Amazon S3 using a parallelized multi-part upload
    4. Create another EBS volume in the second Availability Zone attach it to the Amazon EC2 instance, and use a disk manager to mirror me two disks

AWS EC2 VM Import/Export – Certification

EC2 VM Import/Export

  • EC2 VM Import/Export enables importing virtual machine (VM) images from existing virtualization environment to EC2, and then export them back
  • EC2 VM Import/Export enables
    • migration of applications and workloads to EC2,
    • coping VM image catalog to EC2, or
    • create a repository of VM images for backup and disaster recovery
    • to leverage previous investments in building VMs by migrating your VMs to EC2.
  • The supported file formats are: VMware ESX VMDK images, Citrix Xen VHD images, Microsoft Hyper-V VHD images, and RAW images
  • For VMware vSphere, AWS Connector for vCenter can be used to export a VM from VMware and import it into Amazon EC2
  • For Microsoft Systems Center, AWS Systems Manager for Microsoft SCVMM can be used to import Windows VMs from SCVMM to EC2

AWS EC2 VM Import/Export

EC2 VM Import/Export features

  • ability to import a VM from a virtualization environment to EC2 as an Amazon Machine Image (AMI), which can be used to launch an EC2 instance
  • ability to import a VM from a virtualization environment to EC2 as an EC2 instance, which is initially in a stopped state. AMI can be created from it
  • ability to export a VM that was previously imported from the virtualization environment
  • ability to import disks as Amazon EBS snapshots.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. You are responsible for a legacy web application whose server environment is approaching end of life. You would like to migrate this application to AWS as quickly as possible, since the application environment currently has the following limitations: The VM’s single 10GB VMDK is almost full. The virtual network interface still uses the 10Mbps driver, which leaves your 100Mbps WAN connection completely underutilized. It is currently running on a highly customized Windows VM within a VMware environment: You do not have the installation media. This is a mission critical application with an RTO (Recovery Time Objective) of 8 hours. RPO (Recovery Point Objective) of 1 hour. How could you best migrate this application to AWS while meeting your business continuity requirements?
    1. Use the EC2 VM Import Connector for vCenter to import the VM into EC2
    2. Use Import/Export to import the VM as an EBS snapshot and attach to EC2. (Import/Export is used to transfer large amount of data)
    3. Use S3 to create a backup of the VM and restore the data into EC2.
    4. Use the ec2-bundle-instance API to Import an Image of the VM into EC2 (only bundles an windows instance store instance)

AWS EBS Performance – Certification

AWS EBS Performance Tips

EBS Performance depends on several factores including I/O characteristics and the configuration of instances and volumes and can be improved using PIOPS, EBS-Optimized instances, Pre-Warming and RAIDed configuration

EBS-Optimized or 10 Gigabit Network Instances

  • An EBS-Optimized instance uses an optimized configuration stack and provides additional, dedicated capacity for EBS I/O.
  • Optimization provides the best performance for the EBS volumes by minimizing contention between EBS I/O and other traffic from a instance.
  • EBS-Optimized instances deliver dedicated throughput to EBS, with options between 500 Mbps and 4,000 Mbps, depending on the instance type used
  • Not all instance types support EBS-Optimization
  • Some Instance type enable EBS-Optimization by default, while it can be enabled for some.
  • EBS optimization enabled for an instance, that is not EBS-Optimized by default, an additional low, hourly fee for the dedicated capacity is charged
  • When attached to an EBS–optimized instance,
    • General Purpose (SSD) volumes are designed to deliver within 10% of their baseline and burst performance 99.9% of the time in a given year
    • Provisioned IOPS (SSD) volumes are designed to deliver within 10% of their provisioned performance 99.9 percent of the time in a given year.

EBS Volume Initialization – Pre-warming

  • New EBS volumes receive their maximum performance the moment that they are available and DO NOT require initialization (pre-warming).
  • EBS volumes needed a pre-warming, previously, before being used to get maximum performance to start with. Pre-warming of the volume was possible by writing to the entire volume with 0 for new volumes or reading entire volume for volumes from snapshots
  • Storage blocks on volumes that were restored from snapshots must be initialized (pulled down from S3 and written to the volume) before the block can be accessed
  • This preliminary action takes time and can cause a significant increase in the latency of an I/O operation the first time each block is accessed.

RAID Configuration

  • EBS volumes can be striped, if a single EBS volume does not meet the performance and more is required.
  • Striping volumes allows pushing tens of thousands of IOPS.
  • EBS volumes are already replicated across multiple servers in an AZ for availability and durability, so AWS generally recommend striping for performance rather than durability.
  • For greater I/O performance than can be achieved with a single volume, RAID 0 can stripe multiple volumes together; for on-instance redundancy, RAID 1 can mirror two volumes together.
  • RAID 0 allows I/O distribution across all volumes in a stripe, allowing straight gains with each addition.
  • RAID 1 can be used for durability to mirror volumes, but in this case, it requires more EC2 to EBS bandwidth as the data is written to multiple volumes simultaneously and should be used with EBS–optimization.
  • EBS volume data is replicated across multiple servers in an AZ to prevent the loss of data from the failure of any single component
  • AWS doesn’t recommend RAID 5 and 6 because the parity write operations of these modes consume the IOPS available for the volumes and can result in 20-30% fewer usable IOPS than a RAID 0.
  • A 2-volume RAID 0 config can outperform a 4-volume RAID 6 that costs twice as much.

RAID Configuration

AWS Certification Exam Practice Questions

  1. A user is trying to pre-warm a blank EBS volume attached to a Linux instance. Which of the below mentioned steps should be performed by the user?
    1. There is no need to pre-warm an EBS volume (with latest update no pre-warming is needed)
    2. Contact AWS support to pre-warm (This used to be the case before, but pre warming is not necessary now)
    3. Unmount the volume before pre-warming
    4. Format the device
  2. A user has created an EBS volume of 10 GB and attached it to a running instance. The user is trying to access EBS for first time. Which of the below mentioned options is the correct statement with respect to a first time EBS access?
    1. The volume will show a size of 8 GB
    2. The volume will show a loss of the IOPS performance the first time
    3. The volume will be blank
    4. If the EBS is mounted it will ask the user to create a file system
  3. You are running a database on an EC2 instance, with the data stored on Elastic Block Store (EBS) for persistence At times throughout the day, you are seeing large variance in the response times of the database queries Looking into the instance with the isolate command you see a lot of wait time on the disk volume that the database’s data is stored on. What two ways can you improve the performance of the database’s storage while maintaining the current persistence of the data? Choose 2 answers
    1. Move to an SSD backed instance
    2. Move the database to an EBS-Optimized Instance
    3. Use Provisioned IOPs EBS
    4. Use the ephemeral storage on an m2.4xLarge Instance Instead
  4. You have launched an EC2 instance with four (4) 500 GB EBS Provisioned IOPS volumes attached. The EC2 Instance Is EBS-Optimized and supports 500 Mbps throughput between EC2 and EBS. The two EBS volumes are configured as a single RAID 0 device, and each Provisioned IOPS volume is provisioned with 4.000 IOPS (4000 16KB reads or writes) for a total of 16,000 random IOPS on the instance. The EC2 Instance initially delivers the expected 16,000 IOPS random read and write performance Sometime later in order to increase the total random I/O performance of the instance, you add an additional two 500 GB EBS Provisioned IOPS volumes to the RAID. Each volume Is provisioned to 4,000 lOPS like the original four for a total of 24,000 IOPS on the EC2 instance Monitoring shows that the EC2 instance CPU utilization increased from 50% to 70%, but the total random IOPS measured at the instance level does not increase at all. What is the problem and a valid solution?
    1. Larger storage volumes support higher Provisioned IOPS rates: increase the provisioned volume storage of each of the 6 EBS volumes to 1TB.
    2. EBS-Optimized throughput limits the total IOPS that can be utilized use an EBS-Optimized instance that provides larger throughput. (EC2 Instance types have limit on max throughput and would 8xlarge or higher instance types to provide 24000 IOPS)
    3. Small block sizes cause performance degradation, limiting the I’O throughput, configure the instance device driver and file system to use 64KB blocks to increase throughput.
    4. RAID 0 only scales linearly to about 4 devices, use RAID 0 with 4 EBS Provisioned IOPS volumes but increase each Provisioned IOPS EBS volume to 6.000 IOPS.
    5. The standard EBS instance root volume limits the total IOPS rate, change the instant root volume to also be a 500GB 4,000 Provisioned IOPS volume
  5. A user has deployed an application on an EBS backed EC2 instance. For a better performance of application, it requires dedicated EC2 to EBS traffic. How can the user achieve this?
    1. Launch the EC2 instance as EBS provisioned with PIOPS EBS
    2. Launch the EC2 instance as EBS enhanced with PIOPS EBS
    3. Launch the EC2 instance as EBS dedicated with PIOPS EBS
    4. Launch the EC2 instance as EBS optimized with PIOPS EBS

AWS WorkSpace – Certification

AWS WorkSpace

  • Amazon WorkSpace is a fully managed, secure desktop computing service which runs on the AWS cloud.
  • WorkSpace is a cloud-based virtual desktop that can act as a replacement for a traditional desktop
  • A WorkSpace is available as a bundle of compute resources, storage space, and software applications that allows a user to perform day-to-day tasks just like using a traditional desktop
  • WorkSpace allows user to easily provision cloud-based virtual desktops and provide users access to the documents, applications, and resources they need from any supported device, including computers, Chromebooks, iPads, Fire tablets, and Android tablets.
  • Each WorkSpace runs on an individual instance for the user it is assigned to and Applications and users’ documents and settings are persistent.
  • Security
    • User can login into the WorkSpace using their own credentials set when the instance is provisioned
    • WorkSpaces service integrates with existing Active Directory domain, users will sign in with their regular Active Directory credentials.
    • WorkSpaces also integrates with existing RADIUS server to enable multi-factor authentication (MFA).
  • Backup and Encryption
    • User volume (D:) is backed up every 12 hours and if the WorkSpace fails, AWS can restore the volume from the backup
    • WorkSpaces supports root volume (C: drive) and user volume (D: drive) encryption
    • WorkSpaces uses EBS volumes that can be encrypted on WorkSpace creation, providing encryption for data stored at rest, disk I/O to the volume, and snapshots created from the volume.
    • WorkSpaces integrates with the AWS KMS service with the ability to provide keys for encrypting the volumes
  • Amazon WorkSpaces Application Manager (Amazon WAM)
    • WAM offers a fast, flexible, and secure way for you to deploy and manage applications for Amazon WorkSpaces.
    • WAM accelerates software deployment, upgrades, patching, and retirement by packaging Microsoft Windows desktop applications into virtualized application containers that run as though they are natively installed.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A company needs to deploy virtual desktops to its customers in a virtual private cloud, leveraging existing security controls. Which set of AWS services and features will meet the company’s requirements?
    1. Virtual Private Network connection. AWS Directory Services, and ClassicLink (ClassicLink allows you to link an EC2-Classic instance to a VPC in your account, within the same region)
    2. Virtual Private Network connection. AWS Directory Services, and Amazon Workspaces (WorkSpaces for Virtual desktops, and AWS Directory Services to authenticate to an existing on-premises AD through VPN)
    3. AWS Directory Service, Amazon Workspaces, and AWS Identity and Access Management (AD service needs a VPN connection to interact with an On-premise AD directory)
    4. Amazon Elastic Compute Cloud, and AWS Identity and Access Management (Need WorkSpaces for virtual desktops)

References

AWS_WorkSpaces

AWS CloudHSM – Certification

AWS CloudHSM

  • AWS CloudHSM provides secure cryptographic key storage to customers by making hardware security modules (HSMs) available in the AWS cloud
  • AWS CloudHSM helps meet corporate, contractual and regulatory compliance requirements for data security by using dedicated HSM appliances within the AWS cloud.
  • A hardware security module (HSM)
    • is a hardware appliance that provides secure key storage and cryptographic operations within a tamper-resistant hardware module.
    • are designed with physical and logical mechanisms, to securely store cryptographic key material and use the key material without exposing it outside the cryptographic boundary of the appliance.
    • physical protections include tamper detection and tamper response. When a tampering event is detected, the HSM is designed to securely destroy the keys rather than risk compromise
    • logical protections include role-based access controls that provide separation of duties
  • CloudHSM allows encryption keys protection within HSMs, designed and validated to government standards for secure key management.
  • CloudHSM helps comply with strict key management requirements within the AWS cloud without sacrificing application performance
  • CloudHSM uses SafeNet Luna SA HSM appliances
  • HSMs are located in AWS data centers, managed and monitored by AWS, but AWS does not have access to the keys
  • AWS can’t help recover the key material if the credentials are lost
  • HSMs are inside your VPC and isolated from the rest of the network
  • CloudHSM provides single tenant dedicated access to each HSM appliance
  • Placing HSM appliances near your EC2 instances decreases network latency, which can improve application performance
  • Only you have access to the keys and operations to generate, store and manage on the keys
  • Integrated with Amazon Redshift and Amazon RDS for Oracle
  • Other use cases like EBS volume encryption and S3 object encryption and key management can be handled by writing custom applications and integrating them with CloudHSM

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. With which AWS services CloudHSM can be used (select 2)
    1. S3
    2. DynamoDb
    3. RDS
    4. ElastiCache
    5. Amazon Redshift
  2. An AWS customer is deploying a web application that is composed of a front-end running on Amazon EC2 and of confidential data that is stored on Amazon S3. The customer security policy that all access operations to this sensitive data must be authenticated and authorized by a centralized access management system that is operated by a separate security team. In addition, the web application team that owns and administers the EC2 web front-end instances is prohibited from having any ability to access the data that circumvents this centralized access management system. Which of the following configurations will support these requirements:
    1. Encrypt the data on Amazon S3 using a CloudHSM that is operated by the separate security team. Configure the web application to integrate with the CloudHSM for decrypting approved data access operations for trusted end-users.
    2. Configure the web application to authenticate end-users against the centralized access management system. Have the web application provision trusted users STS tokens entitling the download of approved data directly from Amazon S3
    3. Have the separate security team create and IAM role that is entitled to access the data on Amazon S3. Have the web application team provision their instances with this role while denying their IAM users access to the data on Amazon S3 (Web team would have access to the data)
    4. Configure the web application to authenticate end-users against the centralized access management system using SAML. Have the end-users authenticate to IAM using their SAML token and download the approved data directly from S3.

References

AWS_CloudHSM_User_Guide

AWS Data Pipeline – Certification

AWS Data Pipeline

  • AWS Data Pipeline is a web service that makes it easy to automate and schedule regular data movement and data processing activities in AWS
  • AWS Data Pipeline help define data-driven workflows
  • AWS Data Pipeline integrates with on-premises and cloud-based storage systems to allow developers to use their data when they need it, where they want it, and in the required format.
  • AWS Data Pipeline allows you to quickly define a dependent chain of data sources, destinations, and predefined or custom data processing activities called a pipeline.
  • Based on a defined schedule, the pipeline regularly performs processing activities such as distributed data copy, SQL transforms, EMR applications, or custom scripts against destinations such as S3, RDS, or DynamoDB.
  • By executing the scheduling, retry, and failure logic for the workflows as a highly scalable and fully managed service, Data Pipeline ensures that the pipelines are robust and highly available.

AWS Data Pipeline features

  • Managed workflow orchestration service for data-driven workflows
  • Infrastructure management service, will provision and terminate resources as required
  • Provides dependency resolution
  • Can be scheduled
  • Grants control over retries, including frequency and number
  • Distributed, fault-tolerant and highly available
  • Native integration with S3, DynamoDB, RDS, EMT, EC2 and Redshift
  • Support for both AWS based and external resources

AWS Data Pipeline Concepts

Pipeline Definition

  • Pipeline definition helps the business logic to be communicated to the AWS Data Pipeline
  • Pipeline definition defines the location of data (Data Nodes), activities to be performed, the schedule, resources to run the activities, per-conditions and actions to be performed

Pipeline Components, Instances, and Attempts

  • Pipeline components represent the business logic of the pipeline and are represented by the different sections of a pipeline definition.
  • Pipeline components specify the data sources, activities, schedule, and preconditions of the workflow
  • When AWS Data Pipeline runs a pipeline, it compiles the pipeline components to create a set of actionable instances and contains all the information needed to perform a specific task
  • Data Pipeline provides a durable and robust data management as it retries a failed operation depending on the frequency and the number for retries defined

Task Runners

  • A task runner is an application that polls AWS Data Pipeline for tasks and then performs those tasks
  • When Task Runner is installed and configured,
    • it polls AWS Data Pipeline for tasks associated with activated pipelines
    • after a task is assigned to Task Runner, it performs that task and reports its status back to AWS Data Pipeline.
  • A task is a discreet unit of work that the Data Pipeline service shares with a task runner and differs from a pipeline, which defines activities and resources that usually yields several tasks
  • Tasks can be executed either on the AWS Data Pipeline managed or user managed resources

Data Nodes

  • Data Node defines the location and type of data that a pipeline activity uses as source (input) or destination (output)
  • Data pipeline supports S3, Redshift, DynamoDB and SQL data nodes

Databases

  • Data Pipeline supports JDBC, RDS and Redshift database

Activities

  • An activity is a pipeline component that defines the work to perform
  • Data Pipeline provides pre defined activities for common scenarios like sql transformation, data movement, hive queries etc
  • Activities are extensible and can be used to run own custom scripts to support endless combinations

Preconditions

  • Precondition is a pipeline component containing conditional statements that must be satisfied (evaluated to True) before an activity can run
  • A pipeline supports
    • System-managed preconditions
      • are run by the AWS Data Pipeline web service on your behalf and do not require a computational resource
      • Includes source data and keys check for e.g. DynamoDB data, table exists or S3 key exists or prefix not empty
    • User-managed preconditions
      • run on user defined and managed computational resources
      • Can be defined as Exists check or Shell command

Resources

  • A resource is the computational resource that performs the work that a pipeline activity specifies
  • A pipeline supports following types of resources
    • EC2
    • EMR
  • Resources can run in the same region as their working data set or even on a region different than AWS Data Pipeline
  • Resources launched by AWS Data Pipeline are counted within the resource limits and should be taken into account

Actions

  • Actions are steps that a pipeline takes when a certain event like success, failure occurs.
  • Pipeline supports SNS notifications and termination action on resources

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. An International company has deployed a multi-tier web application that relies on DynamoDB in a single region. For regulatory reasons they need disaster recovery capability in a separate region with a Recovery Time Objective of 2 hours and a Recovery Point Objective of 24 hours. They should synchronize their data on a regular basis and be able to provision the web application rapidly using CloudFormation. The objective is to minimize changes to the existing web application, control the throughput of DynamoDB used for the synchronization of data and synchronize only the modified elements. Which design would you choose to meet these requirements?
    1. Use AWS data Pipeline to schedule a DynamoDB cross region copy once a day. Create a ‘Lastupdated’ attribute in your DynamoDB table that would represent the timestamp of the last update and use it as a filter. (Refer Blog Post)
    2. Use EMR and write a custom script to retrieve data from DynamoDB in the current region using a SCAN operation and push it to DynamoDB in the second region. (No Schedule and throughput control)
    3. Use AWS data Pipeline to schedule an export of the DynamoDB table to S3 in the current region once a day then schedule another task immediately after it that will import data from S3 to DynamoDB in the other region. (Export/Import is supported but i doubt incremental works here and is more suited for baseline data)
    4. Send also each Ante into an SQS queue in the second region; use an auto-scaling group behind the SQS queue to replay the write in the second region. (Not Automated to replay the write)
  2. Your company produces customer commissioned one-of-a-kind skiing helmets combining nigh fashion with custom technical enhancements Customers can show off their Individuality on the ski slopes and have access to head-up-displays. GPS rear-view cams and any other technical innovation they wish to embed in the helmet. The current manufacturing process is data rich and complex including assessments to ensure that the custom electronics and materials used to assemble the helmets are to the highest standards. Assessments are a mixture of human and automated assessments you need to add a new set of assessment to model the failure modes of the custom electronics using GPUs with CUD across a cluster of servers with low latency networking. What architecture would allow you to automate the existing process using a hybrid approach and ensure that the architecture can support the evolution of processes over time?
    1. Use AWS Data Pipeline to manage movement of data & meta-data and assessments. Use an auto-scaling group of G2 instances in a placement group. (Involves mixture of human assessments)
    2. Use Amazon Simple Workflow (SWF) to manage assessments, movement of data & meta-data. Use an autoscaling group of G2 instances in a placement group. (Human and automated assessments with GPU and low latency networking)
    3. Use Amazon Simple Workflow (SWF) to manage assessments movement of data & meta-data. Use an autoscaling group of C3 instances with SR-IOV (Single Root I/O Virtualization). (C3 and SR-IOV won’t provide GPU as well as Enhanced networking needs to be enabled)
    4. Use AWS data Pipeline to manage movement of data & meta-data and assessments use auto-scaling group of C3 with SR-IOV (Single Root I/O virtualization). (Involves mixture of human assessments)