AWS Config – Certification

AWS Config

  • AWS Config is a fully managed service that provides AWS resource inventory, configuration history, and configuration change notifications to enable security and governance
  • It provides a detailed view of the configuration of AWS resources in the AWS account.
  • It gives point-in-time and historical states and allows user to see changes visually in a timeline
  • In cases where several configuration changes are made to a resource in quick succession (i.e., within a span of few minutes), AWS Config will only record the latest configuration of that resource; this represents the cumulative impact of that entire set of changes
  • AWS Config does not cover all the AWS services and for the services unsupported the configuration management process can be automated using API and code and used to compare current and past data

AWS Config Use Case

  • Security Analysis & Resource Administration
    • AWS Config enables continuous monitoring and governance over resource configurations and help evaluate them for any misconfigurations leading to security gaps or weakness
  • Auditing & Compliance
    • AWS Config help maintain a complete inventory of all resources and their configurations attributes as well as point in time history
    • Ability to retrieve historical configurations can be very to ensure compliance with internal policies and best practices and for audits
  • Change Management
    • AWS Config helps understand relationships between resources so that the impact of the change can be proactively assessed
    • It can be configured to notify whenever resources are created, modified, or deleted without having to monitor these changes by polling the calls made to each resource
  • Troubleshooting
    • AWS Config can help quickly identify and troubleshoot issues, by being able to use the historical configurations and compare the last working configuration to the one recent changed causing issues
  • Discovery
    • AWS Config help discover resources that exist within an account leading to better inventory and asset management
    • Get a snapshot of the current configurations of the supported resources that are associated with the AWS account

AWS Config Concepts

AWS Config

  • AWS Resources
    • AWS Resources are entities created and managed for e.g. EC2 instances, Security groups
  • AWS Config Rules
    • AWS Config Rules helps define desired configuration settings for the resources or for the entire account
    • AWS Config continuously tracks the resource configuration changes against the rules and if violated marks the resource as noncompliant
  • Resource Relationship
    • AWS Config discovers AWS resources in the account and then creates a map of relationships between AWS resources for e.g. EBS volume linked to an EC2 instance
  • Configuration Items
    • A configuration item represents a point-in-time view of the supported AWS resource
    • Components of a configuration item include metadata, attributes, relationships, current configuration, and related events.
  • Configuration Snapshot
    • A configuration snapshot is a collection of the configuration items for the supported resources that exist in your account
  • Configuration History
    • A configuration history is a collection of the configuration items for a given resource over any time period
  • Configuration Stream
    • Configuration stream is an automatically updated list of all configuration items for the resources that AWS Config is recording
  • Configuration Recorder
    • Configuration recorder stores the configurations of the supported resources in your account as configuration items
    • A configuration recorder needs to created and started for recording and by default records all supported services in the region

AWS Config Flow

  • When AWS Config is turned on, it first discovers the supported AWS resources that exist in the account and generates a configuration item for each resource.
  • AWS Config also generates configuration items when the configuration of a resource changes, and it maintains historical records of the configuration items of the resources from the time the configuration recorder is started.
  • By default, AWS Config creates configuration items for every supported resource in the region, but can be customized to limited resource types.
  • AWS Config keeps track of all changes to the resources by invoking the Describe or the List API call for each resource as well as related resources in the account
  • Configuration items are delivered in a configuration stream to a S3 bucket
  • AWS Config also tracks the configuration changes that were not initiated by the API. AWS Config examines the resource configurations periodically and generates configuration items for the configurations that have changed.
  • AWS Config rules, if configured,
    • are evaluated continuously for resource configurations for desired settings.
    • Depending on the rule, resources are evaluated either in response to configuration changes or periodically.
    • When AWS Config evaluates the resources, it invokes the rule’s AWS Lambda function, which contains the evaluation logic for the rule.
    • Function returns the compliance status of the evaluated resources.
    • If a resource violates the conditions of a rule, the resource and the rule are flagged as noncompliant and a notification sent to SNS topic

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. One of the challenges in managing AWS resources is to keep track of changes in the resource configuration over time. Which one of the following statements provide the best solution?
    1. Use strict syntax tagging on the resources
    2. Create a custom application to automate the configuration management process
    3. Use AWS Config for supported services and use an automated process via APIs for unsupported services
    4. Use resource groups and tagging along with CloudTrail so that you can audit changes using the logs

References

AWS_Config_Developer_Guide

AWS Directory Services – Certification

AWS Directory Services

  • AWS Directory Service is a managed service offering, providing directories that contain information about the organization, including users, groups, computers, and other resources
  • AWS Directory Services provides multiple ways including
    • AWS Directory Service for Microsoft Active Directory (Enterprise Edition), also referred to as Microsoft AD,
    • Simple AD as a standalone directory service, and
    • AD Connector to use On-Premise Microsoft Active Directory with other AWS services.

Simple AD

  • is a Microsoft Active Directory compatible directory from AWS Directory Service that is powered by Samba 4
  • is the least expensive option and the best choice if there are 5,000 or fewer users & don’t need the more advanced Microsoft Active Directory features
  • supports commonly used Active Directory features such as user accounts, group memberships, domain-joining EC2 instances running Linux and Windows, kerberos-based single sign-on (SSO), and group policies
  • does not support features like DNS dynamic update, schema extensions, multi-factor authentication, communication over LDAPS, PowerShell AD cmdlets, and the transfer of FSMO roles
  • provides daily automated snapshots to enable point-in-time recovery
  • However, Trust relationships cannot be setup between Simple AD and other Active Directory domains.

AD Connector

  • helps connect to an existing on-premises Active Directory to AWS
  • is the best choice to leverage existing on-premises directory with AWS services
  • is a proxy service for connecting on-premises Microsoft Active Directory to AWS without requiring complex directory synchronization technologies or the cost and complexity of hosting a federation infrastructure
  • forwards sign-in requests to the Active Directory domain controllers for authentication and provides the ability for applications to query the directory for data
  • enables consistent enforcement of existing security policies, such as password expiration, password history, and account lockouts, whether users are accessing resources on premises or in the AWS cloud

Microsoft Active Directory (Enterprise Edition)

  • is a feature-rich managed Microsoft Active Directory hosted on the AWS
  • is the best choice if there are more than 5,000 users and need a trust relationship set up between an AWS hosted directory and on-premises directories.
  • provides much of the functionality offered by Microsoft Active Directory plus integration with AWS applications

Microsoft AD connectivity options

  • If the VGW used to connect to the On-Premise AD is not stable or has connectivity issues, the following options can be explored
    • Simple AD
      • least expensive option
      • provides an standalone instance for the Microsoft AD in AWS
      • No single point of Authentication or Authorization, as a separate copy is maintained
      • trust relationships cannot be setup between Simple AD and other Active Directory domains
    • Read-only Domain Controllers (RODCs)
      • works out as a Read-only Active Directory
      • Read-only Domain Controllers (RODCs) hold a copy of the Active Directory Domain Service (AD DS) database and respond to authentication requests
      • RODCs are typically deployed in locations where physical security cannot be guaranteed
      • they cannot be written to by applications or other servers.
      • helps maintain a single point to authentication & authorization controls, however needs to be synced
    • Writable Domain Controllers
      • Writable Domain Controllers operate in a multi-master model; changes can be made on any writable server in the forest, and those changes are replicated to servers throughout the entire forest
      • are expensive to setup

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. The majority of your Infrastructure is on premises and you have a small footprint on AWS. Your company has decided to roll out a new application that is heavily dependent on low latency connectivity to LDAP for authentication. Your security policy requires minimal changes to the company’s existing application user management processes. What option would you implement to successfully launch this application?
    1. Create a second, independent LDAP server in AWS for your application to use for authentication (independent would not work for authentication as its a separate copy)
    2. Establish a VPN connection so your applications can authenticate against your existing on-premises LDAP servers (not a low latency solution)
    3. Establish a VPN connection between your data center and AWS create a LDAP replica on AWS and configure your application to use the LDAP replica for authentication (RODCs low latency and minimal setup)
    4. Create a second LDAP domain on AWS establish a VPN connection to establish a trust relationship between your new and existing domains and use the new domain for authentication (Not minimal effort)
  2. A company is preparing to give AWS Management Console access to developers Company policy mandates identity federation and role-based access control. Roles are currently assigned using groups in the corporate Active Directory. What combination of the following will give developers access to the AWS console? (Select 2) Choose 2 answers
    1. AWS Directory Service AD Connector (for Corporate Active directory)
    2. AWS Directory Service Simple AD
    3. AWS Identity and Access Management groups
    4. AWS identity and Access Management roles
    5. AWS identity and Access Management users
  3. An Enterprise customer is starting their migration to the cloud, their main reason for migrating is agility, and they want to make their internal Microsoft Active Directory available to any applications running on AWS; this is so internal users only have to remember one set of credentials and as a central point of user control for leavers and joiners. How could they make their Active Directory secure, and highly available, with minimal on-premises infrastructure changes, in the most cost and time-efficient way? Choose the most appropriate
    1. Using Amazon Elastic Compute Cloud (EC2), they would create a DMZ using a security group; within the security group they could provision two smaller amazon EC2 instances that are running Openswan for resilient IPSEC tunnels, and two larger instance that are domain controllers; they would use multiple Availability Zones (Whats Openswan? Refer Implementation)
    2. Using VPC, they could create an extension to their data center and make use of resilient hardware IPSEC tunnels; they could then have two domain controller instances that are joined to their existing domain and reside within different subnets, in different Availability Zones (highly available with 2 AZ’s, secure with VPN connection and minimal changes)
    3. Within the customer’s existing infrastructure, they could provision new hardware to run Active Directory Federation Services; this would present Active Directory as a SAML2 endpoint on the internet; any new application on AWS could be written to authenticate using SAML2 (not minimal on-premises hardware changes)
    4. The customer could create a stand-alone VPC with its own Active Directory Domain Controllers; two domain controller instances could be configured, one in each Availability Zone; new applications would authenticate with those domain controllers (not a central location, but a copy)
  4. A company needs to deploy virtual desktops to its customers in a virtual private cloud, leveraging existing security controls. Which set of AWS services and features will meet the company’s requirements?
    1. Virtual Private Network connection. AWS Directory Services, and ClassicLink (ClassicLink allows you to link an EC2-Classic instance to a VPC in your account, within the same region)
    2. Virtual Private Network connection. AWS Directory Services, and Amazon Workspaces (WorkSpaces for Virtual desktops, and AWS Directory Services to authenticate to an existing on-premises AD through VPN)
    3. AWS Directory Service, Amazon Workspaces, and AWS Identity and Access Management (AD service needs a VPN connection to interact with an On-premise AD directory)
    4. Amazon Elastic Compute Cloud, and AWS Identity and Access Management (Need WorkSpaces for virtual desktops)
  5. An Enterprise customer is starting their migration to the cloud, their main reason for migrating is agility and they want to make their internal Microsoft active directory available to any applications running on AWS, this is so internal users only have to remember one set of credentials and as a central point of user control for leavers and joiners. How could they make their active directory secure and highly available with minimal on-premises infrastructure changes in the most cost and time efficient way? Choose the most appropriate:
    1. Using Amazon EC2, they could create a DMZ using a security group, within the security group they could provision two smaller Amazon EC2 instances that are running Openswan for resilient IPSEC tunnels and two larger instances that are domain controllers, they would use multiple availability zones.
    2. Using VPC, they could create an extension to their data center and make use of resilient hardware IPSEC tunnels, they could then have two domain controller instances that are joined to their existing domain and reside within different subnets in different availability zones.
    3. Within the customer’s existing infrastructure, they could provision new hardware to run active directory federation services, this would present active directory as a SAML2 endpoint on the internet and any new application on AWS could be written to authenticate using SAML2 (not a  minimal change to the existing infrastructure)
    4. The customer could create a stand alone VPC with its own active directory domain controllers, two domain controller instances could be configured, one in each availability zone, new applications would authenticate with those domain controllers. (Standalone cannot use the same security)

References

AWS Risk and Compliance – Whitepaper – Certification

AWS Risk and Compliance Whitepaper Overview

  • AWS Risk and Compliance Whitepaper is intended to provide information to assist AWS customers with integrating AWS into their existing control framework supporting their IT environment.
  • AWS does communicate its security and control environment relevant to customers. AWS does this by doing the following:
    • Obtaining industry certifications and independent third-party attestations described in this document
    • Publishing information about the AWS security and control practices in whitepapers and web site content
    • Providing certificates, reports, and other documentation directly to AWS customers under NDA (as required)

Shared Responsibility model

  • AWS’ part in the shared responsibility includes
    • providing its services on a highly secure and controlled platform and providing a wide array of security features customers can use
    • relieves the customer’s operational burden as AWS operates, manages and controls the components from the host operating system and virtualization layer down to the physical security of the facilities in which the service operates
  • Customers’ responsibility includes
    • configuring their IT environments in a secure and controlled manner for their purposes
    • responsibility and management of the guest operating system (including updates and security patches), other associated application software as well as the configuration of the AWS provided security group firewall
    • stringent compliance requirements by leveraging technology such as host based firewalls, host based intrusion detection/prevention, encryption and key management
    • relieve customer burden of operating controls by managing those controls associated with the physical infrastructure deployed in the AWS environment

Risk and Compliance Governance

  • AWS provides a wide range of information regarding its IT control environment to customers through white papers, reports, certifications, and other third-party attestations
  • AWS customers are required to continue to maintain adequate governance over the entire IT control environment regardless of how IT is deployed.
  • Leading practices include
    • an understanding of required compliance objectives and requirements (from relevant sources),
    • establishment of a control environment that meets those objectives and requirements,
    • an understanding of the validation required based on the organization’s risk tolerance,
    • and verification of the operating effectiveness of their control environment.
  • Strong customer compliance and governance might include the following basic approach:
    • Review information available from AWS together with other information to understand as much of the entire IT environment as possible, and then document all compliance requirements.
    • Design and implement control objectives to meet the enterprise compliance requirements.
    • Identify and document controls owned by outside parties.
    • Verify that all control objectives are met and all key controls are designed and operating effectively.
  • Approaching compliance governance in this manner helps companies gain a better understanding of their control environment and will help clearly delineate the verification activities to be performed.

AWS Certifications, Programs, Reports, and Third-Party Attestations

  • AWS engages with external certifying bodies and independent auditors to provide customers with considerable information regarding the policies, processes, and controls established and operated by AWS.
  • AWS provides third-party attestations, certifications, Service Organization Controls (SOC) reports and other relevant compliance reports directly to our customers under NDA.

Key Risk and Compliance Questions

  • Shared Responsibility
    • AWS controls the physical components of that technology.
    • Customer owns and controls everything else, including control over connection points and transmissions
  • Auditing IT
    • Auditing for most layers and controls above the physical controls remains the responsibility of the customer
    • AWS ISO 27001 and other certifications are available for auditors review
    • AWS-defined logical and physical controls is documented in the SOC 1 Type II report and available for review by audit and compliance teams
  • Data location
    • AWS customers control which physical region their data and their servers will be located
    • AWS replicates the data only within the region
    • AWS will not move customers’ content from the selected Regions without notifying the customer, unless required to comply with the law or requests of governmental entities
  • Data center tours
    • As AWS host multiple customers, AWS does not allow data center tours by customers, as this exposes a wide range of customers to physical access of a third party.
    • An independent and competent auditor validates the presence and operation of controls as part of our SOC 1 Type II report.
    • This third-party validation provides customers with the independent perspective of the effectiveness of controls in place.
    • AWS customers that have signed a non-disclosure agreement with AWS may request a copy of the SOC 1 Type II report.
  • Third-party access
    • AWS strictly controls access to data centers, even for internal employees.
    • Third parties are not provided access to AWS data centers except when explicitly approved by the appropriate AWS data center manager per the AWS access policy
  • Multi-tenancy
    • AWS environment is a virtualized, multi-tenant environment.
    • AWS has implemented security management processes, PCI controls, and other security controls designed to isolate each customer from other customers.
    • AWS systems are designed to prevent customers from accessing physical hosts or instances not assigned to them by filtering through the virtualization software.
  • Hypervisor vulnerabilities
    • Amazon EC2 utilizes a highly customized version of Xen hypervisor.
    • Hypervisor is regularly assessed for new and existing vulnerabilities and attack vectors by internal and external penetration teams, and is well suited for maintaining strong isolation between guest virtual machines
  • Vulnerability management
    • AWS is responsible for patching systems supporting the delivery of service to customers, such as the hypervisor and networking services
  • Encryption
    • AWS allows customers to use their own encryption mechanisms for nearly all the services, including S3, EBS, SimpleDB, and EC2.
    • IPSec tunnels to VPC are also encrypted
  • Data isolation
    • All data stored by AWS on behalf of customers has strong tenant isolation security and control capabilities
  • Composite services
    • AWS does not leverage any third-party cloud providers to deliver AWS services to customers.
  • Distributed Denial Of Service (DDoS) attacks
    • AWS network provides significant protection against traditional network security issues and the customer can implement further protection
  • Data portability
    • AWS allows customers to move data as needed on and off AWS storage
  • Service & Customer provider business continuity
    • AWS does operate a business continuity program
    • AWS data centers incorporate physical protection against environmental risks.
    • AWS’ physical protection against environmental risks has been validated by an independent auditor and has been certified
    • AWS provides customers with the capability to implement a robust continuity plan with multi region/AZ deployment architectures, backups, data redundancy replication
  • Capability to scale
    • AWS cloud is distributed, highly secure and resilient, giving customers massive scale potential.
    • Customers may scale up or down, paying for only what they use
  • Service availability
    • AWS does commit to high levels of availability in its service level agreements (SLA) for e.g. S3 99.9%
  • Application Security
    • AWS system development lifecycle incorporates industry best practices which include formal design reviews by the AWS Security Team, source code analysis, threat modeling and completion of a risk assessment
    • AWS does not generally outsource development of software.
  • Threat and Vulnerability Management
    • AWS Security regularly engages independent security firms to perform external vulnerability threat assessments
    • AWS Security regularly scans all Internet facing service endpoint IP addresses for vulnerabilities, but do not include customer instances
    • AWS Security notifies the appropriate parties to remediate any identified vulnerabilities.
    • Customers can request permission to conduct scans and Penetration tests of their cloud infrastructure as long as they are limited to the customer’s instances and do not violate the AWS Acceptable Use Policy. Advance approval for these types of scans is required
  • Data Security

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. When preparing for a compliance assessment of your system built inside of AWS. What are three best practices for you to prepare for an audit? Choose 3 answers
    1. Gather evidence of your IT operational controls (Customer still needs to gather all the IT operation controls inline with their environment)
    2. Request and obtain applicable third-party audited AWS compliance reports and certifications (Customers can request the reports and certifications produced by our third-party auditors or can request more information about AWS Compliance)
    3. Request and obtain a compliance and security tour of an AWS data center for a pre-assessment security review (AWS does not allow data center tour)
    4. Request and obtain approval from AWS to perform relevant network scans and in-depth penetration tests of your system’s Instances and endpoints (AWS requires prior approval to be taken to perform penetration tests)
    5. Schedule meetings with AWS’s third-party auditors to provide evidence of AWS compliance that maps to your control objectives (Customers can request the reports and certifications produced by our third-party auditors or can request more information about AWS Compliance)
  2. In the shared security model, AWS is responsible for which of the following security best practices (check all that apply) :
    1. Penetration testing
    2. Operating system account security management
    3. Threat modeling
    4. User group access management
    5. Static code analysis
  3. You are running a web-application on AWS consisting of the following components an Elastic Load Balancer (ELB) an Auto-Scaling Group of EC2 instances running Linux/PHP/Apache, and Relational DataBase Service (RDS) MySQL. Which security measures fall into AWS’s responsibility?
    1. Protect the EC2 instances against unsolicited access by enforcing the principle of least-privilege access (Customer owned)
    2. Protect against IP spoofing or packet sniffing
    3. Assure all communication between EC2 instances and ELB is encrypted (Customer owned)
    4. Install latest security patches on ELB, RDS and EC2 instances (Customer owned)
  4. Which of the following statements is true about achieving PCI certification on the AWS platform? (Choose 2)
    1. Your organization owns the compliance initiatives related to anything placed on the AWS infrastructure
    2. Amazon EC2 instances must run on a single-tenancy environment (dedicated instance)
    3. AWS manages card-holder environments
    4. AWS Compliance provides assurance related to the underlying infrastructure

References

AWS Glacier – Certification

AWS Glacier

  • Amazon Glacier is a storage service optimized for archival, infrequently used data, or “cold data.”
  • Glacier is an extremely low-cost storage service that provides durable storage with security features for data archiving and backup.
  • Glacier is designed to provide average annual durability of 99.999999999% for an archive.
  • Glacier redundantly stores data in multiple facilities and on multiple devices within each facility.
  • To increase durability, Glacier synchronously stores the data across multiple facilities before returning SUCCESS on uploading archives.
  • Glacier performs regular, systematic data integrity checks and is built to be automatically self-healing.
  • Glacier enables customers to offload the administrative burdens of operating and scaling storage to AWS, without having to worry about capacity planning, hardware provisioning, data replication, hardware failure detection and recovery, or time-consuming hardware migrations.
  • Glacier is a great storage choice when low storage cost is paramount, with data rarely retrieved, and retrieval latency of several hours is acceptable.
  • S3 should be used if applications requires fast, frequent real time access to the data
  • Glacier can store virtually any kind of data in any format.
  • All data is encrypted on the server side with Glacier handling key management and key protection. It uses AES-256, one of the strongest block ciphers available
  • Glacier allows interaction through AWS Management Console, Command Line Interface CLI and SDKs or REST based APIs.
    • management console can only be used to create and delete vaults.
    • rest of the operations to upload, download data, create jobs for retrieval need CLI, SDK or REST based APIs
  • Use cases include
    • Digital media archives
    • Data that must be retained for regulatory compliance
    • Financial and healthcare records
    • Raw genomic sequence data
    • Long-term database backups

Amazon Glacier Data Model

  • Amazon Glacier data model core concepts include vaults and archives and also includes job and notification-configuration resources
    • Vault
      • A vault is a container for storing archives
      • Each vault resource has a unique address, which comprises of the region the vault was created and the unique vault name within the region and account for e.g. https://glacier.us-west-2.amazonaws.com/111122223333/vaults/examplevault
      • Vault allows storage of unlimited number of archives
      • Glacier supports various vault operations which are region specific
      • An AWS account can create up to 1,000 vaults per region.
    • Archive
      • An archive can be any data such as a photo, video, or document and is a base unit of storage in Glacier.
      • Each archive has a unique ID and an optional description, which can only be specified during the upload of an archive.
      • Glacier assigns the archive an ID, which is unique in the AWS region in which it is stored.
      • Archive can be uploaded in a single request. While for large archives, Glacier provides a multipart upload API that enables uploading an archive in parts.
    • Jobs
      • A Job is required to retrieve an Archive and vault inventory list
      • Data retrieval requests are asynchronous operations, are queued and most jobs take about four hours to complete.
      • A job is first initiated and then the output of the job is downloaded after the job is completes
      • Vault inventory jobs needs the vault name
      • Data retrieval jobs needs both the vault name and the archive id, with an optional description
      • A vault can have multiple jobs in progress at any point in time and can be identified by Job ID, assigned when is it created for tracking
      • Glacier maintains job information such as job type, description, creation date, completion date, and job status and can be queried
      • After the job completes, the job output can be downloaded in full or partially by specifying a byte range.
    • Notification Configuration
      • As the jobs are asynchronous, Glacier supports notification mechanism to a SNS topic when job completes
      • SNS topic for notification can either be specified with each individual job request or with the vault
      • Glacier stores the notification configuration as a JSON document

Glacier Supported Operations

Vault Operations

  • Glacier provides operations to create and delete vaults.
  • A vault can be deleted only if there are no archives in the vault as of the last computed inventory and there have been no writes to the vault since the last inventory (as the inventory is prepared periodically)
  • Vault Inventory
    • Vault inventory helps retrieve list of archives in a vault with information such as archive ID, creation date, and size for each archive
    • Inventory for each vault is prepared periodically, every 24 hours
    • Vault inventory is updated approximately once a day, starting on the day the first archive is uploaded to the vault.
    • When a vault inventory job is, Glacier returns the last inventory it generated, which is a point-in-time snapshot and not real-time data.
  • Vault Metadata or Description can also be obtained for a specific vault or for all vaults in a region, which provides information such as
    • creation date,
    • number of archives in the vault,
    • total size in bytes used by all the archives in the vault,
    • and the date the vault inventory was generated
  • Glacier also provides operations to set, retrieve, and delete a notification configuration on the vault. Notifications can be used to identify vault events.

Archive Operations

  • Glacier provides operations to upload, download and delete archives.

Uploading an Archive

  • An archive can be uploaded in a single operation (1 byte to up to 4 GB in size ) or in parts referred as Multipart upload (40 TB)
  • Multipart Upload helps to
    • improve the upload experience for larger archives.
    • upload archives in parts, independently, parallely and in any order
    • faster recovery by needing to upload only the part that failed upload and not the entire archive.
    • upload archives without even knowing the size
    • upload archives from 1 byte to about 40,000 GB (10,000 parts * 4 GB) in size
  • To upload existing data to Glacier, consider using the AWS Import/Export service, which accelerates moving large amounts of data into and out of AWS using portable storage devices for transport. AWS transfers the data directly onto and off of storage devices using Amazon’s high-speed internal network, bypassing the Internet.
  • Glacier returns a response that includes an archive ID which is unique in the region in which the archive is stored
  • Glacier does not support any additional metadata information apart from an optional description. Any additional metadata information required should be maintained at client side

Downloading an Archive

  • Downloading an archive is an asynchronous operation and is the 2 step process
    • Initiate an archive retrieval job
      • When a Job is initiated, a job ID is returned as a part of the response
      • Job is executed asynchronously and the output can be downloaded after the job completes
      • Job can be initiated to download the entire archive or a portion of the archive
    • After the job completes, download the bytes
      • Archive can downloaded as all the bytes or specific byte range to download only a portion of the output
      • Downloading the archive in chunks helps in the event of the download failure, as only that part needs to be downloaded
      • Job completion status can be checked by
        • Check status explicitly (Not Recommended)
          • periodically poll the describe job operation request to obtain job information
        • Completion notification
          • An SNS topic can be specified, when the job is initiated or with the vault, to be used to notify job completion
About Range Retrievals
  • Amazon Glacier allows retrieving an archive either in whole (default) or a range, or portion
  • Range retrievals need a range to be provided that is megabyte aligned
  • Glacier returns checksum in the response which can be used to verify if any errors in download by comparing with checksum computed on the client side
  • Specifying a range of bytes can be helpful when:
    • Control bandwidth costs
      • Glacier allows retrieval of up to 5 percent of the average monthly storage (pro-rated daily) for free each month
      • Scheduling range retrievals can help in two ways.
        • meet the monthly free allowance of 5 percent by spreading out the data requested
        • if the amount of data retrieved doesn’t meet the free allowance percentage, scheduling range retrievals enables reduction of peak retrieval rate, which determines the retrieval fees.
    • Manage your data downloads
      • Glacier allows retrieved data to be downloaded for 24 hours after the retrieval request completes
      • Only portions of the archive can be retrieved so that the schedule of downloads can be managed within the given download window.
    • Retrieve a targeted part of a large archive
      • Retrieving an archive in range can be useful if an archive is uploaded as an aggregate of multiple individual files, and only few files need to be retrieved

Deleting an Archive

  • An archive can be deleted from the vault only one at a time
  • This operation is idempotent. Deleting an already-deleted archive does not result in an error
  • AWS applies pro-rated charge for items that are deleted prior to 90 days, as it is meant for long term storage

Updating an Archive

  • An existing archive cannot be updated and must be deleted and re-uploaded, which would be assigned a new archive id

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. What is Amazon Glacier?
    1. You mean Amazon “Iceberg”: it’s a low-cost storage service.
    2. A security tool that allows to “freeze” an EBS volume and perform computer forensics on it.
    3. A low-cost storage service that provides secure and durable storage for data archiving and backup
    4. It’s a security tool that allows to “freeze” an EC2 instance and perform computer forensics on it.
  2. Amazon Glacier is designed for: (Choose 2 answers)
    1. Active database storage
    2. Infrequently accessed data
    3. Data archives
    4. Frequently accessed data
    5. Cached session data
  3. An organization is generating digital policy files which are required by the admins for verification. Once the files are verified they may not be required in the future unless there is some compliance issue. If the organization wants to save them in a cost effective way, which is the best possible solution?
    1. AWS RRS
    2. AWS S3
    3. AWS RDS
    4. AWS Glacier
  4. A user has moved an object to Glacier using the life cycle rules. The user requests to restore the archive after 6 months. When the restore request is completed the user accesses that archive. Which of the below mentioned statements is not true in this condition?
    1. The archive will be available as an object for the duration specified by the user during the restoration request
    2. The restored object’s storage class will be RRS (After the object is restored the storage class still remains GLACIER. Read more)
    3. The user can modify the restoration period only by issuing a new restore request with the updated period
    4. The user needs to pay storage for both RRS (restored) and Glacier (Archive) Rates
  5. To meet regulatory requirements, a pharmaceuticals company needs to archive data after a drug trial test is concluded. Each drug trial test may generate up to several thousands of files, with compressed file sizes ranging from 1 byte to 100MB. Once archived, data rarely needs to be restored, and on the rare occasion when restoration is needed, the company has 24 hours to restore specific files that match certain metadata. Searches must be possible by numeric file ID, drug name, participant names, date ranges, and other metadata. Which is the most cost-effective architectural approach that can meet the requirements?
    1. Store individual files in Amazon Glacier, using the file ID as the archive name. When restoring data, query the Amazon Glacier vault for files matching the search criteria. (Individual files are expensive and does not allow searching by participant names etc)
    2. Store individual files in Amazon S3, and store search metadata in an Amazon Relational Database Service (RDS) multi-AZ database. Create a lifecycle rule to move the data to Amazon Glacier after a certain number of days. When restoring data, query the Amazon RDS database for files matching the search criteria, and move the files matching the search criteria back to S3 Standard class. (As the data is not needed can be stored to Glacier directly and the data need not be moved back to S3 standard)
    3. Store individual files in Amazon Glacier, and store the search metadata in an Amazon RDS multi-AZ database. When restoring data, query the Amazon RDS database for files matching the search criteria, and retrieve the archive name that matches the file ID returned from the database query. (Individual files and Multi-AZ is expensive)
    4. First, compress and then concatenate all files for a completed drug trial test into a single Amazon Glacier archive. Store the associated byte ranges for the compressed files along with other search metadata in an Amazon RDS database with regular snapshotting. When restoring data, query the database for files that match the search criteria, and create restored files from the retrieved byte ranges.
    5. Store individual compressed files and search metadata in Amazon Simple Storage Service (S3). Create a lifecycle rule to move the data to Amazon Glacier, after a certain number of days. When restoring data, query the Amazon S3 bucket for files matching the search criteria, and retrieve the file to S3 reduced redundancy in order to move it back to S3 Standard class. (Once the data is moved from S3 to Glacier the metadata is lost, as Glacier does not have metadata and must be maintained externally)

References

AWS Import/Export – Certification

AWS Import/Export Disk

  • AWS Import/Export accelerates moving large amounts of data into and out of AWS using portable storage devices for transport
  • AWS transfers the data directly onto and off of storage devices using Amazon’s high-speed internal network, bypassing the Internet, and can be much faster and more cost effective than upgrading connectivity.
  • AWS Import/Export can be implemented in two different ways
    • AWS Import/Export Disk (Disk)
      • originally the only service offered by AWS for data transfer by mail
      • Disk supports transfers data directly onto and off of storage devices you own using the Amazon high-speed internal network
    • AWS Snowball
      • is generally faster and cheaper to use than Disk for importing data into Amazon S3
  • AWS Import/Export supports
    • importing data to several types of AWS storage, including EBS snapshots, S3 buckets, and Glacier vaults.
    • exporting data out from S3 only
  • Data load typically begins the next business day after your storage device arrives at AWS and after the data export or import completes, the storage device is returned

Ideal Usage Patterns

  • AWS Import/Export is ideal for transferring large amounts of data in and out of the AWS cloud, especially in cases where transferring the data over the Internet would be too slow (a week or more) or too costly.
  • Common use cases include
    • first time migration – initial data upload to AWS
    • content distribution or regular data interchange to/from your customers or business associates,
    • off-site backup – transfer to Amazon S3 or Amazon Glacier for off-site backup and archival storage, and
    • disaster recovery – quick retrieval (export) of large backups from Amazon S3 or Amazon Glacier

AWS Import/Export Disk Jobs

  • AWS Import/Export jobs can be created in 2 steps
    • Submit a Job request to AWS where  each job corresponds to exactly one storage device
    • Send your storage device to AWS, which after the data is uploaded or downloaded is returned back
  • AWS Import/Export jobs can be created
    • using a command line tool, which requires no programming or
    • programmatically using the AWS SDK for Java or the REST API to send requests to AWS or
    • even through third party tools
  • AWS Import/Export Data Encrption
    • supports data encryption methods
      • PIN-code encryption, Hardware-based device encryption that uses a physical PIN pad for access to the data.
      • TrueCrypt software encryption, Disk encryption using TrueCrypt, which is an open-source encryption application.
    • Creating an import or export job with encryption requires providing the PIN code or password for the selected encryption method
    • Although is is not mandatory for the data to be encrypted for import jobs, it is highly recommended
    • All export jobs require data encryption can use either hardware encryption or software encryption or both methods.
  • AWS Import/Export supported Job Types
    • Import to S3
    • Import to Glacier
    • Import to EBS
    • Export to S3
  • AWS erases the device after every import job prior to return shipping.

Guidelines and Limitations

  • AWS Import/Export does not support Server-Side Encryption (SSE) when importing data.
  • Maximum file size of a single file or object to be imported is 5 TB. Files and objects larger than 5 TB won’t be imported.
  • Maximum device capacity is 16 TB for Amazon Simple Storage Service (Amazon S3) and Amazon EBS jobs.
  • Maximum device capacity is 4 TB for Amazon Glacier jobs.
  • AWS Import/Export exports only the latest version from an Amazon S3 bucket that has versioning turned on.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. You are working with a customer who has 10 TB of archival data that they want to migrate to Amazon Glacier. The customer has a 1-Mbps connection to the Internet. Which service or feature provides the fastest method of getting the data into Amazon Glacier?
    1. Amazon Glacier multipart upload
    2. AWS Storage Gateway
    3. VM Import/Export
    4. AWS Import/Export (Normal upload will take ~900 days as the internet max speed is capped)

References

AWS Certification Exams – Preparation – Sample Questions

AWS Solution Architect & SysOps Associate Certification Exams Preparation & Sample Questions

I recently passed AWS Solution Architect – Associate (90%) & SysOps – Associate (81%) certification exams.

I would like to share my preparation leading to and experience for the exams

  • AWS Certification exams are pretty tough to crack as they cover a lot of topics from a wide range of services offered by them.
  • I cleared both the Solution Architect and SysOps Associate certifications in a time frame of 2 months.
  • I had 6 months of prior hands-on experience with AWS primarily on IAM, VPC, EC2, S3 & RDS which helped a lot
  • There are lot of resources online which can be helpful but are overwhelming as well as misguide you (I found lot of dumps which have sample exam questions but the answers are marked wrong)
  • AWS Associate certifications although can be cleared with complete theoretical knowledge, a bit of hands on really helps a lot.
  • Also, AWS services are update literally everyday with new features being added, issues resolved and so on, which the exam questions surely don’t keep a track off. Not sure how often the exam questions are updated.
  • So my suggestion is if you see a question which focuses on a scenario which added latest by AWS within a month, still don’t go with that answer and stick to the answer which was relevant before the update for e.g. encryption of Root volume usually made in the certification exam with options to use external tools and was enabled by AWS recently.

AWS Certification Exam Preparation

As I mentioned there are lot of resources and courses online for the Certification exam which can be overwhelming, this is what I did for my preparation to clear the exams

  • Went through AWS Certification Preparation guide
  • Went through the AWS Solution Architect & SysOps blue print thoroughly as it mentions the topics and the weightage in the exam
  • Purchased the acloud guru course from udemy (got it for $10 on discount) for both the Solution Architect and SysOps course, which greatly helped to have a clear picture of the the format, topics and relevant sections
  • Signed up with AWS for the Free Tier account which provides a lot of the Services to be tried for free with certain limits which are more then enough to get things going. Be sure to decommission anything, if you using any thing beyond the free limits, preventing any surprises 🙂
  • Also, used the QwikLabs for all the introductory courses which are free and allow you to try out the services multiple times (I think its max 5, as I got the warnings couple of times)
  • Update: Qwiklabs seems to have reduced the free courses quite a lot and now provide targeted labs for AWS Certification exams which are charged
  • Went through the few Whitepapers especially the
  • Read the FAQs atleast for the important topics, as they cover important points and are good for quick review
  • Went through multiple sites to consolidate the Sample exam questions and worked on them to get the correct answers. I have tried to consolidate them further in this blog topic wise.
  • Went through multiple discussion topics on the acloud guru course which are pretty interesting and provides further insights and some of them are actually certification exam questions
  • I did not purchase the AWS Practice exams, as the questions are available all around. But if you want to check the format, it might be useful.
  • Opinion : acloud guru course are good by itself but is not sufficient to pass the exam but might help to counter about 50-60% of exam questions
  • Also, if you are well prepared the time for the certification exam is more then enough and I could answer all the questions within an hour and was able to run a review on all them once.
  • Important Exam Time Tip: Only mark the questions which you doubt as Mark for Review and then go through them only. I did the mistake marking quite a few as Mark for Review, even though I was confident on the answers, and wasting time on them again.

AWS Associate Certification Exam Important Topics

Targeting the Professional Certifications next ……

 

AWS SWF – Simple Workflow Overview – Certification

AWS SWF – Simple Workflow

  • AWS SWF makes it easy to build applications that coordinate work across distributed components
  • SWF makes it easier to develop asynchronous and distributed applications by providing a programming model and infrastructure for coordinating distributed components, tracking and maintaining their execution state in a reliable way
  • SWF does the following
    • stores metadata about a workflow and its component parts.
    • stores task for workers and queues them until a Worker needs them.
    • assigns task to workers, which can run either on cloud or on-premises
    • routes information between executions of a workflow and the associated Workers.
    • tracks the progress of workers on Tasks, with configurable timeouts.
    • maintains workflow state in a durable fashion
  • SWF helps coordinating tasks across the application which involves managing intertask dependencies, scheduling, and concurrency in accordance with the logical flow of the application.
  • SWF gives full control over implementing tasks and coordinating them without worrying about underlying complexities such as tracking their progress and maintaining their state.
  • SWF tracks and maintains the workflow state in a durable fashion, so that the application is resilient to failures in individual components, which can be implemented, deployed, scaled, and modified independently
  • SWF offers capabilities to support a variety of application requirements and is suitable for a range of use cases that require coordination of tasks, including media processing, web application back-ends, business process workflows, and analytics pipelines.

Simple Workflow Concepts

AWS SWF Components

  • Workflow
    • Fundamental concept in SWF is the Workflow, which is the automation of a business process
    • A workflow is a set of activities that carry out some objective, together with logic that coordinates the activities.
  • Workflow Execution
    • A workflow execution is a running instance of a workflow
  • Workflow History
    • SWF maintains the state and progress of each workflow execution in its Workflow History, which saves the application from having to store the state in a durable way.
    • It enables applications to be stateless as all information about a workflow execution is stored in its workflow history.
    • For each workflow execution, the history provides a record of which activities were scheduled, their current status, and their results. The workflow execution uses this information to determine next steps.
    • History provides a detailed audit trail that can be used to monitor running workflow executions and verify completed workflow executions.
    • Operations that do not change the state of the workflow for e.g. polling execution do not typically appear in the workflow history
  • Domain
    • Each workflow runs in an AWS resource called a Domain, which controls the workflow’s scope
    • An AWS account can have multiple domains, with each containing multiple workflows
    • Workflows in different domains cannot interact with each other
  • Activities
    • Designing an SWF workflow, Activities need to be precisely defined and then registered with SWF as an activity type with information such as name, version and timeout
  • Activity Task & Activity Worker
    • An Activity Worker is a program that receives activity tasks, performs them, and provides results back. An activity worker can be a program or even a person who performs the task using an activity worker software
    • Activity tasks—and the activity workers that perform them can
      • run synchronously or asynchronously, can be distributed across multiple computers, potentially in different geographic regions, or run on the same computer,
      • be written in different programming languages and run on different operating systems
      • be created that are long-running, or that may fail, time out require restarts or that may complete with varying throughput & latency
  • Decider
    • A Decider implements a Workflow’s coordination logic.
    • Decider schedules activity tasks, provides input data to the activity workers, processes events that arrive while the workflow is in progress, and ends (or closes) the workflow when the objective has been completed.
    • Decider directs the workflow by receiving decision tasks from SWF and responding back to SWF with decisions. A decision represents an action or set of actions which are the next steps in the workflow which can either be to schedule an activity task, set timers to delay the execution of an activity task, to request cancellation of activity tasks already in progress, and to complete or close the workflow.
  • Workers and Deciders are both stateless, and can respond to increased traffic by simply adding additional Workers and Deciders as needed
  • Role of  SWF service is to function as a reliable central hub through which data is exchanged between the decider, the activity workers, and other relevant entities such as the person administering the workflow.
  • Mechanism by which both the activity workers and the decider receive their tasks (activity tasks and decision tasks resp.) is by polling the SWF
  • SWF allows “long polling”, requests will be held open for up to 60 seconds if necessary, to reduce network traffic and unnecessary processing
  • SWF informs the decider of the state of the workflow by including with each decision task, a copy of the current workflow execution history. The workflow execution history is composed of events, where an event represents a significant change in the state of the workflow execution for e.g events would be the completion of a task, notification that a task has timed out, or the expiration of a timer that was set earlier in the workflow execution. The history is a complete, consistent, and authoritative record of the workflow’s progress

Workflow Implementation & Execution

  1. Implement Activity workers with the processing steps in the Workflow.
  2. Implement Decider with the coordination logic of the Workflow.
  3. Register the Activities and workflow with SWF.
  4. Start the Activity workers and Decider. Once started, the decider and activity workers should start polling Amazon SWF for tasks.
  5. Start one or more executions of the Workflow. Each execution runs independently and can be provided with its own set of input data.
  6. When an execution is started, SWF schedules the initial decision task. In response, the decider begins generating decisions which initiate activity tasks. Execution continues until your decider makes a decision to close the execution.
  7. View and Track workflow executions

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. What does Amazon SWF stand for?
    1. Simple Web Flow
    2. Simple Work Flow
    3. Simple Wireless Forms
    4. Simple Web Form
  2. For which of the following use cases are Simple Workflow Service (SWF) and Amazon EC2 an appropriate solution? Choose 2 answers
    1. Using as an endpoint to collect thousands of data points per hour from a distributed fleet of sensors
    2. Managing a multi-step and multi-decision checkout process of an e-commerce website
    3. Orchestrating the execution of distributed and auditable business processes
    4. Using as an SNS (Simple Notification Service) endpoint to trigger execution of video transcoding jobs
    5. Using as a distributed session store for your web application
  3. Amazon SWF is designed to help users…
    1. … Design graphical user interface interactions
    2. … Manage user identification and authorization
    3. … Store Web content
    4. … Coordinate synchronous and asynchronous tasks which are distributed and fault tolerant.
  4. What does a “Domain” refer to in Amazon SWF?
    1. A security group in which only tasks inside can communicate with each other
    2. A special type of worker
    3. A collection of related Workflows
    4. The DNS record for the Amazon SWF service
  5. Your company produces customer commissioned one-of-a-kind skiing helmets combining nigh fashion with custom technical enhancements Customers can show oft their Individuality on the ski slopes and have access to head-up-displays. GPS rear-view cams and any other technical innovation they wish to embed in the helmet. The current manufacturing process is data rich and complex including assessments to ensure that the custom electronics and materials used to assemble the helmets are to the highest standards Assessments are a mixture of human and automated assessments you need to add a new set of assessment to model the failure modes of the custom electronics using GPUs with CUD across a cluster of servers with low latency networking. What architecture would allow you to automate the existing process using a hybrid approach and ensure that the architecture can support the evolution of processes over time?
    1. Use AWS Data Pipeline to manage movement of data & meta-data and assessments. Use an auto-scaling group of G2 instances in a placement group. (Involves mixture of human assessments)
    2. Use Amazon Simple Workflow (SWF) to manage assessments, movement of data & meta-data. Use an autoscaling group of G2 instances in a placement group. (Human and automated assessments with GPU and low latency networking)
    3. Use Amazon Simple Workflow (SWF) to manage assessments movement of data & meta-data. Use an autoscaling group of C3 instances with SR-IOV (Single Root I/O Virtualization). (C3 and SR-IOV won’t provide GPU as well as Enhanced networking needs to be enabled)
    4. Use AWS data Pipeline to manage movement of data & meta-data and assessments use auto-scaling group of C3 with SR-IOV (Single Root I/O virtualization). (Involves mixture of human assessments)
  6. Your startup wants to implement an order fulfillment process for selling a personalized gadget that needs an average of 3-4 days to produce with some orders taking up to 6 months you expect 10 orders per day on your first day. 1000 orders per day after 6 months and 10,000 orders after 12 months. Orders coming in are checked for consistency men dispatched to your manufacturing plant for production quality control packaging shipment and payment processing. If the product does not meet the quality standards at any stage of the process employees may force the process to repeat a step Customers are notified via email about order status and any critical issues with their orders such as payment failure. Your case architecture includes AWS Elastic Beanstalk for your website with an RDS MySQL instance for customer data and orders. How can you implement the order fulfillment process while making sure that the emails are delivered reliably?
    1. Add a business process management application to your Elastic Beanstalk app servers and re-use the ROS database for tracking order status use one of the Elastic Beanstalk instances to send emails to customers. (Would use a SWF instead of BPM)
    2. Use SWF with an Auto Scaling group of activity workers and a decider instance in another Auto Scaling group with min/max=1. Use the decider instance to send emails to customers. (Decider sending emails might not be reliable)
    3. Use SWF with an Auto Scaling group of activity workers and a decider instance in another Auto Scaling group with min/max=1. Use SES to send emails to customers.
    4. Use an SQS queue to manage all process tasks. Use an Auto Scaling group of EC2 Instances that poll the tasks and execute them. Use SES to send emails to customers. (Does not provide an ability to repeat a step)
  7. Select appropriate use cases for SWF with Amazon EC2? (Choose 2)
    1. Video encoding using Amazon S3 and Amazon EC2. In this use case, large videos are uploaded to Amazon S3 in chunks. Application is built as a workflow where each video file is handled as one workflow execution.
    2. Processing large product catalogs using Amazon Mechanical Turk. While validating data in large catalogs, the products in the catalog are processed in batches. Different batches can be processed concurrently.
    3. Order processing system with Amazon EC2, SQS, and SimpleDB. Use SWF notifications to orchestrate an order processing system running on EC2, where notifications sent over HTTP can trigger real-time processing in related components such as an inventory system or a shipping service.
    4. Using as an SQS (Simple Queue Service) endpoint to trigger execution of video transcoding jobs.

References

AWS Web Application Firewall – WAF – Certification

AWS Web Application Firewall – WAF

Overview

  • AWS WAF is a web application firewall that helps monitor the HTTP and HTTPS requests forwarded to CloudFront and allows controlling access to the content.
  • WAF allows defining conditions for e.g. request originated IP addresses or query strings values, based on which CloudFront responds to requests either with the requested content or with an access denied (HTTP 403)
  • CloudFront can be configured to return a custom error page when a request is blocked.
  • AWS WAF allows the following behaviors:
    • Allow all requests except the ones specified – Useful when CloudFront serves content for a public website but want to block requests from attackers.
    • Block all requests except the ones specified – Useful when CloudFront serves content for a restricted website whose users can be readily identifiable by properties in web requests for e.g IP addresses the request originate from
    • Count the requests that match the specified properties – allows counting of the requests that match the defined properties, which can be useful when configuring and testing allow or block requests using new properties. After confirming the config did not accidentally block all of the traffic to the website, configuration can be applied to change the behavior to allow or block requests.

WAF Benefits

  • Additional protection against web attacks using specified conditions
  • Conditions can be defined by using characteristics of web requests such as the following:
    • IP addresses that the requests originate from
    • Values in request headers
    • Strings that appear in the requests
    • Length of requests
    • Presence of SQL code that is likely to be malicious (this is known as SQL injection)
    • Presence of a script that is likely to be malicious (this is known as cross-site scripting)
  • Rules that you can reuse for multiple web applications
  • Real-time metrics and sampled web requests
  • Automated administration using the WAF API

How WAF Works

WAF allows controlling behavior to web requests by creating conditions, rules, and web access control lists (web ACLs).

Conditions

  • Conditions define basic characteristics to watch for in a web request
    • Malicious script – XSS  (Cross Site Scripting) – Attackers embed scripts that can exploit vulnerabilities in web applications
    • IP addresses or address ranges that requests originate from.
    • Length of specified parts of the request, such as the query string.
    • Malicious SQL – SQL injection – Attackers try to extract data from the database by embedding malicious SQL code in a web request
    • Strings that appear in the request, for e.g., values that appear in the User-Agent header or text strings that appear in the query string.
      Some conditions take multiple values.

Rules

  • Rules are basically Combination of Conditions to precisely target the requests to be allowed or blocked.
  • When a rule includes multiple conditions, WAF looks for requests that match all the conditions-it ANDs the conditions together.
  • for e.g., based on recent requests that you’ve seen from an attacker, you might create a rule that includes the following conditions:
    • The requests come from 192.0.2.44.
    • They contain the value BadBot in the User-Agent header.
    • They appear to include malicious SQL code in the query string.
  • All 3 conditions should be satisfied for the Rule to be passed and the associated action to be taken

Web ACLs

  • Web ACLs provides
    • Combination of Rules
    • Action – allow, block or count to perform for each rule
      • WAF compares a request with the rules in a web ACL in the order in which its listed and takes the action that is associated with the first rule that the request matches.
      • For multiple rules in a web ACL, WAF evaluates each request against the rules in the order they are listed in the web ACL.
      • When a web request matches all of the conditions in a rule, WAF immediately takes the action—allow or block—and doesn’t evaluate the request against the remaining rules in the web ACL, if any.
    • Default action
      • determines whether WAF allows or blocks a request that does not match all of the conditions in any of the rules

Web Application Firewall Sandwich Architecture

NOTE :- from DDOS Resiliency Whitepaper and doesn’t use the AWS WAF

WAF Sandwich Architecture

  • DDoS attacks at the application layer commonly target web applications with lower volumes of traffic compared to infrastructure attacks.
  • WAF can be included as part of the infrastructure to mitigate these types of attacks
  • WAFs act as filters that apply a set of rules to web traffic, which cover exploits like XSS and SQL injection but can also help build resiliency against DDoS by mitigating HTTP GET or POST floods.
  • HTTP works as a request-response protocol between end users and applications where end users request data (GET) and submit data to be processed (POST). GET floods work by requesting the same URL at a high rate or requesting all objects from your application. POST floods work by finding expensive application processes, i.e., logins or database searches, and triggering those process to overwhelm your application.
  • WAFs have several features that may prevent these types of attacks from affecting your application availability for e.g. HTTP rate limiting which limits the number of requests per end user within a certain time period. Once the threshold is exceeded, WAFs can block or buffer new requests to ensure other end users have access to the application.
  • WAFs can also inspect HTTP requests and identify those that don’t confirm to normal patterns
  • In the “WAF sandwich,” the EC2 instance running the WAF software (not the AWS WAF) is included in an Auto Scaling group and placed in between two ELB load balancers. Basic load balancer in the default VPC will be the frontend, public facing load balancer that will distribute all incoming traffic to the WAF EC2 instance.
  • By running the WAF EC2 instance in an Auto Scaling group behind ELB, the instance can scale and add additional WAF EC2 instances should the traffic spike to elevated levels.
  • Once the traffic has been inspected and filtered, the WAF EC2 instance forwards traffic to the internal, backend load balancer which then distributes traffic across your application EC2 instance.
  • This configuration allows the WAF EC2 instances to scale and meet capacity demands without affecting the availability of your application EC2 instance.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. You’ve been hired to enhance the overall security posture for a very large e-commerce site. They have a well architected multi-tier application running in a VPC that uses ELBs in front of both the web and the app tier with static assets served directly from S3. They are using a combination of RDS and DynamoDB for their dynamic data and then archiving nightly into S3 for further processing with EMR. They are concerned because they found questionable log entries and suspect someone is attempting to gain unauthorized access. Which approach provides a cost effective scalable mitigation to this kind of attack?
    1. Recommend mat they lease space at a DirectConnect partner location and establish a 1G DirectConnect connection to tneirvPC they would then establish Internet connectivity into their space, filter the traffic in hardware Web Application Firewall (WAF). And then pass the traffic through the DirectConnect connection into their application running in their VPC. (Not cost effective)
    2. Add previously identified hostile source IPs as an explicit INBOUND DENY NACL to the web tier subnet. (does not protect against new source)
    3. Add a WAF tier by creating a new ELB and an AutoScaling group of EC2 Instances running a host-based WAF. They would redirect Route 53 to resolve to the new WAF tier ELB. The WAF tier would then pass the traffic to the current web tier. Web tier Security Groups would be updated to only allow traffic from the WAF tier Security Group
    4. Remove all but TLS 1.2 from the web tier ELB and enable Advanced Protocol Filtering This will enable the ELB itself to perform WAF functionality. (No advanced protocol filtering in ELB)

References

AWS Certification – Route 53 Overview

Route 53

  • Amazon Route 53 provides three main functions:
    • Domain registration
      • allows you to register domain names
    • Domain Name System (DNS) service
      • translates friendly domains names like www.example.com into IP addresses like 192.0.2.1
      • responds to DNS queries using a global network of authoritative DNS servers, which reduces latency
      • can route Internet traffic to CloudFront, Elastic Beanstalk, ELB, or S3. There’s no charge for DNS queries to these resources
    • Health checking
      • can monitor the health of the resources such as web servers and email servers.
      • sends automated requests over the Internet to the application to
        verify that it’s reachable, available, and functional
      • CloudWatch alarms can be configured for the health checks, so that you receive notification when a resource becomes unavailable.
      • can be configured to route Internet traffic away from resources that are unavailable

Supported DNS Resource Record Types

  • A (Address) Format
    • is an IPv4 address in dotted decimal notation for e.g. 192.0.2.1
  • AAAA Format
    • is an IPv6 address in colon-separated hexadecimal format
  • CNAME Format
    • is the same format as a domain name
    • DNS protocol does not allow creation of a CNAME record for the top node of a DNS namespace, also known as the zone apex for e.g. the DNS name example.com registration, the zone apex is example.com, a CNAME record for example.com cannot be created, but CNAME records can be created for www.example.com, newproduct.example.com etc.
    • If a CNAME record is created for a subdomain, any other resource record sets for that subdomain cannot be created for e.g. if a CNAME created for www.example.com, not other resource record sets for which the value of the Name field is www.example.com can be created
  • MX (Mail Xchange) Format
    • contains a decimal number that represents the priority of the MX record, and the domain name of an email server
  • NS (Name Server) Format
    • An NS record identifies the name servers for the hosted zone. The value for an NS record is the domain name of a name server.
  • PTR Format
    • A PTR record Value element is the same format as a domain name.
  • SOA (Start of Authority) Format
    • SOA record provides information about a domain and the corresponding Amazon Route 53 hosted zone
  • SPF (Sender Policy Framework) Format
    • SPF records were formerly used to verify the identity of the sender of email messages, however is not recommended
    • Instead of an SPF record, a TXT record that contains the applicable value is recommended
  • SRV Format
    • An SRV record Value element consists of four space-separated values.The first three values are decimal numbers representing priority, weight, and port. The fourth value is a domain name for e.g. 10 5 80 hostname.example.com
  • TXT (Text) Format
    • A TXT record contains a space-separated list of double-quoted strings. A single string include a maximum
      of 255 characters. In addition to the characters that are permitted unescaped in domain names, space
      is allowed in TXT strings

Alias resource record sets

  • Route 53 supports alias resource record sets, which enables routing of queries to a CloudFront distribution, Elastic Beanstalk, ELB, an S3 bucket configured as a static website, or another Route 53 resource record set
  • Alias records are not standard for DNS RFC and are an Route 53 extension to DNS functionality
  • Alias records help map the apex zone (root domain without the www) records to the load balancer DNS name as the DNS specification requires “zone apex” to point to an ‘A’ record (ip address) and not to an CNAME
  • Route 53 automatically recognizes changes in the resource record sets that the alias resource record set refers to for e.g. for a site pointing to an load balancer, if the ip of the load balancer changes, Route 53 will reflect those changes automatically in the DNS answers without any changes to the hosted zone that contains resource record sets
  • If an alias resource record set points to a CloudFront distribution, a load balancer, or an S3 bucket, the time to live (TTL) can’t be set; Route 53 uses the CloudFront, load balancer, or Amazon S3 TTLs.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. What does Amazon Route53 provide?
    1. A global Content Delivery Network.
    2. None of these.
    3. A scalable Domain Name System
    4. An SSH endpoint for Amazon EC2.
  2. Does Amazon Route 53 support NS Records?
    1. Yes, it supports Name Service records.
    2. No
    3. It supports only MX records.
    4. Yes, it supports Name Server records. 
  3. Does Route 53 support MX Records?
    1. Yes
    2. It supports CNAME records, but not MX records.
    3. No
    4. Only Primary MX records. Secondary MX records are not supported.
  4. Which of the following statements are true about Amazon Route 53 resource records? Choose 2 answers
    1. An Alias record can map one DNS name to another Amazon Route 53 DNS name.
    2. A CNAME record can be created for your zone apex.
    3. An Amazon Route 53 CNAME record can point to any DNS record hosted anywhere.
    4. TTL can be set for an Alias record in Amazon Route 53.
    5. An Amazon Route 53 Alias record can point to any DNS record hosted anywhere.
  5. Which statements are true about Amazon Route 53? (Choose 2 answers)
    1. Amazon Route 53 is a region-level service
    2. You can register your domain name
    3. Amazon Route 53 can perform health checks and failovers to a backup site in the even of the primary site failure
    4. Amazon Route 53 only supports Latency-based routing
  6. A customer is hosting their company website on a cluster of web servers that are behind a public-facing load balancer. The customer also uses Amazon Route 53 to manage their public DNS. How should the customer configure the DNS zone apex record to point to the load balancer?
    1. Create an A record pointing to the IP address of the load balancer
    2. Create a CNAME record pointing to the load balancer DNS name.
    3. Create a CNAME record aliased to the load balancer DNS name.
    4. Create an A record aliased to the load balancer DNS name
  7. A user has configured ELB with three instances. The user wants to achieve High Availability as well as redundancy with ELB. Which of the below mentioned AWS services helps the user achieve this for ELB?
    1. Route 53
    2. AWS Mechanical Turk
    3. Auto Scaling
    4. AWS EMR
  8. How can the domain’s zone apex for example “myzoneapexdomain com” be pointed towards an Elastic Load Balancer?
    1. By using an AAAA record
    2. By using an A record
    3. By using an Amazon Route 53 CNAME record
    4. By using an Amazon Route 53 Alias record

Further Reading

AWS EC2 Monitoring – Certification

EC2 Monitoring

Status Checks

  • Status monitoring help quickly determine whether EC2 has detected any problems that might prevent instances from running applications.
  • EC2 performs automated checks on every running EC2 instance to identify hardware and software issues.
  • Status checks are performed every minute and each returns a pass or a fail status.
  • If all checks pass, the overall status of the instance is OK.
  • If one or more checks fail, the overall status is Impaired.
  • Status checks are built into EC2, so they cannot be disabled or deleted.
  • Status checks data augments the information that EC2 already provides about the intended state of each instance (such as pending, running, stopping) as well as the utilization metrics that Amazon CloudWatch monitors (CPU utilization, network traffic, and disk activity).
  • Alarms can be created, or deleted, that are triggered based on the result of the status checks. for e.g., an alarm can be created to warn if status checks fail on a specific instance.

System Status Checks

  • monitor the AWS systems required to use your instance to ensure they are working properly.
  • detect problems with the instance that require AWS involvement to repair.
  • When a system status check fails, one can either
    • wait for AWS to fix the issue
    • or resolve it by by stopping and restarting or terminating and replacing an instance
  • System status checks failure might be cause of
    • Loss of network connectivity
    • Loss of system power
    • Software issues on the physical host
    • Hardware issues on the physical host

Instance Status Checks

  • monitor the software and network configuration of the individual instance
  • checks detect problems that requires involvement to repair.
  • When an instance status check fails, it can be resolved by either rebooting the instance or by making modifications in the operating system
  • Instance status checks failure might be cause of
    • Failed system status checks
    • Misconfigured networking or startup configuration
    • Exhausted memory
    • Corrupted file system
    • Incompatible kernel

CloudWatch Monitoring

  • CloudWatch, helps monitor EC2 instances, which collects and processes
    raw data from EC2 into readable, near real-time metrics.
  • Statistics are recorded for a period of two weeks, so that historical information can be accessed and used to gain a better perspective on how
    the application or service is performing.
  • By default Basic monitoring is enabled and EC2 metric data is sent to CloudWatch in 5-minute periods automatically
  • Detailed monitoring can be enabled on EC2 instance, which sends data to CloudWatch in 1-minute periods.
  • Aggregating Statistics Across Instances/ASG/AMI ID
    • Aggregate statistics are available for the instances that have detailed monitoring (at an additional charge) enable, which provides data in 1-minute periods
    • Instances that use basic monitoring are not included in the aggregates.
    • CloudWatch does not aggregate data across Regions. Therefore, metrics are completely separate between Regions.
    • CloudWatch returns statistics for all dimensions in the AWS/EC2 namespace, if no dimension is specified
    • The technique for retrieving all dimensions across an AWS namespace does not work for custom namespaces published to CloudWatch.
    • Statistics include Sum, Average, Minimum, Maximum, Data Samples
    • With custom namespaces, the complete set of dimensions that are associated with any given data point to retrieve statistics that include the data point must be specified
  • CloudWatch alarms
    • can be created to monitor any one of the EC2 instance’s metrics.
    • can be configured to automatically send you a notification when the metric reaches a specified threshold.
    • can automatically stop, terminate, reboot, or recover EC2 instances
    • can automatically recover an EC2 instance when the instance becomes impaired due to an underlying hardware failure a problem that requires AWS involvement to repair
    • can automatically stop or terminate the instances to save costs (EC2 instances that use an EBS volume as the root device can be stopped
      or terminated, whereas instances that use the instance store as the root device can only be terminated)
    • can use EC2ActionsAccess IAM role, which enables AWS to perform stop, terminate, or reboot actions on EC2 instances
    • If you have read/write permissions for CloudWatch but not for EC2, alarms can still be created but the stop or terminate actions won’t be performed on the EC2 instance

EC2 Metrics

  • CPUCreditUsage
    • (Only valid for T2 instances) The number of CPU credits consumed
      during the specified period.
    • This metric identifies the amount of time during which physical CPUs
      were used for processing instructions by virtual CPUs allocated to
      the instance.
    • CPU Credit metrics are available at a 5 minute frequency.
  • CPUCreditBalance
    • (Only valid for T2 instances) The number of CPU credits that an instance has accumulated.
    • This metric is used to determine how long an instance can burst beyond its baseline performance level at a given rate.
    • CPU Credit metrics are available at a 5 minute frequency.
  • CPUUtilization
    • % of allocated EC2 compute units that are currently in use on the instance. This metric identifies the processing power required to run an application upon a selected instance.
  • DiskReadOps
    • Completed read operations from all instance store volumes available to the instance in a specified period of time.
  • DiskWriteOps
    • Completed write operations to all instance store volumes available to the instance in a specified period of time.
  • DiskReadBytes
    • Bytes read from all instance store volumes available to the instance.
    • This metric is used to determine the volume of the data the application reads from the hard disk of the instance.
    • This can be used to determine the speed of the application.
  • DiskWriteBytes
    • Bytes written to all instance store volumes available to the instance.
    • This metric is used to determine the volume of the data the application writes onto the hard disk of the instance.
    • This can be used to determine the speed of the application.
  • NetworkIn
    • The number of bytes received on all network interfaces by the instance. This metric identifies the volume of incoming network traffic to an application on a single instance.
  • NetworkOut
    • The number of bytes sent out on all network interfaces by the instance. This metric identifies the volume of outgoing network traffic to an application on a single instance.
  • NetworkPacketsIn
    • The number of packets received on all network interfaces by the instance. This metric identifies the volume of incoming traffic in terms of the number of packets on a single instance.
    • This metric is available for basic monitoring only
  • NetworkPacketsOut
    • The number of packets sent out on all network interfaces by the instance. This metric identifies the volume of outgoing traffic in terms of the number of packets on a single instance.
    • This metric is available for basic monitoring only.
  • StatusCheckFailed
    • Reports if either of the status checks, StatusCheckFailed_Instance and StatusCheckFailed_System, that has failed.
    • Values for this metric are either 0 (zero) or 1 (one.) A zero indicates that the status check passed. A one indicates a status check failure.
    • Status check metrics are available at 1 minute frequency
  • StatusCheckFailed_Instance
    • Reports whether the instance has passed the Amazon EC2 instance status check in the last minute.
    • Values for this metric are either 0 (zero) or 1 (one.) A zero indicates that the status check passed. A one indicates a status check failure.
    • Status check metrics are available at 1 minute frequency
  • StatusCheckFailed_System
    • Reports whether the instance has passed the EC2 system status check in the last minute.
    • Values for this metric are either 0 (zero) or 1 (one.) A zero indicates that the status check passed. A one indicates a status check failure.
    • Status check metrics are available at a 1 minute frequency

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. In the basic monitoring package for EC2, Amazon CloudWatch provides the following metrics:
    1. Web server visible metrics such as number failed transaction requests
    2. Operating system visible metrics such as memory utilization
    3. Database visible metrics such as number of connections
    4. Hypervisor visible metrics such as CPU utilization
  2. Which of the following requires a custom CloudWatch metric to monitor?
    1. Memory Utilization of an EC2 instance
    2. CPU Utilization of an EC2 instance
    3. Disk usage activity of an EC2 instance
    4. Data transfer of an EC2 instance
  3. A user has configured CloudWatch monitoring on an EBS backed EC2 instance. If the user has not attached any additional device, which of the below mentioned metrics will always show a 0 value?
    1. DiskReadBytes
    2. NetworkIn
    3. NetworkOut
    4. CPUUtilization
  4. A user is running a batch process on EBS backed EC2 instances. The batch process starts a few instances to process Hadoop Map reduce jobs, which can run between 50 – 600 minutes or sometimes for more time. The user wants to configure that the instance gets terminated only when the process is completed. How can the user configure this with CloudWatch?
    1. Setup the CloudWatch action to terminate the instance when the CPU utilization is less than 5%
    2. Setup the CloudWatch with Auto Scaling to terminate all the instances
    3. Setup a job which terminates all instances after 600 minutes
    4. It is not possible to terminate instances automatically
  5. An AWS account owner has setup multiple IAM users. One IAM user only has CloudWatch access. He has setup the alarm action, which stops the EC2 instances when the CPU utilization is below the threshold limit. What will happen in this case?
    1. It is not possible to stop the instance using the CloudWatch alarm
    2. CloudWatch will stop the instance when the action is executed
    3. The user cannot set an alarm on EC2 since he does not have the permission
    4. The user can setup the action but it will not be executed if the user does not have EC2 rights
  6. A user has launched 10 instances from the same AMI ID using Auto Scaling. The user is trying to see the average CPU utilization across all instances of the last 2 weeks under the CloudWatch console. How can the user achieve this?
    1. View the Auto Scaling CPU metrics (Refer AS Instance Monitoring)
    2. Aggregate the data over the instance AMI ID (Works but needs detailed monitoring enabled)
    3. The user has to use the CloudWatchanalyser to find the average data across instances
    4. It is not possible to see the average CPU utilization of the same AMI ID since the instance ID is different