AWS Certified Security – Speciality (SCS-C01) Exam Learning Path

I recently cleared the AWS Certified Security – Speciality (SCS-C01) with a score of 939/1000. If compared with the Advanced Networking – Speciality exam, the Security – Speciality was not as tough mainly cause it covers features and services which you would have used in your day to day working on AWS or services which have a clear demarcation of their purpose.

AWS Certified Security – Speciality (SCS-C01) exam is the focusing on the AWS Security and Compliance concepts. It basically validates

  • An understanding of specialized data classifications and AWS data protection mechanisms.
  • An understanding of data-encryption methods and AWS mechanisms to implement them.
  • An understanding of secure Internet protocols and AWS mechanisms to implement them.
  • A working knowledge of AWS security services and features of services to provide a secure production environment.
  • Competency gained from two or more years of production deployment experience using AWS security services and features.
  • The ability to make tradeoff decisions with regard to cost, security, and deployment complexity given a set of application requirements. An understanding of security operations and risks

Refer to AWS Certified Security – Speciality Exam Guide

AWS Certified Security – Speciality (SCS-C01) Exam Summary

  • AWS Certified Security – Speciality exam, as its name suggests, covers a lot of Security and compliance concepts for VPC, EBS, S3, IAM, KMS services
  • One of the key tactic I followed when solving any AWS exam is to read the question and use paper and pencil to draw a rough architecture and focus on the areas that you need to improve. Trust me, you will be able eliminate 2 answers for sure and then need to focus on only the other two. Read the other 2 answers to check the difference area and that would help you reach to the right answer or atleast have a 50% chance of getting it right.
  • Be sure to cover the following topics
    • Security, Identity & Compliance
      • Make sure you know all the services and deep dive into IAM, KMS.
      • Identity and Access Management (IAM)
      • Deep dive into Key Management Service (KMS). There would be quite a few questions on this.
      • Understand AWS Cognito esp. User Pools
      • Know AWS GuardDuty as managed threat detection service
      • Know AWS Inspector as automated security assessment service that helps improve the security and compliance of applications deployed on AWS
      • Know Amazon Macie as a security service that uses machine learning to automatically discover, classify, and protect sensitive data in AWS
      • Know AWS Artifact as a central resource for compliance-related information that provides on-demand access to AWS’ security and compliance reports and select online agreements
      • Know AWS Certificate Manager (ACM) for certificate management. (hint : To use an ACM Certificate with Amazon CloudFront, you must request or import the certificate in the US East (N. Virginia) region)
      • Know Cloud HSM as a cloud-based hardware security module (HSM) that enables you to easily generate and use your own encryption keys on the AWS Cloud
      • Know AWS Secrets Manager to protect secrets needed to access your applications, services, and IT resources. The service enables you to easily rotate, manage, and retrieve database credentials, API keys, and other secrets throughout their lifecycle
      • Know AWS Shield esp. the Shield Advanced option and the features it provides
      • Know WAF as Web Traffic Firewall – (Hint – WAF can be attached to your CloudFront, Application Load Balancer, API Gateway to dynamically detect and prevent attacks)
    • Networking & Content Delivery
      • Understand VPC
        • Understand VPC Endpoints esp. services supported by Gateway and Interface Endpoints. Interface Endpoints are also called Private Links. (hint: application endpoints can be exposed using private links)
        • Understand VPC Flow Logs to capture information about the IP traffic going to and from network interfaces in the VPC (hint: can help in port scans but not in packet inspection)
      • Know Virtual Private Network & Direct Connect to establish connectivity a secured, low latency access between on-premises data center and AWS VPC
      • Understand CloudFront esp. with S3 (hint: Origin Access Identity to restrict direct access to S3 content)
      • Know Elastic Load Balancer at high level esp. End to End encryption.
    • Management & Governance Tools
      • Understand AWS CloudWatch for Logs and Metrics. Also, CloudWatch Events more real time alerts as compared to CloudTrail
      • Understand CloudTrail for audit and governance (hint: CloudTrail can be enabled for all regions at one go and supports log file integrity validation)
      • Understand AWS Config and its use cases (hint: AWS Config rules can be used to alert for any changes and Config can be used to check the history of changes. AWS Config can also help check approved AMIs compliance)
      • Understand CloudTrail provides the WHO and Config provides the WHAT.
      • Understand Systems Manager
        • Systems Manager provide parameter store which can used to manage secrets (hint: using Systems Manager is cheaper than Secrets manager for storage if limited usage)
        • Systems Manager provides agent based and agentless mode. (hint: agentless does not track process)
        • Systems Manager Patch Manager helps select and deploy operating system and software patches automatically across large groups of EC2 or on-premises instances
        • Systems Manager Run Command provides safe, secure remote management of your instances at scale without logging into the servers, replacing the need for bastion hosts, SSH, or remote PowerShell
      • Understand AWS Organizations to control what member account can do. (hint: can also control the root accounts)
      • Know AWS Trusted Advisor
    • Storage
    • Compute
      • Know EC2 access to services using IAM Role and Lambda using Execution role.
    • Integration Tools
      • Know how CloudWatch integration with SNS and Lambda can help in notification (Topics are not required to be in detail)
    • Whitepapers and articles

AWS Certified Security – Speciality (SCS-C01) Exam Resources

AWS Certified DevOps Engineer – Professional (DOP-C01) Exam Learning Path

AWS Certified DevOps Engineer – Professional (DOP-C01) Exam Learning Path

AWS Certified DevOps Engineer – Professional (DOP-C01) exam is the upgraded pattern of the DevOps Engineer – Professional exam which was released last year (2018). I recently attempted the latest pattern and AWS has done quite good in improving it further, as compared to the old one, to include more DevOps related questions and services.

AWS Certified DevOps Engineer – Professional (DOP-C01) exam basically validates

  • Implement and manage continuous delivery systems and methodologies on AWS
  • Implement and automate security controls, governance processes, and compliance validation
  • Define and deploy monitoring, metrics, and logging systems on AWS
  • Implement systems that are highly available, scalable, and self-healing on the AWS platform
  • Design, manage, and maintain tools to automate operational processes

Refer to AWS Certified DevOps Engineer – Professional Exam Guide

AWS Certified DevOps Engineer – Professional (DOP-C01) Exam Summary

  • AWS Certified DevOps Engineer – Professional exam was for a total of 170 minutes but it had 75 questions (I was always assuming it to be 65) and I just managed to complete the exam with 20 mins remaining. So be sure you are prepared and manage your time well. As always, mark the questions for review and move on and come back to them after you are done with all.
  • One of the key tactic I followed when solving the DevOps Engineer questions was to read the question and use paper and pencil to draw a rough architecture and focus on the areas that you need to improve. Trust me, you will be able eliminate 2 answers for sure and then need to focus on only the other two. Read the other 2 answers to check the difference area and that would help you reach to the right answer or atleast have a 50% chance of getting it right.
  • AWS Certified DevOps Engineer – Professional exam covers a lot of concepts and services related to Automation, Deployments, Disaster Recovery, HA, Monitoring, Logging and Troubleshooting. It also covers security and compliance related topics.
  • Be sure to cover the following topics
    • Monitoring & Governance tools
      • Very important to understand AWS CloudWatch vs AWS CloudTrail vs AWS Config
      • Very important to understand Trust Advisor vs Systems manager vs AWS Inspector
      • Know Personal Health Dashboard & Service Health Dashboard
      • CloudWatch
      • CloudTrail
        • Understand how to maintain CloudTrail logs integrity
      • Understand AWS Config and its use cases (hint : Config maintains history and can be used to revert the config)
      • Know Personal Health Dashboard (hint : it tracks events on your AWS resources)
      • Understand AWS Trusted Advisor and what it provides (hint : low utilization resources)
      • Systems Manager
        • Systems Manager is also covered heavily in the exams so be sure you know
        • Understand AWS Systems Manager and its various services like parameter store, patch manager
    • Networking & Content Delivery
      • Networking is covered very lightly. Usually the questions are targetted towards Troubleshooting of access or permissions.
      • Know VPC
      • Route 53
    • Security, Identity & Compliance
    • Storage
      • Exam does not cover Storage services in deep
      • Focus on Simple Secure Service (S3)
        • Understand S3 Permissions (Hint – acl authenticated users provides access to all authenticated users. How to control access)
        • Know S3 disaster recovery across region. (hint : cross region replication)
        • Know CloudFront for caching to improve performance
      • Elastic Block Store
        • Focus mainly on EBS Backup using snapshots for HA and Disaster recovery
    • Database
    • Compute
      • Know EC2
        • Understand ENI for HA, user data, pre-baked AMIs for faster instance start times
        • Amazon Linux 2 Image (hint : it allows for replication of Amazon Linux behavior in on-premises)
        • Snapshot and sharing
      • Auto Scaling
        • Auto Scaling Lifecycle events
        • Blue/green deployments with Auto Scaling – With new launch configurations, new auto scaling groups or CloudFormation update policies.
      • Understand Lambda
      • ECS
        • Know Monitoring and deployments with image update
    • Integration Tools
      • Know how CloudWatch integration with SNS and Lambda can help in notification (Topics are not required to be in detail)

AWS Certified DevOps Engineer – Professional (DOP-C01) Exam Resources

AWS Route 53 Resolver

AWS Route 53 – Resolving DNS Queries Between VPCs and On-premises Network

  • Route 53 Resolver provides automatic DNS resolution within the VPC.
  • By default, Resolver answers DNS queries for VPC domain names such as domain names for EC2 instances or ELB load balancers.
  • Resolver performs recursive lookups against public name servers for all other domain names.
  • However, on-premises instances cannot resolve Route 53 DNS entries and Route 53 cannot resolve on-premises DNS entries
  • DNS resolution between AWS VPC and on-premises network can be configured over a Direct Connect or VPN connection
  • Route 53 Resolver is regional.
  • To use inbound or outbound forwarding, create a Resolver endpoint in the VPC.
  • As part of the definition of an endpoint, specify the IP addresses to forward inbound DNS queries to or the IP addresses that outbound queries to originate from. For each IP address specified, Resolver automatically creates a VPC elastic network interface.

Forward DNS queries from resolvers on your network to Route 53 Resolver

  • DNS resolvers on on-premises network can forward DNS queries to Resolver in a specified VPC.
  • This enables DNS resolvers to easily resolve domain names for AWS resources such as EC2 instances or records in a Route 53 private hosted zone. 

Conditionally forward queries from a VPC to resolvers on your network

  • Route 53 Resolver can be configured to forward queries that it receives from EC2 instances in the VPCs to DNS resolvers on on-premises network.
  • To forward selected queries, Resolver rules can be created that specify the domain names for the DNS queries that you want to forward (such as example.com), and the IP addresses of the DNS resolvers on on-premises network that you want to forward the queries to.
  • If a query matches multiple rules (example.com, acme.example.com), Resolver chooses the rule with the most specific match (acme.example.com) and forwards the query to the IP addresses that you specified in that rule. 

[do_widget id=text-15]

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.

References

AWS Certified Advanced Networking – Speciality (ANS-C00) Exam Learning Path

I recently cleared the AWS Certified Advanced Networking – Speciality (ANS-C00), which was my first, enroute my path to the AWS Speciality certifications. Frankly, I feel the time I gave for preparation was still not enough, but I just about managed to get through. So a word of caution, this exam is inline or more tough than the professional exam especially for the reason that the Networking concepts it covers are not something you can get your hands dirty with easily.

AWS Certified Advanced Networking – Speciality (ANS-C00) exam is the focusing on the AWS Networking concepts. It basically validates

  • Design, develop, and deploy cloud-based solutions using AWS
    Implement core AWS services according to basic architecture best practices
  • Design and maintain network architecture for all AWS services
  • Leverage tools to automate AWS networking tasks

Refer to AWS Certified Advanced Networking – Speciality Exam Guide

AWS Certified Advanced Networking – Speciality (ANS-C00) Exam Summary

  • AWS Certified Advanced Networking – Speciality exam covers a lot of Networking concepts like VPC, VPN, Direct Connect, Route 53, ALB, NLB.
  • One of the key tactic I followed when solving the DevOps Engineer questions was to read the question and use paper and pencil to draw a rough architecture and focus on the areas that you need to improve. Trust me, you will be able eliminate 2 answers for sure and then need to focus on only the other two. Read the other 2 answers to check the difference area and that would help you reach to the right answer or atleast have a 50% chance of getting it right.
  • Be sure to cover the following topics
    • Networking & Content Delivery
      • You should know everything in Networking.
      • Understand VPC in depth
      • Virtual Private Network to establish connectivity between on-premises data center and AWS VPC
      • Direct Connect to establish connectivity between on-premises data center and AWS VPC and Public Services
        • Make sure you understand Direct Connect in detail, without this you cannot clear the exam
        • Understand Direct Connect connections – Dedicated and Hosted connections
        • Understand how to create a Direct Connect connection (hint: LOA-CFA provides the details for partner to connect to AWS Direct Connect location)
        • Understand virtual interfaces options – Private Virtual Interface for VPC resources and Public Virtual Interface for Public resources
        • Understand setup Private and Public VIF
        • Understand Route Propagation, propagation priority, BGP connectivity
        • Understand High Availability options based on cost and time i.e. Second Direct Connect connection OR VPN connection
        • Understand Direct Connect Gateway – it provides a way to connect to multiple VPCs from on-premises data center using the same Direct Connect connection
      • Route 53
        • Understand Route 53 and Routing Policies and their use cases Focus on Weighted, Latency routing policies
        • Understand Route 53 Split View DNS to have the same DNS to access a site externally and internally
      • Understand CloudFront and use cases
      • Load Balancer
        • Understand ELB, ALB and NLB 
        • Understand the difference ELB, ALB and NLB esp. ALB provides Content, Host and Path based Routing while NLB provides the ability to have static IP address
        • Know how to design VPC CIDR block with NLB (Hint – minimum number of IPs required are 8)
        • Know how to pass original Client IP to the backend instances (Hint – X-Forwarded-for and Proxy Protocol)
      • Know WorkSpaces requirements and setup
    • Security
      • Know AWS GuardDuty as managed threat detection service
      • Know AWS Shield esp. the Shield Advanced option and the features it provides
      • Know WAF as Web Traffic Firewall – (Hint – WAF can be attached to your CloudFront, Application Load Balancer, API Gateway to dynamically detect and prevent attacks)

AWS Certified Advanced Networking – Speciality (ANS-C00) Exam Resources

AWS Network Connectivity Options

VPC

More details @ Virtual Private Cloud

Internet Gateway

  • An Internet Gateway provides Internet connectivity to VPC
  • Internet gateway is a horizontally scaled, redundant, and highly available VPC component that allows communication between instances in your VPC and the internet.
  • Internet Gateway imposes no availability risks or bandwidth constraints on your network traffic.
  • An Internet gateway serves two purposes: to provide a target in the VPC route tables for internet-routable traffic, and to perform network address translation (NAT) for instances that have not been assigned public IPv4 addresses.
  • An internet gateway supports IPv4 and IPv6 traffic.

NAT Gateway

  • NAT Gateway enables instances in a private subnet to connect to the internet (for example, for software updates) or other AWS services, but prevent the internet from initiating connections with the instances.
  • A NAT gateway forwards traffic from the instances in the private subnet to the internet or other AWS services, and then sends the response back to the instances.
  • When traffic goes to the internet, the source IPv4 address is replaced with the NAT device’s address and similarly, when the response traffic goes to those instances, the NAT device translates the address back to those instances’ private IPv4 addresses.

Egress Only Internet Gateway

  • NAT devices are not supported for IPv6 traffic, use an Egress-only Internet gateway instead
  • Egress-only Internet gateway is a horizontally scaled, redundant, and highly available VPC component that allows outbound communication over IPv6 from instances in the VPC to the Internet, and prevents the Internet from initiating an IPv6 connection with your instances.

VPC Endpoints

  • VPC endpoint provides a private connection from VPC to supported AWS services and VPC endpoint services powered by PrivateLink without requiring an internet gateway, NAT device, VPN connection, or AWS Direct Connect connection.
  • Instances in the VPC do not require public IP addresses to communicate with resources in the service. Traffic between the VPC and the other service does not leave the Amazon network.
  • VPC Endpoints are virtual devices and are horizontally scaled, redundant, and highly available VPC components that allow communication between instances in the VPC and services without imposing availability risks or bandwidth constraints on the network traffic.
  • VPC Endpoints are of two types
    • Interface Endpoints – is an elastic network interface with a private IP address that serves as an entry point for traffic destined to supported services.
    • Gateway Endpoints – is a gateway that is a target for a specified route in your route table, used for traffic destined to a supported AWS service. Currently only Amazon S3 and DynamoDB.

More details @ VPC Endpoints

VPC Peering

  • VPC peering connection enables networking connection between two VPCs to route traffic between them using private IPv4 addresses or IPv6 addresses
  • VPC peering connections can be created between your own VPCs, or with a VPC in another AWS account
  • VPC peering connections can be created across regions, referred to as inter-region VPC peering connection
  • VPC peering uses existing underlying AWS infrastructure; it is neither a gateway nor a VPN connection, and does not rely on a separate piece of physical hardware.
  • VPC Peering does not have a single point of failure for communication or a bandwidth bottleneck.
  • VPC Peering connections have limitations
    • Can be used with Overlapping CIDR blocks
    • Does not provide Transitive peering
    • Doe not support Edge to Edge routing through Gateway or private connection

More details @ VPC Peering

VPN CloudHub

VPC CloudHub
  • AWS VPN CloudHub allows you to securely communicate from one site to another using AWS Managed VPN or Direct Connect
  • AWS VPN CloudHub operates on a simple hub-and-spoke model that can be used with or without a VPC
  • AWS VPN CloudHub can be used if you have multiple branch offices and existing internet connections and would like to implement a convenient, potentially low cost hub-and-spoke model for primary or backup connectivity between these remote offices.
  • AWS VPN CloudHub leverages VPC virtual private gateway with multiple gateways, each using unique BGP autonomous system numbers (ASNs).

Transit VPC

Transit VPC
  • A transit VPC is a common strategy for connecting multiple, geographically disperse VPCs and remote networks in order to create a global network transit center.
  • A transit VPC simplifies network management and minimizes the number of connections required to connect multiple VPCs and remote networks
  • Transit VPC can be used to support important use cases
    • Private Networking – You can build a private network that spans two or more AWS Regions.
    • Shared Connectivity – Multiple VPCs can share connections to data centers, partner networks, and other clouds.
    • Cross-Account AWS Usage – The VPCs and the AWS resources within them can reside in multiple AWS accounts.
  • Transit VPC design helps implement more complex routing rules, such as network address translation between overlapping network ranges, or to add additional network-level packet filtering or inspection

Transit Gateway (Virtual Routing and Forwarding)

  • Transit gateway enables you to attach VPCs (across accounts) and VPN connections in the same Region and route traffic between them
  • Transit gateways support dynamic and static routing between attached VPCs and VPN connections
  • Transit gateway removes the need for using full mesh VPC Peering and Transit VPC

Virtual Private Network (VPN)

VPC Managed VPN Connection
  • VPC provides the option of creating an IPsec VPN connection between remote customer networks and their VPC over the internet
  • AWS managed VPN endpoint includes automated multi–data center redundancy & failover built into the AWS side of the VPN connection
  • AWS managed VPN consists of two parts
    • Virtual Private Gateway (VPG) on AWS side
    • Customer Gateway (CGW) on the on-premises data center
  • AWS Managed VPN only provides Site-to-Site VPN connectivity. It does not provide Point-to-Site VPC connectivity for e.g. from Mobile
  • Virtual Private Gateway are Highly Available as it represents two distinct VPN endpoints, physically located in separate data centers to increase the availability of the VPN connection.
  • High Availability on the on-premises data center must be handled by creating additional Customer Gateway.
  • AWS Managed VPN connections are low cost, quick to setup and start with compared to Direct Connect. However, they are not reliable as they traverse through Internet.

More details @ Virtual Private Network

Software VPN

  • VPC offers the flexibility to fully manage both sides of the VPC connectivity by creating a VPN connection between your remote network and a software VPN appliance running in your VPC network.
  • Software VPNs help manage both ends of the VPN connection either for compliance purposes or for leveraging gateway devices that are not currently supported by Amazon VPC’s VPN solution.
  • Software VPNs allows you to handle Point-to-Site connectivity
  • Software VPNs, with the above design, introduces a single point of failure and needs to be handled.

Direct Connect

  • AWS Direct Connect helps establish a dedicated connection and a private connectivity from an on-premises network to VPC
  • Direct Connect can reduce network costs, increase bandwidth throughput, and provide a more consistent network experience than internet-based or VPN connections
  • Direct Connect uses industry-standard VLANs to access EC2 instances running within a VPC using private IP addresses
  • Direct Connect lets you establish
    • Dedicated Connection: A 1G or 10G physical Ethernet connection associated with a single customer through AWS.
    • Hosted Connection: A 1G or 10G physical Ethernet connection that an AWS Direct Connect Partner provisions on behalf of a customer.
  • Direct Connect provides following Virtual Interfaces
    • Private virtual interface – to access an VPC using private IP addresses.
    • Public virtual interface – to access all AWS public services using public IP addresses.
    • Transit virtual interface – to access one or more transit gateways associated with Direct Connect gateways.
  • Direct Connect connections are not redundant as each connection consists of a single dedicated connection between ports on your router and an Amazon router
  • Direct Connect High Availability can be configured using
    • Multiple Direct Connect connections
    • Back-up IPSec VPN connection

More details @ Direct Connect

LAGs

  • Direct Connect link aggregation group (LAG) is a logical interface that uses the Link Aggregation Control Protocol (LACP) to aggregate multiple connections at a single AWS Direct Connect endpoint, allowing you to treat them as a single, managed connection.
  • LAGs needs the following
    • All connections in the LAG must use the same bandwidth.
    • You can have a maximum of four connections in a LAG. Each connection in the LAG counts towards your overall connection limit for the Region.
    • All connections in the LAG must terminate at the same AWS Direct Connect endpoint.

Direct Connect Gateway

  • Direct Connect Gateway allows you to connect an AWS Direct Connect connection to one or more VPCs in your account that are located in the same or different regions
  • Direct Connect gateway can be created in any public region and accessed from all other public regions
  • Direct Connect gateway CANNOT be used to connect to a VPC in another account.
  • Alternatively, Direct connect locations can also access the public resources in any AWS Region using a public virtual interface.

Direct Connect with VPN

  • AWS Direct Connect plus VPN provides an IPsec-encrypted private connection that also reduces network costs, increases bandwidth throughput, and provides a more consistent network experience than internet-based VPN connections.

References

AWS CloudFormation Best Practices – Certification

AWS CloudFormation Best Practices

  • AWS CloudFormation Best Practices are based on real-world experience from current AWS CloudFormation customers
  • AWS CloudFormation Best Practices help provide guidelines on
    • how to plan and organize stacks,
    • create templates that describe resources and the software applications that run on them,
    • and manage stacks and their resources

Required Mainly for Developer, SysOps Associate & DevOps Professional Exam

Planning and Organizing

Organize Your Stacks By Lifecycle and Ownership

  • Use the lifecycle and ownership of the AWS resources to help you decide what resources should go in each stack.
  • By grouping resources with common lifecycles and ownership, owners can make changes to their set of resources by using their own process and schedule without affecting other resources.
  • For e.g. Consider an Application using Web and Database instances. Both the Web and Database have a different lifecycle and usually the ownership lies with different teams. Maintaining both in a single stack would need communication and co-ordination between different teams introducing complexity. It would be best to have different stacks owned by the respective teams, so that they can update their resources without impacting each others’s stack.

Use Cross-Stack References to Export Shared Resources

  • With multiple stacks, there is usually a need to refer values and resources across stacks.
  • Use cross-stack references to export resources from a stack so that other stacks can use them
  • Stacks can use the exported resources by calling them using the Fn::ImportValue function.
  • For e.g. Web stack would always need resources from the Network stack like VPC, Subnets etc.

Use IAM to Control Access

  • Use IAM to control access to
    • what AWS CloudFormation actions users can perform, such as viewing stack templates, creating stacks, or deleting stacks
    • what actions CloudFormation can perform on resources on their behalf
  • Remember, having access to CloudFormation does not provide user with access to AWS resources. That needs to be provided separately.
  • To separate permissions between a user and the AWS CloudFormation service, use a service role. AWS CloudFormation uses the service role’s policy to make calls instead of the user’s policy.

Verify Quotas for All Resource Types

  • Ensure that stack can create all the required resources without hitting the AWS account limits.

Reuse Templates to Replicate Stacks in Multiple Environments

  • Reuse templates to replicate infrastructure in multiple environments
  • Use parameters, mappings, and conditions sections to customize and make templates reusable
  • for e.g. creating the same stack in development, staging and production environment with different instance types, instance counts etc.

Use Nested Stacks to Reuse Common Template Patterns

  • Nested stacks are stacks that create other stacks.
  • Nested stacks separate out the common patterns and components to create dedicated templates for them, preventing copy pasting across stacks.
  • for e.g. a standard load balancer configuration can be created as nested stack and just used by other stacks

Creating templates

Do Not Embed Credentials in Your Templates

  • Use input parameters to pass in sensitive information such as DB password whenever you create or update a stack.
  • Use the NoEcho property to obfuscate the parameter value.

Use AWS-Specific Parameter Types

  • For existing AWS-specific values, such as existing Virtual Private Cloud IDs or an EC2 key pair name, use AWS-specific parameter types
  • AWS CloudFormation can quickly validate values for AWS-specific parameter types before creating your stack.

Use Parameter Constraints

  • Use Parameter constraints to describe allowed input values so that CloudFormation catches any invalid values before creating a stack.
  • For e.g. constraints for database user name with min and max length

Use AWS::CloudFormation::Init to Deploy Software Applications on Amazon EC2 Instances

  • Use AWS::CloudFormation::Init resource and the cfn-init helper script to install and configure software applications on EC2 instances

Validate Templates Before Using Them

  • Validate templates before creating or updating a stack
  • Validating a template helps catch syntax and some semantic errors, such as circular dependencies, before AWS CloudFormation creates any resources.
  • During validation, AWS CloudFormation first checks if the template is valid JSON or a valid YAML. If both checks fail, AWS CloudFormation returns a template validation error.

Managing stacks

Manage All Stack Resources Through AWS CloudFormation

  • After launching the stack, any further updates should be done through CloudFormation only.
  • Doing changes outside the stack can create a mismatch between the stack’s template and the current state of the stack resources, which can cause errors if you update or delete the stack.

Create Change Sets Before Updating Your Stacks

  • Change sets provides a preview of how the proposed changes to a stack might impact the running resources before you implement them
  • CloudFormation doesn’t make any changes to the stack until you execute the change set, allowing you to decide whether to proceed with the proposed changes or create another change set.

Use Stack Policies

  • Stack policies help protect critical stack resources from unintentional updates that could cause resources to be interrupted or even replaced
  • During a stack update, you must explicitly specify the protected resources that you want to update; otherwise, no changes are made to protected resources

Use AWS CloudTrail to Log AWS CloudFormation Calls

  • AWS CloudTrail tracks anyone making AWS CloudFormation API calls in the AWS account.
  • API calls are logged whenever anyone uses the AWS CloudFormation API, the AWS CloudFormation console, a back-end console, or AWS CloudFormation AWS CLI commands.
  • Enable logging and specify an Amazon S3 bucket to store the logs.

Use Code Reviews and Revision Controls to Manage Your Templates

  • Using code reviews and revision controls help track changes between different versions of your templates and changes to stack resources
  • Maintaining history can help revert the stack to a certain version of the template.

[do_widget id=text-15]

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A company has deployed their application using CloudFormation. They want to update their stack. However, they want to understand how the changes will affect running resources before implementing the updated. How can the company achieve the same?
    1. Use CloudFormation Validate Stack feature
    2. Use CloudFormation Dry Run feature
    3. Use CloudFormation Stage feature
    4. Use CloudFormation Change Sets feature
  2. You have multiple similar three-tier applications and have decided to use CloudFormation to maintain version control and achieve automation. How can you best use CloudFormation to keep everything agile and maintain multiple environments while keeping cost down?
    1. Create multiple templates in one CloudFormation stack.
    2. Combine all resources into one template for version control and automation.
    3. Use CloudFormation custom resources to handle dependencies between stacks
    4. Create separate templates based on functionality, create nested stacks with CloudFormation.
  3. You are working as an AWS DevOps admins for your company. You are in-charge of building the infrastructure for the company’s development teams using CloudFormation. The template will include building the VPC and networking components, installing a LAMP stack and securing the created resources. As per the AWS best practices what is the best way to design this template?
    1. Create a single CloudFormation template to create all the resources since it would be easier from the maintenance perspective.
    2. Create multiple CloudFormation templates based on the number of VPC’s in the environment.
    3. Create multiple CloudFormation templates based on the number of development groups in the environment.
    4. Create multiple CloudFormation templates for each set of logical resources, one for networking, and the other for LAMP stack creation.

References

Google Cloud – Associate Cloud Engineer Certification learning path

Google Cloud Certified - Associate Cloud Engineer

Google Cloud – Associate Cloud Engineer certification exam is basically for one who works day-in day-out with the Google Cloud Services. It targets an Cloud Engineer who deploys applications, monitors operations, and manages enterprise solutions. The exam makes sure it covers gamut of services and concepts. Although, the exam is not that tough and time available of 2 hours a quite plenty, if you well prepared.

Quick summary of the exam

  • Wide range of Google Cloud services and what they actually do. It focuses heavily on IAM, Compute, Storage. There is little bit of Network but hardly any data services.
  • Hands-on is a must. Covers Cloud SDK commands and Console operations that you would use for day-to-day work. If you have not worked on GCP before make sure you do lot of labs else you would be absolute clueless for some of the questions and commands
  • Tests are updated for the latest enhancements. There are no reference of Google Container Engine and everything was Google Kubernetes Engine, covers Cloud Functions, Cloud Spanner.
  • Once again be sure that NO Online Course or Practice tests is going to cover all. I did LinuxAcademy which covered maybe 60-70%, but hands-on or practical knowledge is MUST

The list of topics is quite long, but something that you need to be sure to cover are

  • General Services
    • Billing
      • understand how billing works. Monthly vs Threshold and which has priority
      • how to change a billing account for a project and what roles you need. Hint – Project Owner and Billing Administrator for the billing account
    • Cloud SDK
      • understand gcloud commands esp. when dealing with
        • configurations i.e. gcloud config
          • activate profiles or set project and accounts
        • app engine i.e gcloud iam
          • check roles
        • deployment manager i.e. gcloud deployment-manager
  • Network Services
    • Virtual Private Cloud
      • Create a Custom Virtual Private Cloud (VPC), subnets and host applications within them Hint VPC spans across region
      • Understand how Firewall rules works and how they are configured. Hint – Focus on Network Tags.
      • Understand the concept internal and external IPs and difference between static and ephemeral IPs
    • Load Balancer
  • Identity Services
    • Cloud IAM 
      • provides administrators the ability to manage cloud resources centrally by controlling who can take what action on specific resources.
      • Understand how IAM works and how rules apply esp. the hierarchy from Organization -> Folder -> Project -> Resources
      • Understand the difference between Primitive, Pre-defined and Custom roles and their use cases
      • Need to know and understand the roles for the following services atleast
        • Cloud Storage – Admin vs Creator vs Viewer
        • Compute Engine – Admin vs Instance Admin
        • Spanner – Viewer vs Database User
        • BigQuery – User vs JobUser
      • Know how to copy roles to different projects or organization. Hint – gcloud iam roles copy
      • Know how to use service accounts with applications
  • Compute Services
    • Make sure you know all the compute services Google Compute Engine, Google App Engine and Google Kubernetes Engine, they are heavily covered in the exam.
    • Google Compute Engine
      • Google Compute Engine is the best IaaS option for compute and provides fine grained control
      • Make sure you know how to create a GCE, connect to it using Cloud shell or ssh keys
      • Make sure you know the difference between backups and images and how to create the same
      • Understand how you can recreate instance in different zones and regions
      • Know difference between managed vs unmanaged instance groups and auto-healing feature
      • Understand Preemptible VMs and their use cases.
      • know how to upgrade an instance without downtime. HINT – live migration.
      • In case of any issues or errors, how to debug the same
    • Google App Engine
      • Google App Engine is mainly the best option for PaaS with platforms supported and features provided.
      • Deploy an application with App Engine and understand how versioning and rolling deployments can be done
      • Understand how to keep auto scaling and traffic splitting and migration.
      • Know App Engine is a regional resource and understand the steps to migrate or deploy application to different region and project.
    • Google Kubernetes Engine
      • Google Container Engine is now officially Google Kubernetes Engine and the questions refer to the same
      • Google Kubernetes Engine, powered by the open source container scheduler Kubernetes, enables you to run containers on Google Cloud Platform.
      • Kubernetes Engine takes care of provisioning and maintaining the underlying virtual machine cluster, scaling your application, and operational logistics such as logging, monitoring, and cluster health management.
      • Be sure to Create a Kubernetes Cluster and configure it to host an application
      • Understand how to make the cluster auto repairable and upgradable. Hint – Node auto-upgrades and auto-repairing feature
      • Very important to understand where to use gcloud commands (to create a cluster) and kubectl commands (manage the cluster components)
      • Very important to understand how to increase cluster size and enable autoscaling for the cluster
      • know how to manage secrets like database passwords
  • Storage Services
    • Understand each storage service options and their use cases.
    • Cloud Storage
      • cost-effective object storage for an unstructured data.
      • very important to know the different classes and their use cases esp. Regional and Multi-Regional (frequent access), Nearline (monthly access) and Coldline (yearly access)
      • Understand life cycle management. HINT – Changes are in accordance to object creation date
      • Understand Signed URL to give temporary access and the users do not need to be GCP users
      • Understand permissions – IAM vs ACLs (fine grained control)
    • Relational Databases
      • Know Cloud SQL and Cloud Spanner
      • Cloud SQL
        • is a fully-managed service that provides MySQL and PostgreSQL only.
        • limited to 10TB and is a regional service.
        • know the difference between Failover and Read replicas
        • know how to perform Point-In-Time recovery. Hint – required binary logging and backups
      • Cloud Spanner
        • is a fully managed, mission-critical relational database service.
        • provides a scalable online transaction processing (OLTP) database with high availability and strong consistency at global scale.
        • globally distributed and can scale and handle more than 10TB.
        • not a direct replacement and would need migration
      • There are no direct options for Microsoft SQL Server or Oracle yet.
    • Data Warehousing
      • BigQuery
        • provides scalable, fully managed enterprise data warehouse (EDW) with SQL and fast ad-hoc queries.
        • Remember it is most suitable for historical analysis.
        • know how to perform a preview or dry run. Hint – price is determined by bytes read not bytes returned.
  • Data Services
    • Although there were only a couple of reference of big data services in the exam, it is important to know (DO NOT DEEP DIVE) the Big Data stack (esp. IoT gateway, Pub/Sub, Bigtable vs BigQuery) to understand which service fits the different layers of ingest, store, process, analytics, use
      • Cloud Storage as the medium to store data as data lake
      • Cloud Pub/Sub as the messaging service to capture real time data esp. IoT
      • Cloud Pub/Sub is designed to provide reliable, many-to-many, asynchronous messaging between applications esp. real time IoT data capture
      • Cloud Dataflow to process, transform, transfer data and the key service to integrate store and analytics.
      • Cloud BigQuery for storage and analytics. Remember BigQuery provides the same cost-effective option for storage as Cloud Storage
      • Cloud Dataprep to clean and prepare data. Hint – It can be used anomaly detection.
      • Cloud Dataproc to handle existing Hadoop/Spark jobs. Hint – Use it to replace existing hadoop infra.
      • Cloud Datalab is an interactive tool for exploration, transformation, analysis and visualization of your data on Google Cloud Platform
  • Monitoring
    • Google Stackdriver
      • provides everything from monitoring, alert, error reporting, metrics, diagnostics, debugging, trace.
      • remember audits are mainly checking Stackdriver
  • DevOps services
    • Deployment Manager 
    • Cloud Launcher (Marketplace)
      • provides a way to launch common software packages e.g. Jenkins or WordPress and stacks on Google Compute Engine with just a few clicks like a prepackaged solution.
      • It can help minimize deployment time and can be used without any knowledge about the product

Resources

Google Cloud – Professional Data Engineer Certification learning path

After completing my Google Cloud – Professional Cloud Architect certification exam, I was looking into the Google Cloud – Professional Data Engineer exam and luckily Google Cloud was doing a pilot for their latest updated Professional Data Engineer certification exam. I applied for the free pilot and had a chance to appear for the exam. The pilot exam was 4 hours – 95 questions (as compared to 2 hrs – 50 questions). The results would be out in March 2019, but I can assure the overall exam is quite exhaustive. Once again, the exam covers not only the gamut of services and concepts but also the focus on logical thinking and practical experience.

Quick summary of the exam

  • Wide range of Google Cloud data services and what they actually do. It includes Storage, and a LOTS of Data services
  • Nothing much on Compute and Network is covered
  • Questions sometimes tests your logical thinking rather than any concept regarding Google Cloud.
  • Hands-on, if you have not worked on GCP before make sure you do lots of labs else you would be absolute clueless for some of the questions and commands
  • Tests are updated for the latest enhancements.
  • Pilot exam does not cover the cases studies. But given my Professional Cloud Architect exam experience, make sure you cover the case studies before hand.
  • Be sure that NO Online Course or Practice tests is going to cover all. I did Coursera, LinuxAcademy which is really vast, but hands-on or practical knowledge is MUST.

The list of topics is quite long, but something that you need to be sure to cover are

  • Identity Services
    • Cloud IAM 
      • provides administrators the ability to manage cloud resources centrally by controlling who can take what action on specific resources.
      • Understand how IAM works and how rules apply esp. the hierarchy from Organization -> Folder -> Project -> Resources
      • Understand IAM Best practices
      • Make sure you know the BigQuery Access roles
  • Storage Services
    • Understand each storage service options and their use cases.
    • Cloud Storage
      • cost-effective object storage for an unstructured data.
      • very important to know the different classes and their use cases esp. Regional and Multi-Regional (frequent access), Nearline (monthly access) and Coldline (yearly access)
      • Understand Signed URL to give temporary access and the users do not need to be GCP users
      • Understand permissions – IAM vs ACLs (fine grained control)
    • Relational Databases
      • Know Cloud SQL and Cloud Spanner
      • Cloud SQL
        • is a fully-managed service that provides MySQL and PostgreSQL only.
        • Limited to 10TB and is a regional service.
      • Cloud Spanner
        • is a fully managed, mission-critical relational database service.
        • provides a scalable online transaction processing (OLTP) database with high availability and strong consistency at global scale.
        • globally distributed and can scale and handle more than 10TB.
        • not a direct replacement and would need migration
      • There are no direct options for Microsoft SQL Server or Oracle yet.
    • NoSQL
      • Know Cloud Datastore and BigTable
      • Datastore
        • provides document database for web and mobile applications. Datastore is not for analytics
        • Understand Datastore indexes and how to update indexes for Datastore
      • Bigtable
        • provides column database suitable for both low-latency single-point lookups and precalculated analytics
        • understand Bigtable is not for long term storage as it is quite expensive
        • know the differences with HBase
        • Know how to measure performance and scale
    • Data Warehousing
      • BigQuery
        • provides scalable, fully managed enterprise data warehouse (EDW) with SQL and fast ad-hoc queries.
        • Remember it is most suitable for historical analysis.
        • know how to access control tables, columns within tables and query results (hint – Authorized View)
        • Be sure to cover the Best Practices including key strategy, cost optimization, partitioning and clustering
  • Data Services
    • Obviously there is lots of Data and Just Data
    • Know the Big Data stack and understand which service fits the different layers of ingest, store, process, analytics, use
    • Cloud Storage
      • as the medium to store data as data lake
      • understand what class is the best suited and which one provides geo-redundancy.
    • Cloud Pub/Sub
      • as the messaging service to capture real time data esp. IoT
    • Cloud Pub/Sub
      • is designed to provide reliable, many-to-many, asynchronous messaging between applications esp. real time IoT data capture
      • how it compares to Kafka
    • Cloud Dataflow
      • to process, transform, transfer data and the key service to integrate store and analytics.
      • know how to improve a Dataflow performance
      • Google expects you to know the Apache Beam features as well
    • Cloud BigQuery
      • for storage and analytics. Remember BigQuery provides the same cost-effective option for storage as Cloud Storage
      • understand how BigQuery Streaming works
      • know BigQuery limitations esp. with updates and inserts
    • Cloud Dataprep
      • to clean and prepare data. It can be used anomaly detection.
      • does not need any programming language knowledge and can be done through graphical interface
      • be sure to know or try hands-on on a dataset
    • Cloud Dataproc
      • to handle existing Hadoop/Spark jobs
      • you need to know how to improve the performance of the Hadoop cluster as well :). Know how to configure the hadoop cluster to use all the cores (hint- spark executor cores) and handle out of memory errors (hint – executor memory)
      • how to install other components (hint – initialization actions)
    • Cloud Datalab
      • is an interactive tool for exploration, transformation, analysis and visualization of your data on Google Cloud Platform
      • based on Jupyter
    • Cloud Composer
      • fully managed workflow orchestration service based on Apache Airflow
      • pipelines are configured as directed acyclic graphs (DAGs)
      • workflow lives on-premises, in multiple clouds, or fully within GCP.
      • provides ability to author, schedule, and monitor your workflows in a unified manner
  • Machine Learning
    • Google expects the Data Engineer to surely know some of the Data scientists stuff
    • Understand the different algorithms
      • Supervised Learning (labelled data)
        • Classification (for e.g. Spam or Not)
        • Regression (for e.g. Stock or House prices)
      • Unsupervised Learning (Unlabelled data)
        • Clustering (for e.g. categories)
      • Reinforcement Learning
    • Know Cloud ML with Tensorflow
    • Know all the Cloud AI products which include
      • Cloud Vision
      • Cloud Natural Language
      • Cloud Speech-to-Text
      • Cloud Video Intelligence
    • Cloud AutoML products, which can help you get started without much machine learning experience
  • Monitoring
    • Google Stackdriver provides everything from monitoring, alert, error reporting, metrics, diagnostics, debugging, trace.
      remember audits are mainly checking Stackdriver
  • Security Services
    • Data Loss Prevention API to handle sensitive data esp. redaction of PII data.
    • understand Encryption techniques
  • Other Services
    • Storage Transfer Service allows import of large amounts of online data into Google Cloud Storage, quickly and cost-effectively. Online data is the key here as it supports AWS S3, HTTP/HTTPS and other GCS buckets. If the data is on-premises you need to use gsutil command
    • Transfer Appliance to transfer large amounts of data quickly and cost-effectively into Google Cloud Platform. Check for the data size and it would be always compared with Google Transfer Service or gsutil commands.
    • BigQuery Data Transfer Service to integrate with third-party services and load data into BigQuery

Resources

Google Cloud – Professional Cloud Architect Certification learning path

Google Cloud – Professional Cloud Architect certification exam is one of the toughest exam I have appeared for. It can surely be compared inline with the AWS Solution Architect/DevOps Professional exams. However, the gamut of services and concepts it tests your knowledge on is really vast.

Quick summary of the exam

  • Wide range of Google Cloud services and what they actually do. It includes Compute, Storage, Network and even Data services
  • Questions sometimes tests your logical thinking rather than any concept regarding Google Cloud.
  • Hands-on, if you have not worked on GCP before make sure you do lots of labs else you would be absolute clueless for some of the questions and commands
  • Tests are updated for the latest enhancements. There are no reference of Google Container Engine and everything was Google Kubernetes Engine, covers Cloud Functions, Cloud Spanner.
  • Make sure you cover the case studies before hand. I got around 15 questions (almost 5 per case study) and it can really be a savior for you in the exams.
  • Be sure that NO Online Course or Practice tests is going to cover all. I did LinuxAcademy which is really vast, but hands-on or practical knowledge is MUST.

The list of topics is quite long, but something that you need to be sure to cover are

  • Identity Services
    • Cloud IAM 
      • provides administrators the ability to manage cloud resources centrally by controlling who can take what action on specific resources.
      • Understand how IAM works and how rules apply esp. the hierarchy from Organization -> Folder -> Project -> Resources
      • How do you use on-premises authentication provider? Google Cloud Directory Sync (GCDS)
  • Compute Services
    • Make sure you know all the compute services Google Compute Engine, Google App Engine and Google Kubernetes Engine. You need to be sure to know the pros and cons and the use cases that you should use them.
    • Google Compute Engine
      • Google Compute Engine is the best IaaS option for compute and provides fine grained control
      • Make sure you know how to create a GCE, connect to it using Cloud shell or ssh keys
      • Make sure you know the difference between backups and images and how to create the same
      • Understand how you can recreate instance in different zones and regions
      • Understand the pricing and discounts model Hint – Sustained (automatic upto 30%) vs Committed (1 to 3 yrs) discounts.
      • Understand Preemptible VMs and their use cases.
      • Managed instance groups are covered heavily the exam, as they provide the key auto-scaling capability. Hint you need to create an Instance template and associate it with Instance group
      • Understand how migration or traffic splitting with Managed instance groups works Hint – rolling updates & deployments
      • In case of any issues or errors, how to debug the same
    • Google App Engine
      • Google App Engine is mainly the best option for PaaS with platforms supported and features provided.
      • Understand the key differences between Standard and Flexible App Engine. Hint – network did not work for Standard, so VPN connections did not work and you would need to use Flexible environment.
      • Deploy an application with App Engine and understand how versioning and rolling deployments can be done
    • Google Kubernetes Engine
      • Google Container Engine is now officially Google Kubernetes Engine and the questions refer to the same
      • Google Kubernetes Engine, powered by the open source container scheduler Kubernetes, enables you to run containers on Google Cloud Platform.
      • Kubernetes Engine takes care of provisioning and maintaining the underlying virtual machine cluster, scaling your application, and operational logistics such as logging, monitoring, and cluster health management.
      • Be sure to Create a Kubernetes Cluster and configure it to host an application
      • Very important to understand where to use gcloud commands (to create a cluster) and kubectl commands (manage the cluster components)
    • Cloud Functions
      • is a lightweight, event-based, asynchronous compute solution that allows you to create small, single-purpose functions that respond to cloud events without the need to manage a server or a runtime environment.
      • Remember that Cloud Functions is serverless and scales from zero to scale and back to zero as the demand changes.
  • Network Services
    • Virtual Private Cloud
      • Create a Custom Virtual Private Cloud (VPC), subnets and host applications within them Hint VPC spans across region
      • Understand how Firewall rules works and how they are configured. Hint – Focus on Network Tags.
      • Understand the concept of shared VPC which allows for access to resources using internal IPs
      • Understand VPC Peering and Private Google Access use cases
    • On-premises connectivity
      • Cloud VPN and Interconnect are 2 components which help you connect to on-premises data center.
      • Understand limitations of Cloud VPN esp. 1.5Gbps limit. How it can be improved with multiple tunnels.
      • Understand what are the requirements to setup Cloud VPN. Hint – Cloud Router is required for BGP.
      • Know Interconnect as the reliable high speed, low latency and dedicated bandwidth options.
    • Cloud Load Balancer (GCLB)
      • Google Cloud Load Balancing provides scaling, high availability, and traffic management for your internet-facing and private applications.
      • Understand Google Load Balancing options and their use cases esp. which is global and internal and what protocols they support.
  • Storage Services
    • Understand each storage service options and their use cases.
    • Persistent disks
      • attached to the Compute Engines, provide fast access however are limited in scalability, availability and scope.
      • Remember performance depends on the size of the disk
    • Cloud Storage
      • cost-effective object storage for an unstructured data.
      • very important to know the different classes and their use cases esp. Regional and Multi-Regional (frequent access), Nearline (monthly access) and Coldline (yearly access)
      • Understand how encryption works
      • Understand Signed URL to give temporary access and the users do not need to be GCP users
      • Understand permissions – IAM vs ACLs (fine grained control)
    • Relational Databases
      • Know Cloud SQL and Cloud Spanner
      • Cloud SQL
        • is a fully-managed service that provides MySQL and PostgreSQL only.
        • Limited to 10TB and is a regional service.
      • Cloud Spanner
        • is a fully managed, mission-critical relational database service.
        • provides a scalable online transaction processing (OLTP) database with high availability and strong consistency at global scale.
        • globally distributed and can scale and handle more than 10TB.
        • not a direct replacement and would need migration
      • There are no direct options for Microsoft SQL Server or Oracle yet.
    • NoSQL
      • Know Cloud Datastore and BigTable
      • Datastore
        • provides document database for web and mobile applications. Datastore is not for analytics
        • Understand Datastore indexes and how to update indexes for Datastore
        • Can be configured Multi-regional and regional
      • Bigtable
        • provides column database suitable for both low-latency single-point lookups and precalculated analytics
        • understand Bigtable is not for long term storage as it is quite expensive
    • Data Warehousing
      • BigQuery
        • provides scalable, fully managed enterprise data warehouse (EDW) with SQL and fast ad-hoc queries.
        • Remember it is most suitable for historical analysis.
    • MemoryStore and Firebase did not feature in any of the questions
  • Data Services
    • Although there is a different certification for Data Engineer, the Cloud Architect does cover data services. Data services are also part of the use cases so be sure to know about them
    • Know the Big Data stack and understand which service fits the different layers of ingest, store, process, analytics, use
    • Key Services which need to be mainly covered are –
      • Cloud Storage as the medium to store data as data lake
      • Cloud Pub/Sub as the messaging service to capture real time data esp. IoT
      • Cloud Pub/Sub is designed to provide reliable, many-to-many, asynchronous messaging between applications esp. real time IoT data capture
      • Cloud Dataflow to process, transform, transfer data and the key service to integrate store and analytics.
      • Cloud BigQuery for storage and analytics. Remember BigQuery provides the same cost-effective option for storage as Cloud Storage
      • Cloud Dataprep to clean and prepare data. Hint – It can be used anomaly detection.
      • Cloud Dataproc to handle existing Hadoop/Spark jobs. Hint – Use it to replace existing hadoop infra.
      • Cloud Datalab is an interactive tool for exploration, transformation, analysis and visualization of your data on Google Cloud Platform
  • Monitoring
    • Google Stackdriver
      • provides everything from monitoring, alert, error reporting, metrics, diagnostics, debugging, trace.
      • remember audits are mainly checking Stackdriver
  • DevOps services
    • Deployment Manager is Infrastructure as Code
    • Cloud Source Repositories provides source code repository with Git version control to support collaborative development
    • Container Registry is a private Docker image storage system on Google Cloud Platform. Images are immutable.
    • Cloud Build is a service that executes your builds on Google Cloud Platform infrastructure.
    • Cloud Launcher provides a way to launch common software packages e.g. Jenkins or WordPress and stacks on Google Compute Engine with just a few clicks like a prepackaged solution. It can help minimize deployment time
  • Security Services
    • Cloud Security Scanner is a web application security scanner that enables developers to easily check for a subset of common web application vulnerabilities in websites built on App Engine and Compute Engine.
    • Data Loss Prevention API to handle sensitive data esp. redaction of PII data.
    • Focus on PCI-DSS, how to handle to same. Remember, GCP services are PCI-DSS complaint, however you need to make sure for your applications and hosting to be inline with PCI-DSS.
    • Same concept as PCI-DSS applies to GDPR as well
  • Other Services
    • Storage Transfer Service allows import of large amounts of online data into Google Cloud Storage, quickly and cost-effectively. Online data is the key here as it supports AWS S3, HTTP/HTTPS and other GCS buckets. If the data is on-premises you need to use gsutil command
    • Transfer Appliance to transfer large amounts of data quickly and cost-effectively into Google Cloud Platform. Check for the data size and it would be always compared with Google Transfer Service or gsutil commands.
    • Spinnaker is an open source, multi-cloud, continuous delivery platform and does appear in answer options. So be sure to know about it.
    • Jenkins for Continuous Integration and Continuous Delivery.
  • Case Studies

Resources

Google Cloud Certified – Professional Cloud Architect exam assesses your ability to

Section 1: Designing and planning a cloud solution architecture

  • 1.1 Designing a solution infrastructure that meets business requirements. Considerations include:
    • business use cases and product strategy
    • cost optimization
    • supporting the application design
    • integration
    • movement of data
    • tradeoffs
    • build, buy or modify
    • success measurements (e.g., Key Performance Indicators (KPI), Return on Investment (ROI), metrics)
    • Compliance and observability
  • 1.2 Designing a solution infrastructure that meets technical requirements. Considerations include:
    • high availability and failover design
    • elasticity of cloud resources
    • scalability to meet growth requirements
  • 1.3 Designing network, storage, and compute resources. Considerations include:
    • integration with on premises/multi-cloud environments
      Cloud native networking (VPC, peering, firewalls, container networking)
    • identification of data processing pipeline
    • matching data characteristics to storage systems
    • data flow diagrams
    • storage system structure (e.g., Object, File, RDBMS, NoSQL, NewSQL)
    • mapping compute needs to platform products
  • 1.4 Creating a migration plan (i.e., documents and architectural diagrams). Considerations include:
    • integrating solution with existing systems
    • migrating systems and data to support the solution
    • licensing mapping
    • network and management planning
    • testing and proof-of-concept
  • 1.5 Envisioning future solution improvements. Considerations include:
    • cloud and technology improvements
    • business needs evolution
    • evangelism and advocacy

Section 2: Managing and provisioning solution Infrastructure

  • 2.1 Configuring network topologies. Considerations include:
    • extending to on-premise (hybrid networking)using VPN or Interconnect
    • extending to a multi-cloud environment which may include GCP to GCP communication
    • security
    • data protection
  • 2.2 Configuring individual storage systems. Considerations include:
    • data storage allocation
    • data processing/compute provisioning
    • security and access management
    • network configuration for data transfer and latency
    • data retention and data lifecycle management
    • data growth management
  • 2.3 Configuring compute systems. Considerations include:
    • compute system provisioning
    • compute volatility configuration (preemptible vs. standard)
    • network configuration for compute nodes
    • infrastructure provisioning technology configuration (e.g. Chef/Puppet/Ansible/Terraform)
    • container orchestration (e.g. Kubernetes)

Section 3: Designing for security and compliance

  • 3.1 Designing for security. Considerations include:
    • Identity and Access Management (IAM)
    • Resource hierarchy (organizations, folders, projects)
    • data security (key management, encryption)
    • penetration testing
    • Separation of Duties (SoD)
    • security controls
    • Managing customer-supplied encryption keys with Cloud KMS
  • 3.2 Designing for legal compliance. Considerations include:
    • legislation (e.g., Health Insurance Portability and Accountability Act (HIPAA), Children’s Online Privacy Protection Act (COPPA), etc.)
      audits (including logs)
    • certification (e.g., Information Technology Infrastructure Library (ITIL) framework)

Section 4: Analyzing and optimizing technical and business processes

  • 4.1 Analyzing and defining technical processes. Considerations include:
    • Software Development Lifecycle Plan (SDLC)
    • continuous integration / continuous deployment
    • troubleshooting / post mortem analysis culture
    • testing and validation
    • IT enterprise process (e.g. ITIL)
    • business continuity and disaster recovery
  • 4.2 Analyzing and defining business processes. Considerations include:
    • stakeholder management (e.g. Influencing and facilitation)
    • change management
    • team assessment / skills readiness
    • decision making process
    • customer success management
    • cost optimization / resource optimization (Capex / Opex)
  • 4.3 Developing procedures to test resilience of solution in production (e.g., DiRT and Simian Army)

Section 5: Managing implementation

  • 5.1 Advising development/operation team(s) to ensure successful deployment of the solution. Considerations include:
    • application development
    • API best practices
    • testing frameworks (load/unit/integration)
    • data and system migration tooling
  • 5.2 Interacting with Google Cloud using GCP SDK (gcloud, gsutil and bq). Considerations include:
    • local installation
    • Google Cloud Shell

Section 6: Ensuring solution and operations reliability

  • 6.1 Monitoring/Logging/Alerting solution
  • 6.2 Deployment and release management
  • 6.3 Supporting operational troubleshooting
  • 6.4 Evaluating quality control measures

Case Studies

  • Mountkirk Games
  • Dress4Win
  • TerramEarth

AWS DynamoDB Throughput Capacity

AWS DynamoDB Throughput Capacity

AWS DynamoDB throughput capacity depends on the read/write capacity modes for processing reads and writes on the tables.

There are two types of read/write capacity modes:

  • On-demand
  • Provisioned

NOTE – Provisioned mode is covered in the AWS Certified Developer – Associate exam (DVA-C01) esp. the calculations. On-demand capacity mode is latest enhancement and does not yet feature in the exams.

Provisioned Mode

  • Provisioned mode requires you to specify the number of reads and writes per second as required by the application
  • Provisioned throughput is the maximum amount of capacity that an application can consume from a table or index
  • If the provisioned throughput capacity on a table or index is exceeded, it is subject to request throttling
  • Provisioned mode is a good for applications  
    • predictable application traffic
    • consistent traffic 
    • ability to forecast capacity requirements to control costs
  • Provisioned mode provides the following capacity units 
    • Read Capacity Units (RCU)
      • Total number of read capacity units required depends on the item size, and the consistent read model (eventually or strongly)
      • one RCU represents
        • one strongly consistent read per second for an item up to 4 KB in size
        • two eventually consistent reads per second, for an item up to 4 KB in size i.e. 8 KB
      • Transactional read requests require two read capacity units to perform one read per second for items up to 4 KB.
      • DynamoDB must consume additional read capacity units for items greater than 4 KB for e.g. for an 8 KB item size, 2 read capacity units to sustain one strongly consistent read per second, 1 read capacity unit if you choose eventually consistent reads, or 4 read capacity units for a transactional read request would be required
      • Item size is rounded off to 4 KB equivalents for e.g. a 6 KB or a 8 KB item in size would require the same RCU
    • Write Capacity Units (WCU)
      • Total number of write capacity units required depends on the item size only
      • one write per second for an item up to 1 KB in size
      • Transactional write requests require 2 write capacity units to perform one write per second for items up to 1 KB.
      • DynamoDB must consume additional read capacity units for items greater than 1 KB for an 2 KB item size,  2 write capacity units would be required to sustain one write request per second or 4 write capacity units for a transactional write request
      • Item size is rounded off to 1 KB equivalents for e.g. a 0.5 KB or a 1 KB item would need the same WCU

On-demand Mode

  • On-demand mode provides flexible billing option capable of serving thousands of requests per second without capacity planning
  • There is no need to specify the expected read and write throughput
  • Charged for only the reads and writes that the application performs on the tables in terms of read request units and write request units.
  • Offers pay-per-request pricing for read and write requests so that you pay only for what you use
  • DynamoDB adapts rapidly to accommodate the changing load
  • Read & Write Capacity Units performance remains the same as provisioned mode

[do_widget id=text-15]

AWS Certification Exam Practice Questions

  1. You need to migrate 10 million records in one hour into DynamoDB. All records are 1.5KB in size. The data is evenly distributed across the partition key. How many write capacity units should you provision during this batch load?
    1. 6667
    2. 4166
    3. 5556 ( 2 write units (1 for each 1KB) * 10 million/3600 secs)
    4. 2778
  2. A meteorological system monitors 600 temperature gauges, obtaining temperature samples every minute and saving each sample to a DynamoDB table. Each sample involves writing 1K of data and the writes are evenly distributed over time. How much write throughput is required for the target table?
    1. 1 write capacity unit
    2. 10 write capacity units ( 1 write unit for 1K * 600 gauges/60 secs)
    3. 60 write capacity units
    4. 600 write capacity units
    5. 3600 write capacity units
  3. A company is building a system to collect sensor data from its 36000 trucks, which is stored in DynamoDB. The trucks emit 1KB of data once every hour. How much write throughput is required for the target table. Choose an answer from the options below
    1. 10
    2. 60
    3. 600
    4. 150
  4. A company is using DynamoDB to design storage for their IOT project to store sensor data. Which combination would give the highest throughput?
    1. 5 Eventual Consistent reads capacity with Item Size of 4KB (40KB/s)
    2. 15 Eventual Consistent reads capacity with Item Size of 1KB (30KB/s)
    3. 5 Strongly Consistent reads capacity with Item Size of 4KB (20KB/s)
    4. 15 Strongly Consistent reads capacity with Item Size of 1KB (15KB/s)
  5. If your table item’s size is 3KB and you want to have 90 strongly consistent reads per second, how many read capacity units will you need to provision on the table? Choose the correct answer from the options below
    1. 90
    2. 45
    3. 10
    4. 19

References