AWS Certified Big Data -Speciality (BDS-C00) Exam Learning Path

Clearing the AWS Certified Big Data – Speciality (BDS-C00) was a great feeling. This was my third Speciality certification and in terms of the difficulty level (compared to Network and Security Speciality exams), I would rate it between Network (being the toughest) Security (being the simpler one).

Big Data in itself is a very vast topic and with AWS services, there is lots to cover and know for the exam. If you have worked on Big Data technologies including a bit of Visualization and Machine learning, it would be a great asset to pass this exam.

AWS Certified Big Data – Speciality (BDS-C00) exam basically validates

  • Implement core AWS Big Data services according to basic architectural best practices
  • Design and maintain Big Data
  • Leverage tools to automate Data Analysis

Refer AWS Certified Big Data – Speciality Exam Guide for details

                              AWS Certified Big Data – Speciality Domains

AWS Certified Big Data – Speciality (BDS-C00) Exam Summary

  • AWS Certified Big Data – Speciality exam, as its name suggests, covers a lot of Big Data concepts right from data transfer and collection techniques, storage, pre and post processing, analytics, visualization with the added concepts for data security at each layer.
  • One of the key tactic I followed when solving any AWS Certification exam is to read the question and use paper and pencil to draw a rough architecture and focus on the areas that you need to improve. Trust me, you will be able to eliminate 2 answers for sure and then need to focus on only the other two. Read the other 2 answers to check the difference area and that would help you reach to the right answer or atleast have a 50% chance of getting it right.
  • Be sure to cover the following topics
    • Whitepapers and articles
    • Analytics
      • Make sure you know and cover all the services in depth, as 80% of the exam is focused on these topics
      • Elastic Map Reduce
        • Understand EMR in depth
        • Understand EMRFS (hint: Use Consistent view to make sure S3 objects referred by different applications are in sync)
        • Know EMR Best Practices (hint: start with many small nodes instead on few large nodes)
        • Know Hive can be externally hosted using RDS, Aurora and AWS Glue Data Catalog
        • Know also different technologies
          • Presto is a fast SQL query engine designed for interactive analytic queries over large datasets from multiple sources
          • D3.js is a JavaScript library for manipulating documents based on data. D3 helps you bring data to life using HTML, SVG, and CSS
          • Spark is a distributed processing framework and programming model that helps do machine learning, stream processing, or graph analytics using Amazon EMR clusters
          • Zeppelin/Jupyter as a notebook for interactive data exploration and provides open-source web application that can be used to create and share documents that contain live code, equations, visualizations, and narrative text
          • Phoenix is used for OLTP and operational analytics, allowing you to use standard SQL queries and JDBC APIs to work with an Apache HBase backing store
      • Kinesis
        • Understand Kinesis Data Streams and Kinesis Data Firehose in depth
        • Know Kinesis Data Streams vs Kinesis Firehose
          • Know Kinesis Data Streams is open ended on both producer and consumer. It supports KCL and works with Spark.
          • Know Kineses Firehose is open ended for producer only. Data is stored in S3, Redshift and ElasticSearch.
          • Kinesis Firehose works in batches with minimum 60secs interval.
        • Understand Kinesis Encryption (hint: use server side encryption or encrypt in producer for data streams)
        • Know difference between KPL vs SDK (hint: PutRecords are synchronously, while KPL supports batching)
        • Kinesis Best Practices (hint: increase performance increasing the shards)
      • Know ElasticSearch is a search service which supports indexing, full text search, faceting etc.
      • Redshift
        • Understand Redshift in depth
        • Understand Redshift Advance topics like Workload Management, Distribution Style, Sort key
        • Know Redshift Best Practices w.r.t selection of Distribution style, Sort key, COPY command which allows parallelism
        • Know Redshift views to control access to data.
      • Amazon Machine Learning
      • Know Data Pipeline for data transfer
      • QuickSight
      • Know Glue as the ETL tool
    • Security, Identity & Compliance
    • Management & Governance Tools
      • Understand AWS CloudWatch for Logs and Metrics. Also, CloudWatch Events more real time alerts as compared to CloudTrail
    • Storage
    • Compute
      • Know EC2 access to services using IAM Role and Lambda using Execution role.
      • Lambda esp. how to improve performance batching, breaking functions etc.

AWS Certified Big Data – Speciality (BDS-C00) Exam Resources

AWS Certified DevOps Engineer – Professional (DOP-C01) Exam Learning Path

AWS Certified DevOps Engineer - Professional (DOP-C01) Certificate

AWS Certified DevOps Engineer – Professional (DOP-C01) Exam Learning Path

NOTE – Refer to DOP-C02 Learning Path

AWS Certified DevOps Engineer – Professional (DOP-C01) exam is the upgraded pattern of the DevOps Engineer – Professional exam which was released last year (2018). I recently attempted the latest pattern and AWS has done quite good in improving it further, as compared to the old one, to include more DevOps related questions and services.

AWS Certified DevOps Engineer – Professional (DOP-C01) exam basically validates

  • Implement and manage continuous delivery systems and methodologies on AWS
  • Implement and automate security controls, governance processes, and compliance validation
  • Define and deploy monitoring, metrics, and logging systems on AWS
  • Implement systems that are highly available, scalable, and self-healing on the AWS platform
  • Design, manage, and maintain tools to automate operational processes

Refer to AWS Certified DevOps Engineer – Professional Exam Guide

AWS Certified DevOps Engineer – Professional (DOP-C01) Exam Summary

  • AWS Certified DevOps Engineer – Professional exam was for a total of 170 minutes but it had 75 questions (I was always assuming it to be 65) and I just managed to complete the exam with 20 mins remaining. So be sure you are prepared and manage your time well. As always, mark the questions for review and move on and come back to them after you are done with all.
  • One of the key tactic I followed when solving the DevOps Engineer questions was to read the question and use paper and pencil to draw a rough architecture and focus on the areas that you need to improve. Trust me, you will be able eliminate 2 answers for sure and then need to focus on only the other two. Read the other 2 answers to check the difference area and that would help you reach to the right answer or atleast have a 50% chance of getting it right.
  • AWS Certified DevOps Engineer – Professional exam covers a lot of concepts and services related to Automation, Deployments, Disaster Recovery, HA, Monitoring, Logging and Troubleshooting. It also covers security and compliance related topics.
  • Be sure to cover the following topics
    • Whitepapers are the key to understand Deployments and DR
    • Management Tools
      • DevOps professional exam cannot be cleared without the knowledge of this topics
      • Deep dive into CloudFormation, Elastic Beanstalk and OpsWorks
      • Very important to understand CloudFormation vs Elastic Beanstalk vs OpsWorks
      • CloudFormation
        • Have in-depth understand of CloudFormation concepts
        • Know how to indicate completion of events using CloudFormation helper scripts.
        • Understand CloudFormation deployment strategies esp. rolling and replacing update with AutoScaling and update of launch configuration
        • Understand CloudFormation policies esp. Update and Deletion policies (hint : retain resources on stack deletion)
        • Understand CloudFormation Best Practices esp. Nested Stacks and logical grouping
        • Understand CloudFormation template anatomy – parameters, outputs, mappings
        • Understand CloudFormation Custom resource and its use cases (hint : you can use Custom resource to retrieve AMI IDs or interact with external services)
      • Elastic Beanstalk
      • OpsWorks
        • Understand OpsWorks overall – stacks, layers, recipes
        • Understand OpsWorks Lifecycle events esp. the Configure event and how it can be used.
        • Understand OpsWorks Deployment Strategies
        • Know OpsWorks auto-healing and how to be notified for it.
      • Development Tools
        • Unlike the previous DevOps Engineer – Professional exam, the latest pattern has a heavy focus on the Developer tools and be sure to deep dive into them
        • Understand CodePipepline, CodeCommit, CodeDeploy, CodeBuild and their uses cases
        • CodePipeline
          • Understand how to build Pipelines and integration with other Code* services
          • Understand CodePipeline pipeline structure (Hint : run builds parallelly using runorder)
          • Understand how to configure notifications on events and failures
          • Know CodePipeline supports Manual Approval
        • CodeCommit
          • How to handle deployments for code. (Hint : Same repository and branches for projects and environments)
          • Know CodeCommit IAM policies
        • CodeDeploy
    • Monitoring & Governance tools
      • Very important to understand AWS CloudWatch vs AWS CloudTrail vs AWS Config
      • Very important to understand Trust Advisor vs Systems manager vs AWS Inspector
      • Know Personal Health Dashboard & Service Health Dashboard
      • CloudWatch
      • CloudTrail
        • Understand how to maintain CloudTrail logs integrity
      • Understand AWS Config and its use cases (hint : Config maintains history and can be used to revert the config)
      • Know Personal Health Dashboard (hint : it tracks events on your AWS resources)
      • Understand AWS Trusted Advisor and what it provides (hint : low utilization resources)
      • Systems Manager
        • Systems Manager is also covered heavily in the exams so be sure you know
        • Understand AWS Systems Manager and its various services like parameter store, patch manager
    • Networking & Content Delivery
      • Networking is covered very lightly. Usually the questions are targetted towards Troubleshooting of access or permissions.
      • Know VPC
      • Route 53
    • Security, Identity & Compliance
    • Storage
      • Exam does not cover Storage services in deep
      • Focus on Simple Secure Service (S3)
        • Understand S3 Permissions (Hint – acl authenticated users provides access to all authenticated users. How to control access)
        • Know S3 disaster recovery across region. (hint : cross region replication)
        • Know CloudFront for caching to improve performance
      • Elastic Block Store
        • Focus mainly on EBS Backup using snapshots for HA and Disaster recovery
    • Database
    • Compute
      • Know EC2
        • Understand ENI for HA, user data, pre-baked AMIs for faster instance start times
        • Amazon Linux 2 Image (hint : it allows for replication of Amazon Linux behavior in on-premises)
        • Snapshot and sharing
      • Auto Scaling
        • Auto Scaling Lifecycle events
        • Blue/green deployments with Auto Scaling – With new launch configurations, new auto scaling groups or CloudFormation update policies.
      • Understand Lambda
      • ECS
        • Know Monitoring and deployments with image update
    • Integration Tools
      • Know how CloudWatch integration with SNS and Lambda can help in notification (Topics are not required to be in detail)

AWS Certified DevOps Engineer – Professional (DOP-C01) Exam Resources

AWS Certified Advanced Networking – Speciality (ANS-C00) Exam Learning Path

AWS Certified Advanced Networking – Speciality (ANS-C00) Exam Learning Path

I recently cleared the AWS Certified Advanced Networking – Speciality (ANS-C00), which was my first, en route my path to the AWS Speciality certifications. Frankly, I feel the time I gave for preparation was still not enough, but I just about managed to get through. So a word of caution, this exam is inline or tougher than the professional exam especially for the reason that the Networking concepts it covers are not something you can get your hands dirty with easily.

AWS Certified Advanced Networking – Speciality (ANS-C00) exam is the focusing on the AWS Networking concepts. It basically validates

  • Design, develop, and deploy cloud-based solutions using AWS
    Implement core AWS services according to basic architecture best practices
  • Design and maintain network architecture for all AWS services
  • Leverage tools to automate AWS networking tasks

Refer to AWS Certified Advanced Networking – Speciality Exam Guide

AWS Certified Advanced Networking – Speciality (ANS-C00) Exam Resources

AWS Certified Advanced Networking – Speciality (ANS-C00) Exam Summary

  • AWS Certified Advanced Networking – Speciality exam covers a lot of Networking concepts like VPC, VPN, Direct Connect, Route 53, ALB, NLB.
  • One of the key tactic I followed when solving the DevOps Engineer questions was to read the question and use paper and pencil to draw a rough architecture and focus on the areas that you need to improve. Trust me, you will be able eliminate 2 answers for sure and then need to focus on only the other two. Read the other 2 answers to check the difference area and that would help you reach to the right answer or atleast have a 50% chance of getting it right.
  • Be sure to cover the following topics
    • Networking & Content Delivery
      • You should know everything in Networking.
      • Understand VPC in depth
      • Virtual Private Network to establish connectivity between on-premises data center and AWS VPC
      • Direct Connect to establish connectivity between on-premises data center and AWS VPC and Public Services
        • Make sure you understand Direct Connect in detail, without this you cannot clear the exam
        • Understand Direct Connect connections – Dedicated and Hosted connections
        • Understand how to create a Direct Connect connection (hint: LOA-CFA provides the details for partner to connect to AWS Direct Connect location)
        • Understand virtual interfaces options – Private Virtual Interface for VPC resources and Public Virtual Interface for Public resources
        • Understand setup Private and Public VIF
        • Understand Route Propagation, propagation priority, BGP connectivity
        • Understand High Availability options based on cost and time i.e. Second Direct Connect connection OR VPN connection
        • Understand Direct Connect Gateway – it provides a way to connect to multiple VPCs from on-premises data center using the same Direct Connect connection
      • Route 53
        • Understand Route 53 and Routing Policies and their use cases Focus on Weighted, Latency routing policies
        • Understand Route 53 Split View DNS to have the same DNS to access a site externally and internally
      • Understand CloudFront and use cases
      • Load Balancer
        • Understand ELB, ALB and NLB 
        • Understand the difference ELB, ALB and NLB esp. ALB provides Content, Host and Path based Routing while NLB provides the ability to have static IP address
        • Know how to design VPC CIDR block with NLB (Hint – minimum number of IPs required are 8)
        • Know how to pass original Client IP to the backend instances (Hint – X-Forwarded-for and Proxy Protocol)
      • Know WorkSpaces requirements and setup
    • Security
      • Know AWS GuardDuty as managed threat detection service
      • Know AWS Shield esp. the Shield Advanced option and the features it provides
      • Know WAF as Web Traffic Firewall – (Hint – WAF can be attached to your CloudFront, Application Load Balancer, API Gateway to dynamically detect and prevent attacks)
    •  

Google Cloud – Associate Cloud Engineer Certification learning path

Google Cloud - Associate Cloud Engineer

Google Cloud – Associate Cloud Engineer Certification learning path

Google Cloud – Associate Cloud Engineer certification exam is basically for one who works day-in day-out with the Google Cloud Services. It targets an Cloud Engineer who deploys applications, monitors operations, and manages enterprise solutions. The exam makes sure it covers gamut of services and concepts. Although, the exam is not that tough and time available of 2 hours a quite plenty, if you well prepared.

Google Cloud – Associate Cloud Engineer Certification Summary

  • Has 50 questions to be answered in 2 hours.
  • Covers wide range of Google Cloud services and what they actually do. It focuses heavily on IAM, Compute, Storage with a little bit of Network but hardly any data services.
  • Hands-on is a must. Covers Cloud SDK, CLI commands and Console operations that you would use for day-to-day work. If you have not worked on GCP before make sure you do lot of labs else you would be absolute clueless for some of the questions and commands
  • Once again be sure that NO Online Course or Practice tests is going to cover all. I did ACloud Guru – LA course which covered maybe 60-70%, but hands-on or practical knowledge is MUST

Google Cloud – Associate Cloud Engineer Certification Topics

General Services

  • Cloud Billing
    • understand how Cloud Billing works. Monthly vs Threshold and which has priority
    • Budgets can be set to alert for projects
    • how to change a billing account for a project and what roles you need. Hint – Project Owner and Billing Administrator for the billing account
    • Cloud Billing can be exported to BigQuery and Cloud Storage
  • Resource Manager
    • Understand Resource Manager the hierarchy Organization -> Folders -> Projects -> Resources
    • IAM Policy inheritance is transitive and resources inherit the policies of all of their parent resources.
    • Effective policy for a resource is the union of the policy set on that resource and the policies inherited from higher up in the hierarchy.
  • Cloud SDK
    • understand gcloud commands esp. when dealing with
      • configurations i.e. gcloud config
        • activate profiles – gcloud config configurations activate
        • GKE setting default cluster i.e. gcloud config set container/cluster CLUSTER_NAME
        • set project gcloud config set project mygcp-demo
        • set region gcloud config set compute/region us-west1
        • set zone gcloud config set compute/zone us-west1-a
      • Get project list and ids gcloud projects list
      • Auth i.e gcloud auth
        • Auth login using user gcloud auth login
        • Auth login using service accountgcloud auth activate-service-account --key-file=sa_key.json
      • deployment manager i.e. gcloud deployment-manager
      • VPC firewalls i.e. gcloud compute firewall-rules

Network Services

  • Virtual Private Cloud
    • Understand Virtual Private Cloud (VPC), subnets and host applications within them Hint VPC spans across region
    • Understand how Firewall rules works and how they are configured. Hint – Focus on Network Tags. Also, there are 2 implicit firewall rules – default ingress deny and default egress allow
    • Understand VPC Peering and Shared VPC
    • Understand the concept internal and external IPs and difference between static and ephemeral IPs
    • Primary IP range of an existing subnet can be expanded by modifying its subnet mask, setting the prefix length to a smaller number.
  • Cloud Load Balancing

Identity Services

  • Identity and Access Management – IAM 
    • Identify and Access Management – IAM provides administrators the ability to manage cloud resources centrally by controlling who can take what action on specific resources.
    • Understand how IAM works and how rules apply esp. the hierarchy from Organization -> Folder -> Project -> Resources
    • Understand the difference between Primitive, Pre-defined and Custom roles and their use cases
    • IAM Policy inheritance is transitive and resources inherit the policies of all of their parent resources.
    • Effective policy for a resource is the union of the policy set on that resource and the policies inherited from higher up in the hierarchy.
    • Basically  Permissions -> Roles -> (IAM Policy) -> Members
    • Need to know and understand the roles for the following services atleast
      • Cloud Storage – Admin vs Creator vs Viewer
      • Compute Engine – Admin vs Instance Admin
      • Spanner – Viewer vs Database User
      • BigQuery – User vs JobUser
    • Know how to copy roles to different projects or organization. Hint – gcloud iam roles copy
    • Know how to use service accounts with applications
  • Cloud Identity
    • Cloud Identity provides IDaaS (Identify as a Service) and provides single sign-on functionality and federation with external identity provides like Active Directory.

Compute Services

  • Make sure you know all the compute services Google Compute Engine, Google App Engine and Google Kubernetes Engine, they are heavily covered in the exam.
  • Google Compute Engine
    • Google Compute Engine is the best IaaS option for compute and provides fine grained control
    • Know how to create a Compute Engine instance, connect to it using Cloud shell or ssh keys
    • Difference between backups and images and how to create instances from the same.
    • Instance templates with managed instance groups. Instance template cannot be edited, create a new one and attach.
    • Difference between managed vs unmanaged instance groups and auto-healing feature
    • Preemptible VMs and their use cases. HINT – can be terminated any time and supports max 24 hours.
    • Upgrade an instance without downtime using Live Migration
    • Managing access using OS Login or project and instance metadata
    • Prevent accidental deletion using deletion protection flag
    • In case of any issues or errors, how to debug the same
  • Google App Engine
    • Google App Engine is mainly the best option for PaaS with platforms supported and features provided.
    • Deploy an application with App Engine and understand how versioning and rolling deployments can be done
    • Understand how to keep auto scaling and traffic splitting and migration.
    • Know App Engine is a regional resource and understand the steps to migrate or deploy application to different region and project.
    • Know the difference between App Engine Flexible vs Standard
  • Google Kubernetes Engine
    • Google Container Engine is now officially Google Kubernetes Engine and the questions refer to the same
    • Google Kubernetes Engine, powered by the open source container scheduler Kubernetes, enables you to run containers on Google Cloud Platform.
    • Kubernetes Engine takes care of provisioning and maintaining the underlying virtual machine cluster, scaling your application, and operational logistics such as logging, monitoring, and cluster health management.
    • Be sure to Create a Kubernetes Cluster and configure it to host an application
    • Understand how to make the cluster auto repairable and upgradable. Hint – Node auto-upgrades and auto-repairing feature
    • Very important to understand where to use gcloud commands (to create a cluster) and kubectl commands (manage the cluster components)
    • Very important to understand how to increase cluster size and enable autoscaling for the cluster
    • know how to manage secrets like database passwords

Storage Services

  • Understand each storage service options and their use cases.
  • Cloud Storage
    • Cloud Storage is cost-effective object storage for unstructured data.
    • very important to know the different storage classes and their use cases esp. Regional and Multi-Regional (frequent access), Nearline (monthly access) and Coldline (yearly access)
    • Understand life cycle management. HINT – Changes are in accordance to object creation date
    • Understand Signed URL to give temporary access and the users do not need to be GCP users
    • Understand access control and permissions – IAM vs ACLs (fine grained control)
    • Understand best practices esp. uploading and downloading the data. HINT using parallel composite uploads
  • Relational Databases
    • Cloud SQL
      • Cloud SQL is a fully-managed service that provides MySQL, PostgreSQL and MS SQL Server
      • limited to 10TB 64TB and is a regional service.
      • Difference between Failover and Read replicas. Failover provides High Availability and almost zero downtime while Read replicas provide scalability. Cross region Read Replicas are supported
      • Perform Point-In-Time recovery. Hint – requires binary logging and backups
    • Cloud Spanner
      • is a fully managed, mission-critical relational database service.
      • provides a scalable online transaction processing (OLTP) database with high availability and strong consistency at global scale.
      • globally distributed and can scale and handle more than 10TB.
      • not a direct replacement and would need migration
    • There are no direct options for Microsoft SQL Server or Oracle yet.
  • Data Warehousing
    • BigQuery
      • provides scalable, fully managed enterprise data warehouse (EDW) with SQL and fast ad-hoc queries.
      • Remember it is most suitable for historical analysis.
      • know how to perform a preview or dry run. Hint – price is determined by bytes read not bytes returned.
      • supports federated tables or external tables that can support Cloud Storage, BigTable, Google Drive and Cloud SQL.

Data Services

  • Although there were only a couple of reference of big data services in the exam, it is important to know (DO NOT DEEP DIVE) the Big Data stack (esp. IoT gateway, Pub/Sub, Bigtable vs BigQuery) to understand which service fits the different layers of ingest, store, process, analytics, use
    • Cloud Storage as the medium to store data as data lake
    • Cloud Pub/Sub as the messaging service to capture real time data esp. IoT
    • Cloud Pub/Sub is designed to provide reliable, many-to-many, asynchronous messaging between applications esp. real time IoT data capture
    • Cloud Dataflow to process, transform, transfer data and the key service to integrate store and analytics.
    • Cloud BigQuery for storage and analytics. Remember BigQuery provides the same cost-effective option for storage as Cloud Storage
    • Cloud Dataprep to clean and prepare data. Hint – It can be used anomaly detection.
    • Cloud Dataproc to handle existing Hadoop/Spark jobs. Hint – Use it to replace existing hadoop infra.
    • Cloud Datalab is an interactive tool for exploration, transformation, analysis and visualization of your data on Google Cloud Platform

Monitoring

  • Google Cloud Monitoring or Stackdriver
    • provides everything from monitoring, alert, error reporting, metrics, diagnostics, debugging, trace.
    • remember audits are mainly checking Stackdriver

DevOps services

  • Deployment Manager 
  • Google Marketplace (Cloud Launcher)
    • provides a way to launch common software packages e.g. Jenkins or WordPress and stacks on Google Compute Engine with just a few clicks like a prepackaged solution.
    • It can help minimize deployment time and can be used without any knowledge about the product

Google Cloud – Associate Cloud Engineer Certification Resources

Google Cloud – Professional Data Engineer Certification learning path

Google Cloud – Professional Data Engineer Certification Learning Path

I just recertified on my Google Cloud Certified – Professional Data Engineer certification. The first attempt on the Data Engineer exam has already been 2 long years which lasted for 4 hours with 95 questions. Once again, similar to the other Google Cloud certification exams, the Data Engineer exam covers not only the gamut of services and concepts but also focuses on logical thinking and practical experience.

Google Cloud – Professional Cloud Data Engineer Certification Summary

  • Cloud Data Engineer exam had 50 questions to be answered in 2 hours
  • Covers a wide range of data services including machine learning, with other topics covering storage and security.
  • Exam does not cover any case studies
  • Although the exam covers the latest services, it has not been updated for Cloud Monitoring and Logging and still refers to Stackdriver.
  • Nothing much on Compute and Network is covered
  • Questions sometimes test your logical thinking rather than any concept regarding Google Cloud.
  • Hands-on is MUST, if you have not worked on GCP before make sure you do lots of labs else you would be absolutely clueless about some of the questions and commands
  • Be sure that NO Online Courses or Practice tests are going to cover all. I did Coursera, LinuxAcademy which is really vast, but hands-on or practical knowledge is MUST.

Google Cloud – Professional Cloud Data Engineer Certification Resources

Google Cloud – Professional Cloud Data Engineer Certification Topics

Data & Analytics Services

  • Obviously, there are lots and lots of data and related services
  • Google Cloud Data & Analytics Services Cheatsheet
  • Know the Big Data stack and understand which service fits the different layers of ingest, store, process, analytics
  • Cloud BigQuery
    • provides scalable, fully managed enterprise data warehouse (EDW) with SQL and fast ad-hoc queries.
    • ideal for storage and analytics.
    • provides the same cost-effective option for storage as Cloud Storage
    • understand BigQuery Security
      • use BigQuery IAM access roles to control data and querying access
      • use Authorized views to access control tables, columns within tables, and query results. HINT: Authorized views need to reside in a different dataset as compared to the source dataset.
      • support data encryption
    • understand BigQuery Best Practices including key strategy, cost optimization, partitioning, and clustering
      • use dry run to estimate costs
      • use partitioning and clustering to limit the amount of data scanned
      • using external data sources might result in query performance degradation and its better to import the data
    • Dataset location can be set ONLY at the time of its creation.
    • supports schema auto-detection for JSON and CSV files.
    • understand how BigQuery Streaming works
    • know BigQuery limitations esp. with updates and inserts
    • supports an external data source (federated data source)
      • which is a data source that can be queried directly even though the data is not stored in BigQuery.
      • offers support for querying data directly from:
        • Cloud Bigtable
        • Cloud Storage
        • Google Drive
        • Cloud SQL
      • Use Permanent table for querying an external data source multiple times
      • Use Temporary table for querying an external data source for one-time, ad-hoc queries over external data, or for extract, transform, and load (ETL) processes.
  • Cloud Bigtable
    • provides column database suitable for both low-latency single-point lookups and precalculated analytics
    • understand Bigtable is not for long term storage as it is quite expensive
    • know the differences with HBase
    • Know how to measure performance and scale
    • supports Development and Production mode. Development mode can be upgraded to production and not vice versa.
    • supports HDD and SDD storage during cluster creation. HDD can be converted to SDD by exporting the data to the new instance.
    • understand Bigtable Replication. Can be used to separate real-time and batch workloads on the same instance using application profiles.
  • Cloud Pub/Sub
    • as the messaging service to capture real-time data esp. IoT
    • is designed to provide reliable, many-to-many, asynchronous messaging between applications esp. real-time IoT data capture
    • guarantees at-least-once (but not exactly once) message delivery and can result in data duplication if the message is not ack within a defined time period.
    • how it compares to Kafka (HINT: provides only 7 days of retention vs Kafka which depends on the storage)
  • Cloud Dataflow
    • to process, transform, transfer data and the key service to integrate store and analytics.
    • know how to improve a Dataflow performance
    • understand Apache Beam features as well
      • understand PCollections, Transforms, ParDo and what they do
      • understand windowing, watermarks, triggers Hint: windowing and watermarks can be used to handle delayed messages
    • supports drain feature to finish existing jobs but stop processing new ones, usually useful for deploying incompatible breaking changes
    • canceling a job will lead to an immediate stop and in-flight data loss.
  • Cloud Dataprep
    • to clean and prepare data. It can be used for anomaly detection.
    • does not need any programming language knowledge and can be done through the graphical interface
    • be sure to know or try hands-on on a dataset
  • Cloud Dataproc
    • to handle existing Hadoop/Spark jobs
    • supports connector for BigQuery, Bigtable, Cloud Storage
    • supports Ephermal clusters and with Cloud Storage connector support the data can be stored in GCS instead of HDFS
    • you need to know how to improve the performance of the Hadoop cluster as well :). Know how to configure the Hadoop cluster to use all the cores (hint- spark executor cores) and handle out of memory errors (hint – executor memory)
    • Secondary workers can be used to scale with the below limitations
      • Processing only with no data storage
      • No secondary-worker-only clusters
      • Persistent disk size is used for local caching of data and is not available through HDFS.
    • how to install other components (hint – initialization actions)
  • Cloud Datalab
    • is an interactive tool for exploration, transformation, analysis, and visualization of your data on Google Cloud Platform
    • based on Jupyter
  • Cloud Composer
    • fully managed workflow orchestration service, based on Apache Airflow, enabling workflow creation that spans across clouds and on-premises data centers.
    • pipelines are configured as directed acyclic graphs (DAGs)
    • workflow lives on-premises, in multiple clouds, or fully within GCP.
    • provides the ability to author, schedule, and monitor the workflows in a unified manner

Identity Services

  • Cloud IAM 
    • provides administrators the ability to manage cloud resources centrally by controlling who can take what action on specific resources.
    • Understand how IAM works and how rules apply esp. the hierarchy from Organization -> Folder -> Project -> Resources
    • Understand IAM Best practices

Storage Services

  • Understand each storage service option and its use cases.
  • Cloud Storage
    • cost-effective object storage for unstructured data.
    • very important to know the different classes and their use cases esp. Regional and Multi-Regional (frequent access), Nearline (monthly access), and Coldline (yearly access)
    • Understand Signed URL to give temporary access and the users do not need to be GCP users
    • Understand permissions – IAM vs ACLs (fine-grained control)
  • Cloud SQL
    • is a fully-managed service that provides MySQL and PostgreSQL only.
    • Limited to 10TB and is a regional service.
    • No direct options for Oracle yet.
  • Cloud Spanner
    • is a fully managed, mission-critical relational database service.
    • provides a scalable online transaction processing (OLTP) database with high availability and strong consistency at a global scale.
    • globally distributed and can scale and handle more than 10TB.
    • not a direct replacement and would need migration
  • Cloud Datastore
    • provides document database for web and mobile applications. Datastore is not for analytics
    • Understand Datastore indexes and how to update indexes for Datastore

Machine Learning

  • Google expects the Data Engineer to surely know some of the Data scientists stuff
  • Understand the different algorithms
    • Supervised Learning (labeled data)
      • Classification (for e.g. Spam or Not)
      • Regression (for e.g. Stock or House prices)
    • Unsupervised Learning (Unlabelled data)
      • Clustering (for e.g. categories)
    • Reinforcement Learning
  • Know Cloud ML with Tensorflow
  • Know all the Cloud AI products which include
    • Cloud Vision
    • Cloud Natural Language
    • Cloud Speech-to-Text
    • Cloud Video Intelligence
    • Cloud Dialogflow
  • Cloud AutoML products, which can help you get started without much machine learning experience

Monitoring

  • Cloud Monitoring and Logging
    • provides everything from monitoring, alert, error reporting, metrics, diagnostics, debugging, trace.
    • remember audits are mainly checking Cloud Logging entries
    • Aggregated sink can then route log entries from the organization or folder, plus (recursively) from any contained folders, billing accounts, or projects

Security Services

Other Services

  • Storage Transfer Service 
    • allows import of large amounts of online data into Google Cloud Storage, quickly and cost-effectively. Online data is the key here as it supports AWS S3, HTTP/HTTPS, and other GCS buckets. If the data is on-premises you need to use the gsutil command
  • Transfer Appliance 
    • to transfer large amounts of data quickly and cost-effectively into Google Cloud Platform. Check for the data size and it would be always compared with Google Transfer Service or gsutil commands.
  • BigQuery Data Transfer Service
    • to integrate with third-party services and load data into BigQuery

Google Cloud – Professional Cloud Architect Certification learning path

Google Cloud - Professional Cloud Architect certificate

Google Cloud – Professional Cloud Architect Certification Learning Path

Re-certified !!!! Google Cloud – Professional Cloud Architect certification exam is one of the toughest exam I have appeared for. Even though it was recertification, the preparation level was same as the first one. The gamut of services and concepts it tests your knowledge on is really vast.

Google Cloud – Professional Cloud Architect Certification Summary

  • Has 50 questions to be answered in 2 hours.
  • Covers wide range of Google Cloud services and what they actually do.
  • includes Compute, Storage, Network and even Data services
  • Questions sometimes tests your logical thinking rather than any concept regarding Google Cloud.
  • Hands-on is a MUST, if you have not worked on GCP before make sure you do lots of labs else you would be absolute clueless for some of the questions and commands
  • Make sure you cover the case studies before hand. I got  ~15 questions (almost 5 per case study) and it can really be a savior for you in the exams.
  • Be sure that NO Online Course or Practice tests is going to cover all. I did LinuxAcademy (a bit old now) which is really vast, but hands-on or practical knowledge is MUST.

Google Cloud – Professional Cloud Architect Certification Resources

Google Cloud – Professional Cloud Architect Certification Topics

General Services

  • Cloud Billing
    • understand how Cloud Billing works. Monthly vs Threshold and which has priority
    • Budgets can be set to alert for projects
    • how to change a billing account for a project and what roles you need. Hint – Project Owner and Billing Administrator for the billing account
    • Cloud Billing can be exported to BigQuery and Cloud Storage
  • Resource Manager
    • Understand Resource Manager the hierarchy Organization -> Folders -> Projects -> Resources
    • IAM Policy inheritance is transitive and resources inherit the policies of all of their parent resources.
    • Effective policy for a resource is the union of the policy set on that resource and the policies inherited from higher up in the hierarchy.

Identity Services

  • Cloud Identity and Access Management
    • Identify and Access Management – IAM provides administrators the ability to manage cloud resources centrally by controlling who can take what action on specific resources.
    • Understand how IAM works and how rules apply esp. the hierarchy from Organization -> Folder -> Project -> Resources
    • Understand the difference between Primitive, Pre-defined and Custom roles and their use cases
    • IAM Policy inheritance is transitive and resources inherit the policies of all of their parent resources.
    • Effective policy for a resource is the union of the policy set on that resource and the policies inherited from higher up in the hierarchy.
    • Basically  Permissions -> Roles -> (IAM Policy) -> Members
    • Know how to use service accounts with applications
  • Cloud Identity
    • Cloud Identity provides IDaaS (Identity as a Service) and provides single sign-on functionality and federation with external identity provides like Active Directory.
    • Cloud Identity supports federating with Active Directory using GCDS to implement the synchronization

Compute Services

    • Make sure you know all the compute services Google Compute Engine, Google App Engine and Google Kubernetes Engine. You need to be sure to know the pros and cons and the use cases that you should use them.
    • Google Compute Engine
      • Google Compute Engine is the best IaaS option for compute and provides fine grained control
      • Know how to create a Compute Engine instance, connect to it using Cloud shell or ssh keys
      • Difference between backups and images and how to create instances from the same.
      • Understand Compute Engine Storage Options. Disk throughput and IOPS depends on type and size.
      • Understand Compute Engine Snapshots
      • Instance templates with managed instance groups provide scalability and high availability
      • Instance template cannot be edited, create a new one and attach.
      • Difference between managed vs unmanaged instance groups and auto-healing feature
      • Managed instance groups are covered heavily the exam, as they provide the key auto-scaling capability. Hint: you need to create an Instance template and associate it with Instance group
      • Understand how migration or traffic splitting with Managed instance groups works Hint – rolling updates & deployments
      • Preemptible VMs and their use cases. HINT – can be terminated any time and supports max 24 hours.
      • Upgrade an instance without downtime using Live Migration
      • Managing access using OS Login or project and instance metadata
      • Prevent accidental deletion using deletion protection flag
      •  Understand the pricing and discounts model Hint – Sustained (automatic upto 30%) vs Committed (1 to 3 yrs) discounts.
      • In case of any issues or errors, how to debug the same
    • Google App Engine
      • Google App Engine is mainly the best option for PaaS with platforms supported and features provided.
      • Deploy an application with App Engine and understand how versioning and rolling deployments can be done
      • Understand how to keep auto scaling and traffic splitting and migration.
      • Know App Engine is a regional resource and understand the steps to migrate or deploy application to different region and project.
      • Know the difference between App Engine Flexible vs Standard
    • Google Kubernetes Engine
      • Google Kubernetes Engine, powered by the open source container scheduler Kubernetes, enables you to run containers on Google Cloud Platform.
      • Kubernetes Engine takes care of provisioning and maintaining the underlying virtual machine cluster, scaling your application, and operational logistics such as logging, monitoring, and cluster health management.
      • A node pool is a subset of machines that all have the same configuration, including machine type (CPU and memory) authorization scopes. Node pools represent a subset of nodes within a cluster; a container cluster can contain one or more node pools. Hint : For adding new machine types, need to add a new node pool as existing one cannot be edited
      • Be sure to Create a Kubernetes Cluster and configure it to host an application
      • Understand how to make the cluster auto repairable and upgradable. Hint – Node auto-upgrades and auto-repairing feature
      • Very important to understand where to use gcloud commands (to create a cluster) and kubectl commands (manage the cluster components)
      • Very important to understand how to increase cluster size and enable autoscaling for the cluster
      • Know how to manage secrets like database passwords
    • Cloud Functions
      • is a lightweight, event-based, asynchronous compute solution that allows you to create small, single-purpose functions that respond to cloud events without the need to manage a server or a runtime environment.
      • Remember that Cloud Functions is serverless and scales from zero to scale and back to zero as the demand changes.

Network Services

  • Virtual Private Cloud
    • Understand Virtual Private Cloud (VPC), subnets and host applications within them Hint VPC spans across region
    • Understand how Firewall rules works and how they are configured. Hint – Focus on Network Tags. Also, there are 2 implicit firewall rules – default ingress deny and default egress allow
    • Understand VPC Peering and Shared VPC
    • Understand the concept internal and external IPs and difference between static and ephemeral IPs
    • Primary IP range of an existing subnet can be expanded by modifying its subnet mask, setting the prefix length to a smaller number.
    • Understand Private Google Access use cases
  • On-premises connectivity
    • Cloud VPN and Interconnect are 2 components which help you connect to on-premises data center.
    • Understand limitations of Cloud VPN esp. 3Gbps limit. How it can be improved with multiple tunnels.
    • Understand what are the requirements to setup Cloud VPN.
    • Cloud Router provides dynamic routing using BGP
    • Know Interconnect as the reliable high speed, low latency and dedicated bandwidth options.
  • Cloud Load Balancing (GCLB)
    • Google Cloud Load Balancing provides scaling, high availability, and traffic management for your internet-facing and private applications.
    • Understand Google Load Balancing options and their use cases esp. which is global and internal and what protocols they support.

Storage Services

  • Understand each Storage Options and use cases.
  • Persistent disks
    • attached to the Compute Engines, provide fast access however are limited in scalability, availability and scope.
    • Remember performance depends on the size of the disk
  • Cloud Storage
    • Cloud Storage is cost-effective object storage for unstructured data.
    • very important to know the different storage classes and their use cases esp. Regional and Multi-Regional (frequent access), Nearline (monthly access) and Coldline (yearly access)
    • Understand life cycle management. HINT – Changes are in accordance to object creation date
    • Understand various data encryption techniques
    • Understand Signed URL to give temporary access and the users do not need to be GCP users
    • Understand access control and permissions – IAM vs ACLs (fine grained control)
    • Understand best practices esp. uploading and downloading the data. HINT using parallel composite uploads
  • Relational Databases
    • Know Cloud SQL and Cloud Spanner
    • Cloud SQL
      • Cloud SQL is a fully-managed service that provides MySQL, PostgreSQL and MS SQL Server
      • limited to 10TB and is a regional service.
      • Difference between Failover and Read replicas. Failover provides High Availability and almost zero downtime while Read replicas provide scalability. Cross region Read Replicas are supported
      • Perform Point-In-Time recovery. Hint – requires binary logging and backups
      • MS SQL server support was added anew. Previously for HA, it required setting up SQL Server on Compute Engine, using Always On Availability Groups using Windows Failover Clustering. Place nodes in different subnets.
    • Cloud Spanner
      • is a fully managed, mission-critical relational database service.
      • provides a scalable online transaction processing (OLTP) database with high availability and strong consistency at global scale.
      • globally distributed and can scale and handle more than 10TB.
      • not a direct replacement and would need migration
    • There are no direct options for Oracle yet.
  • NoSQL
    • Know Cloud Datastore and BigTable
    • Datastore
      • provides document database for web and mobile applications. Datastore is not for analytics
      • Understand Datastore indexes and how to update indexes for Datastore
      • Can be configured Multi-regional and regional
    • Bigtable
      • provides column database suitable for both low-latency single-point lookups and precalculated analytics
      • understand Bigtable is not for long term storage as it is quite expensive
  • Data Warehousing
    • BigQuery
      • provides scalable, fully managed enterprise data warehouse (EDW) with SQL and fast ad-hoc queries.
      • Remember it is most suitable for historical analysis.
  • MemoryStore and Firebase did not feature in any of the questions

Data Services

  • Although there is a different certification for Data Engineer, the Cloud Architect does cover data services. Data services are also part of the use cases so be sure to know about them
  • Know the Big Data stack and understand which service fits the different layers of ingest, store, process, analytics, use
  • Key Services which need to be mainly covered are –
    • Cloud Storage as the medium to store data as data lake
    • Cloud Pub/Sub
      • as the messaging service to capture real time data esp. IoT
      • is designed to provide reliable, many-to-many, asynchronous messaging between applications esp. real time IoT data capture
      • Cloud Storage can generate notifications Object change notification
    • Cloud Dataflow to process, transform, transfer data and the key service to integrate store and analytics.
    • Cloud BigQuery for storage and analytics. Remember BigQuery provides the same cost-effective option for storage as Cloud Storage
    • Cloud Dataprep to clean and prepare data. Hint – It can be used anomaly detection.
    • Cloud Dataproc to handle existing Hadoop/Spark jobs. Hint – Use it to replace existing hadoop infra.
    • Cloud Datalab is an interactive tool for exploration, transformation, analysis and visualization of your data on Google Cloud Platform
  • Know standard patterns Cloud Pub/Sub -> Dataflow -> BigQuery

Monitoring

  • Google Cloud Monitoring or Stackdriver
    • provides everything from monitoring, alert, error reporting, metrics, diagnostics, debugging, trace.
    • remember audits are mainly checking Stackdriver
  • Google Cloud Logging or Stackdriver logging

DevOps services

  • Deployment Manager 
    • provides Infrastructure as Code
    • provides dynamic provisioning with templates
  • Cloud Source Repositories
    • provides source code repository with Git version control to support collaborative development
  • Container Registry
    • is a private Docker image storage system on Google Cloud Platform.
    • images stored are immutable.
  • Cloud Build
    • is a service that executes your builds on Google Cloud Platform infrastructure.
  • MarketPlace (Cloud Launcher)
    • provides a way to launch common software packages e.g. Jenkins or WordPress and stacks on Google Compute Engine with just a few clicks like a prepackaged solution.
    • can help minimize deployment time and can be used without any knowledge about the product

Security Services

  • Cloud Security Scanner 
    • is a web application security scanner that enables developers to easily check for a subset of common web application vulnerabilities in websites built on App Engine and Compute Engine.
  • Data Loss Prevention API
    • to handle sensitive data esp. redaction of PII data.
  • PCI-DSS compliant
    • GCP services are PCI-DSS complaint, however you need to make sure for the applications and hosting to be inline with PCI-DSS requirements
  • Same concept as PCI-DSS applies to GDPR as well

Other Services

  • Know various data transfer options
  • Storage Transfer Service
    • allows import of large amounts of online data into Google Cloud Storage, quickly and cost-effectively.
    • Online data is the key here as it supports AWS S3, HTTP/HTTPS and other GCS buckets.
    • for on-premises data you need to use gsutil command
  • Transfer Appliance 
    • to transfer large amounts of data quickly and cost-effectively into Google Cloud Platform.
    • Check for the data size and it would be always compared with Google Transfer Service or gsutil commands.
    • Transfer Appliance Rehydrator provides data rehydration, which is the process by to fully reconstitute the files, so that the transferred data can be accessed and used.
  • Spinnaker
    • is an open source, multi-cloud, continuous delivery platform and does appear in answer options. So be sure to know about it.
  • Jenkins
    • for Continuous Integration and Continuous Delivery.

Case Studies

AWS Certified SysOps Administrator – Associate (SOA-C01) Exam Learning Path

AWS Certified SysOps Administrator – Associate (SOA-C01) Exam Learning Path

AWS Certified SysOps Administrator – Associate (SOA-C01) exam is the latest AWS exam and has already replaced the old SysOps Administrator – Associate exam from 24th Sept 2018. It basically validates

  • Deploy, manage, and operate scalable, highly available, and fault tolerant systems on AWS
  • Implement and control the flow of data to and from AWS
  • Select the appropriate AWS service based on compute, data, or security requirements
  • Identify appropriate use of AWS operational best practices
  • Estimate AWS usage costs and identify operational cost control mechanisms
  • Migrate on-premises workloads to AWS

Refer AWS Certified SysOps – Associate Exam Guide Sep 18

AWS Certified SysOps Administrator - Associate Content Outline

AWS Certified SysOps Administrator – Associate (SOA-C01) Exam Summary

  • AWS Certified SysOps Administrator – Associate exam is quite different from the previous one with more focus on the error handling, deployment, monitoring.
  • AWS Certified SysOps Administrator – Associate exam covers a lot of latest AWS services like ALB, Lambda, AWS Config, AWS Inspector, AWS Shield while focusing majorly on other services like CloudWatch, Metrics from various services, CloudTrail.
  • Be sure to cover the following topics
    •  Monitoring & Management Tools
      • Understand CloudWatch monitoring to provide operational transparency
        • Know which EC2 metrics it can track (disk, network, CPU, status checks) and which would need custom metrics (memory, disk swap, disk storage etc.)
        • Know ELB monitoring
          • Classic Load Balancer metrics SurgeQueueLength and SpilloverCount
          • Reasons for 4XX and 5XX errors
      • Understand CloudTrail for audit and governance
      • Understand AWS Config and its use cases
      • Understand AWS Systems Manager and its various services like parameter store, patch manager
      • Understand AWS Trusted Advisor and what it provides
      • Very important to understand AWS CloudWatch vs AWS CloudTrail vs AWS Config
      • Very important to understand Trust Advisor vs Systems manager vs Inspector
      • Know Personal Health Dashboard & Service Health Dashboard
      • Deployment tools
        • Know AWS OpsWorks and its ability to support chef & puppet
        • Know Elastic Beanstalk and its advantages
        • Understand AWS CloudFormation
          • Know stacks, templates, nested stacks
          • Know how to wait for resources setup to be completed before proceeding esp. cfn-signal
          • Know how to retain resources (RDS, S3), prevent rollback in case of a failure
    • Networking & Content Delivery
      • Understand VPC in depth
        • Understand the difference between
          • Bastion host – allow access to instances in private subnet
          • NAT – route traffic from private subnets to internet
          • NAT instance vs NAT Gateway
          • Internet Gateway – Access to internet
          • Virtual Private Gateway – Connectivity between on-premises and VPC
          • Egress-Only Internet Gateway – relevant to IPv6 only to allow egress traffic from private subnet to internet, without allowing ingress traffic
        • Understand
        • Understand how VPC Peering works and limitations
        • Understand VPC Endpoints and supported services
        • Ability to debug networking issues like EC2 not accessible, EC2 instances not reachable, Instances in subnets not able to communicate with others or Internet.
      • Understand Route 53 and Routing Policies and their use cases
        • Focus on Weighted, Latency routing policies
      • Understand VPN and Direct Connect and their use cases
      • Understand CloudFront and use cases
      • Understand ELB, ALB and NLB and what features they provide like
        • ALB provides content and path routing
        • NLB provides ability to give static IPs to load balancer.
    • Compute
      • Understand EC2 in depth
        • Understand EC2 instance types
        • Understand EC2 purchase options esp. spot instances and improved reserved instances options.
        • Understand how IO Credits work and T2 burstable performance and T2 unlimited
        • Understand EC2 Metadata & Userdata. Whats the use of each? How to look up instance data after it is launched.
        • Understand EC2 Security. 
          • How IAM Role work with EC2 instances
          • IAM Role can now be attached to stopped and runnings instances
        • Understand AMIs and remember they are regional and how can they be shared with others.
        • Troubleshoot issues with launching EC2 esp. RequestLimitExceeded, InstanceLimitExceeded etc.
        • Troubleshoot connectivity, lost ssh keys issues
      • Understand Auto Scaling
      • Understand Lambda and its use cases
      • Understand Lambda with API Gateway
    • Storage
    • Databases
    • Security
      • Understand IAM as a whole
      • Understand KMS for key management and envelope encryption
      • Understand CloudHSM and KMS vs CloudHSM esp. support for symmetric and asymmetric keys
      • Know AWS Inspector and its use cases
      • Know AWS GuardDuty as managed threat detection service. Will help eliminate as the option
      • Know AWS Shield esp. the Shield Advanced option and the features it provides
      • Know WAF as Web Traffic Firewall
      • Know AWS Artifact as on-demand access to compliance reports
    • Integration Tools
      • Understand SQS as message queuing service and SNS as pub/sub notification service
        • Focus on SQS as a decoupling service
        • Understand SQS FIFO, make sure you know the differences between standard and FIFO
      • Understand CloudWatch integration with SNS for notification
    • Cost management

AWS Certified SysOps Administrator – Associate (SOA-C01) Exam Resources

AWS Cloud Computing Whitepapers

AWS Certified SysOps Administrator – Associate (SOA-C01) Exam Contents

Domain 1: Monitoring and Reporting

  1. Create and maintain metrics and alarms utilizing AWS monitoring services
  1. Recognize and differentiate performance and availability metrics
  2. Perform the steps necessary to remediate based on performance and availability metrics

Domain 2: High Availability

  1. Implement scalability and elasticity based on use case
  2. Recognize and differentiate highly available and resilient environments on AWS

Domain 3: Deployment and Provisioning

  1. Identify and execute steps required to provision cloud resources
  2. Identify and remediate deployment issues

Domain 4: Storage and Data Management

  1. Create and manage data retention
  2. Identify and implement data protection, encryption, and capacity planning needs

Domain 5: Security and Compliance

  1. Implement and manage security policies on AWS
  1. Implement access controls when using AWS
  2. Differentiate between the roles and responsibility within the shared responsibility model

Domain 6: Networking

  1. Apply AWS networking features
  1. Implement connectivity services of AWS
  2. Gather and interpret relevant information for network troubleshooting

Domain 7: Automation and Optimization

  1. Use AWS services and features to manage and assess resource utilization
  2. Employ cost-optimization strategies for efficient resource utilization
  3. Automate manual or repeatable process to minimize management overhead

AWS Certified Developer – Associate DVA-C01 Exam Learning Path

AWS Certified Developer – Associate DVA-C01 Exam Learning Path

AWS Certified Developer – Associate DVA-C01 exam is the latest AWS exam and would replace the old Developer – Associate exam. It basically validates

  • Demonstrate an understanding of core AWS services, uses, and basic AWS architecture best practices.
  • Demonstrate proficiency in developing, deploying, and debugging cloud-based applications using AWS.

Refer AWS Certified Developer – Associate (Released June 2018) Exam Blue Print

AWS Certified Developer - Associate June 2018 Domains

AWS Certified Developer – Associate DVA-C01 Summary

  • AWS Certified Developer – Associate DVA-C01 exam is quite different from the previous one with more focus on the hands-on development and deployment concepts rather then just the architectural concepts
  • AWS Certified Developer – Associate DVA-C01 exam covers a lot of latest AWS services like Lambda, X-Ray while focusing majorly on other services like DynamoDB, Elastic Beanstalk, S3, EC2

AWS Developer – Associate DVA-C01 Exam Resources

AWS Developer – Associate DVA-C01 Exam Topics

  • Be sure to cover the following topics
    • Compute
      • Understand what AWS services you can use to build a serverless architecture?
      • Make sure you know and understand Lambda and serverless architecture, its features and use cases.
      • Know Lambda limits for e.g. execution time, deployable zipped and unzipped package limit
      • Be sure to know how to deploy, package using Lambda.
      • Understand tracing of Lambda functions using X-Ray
      • Understand integration of Lambda with CloudWatch.
      • Understand how to handle multiple releases using Alias
      • Know AWS Step Functions to manage Lambda functions flow
      • Understand Lambda with API Gateway
      • Understand API Gateway stages, ability to cater to different environments for e.g. dev, test, prod
      • Understand EC2 as a whole
      • Understand EC2 Metadata & Userdata. Whats the use of each? How to look up instance data after it is launched.
      • Understand EC2 Security. How IAM Role work with EC2 instances.
      • Understand how does EC2 evaluates the order of credentials, when multiple are provided. Remember the order – Environment variables -> Java system properties -> Default credential profiles file -> ECS container credentials -> Instance Profile credentials
      • Know Elastic Beanstalk at a high level, what it provides and its ability to get an application running quickly
      • Understand Elastic Beanstalk configurations and deployment types with their advantages and disadvantages
    • Databases
      • Understand relational and NoSQLs data storage options which include RDS, DynamoDB and their use cases
      • Understand DynamoDB Secondary Indexes
      • Make sure you understand DynamoDB provisioned throughput for Read/Writes and its calculations
      • Make sure you understand DynamoDB Consistency Model – difference between Strongly Consistent and Eventual Consistency
      • Understand DynamoDB with its low latency performance, DAX
      • Know how to configure fine grained security for DynamoDB table, items, attributes
      • Understand DynamoDB Best Practices regarding
        • table design
        • provisioned throughput
        • Query vs Scan operations
        • improving Scan operation performance
      • Understand RDS features – Read Replicas for scalability, Multi-AZ for High Availability
      • Know ElastiCache use cases, mainly for caching performance
      • Understand ElastiCache Redis vs Memcached
    • Storage
      • Understand S3 storage option
      • Understand S3 Best Practices to improve performance for GET/PUT requests
      • Understand S3 features like different storage classes with lifecycle policies, static website hosting, versioning, Pre-Signed URLs for both upload and download, CORS
    • Security
      • Understand IAM as a whole
      • Focus on IAM role and its use case especially with EC2 instance
      • Know how to test and validate IAM policies
      • Understand IAM identity providers and federation and use cases
      • Understand how AWS Cognito works and what features it provides
      • Understand MFA and How would implement two factor authentication for your application
      • Understand KMS for key management and envelope encryption
      • Know what services support KMS
        • Remember SQS, Kinesis now provides SSE support
      • Focus on S3 with SSE, SSE-C, SSE-KMS. How they work and differ?
      • Know how can you enforce only buckets to only accept encrypted objects
      • Know various KMS encryption options encrypt, reencrypt, generateEncryptedDataKey etc
      • Know how KMS impacts the performance of the services
    • Management Tools
      • Understand CloudWatch monitoring to provide operational transparency
      • Know which EC2 metrics it can track.
      • Understand CloudWatch is extendable with custom metrics
      • Understand CloudTrail for Audit
    • Integration Tools
      • Understand SQS as message queuing service and SNS as pub/sub notification service
      • Understand SQS features like visibility, long poll vs short poll
      • Focus on SQS as a decoupling service
      • AWS has released SQS FIFO, make sure you know the differences between standard and FIFO
      • Know the different development and deployment tools like CodeCommit, CodeBuild, CodeDeploy, CodePipeline
    • Networking
      • Does not cover much on networking or designing of networks, but be sure you understand VPC, Subnets, Routes, Security Groups etc.

AWS Cloud Computing Whitepapers

AWS Certified Developer – Associate DVA-C01 Exam Contents

Domain 1: Deployment

  1. Deploy written code in AWS using existing CI/CD pipelines, processes, and patterns.
  1. Deploy applications using Elastic Beanstalk.
  1. Prepare the application deployment package to be deployed to AWS.
  2. Deploy serverless applications.

Domain 2: Security

  1. Make authenticated calls to AWS services.
  1. Implement encryption using AWS services.
  2. Implement application authentication and authorization.

Domain 3: Development with AWS Services

  1. Write code for serverless applications.
  1. Translate functional requirements into application design.
  1. Implement application design into application code.
  2. Write code that interacts with AWS services by using APIs, SDKs, and AWS CLI.

Domain 4: Refactoring

  1. Optimize application to best use AWS services and features.
  2. Migrate existing application code to run on AWS.

Domain 5: Monitoring and Troubleshooting

  1. Write code that can be monitored.
  2. Perform root cause analysis on faults found in testing or production.

AWS Certified Cloud Practitioner (CLF-C01) Exam Learning Path

AWS Certified Cloud Practitioner Exam (CLF-C01) Learning Path

  • AWS Certified Cloud Practitioner Exam is mainly a high-level introduction to Cloud Computing, AWS Cloud, its advantages, its services, pricing, and support plans.
  • AWS Certified Cloud Practitioner exam is a good exam to start your AWS journey with and also provides non-technical professionals to know what AWS has to offer.
  • AWS Certified Cloud Practitioner Exam has 65 questions to be answered in 100 minutes.
  • AWS Certified Cloud Practitioner Exam is the only exam currently that can be taken online, without having to visit a test center.
  • Be sure to have a good internet connection, comfortable seating, id cards ready and you should be good to go.

AWS Certified Cloud Practitioner exam basically validates the following

  • Define what the AWS Cloud is and the basic global infrastructure
  • Describe basic AWS Cloud architectural principles
  • Describe the AWS Cloud value proposition
  • Describe key services on the AWS platform and their common use cases (for example, compute and analytics)
  • Describe basic security and compliance aspects of the AWS platform and the shared security model
  • Define the billing, account management, and pricing models;
  • Identify sources of documentation or technical assistance (for example, white papers or support tickets); and
  • Describe basic/core characteristics of deploying and operating in the AWS Cloud.

Refer to the AWS Certified Cloud Practitioner Exam guide
AWS Certified Cloud Practitioner (CLF-C01) Exam Domains

AWS Certified Cloud Practitioner Exam Resources

AWS Cloud Computing Whitepapers

AWS Certified Cloud Practitioner Exam Contents

Domain 1: Cloud Concepts

  • 1.1 Define the AWS Cloud and its value proposition
    • Agility – Speed, Experimentation, Innovation
    • Elasticity – Scale on demand, Eliminate wasted capacity
    • Availability – Spread across multiple zones
    • Flexibility – Broad set of products, Low to no cost to entry
    • Security – Compliant many compliance certifications, Shared responsibility model
  • 1.2 Identify aspects of AWS Cloud economics
    • Advantages of Cloud Computing
      • Trade capital expense for variable expense
      • Benefit from massive economies of scale
      • Stop guessing about capacity
      • Increase speed and agility
      • Stop spending money running and maintaining data centers
      • Go global in minutes
    • AWS Well-Architected Framework
      • Features include agility, security, reliability, performance efficiency, cost optimization, and operational excellence.
  • 1.3 List the different cloud architecture design principles

Domain 2: Security

  • 2.1 Define the AWS Shared Responsibility model
    • includes having a clear understanding of what AWS and Customer responsibilities are 
  • 2.2 Define AWS Cloud security and compliance concepts
  • 2.3 Identify AWS access management capabilities
    • includes services like IAM
  • 2.4 Identify resources for security support

Domain 3: Technology

  • 3.1 Define methods of deploying and operating in the AWS Cloud
  • 3.2 Define the AWS global infrastructure
    • includes AWS concepts of regions, AZs and edge locations
  • 3.3 Identify the core AWS services
    • Includes AWS Services Overview and focuses on high level knowledge of (but surely not deep enough)
      • Compute Services
      • Storage Services –
        • S3 – object storage, static website hosting. Know S3 subresources esp. versioning, server access logging
        • EBS
        • EFS – shared file storage that can be shared between on-premises and AWS resources
        • Glacier – archival long term storage
      • Security, Identity, and Compliance –
        • IAM – 
        • Organizations,
        • WAF
        • AWS Inspector – automated application security assessment 
        • AWS GuardDuty – threat detection service that continuously monitors for malicious activity and unauthorized behavior to protect the AWS accounts and workloads
        • AWS Artifact – On-demand access to AWS’ compliance reports
      • Databases –
      • Migration – Database Migration Service
      • Networking and Content Delivery
        • VPC
        • CloudFront – caching and helps improve performance and reduce latency
        • Route 53 – dns and domain registration. Also understand different routing policies esp. latency for Global traffic
        • VPN & Direct Connect – provides connectivity between on-premises and AWS Cloud
        • ELB – helps distribute load across multiple resources
        • AWS Global Accelerator – Improve application availability and performance
      • Management Tools
      • Messaging – SQS, SNS
  • 3.4 Identify resources for technology support
    • includes AWS Support Models and the key features and benefits the model provides to the customers
      • know only Enterprise support plan provides
        • dedicated TAM (Technical Account Manager)
        • Well-Architected Review delivered by AWS Solution Architects
        • Account Assistance by Assigned Support Concierge
        • SLA of < 15 mins for business critical events
      • know only Business and above support plan provides
        • 24×7 access to Cloud Support Engineers via email, chat & phone
        • Full Access to Trusted Advisor checks

Domain 4: Billing and Pricing

  • 4.1 Compare and contrast the various pricing models for AWS
    • includes AWS Pricing
      • know EC2 pricing models esp. Spot and Reserved
      • know Lambda pricing which is based on invocations and time
  • 4.2 Recognize the various account structures in relation to AWS billing and pricing
  • 4.3 Identify resources available for billing support
    • includes tools like TCO Calculator which helps compare cost of running applications in an on-premises or colocation environment to AWS
    • includes Cost Explorer allows you to view and analyze costs (hint – Spend forecasting)