AWS Certification – Content Delivery – Cheat Sheet

CloudFront

  • provides low latency and high data transfer speeds for distribution of static, dynamic web or streaming content to web users
  • delivers the content through a worldwide network of data centers called Edge Locations
  • keeps persistent connections with the origin servers so that the files can be fetched from the origin servers as quickly as possible.
  • dramatically reduces the number of network hops that users’ requests must pass through
  • supports multiple origin server options, like AWS hosted service for e.g. S3, EC2, ELB or an on premise server, which stores the original, definitive version of the objects
  • single distribution can have multiple origins and Path pattern in a cache behavior determines which requests are routed to the origin
  • supports Web Download distribution and RTMP Streaming distribution
    • Web distribution supports static, dynamic web content, on demand using progressive download & HLS and live streaming video content
    • RTMP supports streaming of media files using Adobe Media Server and the Adobe Real-Time Messaging Protocol (RTMP) ONLY
  • supports HTTPS using either
    • dedicated IP address, which is expensive as dedicated IP address is assigned to each CloudFront edge location
    • Server Name Indication (SNI), which is free but supported by modern browsers only with the domain name available in the request header
  • For E2E HTTPS connection,
    • Viewers -> CloudFront needs either self signed certificate, or certificate issued by CA or ACM
    • CloudFront -> Origin needs certificate issued by ACM for ELB and by CA for other origins
  •  Security
    • Origin Access Identity (OAI) can be used to restrict the content from S3 origin to be accessible from CloudFront only
    • supports Geo restriction (Geo-Blocking) to whitelist or blacklist countries that can access the content
    • Signed URLs 
      • for RTMP distribution as signed cookies aren’t supported
      • to restrict access to individual files, for e.g., an installation download for your application.
      • users using a client, for e.g. a custom HTTP client, that doesn’t support cookies
    • Signed Cookies
      • provide access to multiple restricted files, for e.g., video part files in HLS format or all of the files in the subscribers’ area of a website.
      • don’t want to change the current URLs
    • integrates with AWS WAF, a web application firewall that helps protect web applications from attacks by allowing rules configured based on IP addresses, HTTP headers, and custom URI strings
  • supports GET, HEAD, OPTIONS, PUT, POST, PATCH, DELETE to get object & object headers, add, update, and delete objects
    • only caches responses to GET and HEAD requests and, optionally, OPTIONS requests
    • does not cache responses to PUT, POST, PATCH, DELETE request methods and these requests are proxied back to the origin
  • object removal from cache
    • would be removed upon expiry (TTL) from the cache, by default 24 hrs
    • can be invalidated explicitly, but has a cost associated, however might continue to see the old version until it expires from those caches
    • objects can be invalidated only for Web distribution
    • change object name, versioning, to serve different version
  • supports adding or modifying custom headers before the request is sent to origin which can be used to
    • validate if user is accessing the content from CDN
    • identifying CDN from which the request was forwarded from, in case of multiple CloudFront distribution
    • for viewers not supporting CORS to return the Access-Control-Allow-Origin header for every request
  • supports Partial GET requests using range header to download object in smaller units improving the efficiency of partial downloads and recovery from partially failed transfers
  • supports compression to compress and serve compressed files when viewer requests include Accept-Encoding: gzip in the request header
  • supports different price class to include all regions, to include only least expensive regions and other regions to exclude most expensive regions
  • supports access logs which contain detailed information about every user request for both web and RTMP distribution

AWS Data Migration Service – DMS

AWS Data Migration Service – DMS

  • AWS Database Migration Service enables quick and secure data migration
  • AWS Database Migration Service helps migration to AWS with virtually no downtime. Source database remains fully operational during the migration, minimizing downtime to applications that rely on the database.
  • AWS Database Migration Service can migrate data to and from most widely used commercial and open-source databases.
  • AWS Database Migration Service supports homogeneous migrations such as Oracle to Oracle, as well as heterogeneous migrations between different database platforms, such as Oracle or Microsoft SQL Server to Aurora.
  • AWS Database Migration Service enables continuous data replication with high availability and consolidate databases into a petabyte-scale data warehouse by streaming data to Redshift and S3.
  • AWS Database Migration Service is highly resilient and self–healing.
  • AWS DMS continually monitors source and target databases, network connectivity, and the replication instance.
  • In case of interruption, DMS automatically restarts the process and continues the migration from where it was halted.
  • DMS supports Multi-AZ option to provide high-availability for database migration and continous data replication by enabling redundant replication instances.

DMS Replication Instance

DMS Migration

  • A DMS replication instance performs the actual data migration between source and target.
  • Replication instance also caches the transaction logs during the migration
  • CPU and memory capacity of the replication instance influences the overall time required for the migration.

AWS Schema Conversion Tool

  • AWS Schema Conversion Tool makes heterogeneous database migrations  by automatically converting the source database schema and a majority of the database code objects, including views, stored procedures, and functions, to a format compatible with the target database.
  • DMS and SCT work in conjunction to both migrate databases and support ongoing replication for a variety of uses such as populating datamarts, synchronizing systems, etc.
  • SCT can copy database schemas for homogeneous migrations and convert them for heterogeneous migrations.
  • SCT clearly marks any objects that cannot be automatically converted so that they can be manually converted to complete the migration.
  • SCT can scan the application source code for embedded SQL statements and convert them as part of a database schema conversion project.
  • SCT performs cloud native code optimization by converting legacy Oracle and SQL Server functions to their equivalent AWS service thus helping  modernize the applications at the same time of database migration.
  • Once schema conversion is complete, SCT can help migrate data from a range of data warehouses to Redshift using built-in data migration agents.

[do_widget id=text-15]

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Which AWS service would simplify migration of a database to AWS?
    1. AWS Storage Gateway
    2. AWS Database Migration Service (AWS DMS)
    3. Amazon Elastic Compute Cloud (Amazon EC2)
    4. Amazon AppStream 2.0

AWS IoT Core

AWS IoT

AWS  IoT Core

  • AWS IoT Core is a managed cloud platform that lets connected devices easily and securely interact with cloud applications and other devices.
  • AWS IoT Core can support billions of devices and trillions of messages, and can process and route those messages to AWS endpoints and to other devices reliably and securely.
  • AWS IoT Core allows the applications to keep track of and communicate with all the devices, all the time, even when they aren’t connected.
  • AWS IoT  Core offers
    • Connectivity between devices and the AWS cloud.
      • AWS IoT Core allows communication with connected devices securely, with low latency and with low overhead.
      • Communication can scale to as many devices as needed.
      • AWS IoT Core supports standard communication protocols (HTTP, MQTT, and WebSockets are supported currently).
      • Communication is secured using TLS.
    • Processing data sent from connected devices.
      • AWS IoT Core can continuously ingest, filter, transform, and route the data streamed from connected devices.
      • Actions can be taken based on the data and route it for further processing and analytics.
    • Application interaction with connected devices.
      • AWS IoT Core accelerates IoT application development.
      • It serves as an easy to use interface for applications running in the cloud and on mobile devices to access data sent from connected devices, and send data and commands back to the devices.

AWS IoT

AWS IoT Core Works

  • Connected devices, such as sensors, actuators, embedded devices, smart appliances, and wearable devices, connect to AWS IoT Core over HTTPS, WebSockets, or secure MQTT.
  • Communication with AWS IoT Core is secure.
    • HTTPS and WebSockets requests sent to AWS IoT Core are authenticated using AWS IAM or AWS Cognito, both of which support the AWS SigV4 authentication.
    • HTTPS requests can also be authenticated using X.509 certificates.
    • MQTT messages to AWS IoT Core are authenticated using X.509 certificates.
    • With AWS IoT Core allows using AWS IoT Core generated certificates, as well as those signed by your preferred Certificate Authority (CA).
  • AWS IoT Core also offers fine-grained authorization to isolate and secure communication among authenticated clients.

Device Gateway

  • Device Gateway forms the backbone of communication between connected devices and the cloud capabilities such as the Rules Engine, Device Shadow, and other AWS and 3rd-party services.
  • Device Gateway allows secure, low-latency, low-overhead, bi-directional communication between connected devices, cloud and mobile application
  • Device Gateway supports the pub/sub messaging pattern, which involves involves clients publishing messages on logical communication channels called ‘topics’ and clients subscribing to topics to receive messages
  • Device gateway enables communication between publishers and subscribers
  • Device Gateway scales automatically as per the demand, without any operational overhead

Rules Engine

  • Rules Engine enables continuous processing of data sent by connected devices.
  • Rules can be configured to filter and transform the data using an intuitive, SQL-like syntax.
  • Rules can be configured to route the data to other AWS services such as DynamoDB, Kinesis, Lambda, SNS, SQS, CloudWatch, Elasticsearch Service with built-in Kibana integration, as well as to non-AWS services, via Lambda for further processing, storage, or analytics.

Registry

  • Registry allows registering devices and keeping track of devices connected to AWS IoT Core, or devices that may connect in the future.

Device Shadow

  • Device Shadow enables cloud and mobile applications to query data sent from devices and send commands to devices, using a simple REST API, while letting AWS IoT Core handle the underlying communication with the devices.
  • Device Shadow accelerates application development by providing
    • a uniform interface to devices, even when they use one of the several IoT communication and security protocols with which the applications may not be compatible.
    • an always available interface to devices even when the connected devices are constrained by intermittent connectivity, limited bandwidth, limited computing ability or limited power.

Device and its Device Shadow Lifecycle

  • A device (such as a light bulb) is registered in the Registry.
  • Connected device is programmed to publish a set of its property values or ‘state (“I am ON and my color is RED”) to the AWS IoT Core service.
  • Device Shadow also stores the last reported state in the  in AWS IoT Core.
  • An application (such as a mobile app controlling the light bulb) uses a RESTful API to query AWS IoT Core for the last reported state of the light bulb, without the complexity of communicating directly with the light bulb
  • When a user wants to change the state (such as turning the light bulb from ON to OFF), the application uses a RESTful API to request an update, i.e. sets a ‘desired’ state for the device in AWS IoT Core. AWS IoT Core takes care of synchronizing the desired state to the device.
  • Application gets notified when the connected device updates its state to the desired state.

[do_widget id=text-15]

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. You need to filter and transform incoming messages coming from a smart sensor you have connected with AWS. Once messages are received, you need to store them as time series data in DynamoDB. Which AWS service can you use?
    1. IoT Device Shadow Service (maintains device state)
    2. Redshift
    3. Kinesis (While Kinesis could technically be used as an intermediary between different sources, it isn’t a great way to get data into DynamoDB from an IoT device.)
    4. IoT Rules Engine

AWS Kinesis Data Streams vs SQS

Kinesis Data Streams vs SQS

Kinesis Data Streams vs SQS

Purpose

  • Amazon Kinesis Data Streams
    • allows real-time processing of streaming big data and the ability to read and replay records to multiple Amazon Kinesis Applications.
    • Amazon Kinesis Client Library (KCL) delivers all records for a given partition key to the same record processor, making it easier to build multiple applications that read from the same Amazon Kinesis stream (for example, to perform counting, aggregation, and filtering).
  • Amazon SQS
    • offers a reliable, highly-scalable hosted queue for storing messages as they travel between applications or microservices.
    • It moves data between distributed application components and helps  decouple these components.
    • provides common middleware constructs such as dead-letter queues and poison-pill management.
    • provides a generic web services API and can be accessed by any programming language that the AWS SDK supports.
    • supports both standard and FIFO queues

Scaling

  • Kinesis Data streams is not fully managed and requires manual provisioning and scaling by increasing shards
  • SQS is fully managed, highly scalable and requires no administrative overhead and little configuration

Ordering

  • Kinesis provides ordering of records, as well as the ability to read and/or replay records in the same order to multiple Kinesis Applications
  • SQS Standard Queue does not guarantee data ordering and provides at least once delivery of messages
  • SQS FIFO Queue guarantees data ordering within the message group

Data Retention Period

  • Kinesis Data Streams stores the data up to 24 hours, by default, and can be extended to 7 days
  • SQS stores the message up to 4 days, by default, and can be configured from 1 minute to 14 days but clears the message once deleted by the consumer

Delivery Semantics

  • Kinesis and SQS Standard Queue both guarantee at-least once delivery of message
  • SQS FIFO Queue guarantees Exactly once delivery

Parallel Clients

  • Kinesis supports multiple consumers
  • SQS allows the messages to be delivered to only one consumer at a time and requires multiple queues to deliver message to multiple consumers

Use Cases

  • Kinesis use cases requirements
    • Ordering of records.
    • Ability to consume records in the same order a few hours later
    • Ability for multiple applications to consume the same stream concurrently
    • Routing related records to the same record processor (as in streaming MapReduce)
  • SQS uses cases requirements
    • Messaging semantics like message-level ack/fail and visibility timeout
    • Leveraging SQS’s ability to scale transparently
    • Dynamically increasing concurrency/throughput at read time
    • Individual message delay, which can be delayed

[do_widget id=text-15]

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. You are deploying an application to track GPS coordinates of delivery trucks in the United States. Coordinates are transmitted from each delivery truck once every three seconds. You need to design an architecture that will enable real-time processing of these coordinates from multiple consumers. Which service should you use to implement data ingestion?
    1. Amazon Kinesis
    2. AWS Data Pipeline
    3. Amazon AppStream
    4. Amazon Simple Queue Service
  2. Your customer is willing to consolidate their log streams (access logs, application logs, security logs etc.) in one single system. Once consolidated, the customer wants to analyze these logs in real time based on heuristics. From time to time, the customer needs to validate heuristics, which requires going back to data samples extracted from the last 12 hours? What is the best approach to meet your customer’s requirements?
    1. Send all the log events to Amazon SQS. Setup an Auto Scaling group of EC2 servers to consume the logs and apply the heuristics.
    2. Send all the log events to Amazon Kinesis develop a client process to apply heuristics on the logs (Can perform real time analysis and stores data for 24 hours which can be extended to 7 days)
    3. Configure Amazon CloudTrail to receive custom logs, use EMR to apply heuristics the logs (CloudTrail is only for auditing)
    4. Setup an Auto Scaling group of EC2 syslogd servers, store the logs on S3 use EMR to apply heuristics on the logs (EMR is for batch analysis)

AWS Certified Solutions Architect – Associate SAA-C02 Exam Learning Path

SAA-C02 Certification

AWS Certified Solutions Architect – Associate SAA-C02 Exam Learning Path

AWS Solutions Architect – Associate SAA-C02 exam is the latest AWS exam that has replaced the previous SAA-C01 certification exam. It basically validates the ability to effectively demonstrate knowledge of how to architect and deploy secure and robust applications on AWS technologies

  • Define a solution using architectural design principles based on customer requirements.
  • Provide implementation guidance based on best practices to the organization throughout the life cycle of the project.

Refer AWS_Solution_Architect_-_Associate_SAA-C02_Exam_Blue_Print

AWS Solutions Architect – Associate SAA-C02 Exam Summary

  • SAA-C02 exam consists of 65 questions in 130 minutes, and the time is more than sufficient if you are well prepared.
  • SAA-C02 Exam covers the architecture aspects in deep, so you must be able to visualize the architecture, even draw them out in the exam just to understand how it would work and how different services relate.
  • AWS has updated the exam concepts from the focus being on individual services to more building of scalable, highly available, cost-effective, performant, resilient.
  • If you had been preparing for the SAA-C01 –
    • SAA-C02 is pretty much similar to SAA-C01 except the operational effective architecture domain has been dropped
    • Although, most of the services and concepts covered by the SAA-C01 are the same. There are few new additions like Aurora Serverless, AWS Global Accelerator, FSx for Windows, FSx for Lustre
  • AWS exams are available online, and I took the online one. Just make sure you have a proper place to take the exam with no disturbance and nothing around you.
  • Also, if you are taking the AWS Online exam for the first time try to join atleast 30 minutes before the actual time.

AWS Solutions Architect – Associate SAA-C02 Exam Topics

Make sure you go through all the topics and focus on hints in italics

Networking

  • Be sure to create VPC from scratch. This is mandatory.
    • Create VPC and understand whats an CIDR and addressing patterns
    • Create public and private subnets, configure proper routes, security groups, NACLs. (hint: Subnets are public or private depending on whether they can route traffic directly through Internet gateway)
    • Create Bastion for communication with instances
    • Create NAT Gateway or Instances for instances in private subnets to interact with internet
    • Create two tier architecture with application in public and database in private subnets
    • Create three tier architecture with web servers in public, application and database servers in private. (hint: focus on security group configuration with least privilege)
    • Make sure to understand how the communication happens between Internet, Public subnets, Private subnets, NAT, Bastion etc.
  • Understand difference between Security Groups and NACLs (hint: Security Groups are Stateful vs NACLs are stateless. Also only NACLs provide an ability to deny or block IPs)
  • Understand VPC endpoints and what services it can help interact (hint: VPC Endpoints routes traffic internally without Internet)
    • VPC Gateway Endpoints supports S3 and DynamoDB.
    • VPC Interface Endpoints OR Private Links supports others
  • Understand difference between NAT Gateway and NAT Instance (hint: NAT Gateway is AWS managed and is scalable and highly available)
  • Understand how NAT high availability can be achieved (hint: provision NAT in each AZ and route traffic from subnets within that AZ through that NAT Gateway)
  • Understand VPN and Direct Connect for on-premises to AWS connectivity
    • VPN provides quick connectivity, cost-effective, secure channel, however routes through internet and does not provide consistent throughput
    • Direct Connect provides consistent dedicated throughput without Internet, however requires time to setup and is not cost-effective
  • Understand Data Migration techniques
    • Choose Snowball vs Snowmobile vs Direct Connect vs VPN depending on the bandwidth available, data transfer needed, time available, encryption requirement, one-time or continuous requirement
    • Snowball, SnowMobile are for one-time data, cost-effective, quick and ideal for huge data transfer
    • Direct Connect, VPN are ideal for continuous or frequent data transfers
  • Understand CloudFront as CDN and the static and dynamic caching it provides, what can be its origin (hint: CloudFront can point to on-premises sources and its usecases with S3 to reduce load and cost)
  • Understand Route 53 for routing
    • Understand Route 53 health checks and failover routing
    • Understand  Route 53 Routing Policies it provides and their use cases mainly for high availability (hint: focus on weighted, latency, geolocation, failover routing)
  • Be sure to cover ELB concepts in deep.
    • SAA-C02 focuses on ALB and NLB and does not cover CLB
    • Understand differences between  CLB vs ALB vs NLB
      • ALB is layer 7 while NLB is layer 4
      • ALB provides content based, host based, path based routing
      • ALB provides dynamic port mapping which allows same tasks to be hosted on ECS node
      • NLB provides low latency and ability to scale
      • NLB provides static IP address

Security

  • Understand IAM as a whole
    • Focus on IAM role (hint: can be used for EC2 application access and Cross-account access)
    • Understand IAM identity providers and federation and use cases
    • Understand MFA and how would implement two factor authentication for an application
    • Understand IAM Policies (hint: except couple of questions with policies defined and you need to select correct statements)
  • Understand encryption services
  • AWS WAF integrates with CloudFront to provide protection against Cross-site scripting (XSS) attacks. It also provide IP blocking and geo-protection.
  • AWS Shield integrates with CloudFront to provide protection against DDoS.
  • Refer Disaster Recovery whitepaper, be sure you know the different recovery types with impact on RTO/RPO.

Storage

  • Understand various storage options S3, EBS, Instance store, EFS, Glacier, FSx and what are the use cases and anti patterns for each
  • Instance Store
    • Understand Instance Store (hint: it is physically attached  to the EC2 instance and provides the lowest latency and highest IOPS)
  • Elastic Block Storage – EBS
    • Understand various EBS volume types and their use cases in terms of IOPS and throughput. SSD for IOPS and HDD for throughput
    • Understand Burst performance and I/O credits to handle occasional peaks
    • Understand EBS Snapshots (hint: backups are automated, snapshots are manual
  • Simple Storage Service – S3
    • Cover S3 in depth
    • Understand S3 storage classes with lifecycle policies
      • Understand the difference between SA Standard vs SA IA vs SA IA One Zone in terms of cost and durability
    • Understand S3 Data Protection (hint: S3 Client side encryption encrypts data before storing it in S3)
    • Understand S3 features including
      • S3 provides a cost effective static website hosting
      • S3 versioning provides protection against accidental overwrites and deletions
      • S3 Pre-Signed URLs for both upload and download provides access without needing AWS credentials
      • S3 CORS allows cross domain calls
      • S3 Transfer Acceleration enables fast, easy, and secure transfers of files over long distances between your client and an S3 bucket.
    • Understand Glacier as an archival storage with various retrieval patterns
    • Glacier Expedited retrieval now allows object retrieval within mins
  • Understand Storage gateway and its different types.
    • Cached Volume Gateway provides access to frequently accessed data, while using AWS as the actual storage
    • Stored Volume gateway uses AWS as a backup, while the data is being stored on-premises as well
    • File Gateway supports SMB protocol
  • Understand FSx easy and cost effective to launch and run popular file systems.
  • Understand the difference between EBS vs S3 vs EFS
    • EFS provides shared volume across multiple EC2 instances, while EBS can be attached to a single volume within the same AZ.
  • Understand the difference between EBS vs Instance Store
  • Would recommend referring Storage Options whitepaper, although a bit dated 90% still holds right

Compute

  • Understand Elastic Cloud Compute – EC2
  • Understand Auto Scaling and ELB, how they work together to provide High Available and Scalable solution. (hint: Span both ELB and Auto Scaling across Multi-AZs to provide High Availability)
  • Understand EC2 Instance Purchase Types – Reserved, Scheduled Reserved, On-demand and Spot and their use cases
    • Choose Reserved Instances for continuous persistent load
    • Choose Scheduled Reserved Instances for load with fixed scheduled and time interval
    • Choose Spot instances for fault tolerant and Spiky loads
    • Reserved instances provides cost benefits for long terms requirements over On-demand instances
    • Spot instances provides cost benefits for temporary fault tolerant spiky load
  • Understand EC2 Placement Groups (hint: Cluster placement groups provide low latency and high throughput communication, while Spread placement group provides high availability)
  • Understand Lambda and serverless architecture, its features and use cases. (hint: Lambda integrated with API Gateway to provide a serverless, highly scalable, cost-effective architecture)
  • Understand ECS with its ability to deploy containers and micro services architecture.
    • ECS role for tasks can be provided through taskRoleArn
    • ALB provides dynamic port mapping to allow multiple same tasks on the same node
  • Know Elastic Beanstalk at a high level, what it provides and its ability to get an application running quickly.

Databases

  • Understand relational and NoSQLs data storage options which include RDS, DynamoDB, Aurora and their use cases
  • RDS
    • Understand RDS features – Read Replicas vs Multi-AZ
      • Read Replicas for scalability, Multi-AZ for High Availability
      • Multi-AZ are regional only
      • Read Replicas can span across regions and can be used for disaster recovery
    • Understand Automated Backups, underlying volume types
  • Aurora
    • Understand Aurora
      • provides multiple read replicas and replicates 6 copies of data across AZs
    • Understand Aurora Serverless provides a highly scalable cost-effective database solution
  • DynamoDB
    • Understand DynamoDB with its low latency performance, key-value store (hint: DynamoDB is not a relational database)
    • DynamoDB DAX provides caching for DynamoDB
    • Understand DynamoDB provisioned throughput for Read/Writes (It is more cover in Developer exam though.)
  • Know ElastiCache use cases, mainly for caching performance

Integration Tools

  • Understand SQS as message queuing service and SNS as pub/sub notification service
  • Understand SQS features like visibility, long poll vs short poll
  • Focus on SQS as a decoupling service
  • Understand SQS Standard vs SQS FIFO difference (hint: FIFO provides exactly once delivery both low throughput)

Analytics

  • Know Redshift as a business intelligence tool
  • Know Kinesis for real time data capture and analytics
  • Atleast know what AWS Glue does, so you can eliminate the answer

Management Tools

  • Understand CloudWatch monitoring to provide operational transparency
  • Know which EC2 metrics it can track. Remember, it cannot track memory and disk space/swap utilization
  • Understand CloudWatch is extendable with custom metrics
  • Understand CloudTrail for Audit
  • Have a basic understanding of CloudFormation, OpsWorks

AWS Solutions Architect – Associate SAA-C02 Exam Resources

AWS Whitepapers & Cheat sheets

AWS Solutions Architect – Associate Exam Domains

Domain 1: Design Resilient Architectures

  1. Design a multi-tier architecture solution
  2. Design highly available and/or fault-tolerant architectures
  3. Design decoupling mechanisms using AWS services
  4. Choose appropriate resilient storage

Domain 2: Define High-Performing Architectures

  1. Identify elastic and scalable compute solutions for a workload
  2. Select high-performing and scalable storage solutions for a workload
  3. Select high-performing networking solutions for a workload
  4. Choose high-performing database solutions for a workload

Domain 3: Specify Secure Applications and Architectures

  1. Design secure access to AWS resources
  2. Design secure application tiers
  3. Select appropriate data security options

Domain 4: Design Cost-Optimized Architectures

  1. Determine how to design cost-optimized storage.
  2. Determine how to design cost-optimized compute.

AWS FSx for Lustre

AWS FSx for Lustre

  • Amazon FSx for Lustre, is a fully managed service, that makes it easy and cost effective to launch and run the world’s most popular high-performance (HPC) Lustre file system.
  • Lustre is an open source file system designed for applications that require fast storage – where you want your storage to keep up with your compute
  • FSx handles the traditional complexity of setting up and managing high-performance Lustre file systems
  • FSx for Lustre is ideal for use cases where speed matters, such as machine learning, high performance computing (HPC), video processing, financial modeling, genome sequencing, and electronic design automation (EDA)
  • Amazon FSx provides multiple deployment options to optimize cost
    • Scratch file systems
      • designed for temporary storage and short-term processing of data.
      • data is not replicated and does not persist if a file server fails.
    • Persistent file systems
      • designed for long-term storage and workloads.
      • is highly available, and data is automatically replicated within the AZ that is associated with the file system.
      • data volumes attached to the file servers are replicated independently from the file servers to which they are attached.
  • FSx for Lustre is compatible with the most popular Linux-based AMIs, including Amazon Linux, Amazon Linux 2, Red Hat Enterprise Linux (RHEL), CentOS, SUSE Linux and Ubuntu.
  • FSx for Lustre can be accessed from a Linux instance, by installing the open-source Lustre client and mounting the file system using standard Linux commands.

FSx for Lustre with S3

  • Amazon FSx also integrates seamlessly with S3, making it easy to process cloud data sets with the Lustre high-performance file system.
  • Amazon FSx for Lustre file system transparently presents S3 objects as files and allows writing changed data back to S3.
  • Amazon FSx for Lustre file system can be linked with a specified S3 bucket, making the data in the S3 accessible to the file system.
  • S3 objects’ names and prefixes will be visible as files and directories
  • Amazon S3 objects are lazy loaded by default.
    • Objects are only loaded into the file system only when first accessed by the applications.
    • Amazon FSx for Lustre automatically loads the corresponding objects from S3 when accessed
    • Subsequent reads of these files are served directly out of the file system with low, consistent latencies.
    • Amazon FSx for Lustre file system can optionally batch hydrated
  • Amazon FSx for Lustre uses parallel data transfer techniques to transfer data from S3 at up to hundreds of GBs/s.
  • Files from the file system can be exported back to the S3 bucket

FSx for Lustre Security

  • FSx for Lustre provides encryption at rest for the file system and the backups, by default, using KMS
  • FSx encrypts data-in-transit when accessed from supported EC2 instances only

FSx for Lustre Scalability

  • Amazon FSx for Lustre file systems scale to hundreds of GB/s of throughput and millions of IOPS.
  • FSx for Lustre also supports concurrent access to the same file or directory from thousands of compute instances.
  • FSx for Lustre provides consistent, sub-millisecond latencies for file operations.

FSx for Lustre Availability and Durability

  • On a scratch file system, file servers are not replaced if they fail and data is not replicated.
  • On a persistent file system, if a file server becomes unavailable it is replaced automatically and within minutes.
  • Amazon FSx for Lustre provides a parallel file system, where data is stored across multiple network file servers to maximize performance and reduce bottlenecks, and each server has multiple disks.
  • Amazon FSx takes daily automatic incremental backups of the file systems, and allows manual backups at any point.
  • Backups are highly durable and file-system-consistent

[do_widget id=text-15]

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A solutions architect is designing storage for a high performance computing (HPC) environment based on Amazon Linux. The workload stores and processes a large amount of engineering drawings that require shared storage and heavy computing. Which storage option would be the optimal solution?
    1. Amazon Elastic File System (Amazon EFS)
    2. Amazon FSx for Lustre
    3. Amazon EC2 instance store
    4. Amazon EBS Provisioned IOPS SSD (io1)

AWS FSx for Windows

AWS FSx for Windows

  • Amazon FSx for Windows File Server provides fully managed, highly reliable, and scalable file storage that is accessible over the industry-standard Service Message Block (SMB) protocol.
  • Built on Windows Server, delivering a wide range of administrative features such as user quotas, end-user file restore, ACLs and Microsoft Active Directory (AD) integration.
  • Amazon FSx provides high levels of throughput and IOPS, and consistent sub-millisecond latencies.
  • Amazon FSx is accessible from Windows, Linux, and MacOS compute instances and devices.
  • Amazon FSx provides concurrent access to the file system to thousands of compute instances and devices
  • Amazon FSx can connect the file system to EC2, VMware Cloud on AWS, Amazon WorkSpaces, and Amazon AppStream 2.0 instances.
  • Integrated with CloudWatch to monitor storage capacity and file system activity
  • Integrated with CloudTrail to monitor all Amazon FSx API calls
  • Amazon FSx was designed for use cases that require Windows shared file storage, like CRM, ERP, custom or .NET applications, home directories, data analytics, media and entertainment workflows, web serving and content management, software build environments, and Microsoft SQL Server.
  • Amazon FSx file systems is accessible from the on-premises environment using an AWS Direct Connect or AWS VPN connection
  • Amazon FSx is accessible from multiple VPCs, AWS accounts, and AWS Regions using VPC Peering connections or AWS Transit Gateway
  • Amazon FSx provides consistent sub-millisecond latencies with SSD storage, and single-digit millisecond latencies with HDD storage
  • Amazon FSx supports Microsoft’s Distributed File System (DFS) to organize shares into a single folder structure up to hundreds of PB in size

FSx for Windows Security

  • Amazon FSx works with Microsoft Active Directory (AD) to integrate with  existing Windows environments, which can either be an AWS Managed Microsoft AD or self-managed Microsoft AD
  • Amazon FSx provides standard Windows permissions (full support for Windows Access Controls ACLS) for files and folders.
  • Amazon FSx for Windows File Server supports encryption at rest for the file system and backups using KMS managed keys
  • Amazon FSx encrypts data-in-transit using SMB Kerberos session keys, when accessing the file system from clients that support SMB 3.0
  • Amazon FSx supports file-level or folder-level restores to previous versions by supporting Windows shadow copies, which are snapshots of your file system at a point in time
  • Amazon FSx supports Windows shadow copies to enable your end-users to easily undo file changes and compare file versions by restoring files to previous versions, and backups to support your backup retention and compliance needs.

FSx for Windows Availability and durability

  • Amazon FSx automatically replicates the data within an Availability Zone (AZ) to protect it from component failure,
  • Amazon FSx continuously monitors for hardware failures, and automatically replaces infrastructure components in the event of a failure.
  • Amazon FSx supports Multi-AZ deployment
    • automatically provisions and maintains a standby file server in a different Availability Zone.
    • Any changes written to disk in the file system are synchronously replicated across AZs to the standby.
    • helps enhance availability during planned system maintenance
    • helps protect the data against instance failure and AZ disruption.
    • In the event of planned file system maintenance or unplanned service disruption, Amazon FSx automatically fails over to the secondary file server, allowing data accessibility without manual intervention.
  • Amazon FSx supports automatic backups of the file systems, which are incremental storing only the changes after the most recent backup
  • Amazon FSx stores backups in Amazon S3.

[do_widget id=text-15]

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A data processing facility wants to move a group of Microsoft Windows servers to the AWS Cloud. Theses servers require access to a shared file system that can integrate with the facility’s existing Active Directory (AD) infrastructure for file and folder permissions. The solution needs to provide seamless support for shared files with AWS and on-premises servers and allow the environment to be highly available. The chosen solution should provide added security by supporting encryption at rest and in transit. The solution should also be cost-effective to implement and manage. Which storage solution would meet these requirements?
    1. An AWS Storage Gateway file gateway joined to the existing AD domain
    2. An Amazon FSx for Windows File Server file system joined to the existing AD domain
    3. An Amazon Elastic File System (Amazon EFS) file system joined to an AWS managed AD domain
    4. An Amazon S3 bucket mounted on Amazon EC2 instances in multiple Availability Zones running Windows Server and joined to an AWS managed AD domain

AWS S3 vs EBS vs EFS

S3 vs EBS vs EFS

Amazon EFS, Amazon EBS, and Amazon S3 are AWS’ three different storage types that are applicable for different types of workload needs

S3 vs EBS vs EFS

Amazon S3

    • is an object store with a with simple key, value store design and good at storing vast numbers of backups or user files.
    • offers pay for the storage you actually use. Offers cost-saving storage classes ideal for infrequently access data or for data archival
    • provides unlimited storage
    • provides durability as the data is replicated and stored across at least three geographically-dispersed AZs with a maximum of 99.999999999% (11! 9’s)
    • provide high availability with a maximum of 99.99%
    • provides Security with range of access control mechanism and ability. to encrypt data at rest and in transit
    • data can be accessed programmatically or directly from services such as AWS CloudFront.
    • provides backup capability using versioning and cross-region replication

Amazon EBS

    • delivers high-availability block-level storage volumes for Amazon Elastic Compute Cloud (EC2) instances.
    • offers pay for the provisioned storage, even if you do not use it
    • provides limited storage capability and cannot scale infinitely
    • stores data on a file system which can be retained after the EC2 instance is shut down.
    • provides durability by replicating data across multiple servers in an AZ to prevent the loss of data from the failure of any single component
    • designed for 99.999% availability
    • provides low-latency performance – using SSD EBS volumes, it offers reliable I/O performance scaled to meet your workload needs.
    • provides secure storage with access control and providing data at rest and in transit encryption
    • is only accessible from a single EC2 instance in the particular AWS region and AZ
    • provides backup capability using backups and snapshots

Amazon EFS

    • scalable file storage, also optimized for EC2.
    • offers pay for the storage you actually use. There’s no advance provisioning, up-front fees, or commitments
    • multiple instances can be configured to mount the file system.
    • allows mounting the file system across multiple regions and instances.
    • is designed to be highly durable and highly available. Data is redundantly stored across multiple AZs.
    • provides elasticity – scales up and down automatically, even to meet the most abrupt workload spikes.
    • provides performance that scales to support any workload: EFS offers the throughput changing workloads need. It can provide higher throughput in spurts that match sudden file system growth, even for workloads up to 500,000 IOPS or 10 GB per second.
    • provides accessible file storage, which can be accessed by On-premises servers and EC2 instances concurrently.
    • provides security and compliance – access to the file system can be secured with the current security solution, or control access to EFS file systems using IAM, VPC, or POSIX permissions.
    • provides data encryption in transit or at rest.
    • allows EC2 instances to access EFS file systems located in other AWS regions through VPC peering.
    • a file system can be accessed concurrently from all AZs in the region where it is located, which means the application can be architected to failover from one AZ to other AZs in the region in order to ensure the highest level of application availability. Mount targets themselves are designed to be highly available.
    • used as a common data source for any application or workload that runs on numerous instances.

[do_widget id=text-15]

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A company runs an application on a group of Amazon Linux EC2 instances. The application writes log files using standard API calls. For compliance reasons, all log files must be retained indefinitely and will be analyzed by a reporting tool that must access all files concurrently. Which storage service should a solutions architect use to provide the MOST cost-effective solution?
    1. Amazon EBS
    2. Amazon EFS
    3. Amazon EC2 instance store
    4. Amazon S3
  2. A new application is being deployed on Amazon EC2. The Application needs to read write upto 3 TB of data to an external data store and requires read-after-write consistency across all AWS regions for writing new objects into this data store.
    1. Amazon EBS
    2. Amazon Glacier
    3. Amazon EFS
    4. Amazon S3
  3. To meet the requirements of an application, an organization needs to save a constantly increasing volume of files on a cloud storage system with the following features and abilities. What below AWS service will meet these requirements?
      1. Pay only for the storage used
      2. Create different security policies for different groups of files
      3. Allow access to the public
      4. Retrieve the files at any time
      5. Store an unlimited number of files
    1. Amazon EBS
    2. Amazon S3
    3. Amazon Glacier
    4. Amazon EFS

Certified Kubernetes Administrator CKA Learning Path

Certified Kubernetes Administrator CKA Learning Path

The journey in the container world continues with the Certified Kubernetes Administrator CKA certification, which I cleared recently with 90%. After knowing how to use Kubernetes, it was really interesting and intriguing to know Kubernetes internals and how the overall system works.

  • CKA focuses on the skills required to be a successful Kubernetes Administrator 
  • CKA is an open book test, where you have access to the official Kubernetes documentation exam, but it focuses more on hands-on experience.
  • Unlike AWS and GCP certifications, you are required to provision, solve, debug actual problems and provision resources on a Kubernetes cluster
  • Even though it is an open book test, you need to know where the information is.

CKA Exam Pattern and Tips

    • CKA requires you to solve 24 questions in 3 hours.
    • CKA exam curriculum includes these general domains and their weights on the exam:
      • Application Lifecycle Management – 8%
      • Installation, Configuration & Validation – 12%
      • Core Concepts – 19%
      • Networking – 11%
      • Scheduling – 5%
      • Security – 12%
      • Cluster Maintenance – 11%
      • Logging / Monitoring – 5%
      • Storage – 7%
      • Troubleshooting – 10%
    • I was not stretched for time for CKA, as compared to CKAD, and was through with the my first attempt in 90 minutes, I took next 30 minutes to review and was done with the exam in 2 hours. However,  I skipped a question with 8%, but that shouldn’t have had a huge impact.
    • Exam questions can be attempted in any order and doesn’t have to be sequential.
    • Each exam question carries a weight so be sure you attempt the exams with higher weights before focusing on the lower ones. So target the ones with higher weights and quicker solutions like debugging ones.
    • 6-8 different K8s clusters are provisioned. Each question refers to a different kubernetes cluster, and the context needs to be switched. Be sure to execute the kubectl use context command, which is available with every question and you just need to copy paste it.
    • Check for the namespace mentioned in the question, to find resources and create resources. Use the -n <namespace>
    • You would be performing most of the interaction from base node. However, pay attention to check for the node you need to execute the exams and make sure you return back to the base node.
    • SSH to nodes and gaining root access is allowed, if needed. Commands are provided. Make sure you use the sudo -i command for running docker commands.
    • Read carefully the Information provided within the questions with the mark. They would provide very useful hints in addressing the question and save time. for e.g. namespaces to look into. for a failed pod, what has already been created like configmap, secrets, network policies so that you do not create the same.
    • CKA was already upgraded to use k8s 1.18 version and kubectl run commands did not work for me. Use kubectl create commands to create deployments.
    • Make sure you know the imperative commands to create resources, as you won’t have time to time to create and edit yaml files.
    • If you need to edit further use --dry-run -o yaml to get a headstart with spec yaml file and edit the same.
    • I personally use alias kk=kubectl to avoid typing kubectl

CKA Learning Path

CKA Key Topics

Application Lifecycle Management

Installation, Configuration & Validation

Core Concepts

Networking

Scheduling

Security

Cluster Maintenance

Logging / Monitoring

  • Understand and know how to monitor all cluster components, applications, cluster and application logs
  • Know resource usage monitoring as you would be needed to check resource usage using the kubectl top command
  • Know how to Debug running pods using the kubectl logs command

Storage

Troubleshooting

  • Practice Debug application for troubleshooting application failures
  • Practice Debug cluster for troubleshooting control plane failure and worker node failure.
    • Understand the control plane architecture.
    • Focus on kube-apiserver, static pod config which causes the control panel pods to be referred and deployed.
    • Check pods in kube-system if they are all running. Use docker ps -a command on the node to inspect the reason for exiting containers.
    • Check kubelet service if the worker node is shown not ready
  • Troubleshoot networking

General information and practices

  • You can book the exam from CNCF CKA Certification @ $300. Avail limited time 30% discount coupon.
  • Exam can be taken online from anywhere.
  • Make sure you have prepared your workspace well before the exams.
  • Make sure you have a valid government issued ID card as it would be checked.
  • You are not allowed to have anything around you and no one should enter the room.
  • Exam proctor will be watching you always, so refrain from doing any other activities. Your screen is also always shared.
  • I did not have any warnings with the Proctor, except for a request to have camera focused.
  • You would need to install a Google Chrome plugin and the exam provides a web based shell to work on which worked quite well without any glitches. Copy + Paste works fine.
  • You will have an online notepad on the right corner to note down. I hardly used it, but it can be good type and modify text instead of using VI editor.

All the Best ..

Certified Kubernetes Application Developer CKAD Learning Path

CKAD Certification

Certified Kubernetes Application Developer CKAD Learning Path

Finally moving a bit away from the Clouds (AWS and GCP) and as my involvement grew more with Kubernetes, I decided to challenge myself for the Kubernetes certification. I started with the Certified Kubernetes Application Developer and am happy to share that I cleared it in the first attempt with 84%.

  • CKAD is more of an open book test, where you have access to the official Kubernetes documentation exam, but it focuses more on hands-on experience.
  • CKAD focuses on “Using a Kubernetes cluster once already provisioned
  • Unlike AWS and GCP certifications, you would be required to solve, debug actual problems and provision resources on a live Kubernetes cluster.
  • It is surely one of the most challenging exam, I have appeared for in the recent times.
  • Even though it is an open book test, you need to know where the information is.
  • Trust me, if you are not prepared this time is not going to be sufficient.

CKAD Exam Pattern and Tips

    • CKAD requires you to solve 19 questions in 2 hours.
    • CKAD exam curriculum includes these general domains and their weights on the exam:
      • 13% – Core Concepts
      • 18% – Configuration
      • 10% – Multi-Container Pods
      • 18% – Observability
      • 20% – Pod Design
      • 13% – Services & Networking
      • 8% – State Persistence
    • Exam questions can be attempted in any order and doesn’t have to be sequential.
    • Each exam question carries a weight so be sure you attempt the exams with higher weights before focusing on the lower ones. So target the ones with higher weights and quicker solutions like debugging ones.
    • 4 different K8s clusters are provisioned. Each question refers to a different kubernetes cluster, and the context needs to be switched. So be sure to execute the kubectl use context command. This command is available with every question and you just need to copy paste it.
    • Check for the namespace mentioned in the question, to find resources and create resources. Use the -n <namespace>
    • You would be performing most of the interaction from base node. However, pay attention to check for the node you need to execute the exams and make sure you return back to the base node.
    • SSH to nodes and gaining root access is allowed, if needed. Commands are provided.
    • Read carefully the Information provided within the questions with the mark. They would provide very useful hints in addressing the question and save time. for e.g. namespaces to look into. for a failed pod, what has already been created like configmap, secrets, network policies so that you do not create the same.
    • Make sure you know the imperative commands to create resources, as you won’t have time to time to create and edit yaml files. kubectl run with restart flag is is your saviour.

kubectl commands

    • If you need to edit further use –dry-run -o yaml to get a headstart yaml file and edit the same.

CKAD Learning Path

CKAD Key Topics

General information and practices

  • You can book the exam from CNCF CKAD Certification @ $300. Usually you would get discounts coupons for 15-20% on the exam.
  • Exam can be taken online from anywhere.
  • Make sure you have prepared your workspace well before the exams.
  • Make sure you have a valid government issued ID card as it would be checked.
  • You are not allowed to have anything around you and no one should enter the room.
  • Exam proctor will be watching you always, so refrain from doing any other activities. Your screen is also always shared.
  • I did not have any warnings with the Proctor, except for a request to have camera focused.
  • You would need to install a Google Chrome plugin and the exam provides a web based shell to work on which worked quite well without any glitches. Copy + Paste works fine.
  • You will have a online notepad on the right corner to note down. I hardly used it, but it can be good type and modify text instead of using VI editor.