Amazon RDS Cross-Region Read Replicas

Cross-Region Read Replicas

RDS Cross-Region Read Replicas

  • RDS Cross-Region Read Replicas create an asynchronously replicated read-only DB instance in a secondary AWS Region.
  • Supported for MySQL, PostgreSQL, MariaDB, Oracle, and SQL Server
  • Cross-Region Read Replicas help to improve
    • disaster recovery capabilities (reduces RTO and RPO),
    • scale read operations into a region closer to end users,
    • migration from a data center in one region to another region

Cross-Region Read Replicas

RDS Cross-Region Read Replicas Process

  • RDS configures the source DB instance as a replication source and setups the specified read replica in the destination AWS Region.
  • RDS creates an automated DB snapshot of the source DB instance in the source AWS Region.
  • RDS begins a cross-Region snapshot copy for the initial data transfer.
  • RDS then uses the copied DB snapshot for the initial data load on the read replica. When the load is complete the DB snapshot copy is deleted.
  • RDS starts by replicating the changes made to the source instance since the start of the create read replica operation.

RDS Cross-Region Read Replicas Considerations

  • A source DB instance can have cross-region read replicas in multiple AWS Regions.
  • Replica lags are higher for Cross-region replicas. This lag time comes from the longer network channels between regional data centers.
  • RDS can’t guarantee more than five cross-region read replica instances, due to the limit on the number of access control list (ACL) entries for a VPC
  • Read Replica uses the default DB parameter group and DB option group for the specified DB engine when configured from AWS console.
  • Read Replica uses the default security group.
  • Cross-Region RDS read replica can be created from a source RDS DB instance that is not a read replica of another RDS DB instance for Microsoft SQL Server, Oracle, and PostgreSQL DB instances. This limitation doesn’t apply to MariaDB and MySQL DB instances.
  • Deleting the source for a cross-region read replica will result in
    • read replica promotion for MariaDB, MySQL, and Oracle DB instances
    • no read replica promotion for PostgreSQL DB instances and the replication status of the read replica is set to terminated.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Your company has HQ in Tokyo and branch offices worldwide and is using logistics software with a multi-regional deployment on AWS in Japan, Europe, and US. The logistic software has a 3-tier architecture and uses MySQL 5.6 for data persistence. Each region has deployed its database. In the HQ region, you run an hourly batch process reading data from every region to compute cross-regional reports that are sent by email to all offices this batch process must be completed as fast as possible to optimize logistics quickly. How do you build the database architecture to meet the requirements?
    1. For each regional deployment, use RDS MySQL with a master in the region and a read replica in the HQ region
    2. For each regional deployment, use MySQL on EC2 with a master in the region and send hourly EBS snapshots to the HQ region
    3. For each regional deployment, use RDS MySQL with a master in the region and send hourly RDS snapshots to the HQ region
    4. For each regional deployment, use MySQL on EC2 with a master in the region and use S3 to copy data files hourly to the HQ region

AWS RDS Replication – Multi-AZ vs Read Replica

RDS Multi-AZ vs Read Replica

RDS Multi-AZ vs Read Replica

RDS DB instances replicas can be created in two ways Multi-AZ & Read Replica, which provide high availability, durability, and scalability to RDS.

RDS Multi-AZ vs Read Replica

Purpose

  • Multi-AZ DB Instance deployments provide high availability, durability, and automatic failover support.
  • Read replicas enable increased scalability and database availability in the case of an AZ failure. Read Replicas allow elastic scaling beyond the capacity constraints of a single DB instance for read-heavy database workloads

RDS Read Replicas vs Multi-AZ

Region & Availability Zones

  • RDS Multi-AZ deployment automatically provisions and manages a standby instance in a different AZ (independent infrastructure in a physically separate location) within the same AWS region.
  • RDS Read Replicas can be provisioned within the same AZ, Cross-AZ or even as a Cross-Region replica.

Replication Mode

  • RDS Multi-AZ deployment manages a synchronous standby instance in a different AZ
  • RDS Read Replicas has the data replicated asynchronously from the Primary instance to the read replicas

Standby Instance can Accept Reads

  • Multi-AZ DB instance deployment is a high-availability solution and the standby instance does not support requests.
  • Read Replica deployment provides readable instances to increase application read-throughput.

Automatic Failover & Failover Time

  • Multi-AZ DB instance deployment performs an automatic failover to the standby instance without administrative intervention, and the failover time can be up to 120 seconds based on the crash recovery.
    • Planned database maintenance
    • Software patching
    • Rebooting the Primary instance with failover
    • Primary DB instance connectivity or host failure, or an
    • Availability Zone failure
  • RDS maintains the same endpoint for the DB Instance after a failover, so the application can resume database operation without the need for manual administrative intervention.
  • Read Replica deployment does not provide automatic failover. Read Replica instance needs to be manually promoted to a Standalone instance.

Upgrades

  • For a Multi-AZ deployment, Database engine version upgrades happen on the Primary instance.
  • For Read Replicas, the Database engine version upgrade is independent of the Primary instance.

Automated Backups

  • Multi-AZ deployment has the Automated Backups taken from the Standby instance
  • Read Replicas do not have any backups configured, by default.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. You are running a successful multi-tier web application on AWS and your marketing department has asked you to add a reporting tier to the application. The reporting tier will aggregate and publish status reports every 30 minutes from user-generated information that is being stored in your web applications database. You are currently running a Multi-AZ RDS MySQL instance for the database tier. You also have implemented ElastiCache as a database caching layer between the application tier and database tier. Please select the answer that will allow you to successfully implement the reporting tier with as little impact as possible to your database.
    1. Continually send transaction logs from your master database to an S3 bucket and generate the reports of the S3 bucket using S3 byte range requests.
    2. Generate the reports by querying the synchronously replicated standby RDS MySQL instance maintained through Multi-AZ (Standby instance cannot be used as a scaling solution)
    3. Launch an RDS Read Replica connected to your Multi-AZ master database and generate reports by querying the Read Replica.
    4. Generate the reports by querying the ElastiCache database caching tier. (ElasticCache does not maintain full data and is simply a caching solution)
  2. A company is deploying a new two-tier web application in AWS. The company has limited staff and requires high availability, and the application requires complex queries and table joins. Which configuration provides the solution for the company’s requirements?
    1. MySQL Installed on two Amazon EC2 Instances in a single Availability Zone (does not provide High Availability out of the box)
    2. Amazon RDS for MySQL with Multi-AZ
    3. Amazon ElastiCache (Just a caching solution)
    4. Amazon DynamoDB (Not suitable for complex queries and joins)
  3. Your company is getting ready to do a major public announcement of a social media site on AWS. The website is running on EC2 instances deployed across multiple Availability Zones with a Multi-AZ RDS MySQL Extra Large DB Instance. The site performs a high number of small reads and writes per second and relies on an eventual consistency model. After comprehensive tests you discover that there is read contention on RDS MySQL. Which are the best approaches to meet these requirements? (Choose 2 answers)
    1. Deploy ElastiCache in-memory cache running in each availability zone
    2. Implement sharding to distribute load to multiple RDS MySQL instances (this is only a read contention, the writes work fine)
    3. Increase the RDS MySQL Instance size and Implement provisioned IOPS (not scalable, this is only a read contention, the writes work fine)
    4. Add an RDS MySQL read replica in each availability zone
  4. Your company has HQ in Tokyo and branch offices all over the world and is using logistics software with a multi-regional deployment on AWS in Japan, Europe and US. The logistic software has a 3-tier architecture and currently uses MySQL 5.6 for data persistence. Each region has deployed its own database. In the HQ region you run an hourly batch process reading data from every region to compute cross-regional reports that are sent by email to all offices this batch process must be completed as fast as possible to quickly optimize logistics. How do you build the database architecture in order to meet the requirements?
    1. For each regional deployment, use RDS MySQL with a master in the region and a read replica in the HQ region
    2. For each regional deployment, use MySQL on EC2 with a master in the region and send hourly EBS snapshots to the HQ region
    3. For each regional deployment, use RDS MySQL with a master in the region and send hourly RDS snapshots to the HQ region
    4. For each regional deployment, use MySQL on EC2 with a master in the region and use S3 to copy data files hourly to the HQ region
    5. Use Direct Connect to connect all regional MySQL deployments to the HQ region and reduce network latency for the batch process
  5. What would happen to an RDS (Relational Database Service) Multi-Availability Zone deployment if the primary DB instance fails?
    1. IP of the primary DB Instance is switched to the standby DB Instance.
    2. A new DB instance is created in the standby availability zone.
    3. The canonical name record (CNAME) is changed from primary to standby.
    4. The RDS (Relational Database Service) DB instance reboots.
  6. Your business is building a new application that will store its entire customer database on a RDS MySQL database, and will have various applications and users that will query that data for different purposes. Large analytics jobs on the database are likely to cause other applications to not be able to get the query results they need to, before time out. Also, as your data grows, these analytics jobs will start to take more time, increasing the negative effect on the other applications. How do you solve the contention issues between these different workloads on the same data?
    1. Enable Multi-AZ mode on the RDS instance
    2. Use ElastiCache to offload the analytics job data
    3. Create RDS Read-Replicas for the analytics work
    4. Run the RDS instance on the largest size possible
  7. Will my standby RDS instance be in the same Availability Zone as my primary?
    1. Only for Oracle RDS types
    2. Yes
    3. Only if configured at launch
    4. No
  8. Is creating a Read Replica of another Read Replica supported?
    1. Only in certain regions
    2. Only with MySQL based RDS
    3. Only for Oracle RDS types
    4. No
  9. A user is planning to set up the Multi-AZ feature of RDS. Which of the below mentioned conditions won’t take advantage of the Multi-AZ feature?
    1. Availability zone outage
    2. A manual failover of the DB instance using Reboot with failover option
    3. Region outage
    4. When the user changes the DB instance’s server type
  10. When you run a DB Instance as a Multi-AZ deployment, the “_____” serves database writes and reads
    1. secondary
    2. backup
    3. stand by
    4. primary
  11. When running my DB Instance as a Multi-AZ deployment, can I use the standby for read or write operations?
    1. Yes
    2. Only with MSSQL based RDS
    3. Only for Oracle RDS instances
    4. No
  12. Read Replicas require a transactional storage engine and are only supported for the _________ storage engine
    1. OracleISAM
    2. MSSQLDB
    3. InnoDB
    4. MyISAM
  13. A user is configuring the Multi-AZ feature of an RDS DB. The user came to know that this RDS DB does not use the AWS technology, but uses server mirroring to achieve replication. Which DB is the user using right now?
    1. MySQL
    2. Oracle
    3. MS SQL
    4. PostgreSQL
  14. If I have multiple Read Replicas for my master DB Instance and I promote one of them, what happens to the rest of the Read Replicas?
    1. The remaining Read Replicas will still replicate from the older master DB Instance
    2. The remaining Read Replicas will be deleted
    3. The remaining Read Replicas will be combined to one read replica
  15. If you have chosen Multi-AZ deployment, in the event of a planned or unplanned outage of your primary DB Instance, Amazon RDS automatically switches to the standby replica. The automatic failover mechanism simply changes the ______ record of the main DB Instance to point to the standby DB Instance.
    1. DNAME
    2. CNAME
    3. TXT
    4. MX
  16. When automatic failover occurs, Amazon RDS will emit a DB Instance event to inform you that automatic failover occurred. You can use the _____ to return information about events related to your DB Instance
    1. FetchFailure
    2. DescriveFailure
    3. DescribeEvents
    4. FetchEvents
  17. The new DB Instance that is created when you promote a Read Replica retains the backup window period.
    1. TRUE
    2. FALSE
  18. Will I be alerted when automatic failover occurs?
    1. Only if SNS configured
    2. No
    3. Yes
    4. Only if Cloudwatch configured
  19. Can I initiate a “forced failover” for my MySQL Multi-AZ DB Instance deployment?
    1. Only in certain regions
    2. Only in VPC
    3. Yes
    4. No
  20. A user is accessing RDS from an application. The user has enabled the Multi-AZ feature with the MS SQL RDS DB. During a planned outage how will AWS ensure that a switch from DB to a standby replica will not affect access to the application?
    1. RDS will have an internal IP which will redirect all requests to the new DB
    2. RDS uses DNS to switch over to standby replica for seamless transition
    3. The switch over changes Hardware so RDS does not need to worry about access
    4. RDS will have both the DBs running independently and the user has to manually switch over
  21. Which of the following is part of the failover process for a Multi-AZ Amazon Relational Database Service (RDS) instance?
    1. The failed RDS DB instance reboots.
    2. The IP of the primary DB instance is switched to the standby DB instance.
    3. The DNS record for the RDS endpoint is changed from primary to standby.
    4. A new DB instance is created in the standby availability zone.
  22. Which of these is not a reason a Multi-AZ RDS instance will failover?
    1. An Availability Zone outage
    2. A manual failover of the DB instance was initiated using Reboot with failover
    3. To autoscale to a higher instance class (Refer link)
    4. Master database corruption occurs
    5. The primary DB instance fails
  23. You need to scale an RDS deployment. You are operating at 10% writes and 90% reads, based on your logging. How best can you scale this in a simple way?
    1. Create a second master RDS instance and peer the RDS groups.
    2. Cache all the database responses on the read side with CloudFront.
    3. Create read replicas for RDS since the load is mostly reads.
    4. Create a Multi-AZ RDS installs and route read traffic to standby.
  24. How does Amazon RDS multi Availability Zone model work?
    1. A second, standby database is deployed and maintained in a different availability zone from master, using synchronous replication. (Refer link)
    2. A second, standby database is deployed and maintained in a different availability zone from master using asynchronous replication.
    3. A second, standby database is deployed and maintained in a different region from master using asynchronous replication.
    4. A second, standby database is deployed and maintained in a different region from master using synchronous replication.
  25. A customer is running an application in US-West (Northern California) region and wants to setup disaster recovery failover to the Asian Pacific (Singapore) region. The customer is interested in achieving a low Recovery Point Objective (RPO) for an Amazon RDS multi-AZ MySQL database instance. Which approach is best suited to this need?
    1. Synchronous replication
    2. Asynchronous replication
    3. Route53 health checks
    4. Copying of RDS incremental snapshots
  26. A user is using a small MySQL RDS DB. The user is experiencing high latency due to the Multi AZ feature. Which of the below mentioned options may not help the user in this situation?
    1. Schedule the automated back up in non-working hours
    2. Use a large or higher size instance
    3. Use PIOPS
    4. Take a snapshot from standby Replica
  27. Are Reserved Instances available for Multi-AZ Deployments?
    1. Only for Cluster Compute instances
    2. Yes for all instance types
    3. Only for M3 instance types
  28. My Read Replica appears “stuck” after a Multi-AZ failover and is unable to obtain or apply updates from the source DB Instance. What do I do?
    1. You will need to delete the Read Replica and create a new one to replace it.
    2. You will need to disassociate the DB Engine and re-associate it.
    3. The instance should be deployed to Single AZ and then moved to Multi-AZ once again
    4. You will need to delete the DB Instance and create a new one to replace it.
  29. What is the charge for the data transfer incurred in replicating data between your primary and standby?
    1. No charge. It is free.
    2. Double the standard data transfer charge
    3. Same as the standard data transfer charge
    4. Half of the standard data transfer charge
  30. A user has enabled the Multi-AZ feature with the MS SQL RDS database server. Which of the below mentioned statements will help the user understand the Multi-AZ feature better?
    1. In a Multi-AZ, AWS runs two DBs in parallel and copies the data asynchronously to the replica copy
    2. In a Multi-AZ, AWS runs two DBs in parallel and copies the data synchronously to the replica copy
    3. In a Multi-AZ, AWS runs just one DB but copies the data synchronously to the standby replica
    4. AWS MS SQL does not support the Multi-AZ feature
  31. A company is running a batch analysis every hour on their main transactional DB running on an RDS MySQL instance to populate their central Data Warehouse running on Redshift. During the execution of the batch their transactional applications are very slow. When the batch completes they need to update the top management dashboard with the new data. The dashboard is produced by another system running on-premises that is currently started when a manually sent email notifies that an update is required The on-premises system cannot be modified because is managed by another team. How would you optimize this scenario to solve performance issues and automate the process as much as possible?
    1. Replace RDS with Redshift for the batch analysis and SNS to notify the on-premises system to update the dashboard
    2. Replace RDS with Redshift for the batch analysis and SQS to send a message to the on-premises system to update the dashboard
    3. Create an RDS Read Replica for the batch analysis and SNS to notify me on-premises system to update the dashboard
    4. Create an RDS Read Replica for the batch analysis and SQS to send a message to the on-premises system to update the dashboard.

AWS RDS DB Snapshot, Backup & Restore

RDS Automated Backups vs Manual Snapshots

RDS BackUp, Restore and Snapshots

  • RDS creates a storage volume snapshot of the DB instance, backing up the entire DB instance and not just individual databases.
  • RDS provides two different methods Automated and Manual for backing up the DB instances.

Automated backups

  • Backups of the DB instance are automatically created and retained.
  • RDS Backups are incremental. The first snapshot of a DB instance contains the data for the full database. Subsequent snapshots of the same database are incremental, which means that only the data that has changed after your most recent snapshot is saved.
  • Automated backups are enabled by default for a new DB instance.
  • Automated backups occur during a daily user-configurable period of time, known as the preferred backup window.
    • If a preferred backup window is not specified when a DB instance is created, RDS assigns a default 30-minute backup window which is selected at random from an 8-hour block of time per region.
    • Changes to the backup window take effect immediately.
    • Backup window cannot overlap with the weekly maintenance window for the DB instance.
  • Backups created during the backup window are retained for a user-configurable number of days, known as the backup retention period
    • If the backup retention period is not set, RDS defaults the period retention period to one day, if created using RDS API or the AWS CLI, or seven days if created from AWS Console.
    • Backup retention period can be modified with valid values are 0 (for no backup retention) to a maximum of 35 days.
  • Manual snapshot limits (50 per region) do not apply to automated backups
  • If the backup requires more time than allotted to the backup window, the backup will continue to completion.
  • An immediate outage occurs if the backup retention period is changed
    • from 0 to a non-zero value as the first backup occurs immediately or
    • from a non-zero value to 0 as it turns off automatic backups, and deletes all existing automated backups for the instance.
  • RDS uses the periodic data backups in conjunction with the transaction logs to enable restoration of the DB Instance to any second during the retention period, up to the LatestRestorableTime (typically up to the last few minutes).
  • During the backup window,
    • for Single AZ instance, storage I/O may be briefly suspended while the backup process initializes (typically under a few seconds) and a brief period of elevated latency might be experienced.
    • for Multi-AZ DB deployments, there is No I/O suspension since the backup is taken from the standby instance
  • The first backup is a full backup, while the others are incremental.
  • Automated DB backups are deleted when
    • the retention period expires
    • the automated DB backups for a DB instance are disabled
    • the DB instance is deleted
  • When a DB instance is deleted,
    • a final DB snapshot can be created upon deletion; which can be used to restore the deleted DB instance at a later date.
    • RDS retains the final user-created DB snapshot along with all other manually created DB snapshots
    • all automated backups are deleted and cannot be recovered
  • NOTE: RDS can now be configured to retain the automated backups on RDS instance deletion.

Point-In-Time Recovery

  • In addition to the daily automated backup, RDS archives database change logs. This enables recovery of the database to any point in time during the backup retention period, up to the last five minutes of database usage.
  • Disabling automated backups also disables point-in-time recovery
  • RDS stores multiple copies of the data, but for Single-AZ DB instances these copies are stored in a single availability zone.
  • If for any reason a Single-AZ DB instance becomes unusable, point-in-time recovery can be used to launch a new DB instance with the latest restorable data

DB Snapshots (User Initiated – Manual)

  • DB snapshots are manual, user-initiated backups that enable a DB instance backup to a known state, and restore to that specific state at any time.
  • RDS keeps all manual DB snapshots until explicitly deleted.

DB Snapshots Creation

  • DB snapshot is a user-initiated storage volume snapshot of DB instance, backing up the entire DB instance and not just individual databases.
  • DB snapshots enable backing up of the DB instance in a known state as needed, and can then be restored to that specific state at any time.
  • DB snapshots are kept until explicitly deleted.
  • Creating DB snapshot on a Single-AZ DB instance results in a brief I/O suspension that typically lasts no more than a few minutes.
  • Multi-AZ DB instances are not affected by this I/O suspension since the backup is taken on the standby instance

DB Snapshot Restore

  • DB instance can be restored to any specific time during this retention period, creating a new DB instance.
  • DB restore creates a New DB instance with a different endpoint
  • RDS uses the periodic data backups in conjunction with the transaction logs to enable restoration of the DB Instance to any second during the retention period, up to the LatestRestorableTime (typically up to the last few minutes).
  • Option group associated with the DB snapshot is associated with the restored DB instance once it is created. However, the option group is associated with the VPC, so would apply only when the instance is restored in the same VPC as the DB snapshot.
  • Default DB parameter and security groups are associated with the restored instance. After the restoration is complete, any custom DB parameter or security groups used by the restored instance should be associated explicitly.
  • A DB instance can be restored with a different storage type than the source DB snapshot. In this case, the restoration process will be slower because of the additional work required to migrate the data to the new storage type e.g. from GP2 to Provisioned IOPS
  • A DB instance can be restored with a different edition of the DB engine only if the DB snapshot has the required storage allocated for the new edition for e.g., to change from SQL Server Web Edition to SQL Server Standard Edition, the DB snapshot must have been created from a SQL Server DB instance that had at least 200 GB of allocated storage, which is the minimum allocated storage for SQL Server Standard edition

DB Snapshot Copy

  • RDS supports two types of DB snapshot copying.
    • Copy an automated DB snapshot to create a manual DB snapshot in the same AWS region. Manual DB snapshots are not deleted automatically and can be kept indefinitely.
    • Copy either an automated or manual DB snapshot from one region to another region. By copying the DB snapshot to another region, a manual DB snapshot is created that is retained in that region
  • Automated backups cannot be shared. They need to be copied to a manual snapshot, and the manual snapshot can be shared.
  • Manual DB snapshots can be shared with other AWS accounts and snapshots shared can be copied by other AWS accounts.
  • Snapshot Copy Encryption
    • DB snapshot that has been encrypted using an AWS Key Management System (AWS KMS) encryption key can be copied
    • Copying an encrypted DB snapshot results in an encrypted copy of the DB snapshot
    • When copying, DB snapshot can either be encrypted with the same KMS encryption key as the original DB snapshot, or a different KMS encryption key to encrypt the copy of the DB snapshot.
    • An unencrypted DB snapshot can be copied to an encrypted snapshot, a quick way to add encryption to a previously encrypted DB instance.
    • Encrypted snapshot can be restored only to an encrypted DB instance.
    • If a KMS encryption key is specified when restoring from an unencrypted DB cluster snapshot, the restored DB cluster is encrypted using the specified KMS encryption key.
    • Copying an encrypted snapshot shared from another AWS account requires access to the KMS encryption key that was used to encrypt the DB snapshot.
    • Because KMS encryption keys are specific to the region that they are created in, encrypted snapshot cannot be copied to another region
    • NOTEAWS now allows copying encrypted DB snapshots between accounts and across multiple regions as seamlessly as unencrypted snapshots.

DB Snapshot Sharing

  • Manual DB snapshots or DB cluster snapshots can be shared with up to 20 other AWS accounts.
  • Manual snapshots shared with other AWS accounts can copy the snapshot, or restore a DB instance or DB cluster from that snapshot.
  • Manual snapshots can also be shared as public, which makes the snapshot available to all AWS accounts. Care should be taken when sharing a snapshot as public so that none of the private information is included
  • Shared snapshot can be copied to another region.
  • However, following limitations apply when sharing manual snapshots with other AWS accounts:
    • When a DB instance or DB cluster is restored from a shared snapshot using the AWS CLI or RDS API, the Amazon Resource Name (ARN) of the shared snapshot as the snapshot identifier should be specified.
    • DB snapshot that uses an option group with permanent or persistent options cannot be shared.
    • A permanent option cannot be removed from an option group. Option groups with persistent options cannot be removed from a DB instance once the option group has been assigned to the DB instance.
  • DB snapshots that have been encrypted “at rest” using the AES-256 encryption algorithm can be shared
  • Users can only copy encrypted DB snapshots if they have access to the AWS Key Management Service (AWS KMS) encryption key that was used to encrypt the DB snapshot.
  • AWS KMS encryption keys can be shared with another AWS account by adding the other account to the KMS key policy.
  • However, KMS key policy must first be updated by adding any accounts to share the snapshot with, before sharing an encrypted DB snapshot

RDS Automated Backups vs Manual Snapshots

RDS Automated Backups vs Manual Snapshots

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Amazon RDS automated backups and DB Snapshots are currently supported for only the __________ storage engine
    1. InnoDB
    2. MyISAM
  2. Automated backups are enabled by default for a new DB Instance.
    1. TRUE
    2. FALSE
  3. Amazon RDS DB snapshots and automated backups are stored in
    1. Amazon S3
    2. Amazon EBS Volume
    3. Amazon RDS
    4. Amazon EMR
  4. You receive a frantic call from a new DBA who accidentally dropped a table containing all your customers. Which Amazon RDS feature will allow you to reliably restore your database to within 5 minutes of when the mistake was made?
    1. Multi-AZ RDS
    2. RDS snapshots
    3. RDS read replicas
    4. RDS automated backup
  5. Disabling automated backups ______ disable the point-in-time recovery.
    1. if configured to can
    2. will never
    3. will
  6. Changes to the backup window take effect ______.
    1. from the next billing cycle
    2. after 30 minutes
    3. immediately
    4. after 24 hours
  7. You can modify the backup retention period; valid values are 0 (for no backup retention) to a maximum of ___________ days.
    1. 45
    2. 35
    3. 15
    4. 5
  8. Amazon RDS automated backups and DB Snapshots are currently supported for only the ______ storage engine
    1. MyISAM
    2. InnoDB 
  9. What happens to the I/O operations while you take a database snapshot?
    1. I/O operations to the database are suspended for a few minutes while the backup is in progress.
    2. I/O operations to the database are sent to a Replica (if available) for a few minutes while the backup is in progress.
    3. I/O operations will be functioning normally
    4. I/O operations to the database are suspended for an hour while the backup is in progress
  10. True or False: When you perform a restore operation to a point in time or from a DB Snapshot, a new DB Instance is created with a new endpoint.
    1. FALSE
    2. TRUE 
  11. True or False: Manually created DB Snapshots are deleted after the DB Instance is deleted.
    1. TRUE
    2. FALSE
  12. A user is running a MySQL RDS instance. The user will not use the DB for the next 3 months. How can the user save costs?
    1. Pause the RDS activities from CLI until it is required in the future
    2. Stop the RDS instance
    3. Create a snapshot of RDS to launch in the future and terminate the instance now
    4. Change the instance size to micro

References

AWS RDS Multi-AZ DB Instance vs DB Cluster Deployment

RDS Multi-AZ DB Instance vs DB Cluster

RDS Multi-AZ DB Instance vs DB Cluster

  • RDS Multi-AZ deployments provide high availability and automatic failover support for DB instances
  • Multi-AZ helps improve the durability and availability of a critical system, enhancing availability during planned system maintenance, DB instance failure, and Availability Zone disruption.
  • A Multi-AZ DB instance deployment has one standby DB instance that provides failover support but doesn’t serve read traffic.
  • A Multi-AZ DB cluster deployment has two standby DB instances that provide failover support and can also serve read traffic.

RDS Multi-AZ DB Instance vs DB Cluster

Instances & Availability Zones

  • A Single AZ instance creates a single DB instance in any specified AZ.
  • A Multi-AZ DB Instance deployment creates a Primary and a Standby instance in two different AZs
  • A Multi-AZ DB Cluster deployment creates a Primary Writer and two Readable Standby instances in three different AZs

Replication Mode

  • Multi-AZ DB instance deployment synchronously replicates the data from the primary DB instance to a standby instance in a different AZ.
  • Multi-AZ DB cluster deployment semi-synchronously replicates data from the writer DB instance to both reader DB instances using the DB engine’s native replication capabilities.

Standby Instance can Accept Reads

  • Multi-AZ DB instance deployment is a high-availability solution and the standby instance does not support requests.
  • Multi-AZ DB cluster deployment provides readable standby instances to increase application read-throughput.

Commit Latency

  • Single AZ instance has the lowest commit latency.
  • Multi-AZ DB instance deployment has a high commit latency as compared to the Single AZ instance as the data needs to be synchronously replicated to the standby instance.
  • Multi-AZ DB cluster deployment provides up to two thirds faster commits for commits compared to Multi-AZ DB instance as it performs semi-synchronous replication.

Automatic Failover & Failover Time

  • Single AZ instances do not support automatic failover and failure would result in data loss. Use point-in-time recovery with backups to restore the database.
  • Multi-AZ DB instance deployment performs an automatic failover to the standby instance, and the failover time can be up to 120 seconds based on the crash recovery.
  • Multi-AZ DB cluster deployment performs an automatic failover to a reader DB instance in a different AZ, and the failover time can be up to 75 seconds depending on the replica lag.

Supported Engines

  • Single AZ and Multi-AZ DB instance deployments support all DB engines
  • Multi-AZ DB clusters are supported only for the MySQL and PostgreSQL DB engines.

Cost

  • Single AZ is the most cost-effective option.
  • Multi-AZ DB Instance deployment costs more than a Single AZ as it maintains a synchronous standby instance.
  • Multi-AZ DB Cluster would be an expensive option as it creates 3 instances, supports specific instance classes that do not include burstable classes, and does not support general-purpose SSD volumes.

Use Cases

  • Single AZ deployments are suitable for non-critical dev, test environments.
  • Multi-AZ deployments are suitable for critical, production-based environments requiring high availability, data redundancy, and scalability for read workloads.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.

References

RDS_Multi-AZ

AWS RDS Multi-AZ DB Instance

RDS Multi-AZ Instance Deployment

RDS Multi-AZ DB Instance Deployment

  • RDS automatically creates a primary DB Instance and synchronously replicates the data to a standby instance in a different AZ.
  • RDS performs an automatic failover to the standby, so that database operations can be resumed as soon as the failover is complete.
  • RDS Multi-AZ deployment maintains the same endpoint for the DB Instance after a failover, so the application can resume database operation without the need for manual administrative intervention.
  • Multi-AZ is a High Availability feature and NOT a scaling solution for read-only scenarios; a standby replica can’t be used to serve read traffic. To service read-only traffic, use a Read Replica.
  • Multi-AZ deployments for Oracle, PostgreSQL, MySQL, and MariaDB DB instances use Amazon technology, while SQL Server DB instances use SQL Server Mirroring.

RDS Multi-AZ Instance Deployment

  • In a Multi-AZ deployment,
    • RDS automatically provisions and maintains a synchronous standby replica in a different Availability Zone.
    • Copies of data are stored in different AZs for greater levels of data durability.
    • Primary DB instance is synchronously replicated across Availability Zones to a standby replica to provide
      • data redundancy,
      • eliminate I/O freezes during snapshots and backups
      • and minimize latency spikes during system backups.
    • DB instances may have increased write and commit latency compared to a Single AZ deployment, due to the synchronous data replication
    • Transaction success is returned only if the commit is successful both on the primary and the standby DB
    • There might be a change in latency if the deployment fails over to the standby replica, although AWS is engineered with low-latency network connectivity between Availability Zones.
  • When using the BYOL licensing model, a license for both the primary instance and the standby replica is required
  • For production workloads, it is recommended to use Multi-AZ deployment with Provisioned IOPS and DB instance classes (m1.large and larger), optimized for Provisioned IOPS for fast, consistent performance.
  • When Single-AZ deployment is modified to a Multi-AZ deployment (for engines other than SQL Server or Amazon Aurora)
    • RDS takes a snapshot of the primary DB instance from the deployment and restores the snapshot into another Availability Zone.
    • RDS then sets up synchronous replication between the primary DB instance and the new instance.
    • This avoids downtime during conversion from Single AZ to Multi-AZ.
  • An existing Single AZ instance can be converted into a Multi-AZ instance by modifying the DB instance without any downtime.

RDS Multi-AZ Failover Process

  • In the event of a planned or unplanned outage of the DB instance,
    • RDS automatically switches to a standby replica in another AZ, if enabled for Multi-AZ.
    • The time taken for the failover to complete depends on the database activity and other conditions at the time the primary DB instance became unavailable.
    • Failover times are typically 60-120 secs. However, large transactions or a lengthy recovery process can increase failover time.
    • Failover mechanism automatically changes the DNS record of the DB instance to point to the standby DB instance.
    • Multi-AZ switch is seamless to the applications as there is no change in the endpoint URLs but just needs to re-establish any existing connections to the DB instance.
  • RDS handles failover automatically so that database operations can be resumed as quickly as possible without administrative intervention.
  • Primary DB instance switches over automatically to the standby replica if any of the following conditions occur:
    • Primary Availability Zone outage
    • Loss of network connectivity to primary
    • Primary DB instance fails
    • DB instance’s server type is changed
    • Operating system of the DB instance is undergoing software patching
    • Compute unit failure on the primary
    • Storage failure on the primary
    • A manual failover of the DB instance was initiated using Reboot with failover (also referred to as Forced Failover)
  • If the Multi-AZ DB instance has failed over, can be determined by
    • DB event subscriptions can be set up to notify you via email or SMS that a failover has been initiated.
    • DB events can be viewed via the Amazon RDS console or APIs.
    • The current state of the Multi-AZ deployment can be viewed via the RDS console and APIs.

Multi-AZ DB Instance vs Multi-AZ DB Cluster

RDS Multi-AZ DB Instance vs DB Cluster

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.

References

RDS_Multi-AZ_DB_Instance_Deployment

AWS RDS Multi-AZ DB Cluster

RDS Mulit-AZ DB Cluster

RDS Multi-AZ DB Cluster

  • RDS Multi-AZ DB cluster deployment is a high-availability deployment mode of RDS with two readable standby DB instances.
  • RDS Multi-AZ DB cluster has a writer DB instance and two reader DB instances in three separate AZs in the same AWS Region.
  • Multi-AZ DB clusters provide high availability, increased capacity for read workloads, and lower write latency when compared to Multi-AZ DB instance deployments.
  • Multi-AZ DB clusters aren’t the same as Aurora DB clusters.

RDS Mulit-AZ DB Cluster

  • With a Multi-AZ DB cluster, RDS replicates data from the writer DB instance to both of the reader DB instances using the DB engine’s native replication capabilities.
  • When a change is made on the writer DB instance, it’s sent to each reader DB instance. Acknowledgment from at least one reader DB instance is required for a change to be committed.
  • Reader DB instances act as automatic failover targets and also serve read traffic to increase application read throughput.
  • If an outage occurs on the writer DB instance, RDS manages failover to one of the reader DB instances. RDS does this based on which reader DB instance has the most recent change record.
  • Multi-AZ DB clusters typically have lower write latency when compared to Multi-AZ DB instance deployments.
  • They also allow read-only workloads to run on reader DB instances.
  • Supports two endpoints
    • Cluster or Writer endpoint connects to the writer DB instance of the DB cluster, which supports both read and write operations.
    • Reader endpoint connects to either of the two reader DB instances, which support only read operations.
    • Instance endpoint connects to a specific DB instance within a Multi-AZ DB cluster.

Multi-AZ DB Cluster Limitations

  • Multi-AZ DB clusters are supported only for the MySQL and PostgreSQL DB engines.
  • Multi-AZ DB clusters support only Provisioned IOPS storage.
  • Single-AZ DB instance deployment or Multi-AZ DB instance deployment can’t be upgraded into a Multi-AZ DB cluster.
  • Multi-AZ DB clusters don’t support modifications at the DB instance level because all modifications are done at the DB cluster level.
  • Multi-AZ DB clusters don’t support the following features:
    • Support for IPv6 connections (dual-stack mode)
    • Cross-Region automated backups
    • Exporting Multi-AZ DB cluster snapshot data to an Amazon S3 bucket
    • IAM DB authentication
    • Kerberos authentication
    • Modifying the port
    • Option groups
    • Point-in-time-recovery (PITR) for deleted clusters
    • Restoring a Multi-AZ DB cluster snapshot from an Amazon S3 bucket
    • Storage autoscaling by setting the maximum allocated storage. Manually scale the storage.
    • Stopping and starting the DB cluster
    • Copying a snapshot of a Multi-AZ DB cluster
    • Encrypting an unencrypted Multi-AZ DB cluster

RDS Multi-AZ DB Cluster Failover

  • RDS automatically fails over to a reader DB instance in a different AZ in case of a planned or unplanned outage of the writer DB instance, as quickly as possible without administrative intervention
  • Failover time taken depends on the database activity and other conditions when the writer DB instance becomes unavailable and is typically under 35 seconds.
  • Failover completes when both reader DB instances have applied outstanding transactions from the failed writer.

Multi-AZ DB Instance vs Multi-AZ DB Cluster

RDS Multi-AZ DB Instance vs DB Cluster

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.

References

RDS_Multi-AZ_Cluster_Deployment

AWS RDS Read Replicas

RDS Read Replicas

RDS Read Replicas

  • RDS Read Replica is a read-only copy of the DB instance.
  • RDS Read Replicas provide enhanced performance and durability for RDS.
  • RDS Read Replicas allow elastic scaling beyond the capacity constraints of a single DB instance for read-heavy database workloads.
  • RDS Read replicas enable increased scalability and database availability in the case of an AZ failure.
  • Read Replicas can help reduce the load on the source DB instance by routing read queries from applications to the Read Replica.
  • Read replicas can also be promoted when needed to become standalone DB instances.
  • RDS read replicas can be Multi-AZ i.e. set up with their own standby instances in a different AZ.
  • One or more replicas of a given source DB Instance can serve high-volume application read traffic from multiple copies of the data, thereby increasing aggregate read throughput.
  • RDS uses DB engines’ built-in replication functionality to create a special type of DB instance called a Read Replica from a source DB instance. It uses the engines’ native asynchronous replication to update the read replica whenever there is a change to the source DB instance.
  • Read Replicas are eventually consistent due to asynchronous replication.
  • RDS sets up a secure communications channel using public-key encryption between the source DB instance and the read replica, even when replicating across regions.
  • Read replica operates as a DB instance that allows only read-only connections. Applications can connect to a read replica just as they would to any DB instance.
  • Read replicas are available in RDS for MySQL, MariaDB, PostgreSQL, Oracle, and SQL Server as well as Aurora.
  • RDS replicates all databases in the source DB instance.
  • RDS supports replication between an RDS MySQL or MariaDB DB instance and a MySQL or MariaDB instance that is external to RDS using Binary Log File Position or  Global Transaction Identifiers (GTIDs) replication.

RDS Read Replicas

Read Replicas Creation

  • Read Replicas can be created within the same AZ, different AZ within the same region, and cross-region as well.
  • Up to five Read Replicas can be created from one source DB instance.
  • Creation process
    • Automatic backups must be enabled on the source DB instance by setting the backup retention period to a value other than 0
    • An existing DB instance needs to be specified as the source.
    • RDS takes a snapshot of the source instance and creates a read-only instance from the snapshot.
    • RDS then uses the asynchronous replication method for the DB engine to update the Read Replica for any changes to the source DB instance.
  • RDS replicates all databases in the source DB instance.
  • RDS sets up a secure communications channel between the source DB instance and the Read Replica if that Read Replica is in a different AWS region from the DB instance.
  • RDS establishes any AWS security configurations, such as adding security group entries, needed to enable the secure channel.
  • During the Read Replica creation, a brief I/O suspension on the source DB instance can be experienced as the DB snapshot occurs.
  • I/O suspension typically lasts about one minute and can be avoided if the source DB instance is a Multi-AZ deployment (in the case of Multi-AZ deployments, DB snapshots are taken from the standby).
  • Read Replica creation time can be slow if any long-running transactions are being executed and should wait for completion
  • For multiple Read Replicas created in parallel from the same source DB instance, only one snapshot is taken at the start of the first create action.
  • A Read Replica can be promoted to a new independent source DB, in which case the replication link is broken between the Read Replica and the source DB.  However, the replication continues for other replicas using the original source DB as the replication source

Read Replica Deletion & DB Failover

  • Read Replicas must be explicitly deleted, using the same mechanisms for deleting a DB instance.
  • If the source DB instance is deleted without deleting the replicas, each replica is promoted to a stand-alone, single-AZ DB instance.
  • If the source instance of a Multi-AZ deployment fails over to the standby, any associated Read Replicas are switched to use the secondary as their replication source.

Read Replica Storage & Compute requirements

  • A Read Replica, by default, is created with the same storage type as the source DB instance.
  • For replication to operate effectively, each Read Replica should have the same amount of compute & storage resources as the source DB instance.
  • Read Replicas should be scaled accordingly if the source DB instance is scaled.

Read Replicas Promotion

  • A read replica can be promoted into a standalone DB instance.
  • When the read replica is promoted
    • New DB instance is rebooted before it becomes available.
    • New DB instance that is created retains the option group and the parameter group of the former read replica.
    • The promotion process can take several minutes or longer to complete, depending on the size of the read replica.
    • If a source DB instance has several read replicas, promoting one of the read replicas to a DB instance has no effect on the other replicas.
  • If you plan to promote a read replica to a standalone instance, AWS recommends that you enable backups and complete at least one backup prior to promotion.
  • Read Replicas Promotion can help with
    • Performing DDL operations (MySQL and MariaDB only)
      • DDL Operations such as creating or rebuilding indexes can take time and can be performed on the read replica once it is in sync with its primary DB instance.
    • Sharding
      • Sharding embodies the “share-nothing” architecture and essentially involves breaking a large database into several smaller databases.
      • Read Replicas can be created and promoted corresponding to each of the shards and then using a hashing algorithm to determine which host receives a given update.
    • Implementing failure recovery
      • Read replica promotion can be used as a data recovery scheme if the primary DB instance fails.

Read Replicas Multi-AZ

  • RDS read replicas can be Multi-AZ and we can have read-only standby instances in a different AZ.
  • Read Replicas is currently supported for MySQL, MariaDB, PostgreSQL, and Oracle database engines.
  • Read Replicas with Multi-AZ help build a resilient disaster recovery strategy and simplify the database engine upgrade process.
  • Read replica as Multi-AZ, allows you to use the read replica as a DR target providing automatic failover.
  • Also, when you promote the read replica to be a standalone database, it will already be Multi-AZ enabled.

Cross-Region Read Replicas

  • Supported for MySQL, PostgreSQL, MariaDB, and Oracle.
  • Not supported for SQL Server
  • Cross-Region Read Replicas help to improve
    • disaster recovery capabilities (reduces RTO and RPO),
    • scale read operations into a region closer to end users,
    • migration from a data center in one region to another region
  • A source DB instance can have cross-region read replicas in multiple AWS Regions.
  • Cross-Region RDS read replica can be created from a source RDS DB instance that is not a read replica of another RDS DB instance.
  • Replica lags are higher for Cross-region replicas. This lag time comes from the longer network channels between regional data centers.
  • RDS can’t guarantee more than five cross-region read replica instances, due to the limit on the number of access control list (ACL) entries for a VPC
  • Read Replica uses the default DB parameter group and DB option group for the specified DB engine.
  • Read Replica uses the default security group.
  • Deleting the source for a cross-Region read replica will result in
    • read replica promotion for MariaDB, MySQL, and Oracle DB instances
    • no read replica promotion for PostgreSQL DB instances and the replication status of the read replica is set to terminated.

Cross-Region Read Replicas

Read Replica Features & Limitations

  • RDS does not support circular replication.
  • DB instance cannot be configured to serve as a replication source for an existing DB instance; a new Read Replica can be created only from an existing DB instance for e.g., if MyDBInstance replicates to ReadReplica1, ReadReplica1 can’t be configured to replicate back to MyDBInstance.  From ReadReplica1, only a new Read Replica can be created, such as ReadRep2.
  • Read Replica can be created from other Read replicas as well. However, the replica lag is higher for these instances and there cannot be more than four instances involved in a replication chain.

Read Replica ComparisionRDS Read Replicas Use Cases

  • Scaling beyond the compute or I/O capacity of a single DB instance for read-heavy database workloads, directing excess read traffic to Read Replica(s)
  • Serving read traffic while the source DB instance is unavailable for e.g. If the source DB instance cannot take I/O requests due to backups I/O suspension or scheduled maintenance, the read traffic can be directed to the Read Replica(s). However, the data might be stale.
  • Business reporting or data warehousing scenarios where business reporting queries can be executed against a Read Replica, rather than the primary, production DB instance.
  • Implementing disaster recovery by promoting the read replica to a standalone instance as a disaster recovery solution, if the primary DB instance fails.

RDS Read Replicas vs Multi-AZ

RDS Mulit-AZ vs Multi-Region vs Read Replicas

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. You are running a successful multi-tier web application on AWS and your marketing department has asked you to add a reporting tier to the application. The reporting tier will aggregate and publish status reports every 30 minutes from user-generated information that is being stored in your web applications database. You are currently running a Multi-AZ RDS MySQL instance for the database tier. You also have implemented ElastiCache as a database caching layer between the application tier and database tier. Please select the answer that will allow you to successfully implement the reporting tier with as little impact as possible to your database.
    1. Continually send transaction logs from your master database to an S3 bucket and generate the reports off the S3 bucket using S3 byte range requests.
    2. Generate the reports by querying the synchronously replicated standby RDS MySQL instance maintained through Multi-AZ (Standby instance cannot be used as a scaling solution)
    3. Launch a RDS Read Replica connected to your Multi-AZ master database and generate reports by querying the Read Replica.
    4. Generate the reports by querying the ElastiCache database caching tier. (ElasticCache does not maintain full data and is simply a caching solution)
  2. Your company is getting ready to do a major public announcement of a social media site on AWS. The website is running on EC2 instances deployed across multiple Availability Zones with a Multi-AZ RDS MySQL Extra Large DB Instance. The site performs a high number of small reads and writes per second and relies on an eventual consistency model. After comprehensive tests you discover that there is read contention on RDS MySQL. Which are the best approaches to meet these requirements? (Choose 2 answers)
    1. Deploy ElastiCache in-memory cache running in each availability zone
    2. Implement sharding to distribute load to multiple RDS MySQL instances (this is only a read contention, the writes work fine)
    3. Increase the RDS MySQL Instance size and Implement provisioned IOPS (not scalable, this is only a read contention, the writes work fine)
    4. Add an RDS MySQL read replica in each availability zone
  3. Your company has HQ in Tokyo and branch offices all over the world and is using logistics software with a multi-regional deployment on AWS in Japan, Europe and US. The logistic software has a 3-tier architecture and currently uses MySQL 5.6 for data persistence. Each region has deployed its own database. In the HQ region you run an hourly batch process reading data from every region to compute cross-regional reports that are sent by email to all offices this batch process must be completed as fast as possible to quickly optimize logistics. How do you build the database architecture in order to meet the requirements?
    1. For each regional deployment, use RDS MySQL with a master in the region and a read replica in the HQ region
    2. For each regional deployment, use MySQL on EC2 with a master in the region and send hourly EBS snapshots to the HQ region
    3. For each regional deployment, use RDS MySQL with a master in the region and send hourly RDS snapshots to the HQ region
    4. For each regional deployment, use MySQL on EC2 with a master in the region and use S3 to copy data files hourly to the HQ region
    5. Use Direct Connect to connect all regional MySQL deployments to the HQ region and reduce network latency for the batch process
  4. Your business is building a new application that will store its entire customer database on a RDS MySQL database, and will have various applications and users that will query that data for different purposes. Large analytics jobs on the database are likely to cause other applications to not be able to get the query results they need to, before time out. Also, as your data grows, these analytics jobs will start to take more time, increasing the negative effect on the other applications. How do you solve the contention issues between these different workloads on the same data?
    1. Enable Multi-AZ mode on the RDS instance
    2. Use ElastiCache to offload the analytics job data
    3. Create RDS Read-Replicas for the analytics work
    4. Run the RDS instance on the largest size possible
  5. If I have multiple Read Replicas for my master DB Instance and I promote one of them, what happens to the rest of the Read Replicas?
    1. The remaining Read Replicas will still replicate from the older master DB Instance
    2. The remaining Read Replicas will be deleted
    3. The remaining Read Replicas will be combined to one read replica
  6. You need to scale an RDS deployment. You are operating at 10% writes and 90% reads, based on your logging. How best can you scale this in a simple way?
    1. Create a second master RDS instance and peer the RDS groups.
    2. Cache all the database responses on the read side with CloudFront.
    3. Create read replicas for RDS since the load is mostly reads.
    4. Create a Multi-AZ RDS installs and route read traffic to standby.
  7. A customer is running an application in US-West (Northern California) region and wants to setup disaster recovery failover to the Asian Pacific (Singapore) region. The customer is interested in achieving a low Recovery Point Objective (RPO) for an Amazon RDS multi-AZ MySQL database instance. Which approach is best suited to this need?
    1. Synchronous replication
    2. Asynchronous replication
    3. Route53 health checks
    4. Copying of RDS incremental snapshots
  8. A user is using a small MySQL RDS DB. The user is experiencing high latency due to the Multi AZ feature. Which of the below mentioned options may not help the user in this situation?
    1. Schedule the automated back up in non-working hours
    2. Use a large or higher size instance
    3. Use PIOPS
    4. Take a snapshot from standby Replica
  9. My Read Replica appears “stuck” after a Multi-AZ failover and is unable to obtain or apply updates from the source DB Instance. What do I do?
    1. You will need to delete the Read Replica and create a new one to replace it.
    2. You will need to disassociate the DB Engine and re associate it.
    3. The instance should be deployed to Single AZ and then moved to Multi- AZ once again
    4. You will need to delete the DB Instance and create a new one to replace it.
  10. A company is running a batch analysis every hour on their main transactional DB running on an RDS MySQL instance to populate their central Data Warehouse running on Redshift. During the execution of the batch their transactional applications are very slow. When the batch completes they need to update the top management dashboard with the new data. The dashboard is produced by another system running on-premises that is currently started when a manually-sent email notifies that an update is required The on-premises system cannot be modified because is managed by another team. How would you optimize this scenario to solve performance issues and automate the process as much as possible?
    1. Replace RDS with Redshift for the batch analysis and SNS to notify the on-premises system to update the dashboard
    2. Replace RDS with Redshift for the batch analysis and SQS to send a message to the on-premises system to update the dashboard
    3. Create an RDS Read Replica for the batch analysis and SNS to notify me on-premises system to update the dashboard
    4. Create an RDS Read Replica for the batch analysis and SQS to send a message to the on-premises system to update the dashboard.

References

AWS_RDS_Read_Replicas

AWS RDS Multi-AZ Deployment

RDS Multi-AZ Instance Deployment

RDS Multi-AZ Deployment

  • RDS Multi-AZ deployments provide high availability and automatic failover support for DB instances
  • Multi-AZ helps improve the durability and availability of a critical system, enhancing availability during planned system maintenance, DB instance failure, and Availability Zone disruption.
  • A Multi-AZ DB instance deployment
    • has one standby DB instance that provides failover support but doesn’t serve read traffic.
    • There is only one row for the DB instance.
    • The value of Role is Instance or Primary.
    • The value of Multi-AZ is Yes.
  • A Multi-AZ DB cluster deployment
    • has two standby DB instances that provide failover support and can also serve read traffic.
    • There is a cluster-level row with three DB instance rows under it.
    • For the cluster-level row, the value of Role is Multi-AZ DB cluster.
    • For each instance-level row, the value of Role is Writer instance or Reader instance.
    • For each instance-level row, the value of Multi-AZ is 3 Zones.

RDS Multi-AZ DB Instance Deployment

  • RDS automatically creates a primary DB Instance and synchronously replicates the data to a standby instance in a different AZ.
  • RDS performs an automatic failover to the standby, so that database operations can be resumed as soon as the failover is complete.
  • RDS Multi-AZ deployment maintains the same endpoint for the DB Instance after a failover, so the application can resume database operation without the need for manual administrative intervention.
  • Multi-AZ is a high-availability feature and NOT a scaling solution for read-only scenarios; a standby replica can’t be used to serve read traffic. To service read-only traffic, use a Read Replica.
  • RDS performs an automatic failover to the standby, so that database operations can be resumed as soon as the failover is complete.
  • Multi-AZ deployments for Oracle, PostgreSQL, MySQL, and MariaDB DB instances use Amazon technology, while SQL Server DB instances use SQL Server Mirroring.

RDS Multi-AZ Instance Deployment

RDS Multi-AZ DB Cluster Deployment

  • RDS Multi-AZ DB cluster deployment is a high-availability deployment mode of RDS with two readable standby DB instances.
  • RDS Multi-AZ DB cluster has a writer DB instance and two reader DB instances in three separate AZs in the same AWS Region.
  • With a Multi-AZ DB cluster, RDS semi-synchronously replicates data from the writer DB instance to both of the reader DB instances using the DB engine’s native replication capabilities.
  • Multi-AZ DB clusters provide high availability, increased capacity for read workloads, and lower write latency when compared to Multi-AZ DB instance deployments.
  • If an event of an outage, RDS manages failover from the writer DB instance to one of the reader DB instances. RDS does this based on which reader DB instance has the most recent change record.

RDS Mulit-AZ DB Cluster

Multi-AZ DB Instance vs Multi-AZ DB Cluster

RDS Multi-AZ DB Instance vs DB Cluster

RDS Multi-AZ vs Read Replicas

RDS Mulit-AZ vs Multi-Region vs Read Replicas

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A company is deploying a new two-tier web application in AWS. The company has limited staff and requires high availability, and the application requires complex queries and table joins. Which configuration provides the solution for the company’s requirements?
    1. MySQL Installed on two Amazon EC2 Instances in a single Availability Zone (does not provide High Availability out of the box)
    2. Amazon RDS for MySQL with Multi-AZ
    3. Amazon ElastiCache (Just a caching solution)
    4. Amazon DynamoDB (Not suitable for complex queries and joins)
  2. What would happen to an RDS (Relational Database Service) multi-Availability Zone deployment if the primary DB instance fails?
    1. IP of the primary DB Instance is switched to the standby DB Instance.
    2. A new DB instance is created in the standby availability zone.
    3. The canonical name record (CNAME) is changed from primary to standby.
    4. The RDS (Relational Database Service) DB instance reboots.
  3. Will my standby RDS instance be in the same Availability Zone as my primary?
    1. Only for Oracle RDS types
    2. Yes
    3. Only if configured at launch
    4. No
  4. Is creating a Read Replica of another Read Replica supported?
    1. Only in certain regions
    2. Only with MySQL based RDS
    3. Only for Oracle RDS types
    4. No
  5. A user is planning to set up the Multi-AZ feature of RDS. Which of the below mentioned conditions won’t take advantage of the Multi-AZ feature?
    1. Availability zone outage
    2. A manual failover of the DB instance using Reboot with failover option
    3. Region outage
    4. When the user changes the DB instance’s server type
  6. When you run a DB Instance as a Multi-AZ deployment, the “_____” serves database writes and reads
    1. secondary
    2. backup
    3. stand by
    4. primary
  7. When running my DB Instance as a Multi-AZ deployment, can I use the standby for read or write operations?
    1. Yes
    2. Only with MSSQL based RDS
    3. Only for Oracle RDS instances
    4. No
  8. Read Replicas require a transactional storage engine and are only supported for the _________ storage engine
    1. OracleISAM
    2. MSSQLDB
    3. InnoDB
    4. MyISAM
  9. A user is configuring the Multi-AZ feature of an RDS DB. The user came to know that this RDS DB does not use the AWS technology, but uses server mirroring to achieve replication. Which DB is the user using right now?
    1. MySQL
    2. Oracle
    3. MS SQL
    4. PostgreSQL
  10. If you have chosen Multi-AZ deployment, in the event of a planned or unplanned outage of your primary DB Instance, Amazon RDS automatically switches to the standby replica. The automatic failover mechanism simply changes the ______ record of the main DB Instance to point to the standby DB Instance.
    1. DNAME
    2. CNAME
    3. TXT
    4. MX
  11. When automatic failover occurs, Amazon RDS will emit a DB Instance event to inform you that automatic failover occurred. You can use the _____ to return information about events related to your DB Instance
    1. FetchFailure
    2. DescriveFailure
    3. DescribeEvents
    4. FetchEvents
  12. The new DB Instance that is created when you promote a Read Replica retains the backup window period.
    1. TRUE
    2. FALSE
  13. Will I be alerted when automatic failover occurs?
    1. Only if SNS configured
    2. No
    3. Yes
    4. Only if Cloudwatch configured
  14. Can I initiate a “forced failover” for my MySQL Multi-AZ DB Instance deployment?
    1. Only in certain regions
    2. Only in VPC
    3. Yes
    4. No
  15. A user is accessing RDS from an application. The user has enabled the Multi-AZ feature with the MS SQL RDS DB. During a planned outage how will AWS ensure that a switch from DB to a standby replica will not affect access to the application?
    1. RDS will have an internal IP which will redirect all requests to the new DB
    2. RDS uses DNS to switch over to standby replica for seamless transition
    3. The switch over changes Hardware so RDS does not need to worry about access
    4. RDS will have both the DBs running independently and the user has to manually switch over
  16. Which of the following is part of the failover process for a Multi-AZ Amazon Relational Database Service (RDS) instance?
    1. The failed RDS DB instance reboots.
    2. The IP of the primary DB instance is switched to the standby DB instance.
    3. The DNS record for the RDS endpoint is changed from primary to standby.
    4. A new DB instance is created in the standby availability zone.
  17. Which of these is not a reason a Multi-AZ RDS instance will failover?
    1. An Availability Zone outage
    2. A manual failover of the DB instance was initiated using Reboot with failover
    3. To autoscale to a higher instance class (Refer link)
    4. Master database corruption occurs
    5. The primary DB instance fails
  18. How does Amazon RDS multi Availability Zone model work?
    1. A second, standby database is deployed and maintained in a different availability zone from master, using synchronous replication. (Refer link)
    2. A second, standby database is deployed and maintained in a different availability zone from master using asynchronous replication.
    3. A second, standby database is deployed and maintained in a different region from master using asynchronous replication.
    4. A second, standby database is deployed and maintained in a different region from master using synchronous replication.
  19. A user is using a small MySQL RDS DB. The user is experiencing high latency due to the Multi AZ feature. Which of the below mentioned options may not help the user in this situation?
    1. Schedule the automated back up in non-working hours
    2. Use a large or higher size instance
    3. Use PIOPS
    4. Take a snapshot from standby Replica
  20. What is the charge for the data transfer incurred in replicating data between your primary and standby?
    1. No charge. It is free.
    2. Double the standard data transfer charge
    3. Same as the standard data transfer charge
    4. Half of the standard data transfer charge
  21. A user has enabled the Multi AZ feature with the MS SQL RDS database server. Which of the below mentioned statements will help the user understand the Multi AZ feature better?
    1. In a Multi AZ, AWS runs two DBs in parallel and copies the data asynchronously to the replica copy
    2. In a Multi AZ, AWS runs two DBs in parallel and copies the data synchronously to the replica copy
    3. In a Multi AZ, AWS runs just one DB but copies the data synchronously to the standby replica
    4. AWS MS SQL does not support the Multi AZ feature

Choosing the Right Data Science Specialization: Where to Focus Your Skills

Choosing the Right Data Science Specialization: Where to Focus Your Skills

In the rapidly evolving world of technology, data science stands out as a field of endless opportunities and diverse pathways. With its foundations deeply rooted in statistics, computer science, and domain-specific knowledge, data science has become indispensable for organizations seeking to make data-driven decisions. However, the vastness of this field can be overwhelming, making specialization a strategic necessity for aspiring data scientists.

This article aims to navigate through the labyrinth of data science specializations, helping you align your career with your interests, skills, and the evolving demands of the job market.

Understanding the Breadth of Data Science

Data science is a multidisciplinary field that uses scientific methods, processes, algorithms, and systems to draw knowledge and discover insights from structured and unstructured data. It includes multiranged activities, from data collection and cleaning to complex algorithmic computations and predictive modeling.

Key Areas Within Data Science

  • Machine Learning: This involves creating algorithms that can learn from pre-fed data and make predictions or decisions based on it.
  • Deep Learning: A specialized subdomain of machine learning, focusing on neural networks and algorithms inspired by the structure and function of the brain.
  • Data Engineering: This is the backbone of data science, focusing on the practical aspects of data collection, storage, and retrieval.
  • Data Visualization: It involves converting complex data sets into understandable and interactive graphical representations.
  • Big Data Analytics: This deals with extracting meaningful insights from very large, diverse data sets that are often beyond the capability of traditional data-processing applications.
  • AI and Robotics: This cutting-edge field combines data science with robotics, focusing on creating machines that can perform actions/operations that typically require human intelligence.

Interconnectivity of These Areas

While these specializations are distinct, they are interconnected. For instance, data engineering is foundational for machine learning, and AI applications often rely on insights derived from big data analytics.

Factors to Consider When Choosing a Specialization

  • Personal Interests and Strengths
    • Your choice should resonate with your personal interests. If you are fascinated by how algorithms can mimic human learning, deep learning could be your calling. Alternatively, if you enjoy the challenges of handling and organizing large data sets, data engineering might suit you.
  • Industry Demand and Job Market Trends
    • It’s crucial to align your specialization with the market demand. Fields like AI and machine learning are rapidly growing and offer numerous job opportunities. Tracking industry trends can provide valuable insights into which specializations are most in demand.
  • Long-term Career Goals
    • Consider where you want to be in your career in the next five to ten years. Some specializations may offer more opportunities for growth, leadership roles, or transitions into different areas of data science.
  • Impact of Emerging Technologies
    • Emerging technologies can redefine the landscape of data science. Continuously updating with the knowledge about these changes can help you choose a specialization that remains relevant in the future.

Deep Dive into Popular Data Science Specializations

  • Machine Learning
    • Overview and Applications: From predictive modeling in finance to recommendation systems in e-commerce, machine learning is revolutionizing various industries.
    • Required Skills and Tools: Proficiency in programming languages like Python or R, understanding of algorithms, and familiarity with TensorFlow or Scikit-learn like machine learning frameworks are essential.
  • Data Engineering
    • Role in Data Science: Data engineers build and maintain the infrastructure that allows data scientists to analyze and utilize data effectively.
    • Key Skills and Technologies: Skills in database management, ETL (Extract, Transform, Load) processes, and knowledge of SQL, NoSQL, Hadoop, and Spark are crucial.
  • Big Data Analytics
    • Understanding Big Data: This specialization deals with extremely large data sets that discover patterns, trends, and associations, particularly relating to human behavior and interactions.
    • Tools and Techniques: Familiarity with big data platforms like Apache Hadoop and Spark, along with data mining and statistical analysis, is important.
  • AI and Robotics
    • The Frontier of Data Science: This field is at the cutting edge, developing intelligent systems with the capability of performing tasks that particularly require human intelligence.
    • Skills and Knowledge Base: A deep understanding of AI principles, programming, and robotics is necessary, along with skills in machine learning and neural networks.

Educational Pathways for Each Specialization

  • Academic Courses and Degrees
    • Pursuing a formal education in data science or a related field can provide a strong theoretical foundation. Many universities like MIT now offer specialized courses in machine learning, AI, and big data analytics, like the Data Analysis Certificate program.
  • Online Courses and Bootcamps
    • Online platforms like Great Learning offer specialized courses that are more flexible and often industry-oriented. Bootcamps, on the other hand, provide intensive, hands-on training in specific areas of data science.
  • Certifications and Workshops
    • Professional certifications from recognized bodies can add significant value to your resume. Educational choices like the Data Science course showcase your expertise and commitment to professional development.
  • Self-learning Resources
    • The internet is replete with resources for self-learners. From online tutorials and forums to webinars and eBooks, the opportunities for self-paced learning in data science are abundant.

Building Experience in Your Chosen Specialization

  • Internships and Entry-level Positions
    • Gaining practical experience is crucial. Internships and entry-level positions provide real-world experience and help you understand the practical challenges and applications of your chosen specialization.
  • Personal and Open-source Projects
    • Working on personal data science projects or contributing to open-source projects can be a great way to apply your skills. These projects can also be a valuable addition to your portfolio.
  • Networking and Community Involvement
    • Building a professional network and participating in data science communities can lead to job opportunities and collaborations. Attending industry conferences and seminars is also a great way to stay updated and connected.
  • Industry Conferences and Seminars
    • These events are excellent for learning about the latest industry trends, best data science practices, and emerging technologies. They also offer opportunities to meet industry leaders and peers.

Future Trends and Evolving Specializations

  • Predicting the Future of Data Science
    • The field of data science is constantly evolving. Staying informed about future trends is crucial for choosing a specialization that will remain relevant and in demand.
  • Emerging Specializations and Technologies
    • Areas like quantum computing, edge analytics, and ethical AI are emerging as new frontiers in data science. These fields are likely to offer exciting new opportunities for specialization in the coming years.
  • Staying Adaptable and Continuous Learning
    • The work-way to a successful career in data science is adaptability and a commitment to continuous learning. The field is dynamic, and staying abreast of new developments is essential.

Conclusion

Choosing the right data science specialization is a critical decision that can shape your career trajectory. It requires a careful consideration of your personal interests, the current job market, and future industry trends. Whether your passion lies in the intricate algorithms of machine learning, the structural challenges of data engineering, or the innovative frontiers of AI and robotics, there is a niche for every aspiring data scientist. The journey is one of continuous learning, adaptability, and an unwavering curiosity about the power of data. As the field continues to grow and diversify, the opportunities for data scientists are bound to expand, offering a rewarding and dynamic career path.

 

Kubernetes and Cloud Native Associate KCNA Exam Learning Path

Kubernetes and Cloud Native Associate KCNA Exam Learning Path

I recently certified for the Kubernetes and Cloud Native Associate – KCNA exam.

  • KCNA exam focuses on a user’s foundational knowledge and skills in Kubernetes and the wider cloud native ecosystem.
  • KCNA exam is intended to prepare candidates to work with cloud-native technologies and pursue further CNCF credentials, including CKA, CKAD, and CKS.
  • KCNA validates the conceptual knowledge of
    • the entire cloud native ecosystem, particularly focusing on Kubernetes.
    • Kubernetes and cloud-native technologies, including how to deploy an application using basic kubectl commands, the architecture of Kubernetes (containers, pods, nodes, clusters), understanding the cloud-native landscape and projects (storage, networking, GitOps, service mesh), and understanding the principles of cloud-native security.

KCNA Exam Pattern

  • KCNA exam curriculum includes these general domains and their weights on the exam:
    • Kubernetes Fundamentals – 46%
    • Container Orchestration – 22%
    • Cloud Native Architecture – 16%
    • Cloud Native Observability – 8%
    • Cloud Native Application Delivery – 8%
  • KCNA exam requires you to solve 60 questions in 90 minutes.
  • Exam questions can be attempted in any order and don’t have to be sequential. So be sure to move ahead and come back later.
  • Time is more than sufficient if you are well prepared. I was able to get through the exam within an hour.

KCNA Exam Preparation and Tips

  • I used the courses from KodeKloud KCNA for practicing and it would be good enough to cover what is required for the exam.

KCNA Resources

KCNA Key Topics

Kubernetes Fundamentals

Kubernetes Architecture

  • Kubernetes is a highly popular open-source container orchestration platform that can be used to automate deployment, scaling, and the management of containerized workloads.
  • Kubernetes Architecture
    • A Kubernetes cluster consists of at least one main (control) plane, and one or more worker machines, called nodes.
    • Both the control planes and node instances can be physical devices, virtual machines, or instances in the cloud.
  • ETCD (key-value store)
    • Etcd is a consistent, distributed, and highly-available key-value store.
    • is stateful, persistent storage that stores all of Kubernetes cluster data (cluster state and config).
    • is the source of truth for the cluster.
    • can be part of the control plane, or, it can be configured externally.
  • Kubernetes API
    • API server exposes a REST interface to the Kubernetes cluster. It is the front end for the Kubernetes control plane.
    • All operations against Kubernetes objects are programmatically executed by communicating with the endpoints provided by it.
    • It tracks the state of all cluster components and manages the interaction between them.
    • It is designed to scale horizontally.
    • It consumes YAML/JSON manifest files.
    • It validates and processes the requests made via API.
  • Scheduling
    • The scheduler is responsible for assigning work to the various nodes. It keeps watch over the resource capacity and ensures that a worker node’s performance is within an appropriate threshold.
    • It schedules pods to worker nodes.
    • It watches api-server for newly created Pods with no assigned node, and selects a healthy node for them to run on.
    • If there are no suitable nodes, the pods are put in a pending state until such a healthy node appears.
    • It watches API Server for new work tasks.
    • Factors taken into account for scheduling decisions include:
      • Individual and collective resource requirements.
      • Hardware/software/policy constraints.
      • Affinity and anti-affinity specifications.
      • Data locality.
      • Inter-workload interference.
      • Deadlines and taints.
  • Controller Manager
    • Controller manager is responsible for making sure that the shared state of the cluster is operating as expected.
    • It watches the desired state of the objects it manages and watches their current state through the API server.
    • It takes corrective steps to make sure that the current state is the same as the desired state.
    • It is a controller of controllers.
    • It runs controller processes. Logically, each controller is a separate process, but to reduce complexity, they are all compiled into a single binary and run in a single process.
  • Kubelet
    • A Kubelet tracks the state of a pod to ensure that all the containers are running and healthy
    • provides a heartbeat message every few seconds to the control plane.
    • runs as an agent on each node in the cluster.
    • acts as a conduit between the API server and the node.
    • instantiates and executes Pods.
    • watches API Server for work tasks.
    • gets instructions from master and reports back to Masters.
  • Kube-proxy
    • Kube proxy is a networking component that routes traffic coming into a node from the service to the correct containers.
    • is a network proxy that runs on each node in a cluster.
    • manages IP translation and routing.
    • maintains network rules on nodes. These network rules allow network communication to Pods from inside or outside of cluster.
    • ensures each Pod gets a unique IP address.
    • makes possible that all containers in a pod share a single IP.
    • facilitates Kubernetes networking services and load-balancing across all pods in a service.
    • It deals with individual host sub-netting and ensures that the services are available to external parties.

Kubernetes Resources

  • Kubernetes Resources
    • Nodes manage and run pods; it’s the machine (whether virtualized or physical) that performs the given work.
    • Namespaces
      • provide a mechanism for isolating groups of resources within a single cluster.
      • Kubernetes starts with four initial namespaces:
        • default – default namespace for objects with no other namespace.
        • kube-system – namespace for objects created by the Kubernetes system.
        • kube-public – namespace is created automatically and is readable by all users (including those not authenticated).
        • kube-node-lease – namespace holds Lease objects associated with each node. Node leases allow the kubelet to send heartbeats so that the control plane can detect node failure.
      • Resource Quotas can be defined for each namespace to limit the resources consumed.
      • Resources within the namespaces can refer to each other with their service names.
    • Pods
      • is a group of containers and is the smallest unit that Kubernetes administers.
      • Containers in a pod share the same resources such as memory and storage.
    • ReplicaSet
      • ensures a stable set of replica Pods running at any given time.
      • helps guarantee the availability of a specified number of identical Pods.
    • Deployments
      • provide declarative updates for Pods and ReplicaSets.
      • describe the number of desired identical pod replicas to run and the preferred update strategy used when updating the deployment.
      • supports Rolling Update and Recreate update strategy.
    • Services
      • is an abstraction over the pods, and essentially, the only interface the various application consumers interact with.
      • exposes a single machine name or IP address mapped to pods whose underlying names and numbers are unreliable.
      • supports the following types
        • ClusterIP
        • NodePort
        • Load Balancer
    • Ingress
      • exposes HTTP and HTTPS routes from outside the cluster to services within the cluster.
    • DaemonSet
      • ensures that all (or some) Nodes run a copy of a Pod.
      • ensures pods are added to the newly created nodes and garbage collected as nodes are removed.
    • StatefulSet
      • is ideal for stateful applications using ReadWriteOnce volumes.
      • designed to deploy stateful applications and clustered applications that save data to persistent storage, such as persistent disks.
    • ConfigMaps
      • helps to store non-confidential data in key-value pairs.
      • can be consumed by pods as environment variables, command-line arguments, or configuration files in a volume.
    • Secrets
      • provides a container for sensitive data such as a password without putting the information in a Pod specification or a container image.
      • are not encrypted but only base64 encoded.
    • Job & ConJobs
      • creates one or more Pods and will continue to retry execution of the Pods until a specified number of them successfully terminate.
      • A CronJob creates Jobs on a repeating schedule.
    • Volumes
      • supports Persistent volumes that exist beyond the lifetime of a pod.
      • When a pod ceases to exist, Kubernetes destroys ephemeral volumes; however, Kubernetes does not destroy persistent volumes.
      • PersistentVolume (PV) is a cluster scoped piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes.
      • PersistentVolumeClaim (PVC) is a request for storage by a user.
    • Labels and Annotations attach metadata to objects in Kubernetes.
      • Labels are identifying key/value pairs that can be attached to Kubernetes objects and are used in conjunction with selectors to identify groups of related resources.
      • Annotations are key/value pairs designed to hold non-identifying information that can be leveraged by tools and libraries.
  • Containers
    • Container runtime is responsible for running containers (in Pods).
    • Kubernetes supports any implementation of the Kubernetes Container Runtime Interface CRI specifications
    • To run the containers, each worker node has a container runtime engine.
    • It pulls images from a container image registry and starts and stops containers.
    • Kubernetes supports several container runtimes:

Container Orchestration

  • Container Orchestration Fundamentals
    • Containers help manage the dependencies of an application and run much more efficiently than spinning up a lot of virtual machines.
    • While virtual machines emulate a complete machine, including the operating system and a kernel, containers share the kernel of the host machine and are only isolated processes.
    • Virtual machines come with some overhead, be it boot time, size or resource usage to run the operating system. Containers on the other hand are processes, like the browser, therefore they start a lot faster and have a smaller footprint.
  • Runtime
    • Container runtime is responsible for running containers (in Pods).
    • Kubernetes supports any implementation of the Kubernetes Container Runtime Interface CRI specifications
    • To run the containers, each worker node has a container runtime engine.
    • It pulls images from a container image registry and starts and stops containers.
    • Kubernetes supports several container runtimes:
      • Docker – Standard for a long time but the usage of Docker as the runtime for Kubernetes has been deprecated and removed in Kubernetes 1.24
      • contained – containerd is the most popular lightweight and performant implementation to run containers used by all major cloud providers for the Kubernetes As A Service products.
      • CRI-O – CRI-O was created by Red Hat and with a similar code base closely related to podman and buildah.
      • gvisor – Made by Google, provides an application kernel that sits between the containerized process and the host kernel.
      • Kata Containers – A secure runtime that provides a lightweight virtual machine, but behaves like a container
    • Security
      • 4C’s of Cloud Native security are Cloud, Clusters, Containers, and Code.
      • Containers are started on a machine and they always share the same kernel, which then becomes a risk for the whole system, if containers are allowed to call kernel functions like for example killing other processes or modifying the host network by creating routing rules.
      • Kubernetes provides security features
        • Authentication using Users & Certificates
          • Certificates are the recommended way
          • Service accounts can be used to provide bearer tokens to authenticate with Kubernetes API.
        • Authorization using Node, ABAC, RBAC, Webhooks
          • Role-based access control is the most secure and recommended authorization mechanism in Kubernetes.
        • Admission Controller is an interceptor to the Kubernetes API server requests prior to persistence of the object, but after the request is authenticated and authorized.
        • Security Context helps define privileges and access control settings for a Pod or Container that includes
        • Service Mesh like Istio and Linkerd can help implement MTLS for intra-cluster pod-to-pod communication.
        • Network Policies help specify how a pod is allowed to communicate with various network “entities” over the network.
        • Kubernetes auditing provides a security-relevant, chronological set of records documenting the sequence of actions in a cluster for activities generated by users, by applications that use the Kubernetes API, and by the control plane itself.
    •  Networking
      • Container Network Interface (CNI) is a standard that can be used to write or configure network plugins and makes it very easy to swap out different plugins in various container orchestration platforms.
      • Kubernetes networking addresses four concerns:
        • Containers within a Pod use networking to communicate via loopback.
        • Cluster networking provides communication between different Pods.
        • Service API helps expose an application running in Pods to be reachable from outside your cluster.
          • Ingress provides extra functionality specifically for exposing HTTP applications, websites, and APIs.
          • Gateway API is an add-on that provides an expressive, extensible, and role-oriented family of API kinds for modeling service networking.
        • Services can also be used to publish services only for consumption inside the cluster.
    • Service Mesh
      • Service Mesh is a dedicated infrastructure layer added to the applications that allows you to transparently add capabilities without adding them to your own code.
      • Service Mesh provides capabilities like service discovery, load balancing, failure recovery, metrics, and monitoring and complex operational requirements, like A/B testing, canary deployments, rate limiting, access control, encryption, and end-to-end authentication.
      • Service mesh uses a proxy to intercept all your network traffic, allowing a broad set of application-aware features based on the configuration you set.
      • Istio is an open source service mesh that layers transparently onto existing distributed applications.
      • An Envoy proxy is deployed along with each service that you start in the cluster, or runs alongside services running on VMs.
      • Istio provides
        • Secure service-to-service communication in a cluster with TLS encryption, strong identity-based authentication and authorization
        • Automatic load balancing for HTTP, gRPC, WebSocket, and TCP traffic
        • Fine-grained control of traffic behavior with rich routing rules, retries, failovers, and fault injection
        • A pluggable policy layer and configuration API supporting access controls, rate limits and quotas
        • Automatic metrics, logs, and traces for all traffic within a cluster, including cluster ingress and egress
    •  Storage
      • Container images are read-only and consist of different layers that include everything added during the build phase ensuring that a container from an image provides the same behavior and functionality.
      • To allow writing files, a read-write layer is put on top of the container image when you start a container from an image.
      • Container on-disk files are ephemeral and lost if the container crashes.
      • Container Storage Interface (CSI) provides a uniform and standardized interface that allows attaching different storage systems no matter if it’s cloud or on-premises storage.
      • Kubernetes supports Persistent volumes that exist beyond the lifetime of a pod. When a pod ceases to exist, Kubernetes destroys ephemeral volumes; however, Kubernetes does not destroy persistent volumes.
      • Persistent Volumes is supported using API resources
        • PersistentVolume (PV)
          • is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes.
          • is a cluster-level resource and not bound to a namespace
          • are volume plugins like Volumes, but have a lifecycle independent of any individual pod that uses the PV.
        • PersistentVolumeClaim (PVC)
          • is a request for storage by a user.
          • is similar to a Pod.
          • Pods consume node resources and PVCs consume PV resources.
          • Pods can request specific levels of resources (CPU and Memory).
          • Claims can request specific size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany, or ReadWriteMany, see AccessModes).
      • Persistent Volumes can be provisioned
          • Statically – where the cluster administrator creates the PVs which is available for use by cluster users
          • Dynamically using StorageClasses where the cluster may try to dynamically provision a volume, especially for the PVC.

Cloud Native Architecture

  • Cloud Native Architecture Fundamentals
    • Cloud native architecture guides us to optimize the software for scalability, high availability, cost efficiency, reliability, security, and faster time-to-market by using a combination of cultural, technological, and architectural design patterns.
    • Cloud native architecture includes containers, service meshes, microservices, immutable infrastructure, and declarative APIs.
    • Cloud native techniques enable loosely coupled systems that are resilient, manageable, and observable.
  • Microservices
    • Microservices are small, independent applications with a clearly defined scope of functions and responsibilities.
    • Microservices help break down an application into multiple decoupled applications, that communicate with each other in a network, which are more manageable.
    • Microservices enable multiple teams to hold ownership of different functions of the application,
    • Microservices also enable functions to be operated and scaled individually.
  •  Autoscaling
    • Autoscaling pattern provides the ability to dynamically adjust the resources based on the current demand without the need to over or under provision the resources.
    • Autoscaling can be performed using
      • Horizontal scaling – Adds new compute resources which can be new copies of the application, Virtual Machines, or physical servers.
      • Vertical scaling – Adds more resources to the existing underlying hardware.
  • Serverless
    • Serverless allows you to just focus on the code while the cloud provider takes care of the underlying resources required to execute the code.
    • Most cloud providers provide this feature as Function as a Service (FaaS) like AWS Lambda, GCP Cloud Functions, etc.
    • Serverless enables on-demand provisioning and scaling of the applications with a pay-as-you-use model.
    • CloudEvents aims to standardize serverless and event-driven architectures on multiple platforms.
      • It provides a specification of how event data should be structured.
      • Events are the basis for scaling serverless workloads or triggering corresponding functions.
  • Community and Governance
    • Open source projects hosted and supported by the CNCF are categorized according to maturity and go through a sandbox and incubation stage before graduating.
    • CNCF Technical Oversight Committee – TOC
      • is responsible for defining and maintaining the technical vision, approving new projects, accepting feedback from the end-user committee, and defining common practices that should be implemented in CNCF projects.
      • does not control the projects, but encourages them to be self-governing and community owned and practices the principle of “minimal viable governance”.
    • CNCF Project Maturity Levels
      • Sandbox Stage
        • Entry point for early stage projects.
      • Incubating Stage
        • Project meeting the sandbox stage requirements plus full technical due diligence performed, including documentation, a healthy number of committers, contributions, clear versioning scheme, documented security processes, and at least one public reference implementation
      • Graduation Stage
        • Project meeting the incubation stage criteria plus committers from at least two organizations, well-defined project governance, and committer process, maintained Core Infrastructure Initiative Best Practices Badge, third party security audit, public list of project adopters, received a supermajority vote from the TOC.
  • Personas
    • SRE, Security, Cloud, DevOps, and Containers have opened up a lot of different Cloud Native roles
      • Cloud Engineer & Architect
      • DevOps Engineer
      • Security Engineer
      • DevSecOps Engineer
      • Data Engineer
      • Full-Stack Developer
      • Site Reliability Engineer (SRE)
    • Site Reliability Engineer – SRE
      • Founded around 2003 by Google, SRE has become an important job for many organizations.
      • SRE’s goal is to create and maintain software that is reliable and scalable.
      • To measure performance and reliability, SREs use three main metrics:
        • Service Level Objectives – SLO: Specify a target level for the reliability of your service.
        • Service Level Indicators – SLI: A carefully defined quantitative measure of some aspect of the level of service that is provided
        • Service Level Agreements – SLA: An explicit or implicit contract with your users that includes consequences of meeting (or missing) the SLOs they contain.
      • Around these metrics, SREs might define an error budget. An error budget defines the amount (or time) of errors the application can have before actions are taken, like stopping deployments to production.
  • Open Standards
    • Open Standards help provide a standardized way to build, package, run, and ship modern software.
    • Open standards covers
      • Open Container Initiative (OCI) Spec: image, runtime, and distribution specification on how to run, build, and distribute containers
      • Container Network Interface (CNI): A specification on how to implement networking for Containers.
      • Container Runtime Interface (CRI): A specification on how to implement container runtimes in container orchestration systems.
      • Container Storage Interface (CSI): A specification on how to implement storage in container orchestration systems.
      • Service Mesh Interface (SMI): A specification on how to implement Service Meshes in container orchestration systems with a focus on Kubernetes.
    • OCI provides open industry standards for container technologies and defines
      • Image-spec defines how to build and package container images.
      • Runtime-spec specifies the configuration, execution environment, and lifecycle of containers.
      • Distribution-Spec, which provides a standard for the distribution of content in general and container images in particular.

Cloud Native Observability

  • Telemetry & Observability
    • Telemetry is the process of measuring and collecting data points and then transferring them to another system.
    • Observability is the ability to understand the state of a system or application by examining its outputs, logs, and performance metrics.
    • It’s a measure of how well the internal states of a system can be inferred from knowledge of its external outputs.
    • Observability mainly consists of
      • Logs: Interactions between data and the external world with messages from the application.
      • Metrics: Quantitative measurements with numerical values describing service or component behavior over time
      • Traces: Records the progression of the request while passing through multiple distributed systems.
        • Trace consists of Spans, which can include information like start and finish time, name, tags, or a log message.
        • Traces can be stored and analyzed in a tracing system like Jaeger.
    • OpenTelemetry
      • is a set of APIs, SDKs, and tools that can be used to integrate telemetry such as metrics, and protocols, but especially traces into applications and infrastructures.
      • OpenTelemetry clients can be used to export telemetry data in a standardized format to central platforms like Jaeger.
  • Prometheus
    • Prometheus is a popular, open-source monitoring system.
    • Prometheus can collect metrics that were emitted by applications and servers as time series data
    • Prometheus data model provides four core metrics:
      • Counter: A value that increases, like a request or error count
      • Gauge: Values that increase or decrease, like memory size
      • Histogram: A sample of observations, like request duration or response size
      • Summary: Similar to a histogram, but also provides the total count of observations.
    • Prometheus provides PromQL (Prometheus Query Language) to query data stored in the Time Series Database (TSDB).
    • Prometheus integrates with Grafana, which can be used to build visualization and dashboards from the collected metrics.
    • Prometheus integrates with Alertmanager to configure alerts when certain metrics reach or pass a threshold.
  • Cost Management
    • All the Cloud providers work on the Pay-as-you-use model.
    • Cost optimization can be performed by analyzing what is really needed, how long, and scaling dynamically as per the needs.
    • Some of the cost optimization techniques include
      • Right sizing the workloads and dynamically scaling as per the demand
      • Identify wasted, unused resources and have proper archival techniques
      • Using Reserved or Spot instances as per the workloads
      • Defining proper budgets and alerts

Cloud Native Application Delivery

  • Application Delivery Fundamentals
    • Application delivery includes the application lifecycle right from source code, versioning, building, testing, packaging, and deployments.
    • The old process included a lot of error-prone manual steps and the constant fear that something would break.
    • DevOps process includes both the developers and administrators and focuses on frequent, error-free, repeatable, rapid deployments.
    • Version control systems like Git provide a decentralized system that can be used to track changes in the source code.
  • CI/CD
    • Continuous Integration/Continuous Delivery (CI/CD) provides very fast, more frequent, and higher quality software rollouts with automated builds, tests, code quality checks, and deployments.
      • Continuous Integration focuses on building and testing the written code. High automation and usage of version control allow multiple developers and teams to work on the same code base.
      • Continuous Delivery focuses on automated deployment of the pre-built software.
    • CI/CD tools include Jenkins, Spinnaker, Gitlab, ArgoCD, etc.
    • CI/CD can be performed using two different approaches
      • Push-based
        • The pipeline is started and runs tools that make the changes in the platform. Changes can be triggered by a commit or merge request.
      • Pull-based
        • An agent watches the git repository for changes and compares the definition in the repository with the actual running state.
        • If changes are detected, the agent applies the changes to the infrastructure.
  • GitOps
    • Infrastructure as a Code with tools like Terraform provides complete automation with versioning and better controls increasing the quality and speed of providing infrastructure.
    • GitOps takes the idea of Git as the single source of truth a step further and integrates the provisioning and change process of infrastructure with version control operations.
  • GitOps frameworks that use the pull-based approach are Flux and ArgoCD.
    • ArgoCD is implemented as a Kubernetes controller
    • Flux is built with the GitOps Toolkit

KCNA General information and practices

  • The exam can be taken online from anywhere.
  • Make sure you have prepared your workspace well before the exams.
  • Make sure you have a valid government-issued ID card as it would be checked.
  • You are not allowed to have anything around you and no one should enter the room.
  • The exam proctor will be watching you always, so refrain from doing any other activities. Your screen is also always shared.

All the Best …