AWS S3 Glacier

AWS S3 Glacier

  • S3 Glacier is a storage service optimized for archival, infrequently used data, or “cold data.”
  • S3 Glacier is an extremely secure, durable, and low-cost storage service for data archiving and long-term backup.
  • S3 Glacier is designed to provide average annual durability of 99.999999999% for an archive.
  • S3 Glacier redundantly stores data in multiple facilities and on multiple devices within each facility.
  • To increase durability, Glacier synchronously stores the data across multiple facilities before returning SUCCESS on uploading archives.
  • Glacier performs regular, systematic data integrity checks and is built to be automatically self-healing.
  • Glacier enables customers to offload the administrative burdens of operating and scaling storage to AWS, without having to worry about capacity planning, hardware provisioning, data replication, hardware failure detection, and recovery, or time-consuming hardware migrations.
  • Glacier is a great storage choice when low storage cost is paramount, with data rarely retrieved, and retrieval latency of several hours is acceptable.
  • Glacier now offers a range of data retrievals options where the retrieval time varies from hours to 1-5 minutes
  • S3 should be used if applications require fast, frequent real-time access to the data
  • S3 Glacier can store virtually any kind of data in any format.
  • Glacier allows interaction through AWS Management Console, Command Line Interface CLI and SDKs or REST based APIs.
    • management console can only be used to create and delete vaults.
    • rest of the operations to upload, download data, create jobs for retrieval need CLI, SDK or REST based APIs
  • Use cases include
    • Digital media archives
    • Data that must be retained for regulatory compliance
    • Financial and healthcare records
    • Raw genomic sequence data
    • Long-term database backups

S3 Glacier Data Model

  • Glacier data model core concepts include vaults and archives and also includes job and notification configuration resources
    • Vault
      • A vault is a container for storing archives
      • Each vault resource has a unique address, which comprises of the region the vault was created and the unique vault name within the region and account for e.g. https://glacier.us-west-2.amazonaws.com/111122223333/vaults/examplevault
      • Vault allows storage of an unlimited number of archives
      • Glacier supports various vault operations which are region-specific
      • An AWS account can create up to 1,000 vaults per region.
    • Archive
      • An archive can be any data such as a photo, video, or document and is a base unit of storage in Glacier.
      • Each archive has a unique ID and an optional description, which can only be specified during the upload of an archive.
      • Glacier assigns the archive an ID, which is unique in the AWS region in which it is stored.
      • An archive can be uploaded in a single request. While for large archives, Glacier provides a multipart upload API that enables uploading an archive in parts.
      • An Archive can be up to 40TB.
    • Jobs
      • A Job is required to retrieve an Archive and vault inventory list
      • Data retrieval requests are asynchronous operations, are queued and most jobs take about four hours to complete.
      • A job is first initiated and then the output of the job is downloaded after the job is completes
      • Vault inventory jobs need the vault name
      • Data retrieval jobs need both the vault name and the archive id, with an optional description
      • A vault can have multiple jobs in progress at any point in time and can be identified by Job ID, assigned when is it created for tracking
      • Glacier maintains job information such as job type, description, creation date, completion date, and job status and can be queried
      • After the job completes, the job output can be downloaded in full or partially by specifying a byte range.
    • Notification Configuration
      • As the jobs are asynchronous, Glacier supports a notification mechanism to an SNS topic when the job completes
      • SNS topic for notification can either be specified with each individual job request or with the vault
      • Glacier stores the notification configuration as a JSON document

S3 Glacier Data Retrievals Options

Glacier provides three options for retrieving data with varying access times and costs: Expedited, Standard, and Bulk retrievals.

Expedited Retrievals

  • Expedited retrievals allow quick access to the data when occasional urgent requests for a subset of archives are required.
  • For all but the largest archives (250MB+), data accessed using Expedited retrievals are typically made available within 1-5 minutes.
  • There are two types of Expedited retrievals: On-Demand and Provisioned.
    • On-Demand requests are like EC2 On-Demand instances and are available the vast majority of the time.
    • Provisioned requests are guaranteed to be available when needed

Standard Retrievals

  • Standard retrievals allow access to any of the archives within several hours.
  • Standard retrievals typically complete within 3-5 hours.

Bulk Retrievals

  • Bulk retrievals are Glacier’s lowest-cost retrieval option, enabling retrieval of large amounts, even petabytes, of data inexpensively in a day.
  • Bulk retrievals typically complete within 5-12 hours.

Glacier Supported Operations

Vault Operations

  • Glacier provides operations to create and delete vaults.
  • A vault can be deleted only if there are no archives in the vault as of the last computed inventory and there have been no writes to the vault since the last inventory (as the inventory is prepared periodically)
  • Vault Inventory
    • Vault inventory helps retrieve a list of archives in a vault with information such as archive ID, creation date, and size for each archive
    • Inventory for each vault is prepared periodically, every 24 hours
    • Vault inventory is updated approximately once a day, starting on the day the first archive is uploaded to the vault.
    • When a vault inventory job is, Glacier returns the last inventory it generated, which is a point-in-time snapshot and not real-time data.
  • Vault Metadata or Description can also be obtained for a specific vault or for all vaults in a region, which provides information such as
    • creation date,
    • number of archives in the vault,
    • total size in bytes used by all the archives in the vault,
    • and the date the vault inventory was generated
  • S3 Glacier also provides operations to set, retrieve, and delete a notification configuration on the vault. Notifications can be used to identify vault events.

Archive Operations

  • S3 Glacier provides operations to upload, download and delete archives.
  • All archive operations must either be done using AWS CLI or SDK. It cannot be done using AWS Management Console.
  • An existing archive cannot be updated, it has to be deleted and uploaded.

Uploading an Archive

  • An archive can be uploaded in a single operation (1 byte to up to 4 GB in size ) or in parts referred to as Multipart upload (40 TB)
  • Multipart Upload helps to
    • improve the upload experience for larger archives.
    • upload archives in parts, independently, parallelly and in any order
    • faster recovery by needing to upload only the part that failed upload and not the entire archive.
    • upload archives without even knowing the size
    • upload archives from 1 byte to about 40,000 GB (10,000 parts * 4 GB) in size
  • To upload existing data to Glacier, consider using the AWS Import/Export Snowball service, which accelerates moving large amounts of data into and out of AWS using portable storage devices for transport. AWS transfers the data directly onto and off of storage devices using Amazon’s high-speed internal network, bypassing the Internet.
  • Glacier returns a response that includes an archive ID which is unique in the region in which the archive is stored
  • Glacier does not support any additional metadata information apart from an optional description. Any additional metadata information required should be maintained at the client side.

Downloading an Archive

  • Downloading an archive is an asynchronous operation and is the 2 step process
    • Initiate an archive retrieval job
      • When a Job is initiated, a job ID is returned as a part of the response.
      • Job is executed asynchronously and the output can be downloaded after the job completes.
      • A job can be initiated to download the entire archive or a portion of the archive.
    • After the job completes, download the bytes
      • An archive can be downloaded as all the bytes or specific byte range to download only a portion of the output
      • Downloading the archive in chunks helps in the event of the download failure, as only that part needs to be downloaded
      • Job completion status can be checked by
        • Check status explicitly (Not Recommended)
          • periodically poll the describe job operation request to obtain job information
        • Completion notification
          • An SNS topic can be specified, when the job is initiated or with the vault, to be used to notify job completion

About Range Retrievals

  • S3 Glacier allows retrieving an archive either in whole (default) or a range, or portion.
  • Range retrievals need a range to be provided that is megabyte aligned.
  • Glacier returns checksum in the response which can be used to verify if any errors in download by comparing with the checksum computed on the client-side.
  • Specifying a range of bytes can be helpful when:
    • Control bandwidth costs
      • Glacier allows retrieval of up to 5 percent of the average monthly storage (pro-rated daily) for free each month
      • Scheduling range retrievals can help in two ways.
        • meet the monthly free allowance of 5 percent by spreading out the data requested
        • if the amount of data retrieved doesn’t meet the free allowance percentage, scheduling range retrievals enables reduction of peak retrieval rate, which determines the retrieval fees.
    • Manage your data downloads
      • Glacier allows retrieved data to be downloaded for 24 hours after the retrieval request completes
      • Only portions of the archive can be retrieved so that the schedule of downloads can be managed within the given download window.
    • Retrieve a targeted part of a large archive
      • Retrieving an archive in the range can be useful if an archive is uploaded as an aggregate of multiple individual files, and only a few files need to be retrieved

Deleting an Archive

  • An archive can be deleted from the vault only one at a time
  • This operation is idempotent. Deleting an already-deleted archive does not result in an error
  • AWS applies a pro-rated charge for items that are deleted prior to 90 days, as it is meant for long term storage

Updating an Archive

  • An existing archive cannot be updated and must be deleted and re-uploaded, which would be assigned a new archive id

S3 Glacier Vault Lock

  • S3 Glacier Vault Lock helps deploy and enforce compliance controls for individual S3 Glacier vaults with a vault lock policy.
  • Specify controls such as “write once read many” (WORM) can be enforced using a vault lock policy and the policy can be locked for future edits.
  • Once locked, the policy can no longer be changed.

S3 Glacier Security

  • S3 Glacier supports data in transit encryption using Secure Sockets Layer (SSL) or client-side encryption.
  • All data is encrypted on the server side with Glacier handling key management and key protection. It uses AES-256, one of the strongest block ciphers available
  • Security and compliance of S3 Glacier is assessed by third-party auditors as part of multiple AWS compliance programs including SOC, HIPAA, PCI DSS, FedRAMP etc.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. What is Amazon Glacier?
    1. You mean Amazon “Iceberg”: it’s a low-cost storage service.
    2. A security tool that allows to “freeze” an EBS volume and perform computer forensics on it.
    3. A low-cost storage service that provides secure and durable storage for data archiving and backup
    4. It’s a security tool that allows to “freeze” an EC2 instance and perform computer forensics on it.
  2. Amazon Glacier is designed for: (Choose 2 answers)
    1. Active database storage
    2. Infrequently accessed data
    3. Data archives
    4. Frequently accessed data
    5. Cached session data
  3. An organization is generating digital policy files which are required by the admins for verification. Once the files are verified they may not be required in the future unless there is some compliance issue. If the organization wants to save them in a cost effective way, which is the best possible solution?
    1. AWS RRS
    2. AWS S3
    3. AWS RDS
    4. AWS Glacier
  4. A user has moved an object to Glacier using the life cycle rules. The user requests to restore the archive after 6 months. When the restore request is completed the user accesses that archive. Which of the below mentioned statements is not true in this condition?
    1. The archive will be available as an object for the duration specified by the user during the restoration request
    2. The restored object’s storage class will be RRS (After the object is restored the storage class still remains GLACIER. Read more)
    3. The user can modify the restoration period only by issuing a new restore request with the updated period
    4. The user needs to pay storage for both RRS (restored) and Glacier (Archive) Rates
  5. To meet regulatory requirements, a pharmaceuticals company needs to archive data after a drug trial test is concluded. Each drug trial test may generate up to several thousands of files, with compressed file sizes ranging from 1 byte to 100MB. Once archived, data rarely needs to be restored, and on the rare occasion when restoration is needed, the company has 24 hours to restore specific files that match certain metadata. Searches must be possible by numeric file ID, drug name, participant names, date ranges, and other metadata. Which is the most cost-effective architectural approach that can meet the requirements?
    1. Store individual files in Amazon Glacier, using the file ID as the archive name. When restoring data, query the Amazon Glacier vault for files matching the search criteria. (Individual files are expensive and does not allow searching by participant names etc)
    2. Store individual files in Amazon S3, and store search metadata in an Amazon Relational Database Service (RDS) multi-AZ database. Create a lifecycle rule to move the data to Amazon Glacier after a certain number of days. When restoring data, query the Amazon RDS database for files matching the search criteria, and move the files matching the search criteria back to S3 Standard class. (As the data is not needed can be stored to Glacier directly and the data need not be moved back to S3 standard)
    3. Store individual files in Amazon Glacier, and store the search metadata in an Amazon RDS multi-AZ database. When restoring data, query the Amazon RDS database for files matching the search criteria, and retrieve the archive name that matches the file ID returned from the database query. (Individual files and Multi-AZ is expensive)
    4. First, compress and then concatenate all files for a completed drug trial test into a single Amazon Glacier archive. Store the associated byte ranges for the compressed files along with other search metadata in an Amazon RDS database with regular snapshotting. When restoring data, query the database for files that match the search criteria, and create restored files from the retrieved byte ranges.
    5. Store individual compressed files and search metadata in Amazon Simple Storage Service (S3). Create a lifecycle rule to move the data to Amazon Glacier, after a certain number of days. When restoring data, query the Amazon S3 bucket for files matching the search criteria, and retrieve the file to S3 reduced redundancy in order to move it back to S3 Standard class. (Once the data is moved from S3 to Glacier the metadata is lost, as Glacier does not have metadata and must be maintained externally)
  6. A user is uploading archives to Glacier. The user is trying to understand key Glacier resources. Which of the below mentioned options is not a Glacier resource?
    1. Notification configuration
    2. Archive ID
    3. Job
    4. Archive

References

AWS Simple Email Service – SES

AWS Simple Email Service – SES

  • SES is a fully managed service that provides an email platform with an easy, cost-effective way to send and receive email using your own email addresses and domains.
  • can be used to send both transactional and promotional emails securely, and globally at scale.
  • acts as an outbound email server and eliminates the need to support its own software or applications to do the heavy lifting of email transport.
  • acts as an inbound email server to receive emails that can help develop software solutions such as email autoresponders, email unsubscribe systems, and applications that generate customer support tickets from incoming emails.
  • existing email server can also be configured to send outgoing emails through SES with no change in any settings in the email clients
  • Maximum message size including attachments is 10 MB per message (after base64 encoding).
  • integrated with CloudWatch and CloudTrail

SES Characteristics

  • Compatible with SMTP
  • Applications can send email using a single API call in many supported languages Java, .Net, PHP, Perl, Ruby, HTTPS, etc
  • Optimized for the highest levels of uptime, availability, and scales as per the demand
  • Provides sandbox environment for testing
  • provides Reputation dashboard, performance insights, anti-spam feedback
  • provides statistics on email deliveries, bounces, feedback loop results, emails opened, etc.
  • supports DomainKeys Identified Mail (DKIM) and Sender Policy Framework (SPF)
  • supports flexible deployment: shared, dedicated, and customer-owned IPs
  • supports attachments with many popular content formats, including documents, images, audio, and video, and scans every attachment for viruses and malware.
  • integrates with KMS to provide the ability to encrypt the mail that it writes to the S3 bucket.
  • uses client-side encryption to encrypt the mail before it sends the email to  S3.

Sending Limits

  • Production SES has a set of sending limits which include
    • Sending Quota – max number of emails in a 24-hour period
    • Maximum Send Rate – max number of emails per second
  • SES automatically adjusts the limits upward as long as emails are of high quality and they are sent in a controlled manner, as any spike in the email sent might be considered to be spam.
  • Limits can also be raised by submitting a Quota increase request

SES Best Practices

  • Send high-quality and real production content that the recipients want
  • Only send to those who have signed up for the mail
  • Unsubscribe recipients who have not interacted with the business recently
  • Have low bounce and compliant rates and remove bounced or complained addresses, using SNS to monitor bounces and complaints, treating them as an opt-out
  • Monitor the sending activity

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. What does Amazon SES stand for?
    1. Simple Elastic Server
    2. Simple Email Service
    3. Software Email Solution
    4. Software Enabled Server
  2. Your startup wants to implement an order fulfillment process for selling a personalized gadget that needs an average of 3-4 days to produce with some orders taking up to 6 months you expect 10 orders per day on your first day. 1000 orders per day after 6 months and 10,000 orders after 12 months. Orders coming in are checked for consistency men dispatched to your manufacturing plant for production quality control packaging shipment and payment processing If the product does not meet the quality standards at any stage of the process employees may force the process to repeat a step Customers are notified via email about order status and any critical issues with their orders such as payment failure. Your case architecture includes AWS Elastic Beanstalk for your website with an RDS MySQL instance for customer data and orders. How can you implement the order fulfillment process while making sure that the emails are delivered reliably? [PROFESSIONAL]
    1. Add a business process management application to your Elastic Beanstalk app servers and re-use the ROS database for tracking order status use one of the Elastic Beanstalk instances to send emails to customers.
    2. Use SWF with an Auto Scaling group of activity workers and a decider instance in another Auto Scaling group with min/max=1 Use the decider instance to send emails to customers.
    3. Use SWF with an Auto Scaling group of activity workers and a decider instance in another Auto Scaling group with min/max=1 use SES to send emails to customers.
    4. Use an SQS queue to manage all process tasks Use an Auto Scaling group of EC2 Instances that poll the tasks and execute them. Use SES to send emails to customers.

References

AWS Aurora Global Database vs DynamoDB Global Tables

AWS Aurora Global Database vs DynamoDB Global Tables

AWS Aurora Global Database vs DynamoDB Global Tables

AWS Aurora Global Database vs DynamoDB Global Tables

Aurora Global Database

  • Aurora Global Database consists of one primary AWS Region where the data is mastered, and up to five read-only, secondary AWS Regions.
  • Aurora cluster in the primary AWS Region where the data is mastered performs both read and write operations. The clusters in the secondary Regions enable low-latency reads.
  • Aurora replicates data to the secondary AWS Regions with a typical latency of under a second.
  • Secondary clusters can be scaled independently by adding one or more DB instances (Aurora Replicas) to serve read-only workloads.
  • Aurora global database uses dedicated infrastructure to replicate the data, leaving database resources available entirely to serve applications.
  • Applications with a worldwide footprint can use reader instances in the secondary AWS Regions for low-latency reads.
  • Typical cross-region replication takes less than 1 second.
  • In case of a disaster or an outage, one of the clusters in a secondary AWS Region can be promoted to take full read/write workloads in under a min.

DynamoDB Global Tables

  • DynamoDB Global tables provide a fully managed, multi-Region, and multi-active database that delivers fast, local, read and write performance for massively scaled, global applications.
  • Global tables replicate the DynamoDB tables automatically across the choice of AWS Regions and enable reads and writes on all instances.
  • DynamoDB global table consists of multiple replica tables (one per AWS Region). Every replica has the same table name and the same primary key schema. When an application writes data to a replica table in one Region, DynamoDB propagates the write to the other replica tables in the other AWS Regions automatically.
  • Global tables enable the read and write of data locally providing single-digit-millisecond latency for the globally distributed application at any scale.
  • Global tables enable the applications to stay highly available even in the unlikely event of isolation or degradation of an entire Region.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.

References

AWS RDS Aurora

AWS RDS Aurora

  • AWS Aurora is a relational database engine that combines the speed and reliability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases.
  • is a fully managed, MySQL- and PostgreSQL-compatible, relational database engine i.e. applications developed with MySQL can switch to Aurora with little or no changes
  • delivers up to 5x performance of MySQL and up to 3x performance of PostgreSQL without requiring any changes to most MySQL applications
  • is fully managed as RDS manages the databases, handling time-consuming tasks such as provisioning, patching, backup, recovery, failure detection and repair.
  • can scale storage automatically, based on the database usage, from 10GB to 128TiB in 10GB increments with no impact on database performance

Aurora DB Clusters

AWS Aurora Architecture

  • Aurora DB cluster consists of one or more DB instances and a cluster volume that manages the data for those DB instances.
  • A cluster volume is a virtual database storage volume that spans multiple AZs, with each AZ having a copy of the DB cluster data
  • Two types of DB instances make up an Aurora DB cluster:
    • Primary DB instance
      • Supports read and write operations, and performs all data modifications to the cluster volume.
      • Each DB cluster has one primary DB instance.
    • Aurora Replica
      • Connects to the same storage volume as the primary DB instance and supports only read operations.
      • Each DB cluster can have up to 15 Aurora Replicas in addition to the primary DB instance.
      • Provides high availability by locating Replicas in separate AZs
      • Aurora automatically fails over to a Replica in case the primary DB instance becomes unavailable.
      • Failover priority for Replicas can be specified.
      • Replicas can also offload read workloads from the primary DB instance
  • For Aurora multi-master clusters
    • all DB instances have read/write capability, with no difference between primary and replica.

Aurora Connection Endpoints

  • Aurora involves a cluster of DB instances instead of a single instance
  • Endpoint refers to an intermediate handler with the hostname and port specified to connect to the cluster
  • Aurora uses the endpoint mechanism to abstract these connections

Cluster endpoint

  • Cluster endpoint (or writer endpoint) for a DB cluster connects to the current primary DB instance for that DB cluster.
  • Cluster endpoint is the only one that can perform write operations such as DDL statements as well as read operations
  • Each DB cluster has one cluster endpoint and one primary DB instance
  • Cluster endpoint provides failover support for read/write connections to the DB cluster. If the current primary DB instance of a DB cluster fails, Aurora automatically fails over to a new primary DB instance. During a failover, the DB cluster continues to serve connection requests to the cluster endpoint from the new primary DB instance, with minimal interruption of service.

Reader endpoint

  • Reader endpoint for a DB cluster provides load-balancing support for read-only connections to the DB cluster.
  • Use the reader endpoint for read operations, such as queries.
  • Reader endpoint reduces the overhead on the primary instance by processing the statements on the read-only Replicas.
  • Each DB cluster has one reader endpoint.
  • If the cluster contains one or more Replicas, the reader endpoint load balances each connection request among the Replicas.

Custom endpoint

  • Custom endpoint for a DB cluster represents a set of DB instances that you choose.
  • Aurora performs load balancing and chooses one of the instances in the group to handle the connection.
  • An Aurora DB cluster has no custom endpoints until one is created and up to five custom endpoints can be created for each provisioned cluster.
  • Aurora Serverless clusters do not support custom endpoints.

Instance endpoint

  • An instance endpoint connects to a specific DB instance within a cluster and provides direct control over connections to the DB cluster.
  • Each DB instance in a DB cluster has its own unique instance endpoint. So there is one instance endpoint for the current primary DB instance of the DB cluster, and there is one instance endpoint for each of the Replicas in the DB cluster.

High Availability and Replication

  • Aurora is designed to offer greater than 99.99% availability
  • provides data durability and reliability
    • by replicating the database volume six ways across three Availability Zones in a single region
    • backing up the data continuously to  S3.
  • transparently recovers from physical storage failures; instance failover typically takes less than 30 seconds.
  • automatically fails over to a new primary DB instance, if the primary DB instance fails, by either promoting an existing Replica to a new primary DB instance or creating a new primary DB instance
  • automatically divides the database volume into 10GB segments spread across many disks. Each 10GB chunk of the database volume is replicated six ways, across three Availability Zones
  • RDS databases for e.g. MySQL, Oracle, etc. have the data in a single AZ
  • is designed to transparently handle
    • the loss of up to two copies of data without affecting database write availability and
    • up to three copies without affecting read availability.
  • provides self-healing storage. Data blocks and disks are continuously scanned for errors and repaired automatically.
  • Replicas share the same underlying volume as the primary instance. Updates made by the primary are visible to all Replicas.
  • As Replicas share the same data volume as the primary instance, there is virtually no replication lag.
  • Any Replica can be promoted to become primary without any data loss and therefore can be used for enhancing fault tolerance in the event of a primary DB Instance failure.
  • To increase database availability, 1 to 15 replicas can be created in any of 3 AZs, and RDS will automatically include them in failover primary selection in the event of a database outage.

Aurora Failovers

  • Aurora automatically fails over, if the primary instance in a DB cluster fails, in the following order:
    • If Aurora Read Replicas are available, promote an existing Read Replica to the new primary instance.
    • If no Read Replicas are available, then create a new primary instance.
  • If there are multiple Aurora Read Replicas, the criteria for promotion is based on the priority that is defined for the Read Replicas.
    • Priority numbers can vary from 0 to 15 and can be modified at any time.
    • PostgreSQL promotes the Aurora Replica with the highest priority to the new primary instance.
    • For Read Replicas with the same priority, PostgreSQL promotes the replica that is largest in size or in an arbitrary manner.
  • During the failover, AWS modifies the cluster endpoint to point to the newly created/promoted DB instance.
  • Applications experience a minimal interruption of service if they connect using the cluster endpoint and implement connection retry logic.

Security

  • Aurora uses SSL (AES-256) to secure the connection between the database instance and the application
  • allows database encryption using keys managed through AWS Key Management Service (KMS).
  • Encryption and decryption are handled seamlessly.
  • With encryption, data stored at rest in the underlying storage is encrypted, as are its automated backups, snapshots, and replicas in the same cluster.
  • Encryption of existing unencrypted Aurora instances is not supported. Create a new encrypted Aurora instance and migrate the data

Backup and Restore

  • Automated backups are always enabled on Aurora DB Instances.
  • Backups do not impact database performance.
  • Aurora also allows the creation of manual snapshots
  • Aurora automatically maintains 6 copies of the data across 3 AZs and will automatically attempt to recover the database in a healthy AZ with no data loss
  • If in any case, the data is unavailable within Aurora storage,
    • DB Snapshot can be restored or
    • the point-in-time restore operation can be performed to a new instance. The latest restorable time for a point-in-time restore operation can be up to 5 minutes in the past.
  • Restoring a snapshot creates a new Aurora DB instance
  • Deleting database deletes all the automated backups (with an option to create a final snapshot), but would not remove the manual snapshots.
  • Snapshots (including encrypted ones) can be shared with other AWS accounts

Aurora Parallel Query

  • Aurora Parallel Query refers to the ability to push down and distribute the computational load of a single query across thousands of CPUs in Aurora’s storage layer.
  • Without Parallel Query, a query issued against an Aurora database would be executed wholly within one instance of the database cluster; this would be similar to how most databases operate.
  • Parallel Query is a good fit for analytical workloads requiring fresh data and good query performance, even on large tables.
  • Parallel Query provides the following benefits
    • Faster performance: Parallel Query can speed up analytical queries by up to 2 orders of magnitude.
    • Operational simplicity and data freshness: you can issue a query directly over the current transactional data in your Aurora cluster.
    • Transactional and analytical workloads on the same database: Parallel Query allows Aurora to maintain high transaction throughput alongside concurrent analytical queries.
  • Parallel Query can be enabled and disabled dynamically at both the global and session level using the aurora_pq parameter.
  • Parallel Query is available for the MySQL 5.6-compatible version of Aurora

Aurora Scaling

  • Aurora storage scaling is built-in and will automatically grow, up to 64 TB (soft limit), in 10GB increments with no impact on database performance.
  • There is no need to provision storage in advance
  • Compute Scaling
    • Instance scaling
      • Vertical scaling the master the instance. Memory and CPU resources are modified by changing the DB Instance class.
      • scaling the read replica and promoting it to master using forced failover which provides a minimal downtime
    • Read scaling
      • provides horizontal scaling with up to 15 read replicas
  • Auto Scaling
    • Scaling policies to add read replicas with min and max replica count based on scaling CloudWatch CPU or connections metrics condition

Aurora Backtrack

  • Backtracking “rewinds” the DB cluster to the specified time
  • Backtracking performs in-place restore and does not create a new instance. There is minimal downtime associated with it.
  • Backtracking is available for Aurora with MySQL compatibility
  • Backtracking is not a replacement for backing up the DB cluster so that you can restore it to a point in time.
  • With backtracking, there is a target backtrack window and an actual backtrack window:
    • Target backtrack window is the amount of time you WANT the DB cluster can be backtracked for e.g 24 hours. The limit for a backtrack window is 72 hours.
    • Actual backtrack window is the actual amount of time you CAN backtrack the DB cluster, which can be smaller than the target backtrack window. The actual backtrack window is based on the workload and the storage available for storing information about database changes, called change records
  • DB cluster with backtracking enabled generates change records.
  • Aurora retains change records for the target backtrack window and charges an hourly rate for storing them.
  • Both the target backtrack window and the workload on the DB cluster determine the number of change records stored.
  • Workload is the number of changes made to the DB cluster in a given amount of time. If the workload is heavy, you store more change records in the backtrack window than you do if your workload is light.
  • Backtracking affects the entire DB cluster and can’t selectively backtrack a single table or a single data update.
  • Backtracking provides the following advantages over traditional backup and restore:
    • Undo mistakes – revert destructive action, such as a DELETE without a WHERE clause
    • Backtrack DB cluster quickly – Restoring a DB cluster to a point in time launches a new DB cluster and restores it from backup data or a DB cluster snapshot, which can take hours. Backtracking a DB cluster doesn’t require a new DB cluster and rewinds the DB cluster in minutes.
    • Explore earlier data changes – repeatedly backtrack a DB cluster back and forth in time to help determine when a particular data change occurred

Aurora Serverless

Refer blog post @ Aurora Serverless

Aurora Global Database

  • Aurora global database consists of one primary AWS Region where the data is mastered, and up to five read-only, secondary AWS Regions.
  • Aurora cluster in the primary AWS Region where your data is mastered performs both read and write operations. The clusters in the secondary Regions enable low-latency reads.
  • Aurora replicates data to the secondary AWS Regions with a typical latency of under a second.
  • Secondary clusters can be scaled independently by adding one of more DB instances (Aurora Replicas) to serve read-only workloads.
  • Aurora global database uses dedicated infrastructure to replicate the data, leaving database resources available entirely to serve applications.
  • Applications with a worldwide footprint can use reader instances in the secondary AWS Regions for low-latency reads.
  • In case of a disaster or an outage, one of the cluster in a secondary AWS Regions can be promoted to take full read/write workloads in under a min.

Aurora Clone

  • Aurora cloning feature helps create Aurora cluster duplicates quickly and cost-effectively
  • Creating a clone is faster and more space-efficient than physically copying the data using a different technique such as restoring a snapshot.
  • Aurora cloning uses a copy-on-write protocol.
  • Aurora clone requires only minimal additional space when first created. In the beginning, Aurora maintains a single copy of the data, which is used by both the original and new DB clusters.
  • Aurora allocates new storage only when data changes, either on the source cluster or the cloned cluster.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Company wants to use MySQL compatible relational database with greater performance. Which AWS service can be used?
    1. Aurora
    2. RDS
    3. SimpleDB
    4. DynamoDB
  2. An application requires a highly available relational database with an initial storage capacity of 8 TB. The database will grow by 8 GB every day. To support expected traffic, at least eight read replicas will be required to handle database reads. Which option will meet these requirements?
    1. DynamoDB
    2. Amazon S3
    3. Amazon Aurora
    4. Amazon Redshift
  3. A company is migrating their on-premise 10TB MySQL database to AWS. As a compliance requirement, the company wants to have the data replicated across three availability zones. Which Amazon RDS engine meets the above business requirement?
    1. Use Multi-AZ RDS
    2. Use RDS
    3. Use Aurora
    4. Use DynamoDB

References

AWS_Aurora

AWS Network Firewall vs Gateway Load Balancer

AWS Network Firewall vs Gateway Load Balancer

AWS Network Firewall vs Gateway Load Balancer

AWS Network Firewall vs Gateway Load Balancer

AWS Network Firewall

  • AWS Network Firewall is a stateful, fully managed, network firewall and intrusion detection and prevention service (IDS/IPS) for VPCs.
  • Network Firewall scales automatically with the network traffic, without the need for deploying and managing any infrastructure.
  • AWS Network Firewall cost covers
    • an hourly rate for each firewall endpoint,
    • the amount of traffic and data processing charges, billed by the gigabyte, processed by the firewall endpoint,
    • standard AWS data transfer charges for all data transferred via the AWS Network Firewall.

AWS Gateway Load Balancer

  • Gateway Load Balancer helps deploy, scale, and manage virtual appliances, such as firewalls, intrusion detection and prevention systems (IDS/IPS), and deep packet inspection systems.
  • is architected to handle millions of requests/second, volatile traffic patterns, and introduces extremely low latency.
  • AWS Gateway Load Balancer cost covers
    • charges for each hour or partial hour that a GWLB is running,
    • the number of Gateway Load Balancer Capacity Units (GLCU) used by Gateway Load Balancer per hour.
    • GWLB uses Gateway Load Balancer Endpoint (GWLBE) to simplify how applications can securely exchange traffic with GWLB across VPC boundaries. GWLBE is priced and billed separately.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.

References

AWS IAM Best Practices

AWS IAM Best Practices

AWS recommends the following AWS Identity and Access Management service – IAM Best Practices to secure AWS resources

Root Account – Don’t use & Lock away access keys

  • Do not use AWS Root account which has full access to all the AWS resources and services including the Billing information.
  • Permissions associated with your AWS Root account cannot be restricted.
  • Do not generate the access keys, if not required
  • If already generated and not needed, delete the access keys.
  • If access keys needed, rotate (change) the access key regularly
  • Never share your Root account credentials or access keys, instead create IAM users or Roles to grant granular access
  • Enable AWS multifactor authentication (MFA) on your AWS account

User – Create individual IAM users

  • Don’t use the AWS root account credentials to access AWS, and don’t share the credentials with anyone else.
  • Start by creating a IAM User with Administrator role, which has access to all resources as the Root user except to the account’s security credentials
  • Create individual users for anyone who needs access to your AWS account and give each user unique credentials and grant different permissions

Groups – Use groups to assign permissions to IAM users

  • Instead of defining permissions for individual IAM users, create groups and define the relevant permissions for each group as per the job function, and then associate IAM users to those groups.
  • Users in an IAM group inherit the permissions assigned to the group and a User can belong to multiple groups
  • It is much easier to add new users, remove users and modify the permissions of a group of users.

Permission – Grant least privilege

  • IAM user, by default, is created with no permissions
  • Users should be granted LEAST PRIVILEGE as required to perform a task.
  • Starting with minimal permissions and add to the permissions as required to perform the job function is far better then granting access all and trying to then tightening it down

Passwords – Enforce strong password policy for users

  • Enforce user to create strong passwords and enforce them to rotate their passwords periodically
  • Enable a strong password policy to define passwords requirements forcing users to create passwords with requirements like at least one capital letter, one number, how frequently it should be rotated.

MFA – Enable MFA for privileged users

  • For extra security, Enable MultiFactor Authentication (MFA) for privileged IAM users, who are allowed access to sensitive resources or APIs.

Role – Use temporary credentials with IAM roles

  • Use roles for workloads instead of creating IAM user and hardcoding the credentials which can compromise the access and are also hard to rotate.
  • Roles have specific permissions and do not have a permanent set of credentials.
  • Roles provide a way to access AWS by relying on dynamically generated & automatically rotated temporary security credentials.
  • Roles  associated with it but dynamically provide temporary credentials that are automatically rotated

Sharing – Delegate using roles

  • Allow users from same AWS account, another AWS account, or externally authenticated users (either through any corporate authentication service or through Google, Facebook etc) to use IAM roles to specify the permissions which can then be assumed by them
  • A role can be defined that specifies what permissions the IAM users in the other account are allowed, and from which AWS accounts the IAM users are allowed to assume the role

Rotation – Rotate credentials regularly

  • Change your own passwords and access keys regularly and enforce it through a strong password policy. So even if a password or access key is compromised without your knowledge, you limit how long the credentials can be used to access your resources
  • Access keys allows creation of 2 active keys at the same time for an user. These can be used to rotate the keys.

Track – Remove unnecessary credentials

  • Remove IAM user and credentials (that is, passwords and access keys) that are not needed
  • Use the IAM Credential report that lists all IAM users in the account and status of their various credentials, including passwords, access keys, and MFA devices and usage pattern to figure out what can be removed
  • Passwords and access keys that have not been used recently might be good candidates for removal.

Conditions – Use policy conditions for extra security

  • Define conditions under which IAM policies allow access to a resource.
  • Conditions would help provide finer access control to the AWS services and resources for e.g. access limited to specific ip range or allowing only encrypted request for uploads to S3 buckets etc.

Auditing – Monitor activity in the AWS account

  • Enable logging features provided through CloudTrail, S3, CloudFront in AWS to determine the actions users have taken in the account and the resources that were used.
  • Log files show the time and date of actions, the source IP for an action, which actions failed due to inadequate permissions, and more.

Use IAM Access Analyzer

  • IAM Access Analyzer analyzes the services and actions that the IAM roles use, and then generates a least-privilege policy that you can use.
  • Access Analyzer helps preview and analyze public and cross-account access for supported resource types by reviewing the generated findings.
  • IAM Access Analyzer helps to validate the policies created to ensure that they adhere to the IAM policy language (JSON) and IAM best practices.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Your organization is preparing for a security assessment of your use of AWS. In preparation for this assessment, which two IAM best practices should you consider implementing? Choose 2 answers
    1. Create individual IAM users for everyone in your organization (May not be needed as can use Roles as well)
    2. Configure MFA on the root account and for privileged IAM users
    3. Assign IAM users and groups configured with policies granting least privilege access
    4. Ensure all users have been assigned and are frequently rotating a password, access ID/secret key, and X.509 certificate (Must be assigned only if using console or through command line)
  2. What are the recommended best practices for IAM? (Choose 3 answers)
    1. Grant least privilege
    2. User the AWS account(root) for regular user
    3. Use Mutli-Factor Authentication (MFA)
    4. Store access key/private key in git
    5. Rotate credentials regularly
  3. Which of the below mentioned options is not a best practice to securely manage the AWS access credentials?
    1. Enable MFA for privileged users
    2. Create individual IAM users
    3. Keep rotating your secure access credentials at regular intervals
    4. Create strong access key and secret access key and attach to the root account
  4. Your CTO is very worried about the security of your AWS account. How best can you prevent hackers from completely hijacking your account?
    1. Use short but complex password on the root account and any administrators.
    2. Use AWS IAM Geo-Lock and disallow anyone from logging in except for in your city.
    3. Use MFA on all users and accounts, especially on the root account. (For increased security, it is recommend to configure MFA to help protect AWS resources)
    4. Don’t write down or remember the root account password after creating the AWS account.
  5. Fill the blanks: ____ helps us track AWS API calls and transitions, ____ helps to understand what resources we have now, and ____ allows auditing credentials and logins.
    1. AWS Config, CloudTrail, IAM Credential Reports
    2. CloudTrail, IAM Credential Reports, AWS Config
    3. CloudTrail, AWS Config, IAM Credential Reports
    4. AWS Config, IAM Credential Reports, CloudTrail

References

AWS Certified Advanced Networking – Specialty ANS-C01 Exam Learning Path

AWS Certified Advanced Networking - Specialty Certificate

AWS Certified Advanced Networking – Specialty ANS-C01 Exam Learning Path

I recently certified/recertified for the AWS Certified Advanced Networking – Specialty (ANS-C01). Frankly, Networking is something that I am still diving deep into and I just about managed to get through. So a word of caution, this exam is inline or tougher than the professional exams, especially for the reason that some of the Networking concepts covered are not something you can get your hands dirty with easily.

AWS Certified Advanced Networking – Specialty (ANS-C01) exam focuses on the AWS Networking concepts. It basically validates

  • Design and develop hybrid and cloud-based networking solutions by using AWS
  • Implement core AWS networking services according to AWS best practices
  • Operate and maintain hybrid and cloud-based network architecture for all AWS services
  • Use tools to deploy and automate hybrid and cloud-based AWS networking tasks
  • Implement secure AWS networks using AWS native networking constructs and services

Refer to AWS Certified Advanced Networking – Specialty Exam Guide AWS Certified Advanced Networking - Specialty ANS-C01 Exam Domains

AWS Certified Advanced Networking – Specialty (ANS-C01) Exam Resources

AWS Certified Advanced Networking – Specialty (ANS-C01) Exam Summary

  • AWS Certified Networking – Specialty (ANS-C01) exam has 65 questions to be solved in 170 minutes and I made sure I utilized the complete time.
  • AWS Certified Networking – Specialty (ANS-C01) exam focuses a lot on Networking concepts involving Hybrid Connectivity with Direct Connect, VPN, Transit Gateway, Direct Connect Gateway, and a bit of VPC, Route 53, ALB, NLB & CloudFront.
  • Each question mainly touches multiple AWS services.
  • Questions and answers options have a lot of prose and a lot of reading that needs to be done, so be sure you are prepared and manage your time well.
  • As always, mark the questions for review and move on and come back to them after you are done with all.
  • As always, having a rough architecture or mental picture of the setup helps focus on the areas that you need to improve. Trust me, you will be able to eliminate 2 answers for sure and then need to focus on only the other two. Read the other 2 answers to check the difference area and that would help you reach the right answer or at least have a 50% chance of getting it right.

AWS Certified Advanced Networking – Specialty (ANS-C01) Exam Topics

Networking & Content Delivery

  • Virtual Private Cloud – VPC
    • Understand VPC, Subnets
    • AWS allows extending the VPC by adding a secondary VPC
    • Understand Security Groups, NACLs
    • VPC Flow Logs
      • help capture information about the IP traffic going to and from network interfaces in the VPC and can help in monitoring the traffic or troubleshooting any connectivity issues
      • NACLs are stateless and how it is reflected in VPC Flow Logs
        • If ACCEPT followed by REJECT, inbound was accepted by Security Groups and ACLs. However, rejected by NACLs outbound
        • If REJECT, inbound was either rejected by Security Groups OR NACLs.
      • Use pkt-dstaddr instead of dstaddr to track the destination address as dstaddr refers to the primary ENI address always and not the secondary addresses.
      • Pattern: VPC Flow Logs -> CloudWatch Logs -> (Subscription) -> Kinesis Data Firehose -> S3/Open Search.
    • DHCP Option Sets esp. how to resolve DNS from both on-premises data center and AWS.
    • VPC Peering
      • helps point-to-point connectivity between 2 VPCs which can be in the same or different regions and accounts.
      • know VPC Peering Limitations esp. it does not allow overlapping CIDRs and transitive routing.
    • Placement Groups determine how the instances are placed on the underlying hardware
    • VRF – Virtual Routing & Forwarding can be used to route traffic to the same customer gateway from multiple VPCs, that can be overlapping.
  • VPC Endpoints
    • VPC Gateway Endpoints for connectivity with S3 & DynamoDB i.e. VPC -> VPC Gateway Endpoints -> S3/DynamoDB.
    • VPC Interface Endpoints or Private Links for other AWS services and custom hosted services i.e. VPC -> VPC Interface Endpoint OR Private Link -> S3/Kinesis/SQS/CloudWatch/Any custom endpoint.
    • S3 gateway endpoints cannot be accessed through VPC Peering, VPN, or Direct Connect. Need HTTP proxy to route traffic.
    • S3 Private Link can be accessed through VPC Peering, VPN, or Direct Connect. Need to use an endpoint-specific DNS name.
    • VPC endpoint policy can be configured to control which S3 buckets can be accessed and the S3 Bucket policy can be used to control which VPC (includes all VPC Endpoints) or VPC Endpoint can access it.
    • Private Link Patterns
  • VPC Network Access Analyzer
    • helps identify unintended network access to the resources on AWS.
  • Transit Gateway
    • helps consolidate the AWS VPC routing configuration for a region with a hub-and-spoke architecture.
    • Appliance Mode ensures that network flows are symmetrically routed to the same AZ and network appliance
    • Transit Gateway Connect attachment can be used to connect SD-WAN to AWS Cloud. This supports GRE.
    • Transit Gateways are regional and Peering can connect Transit Gateways across regions.
    • Transit Gateway Network Manager includes events and metrics to monitor the quality of the global network, both in AWS and on-premises.
  • VPC Routing Priority
  • NAT Gateways
    • for HA, Scalable, Outgoing traffic. Does not support Security Groups or ICMP pings.
    • times out the connection if it is idle for 350 seconds or more. To prevent the connection from being dropped, initiate more traffic over the connection or enable TCP keepalive on the instance with a value of less than 350 seconds.
    • supports Private NAT Gateways for internal communication.
  • Virtual Private Network
    • to establish connectivity between the on-premises data center and AWS VPC
  • Direct Connect
    • to establish connectivity between the on-premises data center and AWS VPC and Public Services
    • Direct Connect connections – Dedicated and Hosted connections
    • Understand how to create a Direct Connect connection
      • LOA-CFA provides the details for partners to connect to the AWS Direct Connect location
    • Virtual interfaces options – Private Virtual Interface for VPC resources and Public Virtual Interface for Public Resources
      • Private VIF is for resources within a VPC
      • Public VIF is for AWS public resources
      • Private VIF has a limit of 100 routes and Public VIF of 1000 routes. Summarize the routes if you need to configure more.
    • Understand setup Private and Public VIF
    • Understand High Availability options based on cost and time i.e. Second Direct Connect connection OR VPN connection
    • Direct Connect Gateway
      • it provides a way to connect to multiple VPCs from an on-premises data center using the same Direct Connect connection.
      • can connect to VGW or TGW.
    • Understand Active/Passive Direct Connect 
    • supports MACsec which delivers native, near line-rate, point-to-point encryption ensuring that data communications between AWS and the data center, office, or colocation facility remain protected.
    • Understand Route Propagation, propagation priority, BGP connectivity
      • BGP prefers the shortest AS PATH to get to the destination. Traffic from the VPC to on-premises uses the primary router. This is because the secondary router advertises a longer AS-PATH.
      • AS PATH prepending doesn’t work when the Direct Connect connections are in different AWS Regions than the VPC.
      • AS PATH works from AWS to on-premises and Local Pref from on-premises to AWS
      • Use Local Preference BGP community tags to configure Active/Passive when the connections are from different regions. The higher tag has a higher preference for 7224:7300 > 7224:7100
      • NO_EXPORT works only for Public VIFs
      • 7224:9100, 7224:9200, and 7224:9300 apply only to public prefixes. Usually used to restrict traffic to regions. Can help control if routes should propagate to the local Region only, all Regions within a continent, or all public Regions.
        • 7224:9100 — Local AWS Region
        • 7224:9200 — All AWS Regions for a continent, North America–wide, Asia Pacific, Europe, the Middle East and Africa
        • 7224:9300 — Global (all public AWS Regions)
      • 7224:8100 — Routes that originate from the same AWS Region in which the AWS Direct Connect point of presence is associated.
      • 7224:8200 — Routes that originate from the same continent with which the AWS Direct Connect point of presence is associated.
      • No-tag — Global (all public AWS Regions).
  • Route 53
    • provides a highly available and scalable DNS web service.
    • Routing Policies and their use cases Focus on Weighted,  Latency, and Failover routing policies.
    • supports Alias resource record sets, which enables routing of queries to a CloudFront distribution, Elastic Beanstalk, ELB, an S3 bucket configured as a static website, or another Route 53 resource record set.
    • CNAME does not support zone apex or root records. 
    • Route 53 DNSSEC
      • secures DNS traffic, and helps protect a domain from DNS spoofing man-in-the-middle attacks. 
      • Requirements
        • Asymmetric Customer Managed Keys
        • us-east-1 with ECC_NIST_P256 spec
    • Route 53 Resolver DNS Firewall
      • protection for outbound DNS requests from the VPCs and can monitor and control the domains that the applications can query.
      • allows you to define allow and deny list.
      • can be used for DNS exfiltration.
      • supports FirewallFailOpen configuration which determines how Route 53 Resolver handles queries during failures.
        • disabled, favors security over availability and blocks queries that it is unable to evaluate properly.
        • enabled, favors availability over security and allows queries to proceed if it is unable to properly evaluate them.
    • Route 53 Resolver (Hybrid DNS)
      • Inbound Endpoint for On-premises -> AWS
      • Outbound Endpoint for AWS -> On-premises
    • Route 53 DNS Query Logging
      • Can be logged to CloudWatch logs, S3, and Kinesis Data Firehose
    • Route 53 Resolver rules take precedence over privately hosted zones.
    • Route 53 Split View DNS helps to have the same DNS to access a site externally and internally
    • Know the Domain Migration process
  • CloudFront
    • provides a fully managed, fast CDN service that speeds up the distribution of static, dynamic web, or streaming content to end-users.
    • supports geo-restriction, WAF & AWS Shield for protection.
    • provides Cloud Functions (Edge location) & Lambda@Edge (Regional location) to execute scripts closer to the user.
    • supports encryption at rest and end-to-end encryption
    • CloudFront Origin Shield
      • helps improve the cache hit ratio and reduce the load on the origin.
      • requests from other regional caches would hit the Origin shield rather than the Origin.
      • should be placed at the regional cache and not in the edge cache
      • should be deployed to the region closer to the origin server
  • Global Accelerator
    • provides 2 static IPs
    • does not support client IP address preservation for NLB and Elastic IP address endpoints.
    • does not support IPv6 address
    • know CloudFront vs Global Accelerator
  • Understand ELB, ALB and NLB
    • Differences between ALB and NLB
    • ALB provides Content, Host, and Path-based Routing while NLB provides the ability to have a static IP address
    • Maintain original Client IP to the backend instances using X-Forwarded-for and Proxy Protocol
    • ALB/NLB do not support TLS renegotiation or mutual TLS authentication (mTLS). For implementing mTLS, use NLB with TCP listener on port 443 and terminate on the instances.
    • NLB
      • also provides local zonal endpoints to keep the traffic within AZ
      • can front Private Link endpoints and provide static IPs.
    • ALB supports Forward Secrecy, through Security Policies, that provide additional safeguards against the eavesdropping of encrypted data, through the use of a unique random session key.
    • Supports sticky session feature (session affinity) to enable the LB to bind a user’s session to a specific target. This ensures that all requests from the user during the session are sent to the same target. Sticky Sessions is configured on the target groups.
  • Gateway Load Balancer – GWLB
    • helps deploy, scale, and manage virtual appliances, such as firewalls, IDS/IPS systems, and deep packet inspection systems.
  • Athena integrates with S3 only and not with CloudWatch logs.
  • Transit VPC
    • helps connect multiple, geographically disperse VPCs and remote networks in order to create a global network transit center.
    • Use Transit Gateway instead now.
  • Know CloudHub and its use case

Security

  • AWS GuardDuty
    • managed threat detection service
    • provides Malware protection
  • AWS Shield
    • managed DDoS protection service
    • AWS Shield Advanced provides 24×7 access to the AWS Shield Response Team (SRT), protection against DDoS-related spike, and DDoS cost protection to safeguard against scaling charges.
  • WAF as Web Traffic Firewall
    • helps protect web applications from attacks by allowing rules configuration that allow, block, or monitor (count) web requests based on defined conditions.
    • integrates with CloudFront, ALB, API Gateway to dynamically detect and prevent attacks
  • Network Firewall

Monitoring & Management Tools

  • Understand AWS CloudFormation esp. in terms of Network creation.
    • Custom resources can be used to handle activities not supported by AWS
    • While configuring VPN connections use depends_on on route tables to define a dependency on other resources as the VPN gateway route propagation depends on a VPC-gateway attachment when you have a VPN gateway.
  • AWS Config
    • fully managed service that provides AWS resource inventory, configuration history, and configuration change notifications to enable security, compliance, and governance.
    • can be used to monitor resource changes e.g. Security Groups and invoke Systems Manager Automation scripts for remediation.
  • CloudTrail for audit and governance

Integration Tools

Networking Architecture Patterns

Finally, All the Best 🙂

AWS Network Architecture Patterns

AWS Network Architecture Patterns

On-premises -> S3 Private Link -> S3 (Without Internet Gateway or S3 Gateway Endpoint)

 Data flow diagram shows access from on-premises and in-VPC apps to S3 using an interface endpoint and AWS PrivateLink.
  • Interface endpoints in the VPC can route both in-VPC applications and on-premises applications to S3 over the Amazon network.
  • On-premises network uses Direct Connect or AWS VPN to connect to VPC.
  • On-premises applications in VPC A use endpoint-specific DNS names to access S3 through the S3 interface endpoint.
  • On-premises applications send data to the interface endpoint in the VPC through AWS Direct Connect (or AWS VPN). AWS PrivateLink moves the data from the interface endpoint to S3 over the AWS network.
  • VPC applications can also send traffic to the interface endpoint. AWS PrivateLink moves the data from the interface endpoint to S3 over the AWS network.

On-premises -> Proxy -> Gateway Endpoint -> S3

On-premises + VPC Endpoint + S3

  • VPC endpoints are only accessible from EC2 instances inside a VPC, a local instance must proxy all remote requests before they can
    utilize a VPC endpoint connection.
  • Proxy farm proxies S3 traffic to the VPC endpoint. Configure an Auto Scaling group to manage the proxy servers and automatically grow or shrink the number of required instances based on proxy server load.

Direct Connect Gateway + Transit Gateway

AWS Direct Connect Gateway + Transit Gateway

  • AWS Direct Connect Gateway does not support transitive routing and has limits on the number of VGWs that can be connected.
  • AWS Direct Connect Gateway can be combined with AWS Transit Gateway using transit VIF attachment which enables your network to connect up to three regional centralized routers over a private dedicated connection
  • DX Gateway + TGW simplifies the management of connections between a VPC and the on-premises networks over a private connection that can reduce network costs, increase bandwidth throughput, and provide a more consistent network experience than internet-based connections.
  • With AWS Transit Gateway connected to VPCs, full or partial mesh connectivity can be achieved between the VPCs.

AWS Direct Connect with VPN as Backup

  • Be sure that you use the same virtual private gateway for both Direct Connect and the VPN connection to the VPC.
  • If you are configuring a Border Gateway Protocol (BGP) VPN, advertise the same prefix for Direct Connect and the VPN.
  • If you are configuring a static VPN, add the same static prefixes to the VPN connection that you are announcing with the Direct Connect virtual interface.
  • If you are advertising the same routes toward the AWS VPC, the Direct Connect path is always preferred, regardless of AS path prepending.

AWS Direct Connect + VPN

AWS Direct Connect + VPN

  • AWS Direct Connect + VPN combines the benefits of the end-to-end secure IPSec connection with low latency and increased bandwidth of the AWS Direct Connect to provide a more consistent network experience than internet-based VPN connections.
  • AWS Direct Connect public VIF establishes a dedicated network connection between the on-premises network to public AWS resources, such as an Amazon virtual private gateway IPsec endpoint.
  • A BGP connection is established between the AWS Direct Connect and your router on the public VIF.
  • Another BGP session or a static router will be established between the virtual private gateway and your router on the IPSec VPN tunnel.

AWS Private Link -> NLB -> ALB

AWS Private Link + NLB + ALB

  • AWS PrivateLink for ALB allows customers to utilize PrivateLink on NLB and route this traffic to a target ALB to utilize the layer 7 benefits.
  • Static NLB IP Addresses for ALB – with one static IP per AZ on NLB allows full control over the IP addresses and enables various use cases as follows:
    • Allow listing of IP addresses for firewall rules.
    • Pointing a DNS Zone apex to an application fronted by an ALB. Utilizing ALB as a target of NLB, a DNS A-record type can be used to resolve your zone apex to the NLB static IP addresses.
    • When legacy clients cannot utilize DNS resulting in a need for hard-coded IP addresses.

Centralized Egress: Transit Gateway + NAT Gateway

Centralized Egress with Transit Gateway & NAT Gateway

  • A separate egress VPC in the network services account can be created to route all egress traffic from the spoke VPCs via a NAT gateway sitting in this VPC using Transit Gateway
  • As the NAT gateway has an hourly charge, deploying a NAT gateway in every spoke VPC can become cost prohibitive and centralizing NAT can provide cost benefits
  • In some edge cases when huge amounts of data is sent through the NAT gateway from a VPC, keeping the NAT local in the VPC to avoid the Transit Gateway data processing charge might be a more cost-effective option.
  • Two NAT gateways (one in each AZ) provide High Availability.

Direct Connect with High Resiliency – 99.9

Direct Connect High Availability

 

  • For critical production workloads that require high resiliency, it is recommended to have one connection at multiple locations.
  • ensures resilience to connectivity failure due to a fiber cut or a device failure as well as a complete location failure. You can use AWS Direct Connect gateway to access any AWS Region (except AWS Regions in China) from any AWS Direct Connect location.

Direct Connect with Maximum Resiliency – 99.99

Direct Connect Maximum Availability

  • Maximum resilience is achieved by separate connections terminating on separate devices in more than one location
  • ensures resilience to device failure, connectivity failure, and complete location failure.

VPC Network Access Analyzer

VPC Network Access Analyzer

  • VPC Network Access Analyzer helps identify unintended network access to the resources on AWS.
  • Network Access Analyzer can be used to
    • Understand, verify, and improve the network security posture
      • helps identify unintended network access relative to the security and compliance requirements that enable improving network security.
    • Demonstrate compliance
      • can help demonstrate that the network on AWS meets certain compliance requirements.
  • Network Access Analyzer can help verify the following example requirements:
    • Network segmentation
      • verify that the production environment VPCs and development environment VPCs are isolated from one another or systems that process credit card information are isolated from the rest of the environment.
    • Internet accessibility
      • can help identify resources that can be accessed from internet gateways, and verify that they are limited to only those resources that have a legitimate need to be accessible from the internet.
    • Trusted network paths
      • can help verify that appropriate network controls such as network firewalls and NAT gateways are configured on all network paths between the resources and internet gateways.
    • Trusted network access
      • can help verify that the resources have network access only from a trusted IP address range, over specific ports and protocols.
      • network access requirements can be specified in terms of:
        • Individual resource IDs, such as vpc-01234567
        • All resources of a given type, such as AWS::EC2::InternetGateway
        • All resources with a given tag, using AWS Resource Groups
        • IP address ranges, port ranges, and traffic protocols

Network Access Analyzer Concepts

  • Network Access Scopes
    • Network Access Scopes specifies the network access requirements, which determine the types of findings that the analysis produces.
    • MatchPaths field help to specify the types of network paths to identify.
    • ExcludePaths field help to specify the types of network paths to exclude
  • Findings
    • Findings are potential paths in the network that match any of the MatchPaths entries in the Network Access Scope, but that do not match any of the ExcludePaths entries in the Network Access Scope.

Network Access Analyzer Works

  • Network Access Analyzer uses automated reasoning algorithms to analyze the network paths that a packet can take between resources in an AWS network.
  • performs a static analysis of a network configuration, meaning that no packets are transmitted in the network as part of this analysis.
  • produces findings for paths that match a customer-defined Network Access Scope.
  • only considers the state of the network as described in the network configuration, packet loss that’s due to transient network interruptions or service failures is not considered in this analysis.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.

References

AWS_VPC_Network_Access_Analyzer