AWS Lambda Event Source

December 1, 2022 ~ Last updated on : June 13, 2023 ~ jayendrapatil ~ 5 Comments

AWS Lambda Event Source

Lambda Event Source is an AWS service or developer-created application that produces events that trigger an AWS Lambda function to run.

Event sources can be either AWS Services or Custom applications.
Event sources can be both push and pull sources
- Services like S3, and SNS publish events to Lambda by invoking the cloud function directly.
- Lambda can also poll resources in services like Kafka, and Kinesis streams that do not publish events to Lambda.
Events are passed to a Lambda function as an event input parameter. For batch event sources, such as Kinesis Streams, the event parameter may contain multiple events in a single call, based on the requested batch size

Lambda Event Source Mapping

Lambda Event source mapping refers to the configuration which maps an event source to a Lambda function.
Event source mapping
- enables automatic invocation of the Lambda function when events occur.
- identifies the type of events to publish and the Lambda function to invoke when events occur.

Lambda Event Sources Type

AWS Lambda Event Source Types

Push-based

also referred to as the Push model

includes services like S3, SNS, SES, etc.
Event source mapping maintained on the event source side
as the event sources invoke the Lambda function, a resource-based policy should be used to grant the event source the necessary permissions.

Pull-based

also referred to as the Pull model
covers mostly the Stream-based event sources like DynamoDB, Kinesis streams, MQ, SQS, Kafka
Event source mapping maintained on the Lambda side

Lambda Event Sources Invocation Model

Synchronously

You wait for the function to process the event and return a response.
Error handling and retries need to be handled by the Client.
Invocation includes API, and SDK for calls from API Gateway.

Asynchronously

queues the event for processing and returns a response immediately.
handles retries and can send invocation records to a destination for successful and failed events.
Invocation includes S3, SNS, and CloudWatch Events

Lambda Supported Event Sources

AWS Lambda can be configured as an event source for multiple AWS services

Service	Method of invocation
Amazon Alexa	Event-driven; synchronous invocation
Amazon MSK – Managed Streaming for Apache Kafka	Lambda polling
Self-managed Apache Kafka	Lambda polling
Amazon API Gateway	Event-driven; synchronous invocation
AWS CloudFormation	Event-driven; asynchronous invocation
Amazon CloudFront (Lambda@Edge)	Event-driven; synchronous invocation
Amazon EventBridge (CloudWatch Events)	Event-driven; asynchronous invocation
Amazon CloudWatch Logs	Event-driven; asynchronous invocation
AWS CodeCommit	Event-driven; asynchronous invocation
AWS CodePipeline	Event-driven; asynchronous invocation
Amazon Cognito	Event-driven; synchronous invocation
AWS Config	Event-driven; asynchronous invocation
Amazon Connect	Event-driven; synchronous invocation
Amazon DynamoDB	Lambda polling
Amazon Elastic File System	Special integration
Elastic Load Balancing (Application Load Balancer)	Event-driven; synchronous invocation
AWS IoT	Event-driven; asynchronous invocation
AWS IoT Events	Event-driven; asynchronous invocation
Amazon Kinesis	Lambda polling
Amazon Kinesis Data Firehose	Event-driven; synchronous invocation
Amazon Lex	Event-driven; synchronous invocation
Amazon MQ	Lambda polling
Amazon Simple Email Service	Event-driven; asynchronous invocation
Amazon Simple Notification Service	Event-driven; asynchronous invocation
Amazon Simple Queue Service	Lambda polling
Amazon S3	Event-driven; asynchronous invocation
Amazon Simple Storage Service Batch	Event-driven; synchronous invocation
Secrets Manager	Event-driven; synchronous invocation
AWS X-Ray	Special integration

Amazon S3

S3 bucket events, such as the object-created or object-deleted events can be processed using Lambda functions for e.g., the Lambda function can be invoked when a user uploads a photo to a bucket to read the image and create a thumbnail.
S3 bucket notification configuration feature can be configured for the event source mapping, to identify the S3 bucket events and the Lambda function to invoke.

Error handling for an event source depends on how Lambda is invoked
S3 invokes your Lambda function asynchronously.

DynamoDB

Lambda functions can be used as triggers for the DynamoDB table to take custom actions in response to updates made to the DynamoDB table.

Trigger can be created by
- Enabling DynamoDB Streams for the table.
- Lambda polls the stream and processes any updates published to the stream

DynamoDB is a stream-based event source and with stream-based service, the event source mapping is created in Lambda, identifying the stream to poll and which Lambda function to invoke.
Error handling for an event source depends on how Lambda is invoked

Kinesis Streams

AWS Lambda can be configured to automatically poll the Kinesis stream periodically (once per second) for new records.

Lambda can process any new records such as social media feeds, IT logs, website click streams, financial transactions, and location-tracking events
Kinesis Streams is a stream-based event source and with stream-based service, the event source mapping is created in Lambda, identifying the stream to poll and which Lambda function to invoke.
Error handling for an event source depends on how Lambda is invoked

Simple Notification Service – SNS

SNS notifications can be processed using Lambda
When a message is published to an SNS topic, the service can invoke Lambda function by passing the message payload as parameter, which can then process the event
Lambda function can be triggered in response to CloudWatch alarms and other AWS services that use SNS.

SNS via topic subscription configuration feature can be used for the event source mapping, to identify the SNS topic and the Lambda function to invoke
Error handling for an event source depends on how Lambda is invoked
SNS invokes your Lambda function asynchronously.

Simple Email Service – SES

SES can be used to receive messages and can be configured to invoke Lambda function when messages arrive, by passing in the incoming email event as parameter
SES using the rule configuration feature can be used for the event source mapping
Error handling for an event source depends on how Lambda is invoked

SES invokes your Lambda function asynchronously.

Amazon Cognito

Cognito Events feature enables Lambda function to run in response to events in Cognito for e.g. Lambda function can be invoked for the Sync Trigger events, that is published each time a dataset is synchronized.
Cognito event subscription configuration feature can be used for the event source mapping

Error handling for an event source depends on how Lambda is invoked
Cognito is configured to invoke a Lambda function synchronously

CloudFormation

Lambda function can be specified as a custom resource to execute any custom commands as a part of deploying CloudFormation stacks and can be invoked whenever the stacks are created, updated, or deleted.

CloudFormation using stack definition can be used for the event source mapping
Error handling for an event source depends on how Lambda is invoked
CloudFormation invokes the Lambda function asynchronously

CloudWatch Logs

Lambda functions can be used to perform custom analysis on CloudWatch Logs using CloudWatch Logs subscriptions.
CloudWatch Logs subscriptions provide access to a real-time feed of log events from CloudWatch Logs and deliver it to the AWS Lambda function for custom processing, analysis, or loading to other systems.
CloudWatch Logs using the log subscription configuration can be used for the event source mapping

Error handling for an event source depends on how Lambda is invoked
CloudWatch Logs invokes the Lambda function asynchronously

CloudWatch Events

CloudWatch Events help respond to state changes in the AWS resources. When the resources change state, they automatically send events into an event stream.

Rules that match selected events in the stream can be created to route them to the Lambda function to take action for e.g., the Lambda function can be invoked to log the state of an EC2 instance or AutoScaling Group.
CloudWatch Events by using a rule target definition can be used for the event source mapping
Error handling for an event source depends on how Lambda is invoked

CloudWatch Events invokes the Lambda function asynchronously

CodeCommit

Trigger can be created for a CodeCommit repository so that events in the repository will invoke a Lambda function for e.g., Lambda function can be invoked when a branch or tag is created or when a push is made to an existing branch.
CodeCommit by using a repository trigger can be used for the event source mapping

Error handling for an event source depends on how Lambda is invoked
CodeCommit Events invokes the Lambda function asynchronously

Scheduled Events (powered by CloudWatch Events)

AWS Lambda can be invoked regularly on a scheduled basis using the schedule event capability in CloudWatch Events.

CloudWatch Events by using a rule target definition can be used for the event source mapping
Error handling for an event source depends on how Lambda is invoked
CloudWatch Events invokes the Lambda function asynchronously

AWS Config

Lambda functions can be used to evaluate whether the AWS resource configurations comply with custom Config rules.
As resources are created, deleted, or changed, AWS Config records these changes and sends the information to the Lambda functions, which can then evaluate the changes and report results to AWS Config. AWS Config can be used to assess overall resource compliance
AWS Config by using a rule target definition can be used for the event source mapping

Error handling for an event source depends on how Lambda is invoked
AWS Config invokes the Lambda function asynchronously

Amazon API Gateway

Lambda function can be invoked over HTTPS by defining a custom REST API and endpoint using Amazon API Gateway.

Individual API operations, such as GET and PUT, can be mapped to specific Lambda functions.
When an HTTPS request to the API endpoint is received, the API Gateway service invokes the corresponding Lambda function.
Error handling for an event source depends on how Lambda is invoked.

API Gateway is configured to invoke a Lambda function synchronously.

Other Event Sources: Invoking a Lambda Function On Demand

Lambda functions can be invoked on-demand without the need to preconfigure any event source mapping in this case.

AWS Certification Exam Practice Questions

References

AWS_Lambda_Developer_Guide

AWS Elastic Beanstalk Deployment Strategies

December 1, 2022 ~ Last updated on : March 28, 2023 ~ jayendrapatil

AWS Elastic Beanstalk Deployment Strategies

Elastic Beanstalk supports environments such as
- Single Instance environments, with a single instance and Auto Scaling to maintain the minimum/maximum 1 instance
- Load Balanced environments, with load balancing and Auto Scaling

Elastic Beanstalk allows multiple deployment options or strategies that can be selected depending upon the requirements for deployment time, downtime, DNS change, and rollback process.

Elastic Beanstalk Deployment Methods

All at Once Deployments

Elastic Beanstalk environment uses all-at-once deployments if it is created with a different client (API, SDK, or AWS CLI).
All at Once deployments perform an in-place deployment on all instances at the same time.
All at Once deployments are simple and fast, however, it would lead to downtime and the rollback would take time in case of any issues.

Rolling Deployments

Elastic Beanstalk environment uses rolling deployments if it is created with console or EB CLI.
Elastic Beanstalk splits the environment’s EC2 instances into batches and deploys the new version of the application on the existing instance one batch at a time, leaving the rest of the instances in the environment running the old version.
During a rolling deployment, part of the instances serves requests with the old version of the application, while instances in completed batches serve other requests with the new version.

Elastic Beanstalk performs the rolling deployments as
- When processing a batch, detaches all instances in the batch from the load balancer, deploys the new application version, and then reattaches the instances.
- To avoid any connection issues when the instances are detached, connection draining can be enabled on the load balancer
- After reattaching the instances in a batch to the load balancer, ELB waits until they pass a minimum number of health checks (the Healthy check count threshold value), and then starts routing traffic to them.
- Elastic Beanstalk waits until all instances in a batch are healthy before moving on to the next batch.
- When all instances in the batch pass enough health checks to be considered healthy by ELB, the batch is complete.
- If a batch of instances does not become healthy within the command timeout, the deployment fails.
- If a deployment fails after one or more batches are completed successfully, the completed batches run the new version of the application while any pending batches continue to run the old version.
- If the instances are terminated from the failed deployment, Elastic Beanstalk replaces them with instances running the application version from the most recent successful deployment.

Rolling with Additional Batch Deployments

Rolling with Additional Batch deployments is helpful when you need to maintain full capacity during deployments.
This deployment is similar to Rolling deployments, except they do not do an in-place deployment but a disposable one, launching a new batch of instances prior to taking any instances out of service
When the deployment completes, Elastic Beanstalk terminates the additional batch of instances.

Rolling with additional batch deployment does not impact the capacity and ensures full capacity during the deployment process.

Immutable Deployments

All at Once and Rolling deployment method updates existing instances.
If you need to ensure the application source is always deployed to new instances, instead of updating existing instances, the environment can be configured to use immutable updates for deployments.

Immutable updates are performed by launching a second Auto Scaling group is launched in the environment and the new version serves traffic alongside the old version until the new instances pass health checks.
Immutable deployments can prevent issues caused by partially completed rolling deployments. If the new instances don’t pass health checks, Elastic Beanstalk terminates them, leaving the original instances untouched.

Blue Green Deployments

Elastic Beanstalk performs an in-place update when application versions are updated, which may result in the application becoming unavailable to users for a short period of time.

Blue Green approach is suitable for deployments that depend on incompatible resource configuration changes or a new version that can’t run alongside the old version.
Elastic Beanstalk enables the Blue Green deployment through the Swap Environment URLs feature.
Blue Green deployment provides an almost zero downtime solution, where a new version is deployed to a separate environment, and then CNAMEs of the two environments are swapped to redirect traffic to the new version.

Blue/green deployments require that the environment runs independently of the production database i.e. not maintained by Elastic Beanstalk if your application uses one. Because if the environment has an RDS DB instance attached to it, the data will not transfer over to the second environment and will be lost if the original environment is terminated
Blue Green deployment entails a DNS change; hence, do not terminate the old environment until the DNS changes have been propagated and the old DNS records expire.
DNS servers do not necessarily clear old records from their cache based on the time to live (TTL) you set on the DNS records.

AWS Certification Exam Practice Questions

When thinking of AWS Elastic Beanstalk, the ‘Swap Environment URLs’ feature most directly aids in what? [CDOP]
1. Immutable Rolling Deployments
2. Mutable Rolling Deployments
3. Canary Deployments
4. Blue-Green Deployments (Simply upload the new version of your application and let your deployment service (AWS Elastic Beanstalk, AWS CloudFormation, or AWS OpsWorks) deploy a new version (green). To cut over to the new version, you simply replace the ELB URLs in your DNS records. Elastic Beanstalk has a Swap Environment URLs feature to facilitate a simpler cutover process.)
You need to deploy a new version of your application. You’d prefer to use all new instances if possible, but you cannot have any downtime. You also don’t want to swap any environment URLs. You’re running t2.large instances and you normally need 15 instances to meet capacity. Which deployment method should you use? Choose the correct answer:
1. Rolling Updates
2. Blue/Green
3. Immutable
4. All at Once
Your team is responsible for an AWS Elastic Beanstalk application. The business requires that you move to a continuous deployment model, releasing updates to the application multiple times per day with zero downtime. What should you do to enable this and still be able to roll back almost immediately in an emergency to the previous version? [CDOP]
1. Enable rolling updates in the Elastic Beanstalk environment, setting an appropriate pause time for application startup.
2. Create a second Elastic Beanstalk environment running the new application version, and swap the environment CNAMEs.
3. Develop the application to poll for a new application version in your code repository; download and install to each running Elastic Beanstalk instance.
4. Create a second Elastic Beanstalk environment with the new application version, and configure the old environment to redirect clients, using the HTTP 301 response code, to the new environment.

References

AWS Elastic Beanstalk Deployment Options

AWS Elastic Beanstalk

December 1, 2022 ~ Last updated on : March 28, 2023 ~ jayendrapatil ~ 16 Comments

AWS Elastic Beanstalk

AWS Elastic Beanstalk helps to quickly deploy and manage applications in the AWS Cloud without having to worry about the infrastructure that runs those applications.

reduces management complexity without restricting choice or control.
enables automated infrastructure management and code deployment, by simply uploading, for applications and includes
- Application platform management
- Capacity provisioning
- Load Balancing
- Auto Scaling
- Code deployment
- Health Monitoring

Elastic Beanstalk automatically launches an environment once an application is uploaded, and creates and configures the AWS resources needed to run the code. After the environment is launched, it can be managed and used to deploy new application versions.
AWS resources launched by Elastic Beanstalk are fully accessible i.e. EC2 instances can be SSHed into.
provides developers and systems administrators with an easy, fast way to deploy and manage the applications without having to worry about AWS infrastructure.

CloudFormation, using templates, is a better option than Elastic Beanstalk if the internal AWS resources to be used are known and fine-grained control is needed.

Elastic Beanstalk Components

Application
- An Application is a logical collection of components, including environments, versions, and environment configurations.
Application Version
- An application version refers to a specific, labeled iteration of deployable code for a web application.
- Applications can have many versions and each application version is unique and points to an S3 object.
- Multiple versions of an Application can be deployed for testing differences and helps to roll back to any version in case of issues.

Environment
- An environment is a version that is deployed onto AWS resources.
- An environment runs a single application version at a time, but same application version can be deployed across multiple environments.
- When an environment is created, EB provisions the resources needed to run the specified application version.
Environment Configuration
- An environment configuration identifies a collection of parameters and settings that define how an environment and its associated resources behave
- When an environment’s configuration settings are updated, EB automatically applies the changes to existing resources or deletes and deploys new resources, depending upon the change
Configuration Template
- A configuration template is a starting point for creating unique environment configurations

Elastic Beanstalk Architecture

Elastic Beanstalk environment requires an environment tier, platform, and
environment type.
Environment tier determines whether EB provisions resources to support
- Web tier – a web application that handles HTTP(S) requests
- Worker tier – an application that handles background-processing tasks.
One environment cannot support two different environment tiers because each requires its own set of resources; a worker environment tier and a web server environment tier each require an Auto Scaling group, but Elastic Beanstalk supports only one Auto Scaling group per environment.

Web Environment Tier

An environment tier whose web application processes web requests is known as a web server tier.
AWS resources created for a web environment tier include an Elastic Load Balancer, an Auto Scaling group, one or more EC2 instances
Every Environment has a CNAME URL pointing to the ELB, aliased in Route 53 to ELB URL.

Each EC2 server instance that runs the application uses a container type, which defines the infrastructure topology and software stack.
A software component called the host manager (HM) runs on each EC2 server instance and is responsible for
- Deploying the application
- Aggregating events and metrics for retrieval via the console, the API, or the command line
- Generating instance-level events
- Monitoring the application log files for critical errors
- Monitoring the application server
- Patching instance components
- Rotating your application’s log files and publishing them to S3

Worker Environment Tier

An environment tier whose web application runs background jobs is known as a worker tier.
AWS resources created for a worker environment tier include an Auto Scaling group, one or more EC2 instances, and an IAM role.
For the worker environment tier, Elastic Beanstalk also creates and provisions an SQS queue, if one doesn’t exist.

When a worker environment tier is launched, EB installs the necessary support files for the programming language of choice and a daemon on each EC2 instance in the Auto Scaling group reading from the same SQS queue.
Daemon is responsible for pulling requests from an SQS queue and then sending the data to the web application running in the worker environment tier that will process those messages.
Worker environments support SQS dead letter queues which can be used to store messages that could not be successfully processed. Dead letter queue provides the ability to sideline, isolate and analyze the unsuccessfully processed messages

Elastic Beanstalk with Other AWS Services

Elastic Beanstalk supports VPC and launches AWS resources, such as instances, into the VPC
Elastic Beanstalk supports IAM and helps you securely control access to your AWS resources.
CloudFront can be used to distribute the content in S3 after an Elastic Beanstalk is created and deployed

CloudTrail
- Elastic Beanstalk is integrated with CloudTrail, a service that captures all of the Elastic BeanstalkAPI calls and delivers the log files to a specified S3 bucket.
- CloudTrail captures API calls from the Elastic Beanstalk console or from your code to the Elastic Beanstalk APIs and helps to determine the request made to Elastic Beanstalk, the source IP address from which the request was made, who made the request, when it was made, etc.

RDS
- EB provides support for running RDS instances in the environment which is ideal for development and testing but not for production.
- For a production environment, it is not recommended because it ties the lifecycle of the database instance to the lifecycle of the application’s environment. So if the environment is deleted, the RDS instance is deleted as well
- It is recommended to launch a database instance outside of the environment and configure the application to connect to it outside of the functionality provided by Elastic Beanstalk.
- Using a database instance external to the environment requires additional security group and connection string configuration, but it also lets the application connect to the database from multiple environments, use database types not supported with integrated databases, perform blue/green deployments, and tear down the environment without affecting the database instance.
S3
- EB creates an S3 bucket named elasticbeanstalk-region-account-id for each region in which environments are created.
- EB uses the bucket to store application versions, logs, and other supporting files.
- It applies a bucket policy to buckets it creates to allow environments to write to the bucket and prevent accidental deletion

Elastic Beanstalk Deployment Strategies

All at Once
- performs an in-place deployment on all instances at the same time.
- is performed on existing instances and would lead to downtime as well as time to roll back changes.
Rolling
- splits the environment instances into batches and deploys the application’s new version on the existing instance one batch at a time, leaving the rest of the environment instances running the old version.
- waits until all instances in a batch are healthy before moving on to the next batch.
- reduces downtime as all instances are not updated and if the health checks fail the deployment can be rollback.
Rolling with an Additional batch
- similar to Rolling however it starts the deployment of the application’s new version on a new batch.
- does not impact the capacity and ensures full capacity during the deployment process.
Immutable
- ensures the application source is always deployed to new instances.
- prevent issues caused by partially completed rolling deployments.
- provides minimal downtime and quick rollback.

Blue Green
- suitable for deployments that depend on incompatible resource configuration changes or a new version that can’t run alongside the old version.
- implemented using the Swap Environment URLs feature that entails a DNS switchover.

AWS Certification Exam Practice Questions

An organization is planning to use AWS for their production roll out. The organization wants to implement automation for deployment such that it will automatically create a LAMP stack, download the latest PHP installable from S3 and setup the ELB. Which of the below mentioned AWS services meets the requirement for making an orderly deployment of the software?
1. AWS Elastic Beanstalk
2. AWS CloudFront
3. AWS CloudFormation
4. AWS DevOps
What does Amazon Elastic Beanstalk provide?
1. A scalable storage appliance on top of Amazon Web Services.
2. An application container on top of Amazon Web Services
3. A service by this name doesn’t exist.
4. A scalable cluster of EC2 instances
You want to have multiple versions of your application running at the same time, with all versions launched via AWS Elastic Beanstalk. Is this possible?
1. However if you have 2 AWS accounts this can be done
2. AWS Elastic Beanstalk is not designed to support multiple running environments
3. AWS Elastic Beanstalk is designed to support a number of multiple running environments
4. However AWS Elastic Beanstalk is designed to support only 2 multiple running environments

A .NET application that you manage is running in Elastic Beanstalk. Your developers tell you they will need access to application log files to debug issues that arise. The infrastructure will scale up and down. How can you ensure the developers will be able to access only the log files?
1. Access the log files directly from Elastic Beanstalk
2. Enable log file rotation to S3 within the Elastic Beanstalk configuration
3. Ask your developers to enable log file rotation in the applications web.config file
4. Connect to each Instance launched by Elastic Beanstalk and create a Windows Scheduled task to rotate the log files to S3
Your team has a tomcat-based Java application you need to deploy into development, test and production environments. After some research, you opt to use Elastic Beanstalk due to its tight integration with your developer tools and RDS due to its ease of management. Your QA team lead points out that you need to roll a sanitized set of production data into your environment on a nightly basis. Similarly, other software teams in your org want access to that same restored data via their EC2 instances in your VPC .The optimal setup for persistence and security that meets the above requirements would be the following. [PROFESSIONAL]
1. Create your RDS instance as part of your Elastic Beanstalk definition and alter its security group to allow access to it from hosts in your application subnets. (Not optimal for persistence as the RDS is associated with the Elastic Beanstalk lifecycle and would not live independently)
2. Create your RDS instance separately and add its IP address to your application’s DB connection strings in your code. Alter its security group to allow access to it from hosts within your VPC’s IP address block. (RDS is connected using DNS endpoint only)
3. Create your RDS instance separately and pass its DNS name to your app’s DB connection string as an environment variable. Create a security group for client machines and add it as a valid source for DB traffic to the security group of the RDS instance itself. (Security group allows instances to access the RDS with new instances launched without any changes)
4. Create your RDS instance separately and pass its DNS name to your DB connection string as an environment variable. Alter its security group to allow access to it from hosts in your application subnets. (Not optimal for security adding individual hosts)
Your must architect the migration of a web application to AWS. The application consists of Linux web servers running a custom web server. You are required to save the logs generated from the application to a durable location. What options could you select to migrate the application to AWS? (Choose 2) [PROFESSIONAL]
1. Create an AWS Elastic Beanstalk application using the custom web server platform. Specify the web server executable and the application project and source files. Enable log file rotation to Amazon Simple Storage Service (S3). (EB does not work with Custom server executable)
2. Create Dockerfile for the application. Create an AWS OpsWorks stack consisting of a custom layer. Create custom recipes to install Docker and to deploy your Docker container using the Dockerfile. Create custom recipes to install and configure the application to publish the logs to Amazon CloudWatch Logs (although this is one of the option, the last sentence mentions configure the application to push the logs to S3, which would need changes to application as it needs to use SDK or CLI)
3. Create Dockerfile for the application. Create an AWS OpsWorks stack consisting of a Docker layer that uses the Dockerfile. Create custom recipes to install and configure Amazon Kinesis to publish the logs into Amazon CloudWatch. (Kinesis not needed)
4. Create a Dockerfile for the application. Create an AWS Elastic Beanstalk application using the Docker platform and the Dockerfile. Enable logging the Docker configuration to automatically publish the application logs. Enable log file rotation to Amazon S3. (Use Docker configuration with awslogs and EB with Docker)
5. Use VM import/Export to import a virtual machine image of the server into AWS as an AMI. Create an Amazon Elastic Compute Cloud (EC2) instance from AMI, and install and configure the Amazon CloudWatch Logs agent. Create a new AMI from the instance. Create an AWS Elastic Beanstalk application using the AMI platform and the new AMI. (Use VM Import/Export to create AMI and CloudWatch logs agent to log)
Which of the following groups is AWS Elastic Beanstalk best suited for?
1. Those who want to deploy and manage their applications within minutes in the AWS cloud.
2. Those who want to privately store and manage Git repositories in the AWS cloud.
3. Those who want to automate the deployment of applications to instances and to update the applications as required.
4. Those who want to model, visualize, and automate the steps required to release software.

When thinking of AWS Elastic Beanstalk’s model, which is true?
1. Applications have many deployments, deployments have many environments.
2. Environments have many applications, applications have many deployments.
3. Applications have many environments, environments have many deployments. (Applications group logical services. Environments belong to Applications, and typically represent different deployment levels (dev, stage, prod, forth). Deployments belong to environments, and are pushes of bundles of code for the environments to run.)
4. Deployments have many environments, environments have many applications.
If you’re trying to configure an AWS Elastic Beanstalk worker tier for easy debugging if there are problems finishing queue jobs, what should you configure?
1. Configure Rolling Deployments.
2. Configure Enhanced Health Reporting
3. Configure Blue-Green Deployments.
4. Configure a Dead Letter Queue (Elastic Beanstalk worker environments support SQS dead letter queues, where worker can send messages that for some reason could not be successfully processed. Dead letter queue provides the ability to sideline, isolate and analyze the unsuccessfully processed messages. Refer link)
When thinking of AWS Elastic Beanstalk, which statement is true?
1. Worker tiers pull jobs from SNS.
2. Worker tiers pull jobs from HTTP.
3. Worker tiers pull jobs from JSON.
4. Worker tiers pull jobs from SQS. (Elastic Beanstalk installs a daemon on each EC2 instance in the Auto Scaling group to process SQS messages in the worker environment. Refer link)

You are building a Ruby on Rails application for internal, non-production use, which uses MySQL as a database. You want developers without very much AWS experience to be able to deploy new code with a single command line push. You also want to set this up as simply as possible. Which tool is ideal for this setup?
1. AWS CloudFormation
2. AWS OpsWorks
3. AWS ELB + EC2 with CLI Push
4. AWS Elastic Beanstalk
What AWS products and features can be deployed by Elastic Beanstalk? Choose 3 answers.
1. Auto scaling groups
2. Route 53 hosted zones
3. Elastic Load Balancers
4. RDS Instances
5. Elastic IP addresses
6. SQS Queues

AWS Elastic Beanstalk stores your application files and optionally server log files in ____.
1. Amazon Storage Gateway
2. Amazon Glacier
3. Amazon EC2
4. Amazon S3
When you use the AWS Elastic Beanstalk console to deploy a new application ____.
1. Need to upload each file separately
2. Need to create each file and path
3. Need to upload a source bundle
4. Need to create each file

References

AWS_Elastic_Beanstalk_Developer_Guide

AWS EFS vs EBS Multi-Attach

November 28, 2022 ~ jayendrapatil

AWS EFS vs EBS Multi-Attach

EFS vs EBS Multi-Attach features

Elastic File Store – EFS is a file storage service for use with Amazon compute (EC2, containers, serverless) and on-premises servers. EFS provides a file system interface, file system access semantics (such as strong consistency and file locking), and concurrently accessible storage for up to thousands of EC2 instances.
Elastic Block Store – EBS is a block-level storage service for use with EC2. EBS can deliver performance for workloads that require the lowest-latency access to data from a single EC2 instance.

Service type
- Elastic File Store is fully managed by AWS
- EBS needs to be managed by the user.

Accessibility
- EFS can be accessed concurrently from all AZs in the Region.
- EBS Multi-Attach can be accessed concurrently from instances within the same AZ.

Data Scalability
- EFS provides unlimited data storage
- EBS Multi-Attach has limits on the storage it can provide.

Instance Scalability
- EFS can be attached to Tens, hundreds, or even thousands of compute instances.
- EBS Multi-Attach enabled volumes can be attached to up to 16 Linux instances built on the Nitro System.

Supported Instances
- EFS is compatible with all Linux-based AMIs for EC2, POSIX file system (~Linux) that has a standard file API
- Multi-Attach enabled volumes can be attached to up to 16 Linux instances built on the Nitro System that are in the same AZ. Multi-Attach enabled volume can be attached to Windows instances, but the OS does not recognize the data on the volume that is shared between the instances, which can result in data inconsistency.

Pricing
- EFS is priced as per the pay-as-you-use model
- EBS is priced as per the provisioned capacity

AWS Certification Exam Practice Questions

A company wants to organize the contents of multiple websites in managed file storage. The company must be able to scale the storage based on demand without needing to provision storage. Multiple servers across multiple Availability Zones within a region should be able to access this storage concurrently. Which services should the Solutions Architect recommend?
1. Amazon S3
2. Amazon EBS Multi-Attach
3. Amazon EFS
4. AWS Storage Gateway – Volume gateway

References

Amazon_EBS & Amazon_EFS

AWS S3 Security

November 28, 2022 ~ Last updated on : February 2, 2023 ~ jayendrapatil

AWS S3 Security

AWS S3 Security is a shared responsibility between AWS and the Customer

S3 is a fully managed service that is protected by the AWS global network security procedures
AWS handles basic security tasks like guest operating system (OS) and database patching, firewall configuration, and disaster recovery.

Security and compliance of S3 are assessed by third-party auditors as part of multiple AWS compliance programs including SOC, PCI DSS, HIPAA, etc.
S3 provides several other features to handle security, which are the customers’ responsibility.
S3 Encryption supports both data at rest and data in transit encryption.
- Data in transit encryption can be provided by enabling communication via SSL or using client-side encryption
- Data at rest encryption can be provided using Server Side or Client Side encryption
S3 permissions can be handled using
- IAM User Policies
- Resource-based policies which include Bucket policies, Bucket ACL, and Object ACL
- S3 Access Points

S3 Object Lock helps to store objects using a WORM model and can help prevent objects from being deleted or overwritten for a fixed amount of time or indefinitely.
S3 Access Points simplify data access for any AWS service or customer application that stores data in S3.
S3 Versioning with MFA Delete can be enabled on a bucket to ensure that data in the bucket cannot be accidentally overwritten or deleted.

S3 Block Public Access provides controls across an entire AWS Account or at the individual S3 bucket level to ensure that objects never have public access, now and in the future.
S3 Access Analyzer monitors the access policies, ensuring that the policies provide only the intended access to your S3 resources.

S3 Encryption

S3 allows the protection of data in transit by enabling communication via SSL or using client-side encryption

S3 provides data-at-rest encryption using
- Server-Side Encryption: S3 handles the encryption
  - SSE-S3
    - S3 handles the encryption and decryption using S3 managed keys
  - SSE-KMS
    - S3 handles the encryption and decryption using keys managed through AWS KMS.
  - SSE-C
    - S3 handles the encryption and decryption using keys managed and provided by the Customer.
- Client Side Encryption: Customer handles the encryption
  - CSE-CMK
    - Customer handles the encryption and decryption using keys managed through AWS KMS.
  - Client-side Master Key
    - Customer handles the encryption and decryption using keys managed by them.

S3 Permissions

Refer blog post @ S3 Permissions

S3 Object Lock

S3 Object Lock helps to store objects using a write-once-read-many (WORM) model.

can help prevent objects from being deleted or overwritten for a fixed amount of time or indefinitely.
can help meet regulatory requirements that require WORM storage or add an extra layer of protection against object changes and deletion.
can be enabled only for new buckets and works only in versioned buckets.

provides two retention modes that apply different levels of protection to the objects
- Governance mode
  - Users can’t overwrite or delete an object version or alter its lock settings unless they have special permissions.
  - Objects can be protected from being deleted by most users, but some users can be granted permission to alter the retention settings or delete the object if necessary.
  - Can be used to test retention-period settings before creating a compliance-mode retention period.
- Compliance mode
  - A protected object version can’t be overwritten or deleted by any user, including the root user in the AWS account.
  - Object retention mode can’t be changed, and its retention period can’t be shortened.
  - Object versions can’t be overwritten or deleted for the duration of the retention period.

S3 Access Points

S3 access points simplify data access for any AWS service or customer application that stores data in S3.
Access points are named network endpoints that are attached to buckets and can be used to perform S3 object operations, such as GetObject and PutObject.
Each access point has distinct permissions and network controls that S3 applies for any request that is made through that access point.

Each access point enforces a customized access point policy that works in conjunction with the bucket policy, attached to the underlying bucket.
An access point can be configured to accept requests only from a VPC to restrict S3 data access to a private network.
Custom block public access settings can be configured for each access point.

S3 VPC Gateway Endpoint

A VPC endpoint enables connections between a VPC and supported services, without requiring that you use an internet gateway, NAT device, VPN connection, or AWS Direct Connect connection.
VPC is not exposed to the public internet.
A Gateway Endpoint is a gateway that is a target for a route in your route table used for traffic destined to either S3.

S3 Block Public Access

S3 Block Public Access provides controls across an entire AWS Account or at the individual S3 bucket level to ensure that objects never have public access, now and in the future.
S3 Block Public Access provides settings for access points, buckets, and accounts to help manage public access to S3 resources.
By default, new buckets, access points, and objects don’t allow public access. However, users can modify bucket policies, access point policies, or object permissions to allow public access.

S3 Block Public Access settings override these policies and permissions so that public access to these resources can be limited.
S3 Block Public Access allows account administrators and bucket owners to easily set up centralized controls to limit public access to their S3 resources that are enforced regardless of how the resources are created.
S3 doesn’t support block public access settings on a per-object basis.

S3 Block Public Access settings when applied to an account apply to all AWS Regions globally.

S3 Access Analyzer

S3 Access Analyzer monitors the access policies, ensuring that the policies provide only the intended access to your S3 resources.
S3 Access Analyzer evaluates the bucket access policies and enables you to discover and swiftly remediate buckets with potentially unintended access.

S3 Security Best Practices

S3 Preventative Security Best Practices

Ensure S3 buckets use the correct policies and are not publicly accessible
- Use S3 block public access
- Identify Bucket policies and ACLs that allow public access
- Use AWS Trusted Advisor to inspect the S3 implementation.
Implement least privilege access
Use IAM roles for applications and AWS services that require S3 access

Enable Multi-factor authentication (MFA) Delete to help prevent accidental bucket deletions
Consider Data at Rest Encryption
Enforce Data in Transit Encryption

Consider S3 Object Lock to store objects using a “Write Once Read Many” (WORM) model.
Enable versioning to easily recover from both unintended user actions and application failures.
Consider S3 Cross-Region replication

Consider VPC endpoints for S3 access to provide private S3 connectivity and help prevent traffic from potentially traversing the open internet.

S3 Monitoring and Auditing Best Practices

Identify and Audit all S3 buckets to have visibility of all the S3 resources to assess their security posture and take action on potential areas of weakness.
Implement monitoring using AWS monitoring tools

Enable S3 server access logging, which provides detailed records of the requests that are made to a bucket useful for security and access audits
Use AWS CloudTrail, which provides a record of actions taken by a user, a role, or an AWS service in S3.
Enable AWS Config, which enables you to assess, audit, and evaluate the configurations of the AWS resources

Consider using Amazon Macie with S3 to automatically discover, classify, and protect sensitive data in AWS.
Monitor AWS security advisories to regularly check security advisories posted in Trusted Advisor for the AWS account.

AWS Certification Exam Practice Questions

References

AWS_S3_Security

AWS Backup

November 27, 2022 ~ jayendrapatil

AWS Backup

AWS Backup is a fully-managed service that helps centralize and automate data protection across AWS services, in the cloud, and on premises.

helps configure backup policies and monitor activity for the AWS resources in one place.
helps automate and consolidate backup tasks previously performed service-by-service and removes the need to create custom scripts and manual processes.

helps create backup policies called backup plans that help define the backup requirements like frequency, window, retention period, etc.
automatically backs up the AWS resources according to the defined backup plan.
can apply backup plans to the AWS resources by simply tagging them.

stores the periodic backups incrementally which provides benefit from the data protection of frequent backups while minimizing storage costs.

AWS Backup Supported Services

Amazon EC2 – Elastic Compute Cloud
Windows Volume Shadow Copy Service (VSS)

Amazon S3 – Simple Storage Service
Amazon EBS – Elastic Block Store volumes
Amazon DynamoDB tables

Amazon RDS – Relational Database Service
Amazon Aurora clusters
Amazon EFS – Elastic File System file systems

FSx for Lustre file systems
FSx for Windows file systems
Amazon FSx for NetApp ONTAPfile systems

Amazon FSx for OpenZFS file systems
AWS Storage Gateway (Volume Gateway)
Amazon DocumentDB clusters

Amazon Neptune clusters
VMware Cloud™ virtual machines on AWS
VMware Cloud™ on AWS Outposts

AWS Certification Exam Practice Questions

For the production account, a SysOps administrator must ensure that all data is backed up daily for all current and future Amazon EC2 instances and Amazon Elastic File System (Amazon EFS) file systems. Backups must be retained for 30 days. Which solution will meet these requirements with the LEAST amount of effort?
1. Create a backup plan in AWS Backup. Assign resources by resource ID, selecting all existing EC2 and EFS resources that are running in the account. Edit the backup plan daily to include any new resources. Schedule the backup plan to run every day with a lifecycle policy to expire backups after 30 days.
2. Create a backup plan in AWS Backup. Assign resources by tags. Ensure that all existing EC2 and EFS resources are tagged correctly. Schedule the backup plan to run every day with a lifecycle policy to expire backups after 30 days.
3. Create a lifecycle policy in Amazon Data Lifecycle Manager (Amazon DLM). Assign all resources by resource ID, selecting all existing EC2 and EFS resources that are running in the account. Edit the lifecycle policy daily to include any new resources. Schedule the lifecycle policy to create snapshots every day with a retention period of 30 days.
4. Create a lifecycle policy in Amazon Data Lifecycle Manager (Amazon DLM). Assign all resources by tags. Ensure that all existing EC2 and EFS resources are tagged correctly. Schedule the lifecycle policy to create snapshots every day with a retention period of 30 days.

References

AWS_Backup

AWS S3 Object Lock

November 25, 2022 ~ Last updated on : February 2, 2023 ~ jayendrapatil ~ 1 Comment

AWS S3 Object Lock

S3 Object Lock helps to store objects using a write-once-read-many (WORM) model.

can help prevent objects from being deleted or overwritten for a fixed amount of time or indefinitely.
can help meet regulatory requirements that require WORM storage or add an extra layer of protection against object changes and deletion.

can be enabled only for new buckets. For an existing bucket, you need to contact AWS Support.
works only in versioned buckets.
Once Object Lock is enabled
- Object Lock can’t be disabled
- automatically enables versioning for the bucket
- versioning can’t be suspended for the bucket.

provides two ways to manage object retention.
- Retention period
  - protects an object version for a fixed amount of time, during which an object remains locked.
  - During this period, the object is WORM-protected and can’t be overwritten or deleted.
  - can be applied on an object version either explicitly or through a bucket default setting.
  - S3 stores a timestamp in the object version’s metadata to indicate when the retention period expires. After the retention period expires, the object version can be overwritten or deleted unless you also placed a legal hold on the object version.
- Legal hold
  - protects an object version, as a retention period, but it has no expiration date.
  - remains in place until you explicitly remove it.
  - can be freely placed and removed by any user who has the s3:PutObjectLegalHold permission.
  - are independent of retention periods.
- Retention periods and legal holds apply to individual object versions.
- Placing a retention period or legal hold on an object protects only the version specified in the request. It doesn’t prevent new versions of the object from being created.
- An object version can have both a retention period and a legal hold, one but not the other, or neither.
provides two retention modes that apply different levels of protection to the objects
- Governance mode
- Compliance mode
S3 buckets with S3 Object Lock can’t be used as destination buckets for server access logs.

has been assessed by Cohasset Associates for use in environments that are subject to SEC 17a-4, CFTC, and FINRA regulations.

S3 Object Lock – Retention Modes

Governance mode

Users can’t overwrite or delete an object version or alter its lock settings unless they have special permissions.
Objects can be protected from being deleted by most users, but some users can be granted permission to alter the retention settings or delete the object if necessary.

Can be used to test retention-period settings before creating a compliance-mode retention period.
To override or remove governance-mode retention settings, a user must have the s3:BypassGovernanceRetention permission and must explicitly include x-amz-bypass-governance-retention:true as a request header.

Compliance mode

A protected object version can’t be overwritten or deleted by any user, including the root user in the AWS account.

Object retention mode can’t be changed, and its retention period can’t be shortened.
Object versions can’t be overwritten or deleted for the duration of the retention period.

AWS Certification Exam Practice Questions

A company needs to store its accounting records in Amazon S3. No one at the company; including administrative users and root users, should be able to delete the records for an entire 10-year period. The records must be stored with maximum resiliency. Which solution will meet these requirements?
1. Use an access control policy to deny deletion of the records for a period of 10 years.
2. Use an IAM policy to deny deletion of the records. After 10 years, change the IAM policy to allow deletion.
3. Use S3 Object Lock in compliance mode for a period of 10 years.
4. Use S3 Object Lock in governance mode for a period of 10 years.

References

Amazon_S3_Object_Lock

AWS S3 Encryption

November 24, 2022 ~ Last updated on : February 2, 2023 ~ jayendrapatil ~ 2 Comments

AWS S3 Encryption

AWS S3 Encryption supports both data at rest and data in transit encryption.

Data in-transit
- S3 allows protection of data in transit by enabling communication via SSL or using client-side encryption

Data at Rest
- Server-Side Encryption
  - S3 encrypts the object before saving it on disks in its data centers and decrypt it when the objects are downloaded
- Client-Side Encryption
  - data is encrypted at the client-side and uploaded to S3.
  - the encryption process, the encryption keys, and related tools are managed by the user.

S3 Server-Side Encryption

Server-side encryption is about data encryption at rest
Server-side encryption encrypts only the object data.
Any object metadata is not encrypted.

S3 handles the encryption (as it writes to disks) and decryption (when objects are accessed) of the data objects
There is no difference in the access mechanism for both encrypted and unencrypted objects and is handled transparently by S3

Server-Side Encryption with S3-Managed Keys – SSE-S3

Encryption keys are handled and managed by AWS

Each object is encrypted with a unique data key employing strong multi-factor encryption.
SSE-S3 encrypts the data key with a master key that is regularly rotated.
S3 server-side encryption uses one of the strongest block ciphers available, 256-bit Advanced Encryption Standard (AES-256), to encrypt the data.

Whether or not objects are encrypted with SSE-S3 can’t be enforced when they are uploaded using pre-signed URLs, because the only way server-side encryption can be specified is through the AWS Management Console or through an HTTP request header.
Request must set header x-amz-server-side-encryption to AES-256
For enforcing server-side encryption for all of the objects that are stored in a bucket, use a bucket policy that denies permissions to upload an object unless the request includes x-amz-server-side-encryption header to request server-side encryption.

SSE-S3 : Server Side Encryption using S3 Managed Keys — Source: Oreilly

Server-Side Encryption with AWS KMS-Managed Keys – SSE-KMS

Server-Side Encryption with AWS KMS-Managed Keys (SSE-KMS)

SSE-KMS is similar to SSE-S3, but it uses AWS Key Management Services (KMS) which provides additional benefits along with additional charges
- KMS is a service that combines secure, highly available hardware and software to provide a key management system scaled for the cloud.
- KMS uses customer master keys (CMKs) to encrypt the S3 objects.
- The master key is never made available.
- KMS enables you to centrally create encryption keys, and define the policies that control how keys can be used.
- Allows audit of keys used to prove they are being used correctly, by inspecting logs in AWS CloudTrail.
- Allows keys to be temporarily disabled and re-enabled.
- Allows keys to be rotated regularly.
- Security controls in AWS KMS can help meet encryption-related compliance requirements..
SSE-KMS enables separate permissions for the use of an envelope key (that is, a key that protects the data’s encryption key) that provides added protection against unauthorized access to the objects in S3.
SSE-KMS provides the option to create and manage encryption keys yourself, or use a default customer master key (CMK) that is unique to you, the service you’re using, and the region you’re working in.

Creating and Managing CMK gives more flexibility, including the ability to create, rotate, disable, and define access controls, and audit the encryption keys used to protect the data.
Data keys used to encrypt the data are also encrypted and stored alongside the data they protect and are unique to each object.
Process flow
- An application or AWS service client requests an encryption key to encrypt data and passes a reference to a master key under the account.
- Client requests are authenticated based on whether they have access to use the master key.
- A new data encryption key is created, and a copy of it is encrypted under the master key.
- Both the data key and encrypted data key are returned to the client.
- Data key is used to encrypt customer data and then deleted as soon as is practical.
- Encrypted data key is stored for later use and sent back to AWS KMS when the source data needs to be decrypted.

S3 only supports symmetric keys and not asymmetric keys.
Must set header x-amz-server-side-encryption to aws:kms

SSE-KMS : Server Side Encryption using AWS KMS managed keys — Source: Oreilly

Server-Side Encryption with Customer-Provided Keys – SSE-C

AWS S3 Server Side Encryption using Customer Provided Keys SSE-C

Encryption keys can be managed and provided by the Customer and S3 manages the encryption, as it writes to disks, and decryption, when you access the objects
When you upload an object, the encryption key is provided as a part of the request and S3 uses that encryption key to apply AES-256 encryption to the data and removes the encryption key from memory.
When you download an object, the same encryption key should be provided as a part of the request. S3 first verifies the encryption key and if it matches the object is decrypted before returning back to you.

As each object and each object’s version can be encrypted with a different key, you are responsible for maintaining the mapping between the object and the encryption key used.
SSE-C requests must be done through HTTPS and S3 will reject any requests made over HTTP when using SSE-C.
For security considerations, AWS recommends considering any key sent erroneously using HTTP to be compromised and it should be discarded or rotated.

S3 does not store the encryption key provided. Instead, a randomly salted HMAC value of the encryption key is stored which can be used to validate future requests. The salted HMAC value cannot be used to decrypt the contents of the encrypted object or to derive the value of the encryption key. That means, if you lose the encryption key, you lose the object.

SSE-C : Server-Side Encryption with Customer-Provided Keys — Source: Oreilly

Client-Side Encryption

Client-side encryption refers to encrypting data before sending it to S3 and decrypting the data after downloading it

AWS KMS-managed Customer Master Key – CMK

Customer can maintain the encryption CMK with AWS KMS and can provide the CMK id to the client to encrypt the data

Uploading Object
- AWS S3 encryption client first sends a request to AWS KMS for the key to encrypt the object data.
- AWS KMS returns a randomly generated data encryption key with 2 versions a plain text version for encrypting the data and cypher blob to be uploaded with the object as object metadata
- Client obtains a unique data encryption key for each object it uploads.
- AWS S3 encryption client uploads the encrypted data and the cipher blob with object metadata
Download Object
- AWS Client first downloads the encrypted object along with the cipher blob version of the data encryption key stored as object metadata
- AWS Client then sends the cipher blob to AWS KMS to get the plain text version of the same, so that it can decrypt the object data.

Client Side Encryption - Customer Master Keys CSE-CMK — Source: Oreilly

Client-Side master key

Encryption master keys are completely maintained at the Client-side

Uploading Object
- S3 encryption client ( for e.g. AmazonS3EncryptionClient in the AWS SDK for Java) locally generates randomly a one-time-use symmetric key (also known as a data encryption key or data key).
- Client encrypts the data encryption key using the customer provided master key
- Client uses this data encryption key to encrypt the data of a single S3 object (for each object, the client generates a separate data key).
- Client then uploads the encrypted data to S3 and also saves the encrypted data key and its material description as object metadata ( x-amz-meta-x-amz-key) in S3 by default
Downloading Object
- Client first downloads the encrypted object from S3 along with the object metadata.
- Using the material description in the metadata, the client first determines which master key to use to decrypt the encrypted data key.
- Using that master key, the client decrypts the data key and uses it to decrypt the object

Client-side master keys and your unencrypted data are never sent to AWS
If the master key is lost the data cannot be decrypted

Enforcing S3 Encryption

S3 Encryption in Transit
- S3 Bucket Policy can be used to enforce SSL communication with S3 using the effect deny with condition aws:SecureTransport set to false.
S3 Default Encryption
- helps set the default encryption behaviour for an S3 bucket so that all new objects are encrypted when they are stored in the bucket.
- Objects are encrypted using SSE with either S3-managed keys (SSE-S3) or AWS KMS keys stored in AWS KMS (SSE-KMS).
S3 Bucket Policy
- can be applied that denies permissions to upload an object unless the request includes x-amz-server-side-encryption header to request server-side encryption.
- is not required, if S3 default encryption is enabled
- are evaluated before the default encryption.

S3 Bucket Policy Enforce Encryption

AWS Certification Exam Practice Questions

A company is storing data on Amazon Simple Storage Service (S3). The company’s security policy mandates that data is encrypted at rest. Which of the following methods can achieve this? Choose 3 answers
1. Use Amazon S3 server-side encryption with AWS Key Management Service managed keys
2. Use Amazon S3 server-side encryption with customer-provided keys
3. Use Amazon S3 server-side encryption with EC2 key pair.
4. Use Amazon S3 bucket policies to restrict access to the data at rest.
5. Encrypt the data on the client-side before ingesting to Amazon S3 using their own master key
6. Use SSL to encrypt the data while in transit to Amazon S3.
A user has enabled versioning on an S3 bucket. The user is using server side encryption for data at Rest. If the user is supplying his own keys for encryption (SSE-C) which of the below mentioned statements is true?
1. The user should use the same encryption key for all versions of the same object
2. It is possible to have different encryption keys for different versions of the same object
3. AWS S3 does not allow the user to upload his own keys for server side encryption
4. The SSE-C does not work when versioning is enabled

A storage admin wants to encrypt all the objects stored in S3 using server side encryption. The user does not want to use the AES 256 encryption key provided by S3. How can the user achieve this?
1. The admin should upload his secret key to the AWS console and let S3 decrypt the objects
2. The admin should use CLI or API to upload the encryption key to the S3 bucket. When making a call to the S3 API mention the encryption key URL in each request
3. S3 does not support client supplied encryption keys for server side encryption
4. The admin should send the keys and encryption algorithm with each API call
A user has enabled versioning on an S3 bucket. The user is using server side encryption for data at rest. If the user is supplying his own keys for encryption (SSE-C), what is recommended to the user for the purpose of security?
1. User should not use his own security key as it is not secure
2. Configure S3 to rotate the user’s encryption key at regular intervals
3. Configure S3 to store the user’s keys securely with SSL
4. Keep rotating the encryption key manually at the client side
A system admin is planning to encrypt all objects being uploaded to S3 from an application. The system admin does not want to implement his own encryption algorithm; instead he is planning to use server side encryption by supplying his own key (SSE-C.. Which parameter is not required while making a call for SSE-C?
1. x-amz-server-side-encryption-customer-key-AES-256
2. x-amz-server-side-encryption-customer-key
3. x-amz-server-side-encryption-customer-algorithm
4. x-amz-server-side-encryption-customer-key-MD5

You are designing a personal document-archiving solution for your global enterprise with thousands of employee. Each employee has potentially gigabytes of data to be backed up in this archiving solution. The solution will be exposed to he employees as an application, where they can just drag and drop their files to the archiving system. Employees can retrieve their archives through a web interface. The corporate network has high bandwidth AWS DirectConnect connectivity to AWS. You have regulatory requirements that all data needs to be encrypted before being uploaded to the cloud. How do you implement this in a highly available and cost efficient way?
1. Manage encryption keys on-premise in an encrypted relational database. Set up an on-premises server with sufficient storage to temporarily store files and then upload them to Amazon S3, providing a client-side master key. (Storing temporary increases cost and not a high availability option)
2. Manage encryption keys in a Hardware Security Module(HSM) appliance on-premise server with sufficient storage to temporarily store, encrypt, and upload files directly into amazon Glacier. (Not cost effective)
3. Manage encryption keys in amazon Key Management Service (KMS), upload to amazon simple storage service (s3) with client-side encryption using a KMS customer master key ID and configure Amazon S3 lifecycle policies to store each object using the amazon glacier storage tier. (with CSE-KMS the encryption happens at client side before the object is upload to S3 and KMS is cost effective as well)
4. Manage encryption keys in an AWS CloudHSM appliance. Encrypt files prior to uploading on the employee desktop and then upload directly into amazon glacier (Not cost effective)
A user has enabled server side encryption with S3. The user downloads the encrypted object from S3. How can the user decrypt it?
1. S3 does not support server side encryption
2. S3 provides a server side key to decrypt the object
3. The user needs to decrypt the object using their own private key
4. S3 manages encryption and decryption automatically
When uploading an object, what request header can be explicitly specified in a request to Amazon S3 to encrypt object data when saved on the server side?
1. x-amz-storage-class
2. Content-MD5
3. x-amz-security-token
4. x-amz-server-side-encryption

A company must ensure that any objects uploaded to an S3 bucket are encrypted. Which of the following actions should the SysOps Administrator take to meet this requirement? (Select TWO.)
1. Implement AWS Shield to protect against unencrypted objects stored in S3 buckets.
2. Implement Object access control list (ACL) to deny unencrypted objects from being uploaded to the S3 bucket.
3. Implement Amazon S3 default encryption to make sure that any object being uploaded is encrypted before it is stored.
4. Implement Amazon Inspector to inspect objects uploaded to the S3 bucket to make sure that they are encrypted.
5. Implement S3 bucket policies to deny unencrypted objects from being uploaded to the buckets.

References

AWS_S3_Encryption

AWS S3 Versioning

November 24, 2022 ~ Last updated on : May 16, 2023 ~ jayendrapatil ~ 7 Comments

S3 Versioning

S3 Versioning helps to keep multiple variants of an object in the same bucket and can be used to preserve, retrieve, and restore every version of every object stored in the S3 bucket.

S3 Object Versioning can be used to protect from unintended overwrites and accidental deletions
As Versioning maintains multiple copies of the same objects as a whole and charges accrue for multiple versions for e.g. for a 1GB file with 5 copies with minor differences would consume 5GB of S3 storage space and you would be charged for the same.

Buckets can be in one of the three states
- Unversioned (the default)
- Versioning-enabled
- Versioning-suspended
S3 Object Versioning is not enabled by default and has to be explicitly enabled for each bucket.
Versioning once enabled, cannot be disabled and can only be suspended

Versioning enabled on a bucket applies to all the objects within the bucket
Permissions are set at the version level. Each version has its own object owner; an AWS account that creates the object version is the owner. So, you can set different permissions for different versions of the same object.
Irrespective of the Versioning, each object in the bucket has a version.
- For Non Versioned bucket, the version ID for each object is null
- For Versioned buckets, a unique version ID is assigned to each object
With Versioning, version ID forms a key element to define the uniqueness of an object within a bucket along with the bucket name and object key

Object Retrieval

For Non Versioned bucket
- An Object retrieval always returns the only object available.

For Versioned bucket
- An object retrieval returns the Current latest object.
- Non-Current objects can be retrieved by specifying the version ID.

Object Addition

For Non Versioned bucket
- If an object with the same key is uploaded again it overwrites the object
For Versioned bucket
- If an object with the same key is uploaded, the newly uploaded object becomes the current version and the previous object becomes the non-current version.
- A non-current versioned object can be retrieved and restored hence protecting against accidental overwrites

Object Deletion

For Non Versioned bucket
- An object is permanently deleted and cannot be recovered
For the Versioned bucket,
- All versions remain in the bucket and Amazon inserts a delete marker which becomes the Current version
- A non-current versioned object can be retrieved and restored hence protecting against accidental overwrites
- If an Object with a specific version ID is deleted, a permanent deletion happens and the object cannot be recovered

Delete marker

Delete Marker object does not have any data or ACL associated with it, just the key and the version ID

An object retrieval on a bucket with a delete marker as the Current version would return a 404
Only a DELETE operation is allowed on the Delete Marker object
If the Delete marker object is deleted by specifying its version ID, the previous non-current version object becomes the current version object

If a DELETE request is fired on an object with Delete Marker as the current version, the Delete marker object is not deleted but a Delete Marker is added again

S3 Versioning - Delete Operation

Restoring Previous Versions

Copy a previous version of the object into the same bucket. The copied object becomes the current version of that object and all object versions are preserved – Recommended as it keeps all the versions.

Permanently delete the current version of the object. When you delete the current object version, you, in effect, turn the previous version into the current version of that object.

Versioning Suspended Bucket

Versioning can be suspended to stop accruing new versions of the same object in a bucket.
Existing objects in the bucket do not change and only future requests behavior changes.

An object with version ID null is added for each new object addition.
For each object addition with the same key name, the object with the version ID null is overwritten.
An object retrieval request will always return the current version of the object.

A DELETE request on the bucket would permanently delete the version ID null object and inserts a Delete Marker
A DELETE request does not delete anything if the bucket does not have an object with version ID null
A DELETE request can still be fired with a specific version ID for any previous object with version IDs stored

MFA Delete

Additional security can be enabled by configuring a bucket to enable MFA (Multi-Factor Authentication) for the deletion of objects.
MFA Delete enabled, requires additional authentication for operations
- Changing the versioning state of the bucket
- Permanently deleting an object version
MFA Delete can be enabled on a bucket to ensure that data in the bucket cannot be accidentally deleted
While the bucket owner, the AWS account that created the bucket (root account), and all authorized IAM users can enable versioning, but only the bucket owner (root account) can enable MFA Delete.

MFA Delete however does not prevent deletion or allow restoration.
MFA Delete cannot be enabled using the AWS Management Console. You must use the AWS Command Line Interface (AWS CLI) or the API.

AWS Certification Exam Practice Questions

Which set of Amazon S3 features helps to prevent and recover from accidental data loss?
1. Object lifecycle and service access logging
2. Object versioning and Multi-factor authentication
3. Access controls and server-side encryption
4. Website hosting and Amazon S3 policies
You use S3 to store critical data for your company Several users within your group currently have full permissions to your S3 buckets. You need to come up with a solution that does not impact your users and also protect against the accidental deletion of objects. Which two options will address this issue? Choose 2 answers
1. Enable versioning on your S3 Buckets
2. Configure your S3 Buckets with MFA delete
3. Create a Bucket policy and only allow read only permissions to all users at the bucket level
4. Enable object life cycle policies and configure the data older than 3 months to be archived in Glacier

To protect S3 data from both accidental deletion and accidental overwriting, you should
1. enable S3 versioning on the bucket
2. access S3 data using only signed URLs
3. disable S3 delete using an IAM bucket policy
4. enable S3 Reduced Redundancy Storage
5. enable Multi-Factor Authentication (MFA) protected access

A user has not enabled versioning on an S3 bucket. What will be the version ID of the object inside that bucket?
1. 0
2. There will be no version attached
3. Null
4. Blank
A user is trying to find the state of an S3 bucket with respect to versioning. Which of the below mentioned states AWS will not return when queried?
1. versioning-enabled
2. versioning-suspended
3. unversioned
4. versioned

References

AWS S3 Versioning

AWS Storage Gateway

November 23, 2022 ~ Last updated on : November 23, 2022 ~ jayendrapatil ~ 35 Comments

AWS Storage Gateway

AWS Storage Gateway connects on-premises software appliances with cloud-based storage to provide seamless integration with data security features between on-premises and the AWS storage infrastructure.

AWS Storage Gateway is a hybrid cloud storage service that gives you on-premises access to virtually unlimited cloud storage.
Storage Gateway allows storage of data in the AWS cloud for scalable and cost-effective storage while maintaining data security.

Storage Gateway can run either on-premises, as a VM appliance, or in AWS, as an EC2 instance. So if the on-premises data center goes offline and there is no available host, the gateway can be deployed on an EC2 instance.
Gateways hosted on EC2 instances can be used for disaster recovery, data mirroring, and providing storage for applications hosted on EC2
Storage Gateway, by default, uploads data using SSL and provides data encryption at rest when stored in S3 or Glacier using AES-256

Storage Gateway performs encryption of data-in-transit and at-rest.
Storage Gateway offers multiple types
- File Gateway
- Volume-based Gateway
- Tape-based

S3 File Gateway

supports a file interface into S3 and combines service and a virtual software appliance.
allows storing and retrieving of objects in S3 using industry-standard file protocols such as NFS and SMB.
Software appliance, or gateway, is deployed into the on-premises environment as a VM running on VMware ESXi or Microsoft Hyper-V hypervisor.

provides access to objects in S3 as files or file share mount points. It can be considered as a file system mount on S3.
durably stores POSIX-style metadata, including ownership, permissions, and timestamps in S3 as object user metadata associated with the file.
provides a cost-effective alternative to on-premises storage.

provides low-latency access to data through transparent local caching.
manages data transfer to and from AWS, buffers applications from network congestion, optimizes and streams data in parallel, and manages bandwidth consumption.
easily integrates with services like IAM, KMS, CloudWatch, CloudTrail, etc.

File Gateway allows you to
- store and retrieve files directly using the NFS version 3 or 4.1 protocol.
- store and retrieve files directly using the SMB file system version, 2 and 3 protocol.
- access the data directly in S3 from any AWS Cloud application or service.
- manage S3 data using lifecycle policies, cross-region replication, and versioning.

Volume Gateways

Volume gateways provide cloud-backed storage volumes that can be mounted as Internet Small Computer System Interface (iSCSI) devices from the on-premises application servers.

all data is securely stored in AWS, the approach differs from how much data is stored on-premises.
exposes compatible iSCSI interface on the front end to easily integrate with existing backup applications and represents another disk drive
backs up the data incrementally by taking snapshots which are stored as EBS snapshots in S3. These snapshots can be restored as gateway storage volume or used to create EBS volumes to be attached to an EC2 instance

Gateway Cached Volumes

Gateway Cached Volumes store data in S3, which acts as a primary data storage, and retains a copy of recently read data locally for low latency access to the frequently accessed data
Gateway-cached volumes offer substantial cost savings on primary storage and minimize the need to scale the storage on-premises.
All gateway-cached volume data and snapshot data are stored in S3 encrypted at rest using server-side encryption (SSE) and it cannot be accessed with S3 API or any other tools.

Each gateway configured for gateway-cached volumes can support up to 32 volumes, with each volume ranging from 1GiB to 32TiB, for a total maximum storage volume of 1,024 TiB (1 PiB).
Gateway VM can be allocated disks
- Cache storage
  - Cache storage acts as the on-premises durable storage, stores the data before uploading it to S3
  - Cache storage also stores recently read data for low-latency access
- Upload buffer
  - Upload buffer acts as a staging area before the data is uploaded to S3
  - Gateway uploads data over an encrypted Secure Sockets Layer (SSL) connection to AWS, where it is stored encrypted in S3

Gateway Stored Volumes

Gateway stored volumes maintain the entire data set locally to provide low-latency access.

Gateway asynchronously backs up point-in-time snapshots (in the form of EBS snapshots) of the data to S3 which provides durable off-site backups
Gateway stored volume configuration provides durable and inexpensive off-site backups that you can recover to your local data center or EC2 for e.g., if you need replacement capacity for disaster recovery, you can recover the backups to EC2.
Each gateway configured for gateway-stored volumes can support up to 12 32 volumes, ranging from 1GiB to 16TiB, and total volume storage of ~~192 TiB~~ 512 TiB

Gateway VM can be allocated disks
- Volume Storage
  - For storing the actual data
  - Can be mapped to on-premises direct-attached storage (DAS) or storage area network (SAN) disks
- Upload buffer
  - Upload buffer acts as a staging area before the data is uploaded to S3
  - Gateway uploads data over an encrypted Secure Sockets Layer (SSL) connection to AWS, where it is stored encrypted in Amazon S3

Tape Gateway – Gateway-Virtual Tape Library (VTL)

Tape Gateway offers a durable, cost-effective data archival solution.
VTL interface can help leverage existing tape-based backup application infrastructure to store data on virtual tape cartridges created on the tape gateway.

Each Tape Gateway is preconfigured with a media changer and tape drives, which are available to the existing client backup applications as iSCSI devices. Tape cartridges can be added as needed to archive the data.
Gateway-VTL provides a virtual tape infrastructure that scales seamlessly with the business needs and eliminates the operational burden of provisioning, scaling, and maintaining a physical tape infrastructure.
Gateway VTL has the following components:-
- Virtual Tape
  - Virtual tape is similar to the physical tape cartridge, except that the data is stored in the AWS storage solution
  - Each gateway can contain 1500 tapes or up to ~~150 TiB~~ 1 PiB of total tape data, with each tape ranging from 100 GiB to 2.5 TiB
- Virtual Tape Library
  - Virtual tape library is similar to the physical tape library with tape drives (replaced with VTL tape drive) and robotic arms (replaced with Media changer)
  - Tapes in the Virtual tape library are backup in S3
  - Backup software writes data to the gateway, the gateway stores data locally, and then asynchronously uploads it to virtual tapes in S3.
- Archive OR Virtual Tape Shelf
  - Virtual tape shelf is similar to the offsite tape holding facility
  - Tapes in the Virtual tape library are backup in Glacier providing an extremely low-cost storage service for data archiving and backup
  - VTS is located in the same region where the gateway was created and every region would have a single VTS irrespective of the number of gateways
  - Archiving tapes
    - When the backup software ejects a tape, the gateway moves the tape to the VTS for long term storage
  - Retrieving tapes
    - Tape can be retrieved from VTS only by first retrieving the tapes first to VTL and would be available in the VTL in about 24 hours

Gateway VM can be allocated disks for
- Cache storage
  - Cache storage acts as the on-premises durable storage, stores the data before uploading it to S3.
  - Cache storage also stores recently read data for low-latency access
- Upload buffer
  - Upload buffer acts as a staging area before the data is uploaded to the Virtual tape.
  - Gateway uploads data over an encrypted Secure Sockets Layer (SSL) connection to AWS, where it is stored encrypted in S3.

AWS Certification Exam Practice Questions

Which of the following services natively encrypts data at rest within an AWS region? Choose 2 answers
1. AWS Storage Gateway
2. Amazon DynamoDB
3. Amazon CloudFront
4. Amazon Glacier
5. Amazon Simple Queue Service
What does the AWS Storage Gateway provide?
1. It allows to integrate on-premises IT environments with Cloud Storage
2. A direct encrypted connection to Amazon S3.
3. It’s a backup solution that provides an on-premises Cloud storage.
4. It provides an encrypted SSL endpoint for backups in the Cloud.

You’re running an application on-premises due to its dependency on non-x86 hardware and want to use AWS for data backup. Your backup application is only able to write to POSIX-compatible block-based storage. You have 140TB of data and would like to mount it as a single folder on your file server. Users must be able to access portions of this data while the backups are taking place. What backup solution would be most appropriate for this use case?
1. Use Storage Gateway and configure it to use Gateway Cached volumes.
2. Configure your backup software to use S3 as the target for your data backups.
3. Configure your backup software to use Glacier as the target for your data backups
4. Use Storage Gateway and configure it to use Gateway Stored volumes (Data is hosted on the On-premise server as well. The requirement for 140TB is for file server On-Premise more to confuse and not in AWS. Just need a backup solution hence stored instead of cached volumes)
A customer has a single 3-TB volume on-premises that is used to hold a large repository of images and print layout files. This repository is growing at 500 GB a year and must be presented as a single logical volume. The customer is becoming increasingly constrained with their local storage capacity and wants an off-site backup of this data, while maintaining low-latency access to their frequently accessed data. Which AWS Storage Gateway configuration meets the customer requirements?
1. Gateway-Cached volumes with snapshots scheduled to Amazon S3
2. Gateway-Stored volumes with snapshots scheduled to Amazon S3
3. Gateway-Virtual Tape Library with snapshots to Amazon S3
4. Gateway-Virtual Tape Library with snapshots to Amazon Glacier
You have a proprietary data store on-premises that must be backed up daily by dumping the data store contents to a single compressed 50GB file and sending the file to AWS. Your SLAs state that any dump file backed up within the past 7 days can be retrieved within 2 hours. Your compliance department has stated that all data must be held indefinitely. The time required to restore the data store from a backup is approximately 1 hour. Your on-premise network connection is capable of sustaining 1gbps to AWS. Which backup methods to AWS would be most cost-effective while still meeting all of your requirements?
1. Send the daily backup files to Glacier immediately after being generated (will not meet the RTO)
2. Transfer the daily backup files to an EBS volume in AWS and take daily snapshots of the volume (Not cost effective)
3. Transfer the daily backup files to S3 and use appropriate bucket lifecycle policies to send to Glacier (Store in S3 for seven days and then archive to Glacier)
4. Host the backup files on a Storage Gateway with Gateway-Cached Volumes and take daily snapshots (Not Cost effective as local storage as well as S3 storage)

A customer implemented AWS Storage Gateway with a gateway-cached volume at their main office. An event takes the link between the main and branch office offline. Which methods will enable the branch office to access their data? Choose 3 answers
1. Use a HTTPS GET to the Amazon S3 bucket where the files are located (gateway volumes are only accessible from the AWS Storage Gateway and cannot be directly accessed using Amazon S3 APIs)
2. Restore by implementing a lifecycle policy on the Amazon S3 bucket.
3. Make an Amazon Glacier Restore API call to load the files into another Amazon S3 bucket within four to six hours.
4. Launch a new AWS Storage Gateway instance AMI in Amazon EC2, and restore from a gateway snapshot
5. Create an Amazon EBS volume from a gateway snapshot, and mount it to an Amazon EC2 instance.
6. Launch an AWS Storage Gateway virtual iSCSI device at the branch office, and restore from a gateway snapshot
A company uses on-premises servers to host its applications. The company is running out of storage capacity. The applications use
both block storage and NFS storage. The company needs a high-performing solution that supports local caching without rearchitecting
its existing applications.Which combination of actions should a solutions architect take to meet these requirements? (Choose two.)
1. Mount Amazon S3 as a file system to the on-premises servers.
2. Deploy an AWS Storage Gateway file gateway to replace NFS storage.
3. Deploy AWS Snowball Edge to provision NFS mounts to on-premises servers.
4. Deploy an AWS Storage Gateway volume gateway to replace the block storage.
5. Deploy Amazon Elastic File System (Amazon EFS) volumes and mount them to on-premises servers.

References

AWS_Storage_Gateway_User_Guide

https://www.youtube.com/watch?v=AkehuRl5YPg