AWS S3 Versioning

S3 Versioning

  • S3 Versioning helps to keep multiple variants of an object in the same bucket and can be used to preserve, retrieve, and restore every version of every object stored in the S3 bucket.
  • S3 Object Versioning can be used to protect from unintended overwrites and accidental deletions
  • As Versioning maintains multiple copies of the same objects as a whole and charges accrue for multiple versions for e.g. for a 1GB file with 5 copies with minor differences would consume 5GB of S3 storage space and you would be charged for the same.
  • Buckets can be in one of the three states
    • Unversioned (the default)
    • Versioning-enabled
    • Versioning-suspended
  • S3 Object Versioning is not enabled by default and has to be explicitly enabled for each bucket.
  • Versioning once enabled, cannot be disabled and can only be suspended
  • Versioning enabled on a bucket applies to all the objects within the bucket
  • Permissions are set at the version level. Each version has its own object owner; an AWS account that creates the object version is the owner. So, you can set different permissions for different versions of the same object.
  • Irrespective of the Versioning, each object in the bucket has a version.
    • For Non Versioned bucket, the version ID for each object is null
    • For Versioned buckets, a unique version ID is assigned to each object
  • With Versioning, version ID forms a key element to define the uniqueness of an object within a bucket along with the bucket name and object key
  • After enabling versioning on a bucket for the first time, it may take up to 15 minutes for the change to fully propagate. During this time, GET requests for objects created or updated after enabling versioning may result in HTTP 404 NoSuchKey errors. AWS recommends waiting 15 minutes after enabling versioning before issuing write operations (PUT or DELETE) on objects in the bucket.
  • Objects that are stored in the bucket before versioning is enabled have a version ID of null. When versioning is enabled, existing objects do not change; only how S3 handles future requests changes.

Object Retrieval

  • For Non Versioned bucket
    • An Object retrieval always returns the only object available.
  • For Versioned bucket
    • An object retrieval returns the Current latest object.
    • Non-Current objects can be retrieved by specifying the version ID.

Object Addition

  • For Non Versioned bucket
    • If an object with the same key is uploaded again it overwrites the object
  • For Versioned bucket
    • If an object with the same key is uploaded, the newly uploaded object becomes the current version and the previous object becomes the non-current version.
    • A non-current versioned object can be retrieved and restored hence protecting against accidental overwrites
    • If S3 receives multiple write requests for the same object simultaneously, it stores all of those objects as separate versions.

Object Deletion

  • For Non Versioned bucket
    • An object is permanently deleted and cannot be recovered
  • For the Versioned bucket,
    • All versions remain in the bucket and Amazon inserts a delete marker which becomes the Current version
    • A non-current versioned object can be retrieved and restored hence protecting against accidental deletions
    • If an Object with a specific version ID is deleted, a permanent deletion happens and the object cannot be recovered

Delete marker

  • Delete Marker object does not have any data or ACL associated with it, just the key and the version ID
  • An object retrieval on a bucket with a delete marker as the Current version would return a 404
  • Only a DELETE operation is allowed on the Delete Marker object
  • If the Delete marker object is deleted by specifying its version ID, the previous non-current version object becomes the current version object
  • If a DELETE request is fired on an object with Delete Marker as the current version, the Delete marker object is not deleted but a Delete Marker is added again

S3 Versioning - Delete Operation

Restoring Previous Versions

  • Copy a previous version of the object into the same bucket. The copied object becomes the current version of that object and all object versions are preserved – Recommended as it keeps all the versions.
  • Permanently delete the current version of the object. When you delete the current object version, you, in effect, turn the previous version into the current version of that object.

Versioning Suspended Bucket

  • Versioning can be suspended to stop accruing new versions of the same object in a bucket.
  • Existing objects in the bucket do not change and only future requests behavior changes.
  • An object with version ID null is added for each new object addition.
  • For each object addition with the same key name, the object with the version ID null is overwritten.
  • An object retrieval request will always return the current version of the object.
  • A DELETE request on the bucket would permanently delete the version ID null object and inserts a Delete Marker
  • A DELETE request does not delete anything if the bucket does not have an object with version ID null
  • A DELETE request can still be fired with a specific version ID for any previous object with version IDs stored

S3 Versioning with S3 Lifecycle

  • S3 Lifecycle can be used to manage versioned objects and control storage costs by automatically transitioning or expiring noncurrent versions.
  • NoncurrentVersionExpiration action permanently deletes noncurrent object versions after a specified number of days.
  • NoncurrentVersionTransition action transitions noncurrent versions to a cheaper storage class (e.g., S3 Standard-IA, S3 Glacier) after a specified number of days.
  • NewerNoncurrentVersions parameter allows retaining a specific number of noncurrent versions (up to 100) before lifecycle actions apply. This helps retain only the most recent N versions for recovery while expiring older ones.
  • If you have an object expiration lifecycle configuration in your unversioned bucket and you want to maintain the same permanent delete behavior when you enable versioning, you must add a noncurrent expiration configuration.
  • Lifecycle rules help control versioning storage costs by automatically cleaning up old versions that are no longer needed.
  • MFA Delete cannot be used with lifecycle configurations.

S3 Object Lock

  • S3 Object Lock provides write-once-read-many (WORM) protection for S3 objects.
  • S3 Object Lock requires versioning to be enabled on the bucket. When Object Lock is enabled, versioning is automatically enabled and cannot be suspended.
  • Object Lock prevents locked object versions from being permanently deleted or overwritten.
  • Object Lock provides two retention modes:
    • Governance Mode – Users with specific IAM permissions (s3:BypassGovernanceRetention) can override or remove the retention settings. Provides protection against most users but allows authorized overrides.
    • Compliance Mode – No user, including the root account, can overwrite or delete a protected object version during the retention period. The retention mode cannot be changed, and the retention period cannot be shortened.
  • Legal Hold provides the same protection as a retention period but has no expiration date. It remains in place until explicitly removed by a user with the s3:PutObjectLegalHold permission.
  • Object Lock can only be enabled at bucket creation time (cannot be added to existing buckets without contacting AWS Support).
  • S3 Object Lock works at the individual object version level.

S3 Versioning with S3 Replication

  • S3 Replication (both Cross-Region Replication and Same-Region Replication) requires versioning to be enabled on both the source and destination buckets.
  • Live replication automatically replicates new and updated objects as they are written to the source bucket.
  • S3 Batch Replication can replicate existing objects that were added before replication was configured.
  • Replication replicates version-specific metadata including version ID, storage class, and retention information.
  • Delete markers can optionally be replicated to the destination bucket.

MFA Delete

  • Additional security can be enabled by configuring a bucket to enable MFA (Multi-Factor Authentication) for the deletion of objects.
  • MFA Delete enabled, requires additional authentication for operations
    • Changing the versioning state of the bucket
    • Permanently deleting an object version
  • MFA Delete requires two forms of authentication: security credentials + the six-digit code from an approved MFA device.
  • MFA Delete can be enabled on a bucket to ensure that data in the bucket cannot be accidentally deleted
  • While the bucket owner, the AWS account that created the bucket (root account), and all authorized IAM users can enable versioning, but only the bucket owner (root account) can enable MFA Delete.
  • MFA Delete however does not prevent deletion or allow restoration.
  • MFA Delete cannot be enabled using the AWS Management Console. You must use the AWS Command Line Interface (AWS CLI) or the API.
  • MFA Delete cannot be used with lifecycle configurations.
  • To identify buckets that have MFA Delete enabled, you can use Amazon S3 Storage Lens metrics.
  • Both hardware and virtual MFA devices can be used with MFA Delete.

Versioning Cost Optimization Best Practices

  • Each version of an object is the entire object (not a diff/delta), so storage costs increase linearly with version count.
  • Use S3 Lifecycle rules with NoncurrentVersionExpiration to automatically delete old versions after a defined retention period.
  • Use NewerNoncurrentVersions to retain only the most recent N noncurrent versions (e.g., keep 3 latest versions).
  • Transition noncurrent versions to cheaper storage classes (S3 Standard-IA, S3 Glacier) using NoncurrentVersionTransition before expiring them.
  • Unoptimized versioning can generate 15-25% of total storage costs in production environments.
  • Monitor versioning storage using S3 Storage Lens and S3 Metadata to identify buckets with excessive noncurrent versions.
  • Consider suspending versioning (not disabling) if the use case no longer requires version history, but note existing versions remain and continue to incur charges.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Which set of Amazon S3 features helps to prevent and recover from accidental data loss?
    1. Object lifecycle and service access logging
    2. Object versioning and Multi-factor authentication
    3. Access controls and server-side encryption
    4. Website hosting and Amazon S3 policies
  2. You use S3 to store critical data for your company Several users within your group currently have full permissions to your S3 buckets. You need to come up with a solution that does not impact your users and also protect against the accidental deletion of objects. Which two options will address this issue? Choose 2 answers
    1. Enable versioning on your S3 Buckets
    2. Configure your S3 Buckets with MFA delete
    3. Create a Bucket policy and only allow read only permissions to all users at the bucket level
    4. Enable object life cycle policies and configure the data older than 3 months to be archived in Glacier
  3. To protect S3 data from both accidental deletion and accidental overwriting, you should
    1. enable S3 versioning on the bucket
    2. access S3 data using only signed URLs
    3. disable S3 delete using an IAM bucket policy
    4. enable S3 Reduced Redundancy Storage
    5. enable Multi-Factor Authentication (MFA) protected access
  4. A user has not enabled versioning on an S3 bucket. What will be the version ID of the object inside that bucket?
    1. 0
    2. There will be no version attached
    3. Null
    4. Blank
  5. A user is trying to find the state of an S3 bucket with respect to versioning. Which of the below mentioned states AWS will not return when queried?
    1. versioning-enabled
    2. versioning-suspended
    3. unversioned
    4. versioned
  6. A company wants to ensure that objects stored in S3 cannot be deleted or overwritten by any user, including the root account, for a period of 7 years to meet regulatory compliance. Which S3 features should be used? [Select 2]
    1. S3 Versioning
    2. S3 Object Lock in Compliance mode
    3. S3 Object Lock in Governance mode
    4. S3 MFA Delete
    5. S3 Lifecycle policies
  7. A company has enabled versioning on an S3 bucket but is concerned about increasing storage costs. Which feature allows them to automatically retain only the 3 most recent noncurrent versions and expire older ones?
    1. NoncurrentVersionExpiration with NoncurrentDays
    2. NoncurrentVersionExpiration with NewerNoncurrentVersions
    3. S3 Intelligent-Tiering
    4. S3 Object Lock retention period
  8. Which of the following S3 features requires versioning to be enabled on the bucket? [Select 2]
    1. S3 Cross-Region Replication
    2. S3 Object Lock
    3. S3 Transfer Acceleration
    4. S3 Event Notifications
    5. S3 Static Website Hosting
  9. After enabling versioning on an S3 bucket for the first time, a developer immediately uploads an object but receives an HTTP 404 NoSuchKey error when trying to retrieve it. What is the most likely cause?
    1. The bucket policy does not allow GetObject
    2. The object was uploaded to the wrong bucket
    3. Versioning changes may take up to 15 minutes to propagate after first enablement
    4. The object is encrypted and the developer lacks KMS permissions
  10. A company wants to use S3 Object Lock to protect sensitive data. Which statement is correct about the relationship between Object Lock and Versioning?
    1. Object Lock can be enabled without versioning
    2. Object Lock automatically disables versioning
    3. Object Lock automatically enables versioning, and versioning cannot be suspended while Object Lock is active
    4. Object Lock requires versioning to be suspended

References

AWS S3 Best Practices

S3 Best Practices

Performance

Multiple Concurrent PUTs/GETs

  • S3 scales to support very high request rates. S3 automatically partitions the buckets as needed to support higher request rates.
  • S3 can achieve at least 3,500 PUT/COPY/POST/DELETE and 5,500 GET/HEAD requests per second per partitioned prefix in a bucket.
  • There are no limits to the number of prefixes in a bucket, so throughput can be scaled horizontally by parallelizing reads or writes across different prefixes.
  • Random prefix key naming is NO LONGER required for performance optimization.
    • Since July 2018, S3 automatically handles internal partitioning to support high request rates.
    • Logical or sequential naming patterns can be used without any performance implications.
    • S3 dynamically optimizes performance in response to sustained high request rates.
  • If a workload experiences sudden bursts above the per-prefix limit, S3 will return HTTP 503 (Slow Down) responses temporarily while it repartitions. Gradually ramping up request rates (prefix-level warm-up) helps avoid throttling for new prefixes.

S3 Express One Zone (High-Performance Storage)

  • S3 Express One Zone is a high-performance storage class (launched Nov 2023) purpose-built for latency-sensitive applications.
    • Delivers consistent single-digit millisecond first-byte read and write latency — up to 10x faster than S3 Standard.
    • Reduces request costs by up to 50% compared to S3 Standard.
    • Scales to process millions of requests per minute.
    • Uses directory buckets stored in a single Availability Zone.
    • Ideal for ML model training, interactive analytics, media content creation, and high-frequency trading.
  • AWS announced up to 85% price reductions for S3 Express One Zone in April 2025.

Transfer Acceleration

  • S3 Transfer Acceleration enables fast, easy, and secure transfers of files over long distances between the client and an S3 bucket.
  • Transfer Acceleration takes advantage of CloudFront’s globally distributed edge locations. As the data arrives at an edge location, data is routed to S3 over an optimized network path.
  • Use the S3 Transfer Acceleration Speed Comparison tool to determine if it would benefit your use case.

GET-intensive Workloads

  • CloudFront can be used for performance optimization and can help by
    • distributing content with low latency and high data transfer rate.
    • caching the content and thereby reducing the number of direct requests to S3
    • providing multiple endpoints (Edge locations) for data availability
  • CloudFront RTMP distributions were deprecated on December 31, 2020. Use CloudFront Web distributions with HTTP-based streaming (HLS, DASH) for media delivery.
  • To fast data transport over long distances between a client and an S3 bucket, use S3 Transfer Acceleration. Transfer Acceleration uses the globally distributed edge locations in CloudFront to accelerate data transport over geographical distances.

PUTs/GETs for Large Objects

  • AWS allows Parallelizing the PUTs/GETs request to improve the upload and download performance as well as the ability to recover in case it fails
  • For PUTs, Multipart upload can help improve the uploads by
    • performing multiple uploads at the same time and maximizing network bandwidth utilization
    • quick recovery from failures, as only the part that failed to upload needs to be re-uploaded
    • ability to pause and resume uploads
    • begin an upload before the Object size is known
    • Recommended for objects larger than 100 MB; required for objects larger than 5 GB
  • For GETs, the Range HTTP header (byte-range fetches) can help improve the downloads by
    • allowing the object to be retrieved in parts instead of the whole object
    • quick recovery from failures, as only the part that failed to download needs to be retried
    • higher aggregate throughput by downloading parts in parallel

List Operations

  • Object key names are stored lexicographically in S3 indexes, making it hard to sort and manipulate the contents of LIST
  • S3 maintains a single lexicographically sorted list of indexes
  • Build and maintain Secondary Index outside of S3 for e.g. DynamoDB or RDS to store, index and query objects metadata rather than performing operations on S3
  • Use S3 Inventory reports (daily or weekly) as an alternative to LIST API calls for large buckets — more efficient and cost-effective for auditing or analytics workloads.

Security

  • Use Versioning
    • can be used to protect from unintended overwrites and deletions
    • allows the ability to retrieve and restore deleted objects or rollback to previous versions
  • Enable additional security by configuring a bucket to enable MFA (Multi-Factor Authentication) Delete
  • Versioning does not prevent Bucket deletion and must be backed up as if accidentally or maliciously deleted the data is lost
  • Use S3 Object Lock for WORM (Write Once Read Many) protection
    • Prevents objects from being deleted or overwritten for a fixed period or indefinitely
    • Supports Governance mode (can be overridden with special permissions) and Compliance mode (cannot be overridden by anyone, including root)
    • Requires versioning to be enabled
    • Helps meet regulatory requirements (SEC, FINRA, CFTC)
  • Use Same Region Replication or Cross Region Replication feature to backup data to a different bucket or region
  • When using VPC with S3, use VPC S3 endpoints as
    • are horizontally scaled, redundant, and highly available VPC components
    • help establish a private connection between VPC and S3 and the traffic never leaves the Amazon network
    • Support both Gateway endpoints (free, for S3 and DynamoDB) and Interface endpoints (PrivateLink, for cross-region or on-premises access)

S3 Security Defaults (Since 2023)

  • Default Encryption: Since January 5, 2023, all new objects are automatically encrypted with SSE-S3 (AES-256) at no additional cost. You can override with SSE-KMS or SSE-C.
  • Block Public Access: Since April 2023, S3 Block Public Access is enabled by default and ACLs are disabled for all new buckets.
  • SSE-C Disabled by Default: New general purpose buckets automatically disable server-side encryption with customer-provided keys (SSE-C) as a security best practice.
  • Use S3 Access Grants for scalable, fine-grained access control — maps S3 permissions to corporate identities via IAM Identity Center.

Refer blog post @ S3 Security Best Practices

Cost

  • Optimize S3 storage cost by selecting an appropriate storage class for objects:
    • S3 Standard — frequently accessed data
    • S3 Intelligent-Tiering — data with unknown or changing access patterns (automatically moves objects between Frequent, Infrequent, Archive Instant, Archive, and Deep Archive access tiers)
    • S3 Standard-IA — infrequent access, rapid retrieval needed
    • S3 One Zone-IA — infrequent access, non-critical data
    • S3 Glacier Instant Retrieval — archive data needing millisecond access
    • S3 Glacier Flexible Retrieval — archive with minutes to hours retrieval
    • S3 Glacier Deep Archive — lowest cost, 12-48 hour retrieval
    • S3 Express One Zone — highest performance, single-digit ms latency
  • Configure appropriate Lifecycle Management rules to automatically transition objects to lower-cost storage classes and expire them when no longer needed.
  • Use S3 Intelligent-Tiering as the default storage class for data with unpredictable access patterns — no retrieval charges, automatic optimization.
  • Use S3 Storage Lens to get organization-wide visibility into storage usage and activity trends, identify cost optimization opportunities, and apply data protection best practices.
  • Use S3 Storage Class Analysis to identify the optimal lifecycle policy for transitioning data to the right storage class.

Data Integrity

  • Use Conditional Writes (launched August 2024) to prevent overwriting existing objects
    • Supports If-None-Match header to check for object existence before creating
    • Supports If-Match header to check ETag before updating
    • Eliminates the need for external locking mechanisms (e.g., DynamoDB) for multi-writer applications
    • Can be enforced at the bucket level using bucket policies (November 2024)
  • Use S3 Object Lock for immutable data protection (compliance, ransomware protection)
  • Enable S3 Versioning to preserve every version of every object
  • Use additional checksums (CRC32, CRC32C, SHA-1, SHA-256) for end-to-end data integrity validation during uploads

Tracking and Monitoring

  • Use S3 Event Notifications with Amazon EventBridge for advanced event-driven architectures
    • Supports filtering by object size, key name patterns, metadata, and event time
    • Can route events to over 20+ AWS service targets
    • More flexible than legacy S3 Event Notifications (which only support SNS, SQS, and Lambda)
  • Use CloudTrail for API-level logging — captures all S3 API calls for auditing and compliance
  • Use S3 Server Access Logging for detailed access records (object-level access patterns)
  • Use CloudWatch to monitor S3 buckets, tracking metrics such as object counts, bytes stored, request counts, and latency
  • Use S3 Storage Lens for organization-wide visibility across all accounts and buckets with actionable recommendations

S3 Monitoring and Auditing Best Practices

Refer blog post @ S3 Monitoring and Auditing Best Practices

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A media company produces new video files on-premises every day with a total size of around 100GB after compression. All files have a size of 1-2 GB and need to be uploaded to Amazon S3 every night in a fixed time window between 3am and 5am. Current upload takes almost 3 hours, although less than half of the available bandwidth is used. What step(s) would ensure that the file uploads are able to complete in the allotted time window?
    1. Increase your network bandwidth to provide faster throughput to S3
    2. Upload the files in parallel to S3 using multipart upload
    3. Pack all files into a single archive, upload it to S3, then extract the files in AWS
    4. Use AWS Import/Export to transfer the video files
  2. You are designing a web application that stores static assets in an Amazon Simple Storage Service (S3) bucket. You expect this bucket to immediately receive over 150 PUT requests per second. What should you do to ensure optimal performance?
    1. Use multi-part upload.
    2. Add a random prefix to the key names.
    3. Amazon S3 will automatically manage performance at this scale. (Since July 2018, S3 automatically partitions for high request rates. 150 PUT/s is well within the 3,500 PUT/s per prefix limit. Random prefixes are no longer needed.)
    4. Use a predictable naming scheme, such as sequential numbers or date time sequences, in the key names
  3. You have an application running on an Amazon Elastic Compute Cloud instance, that uploads 5 GB video objects to Amazon Simple Storage Service (S3). Video uploads are taking longer than expected, resulting in poor application performance. Which method will help improve performance of your application?
    1. Enable enhanced networking
    2. Use Amazon S3 multipart upload
    3. Leveraging Amazon CloudFront, use the HTTP POST method to reduce latency.
    4. Use Amazon Elastic Block Store Provisioned IOPs and use an Amazon EBS-optimized instance
  4. Which of the following methods gives you protection against accidental loss of data stored in Amazon S3? (Choose 2)
    1. Set bucket policies to restrict deletes, and also enable versioning
    2. By default, versioning is enabled on a new bucket so you don’t have to worry about it (Not enabled by default)
    3. Build a secondary index of your keys to protect the data (improves performance only)
    4. Back up your bucket to a bucket owned by another AWS account for redundancy
  5. A startup company hired you to help them build a mobile application that will ultimately store billions of image and videos in Amazon S3. The company is lean on funding, and wants to minimize operational costs, however, they have an aggressive marketing plan, and expect to double their current installation base every six months. Due to the nature of their business, they are expecting sudden and large increases to traffic to and from S3, and need to ensure that it can handle the performance needs of their application. What other information must you gather from this customer in order to determine whether S3 is the right option?
    1. You must know how many customers that company has today, because this is critical in understanding what their customer base will be in two years. (No. of customers do not matter)
    2. You must find out total number of requests per second at peak usage.
    3. You must know the size of the individual objects being written to S3 in order to properly design the key namespace. (Size does not relate to the key namespace design but the count does)
    4. In order to build the key namespace correctly, you must understand the total amount of storage needs for each S3 bucket. (S3 provided unlimited storage the key namespace design would depend on the number)
  6. A document storage company is deploying their application to AWS and changing their business model to support both free tier and premium tier users. The premium tier users will be allowed to store up to 200GB of data and free tier customers will be allowed to store only 5GB. The customer expects that billions of files will be stored. All users need to be alerted when approaching 75 percent quota utilization and again at 90 percent quota use. To support the free tier and premium tier users, how should they architect their application?
    1. The company should utilize an amazon simple workflow service activity worker that updates the users data counter in amazon dynamo DB. The activity worker will use simple email service to send an email if the counter increases above the appropriate thresholds.
    2. The company should deploy an amazon relational data base service relational database with a store objects table that has a row for each stored object along with size of each object. The upload server will query the aggregate consumption of the user in questions (by first determining the files store by the user, and then querying the stored objects table for respective file sizes) and send an email via Amazon Simple Email Service if the thresholds are breached. (Good Approach to use RDS but with so many objects might not be a good option)
    3. The company should write both the content length and the username of the files owner as S3 metadata for the object. They should then create a file watcher to iterate over each object and aggregate the size for each user and send a notification via Amazon Simple Queue Service to an emailing service if the storage threshold is exceeded. (List operations on S3 not feasible)
    4. The company should create two separated amazon simple storage service buckets one for data storage for free tier users and another for data storage for premium tier users. An amazon simple workflow service activity worker will query all objects for a given user based on the bucket the data is stored in and aggregate storage. The activity worker will notify the user via Amazon Simple Notification Service when necessary (List operations on S3 not feasible as well as SNS does not address email requirement)
  7. Your company host a social media website for storing and sharing documents. the web application allow users to upload large files while resuming and pausing the upload as needed. Currently, files are uploaded to your php front end backed by Elastic Load Balancing and an autoscaling fleet of amazon elastic compute cloud (EC2) instances that scale upon average of bytes received (NetworkIn) After a file has been uploaded. it is copied to amazon simple storage service(S3). Amazon Ec2 instances use an AWS Identity and Access Management (AMI) role that allows Amazon s3 uploads. Over the last six months, your user base and scale have increased significantly, forcing you to increase the auto scaling groups Max parameter a few times. Your CFO is concerned about the rising costs and has asked you to adjust the architecture where needed to better optimize costs. Which architecture change could you introduce to reduce cost and still keep your web application secure and scalable?
    1. Replace the Autoscaling launch Configuration to include c3.8xlarge instances; those instances can potentially yield a network throughput of 10gbps. (no info of current size and might increase cost)
    2. Re-architect your ingest pattern, have the app authenticate against your identity provider as a broker fetching temporary AWS credentials from AWS Secure token service (GetFederation Token). Securely pass the credentials and s3 endpoint/prefix to your app. Implement client-side logic to directly upload the file to amazon s3 using the given credentials and S3 Prefix. (will not provide the ability to handle pause and restarts)
    3. Re-architect your ingest pattern, and move your web application instances into a VPC public subnet. Attach a public IP address for each EC2 instance (using the auto scaling launch configuration settings). Use Amazon Route 53 round robin records set and http health check to DNS load balance the app request this approach will significantly reduce the cost by bypassing elastic load balancing. (ELB is not the bottleneck)
    4. Re-architect your ingest pattern, have the app authenticate against your identity provider as a broker fetching temporary AWS credentials from AWS Secure token service (GetFederation Token). Securely pass the credentials and s3 endpoint/prefix to your app. Implement client-side logic that used the S3 multipart upload API to directly upload the file to Amazon s3 using the given credentials and s3 Prefix. (multipart allows one to start uploading directly to S3 before the actual size is known or complete data is downloaded)
  8. If an application is storing hourly log files from thousands of instances from a high traffic web site, which naming scheme would give optimal performance on S3?
    1. Sequential
    2. instanceID_log-HH-DD-MM-YYYY
    3. instanceID_log-YYYY-MM-DD-HH
    4. HH-DD-MM-YYYY-log_instanceID (HH will give some randomness to start with instead of instanceId where the first characters would be i-)
    5. YYYY-MM-DD-HH-log_instanceID

    📝 Note: Since July 2018, S3 no longer requires random prefixes for performance. S3 automatically partitions based on request patterns. However, this exam question may still appear as it tests understanding of the historical key naming optimization concept.

  9. A company wants to ensure that objects uploaded to their S3 bucket are never accidentally overwritten by concurrent writes from multiple application instances. Which S3 feature should they use? [Added 2024]
    1. S3 Versioning
    2. S3 Object Lock in Governance mode
    3. S3 Conditional Writes with If-None-Match header (Conditional writes (Aug 2024) allow checking object existence before creating, preventing accidental overwrites without external locking)
    4. S3 Bucket Policy with deny overwrite
  10. A company stores millions of objects in S3 with unpredictable access patterns. Some objects are accessed frequently for a few weeks, then rarely accessed again. Which storage class provides the most cost-effective solution without operational overhead? [Added 2024]
    1. S3 Standard with lifecycle policy to S3 Standard-IA
    2. S3 Intelligent-Tiering (Automatically moves objects between frequent, infrequent, and archive access tiers based on access patterns with no retrieval charges and no operational overhead)
    3. S3 One Zone-IA
    4. S3 Standard with manual storage class changes
  11. An application requires single-digit millisecond latency for read and write operations on objects stored in S3. The application processes millions of transactions per minute. Which S3 storage option provides the best performance? [Added 2024]
    1. S3 Standard with CloudFront caching
    2. S3 Standard with Transfer Acceleration
    3. S3 Express One Zone (Delivers consistent single-digit millisecond latency, up to 10x faster than S3 Standard, and supports millions of requests per minute. Uses directory buckets in a single AZ.)
    4. S3 Standard with provisioned capacity

References