CloudFront can be used to distribute the content from an S3 bucket.
For an RTMP distribution, the S3 bucket is the only supported origin, and custom origins cannot be used
Using CloudFront over S3 has the following benefits
can be more cost-effective if the objects are frequently accessed as at higher usage, the price for CloudFront data transfer is much lower than the price for S3 data transfer.
downloads are faster with CloudFront than with S3 alone because the objects are stored closer to the users
When using S3 as the origin for distribution and the bucket is moved to a different region, CloudFront can take up to an hour to update its records to include the change of region when both of the following are true:
Origin Access Identity (OAI) is used to restrict access to the bucket
Bucket is moved to an S3 region that requires Signature Version 4 for authentication
Origin Access Identity – OAI
S3 origin objects must be granted public read permissions and hence the objects are accessible from both S3 as well as CloudFront.
Even though CloudFront does not expose the underlying S3 URL, it can be known to the user if shared directly or used by applications.
For using CloudFront signed URLs or signed cookies to provide access to the objects, it would be necessary to prevent users from having direct access to the S3 objects.
Users accessing S3 objects directly would
bypass the controls provided by CloudFront signed URLs or signed cookies, for e.g., control over the date-time that a user can no longer access the content and the IP addresses can be used to access content
CloudFront access logs are less useful because they’re incomplete.
Origin Access Identity (OAI) can be used to prevent users from directly accessing objects from S3.
Origin access identity, which is a special CloudFront user, can be created and associated with the distribution.
S3 bucket/object permissions need to be configured to only provide access to the Origin Access Identity.
When users access the object from CloudFront, it uses the OAI to fetch the content on the user’s behalf, while the S3 object’s direct access is restricted
CloudFront with S3 Objects
CloudFront can be configured to include custom headers or modify existing headers whenever it forwards a request to the origin, to
validate the user is not accessing the origin directly, bypassing CDN
identify the CDN from which the request was forwarded, if more than one CloudFront distribution is configured to use the same origin
if users use viewers that don’t support CORS, configure CloudFront to forward the Origin header to the origin. That will cause the origin to return the Access-Control-Allow-Origin header for every request
Adding & Updating Objects
Objects just need to be added to the Origin and CloudFront would start distributing them when accessed.
For objects served by CloudFront, the Origin can be updated either by
Overwriting the original object
Create a different version and update the links exposed to the user.
For updating objects, it is recommended to use versioning e.g. have files or the entire folders with versions, so links can be changed when the objects are updated forcing a refresh.
there is no wait time for an object to expire before CloudFront begins to serve a new version of it.
there is no difference in consistency in the object served from the edge
no cost is involved to pay for object invalidation.
Objects, by default, would be removed upon expiry (TTL) and the latest object would be fetched from the Origin
Objects can also be removed from the edge cache before it expires
File or Object Versioning to serve a different version of the object that has a different name.
Invalidate the object from edge caches. For the next request, CloudFront returns to the Origin to fetch the object
Object or File Versioning is recommended over Invalidating objects
if the objects need to be updated frequently.
enables to control which object a request returns even when the user has a version cached either locally or behind a corporate caching proxy.
makes it easier to analyze the results of object changes as CloudFront access logs include the names of the objects
provides a way to serve different versions to different users.
simplifies rolling forward & back between object revisions.
is less expensive, as no charges for invalidating objects.
for e.g. change header-v1.jpg to header-v2.jpg
Invalidating objects from the cache
objects in the cache can be invalidated explicitly before they expire to force a refresh
allows to invalidate selected objects
allows to invalidate multiple objects for e.g. objects in a directory or all of the objects whose names begin with the same characters, you can include the * wildcard at the end of the invalidation path.
the user might continue to see the old version until it expires from those caches.
A specified number of invalidation paths can be submitted each month for free. Any invalidation requests more than the allotted no. per month, a fee is charged for each submitted invalidation path
The First 1,000 invalidation paths requests submitted per month are free; charges apply for each invalidation path over 1,000 in a month.
Invalidation path can be for a single object for e.g. /js/ab.js or for multiple objects for e.g. /js/* and is counted as a single request even if the * wildcard request may invalidate thousands of objects.
For RTMP distribution, objects served cannot be invalidated
Partial Requests (Range GETs)
Partial requests using Range headers in a GET request help to download the object in smaller units, improving the efficiency of partial downloads and the recovery from partially failed transfers.
For a partial GET range request, CloudFront
checks the cache in the edge location for the requested range or the entire object and if exists, serves it immediately
if the requested range does not exist, it forwards the request to the origin and may request a larger range than the client requested to optimize performance
if the origin supports range header, it returns the requested object range and CloudFront returns the same to the viewer
if the origin does not support range header, it returns the complete object and CloudFront serves the entire object and caches it for future.
CloudFront uses the cached entire object to serve any future range GET header requests
AWS Certification Exam Practice Questions
Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
Open to further feedback, discussion and correction.
You are building a system to distribute confidential training videos to employees. Using CloudFront, what method could be used to serve content that is stored in S3, but not publically accessible from S3 directly?
Create an Origin Access Identity (OAI) for CloudFront and grant access to the objects in your S3 bucket to that OAI.
Add the CloudFront account security group “amazon-cf/amazon-cf-sg” to the appropriate S3 bucket policy.
Create an Identity and Access Management (IAM) User for CloudFront and grant access to the objects in your S3 bucket to that IAM User.
Create an S3 bucket policy that lists the CloudFront distribution ID as the Principal and the target bucket as the Amazon Resource Name (ARN).