AWS CloudFront

CloudFront

  • CloudFront is a fully managed, fast content delivery network (CDN) service that speeds up the distribution of static, dynamic web, or streaming content to end-users.
  • CloudFront delivers the content through a worldwide network of data centers called edge locations or Point of Presence (POP).
  • CloudFront securely delivers data, videos, applications, and APIs to customers globally with low latency, and high transfer speeds, all within a developer-friendly environment.
  • CloudFront gives businesses and web application developers an easy and cost-effective way to distribute content with low latency and high data transfer speeds.
  • CloudFront speeds up the distribution of the content by routing each user request to the edge location that can best serve the content thus providing the lowest latency (time delay).
  • CloudFront uses the AWS backbone network that dramatically reduces the number of network hops that users’ requests must pass through and helps improve performance, provide lower latency and higher data transfer rate
  • CloudFront is a good choice for the distribution of frequently accessed static content that benefits from edge delivery – like popular website images, videos, media files, or software downloads

CloudFront Benefits

  • CloudFront eliminates the expense and complexity of operating a network of cache servers in multiple sites across the internet and eliminates the need to over-provision capacity in order to serve potential spikes in traffic.
  • CloudFront also provides increased reliability and availability because copies of objects are held in multiple edge locations around the world.
  • CloudFront keeps persistent connections with the origin servers so that those files can be fetched from the origin servers as quickly as possible.
  • CloudFront also uses techniques such as collapsing simultaneous viewer requests at an edge location for the same file into a single request to the origin server reducing the load on the origin.
  • CloudFront offers the most advanced security capabilities, including field-level encryption and HTTPS support.
  • CloudFront seamlessly integrates with AWS Shield, AWS Web Application Firewall – WAF, and Route 53 to protect against multiple types of attacks including network and application layer DDoS attacks.

Edge Locations & Regional Edge Caches

  • CloudFront Edge Locations or POPs make sure that popular content can be served quickly to the viewers.
  • CloudFront also has Regional Edge Caches that help bring more content closer to the viewers, even when the content is not popular enough to stay at a POP, to help improve performance for that content.
  • Regional Edge Caches are deployed globally, close to the viewers, and are located between the origin servers and the Edge Locations.
  • Regional edge caches support multiple Edge Locations and support a larger cache size so objects remain in the cache longer at the nearest regional edge cache location.
  • Regional edge caches help with all types of content, particularly content that tends to become less popular over time.

Configuration & Content Delivery

CloudFront Configuration and Content Delivery

Configuration

  1. Origin servers need to be configured to get the files for distribution. An origin server stores the original, definitive version of the objects and can be an AWS hosted service for e.g. S3, EC2, or an on-premise server
  2. Files or objects can be added/uploaded to the Origin servers with public read permissions or permissions restricted to Origin Access Identity (OAI).
  3. Create a CloudFront distribution, which tells CloudFront which origin servers to get the files from when users request the files.
  4. CloudFront sends the distribution configuration to all the edge locations.
  5. The website can be used with the CloudFront provided domain name or a custom alternate domain name.
  6. An origin server can be configured to limit access protocols, caching behaviour, add headers to the files to add TTL, or the expiration time.

Content delivery to Users

  1. When a user accesses the website, file, or object – the DNS routes the request to the CloudFront edge location that can best serve the user’s request with the lowest latency.
  2. CloudFront returns the object immediately if the requested object is present in the cache at the Edge location.
  3. If the requested object does not exist in the cache at the edge location, the POP typically goes to the nearest regional edge cache to fetch it.
  4. If the object is in the regional edge cache, CloudFront forwards it to the POP that requested it.
  5. For objects not cached at either the POP or the regional edge cache location, CloudFront requests the object from the origin server and returns it to the user via the regional edge cache and POP
  6. CloudFront begins to forward the object to the user as soon as the first byte arrives from the regional edge cache location.
  7. CloudFront also adds the object to the cache in the regional edge cache location in addition to the POP for the next time a viewer requests it.
  8. When the object reaches its expiration time, for any new request CloudFront checks with the Origin server for any latest versions, if it has the latest it uses the same object. If the Origin server has the latest version the same is retrieved, served to the user, and cached as well

CloudFront Origins

  • Each origin is either an S3 bucket, a MediaStore container, a MediaPackage channel, or a custom origin like an EC2 instance or an HTTP server
  • For the S3 bucket, use the bucket URL or the static website endpoint URL, and the files either need to be publicly readable or secured using OAI.
  • Origin restrict access, for S3 only, can be configured using Origin Access Identity to prevent direct access to the S3 objects.
  • For the HTTP server as the origin, the domain name of the resource needs to be mapped and files must be publicly readable.
  • Distribution can have multiple origins for each bucket with one or more cache behaviors that route requests to each origin. Path pattern in a cache behavior determines which requests are routed to the origin (S3 bucket) that is associated with that cache behavior.
  • Origin Groups can be used to specify two origins to configure origin failover for high availability. Origin failover can be used to designate a primary origin plus a second origin that CloudFront automatically switches to when the primary origin returns specific HTTP status code failure responses.

CloudFront Delivery Methods

Web distributions

  • supports both static and dynamic content for e.g. HTML, CSS, js, images, etc using HTTP or HTTPS.
  • supports multimedia content on-demand using progressive download and Apple HTTP Live Streaming (HLS).
  • supports a live event, such as a meeting, conference, or concert, in real-time. For live streaming, distribution can be created automatically using an AWS CloudFormation stack.
  • origin servers can be either an S3 bucket or an HTTP server, for e.g., a web server or an AWS ELB, etc.

RMTP distributions (Support Discontinued)

  • supports streaming of media files using Adobe Media Server and the Adobe Real-Time Messaging Protocol (RTMP)
  • must use an S3 bucket as the origin.
  • To stream media files using CloudFront, two types of files are needed
    • Media files
    • Media player for e.g. JW Player, Flowplayer, or Adobe flash
  • End-users view media files using the media player that is provided; not the locally installed on the computer of the device
  • When an end-user streams the media file, the media player begins to play the file content while the file is still being downloaded from CloudFront.
  • The media file is not stored locally on the end user’s system.
  • Two CloudFront distributions are required, Web distribution for media Player and RMTP distribution for media files
  • Media player and Media files can be stored in a same-origin S3 bucket or different buckets

Cache Behavior Settings

Path Patterns

  • Path Patterns help define which path the Cache behaviour would apply to.
  • A default (*) pattern is created and multiple cache distributions can be added with patterns to take priority over the default path.

Viewer Protocol Policy (Viewer -> CloudFront)

  • Viewer Protocol policy can be configured to define the allowed access protocol.
  • Between CloudFront & Viewers, cache distribution can be configured to either allow
    • HTTPS only – supports HTTPS only
    • HTTP and HTTPS – supports both
    • HTTP redirected to HTTPS – HTTP is automatically redirected to HTTPS

Origin Protocol Policy (CloudFront -> Origin)

  • Between CloudFront & Origin, cache distribution can be configured with
    • HTTP only (for S3 static website).
    • HTTPS only – CloudFront fetches objects from the origin by using HTTPS.
    • Match Viewer – CloudFront uses the protocol that the viewer used to request the objects.
  • For S3 as origin,
    • For the website, the protocol has to be HTTP as HTTPS is not supported.
    • For the S3 bucket, the default Origin protocol policy is Match Viewer and cannot be changed. So When CloudFront is configured to require HTTPS between the viewer and CloudFront, it automatically uses HTTPS to communicate with S3.

HTTPS Connection

  • CloudFront can also be configured to work with HTTPS for alternate domain names by using:-
    • Serving HTTPS Requests Using Dedicated IP Addresses
      • CloudFront associates the alternate domain name with a dedicated IP address, and the certificate is associated with the IP address when a request is received from a DNS server for the IP address.
      • CloudFront uses the IP address to identify the distribution and the SSL/TLS certificate to return to the viewer.
      • This method works for every HTTPS request, regardless of the browser or other viewer that the user is using.
      • An additional monthly charge (of about $600/month) is incurred for using a dedicated IP address.
    • Serving HTTPS Requests Using Server Name Indication – SNI
      • SNI Custom SSL relies on the SNI extension of the TLS protocol, which allows multiple domains to be served over the same IP address by including the hostname, viewers are trying to connect to
      • With the SNI method, CloudFront associates an IP address with the alternate domain name, but the IP address is not dedicated.
      • CloudFront can’t determine, based on the IP address, which domain the request is for as the IP address is not dedicated.
      • Browsers that support SNI automatically get the domain name from the request URL & add it to a new field in the request header.
      • When CloudFront receives an HTTPS request from a browser that supports SNI, it finds the domain name in the request header and responds to the request with the applicable SSL/TLS certificate.
      • Viewer and CloudFront perform SSL negotiation, and CloudFront returns the requested content to the viewer.
      • Older browsers do not support SNI.
      • SNI Custom SSL is available at no additional cost beyond standard CloudFront data transfer and request fees
    • For End-to-End HTTPS connections certificate needs to be applied both between the Viewers and CloudFront & CloudFront and Origin, with the following requirements
      • HTTPS between viewers and CloudFront
        • A certificate that was issued by a trusted certificate authority (CA) such as Comodo, DigiCert, or Symantec;
        • Certificate provided by AWS Certificate Manager (ACM);
        • self-signed certificate
      • HTTPS between CloudFront and the Custom Origin
        • If the origin is not an ELB load balancer, the certificate must be issued by a trusted CA such as Comodo, DigiCert, or Symantec.
        • For load balancer, a certificate provided by ACM can be used
        • Self-signed certificates CAN NOT be used.
      • ACM certificate for CloudFront must be requested or imported in the US East (N. Virginia) region. ACM certificates in this region that are associated with a CloudFront distribution are distributed to all the geographic locations configured for that distribution.

Allowed HTTP methods

  • CloudFront supports GET, HEAD, OPTIO