AWS SageMaker

AWS SageMaker

  • SageMaker is a fully managed machine learning service to build, train, and deploy machine learning (ML) models quickly.
  • SageMaker removes the heavy lifting from each step of the machine learning process to make it easier to develop high quality models.
  • SageMaker is designed for high availability with no maintenance windows or scheduled downtimes
  • SageMaker APIs run in Amazon’s proven, high-availability data centers, with service stack replication configured across three facilities in each AWS region to provide fault tolerance in the event of a server failure or AZ outage
  • SageMaker provides a full end-to-end workflow, but users can continue to use their existing tools with SageMaker.
  • SageMaker supports Jupyter notebooks.
  • SageMaker allows users to select the number and type of instance used for the hosted notebook, training & model hosting.

SageMaker Machine Learning

Generate example data

  • Involves exploring and preprocessing, or “wrangling,” example data before using it for model training.
  • To preprocess data, you typically do the following:
    • Fetch the data
    • Clean the data
    • Prepare or transform the data

Train a model

  • Model training includes both training and evaluating the model, as follows:
  • Training the model
    • Needs an algorithm, which depends on a number of factors.
    • Need compute resources for training.
  • Evaluating the model
    • determine whether the accuracy of the inferences is acceptable.

Training Data Format – File mode vs Pipe mode

    • Most Amazon SageMaker algorithms work best when using the optimized protobuf recordIO format for the training data.
    • Using RecordIO format allows algorithms to take advantage of Pipe mode when training the algorithms that support it.
    • File mode loads all of the data from S3 to the training instance volumes
    • In Pipe mode, the training job streams data directly from S3.
    • Streaming can provide faster start times for training jobs and better throughput.
    • With Pipe mode, reduce the size of the EBS volumes for the training instances is also reduced Pipe mode needs only enough disk space to store your final model artifacts.
    • File mode needs disk space to store both the final model artifacts and the full training dataset.

Build Model

  • SageMaker provides several built-in machine learning algorithms that can be used for a variety of problem types
  • Write a custom training script in a machine learning framework that SageMaker supports, and use one of the pre-built framework containers to run it in SageMaker.
  • Bring your own algorithm or model to train or host in SageMaker.
    • SageMaker provides pre-built Docker images for its built-in algorithms and the supported deep learning frameworks used for training and inference
    • By using containers, machine learning algorithms can be trained and deploy models quickly and reliably at any scale.
  • Use an algorithm that you subscribe to from AWS Marketplace.

Deploy the model

  • Re-engineer a model before integrating it with application and deploy it.
  • supports both hosting services and batch transform

Hosting services

    • provides an HTTPS endpoint where the machine learning model is available to provide inferences.
    • supports Canary deployment using ProductionVariant and deploying multiple variants of a model to the same SageMaker HTTPS endpoint.
    • supports automatic scaling for production variants. Automatic scaling dynamically adjusts the number of instances provisioned for a production variant in response to changes in your workload

Batch transform

    • to inferences on entire datasets, consider using batch transform as an alternative to hosting services.

SageMaker Security

  • SageMaker ensures that ML model artifacts and other system artifacts are encrypted in transit and at rest.
  • SageMaker allows using encrypted S3 buckets for model artifacts and data, as well as pass a KMS key to SageMaker notebooks, training jobs, and endpoints, to encrypt the attached ML storage volume.
  • Requests to the SageMaker API and console are made over a secure (SSL) connection.
  • SageMaker stores code in ML storage volumes, secured by security groups and optionally encrypted at rest.

SageMaker Notebooks

  • SageMaker notebooks are collaborative notebooks that are built into SageMaker Studio that can be launched quickly.
  • can be accessed without setting up compute instances and file storage
  • charged only for the resources consumed when notebooks is running
  • instance types can be easily switching  if more or less computing power is needed, during the experimentation phase.

SageMaker Built-in Algorithms

Please refer SageMaker Built-in Algorithms for details

Elastic Inference (EI)

  • helps speed up the throughput and decrease the latency of getting real-time inferences from the deep learning models deployed as SageMaker hosted models
  • adds inference acceleration to a hosted endpoint for a fraction of the cost of using a full GPU instance.

SageMaker Ground Truth

  • provides automated data labeling using machine learning
  • helps building highly accurate training datasets for machine learning quickly.
  • offers easy access to labelers through Amazon Mechanical Turk and provides them with built-in workflows and interfaces for common labeling tasks.
  • allows using your own labelers or using vendors recommended by Amazon through AWS Marketplace.
  • helps lower the labeling costs by up to 70% using automatic labeling, which works by training Ground Truth from data labeled by humans so that the service learns to label data independently.
  • significantly reduces the time and effort required to create datasets for training to reduce costs
  • provides annotation consolidation to help improve the accuracy of the data object’s labels. It combines the results of multiple worker’s annotation tasks into one high-fidelity label.

  • first selects a random sample of data and sends it to Amazon Mechanical Turk to be labeled.
  • results are then used to train a labeling model that attempts to label a new sample of raw data automatically.
  • labels are committed when the model can label the data with a confidence score that meets or exceeds a threshold you set.
  • for confidence score falling below the defined threshold, the data is sent to human labelers.
  • Some of the data labeled by humans is used to generate a new training dataset for the labeling model, and the model is automatically retrained to improve its accuracy.
  • process repeats with each sample of raw data to be labeled.
  • labeling model becomes more capable of automatically labeling raw data with each iteration, and less data is routed to humans.

SageMaker Automatic Model Training

  • Hyperparameters are parameters exposed by machine learning algorithms that control how the underlying algorithm operates and their values affect the quality of the trained models
  • Automatic model tuning is the process of finding a set of hyperparameters for an algorithm that can yield an optimal model.
  • Best Practices for Hyperparameter tuning
    • Choosing the Number of Hyperparameters – limit the search to a smaller number as difficulty of a hyperparameter tuning job depends primarily on the number of hyperparameters that Amazon SageMaker has to search
    • Choosing Hyperparameter RangesDO NOT specify a very large range to cover every possible value for a hyperparameter. Range of values for hyperparameters that you choose to search can significantly affect the success of hyperparameter optimization.
    • Using Logarithmic Scales for Hyperparameters – log-scaled hyperparameter can be converted to improve hyperparameter optimization.
    • Choosing the Best Number of Concurrent Training Jobsrunning one training job at a time achieves the best results with the least amount of compute time.
    • Running Training Jobs on Multiple Instances – Design distributed training jobs so that you get they report the objective metric that you want.

SageMaker Neo

  • SageMaker Neo enables machine learning models to train once and run anywhere in the cloud and at the edge.
  • Automatically optimizes models built with popular deep learning frameworks that can be used to deploy on multiple hardware platforms.
  • Optimized models run up to two times faster and consume less than a tenth of the resources of typical machine learning models.

SageMaker Pricing

  • Users pay for ML compute, storage and data processing resources their use for hosting the notebook, training the model, performing predictions & logging the outputs.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.