Google Cloud Storage Options

GCP Storage Options

GCP provides various storage options and the selection can be based on

  • Relational (SQL) vs Non-Relational (NoSQL)
  • Structured vs Unstructured
  • Transactional (OLTP) vs Analytical (OLAP)
  • Fully Managed vs Requires Provisioning
  • Global vs Regional
  • Horizontal vs Vertical scaling

Cloud Datastore

  • Cloud Datastore is a highly-scalable, non-relational NoSQL database
  • fully managed with no-ops and no planned downtime and no need to provision database instances (vs Bigtable)
  • uses a distributed architecture to automatically manage scaling.
  • queries scale with the size of the result set, not the size of the data set
  • supports ACID Atomic transactionsall or nothing (vs Bigtable)
  • provides High availability of reads and writesruns in Google data centers, which use redundancy to minimize impact from points of failure.
  • provides massive scalability with high performanceuses a distributed architecture to automatically manage scaling.
  • scales from zero to terabytes with flexible storage and querying of data
  • provides SQL-like query language
  • supports strong and eventual consistencyensures that entity lookups and ancestor queries always receive strongly consistent data. All other queries are eventually consistent.
  • supports data encryption at rest and in transit
  • provides terabytes of capacity with a maximum unit size of 1 MB per entity (vs Bigtable)
  • Consider using Cloud Datastore if you need to store semi-structured objects, or if require support for transactions and SQL-like queries.

Cloud Bigtable

  • Bigtable is a non-relational NoSQL, analytical big data database service
  • supports large quantities (>1 TB) of semi-structured or structured data (vs Datastore)
  • supports high throughput or rapidly changing data (vs BigQuery)
  • managed, but needs provisioning of nodes and can be expensive (vs Datastore and BigQuery)
  • does not support transactions or strong relational semantics (vs Datastore)
  • does not support SQL queries (vs BigQuery and Datastore)
  • ideal for time-series or natural semantic ordering data
  • can run asynchronous batch or real-time processing on the data
  • can run machine learning algorithms on the data
  • provides petabytes of capacity with a maximum unit size of 10 MB per cell and 100 MB per row.
  • Consider using Cloud Bigtable, if you need to high performance datastore to perform analytics on a large amount of structured objects.

Cloud Storage

  • Cloud Storage provides durable and highly available object storage.
  • fully managed, simple administration and scalable that does not require capacity management
  • supports unstructured data storage like binary or raw objects
  • provides high performance, internet-scale
  • supports data encryption at rest and in transit
  • Consider using Cloud Storage, if you need to store immutable blobs larger than 10 MB, such as large images or movies. This storage service provides petabytes of capacity with a maximum unit size of 5 TB per object.

Cloud SQL

  • Cloud SQL provides managed, relational SQL databases
  • Offers MySQL and PostgreSQL databases as a service
  • managed, however needs to select and provision machines (vs Cloud Spanner)
  • supports automatic replication
  • supports managed backups
  • supports vertical scaling for read and write
  • supports Horizontal scaling for read only using read replicas (vs Cloud Spanner)
  • single region only – although it now supports cross region read replicas (vs Cloud Spanner)
  • supports data encryption at rest and in transit
  • provides up to 10,230 GB, depending on machine type (vs Cloud Spanner)
  • Consider using Cloud SQL for full relational SQL support for OTLP and lift and shift of MySQL, PostgreSQL databases

Cloud Spanner

  • Cloud Spanner provides fully managed, relational SQL databases with joins and secondary indexes
  • provides cross-region, global, horizontal scalability and availability
  • supports strong consistency, including strongly consistent secondary indexes
  • provides high availability through synchronous and built-in data replication.
  • provides strong global consistency
  • supports database sizes exceeding ~2 TB (vs Cloud SQL)
  • does not provide direct lift and shift for relational databases (vs Cloud SQL)
  • expensive as compared to Cloud SQL
  • Consider using Cloud SQL for full relational SQL support, with horizontal scalability spanning petabytes for OTLP

BigQuery

  • provides fully managed, no-ops,  OLAP solution
  • provides high capacity, data warehousing analytics solution
  • ideal for big data exploration and processing
  • not ideal for operational or transactional databases
  • provides SQL interface

GCP Storage Options Decision Tree

GCP Storage Options Decision Tree

GCP Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • GCP services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • GCP exam questions are not updated to keep up the pace with GCP updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Your application is hosted across multiple regions and consists of both relational database data and static images. Your database has over 10 TB of data. You want to use a single storage repository for each data type across all regions. Which two products would you choose for this task? (Choose two)
    1. Cloud Bigtable
    2. Cloud Spanner
    3. Cloud SQL
    4. Cloud Storage
  2. You are building an application that stores relational data from users. Users across the globe will use this application. Your CTO is concerned about the scaling requirements because the size of the user base is unknown. You need to implement a database solution that can scale with your user growth with minimum configuration changes. Which storage solution should you use?
    1. Cloud SQL
    2. Cloud Spanner
    3. Cloud Firestore
    4. Cloud Datastore
  3. Your company processes high volumes of IoT data that are time-stamped. The total data volume can be several petabytes. The data needs to be written and changed at a high speed. You want to use the most performant storage option for your data. Which product should you use?
    1. Cloud Datastore
    2. Cloud Storage
    3. Cloud Bigtable
    4. BigQuery
  4. Your App Engine application needs to store stateful data in a proper storage service. Your data is non-relational database data. You do not expect the database size to grow beyond 10 GB and you need to have the ability to scale down to zero to avoid unnecessary costs. Which storage service should you use?
    1. Cloud Bigtable
    2. Cloud Dataproc
    3. Cloud SQL
    4. Cloud Datastore

AWS Data Transfer Services

AWS Data Transfer Services

  • AWS provides a suite of data transfer services that includes many methods that to migrate your data more effectively.
  • Data Transfer services work both Online and Offline and the usage depends on several factors like amount of data, time required, frequency, available bandwidth and cost.
  • Online data transfer and hybrid cloud storage
    • A network link to the VPC, transfer data to AWS, or use S3 for hybrid cloud storage with an existing on-premises applications.
    • helps both to lift and shift large datasets once, as well as help you integrate existing process flows like backup and recovery or continuous data streams directly with cloud storage.
  • Offline data migration to Amazon S3.
    • use shippable, ruggedized devices are ideal for moving large archives, data lakes, or in situations where bandwidth and data volumes cannot pass over your networks within your desired time frame.

Online data transfer

VPN

  • connect securely between data centers and AWS
  • quick to setup and cost efficient
  • ideal for small data transfers and connectivity
  • not reliable as still uses shared Internet connection

Direct Connect

  • provides dedicated physical connection to accelerate network transfers between data centers and AWS
  • provides reliable data transfer
  • ideal for regular large data transfer
  • needs time to setup
  • is not a cost efficient solution
  • can be secured using VPN over Direct Connect

AWS S3 Transfer Acceleration

  • makes public Internet transfers to S3 faster.
  • helps maximize the available bandwidth regardless of distance or varying Internet weather, and there are no special clients or proprietary network protocols.  Simply change the endpoint you use with your S3 bucket and acceleration is automatically applied.
  • ideal for recurring jobs that travel across the globe, such as media uploads, backups, and local data processing tasks that are regularly sent to a central location

AWS DataSync

  • automates moving data between on-premises storage and S3 or Elastic File System (Amazon EFS).
  • automatically handles many of the tasks related to data transfers that can slow down migrations or burden the IT operations, including running your own instances, handling encryption, managing scripts, network optimization, and data integrity validation.
  • helps transfer data at speeds up to 10 times faster than open-source tools.
  • uses AWS Direct Connect or internet links to AWS and ideal for one-time data migrations, recurring data processing workflows, and automated replication for data protection and recovery.

Offline data transfer

AWS Snowball

  • is a petabyte-scale data transport solution that uses secure appliances to transfer large amounts of data into and out of AWS.
  • ideal for one time large data transfers with limited network bandwidth, long transfer times, and security concerns
  • is simple, fast, and secure.
  • can be very cost and time efficient for large data transfer

AWS Snowball Edge

  • is a petabyte to exabytes scale data transfer device with on-board storage and compute capabilities
  • move large amounts of data into and out of AWS, as a temporary storage tier for large local datasets, or to support local workloads in remote or offline locations.
  • ideal for one time large data transfers with limited network bandwidth, long transfer times, and security concerns
  • is simple, fast, and secure.
  • can be very cost and time efficient for large data transfer

AWS Snowmobile

  • is an exabyte-scale data transport solution that uses a secure semi 40-foot shipping container to transfer large amounts of data into and out of AWS.
  • addresses common challenges with large-scale data transfers including high network costs, long transfer times, and security concerns.
  • transfer done through through a custom engagement, is fast, secure, and can be as little as one-fifth the cost of high-speed Internet.

Data Transfer Chart – Bandwidth vs Time

Data Migration Speeds

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. An organization is moving non-business-critical applications to AWS while maintaining a mission critical application in an on-premises data center. An on-premises application must share limited confidential information with the applications in AWS. The Internet performance is unpredictable. Which configuration will ensure continued connectivity between sites MOST securely?
    1. VPN and a cached storage gateway
    2. AWS Snowball Edge
    3. VPN Gateway over AWS Direct Connect
    4. AWS Direct Connect
  2. A company wants to transfer petabyte scale of data to AWS for their analytics, however are constrained on their internet connectivity? Which AWS service can help them transfer the data quickly?
    1. S3 enhanced uploader
    2. Snowmobile
    3. Snowball
    4. Direct Connect
  3. A company wants to transfer their video library data, which runs in exabyte, to AWS. Which AWS service can help the company transfer the data?
    1. Snowmobile
    2. Snowball
    3. S3 upload
    4. S3 enhanced uploader
  4. You are working with a customer who has 100 TB of archival data that they want to migrate to Amazon Glacier. The customer has a 1-Gbps connection to the Internet. Which service or feature provides the fastest method of getting the data into Amazon Glacier?
    1. Amazon Glacier multipart upload
    2. AWS Storage Gateway
    3. VM Import/Export
    4. AWS Snowball