Google Cloud – EHR Healthcare Case Study

Google Cloud – EHR Healthcare Case Study

EHR Healthcare is a leading provider of electronic health record software to the medical industry. EHR Healthcare provides its software as a service to multi-national medical offices, hospitals, and insurance providers.

Executive statement

Our on-premises strategy has worked for years but has required a major investment of time and money in training our team on distinctly different systems, managing similar but separate environments, and responding to outages. Many of these outages have been a result of misconfigured systems, inadequate capacity to manage spikes in traffic, and inconsistent monitoring practices. We want to use Google Cloud to leverage a scalable, resilient platform that can span multiple environments seamlessly and provide a consistent and stable user experience that positions us for future growth.

EHR Healthcare wants to move to Google Cloud to expand, build scalable and highly available applications. It also wants to leverage automation and IaaC to provide consistency across environments and reduce provisioning errors.

Solution Concept

Due to rapid changes in the healthcare and insurance industry, EHR Healthcare’s business has been growing exponentially year over year. They need to be able to scale their environment, adapt their disaster recovery plan, and roll out new continuous deployment capabilities to update their software at a fast pace. Google Cloud has been chosen to replace its current colocation facilities.

EHR wants to scale, build HA and DR setup and introduce CI/CD in their setup.

Existing Technical Environment

EHR’s software is currently hosted in multiple colocation facilities. The lease on one of the data centers is about to expire.
Customer-facing applications are web-based, and many have recently been containerized to run on a group of Kubernetes clusters. Data is stored in a mixture of relational and NoSQL databases (MySQL, MS SQL Server, Redis, and MongoDB).
EHR is hosting several legacy file- and API-based integrations with insurance providers on-premises. These systems are scheduled to be replaced over the next several years. There is no plan to upgrade or move these systems at the current time.
Users are managed via Microsoft Active Directory. Monitoring is currently being done via various open-source tools. Alerts are sent via email and are often ignored.

  • As the lease of one of the data centers is about to expire, so time is critical
  • Some applications are containerized and have SQL and NoSQL databases and can be moved
  • Some of the systems would not be migrated
  • Team has multiple monitoring tools and might need consolidation

Business requirements

  • On-board new insurance providers as quickly as possible.
  • Provide a minimum 99.9% availability for all customer-facing systems.
    • Availability can be increased by hosting applications across multiple zones
  • Provide centralized visibility and proactive action on system performance and usage.
    • Cloud Monitoring can be used to provide centralized visibility and alerting can provide proactive action
  • Increase ability to provide insights into healthcare trends.
    • Data can be pushed and analyzed using BigQuery and insights visualized using Data studio.
  • Reduce latency to all customers.
    • Performance can be improved using Global Load Balancer to expose the applications
  • Maintain regulatory compliance.
    • Regulatory compliance can be maintained using data localization, data retention.
  • Decrease infrastructure administration costs.
    • Infrastructure administration costs can be reduced using automation with either Terraform or Deployment Manager
  • Make predictions and generate reports on industry trends based on provider data.
    • Data can be pushed and analysed using BigQuery.

Technical requirements

  • Maintain legacy interfaces to insurance providers with connectivity to both on-premises systems and cloud providers.
  • Provide a consistent way to manage customer-facing applications that are container-based.
    • Containers based applications can be deployed GKE or Cloud Run with consistent CI/CD experience
  • Provide a secure and high-performance connection between on-premises systems and Google Cloud.
    • Cloud VPN, Dedicated Interconnect, or Partner Interconnect connections can be established between on-premises and Google Cloud
  • Provide consistent logging, log retention, monitoring, and alerting capabilities.
    • Cloud Monitoring and Cloud Logging can be used to provide a single tool for monitoring, logging, and alerting.
  • Maintain and manage multiple container-based environments.
    • Use Deployment Manager or IaaC to provide consistent implementations across environments
  • Dynamically scale and provision new environments.
    • Applications deployed on GKE can be scaled using Cluster Autoscaler and HPA for deployments.
  • Create interfaces to ingest and process data from new providers.

GCP Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • GCP services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • GCP exam questions are not updated to keep up the pace with GCP updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. For this question, refer to the EHR Healthcare case study. In the past, configuration errors put IP addresses on backend servers that should not have been accessible from the internet. You need to ensure that no one can put external IP addresses on backend Compute Engine instances and that external IP addresses can only be configured on the front end Compute Engine instances. What should you do?
    1. Create an organizational policy with a constraint to allow external IP addresses on the front end Compute Engine instances
    2. Revoke the compute.networkadmin role from all users in the project with front end instances
    3. Create an Identity and Access Management (IAM) policy that maps the IT staff to the compute.networkadmin role for the organization
    4. Create a custom Identity and Access Management (IAM) role named GCE_FRONTEND with the compute.addresses.create permission

References

EHR Healthcare Case Study

Google Cloud – TerramEarth Case Study

TerramEarth manufactures heavy equipment for the mining and agricultural industries. About 80% of their business is from mining and 20% from agriculture. They currently have over 500 dealers and service centers in 100 countries. Their mission is to build products that make their customers more productive.

Key points here are 500 dealers and service centers are spread across the world and they want to make their customer more productive.

Solution Concept

There are 20 million TerramEarth vehicles in operation that collect 120 fields of data per second. Data is stored locally on the vehicle and can be accessed for analysis when a vehicle is serviced. The data is downloaded via a maintenance port. This same port can be used to adjust operational parameters, allowing the vehicles to be upgraded in the field with new computing modules.

Approximately 200,000 vehicles are connected to a cellular network, allowing TerramEarth to collect data directly. At a rate of 120 fields of data per second, with 22 hours of operation per day, TerramEarth collects a total of about 9 TB/day from these connected vehicles.

Key points here are TerramEarth has 20 million vehicles. Data is stored on the vehicle and is only available for analysis when the vehicle comes for servicing. Only 1% of the vehicles currently have the capability to stream real time data which produce 9 TB/day.

Executive Statement

Our competitive advantage has always been in our manufacturing process, with our ability to build better vehicles for lower cost than our competitors. However, new products with different approaches are constantly being developed, and I’m concerned that we lack the skills to undergo the next wave of transformations in our industry. My goals are to build our skills while addressing immediate market needs through incremental innovations.

Key point here is the company wants to improve their vehicles while building new skills and reducing cost.

Existing Technical Environment

TerramEarth’s existing architecture is composed of Linux and Windows-based systems that reside in a single U.S, west coast based data center. These systems gzip CSV files from the field and upload via FTP, and place the data in their data warehouse. Because this process takes time, aggregated reports are based on data that is 3 weeks old.

With this data, TerramEarth has been able to preemptively stock replacement parts and reduce unplanned downtime of their vehicles by 60%. However, because the data is stale, some customers are without their vehicles for up to 4 weeks while they wait for replacement parts.

Key point here is that the company is working with stale data and hence have increased downtime.

Application 1: Data ingest

A custom Python application reads uploaded datafiles from a single server, writes to the data warehouse

Compute:

  • Windows Server 2008 R2
    • 16 CPUs
    • 128 GB of RAM
    • 10 TB local HDD storage

Application 2: Reporting

An off the shelf application that business analysts use to run a daily report to see what equipment needs repair. Only 2 analysts of a team of 10 (5 west coast, 5 east coast) can connect to the reporting application at a time.

Compute

  • Off the shelf application. License tied to number of physical CPUs
    • Windows Server 2008 R2
    • 16 CPUs
    • 32 GB of RAM
    • 500 GB HDD

Data warehouse

  • A single PostgreSQL server
    • RedHat Linux
    • 64 CPUs
    • 128 GB of RAM
    • 4x 6TB HDD in RAID 0

Key points here are TerramEarth has its infrastructure in a single location – US West Coast. The data from vehicles is uploaded as CSV files through FTP on the data warehouse. As the data is delayed TerramEarth is only able to reduce unplanned downtime of their vehicles by 60%. Some customer vehicles do not have replacement parts for over 4 weeks.

Business Requirements

  • Decrease unplanned vehicle downtime to less than 1 week
    • Current bottleneck is mainly collection of data available for analytics. If the data collection can be improved i.e. more vehicles can be moved to Cellular connectivity the data is available more real time and hence the feedback loop can be completed earlier.
    • Can be handled using Cloud Pub/Sub, Cloud IoT, Cloud Dataflow to capture the Cellular data and have analytics done using BigQuery and Cloud Machine Learning.
  • Support the dealer network with more data on how their customers use their equipment to better position new products and services.
    • can be handled using running analytics over collected data regarding the usage and consumption
  • Have the ability to partner with different companies—especially with seed and fertilizer suppliers in the fast-growing agricultural business—to create compelling joint offerings for their customers.
    • can be handled using building APIs to expose the data externally and using Cloud Endpoints to expose the APIs.

Technical Requirements

  • Expand beyond a single datacenter to decrease latency to the American midwest and east coast
    • Can be handled using multi-regional Cloud Storage and other AWS managed services like Pub/Sub, Dataflow and BigQuery
  • Create a backup strategy
    • data can be easily backed up in Cloud Storage or BigQuery
  • Increase security of data transfer from equipment to the datacenter
    • data can be transferred using HTTPs for more security
  • Improve data in the data warehouse
    • can be handled using BigQuery as the data warehouse solution instead of a single PostgreSQL which is limited in capability and scalability
  • Use customer and equipment data to anticipate customer needs
    • can be handled using running machine learning models over the data collected

Reference Cellular Upload Architecture

Batch Upload Replacement Architecture

Reference

Google Cloud – Mountkirk Games Case Study – v1

Mountkirk Games makes online, session-based, multiplayer games for mobile platforms. They build all of their games using some server-side integration. Historically, they have used cloud providers to lease physical servers.

Due to the unexpected popularity of some of their games, they have had problems scaling their global audience, application servers, MySQL databases, and analytics tools.

Their current model is to write game statistics to files and send them through an ETL tool that loads them into a centralized MySQL database for reporting.

Solution Concept

Mountkirk Games is building a new game, which they expect to be very popular. They plan to deploy the game’s backend on Google Compute Engine so they can capture streaming metrics, run intensive analytics, and take advantage of its autoscaling server environment and integrate with a managed NoSQL database.

So the key here is the company wants to deploy the new game to Google Compute Engine (not GAE or GKE) configured to scale (managed instance groups) to handle streaming metrics and analytics. It also needs to integrated with a NoSQL database (either Cloud Datastore or BigTable)

Executive Statement

Our last successful game did not scale well with our previous cloud provider, resulting in lower user adoption and affecting the game’s reputation. Our investors want more key performance indicators (KPIs) to evaluate the speed and stability of the game, as well as other metrics that provide deeper insight into usage patterns so we can adapt the game to target users. Additionally, our current technology stack cannot provide the scale we need, so we want to replace MySQL and move to an environment that provides autoscaling, low latency load balancing, and frees us up from managing physical servers.

So the key points here are the company wants to move to cloud and use more of scalable managed services without them needing to manage physical servers. Also they want to move from relational MySQL database to a managed database.

Business Requirements

  • Increase to a global footprint
    • Can be handled using Global HTTP load balancer with managed instance groups in each region.
    • Using multi-regional services like Cloud Storage, Cloud Datastore, Pub/Sub, BigQuery
  • Improve uptime – downtime is loss of players
    • can be handled using managed instance groups across zones and regions coupled with load balancer for High Availability
    • GCP managed services like Pub/Sub, Datastore provide a high uptime
  • Increase efficiency of the cloud resources we use
    • can be increased with scaling the resources as per the demand
    • can be increased further using Stackdriver metrics and monitoring
  • Reduce latency to all customers
    • can be reduced using Global HTTP load balancer, which would route the user to the closest region
    • using multi-regional resources would also help reduce latency

Technical Requirements

Requirements for Game Backend Platform

  1. Dynamically scale up or down based on game activity.
    • can be handled using autoscaling with managed instance groups with the ability to scale as per the demand
  2. Connect to a transactional database service to manage user profiles and game state.
    • can be handled using Cloud Datastore, which provides transactional NoSQL database service and is ideal for user profiles and game state
    • it does not need a relational database, so Datastore should work fine
  3. Store game activity in a timeseries database service for future analysis.
    • can be stored using BigQuery as it would provide a low cost data storage (as compared to Bigtable) for analytics 
    • other advantage of BigQuery over Bigtable in this case its multi-regional, meeting the global footprint and latency requirements
  4. As the system scales, ensure that data is not lost due to processing backlogs.
    • seems relevant for capturing streaming data and can be handled using Pub/Sub and Cloud Dataflow
  5. Run hardened Linux distro.
    • can be handled using Google Compute Engine, as it is the only compute option that would support custom Linux distro

Requirements for Game Analytics Platform

  1. Dynamically scale up or down based on game activity.
    • can be handled using GCP managed services which scale as per the demand
    • can be handled using Global HTTP load balancer and autoscaling managed instance groups
  2. Process incoming data on the fly directly from the game servers.
    • can be handled using Pub/Sub for capturing data and Cloud DataFlow for processing the data on the fly i.e real time
  3. Process data that arrives late because of slow mobile networks.
    • can be handled using Cloud Pub/Sub and Cloud DataFlow
  4. Allow queries to access at least 10 TB of historical data.
    • can be handled using BigQuery for storage and analytics
  5. Process files that are regularly uploaded by users’ mobile devices.
    • more relevant for kind of batch processing, which can be handled using storing the files in Cloud Storage and processing the same using Cloud DataFlow

Reference Architecture

Refer to Mobile Gaming Analysis Telemetry solution

References