top of page
  • Writer's pictureEGlobalTech

Revolutionizing Big Data Access and Analysis in the Cloud

The Need

Our client, a large research agency, automated its systems for storing, disseminating and analyzing big data about molecular biology, biochemistry, and genomics support for some of the most heavily trafficked governmental sites in the world. These sites experience millions of hits daily from medical workers, researchers, and the general public.

Researchers conducting analyses on the agency’s public datasets typically download them to their research facility computers or to their virtual environments in the cloud. The datasets are often massive, measured in terabytes and sometimes petabytes. Since these public, read-only datasets are copied as many times as there are research teams seeking to work on it, download times are long, and the process is inefficient. The agency was looking for a more efficient way for researchers to access and analyze the datasets faster, as well as free up more of the agency’s networking capabilities.

Our Solution

EGlobalTech (EGT), a cloud service provider, partnered with this agency to help architect, design, and implement new forms of access to significantly improve the efficiency and speed of research on massive biomedical public datasets. To meet the client’s unique need, EGT employed the emerging “bring compute to the data” paradigm, which enabled researchers to create and fund their own cloud computing resources in one of several clouds where the government agency’s datasets reside. We enabled research teams’ cloud computers to directly connect to, read, and analyze public, read-only datasets without copying or downloading. Researchers see the same version of the data, access is immediate, and efficiencies are significant.

To achieve this result, EGT applied “cloud-native architectures composed in a cloud-agnostic manner” principles to develop the initial architecture portable across Amazon Web Services (AWS) and the Google Compute Platform (GCP), and iteratively refined and operationalized the architecture. EGT applied Infrastructure as Code practices and produced functional, software-defined solutions delivered in time-boxed iterations, while also supporting cloud architecture security engineering from both a compliance and an operational security perspective.

Major highlights of our services include:

  1. Designed a multi-cloud architecture to support “Platform as a Service for Big Data Access” model

  2. Leveraged tools such as Terraform and Puppet to automate the deployment and operations of cloud environments

  3. Developed a portable architecture with initial support for AWS and GCP and eventual support for other providers such as Microsoft Azure

  4. Developed required security packages, supported the Accreditation and Authorization (A&A) process, and helped secure Authority to Operate (ATO)

  5. Provided on-going SysOps and environment management support to development and DevOps teams

EGT’s extensive experience as a cloud service provider, architecting and migrating applications to the cloud has enabled the development of flexible and creative breakthrough strategies that consider our client’s unique requirements, creative implementations, extremely complex environments, and security needs.

The Results

Key results of our solution include:

  1. Operationalized multi-cloud environments in AWS and GCP

  2. Developed required FISMA security packages and secured ATO on time

  3. Enabled “Bring Compute to Data” strategy through distributable cloud automation platform

Contact us at if you would like to learn more about this project.

Copyright 2018 | EGlobalTech | All rights reserved.


bottom of page