Chronic kidney disease (CKD) affects 780 million people globally—about one in ten people on earth. Goldfinch wants to change that.
Seeking novel therapeutics for CKD patients, Goldfinch Bio is combining genetics and technology to target new drugs and improve clinical treatments. Goldfinch partnered with Loka to develop systems that could run genomic structural variation pipelines on AWS. The results of the collaboration made headlines in the scientific community.
Although Goldfinch Bio was integrated within AWS, it lacked the necessary data engineering, pipeline management and deep expertise in both cloud technologies and scientific workflow definitions. This deficit hindered its ability to improve and customize the genomics tools essential for effective pipeline operation.
Loka’s specialized knowledge of HPC, AWS Batch, open-source packages, open data sets and processes like parallelization--which enables thousands of jobs to run simultaneously--delivered best-in-class solutions faster than Goldfinch could’ve achieved on its own.
Loka deployed Cromwell, an open-sourced workflow management system geared toward scientific workflows, on an Amazon Elastic Compute Cloud instance. We launched the AWS infrastructure required to properly run Cromwell, utilizing AWS S3, AWS Batch, AWS RDS, security groups and IAM Roles.
Loka’s team identified and transferred input genomics data from Google Cloud to AWS, modifying the input JSONs to use AWS S3 buckets and patching the WDL files to work in AWS.
Loka then added and enriched Cromwell functionalities that improved the overall integration of Cromwell in AWS, enhancing flexibility and performance for users and companies.
Loka successfully transitioned the GATK-SV pipeline, initially developed by the Broad Institute for Google Cloud, to operate on AWS. This shift led to notable enhancements in performance, speed and cost efficiency. Loka’s collaboration with Goldfinch, which landed on the July 2022 cover of Science Advances, was further celebrated when our team leader and Loka's CEO shared their insights on the AWS Health Innovation Podcast and co-authored an article about the project with Goldfinch and AWS.
Loka decreased the execution time from three-plus days to 1.3 days.
The enhancements implemented not only minimized the effort required but also led to a drastic reduction in the total cost of utilized EC2 instances.
AWS-platformed companies can now run their pipelines markedly faster using the infrastructure Loka developed. (Find deployment instructions on GitHub.)
“Loka jumped in as an extension of our team, supporting our DevOps work and collaborating on the GATK-SV project. The team started contributing right away. I always felt like Loka’s people were on point and focused on the priorities.”