Use Case: 

Union lightens the load for Delve Bio

The company

Less than a year since Delve Bio spun out from UC San Francisco, it’s already creating a buzz with an innovative diagnostic platform to speed diagnosis of infectious diseases. Its test applies metagenomic next-generation sequencing to analyze all the nucleic acids in a patient sample to sniff out bacteria, fungi, parasites and viruses simultaneously.

The technique, used to diagnose meningitis and encephalitis from samples of cerebrospinal fluid, was initially developed at UCSF, said Brian O’Donovan, head of bioinformatics and computational biology. Now Delve is applying the technique to identify the presence of the entire gamut of infectious disease much faster than traditional methods, using a single sample of cerebrospinal fluid.

The technique, used to diagnose meningitis and encephalitis from samples of cerebrospinal fluid, was initially developed at UCSF, said Brian O’Donovan, head of bioinformatics and computational biology. Now, as an exclusive licensee of the technology, Delve is aiming to modernize and scale the technology so more patients have access to a single test that can screen for the entire gamut of infectious organisms at once. This is opposed to the current paradigm where clinicians order several tests to screen for specific viruses, fungi, bacteria and other infections; often, each targets only a single organism or a small panel of pathogens.

The challenge

O’Donovan said robust, auditable workflow management is essential to achieve the precision the process requires.

“There’s very little nucleic acid in a milliliter of someone’s spinal fluid,” he explained. “So the assay itself is very sensitive. We’re using picogram-level amounts of nucleic acid, so we have to amplify those to get enough material to then load onto a sequencer. And that unbiased amplification also gives you a very good idea of what’s going on in the lab. We need to be able to confidently and accurately adjudicate which sequences are derived from the sample and which are potential artifacts from the assay or environment. It’s a typical signal and noise problem.”

When O’Donovan joined Delve as its “first and only bioinformatics scientist and engineer,” he had used Flyte workflow orchestration at his previous company, Freenome.

“I think bioinformaticians are gravitating towards systems like Flyte,” he said. “If you can already write Python code, implementation is easy! It’s very satisfying to just decorate a function and then see it spin up a node.” O’Donovan said the stability of Flyte’s handling of ETL and EDA simplified working with data, “especially for clinical applications, like we’re doing, where traceability and reproducibility are paramount.

“I like to design in silico experiments, where we have a spec for our pipeline and we’re tweaking one of the parameters on, say, the alignment package we use or the allowable edit distance between the DNA strings that we’re comparing. In the case of viral alignments, we might relax that parameter a little bit because viruses evolve quicker than other organisms, so we might not have a good reference sequence for a novel or mutated viral strain. We want to know what impact this alignment stringency has on our clinical sensitivity and specificity.

“We’ll mock up — either in a YAML or a flat file — some kind of parameter sweep over those parameters, and then just press Go and run it,” O’Donovan continued. “Then it’s all cached and auditable. If we make a decision, if we change our clinical or business logic because of that study, we can just reference that execution.

“If you have to write very explicitly typed write functions, it’s much easier if the actual registration process does Syntax Checking, and then you forever have a cached recorded execution that you can refer to in your QMS documentation. I was trained in genomics and next-generation sequencing — not in document control, or technical writing or anything like that. So anything that kind of lightens that load is extremely welcome!”

The solution

When he joined the Delve team, O’Donovan had “lobbied pretty hard” for orchestration: A formal workflow, he reasoned, would speed onboarding of researchers and reduce technical overhead.

“We explored launching the workflow as an AWS batch and [tried] other options on GCP,” he said. “But tweaking things like memory requirements in AMI or VM specs creates overhead.”

“After reviewing the available options with our Engineering team, including standing up our own Flyte cluster using purely the open source project, we decided that the Union offering was most aligned with our immediate needs and expectations. We met with the Union team several times leading up to the decision and were excited to enter a partnership that included dedicated support and integration of their platform into our existing AWS infrastructure. Throughout the entire process they have been incredibly responsive and insightful, allowing our engineers and scientists to focus more on implementing our core business and clinical logic and less on technical details or overhead relating to our compute cluster.”

Adopting Union relieved Delve of the task of managing infrastructure and Kubernetes. “I’m not gonna lie: Kubernetes is all nuance. If you did a quick assessment of my Slack exchanges with Union dev engineers that help out, they’re mostly helping us with Kubernetes questions. That takes a huge load off of our engineering department building systems that integrate with medical systems and billing. They don’t need to be managing our ephemeral storage on a node here and there.”

The results

“One of the things that attracted me the most to Union: Its learning curve’s not terribly steep, but the yield curve is incredible,” O’Donovan said. “Once people get the underlying concept, it's incredibly easy and rewarding.”

That ease of use is essential, O’Donovan said, with a team that spans engineers as well as academics with specialized topical knowledge. “I can get a kind of research-to-dev pipeline going where you can show me something in a notebook, if it looks like it’s yielding something interesting, we can spend a few hours to turn that into a Flyte task. And if you do that a few times with a green hire, within a month, they’ll start producing code that they can register on their own. Eventually, dev habits evolve so even early code is almost immediately amenable to execution on our Union cluster.”