Software Employs Deep Learning to analyze T-cell receptor (TCR) sequencing data
A new software package that employs deep-learning algorithms to analyze T-cell receptor (TCR) sequencing data has been developed by Researchers at the Bloomberg Kimmel Institute for Cancer Immunotherapy at the Johns Hopkins Kimmel Cancer Center. T-cell receptors, found on the surface of immune T cells, bind to certain antigens, or proteins, found on abnormal cells, such as cancer cells and cells infected with a virus or bacteria, to guide the T cells to attack and destroy the affected cells. The research was published in Nature Communications.
“DeepTCR is an open-source software that can be used to answer questions in research into infectious disease, cancer immunology and autoimmune disease; any place where the immune system has a role through its T-cell receptors,” said lead study author John-William Sidhom, an MD/PhD student at the Johns Hopkins University School of Medicine and Department of Biomedical Engineering working in the Bloomberg~Kimmel Institute for Cancer Immunotherapy.
DeepTCR is a comprehensive deep-learning framework that includes both unsupervised and supervised deep learning models that can be applied at the sequence and sample level. Sidhom says the unsupervised approaches allow investigators to analyze their data in an exploratory fashion, where there may not be known immune exposures, and the supervised approaches will allow investigators to leverage known exposures to improve the learning of the models. As a result, he says, DeepTCR will enable investigators to study the function of the T-cell immune response in basic and clinical sciences by identifying the patterns in the receptors that confer the function of the T cell to recognize and kill pathological cells.
One of the main challenges of analyzing TCR sequencing data is distinguishing meaningful sequencing data from inconsequential data, and DeepTCR helps perform this analysis. “There are a lot of sequences in someone’s immune repertoire. There are a lot of pathogens that someone can be infected by, so the immune response is very broad. As a result, there is a sea of noise in the immune response, and only parts of it are important at a certain time for a certain infection,” Sidhom explains. “I may have T-cell responses to a thousand different viruses, but when the flu impacts me, I only need to utilize a small subset of those T cells to fight the flu. The main thing that the algorithm can do is isolate and match the right T cells to specific responses.”
The software package, which employs a type of deep-learning architecture called a convolutional neural network, provides users the ability to find T-cell sequencing patterns that are relevant to a specific exposure, like a flu infection, a cancer or an autoimmune disease.
“When presented with a lot of data, our algorithms can learn rules of these TCR sequence patterns. For example, we may not know the rules for how the body responds to flu, but with enough data, our software can learn those rules and then teach us what they are,” says Sidhom. “It is very well-suited to identify complex patterns in a very, very large immune repertoire to identify the interacting partners between a T-cell receptor and its antigen.”
In addition to Sidhom, others participating in the research were H. Benjamin Larman, Drew M. Pardoll and Alexander S. Baras.
The research was supported by the Bloomberg~Kimmel Institute for Cancer Immunotherapy, the Mark Foundation for Cancer Research, philanthropy of Susan Wojcicki and Dennis Troper in support of Computational Pathology at Johns Hopkins, the Johns Hopkins–Bristol-Myers Squibb Immuno-Oncology Consortium, and the National Institutes of Health Cancer Center Support Grant.