iStar in Focus: Enhancing Spatial Transcriptomics with High-Resolution Histology for Precise Molecular Mapping

Interviews
Published

May 5, 2024

iStar in Focus: Enhancing Spatial Transcriptomics with High-Resolution Histology for Precise Molecular Mapping Interview Image

The Article Link:

Inferring super-resolution tissue architecture by integrating spatial transcriptomics with histology
Interviewee Name

Dr. Mingyao Li

Dr. Mingyao Li is a Professor of Biostatistics and Digital Pathology at the University of Pennsylvania Perelman School of Medicine. She received her PhD in Biostatistics from the University of Michigan in 2005. She was trained as a statistical geneticist, but since joining the faculty at UPenn in 2006, she has gradually transitioned her research from traditional statistical genetics to statistical genomics, and more recently to molecular imaging and digital pathology. The primary goal of her research is to develop computational tools to tackle genetics and genomics related problems using modern machine learning and AI techniques. She is particularly interested in translational research that has a direct impact on human health.

Interviewee Name

Dr. David Zhang

Dr. Daiwei (David) Zhang received his PhD in Biostatistics and Scientific Computing from the University of Michigan in 2021. His doctoral research centered on developing statistical and machine learning methods for analyzing genetics and imaging data. Since joining Mingyao Li’s lab as a postdoc researcher in October 2021, David has focused on creating AI tools for the integrative analysis of spatial omics and pathology imaging data. In August 2024, he will join the Departments of Biostatistics and Genetics at UNC Chapel Hill as a tenure track assistant professor, where his lab will focus on developing AI tools to address challenges in digital medicine.

Regarding the research background and significance, does this work discover new knowledge or solve existing problems within the field? Please elaborate in detail.

Selected as Nature Methods’ 2020 Method of the Year, Spatial Transcriptomics (ST) has revolutionized our understanding of how tissues are spatially organized and how cells communicate with each other. Despite the availability of many ST platforms, none of them offer a comprehensive solution. An ideal ST platform for biological discovery should provide single-cell resolution, encompass the entire transcriptome, cover a large tissue area, and be cost-effective. However, producing such ST data with existing platforms is difficult. To address this challenge, we developed iStar (Inferring Super-Resolution Tissue Architecture), https://www.nature.com/articles/s41587-023-02019-9, an AI tool designed to produce such ideal ST data by leveraging high-resolution information provided by histology images. iStar utilizes an image feature extraction process that mimics a pathologist’s examination of histology images, enabling the virtual prediction of gene expression with near single-cell resolution. The results from iStar offer both detailed views of individual cells and a broader perspective on the full spectrum of gene activity. This tool exemplifies how AI-enabled digital pathology can enhance the analysis of spatial transcriptomics data.

How did the reviewers evaluate (praise) it?

This paper underwent the smoothest peer-review process that I have ever experienced. Even in the initial round of review, one reviewer commented, “The method demonstrated a significant and meaningful advancement for the field. I will recommend accepting with revision as a brief communication in Nature Biotechnology.” Following our first revision, which clarified some of the technical details, the paper was accepted in principle.

If this achievement has potential applications, what are some specific applications it might have in a few years?

The pathology imaging technique provides both highly detailed views of individual cells and a broader look of the full spectrum of how people’s genes operate, which would allow doctors and researchers to see cancer cells that might otherwise have been virtually invisible. As such, iStar can be used to determine whether safe margins are achieved through cancer surgeries and automatically provide annotation for microscopic images, paving the way for molecular disease diagnosis at that level. In addition, iStar can be utilized to automatically detect critical anti-tumor immune formations called “tertiary lymphoid structures,” whose presence correlates with a patient’s likely survival and favorable response to immunotherapy, which is often given for cancer and requires high precision in patient selection. This means that iStar could be a powerful tool for determining which patients would benefit most from immunotherapy.

Can you recount the specific steps or stages from setting the research topic to the successful completion of the research?

This project started in October 2021, when David joined my lab as a postdoc. Initially, our goal was to predict spot-level gene expression in histology images for samples lacking gene expression data. Despite facing many challenges and early setbacks, David’s persistence led to innovative approaches for extracting information from the histology images. After many attempts, we landed on a hierarchical vision transformer approach for image processing, that computationally mimics how a pathologist examines a histology image in clinic. As I will detail in my answer to the next question, we discovered that this approach effectively extracts features from histology images, and promted us to shift our focus. Rather than focusing on out-of-subject gene expresson prediction, we focused on enhancing the spatial resolution of gene expression in the 10x Visium platform, the most popular platform for ST. By October 2022, we had developed a prototype of the algorithm. The subsequent phase of the project involved identifying biologically and clinically relevant questions that could be explored using our predicted super-resolution gene expression. We engaged with many collaborators, including pathologists, cancer biologists, nepherologits, and neurologists, to review the results and provide biological interpretations. These collaborative efforts significantly enhanced the project’s impact. Without those interesting case studies, this paper wouldn’t have appeared in Nature Biotechnology.

Were there any memorable events during the research? You can tell a story about anything related to people, events, or objects.

A particularly memorable event occurred during a meeting in July 2022 when David presented a tissue segmentation result that utilized hierarchically extracted histology image features. Upon seeing that result, I immediately realized the potential of these features to predict gene expression. Since the features were extracted at the super-pixel level, approximately equivalent to the size of a single cell, it became apprarent that we could use these features to learn the relationship between histology and gene expression. This relationship would allow us to predict near single-cell level gene expression not only within directly measured spots but also in tissue gaps lacking spot coverage.

Is there a follow-up plan based on this research? If so, please elaborate.

iStar’s capabilities can be expanded in many different ways. Currently, we are exploring the use of ST data with single-cell resolution to train models that predict gene expression. We are also investigaging the potential for predicting gene expression in large-sized tissues that exceed the capture area of standard ST platforms. Another exciting direction is the reconstruction of 3D tissue volumes for gene expression analysis. Our ultimate goal is to enable virtual ST on tissue samples from biobanks. If successful, this would facilitate the development of diagnostic and prognostic models for diseases, generate spatial molecular tissue references, and enhance disease diagnostis, treatment recommendation, and personalized medicine.

Without a doubt, AI is one of the hot topics of 2023, requiring extensive data support in its development. What assistance can biostatistics offer to the development of AI?

Both David and I received our PhDs in Biostatistics from the University of Michigan. During our PhD studies, we worked on imaging and genetics problems, which trained our computing skills—critical for our transition into AI research. Biostatisticians, especially those with strong computing abilities and experience in handling large datasets, can perform as well as those trained in computer science. The statistical training we received, coupled with our domain knowledge, is invalable for identifying the most pertinent biomedical problems to address using AI approaches.

Besides the above questions, is there anything else about this achievement that you would like to add? If so, please add it below.

Domain knowledge is crucial in biomedical research. The impact of a paper largely depends on the significance of the question it addresses. Although the techniques employed are important, their importane is secondary to that of the scientific question. Often, the emphasis is mistakenly placed on the technical aspects of research, which may limit its relevance primarily to statisticians. To engage a broader audience, it is essential to communicate in the language of domain experts and to identify the most pressing questions. Acquiring domain knowledge requires years of experience. We are fortunate to work at the Penn School of Medicine, a hub for collaborative and translational research that shapes our research approach and deepens our domain knowledge.

Edited by: Shan Gao
Proofread by: Hongtu Zhu
Page Views: