Imageomics, at its core, uses biological images to reveal new insights about life. What inspired you to start using images as a tool for answering complex biological questions?

Humans a very visual species. For centuries, we have learned about the world around us by looking and recording what we see. In fact, the entire scientific method is about looking carefully, as  Poincare wrote in his Science and Method: "The scientific method consists in observation and experiment. If the scientist had an infinity of time at his disposal, it would be sufficient to say to him, 'Look, and look carefully.' But, since he has not time to look at everything, and above all to look carefully, and since it is better not to look at all than to look carelessly, he is forced to make a selection. The first question, then, is to know how to make this selection." 

Computation and technology do not change the scientific method. What they do is allow scientists to look more carefully at more things.

In nature, particularly, images are by far the most abundant and readily available source of information, coming a variety of imaging technologies ranging from satellites to personal smartphones. New data collection technologies, such as GPS, high-definition cameras, autonomous vehicles under water, on the ground, and in the air, camera traps, and crowdsourcing, are generating data about life on the planet that are orders of magnitude richer than any previously collected. Yet, as in many other cases, our ability to extract insight from these data lags substantially behind our ability to collect it. AI can turn these data into high resolution information source about living organisms, enabling scientific inquiry, conservation, and policy decisions.

 

How can imageomics help us better understand how our dietary choices affect biological processes and overall health?

Absolutely! As in many other areas of biology, citizen science and crowdsourcing are an increasing source of data, especially image data. For the first time in history, we have direct access to the images of people’s meals thanks to the “phone eats first” attitude. Combined with a lot of health data and biological data ranging from molecular and microbiome to ecosystems and agricultural systems, we can start unraveling the complex relationships between food, biology, and health at large scale and high resolution using AI methods that are being quickly developed both within imageomics and more broadly.

 

What cutting-edge techniques or tools are you incorporating into your research to advance imageomics?

We are developing a variety of new Machine Learning (ML) methods focusing on image and multimodal analysis. We are particularly pushing the boundary in knowledge-guided machine learning (KGML)redesigning ML approaches as application-driven ML, adding the structured biological knowledge directly into the architecture of the ML models, resulting in not only better models but more interpretable and potentially explainable ones. Great examples of the types of models and tools that we are developing are the first ML foundation model for the Tree of Life, BioCLIPphylogeny-guided neural network models that allow phylogeny inference directly from images and generating hypotheses for ancestral species, developing interpretable ML, and developing workflows for processing large organism specimen collections.

 

What are the biggest challenges you face in this type of research?

Data infrastructure. 

AI runs on data. Imageomics runs on image data with a lot of other data modalities (text, geospatial, molecular) as meta data or context, and the domain knowledge as the guide. Current biological, particularly non-genomic, state of data is highly fractured, often inaccessible, disorganized, and most of it non-machine accessible. We need a unified (but decentralized) data infrastructure to enable biological research at scale.

 

What aspect of your work in imageomics excites you the most or has the greatest potential for scientific breakthroughs?

For the potential of AI in science to be truly realized, we must change the entire paradigm of the way we design, evaluate, and deploy AI and ML solutions. As argued in a wonderful recent piece by Rolnick et al., current ML culture needs to be changed for application-driven AI. Particularly, within scientific applications, domain-specific scientific knowledge and constraints need to become an integral part of the models that focus on the scientific ‘How? and Why’, rather than the current ‘Who/What? Where? When?’. 

Among the yet-to-be explored applications of AI in science I would put the following challenges at the top: novelty discovery, generation of testable hypotheses, and expanding the space of feasible solutions (just because a genome is realizable doesn’t mean it exists. Or does it?). Particularly in imageomics, AI, and more broadly computation, allows us to see and perceive the world from the perspective of different species, senses (hyperspectral), at different resolutions and acuities, quantifying traits we cannot even perceive. The promise of imageomics is making the invisible visible. Allow scientists to see more things more carefully. 

 

Collaboration is often key to innovative research. Who do you collaborate with, and how do they contribute to your work?

Many of the big challenges in science, engineering, and medicine are multidisciplinary and can only be addressed and met by a community of collaborating experts. Moreover, it is important to enable, support, educate, and grow the next generation of that community, from undergraduate to postdoctoral (and beyond) researchers. Whether as the director of the Imageomics Institute, of the AI and Biodiversity Change (ABC) Global Center, or of the Translational Data Analytics Institute at OSU, my focus is always on building and enabling the interdisciplinary research community. Specifically to the imageomics, which is a new and growing field of science, our core team includes ecologists, evolutionary biologists, biodiversity scientists, AI researchers, data scientists, statisticians, and education experts. The community has already grown to include biodiversity and wildlife managers, as well as digital agriculture, health and life science researchers, other computational and engineering scientists and professionals. Turns out, extracting information from (massive amount of) images automatically is an intuitive approach to understanding the world around us for any of its aspects. We continue to look and look carefully.