Harnessing genomics and machine learning to understand evolution

July 16, 2018

BIDS Data Science Fellow and Molecular Biologist Ciera Martinez studies the molecular mechanisms that guide plant and animal development. Her work furthers our understanding of genome evolution and the genetic basis of human disease.

LeafHow do organisms get their shape?

This is the overarching question behind all my research. The broad answer to this question is evolution. A whale has fins because it evolved fins to swim in the ocean. A plant evolved flat green structures aimed at the sun to optimize photosynthesis. Evolution is the process for which the architecture of life is built. I have spent the last ten years dissecting how.

It began with trees

I was intrinsically enchanted with trees. These extremely large organisms dominate much of our landscapes, and their sprawling psychedelic forms spurred my imagination from a very early age. So from a personal perspective, it made sense that I gravitated to studying plant developmental evolution. From a developmental perspective, plant architecture is just as awe-inspiring, because

  1. Plant architecture is insanely diverse, both between and within species;
  2. Plants cannot move, therefore their form needs to modulate and react to their environment throughout their entire lifetime; and
  3. This modulation is largely achieved by the proliferation of plant stem cells, which exist throughout the entire life of the plant.

Stem cells have the ability to turn into any type of cell.  For my PhD, I explored how stem cell fate is maintained and directed, specifically in leaves. Using molecular techniques, I focused on how DNA directs these processes. Check out the papers on my Google Scholar page.

Non-coding DNA and enhancers

For my PhD, I focused on what most biologists focus on when studying DNA - genes - but genes represent only a tiny portion of genomes. For instance, the human genome is only 3% genes, and the rest of the genome has DNA that doesn't code for proteins at all.  This latter 97% of the genome is the mysterious non-coding region. Although we know little about this region, we do know that it can be extremely important, for both human disease and - most fascinating to me – development. I chose to spend my post-doc exploring the wild west of genomes - these large, vast non-coding regions.

How and why I turned to flies

Plants have enormous and wildly complex genomes, so to study non-coding regions, simplifying the system was my only hope. And I did something I thought I would never do, I left plant research and found the perfect system to study non-coding regions: fruit flies (Drosophila). Fruit flies have tiny genomes and ample genetic tools.

The functional non-coding regions of genomes are slowly being classified and play enormous roles in the regulation of the coding (gene) regions. I study a particular class of non-coding regions called enhancers. Enhancers give directions to the gene regions on where and when to code for proteins. I aim to answer these two main questions: 1. How do enhancer regions modulate and evolve? and 2. Is there a hidden enhancer syntax? I am approaching these questions from a combination of comparative genomics, microscopy, molecular biology and data science.

Data Science is a call for universal collaboration

It was definitely not in my plan to become obsessed with programming and large amounts of data, but here I am. Here we all are. Every field is being inundated with large amounts of data and turning people into data-obsessed scientists. Data is alluring. Hidden within the data are the answers to whatever we happen to care about most, we just need the tools. The handling and exploring of data has become a field in of itself -  data science - composed of disparate fields, all learning from each other. It has begun to widen how scientific disciplines interact. Not only has data science changed how I think about my scientific questions, it has widened my perspective on how to approach scientific work. I am fortunate enough to be a fellow at the Berkeley Institute for Data Science (BIDS), which unites all of us who are data-obsessed. We are all also passionate about inclusivity within data science, because if data science is truly a call for universal collaboration, we need the data and tools to be open to all, education clear and easily accessible, and inclusive to a diverse range of people. Bring on the data!

Featured Fellow

Ciera Martinez

Molecular and Cell Biology