Episode Transcript
Transcripts are displayed as originally observed. Some content, including advertisements may have changed.
Use Ctrl + F to search
0:15
Pushkin. I'm
0:27
Jacob Goldstein and this is What's Your Problem, the
0:29
show where I talk to people who are trying
0:31
to make technological progress. My
0:33
guest today is Manola's. Kellis
0:36
Manola's is a professor of computer
0:38
science at MIT, and he works
0:40
in computational biology. It's
0:43
a field where researchers take giant
0:45
data sets relating to things like genetics
0:48
and health outcomes and try and understand
0:50
basically what's going on, things
0:53
like what are the cellular mechanisms
0:55
of disease and how can we intervene to
0:57
keep people healthy. In particular, Minola's
1:00
research focuses on genomics and
1:02
a related field called epigenomics.
1:05
Here's how Manola's explains.
1:06
What that means. What's
1:10
extraordinary with genomics is
1:12
that we can see beyond the
1:15
limits of human imagination. We're
1:18
talking about millions of cells across
1:20
hundreds of people, across thousands of genes,
1:23
and now we can now look at how
1:25
the single genome manifests
1:28
in every cell type of the human body
1:31
in a slightly different way to create
1:33
this extraordinary symphony
1:36
that is the human life, that is human
1:38
thought, that is human understanding, cognition, and
1:41
every biological process that ability
1:44
to now start understanding the building
1:47
blocks of how this human
1:49
genome manifests
1:51
into all of these myriad of cell types
1:54
and their interactions and their combinations
1:57
and their coordination and their communication
2:01
is what we can do for the first time. They're also
2:03
giving us the entry points for
2:05
understanding the basis
2:08
of human variation, the basis of human
2:10
disease, and the basis for reversing
2:13
human disease.
2:14
So that is the very big picture
2:16
view of what Manola's does. In
2:18
our conversation, we got into a lot more
2:20
detail. For one thing, Manola's
2:22
talked about his work on obesity, and
2:25
that work is based on epigenomics,
2:27
which is basically the way in which
2:30
different genes are turned on and
2:32
off, and this turns out to be a really
2:34
big deal. Manola's and I also
2:36
talked about his work on Alzheimer's
2:38
disease. In that part of the conversation,
2:41
he talked about how he and his colleagues are trying
2:43
to find these key biological
2:45
pathways that contribute to lots
2:48
of different diseases, and how they're trying
2:50
to come up with drugs to target those
2:52
pathways. We started our
2:54
conversation by talking about Manola's
2:56
early work on the human genome, which
2:59
led to the work he's doing now.
3:00
So the human genome was mapped by K ninety
3:02
nine or two thousand and three, depending on how you count.
3:05
And then we had all of the nucleotides,
3:07
all of the letters through into billion letters.
3:11
Then the hard part begins, how
3:14
do you make sense of that book? So that was the Book
3:16
of Life. So we had all of the letters, how do you make
3:18
sense of the book? My own
3:20
PhD was developing evolutionary
3:23
signatures for understanding systematically
3:26
the human genome. So how do you recognize
3:30
where are the protein coding parts? What are
3:32
the parts that code for protein? We didn't even
3:34
know.
3:34
And just to be clear, sort
3:36
of non intuitively, most
3:39
of the human genome is
3:41
not protein coding, right, Like there's
3:43
this very basic idea that
3:45
like, oh, sure the genome, that's what codes for proteins,
3:47
but in fact, most of the genome is not doing
3:50
that.
3:51
Ninety eight percent of the
3:53
human genome does not code for protein.
3:55
It's wild. That is so nonintuitive,
3:58
correct.
3:59
So in that mysterious
4:01
ninety eight percent of the genome lie
4:04
control regents that are
4:06
responsible for turning genes on and off. And
4:10
that's where ninety
4:12
three percent of the disease associated
4:15
genetic variants are sitting.
4:17
Huh, it's not the genes
4:19
that actually code for proteins, it's the genes
4:21
that control when are proteins
4:24
made, when are they not made, how much are they made.
4:26
That's exactly right.
4:26
Okay, so I get that in a broad
4:29
sense. That's
4:31
sort of the state of affairs when you're coming
4:33
into the.
4:34
Field's exactly right. So I wrote a series
4:36
of papers, both as a student and
4:38
as a faculty member that sought to then
4:41
uncover how to even parse
4:43
the genome, how to even start understanding reading
4:45
that book of life. So that's one part.
4:48
The second part is where the regulatory
4:50
motifs are. What are regulatory motifs.
4:52
They are the short words of the
4:54
language of DNA that are
4:57
bound by regulators
5:00
to turn genes on and off. So there's these
5:02
regulatory regions, and within these
5:04
regions lie these words
5:07
which are the regulatory mode.
5:09
And just to be clear, the regulatory
5:12
motifs are part of what
5:14
determine sort of when and how
5:16
much different genes express different proteins.
5:19
That's exactly right, that's exactly right. And that's
5:21
where the human epigenome comes in. So
5:23
what we needed to now understand is how
5:25
that genome turns to life. So
5:28
you can think of the epigenome as the living genome,
5:30
as the genome. There's the genome itself
5:32
is static. It's just the book the tablets, if
5:35
you wish that Moses brought down from the mountain,
5:37
and then the epigenome is the
5:40
music that gets played from the orchestra. The
5:42
epigenome tells you which
5:44
parts are active in the brain and the
5:46
liver, and the heart and the muscle and so and so forth.
5:49
So your work on the epigenome is really
5:52
interesting to me. And I know you've done
5:54
some work on obesity, and the
5:56
epigenome tell me
5:58
about that.
5:59
The strongest genetic association with obesity
6:01
sits in one gene called
6:04
FTO, and FTO
6:06
was renamed fat and obesity
6:09
associated after that discovery, and
6:11
it remained mysterious for seven years.
6:14
People had no idea how that gene
6:16
works.
6:16
You just saw correlate.
6:17
There was a correlation.
6:18
There was a correlation.
6:19
Just the problem of genetics and the beauty
6:21
of genetics. The beauty of genetics is that it tells
6:23
you what region is responsible
6:26
for disease. Regardless of how it functions.
6:29
The downside is that it after
6:31
he tells.
6:31
You it's the same thing.
6:33
After it tells you that he has a role, you
6:36
have no idea how it functions. And
6:39
what we showed in
6:41
our work is that
6:43
that region doesn't affect
6:45
the FTO gene at all.
6:47
So like in the middle of a gene,
6:50
there is this whatever series
6:52
of nucleotides,
6:54
but those those nucleotides are just randomly
6:57
in the middle of that gene and actually have nothing to do
6:59
with that gene. I didn't even know you could do that.
7:01
Fairly, you can't. So there
7:04
are eighty nine differences, eighty
7:07
nine common variants, common
7:10
genetic variants that are all coinherited. If
7:12
you get a here, you get all
7:14
of the other you know, actage,
7:16
you get that passage. If
7:19
you get that package, it spans fifty thousand
7:21
letters. But there are only eighty nine differences
7:23
in these fifty thousand letters.
7:25
Wow, and these will increase
7:28
your body weight
7:30
by one standard deviation, which
7:33
is like how much it's like nine pounds, Like it's a
7:35
lot, okay. So so basically what
7:38
that does is that it functions
7:41
somehow to increase your risk for a basits, it's
7:43
like the strongest genetic association before.
7:46
And what we reason
7:49
is, how could it be acting. It could be acting in your
7:51
brain to decide whether you like sweets or
7:53
salting. It could be acting your muscle to make
7:55
you more fit or less fit. It could be asking
7:57
in your digestives. So we
7:59
basically said, okay, well that's speculation.
8:02
Let's look at the data. And we looked at the data
8:04
and we found that there was this massive
8:06
control region that was active
8:09
in mesenchymal stem
8:11
cells what are mesimo cells and sells. They
8:13
are the progenitors of brown
8:16
fat and white fat.
8:20
Now, white fat is white because
8:22
it's full of lipids. That's
8:24
where all the calories are stored. Brown
8:27
fat is brown because of all of the iron
8:29
in the mitochondria. That's where the calories
8:31
are burned. So it turns out that
8:33
our fat cells make a developmental
8:36
decision in their first three days of differentiation
8:39
to go down the white path lineage or
8:41
the brown path lineage. And
8:44
what the white fat does is it stores
8:46
energies and brown
8:49
burns energies. So
8:51
it turns out that I'm actually homozygous risk for
8:54
the store calories position, which
8:57
is the obesity risk.
8:58
So you have the obesity.
9:00
I have two copies of the obesity risk. My
9:02
wife has zero. I can tell you,
9:05
you know, we look very different. Fair So
9:10
we basically realize that it sits
9:12
in the progenitor cells of white and
9:14
brown flat and then
9:16
we could show that
9:18
the true target was not the ftogene
9:20
at all. It was instead two
9:22
other genes that are sitting one point
9:25
two million letters away from
9:28
this region and six hundred thousand
9:30
letters away, and those genes turned
9:32
out to be master controllers
9:35
of thermogenesis. They
9:37
are basically switching your
9:40
metabolic state. So my
9:43
cells are stuck on the store
9:45
position and
9:47
my wife cells are stuck on the burn position.
9:50
And so what is the relationship between
9:52
the genes that are acting
9:55
here and this
9:57
this you know, package variant that is
9:59
far away from them.
10:00
It comes back to the epigena. So
10:03
our DNA is stored
10:06
inside a tiny little space. The
10:09
way that gene regulation works is that you have
10:12
these control regions that are scattered
10:14
throughout the region of
10:16
every gene that are linked together
10:18
to that gene in three dimensions. So
10:20
they do around and.
10:22
So it's it's far away. If you think of it
10:24
as a strand but in three dimensional space,
10:27
right there, three dimension pats right, Ah,
10:30
that's satisfying.
10:32
And when we took these genes and
10:34
we modulated them, we show
10:37
that you can turn off one
10:39
gene in mouse, in
10:42
specifically the adipocytes of mouse
10:45
with a dominant negative cus of
10:47
fat cells with a dominant negative
10:50
construct, and that turned
10:52
the mouse fifty percent leaner. They
10:55
eat the same amount, they exercise
10:57
the same amount, but they burn calories
11:00
when they're awake and they burn calories
11:03
when they're sleeping. And
11:05
what's really fascinated with that story
11:07
is that the variant associated
11:09
with obesity is at two percent
11:12
frequency in Africa, but forty
11:14
two percent frequency in Europe and
11:17
forty four percent frequency in Southeast Asia.
11:20
So it rose from two percent to forty
11:23
four percent maybe
11:25
because of positive selection. Maybe
11:28
it was beneficial to be able to
11:30
store every kind of.
11:31
Places where food is, where you have food
11:33
is scarce in moments of famine, exactly.
11:36
In the out of Africa event, this may have
11:38
been selected for. Or in the you know,
11:40
ice ages, it may have been selected for. And
11:42
it's only after World War two that
11:45
this variant became associated
11:47
with obesity.
11:48
Because food became so abundant.
11:50
And we stopped exercising as much. So
11:53
it's fascinating to see how the environmental
11:55
shift led to a new genetic
11:58
association which is now plaguing
12:00
our society, and of course
12:02
the hope that by understanding the circuit
12:05
systematically, we can now
12:08
solve so many different
12:11
circuits and ultimately so many
12:13
different pathways and ultimately so
12:15
many different disorders.
12:19
In a minute, Manola's describes how
12:21
he and his colleagues are trying to turn
12:23
their genomic research into new
12:25
medicines. That's
12:35
the end of the ads.
12:36
Now we're going back to the show.
12:39
Another area where Manola's and his colleagues have
12:41
done a lot of work is on Alzheimer's
12:43
disease. They looked at a common
12:45
genetic variant called apo E four.
12:48
People with two copies of this variant have a
12:50
much much higher risk of getting Alzheimer's,
12:53
and Manola's and his colleagues were trying to figure out
12:55
why. They found that having
12:57
this Apoe four variant was linked to
13:00
problems with moving cholesterol
13:02
around in the brain, a process
13:05
called cholesterol transport.
13:07
Then they did experiments and mice that
13:09
found that drugs that restore cholesterol
13:12
transport actually restored
13:14
cognition in the mice. Now
13:17
that's in mice, and Alzheimer's
13:19
is a notoriously difficult disease
13:21
to treat in humans. So I
13:23
asked Minolas what it will take to
13:26
move his research from mice to humans,
13:28
and his answer was really interesting. It
13:30
pointed not only two ideas about treating
13:33
Alzheimer's, but to bigger ideas
13:35
about treating human disease more generally.
13:38
The way that I'm thinking about this, the way that our team
13:40
is thinking about these, is how
13:42
do we enable personalized medicine
13:45
and precision medicine. Namely,
13:48
Alzheimer's is not going to be only about transport.
13:51
It's going to be a combination. Every person
13:53
has some combination of these regulations.
13:56
A point four is the strongest genetic
13:59
risk, but there are many others.
14:01
And the question is how do we now
14:04
systematically take a person with Alzheimer's,
14:07
or take a family with risk, develop
14:09
treatments that are either directly addressing
14:12
the root causes rather than
14:14
treating the symptoms, and
14:16
are not only preventative but
14:20
adapted to every family and every
14:22
person.
14:22
And just to be clear, like having you know,
14:24
two copies of the APO four lil
14:28
is neither necessary nor sufficient to get Alzheimer's.
14:31
Right, that's exactly both of them and not get it. You
14:33
can have neither of them and get it. So it's exactly
14:36
so complicated hard.
14:37
So, as with everything with human disease, genetics
14:40
is not destiny. Genetics is
14:42
a predisposition, and there are environmental
14:45
factors. There are behavioral factors, there
14:47
are nutritional exercise factors,
14:49
there are socio economic factors. There's so many
14:52
other factors that are affecting how your
14:54
genetics will manifest ultimately
14:56
into disease. But
14:59
now the question is for every person,
15:01
how do we create a drug?
15:04
And it's not going to be feasible
15:06
economically or in any other
15:08
way to create one
15:10
pill for each person. The way that
15:12
we're going to enable personalized medicine is by
15:15
understanding what are the hallmarks of
15:17
disease, what are the hallmarks of Alzheimer's,
15:19
the wholemarks of obesity, the whole moods of diabetes,
15:22
the hallmarks of cardiac disorders, and
15:25
develop therapeutics for every one
15:27
of those hallmarks. So think of it as an
15:29
arsenal of twelve or twenty
15:32
different drugs for Alzheimer's that
15:34
you're going to be taking a combination of it.
15:37
Seems like oncology is already
15:39
some way down that road, right, I
15:41
mean, you know her two positive
15:43
breast cancers have certain drugs that target
15:46
them that sort of thing, right, is that the model?
15:48
That's exactly the model. So the hallmarks
15:50
of cancer have been the way
15:53
of thinking about cancer for twenty plus years
15:55
now. And the difference
15:57
in cancer is the following. Cancer
16:00
is subject to positive
16:02
selection. What does that mean? That
16:04
means that because it's a replicative
16:07
disorder where the cell, the
16:09
cancer cells make more of themselves.
16:12
If a cell acquires a mutation
16:14
that allows it to replicate faster, you
16:17
will have more of that cell. So
16:20
you are subject to positive selection
16:22
where the bad mutations are
16:25
increasing in frequency
16:27
in every generation of the cancer. By
16:30
contrast, most complex
16:33
disorders are subject to purifying
16:35
selection, where the mutations that
16:37
are responsible for them are maintained
16:39
at low frequency by evolution.
16:42
Huh.
16:43
So it's working at the opposite
16:45
ends of the evolutionary spectrum. So
16:48
cancer has a small number of genes
16:50
that drive the disorder. Complex
16:53
traits have thousands of genes that are
16:55
maintained at low frequency or
16:58
at weak effects.
16:59
Except that sounds much
17:01
harder. It's harder to figure out what's going on harder.
17:05
But the saving grace is that even
17:07
though you have extreme heterogeneity
17:10
in the number of drivers, for every one of these
17:12
disorders, they coalesce,
17:16
they cluster, they converge
17:19
in a small number of recurrent
17:21
pathways, and these are the hallmarks.
17:24
Huh.
17:25
So you can find multiple genes associated
17:27
with lipid transport, you can find multiple
17:30
genes associated with new inflammation with DNA
17:32
damage, so.
17:33
You target the sort of pathways where they
17:35
converge.
17:35
That's exactly right. So we're not going to make a drug
17:38
for Alzheimer's that we might make a drug
17:40
for DNA damage, a drug for
17:43
lipid metabolism, a drug for cholesterol
17:45
transport, et cetera. And that's
17:47
what we're working.
17:48
That's satisfying. That's a satisfying
17:50
explanation.
17:51
It basically says that it is a
17:53
limited number. There's a billion people
17:55
in the planet. We're not going to have a billion drugs.
17:58
What we're going to have it's a small number of drugs,
18:01
one for each pathway, and these
18:03
are sometimes going to be actually reused
18:05
between different disorders. So we
18:08
work on cardie disorders, we're finding
18:10
the same genes underlying
18:12
Alzheimer's, and specifically
18:14
the lipid and cholesterol component are
18:16
in fact reused in the heart disease.
18:19
And again it's about lipids. It's about
18:22
saturation of the fat
18:24
stores of an individual and now the
18:26
lipid escaping into the blacks into the
18:28
bloodstream, forming these plaques
18:31
that will then cause heart
18:34
you know, failure and heart damage and so and so forth.
18:36
So that's where we're at.
18:39
So is there. I
18:41
mean, the dream is that there is some dysfunction
18:44
that is common to all these different diseases
18:47
that you could target, right, Like, I
18:50
mean, the naive dream is find
18:52
the cure for everything, or not everything, but find the
18:54
cure for a lot of things, or at least find
18:57
a drug that will reduce risks of
18:59
many different bad things, right, I
19:01
mean, is that plausible or am I just
19:03
naive in going there? From what you're saying.
19:06
So you're right
19:08
that some of the time these
19:11
pathways that we're finding are going to be helping
19:13
in multiple frauds, And
19:15
then that's absolutely the dream. We should basically
19:18
start not with what is the worst disease, but
19:20
maybe what is the best pathway that if
19:22
we fix that one, we're going to have an impact on
19:24
most diseases.
19:25
Right, like the highest return on investments
19:28
for example.
19:28
Like, Yeah, that's a great way to think about it. But
19:32
the way that I would say is that for
19:35
each person, this might be a different
19:37
molecule.
19:39
So now I'm not hopeful.
19:43
But that with a small number
19:45
of these molecules, say one hundred, one hundred
19:47
and fifty two hundred molecules.
19:48
When you say molecule, you mean drug.
19:50
I mean trust, might I mean drust. Yeah, Basically
19:52
that there's going to be a small number of pathways and
19:55
a small number of these modulators,
19:58
and that those are going to be mixed and
20:00
matched in each person to then
20:02
target a communatorially large number
20:04
of people.
20:05
Yeah, it just got hard. I know, I know biology
20:07
is hard, but I got up to for a
20:09
second.
20:10
There's not going to be a single silver bullet for all
20:13
of those. In fact, for any one of these diseases,
20:15
there's no silver bullet. But the moment you
20:17
build your panelbly of fifty silver
20:19
bullets, then you're going to be hitting two hundred
20:21
diseases. That's the beauty of it.
20:24
Fifty bronze bo there's no silver bullet,
20:26
but maybe.
20:26
You can find it for hearts exactly right.
20:30
We'll be back in a minute with the lightning round.
20:43
Now, let's get back to the show. I
20:45
read that you have been an author on more than
20:48
two hundred and thirty papers, which
20:50
is a lot. Which one was the most fun?
20:52
Oh? You know what, don't I tell you about my very first one?
20:54
Sure?
20:56
And the very first paper was published
20:59
in c graph and it now has like two thousand
21:01
citations, And it was about how do we reconstruct
21:04
the surface of an object
21:06
from a cloud of points? So
21:09
you can basically use laser scanning to sort of figure
21:11
out points in three D and then
21:13
the question is what is the surface that goes between
21:15
them. I've always been fascinated with three D
21:17
space, so it was very fun for me to
21:19
just like you know, as a kid, basically as
21:22
as a freshman at to
21:24
work on such a project and then showing
21:26
up at the conference. He was in Disneyland,
21:29
so it was my first time in Disneyland as an author of
21:31
a vapor.
21:31
Sounds relevant for motion capture, not
21:34
knowing anything about it. When I think of, like, you
21:36
know, people, the
21:38
way they make movies now exactly as they put a bunch
21:40
of censors on people and they move around and
21:42
then you can turn them into a dragon or whatever
21:44
you want.
21:45
Yeah, that's exactly right. So you
21:47
know that paper has been quite influential
21:49
and used for a lot of a lot of different things.
21:52
What's the most overrated Greek island?
21:54
Oh my god, I can tell you about the most underrated
21:56
Santorini. Definitely not overrated tons
21:58
of people, but worth every time. I
22:00
can tell you about my first day in Santorini,
22:03
which is I walked out on this balcony
22:05
and I asked the owner of the restaurant if I can
22:07
take a look at the view and I'm not order anything.
22:10
He said, please be my guest,
22:12
and I walked out, and ten minutes later, I'm like, I can't
22:14
leave. I'm gonna have to order. He
22:17
tells me, ten years ago, I came here to look at
22:19
the view.
22:19
I want you to throw a little bit of shade. I
22:21
want you to get in a little bit of drug.
22:23
Can't.
22:24
What's one place in Greece I should not cannot.
22:28
It's not possible. I
22:31
mean, you know, if you keep insisting, I'll give you another
22:33
twenty amazing places to visit.
22:35
Well, that's fair, that's fair. I did
22:37
what I could do. If everything goes
22:39
well, what problem will you be trying to solve
22:41
in five years?
22:43
I think what I'm trying to solve now of
22:46
actually creating
22:48
these drugs in such a modular, AI
22:51
driven, personalized, reusable
22:54
way, centered on pathways.
22:57
That's going to keep me busy for a long time. And
23:00
I hope that in five years we
23:02
have actually sold a
23:05
big chunk of the platform and
23:07
that we have a few drugs
23:09
in clinical trials. So you know, my dream needs
23:12
to take all of these circuits that we have uncovered
23:15
and make a difference for humanity, make
23:17
a difference for you know, my fellow beings.
23:19
That's my big goal.
23:21
Great, it's fun to talk to you.
23:23
I learned a lot, such a pleasure, thank you, and
23:26
I love that you're fearless. You're like, well,
23:29
we're gonna jump into this new topic and find
23:31
it all about it.
23:36
Man nola's Kellis is a professor of computer
23:38
science at MIT. Today's
23:41
show was produced by Edith Russelo, edited
23:43
by Karen Chakerji, and engineered
23:46
by Sarah Bruguer. You can email
23:48
us at problem at pushkin dot FM.
23:51
I'm Jacob Goldstein. One last thing
23:53
we are going to be taking a break for a couple
23:55
of weeks, but we'll be back with new shows
23:57
in early twenty twenty four. Thanks
24:00
for listening, Happy New Year, that
24:10
t
Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More