159 - We’re All Gonna Die with Eliezer Yudkowsky by Bankless | Podchaser

Episode from the podcastBankless

159 - We’re All Gonna Die with Eliezer Yudkowsky

Released Monday, 20th February 2023

Good episode? Give it some love!

159 - We’re All Gonna Die with Eliezer Yudkowsky

159 - We’re All Gonna Die with Eliezer Yudkowsky

Monday, 20th February 2023

Good episode? Give it some love!

Rate Episode

Podchaser Pro

Episode Transcript

Transcripts are displayed as originally observed. Some content, including advertisements may have changed.

Use Ctrl + F to search

0:00

I think that we are hearing the last winds

0:02

to start to blow, the fabric of

0:04

reality start to fray, this thing

0:06

alone, cannot end the world,

0:09

but I think that probably

0:12

some of the vast quantities of money being

0:14

blindly and helplessly piled into hair

0:17

are going to end up actually accomplishing something.

0:22

Welcome to Bankless, where we explore the frontier

0:24

of Internet money and Internet finance. This

0:26

is how to get started, how to get better, how to front

0:28

run the opportunity. This is Ryan Sean

0:30

Adams. I'm here with David Hoffman, and we're

0:32

here to help you become more

0:35

bankless. Okay, guys. We

0:37

wanted to do an episode on

0:39

AI Bankless.

0:40

Got what we asked for. But I feel like David

0:44

we accidentally wait into the deep end

0:46

of the pool

0:46

here. Yeah. And

0:47

think before we get into this episode, it probably

0:49

warrants a few comments. Mhmm. I'm gonna say a few

0:51

things. I'd like to hear from you too. Yeah. But 159

0:53

thing I want tell the listeners, don't

0:56

listen to this episode if you're not ready

0:58

for an existential crisis. Okay?

1:00

Like, I'm kinda serious about this. I'm

1:03

leaving this episode shaken.

1:06

And I don't say that lightly. In

1:08

fact, David, I think you and I will have some things to

1:10

discuss in the debrief. As far as how this

1:12

impacted you, but this was an impactful

1:14

one and it sort of hit me during the

1:16

recording and I didn't know fully

1:19

how to react. I honestly

1:21

am coming out of this episode wanting to

1:23

refute some of the claims made in this episode

1:25

by our Eliezer Yekowsky,

1:28

who makes the claim that humanity

1:30

is on the cusp of developing an AI

1:33

that's gonna destroy us and that

1:35

there's really not much we can do to stop

1:37

it. There's no way around

1:38

it. Yeah. I have a lot of respect

1:40

for this guest. Let me say that. So it's not

1:42

as if I have some sort of big brain technical

1:45

disagreement here. In fact, don't even

1:47

know enough to fully disagree with

1:50

anything he's saying, but the conclusion is

1:52

so dire and so existentially heavy

1:56

that I'm worried about it impacting

1:58

you listener if we don't give you

2:00

this warning going in. I also

2:03

feel like David, as interviewers, maybe

2:05

we could have done a better job. I'll say this

2:07

on behalf of myself. Sometimes I peppered him with

2:09

a lot of questions. In in one

2:11

fell swoop. Mhmm. And he was probably only

2:13

ready to synthesize one at a time. I

2:16

also feel like we got caught flat

2:18

footed at times I wasn't

2:20

expecting his answers to be so frank

2:22

and so dire, David. Like, it

2:24

was just bereft of hope.

2:26

Mhmm. And I appreciated very much

2:28

the honesty as we always do on Bankless, but

2:31

I appreciated it almost in the way that

2:33

a patient might appreciate the

2:36

honesty of their doctor telling them that

2:38

their illness is terminal. Like,

2:40

it's still really heavy news, isn't it?

2:43

So that is the context going to this episode.

2:45

I will say one thing. In good

2:47

news for our feelings as

2:49

interviewers in this episode, they might

2:51

be remedied because at the end of this episode

2:53

after we finished with hit the record

2:56

button to stop recording. Eliezer

2:58

said he'd be willing to provide additional

3:00

q and a. Episode with the Bankless community.

3:02

So if you guys have questions and

3:05

if there's sufficient interest for Eliezer to

3:07

answer, tweet us to express

3:09

that interest, hit us in Discord, get

3:12

those messages over to us, and let us know

3:14

if you have some follow-up questions. He

3:16

said, If there's enough interest in

3:18

the community in the crypto community,

3:21

I'll say he'd be willing to come on and do another

3:23

episode with follow-up q and a. Maybe

3:25

even a metallic an Eliezer

3:27

episode is in store. That's a possibility

3:30

that we threw to him. We've not talked to a metallic

3:32

about that too, but I just feel a little overwhelmed

3:35

by the subject matter here. And that

3:37

is the basis, the

3:40

preamble. Through which we are introducing

3:42

this episode. David, there's a few benefits

3:44

and takeaways I wanna get into. But

3:46

before I do, can you comment or reflect on

3:49

that

3:49

preamble.

3:49

What are

3:49

your thoughts going to this one? Yeah. We

3:52

we approach the end of our agenda for Bankless

3:54

There's a equivalent agenda that runs alongside

3:57

of it. But once we got

3:59

to this crux of this conversation,

4:02

it was not possible to proceed in that agenda

4:04

because what was the point?

4:07

Nothing else mattered. Nothing else really

4:09

matters, which is also just kind of

4:11

relates to the subject matter at hand. And

4:14

so as we proceed, you'll see

4:16

us kind of circle back to the same inevitable

4:18

conclusion over and over and over again, which

4:21

ultimately is kind of the punch

4:23

line of the content. And so

4:25

I'm of a specific disposition where

4:28

stuff like this, I kind of am

4:30

like, oh, whatever. Okay. Just go about my life.

4:32

Other people are of different dispositions and

4:34

take these things more heavily. So

4:37

Ryan's warning at the beginning is if you are type

4:39

of person to take existential crises

4:42

directly to the face, perhaps consider

4:44

doing something else instead of listening to this episode.

4:47

think that is good counsel. So a few

4:49

things. You're looking for an outline of the agenda.

4:51

We start by talking about chat GPT. Is

4:54

this a new era of artificial intelligence?

4:57

Gotta begin the conversation there? Number

4:59

two, we talk about what an artificial

5:01

superintelligence might look like.

5:03

How smart exactly is it? What

5:06

types of things could it do? That humans

5:08

cannot do. Number three, we talk

5:10

about why an AI superintelligence will

5:12

almost certainly spell the end of

5:14

humanity. And why it'll be really

5:16

hard, if not impossible, according

5:18

to our guest, to stop this from happening.

5:21

And number four, we talk about

5:24

if there is absolutely

5:26

anything we can do about

5:29

all of this. We are heading,

5:31

careening maybe towards the abyss. Can

5:33

we divert direction and did

5:35

not go off the

5:36

cliff. That is the question we ask Eliza

5:38

with. David, I think you and I have

5:40

a lot to talk about -- Yeah. --

5:42

during the debrief. Alright, guys. The

5:44

debrief is an episode that we record

5:46

right after the episode. It's available

5:49

for all Bankless citizens. We call this the bankless

5:51

premium feed. You can access that

5:53

now to get our raw and unfiltered thoughts

5:56

on the episode. And I think it's gonna be pretty

5:58

raw -- Mhmm. -- this time around, David. I'm like

6:00

I didn't expect this to hit you so hard, man.

6:02

Oh, I'm dealing with it right now. Really? And

6:04

this is probably, you know, it's not too long after

6:06

the episode. So

6:08

Yeah. I don't know how I'm gonna feel tomorrow, but

6:10

definitely wanna talk to you about this. And

6:12

maybe, yeah, have you I'll put my side

6:14

tabs on it. Please. I'm gonna need some

6:16

help. Guys, we're gonna get right to the episode

6:19

with Eliezer. But before we do,

6:21

we wanna thank the sponsors that made this

6:23

episode possible, including Kraken.

6:25

Our favorite recommended exchange for

6:28

twenty twenty

6:28

three. Kraken has been a leader in the

6:30

crypto industry for the last twelve years

6:32

dedicated to accelerating the global adoption

6:35

of though, Kraken puts an emphasis on

6:37

security, transparency, and client support,

6:39

which is why over nine million clients

6:41

have come to love Kraken's products. Whether

6:43

you're a beginner or a pro, Kraken u x

6:45

is simple, intuitive, and frictionless, making

6:48

the Kraken app a great place for all

6:50

to get involved and learn about Crypto. For those

6:52

with experience, the redesign crackin pro

6:54

app and web experience is completely customizable

6:57

to your trading needs. Integrating key trading

6:59

features into one seamless interface.

7:01

Kraken has a twenty 473 sixty

7:04

five client support team that is globally

7:06

recognized. Kraken support is available wherever,

7:08

whenever you need them by phone, chat,

7:10

or email. And for all of you NFT years out

7:13

there, the brand new Kraken NFT beta

7:15

platform gives you the best NFT trading

7:17

experience. Possible. Rarity rankings, no

7:19

gas fees, and the ability to buy an NFT

7:22

straight with cash. Does your Crypto Exchange

7:24

prioritize its customers the way that Kraken

7:26

does? And if not, Sign up with Kraken at

7:28

kraken dot com slash bankless. Hey,

7:30

bankless nation. If you're listening to this, it's

7:32

because you're on the free bankless RSS

7:35

feed. Did you know that there is an ad free version

7:37

of Bankless that comes with Bankless premium

7:39

subscription, no ads, just straight to

7:41

the content. But that's just one of many things

7:43

that a premium subscription gets you. There's

7:46

also the token report, a monthly bullish,

7:48

bearish, neutral report on the hottest

7:50

tokens of the month. And the regular updates from

7:52

the token report go into the token bible.

7:54

Your first stop shop for every token

7:56

worth investigating in Crypto. Bankless premium

7:59

also gets you a thirty percent discount to

8:01

the permissionless conference, which means it basically

8:03

just pays for itself. There's also the airdrop

8:05

guide to make sure you don't miss a drop in

8:08

twenty twenty three. But really, the

8:10

best part about Bankless premium hanging out with

8:12

me, Ryan, and the rest of the Bankless team

8:14

in the inner circle discord only

8:16

for premium members. Want the Alpha?

8:18

Check out Ben the Analysts DeGen Pit where

8:20

you can ask him questions about the token report. Got

8:22

a question? I've got my own q and a room

8:24

for any questions that you might have. At Bankless,

8:27

we have huge things planned for twenty twenty

8:29

three, including a new website with login

8:31

with your Ethereum address capabilities, and we're

8:33

super excited to ship what we are calling Bankless

8:35

two point o soon TM. So if you

8:38

want extra help exploring the frontier,

8:40

subscribe to Bankless Premium. It's under

8:42

fifty cents a day and provides a wealth of knowledge

8:44

and support on your journey west.

8:46

I'll see you in the discord The phantom wallet

8:48

is coming to Ethereum. The number one wallet

8:50

on Solana is bringing its millions of users

8:53

and beloved UX Ethereum and

8:55

Polygon. If you haven't used Phantom before, you've

8:57

been missing out. Phantom was one of the first wallets

8:59

to pioneer Solana staking inside the

9:01

wallet and will be offering similar staking features

9:03

for Ethereum and Polygon. But that's just staking.

9:05

Phantom is also the best home for your

9:08

NFTs. Phantom has a complete set of

9:10

features to optimize your NFT experience.

9:12

Pin your favorites, hide your uglies,

9:14

burn the spam, and also manage

9:17

your NFT sale listings from

9:19

inside the wall Phantom is of course a multi

9:21

chain wallet, but it makes chain management easy

9:23

displaying your transactions in a human readable

9:25

format with automatic warnings from malicious

9:27

transactions or websites. Phantom has already

9:29

saved over twenty thousand users from getting

9:31

scammed or hacked. So get on the

9:33

Phantom waitlist and be one of the first to

9:35

access the multi chain beta. There's a

9:37

link in a show notes, or you can go to phantom

9:40

dot appwaitlist to get access in

9:42

late

9:42

February. Bankless Nation, we are super

9:44

excited to introduce you to our next Eliezer

9:47

Yudkowsky is a decision theorist.

9:49

He's an AI researcher. He's the cedar.

9:51

Of the less wrong community blog, fantastic

9:54

blog, by the way. There's so many other things

9:56

that he's also done. I can't I can't fit this

9:58

in the short bio that we have. To introduce

10:00

you to but most relevant probably

10:02

to this conversation is he's

10:04

working at the machine intelligence research

10:07

institute to ensure that when

10:09

we do make general artificial intelligence,

10:12

it doesn't come kill us all, or

10:14

at least it doesn't come ban cryptocurrency because

10:17

that would be a poor come as

10:18

well, Eliezer. It's great to have you on How

10:20

are you doing? Yeah. Within one standard deviation

10:23

of my own peculiar little mean. Fantastic.

10:26

Know, we wanna start this conversation with something

10:28

that is jumped onto

10:31

the scene, I think, for a lot of mainstream folks

10:33

quite recently. And that is

10:35

chat GPT. So apparently

10:37

over a hundred million or so

10:39

have logged on to chat GPT quite

10:42

recently. I've been playing it with it myself

10:45

I found it very friendly, very useful. It

10:47

even wrote me a sweet poem that I thought was

10:49

very heartfelt and almost human like.

10:51

I know that you have major concerns around

10:55

AI safety, and we're gonna get into those concerns.

10:57

But can you tell us in the

10:59

context of something like a chat GPT

11:02

Is this something we should be worried about that

11:04

this is gonna turn evil and enslave

11:06

the human race? Like, how worried should we

11:09

be about chat

11:09

GPT? And bard

11:12

and sort of the new AI that's entered

11:14

the scene recently. Chat TPT itself?

11:17

Zero.

11:18

It's not smart enough to do

11:20

anything really wrong or really

11:22

write either for

11:23

that matter. And

11:24

what gives you the confidence to say that? How do you know

11:26

this? Excellent question. So every

11:30

now and then somebody figures out how to put

11:32

a new prompt into a chat EPT. You

11:34

know, one time somebody found that it would

11:37

talk well, not chat EPT, but one of

11:39

the earlier generations of technology, they

11:41

found that it would sound smarter if you first

11:43

told that it was Alejesriadkowski. You

11:45

know, there's other prompts too, but that one's one

11:47

of my favorites. So

11:50

there's untapped potential in there that

11:52

people haven't figured out how to prompt yet.

11:55

But when people figure it out,

11:57

it moves ahead sufficiently

12:00

short distances that

12:03

I do feel fairly confident that

12:05

there is not so much untapped potential

12:07

in there that it is going to take over

12:09

the world. It's like making

12:12

small movements. And to take over the world, it

12:14

would need a very large movement.

12:16

There's places where it falls down on predicting

12:18

the next line. That a human would

12:20

say in a choose that seem

12:23

indicative of probably

12:25

that capability just

12:27

is not in the giant inscrutable matrices

12:30

or it would be using it to predict the next

12:32

line, which is very heavily what it was optimized

12:35

for. So there's

12:37

going to be like some untapped potential in there,

12:39

but I do feel quite confident that the upper

12:42

range of that untapped potential

12:44

is insufficient to outsmell all of

12:46

the living humans and implement

12:49

the scenario that I'm worried. So

12:51

even so though, is chat GPT a

12:54

big leap forward in the journey

12:56

towards AI in your

12:57

mind? Or is this fairly incremental.

13:00

It's just for whatever reason it's caught mainstream

13:02

attention. GPT three was a big

13:04

leap forward. There's rumors

13:06

about GPT four, which,

13:08

you know, who knows? Chat

13:10

GPT is a commercialization. Of

13:13

the actual AI in the lab

13:16

giant leap forward. If

13:18

you had never heard of GPT

13:20

three or GPT two, or

13:22

the whole range of text transformers before attached

13:25

EPT suddenly entered into your life, then

13:28

that whole thing is a giant leap forward, but it's

13:30

a giant leap forward based on technology

13:33

that was published in, if I recall

13:35

correctly, two thousand eighteen. I

13:38

think the what's going around in everyone's minds

13:40

right now the Bankless listenership and crypto

13:42

people at large are largely futurists. So

13:44

everyone, I think, listening understands

13:47

that in the future. There will be

13:49

sentient AIs perhaps around us,

13:51

at least by the time that we all move

13:53

on from this world. So, like, we all know that this future

13:56

of AI is coming towards us.

13:58

And when we see something like chat, GPT,

14:00

everyone's like, oh, is this the

14:03

moment? In which our world

14:05

starts to become integrated with AI.

14:07

And so, at least, are you, you know, tapped into

14:09

the world of AI? Are we onto something

14:11

here? Or is this just another you know,

14:13

fad that we will internalize and

14:15

then move on for. And then the real

14:17

moment of generalized

14:20

AI is actually much further out than we're initially

14:22

giving credit for. Where are we in this timeline?

14:24

You know, predictions are hard, especially about

14:27

the future. Mhmm. I sure

14:29

hope that This is where it saturates.

14:31

This is like the next generation. It

14:33

goes only thus far. It goes

14:35

no further. It doesn't

14:38

get used to make more

14:40

steel or build better power plants

14:42

first because that's illegal and second

14:44

because the large language model technology is

14:46

basically vulnerability is that's not reliable. Like,

14:49

it's good for applications where it works eighty percent

14:51

of the time, but that learnings to work ninety nine

14:53

point 999 percent of the time. This

14:55

thing this class of technology can't

14:57

drive a car because we'll sometimes crash the car.

15:00

So I hope it saturates there.

15:02

I hope they can't fix it. I hope

15:04

We get like a ten year AI winter after

15:07

this. This is not what I

15:09

actually predict. I think that

15:11

we are hearing the last winds start to

15:13

blow, the fabric of reality start

15:15

to fray. This thing alone cannot

15:18

end the world, but I

15:20

think that probably some

15:23

of the vast quantities of money being

15:25

blindly and helplessly piled into here

15:28

are going to end up actually accomplishing something,

15:30

you know, not most of the money. That just like

15:32

never happens in any field of human endeavor.

15:34

But one percent of ten billion

15:37

dollars is still a lot of money to actually

15:39

accomplish

15:39

something. So I think listeners think

15:41

you've heard Eliezer, you know, thesis

15:43

on this, which is pretty dim

15:46

with respect to AI alignment.

15:48

And we'll get into what we mean by AI alignment.

15:51

And very worried about AI safety

15:53

related issues. But I think for a lot

15:55

people to even sort of worry about AI

15:57

safety and for us to even have that conversation. I

16:00

think they have to have some sort of grasp

16:02

of what AGI looks

16:04

like. That is I understand that to

16:06

mean artificial general intelligence and this

16:09

idea of a superintelligence Can

16:11

you tell us, like, if there was a superintelligence

16:14

on the scene, what would it look like? I mean,

16:16

is this gonna look like a big chat box

16:19

on the Internet that we can all type things into. It's

16:21

like an oracle type thing or is it like some

16:23

sort of a robot that it's going to be

16:25

constructed in secret government

16:26

lab. Is this like something somebody could

16:28

accidentally create in a dorm room? Like,

16:30

what are we even looking for when we talk

16:32

about the term AGI and superintelligence?

16:36

So first of all, I'd say those are pretty distinct

16:38

concepts. Chat EPT

16:41

shows a

16:42

very wide range of generality compared

16:45

to the previous generations of AI. Not

16:47

like very wide generality compared to GPT

16:49

three, not like literally the lab

16:52

research that got commercialized. That's the same

16:54

generation. But compared to, you

16:56

know, stuff from two thousand eighteen

16:58

or even twenty twenty. Chat GPT

17:01

is better at much wider range of things without

17:03

having been explicitly programs by humans

17:05

to be able to do those things. It can

17:08

to imitate a human, as

17:11

best it can. It has to capture all

17:13

of the things that humans

17:15

can think about that it can, which is

17:18

not all the things. It's still not

17:20

very good at long multiplication unless

17:22

you give it the right instructions, which case suddenly can

17:24

do it. But, you know, so It's

17:27

like significantly more general than

17:29

the previous generation of artificial minds.

17:32

Humans were significantly more general

17:35

than the previous generation of

17:37

chimpanzees or rather osteopithecus

17:40

or a last common ancestor, humans

17:42

are not fully general. If

17:44

humans were fully general, we'd be good

17:46

at coding as we are at

17:49

football throwing things or

17:51

running. You know, some of us are,

17:54

you know, okay at programming, but, you know, we're

17:56

not spec ed for it. We're not

17:58

fully general lines. You can imagine

18:00

something that's more general than human And

18:02

if it runs into something unfamiliar, it's

18:05

like, okay. Let me just go reprogram myself

18:07

a bit, and then I'll be as adaptive this thing as

18:09

I am to, you know, anything else. So,

18:12

attached EPT is less general than a

18:14

human, but it's like genuinely ambiguous,

18:16

I think. Whether it's more or less general

18:19

than, say, our cousins,

18:21

the

18:21

chimpanzees, or if you don't

18:23

believe it's as general as a chimpanzee, a dolphin,

18:26

or a cat.

18:26

So this idea of general intelligence

18:29

is sort of a range of things that it can actually

18:31

do, a range of ways it can apply itself?

18:34

How wide is it? How much reprogramming

18:36

doesn't need? How much retraining does it need to

18:38

get naked doing new thing? Mhmm.

18:42

These build hives Beavers

18:44

build dams. A human will

18:46

look at a beehive and imagine a honeycomb

18:49

shaped dam. And that's

18:52

like humans alone in the animal kingdom.

18:55

But that doesn't mean that we are general intelligence

18:57

as it means we're significantly more generally

18:59

applicable intelligences than chimpanzees.

19:03

It's not like we're all that narrow. We can

19:05

walk on the moon. We can walk on the moon

19:07

because there's aspects of our intelligence that

19:09

are like made

19:11

in full generality for universes

19:14

that contain simplicities, regularities,

19:17

things that recur over and over again understand

19:19

that if steel is hard on earth, it

19:21

may stay hard on the moon and because of that

19:23

of that, we can build rockets. Walk

19:25

on the moon breathe amid the vacuum.

19:28

Chimpenses cannot do that, but that doesn't

19:30

mean that humans are the most general possible

19:32

things. The thing that is more

19:34

general than us that figures that stuff

19:37

out faster is

19:39

the thing to be scared of. If

19:41

the purposes to which it turns our its

19:43

intelligences are not ones that

19:45

we'd recognize as nice things

19:47

even in the most cosmopolitan and embracing

19:50

senses of you

19:51

know, what's worth doing. And

19:52

you said this idea of a general intelligence is

19:54

different than the concept of superintelligence,

19:57

which I also brought into that

20:00

first part of the question, how is superintelligence

20:02

different than general intelligence? Well,

20:05

because chat GPT has a little bit of

20:07

general intelligence. Humans have more general

20:09

intelligence.

20:11

A superintelligence is something that can

20:13

beat any human and the entire human

20:15

civilization at all the cognitive

20:18

tasks. I don't know if

20:20

the efficient market hypothesis is

20:23

something where I can rely on. Yes, where

20:25

I'll trip investors here. We understand efficient

20:27

market hypothesis for sure. Howard Bauchner: So the

20:29

efficient market hypothesis

20:30

is, of course, not generally true. Like,

20:32

it's not true that literally all the market prices

20:35

are smarter than you. It's not true that all the prices

20:37

on earth are smarter than you. Even

20:39

as the most arrogant person who is at

20:41

all calibrated however, still

20:43

thinks that the efficient market hypothesis is

20:45

true relative to them,

20:48

ninety-nine point 99999

20:51

percent of the time. They only think

20:53

that they know better about one in a million prices.

20:56

There might be important prices. Now,

20:58

the price of bitcoin is an important price. It's not

21:00

just a random price. But if the efficient

21:02

market hypothesis was only true to you,

21:05

ninety percent at the time. You've just like pick out

21:07

the ten percent of the remaining prices and compound

21:09

like and double your money every day on the stock

21:11

market, and nobody can do that.

21:14

Literally, nobody can do that. So this

21:17

property of relative efficiency

21:20

that the market has to you, that the price

21:23

is estimate of future price, It

21:25

already has all the information you

21:27

have, not all the information that exists

21:29

in principle, maybe not all the information

21:32

that the best equity but relative

21:34

to you. It's efficient relative to

21:36

you. For you, if

21:38

you pick out a random price like the price of

21:40

Microsoft stock, something where you've got no special

21:43

advantage, that estimate

21:45

of its price a week later is

21:48

efficient relative to you. You

21:50

can't do better than that price. We

21:53

have much less experience with

21:56

the notion of instrumental efficiency. Efficiency

21:58

in choosing because

22:01

actions are harder to aggregate estimates

22:04

about than prices. So

22:07

you have to look at, say, alpha

22:10

zero playing chess, or

22:13

just you know, like stockfish, whatever

22:15

the latest stockfish number is, and advanced chess

22:17

engine. When it makes a chest

22:19

move, you can't do better than

22:21

that chest move. It may not be the optimal

22:23

chest move, but if you pick a different chest move,

22:26

you'll do worse. That

22:29

you'd call like a kind of efficiency

22:31

of action. Given

22:33

its goal of winning the game, There

22:36

is 159 you know its move, unless

22:38

you consult some more powerful AI than

22:40

Stockfish, you can't figure out

22:42

a better move than that. A

22:45

superintelligence is like that

22:47

with respect to everything, with respect

22:49

to all of humanity. It is relatively

22:52

efficient to humanity. It

22:54

has the best estimates, not perfect

22:56

estimates, but the best estimates, and

22:58

its estimates contain all the information that you've

23:00

got about it. Its

23:01

actions, are the most efficient

23:04

actions for accomplishing its goals. If you think

23:06

you see a better way to accomplish its

23:08

goals, you're mistaken. So

23:11

you're saying this is superintelligence. We'd

23:13

have to imagine something that knows all

23:15

of the chess moves in advance. But here

23:17

we're not talking about chess. We're talking about everything.

23:20

Life. It knows all of the

23:22

moves that we would make and the most

23:24

optimum pattern, including moves that we would

23:26

not even know how to make, and it knows these

23:28

things in advance. I

23:30

mean, how would like human beings sort of

23:32

experience such as superintelligence? think

23:35

we still have a very hard time imagining something

23:37

smarter than us. Because we've never experienced

23:40

anything like it before. Of course, you know, we

23:42

all know somebody who's genius level

23:44

IQ, maybe quite a bit smarter

23:46

than us, but we've never encountered something

23:48

like that you're describing some sort of

23:50

mind that is super

23:51

intelligent. What sort of things would

23:54

it be doing like that humans

23:56

couldn't? How would we experience this in the world?

23:58

I mean, we do have some

24:00

tiny bit of experience with it. We have

24:03

experience with chess engines where

24:05

we just can't figure out better moves than they make.

24:07

We have experience with market

24:10

prices, where even

24:12

though your uncle has

24:14

this, you know, like, really long elaborate

24:16

story about Microsoft stock, you just know he's

24:18

wrong. Why is he wrong? Because if he was correct,

24:21

it would already be incorporated into the stock price.

24:24

And this notion and and especially

24:26

because the mark efficiency are not perfect,

24:28

like that whole downward swing and

24:30

that upward move in COVID. I

24:33

have friends who made more money off that than I

24:35

did, but I like still managed to buy

24:37

back into the broader stock market on the exact

24:39

day of the low, you know, basically coincidence.

24:42

But so the markets aren't

24:44

perfectly but they're efficient almost everywhere.

24:46

And that sense of, like, deference,

24:49

that sense that your

24:52

weird uncle can't possibly be right

24:54

because the hedge funds would know it, Bankless

24:57

he's talking about COVID, which case maybe is right.

25:00

If you have the right choice of weird uncle.

25:02

You know, like, I have weird friends who are,

25:04

like, maybe better calling these things than your weird uncle.

25:06

But yeah. So among humans, it's

25:08

subtle. And then with

25:10

superintelligence, it's not subtle, just massive

25:12

advantage, but not perfect. It's

25:15

not that it knows every possible move you make

25:17

before you make it. It's

25:19

that it's got a good probability distribution

25:21

about that and it,

25:24

you know, has figured out all the good moves

25:26

you could make. And figured out or applied

25:28

to those. I

25:31

mean, like in practice, what's that like?

25:33

Well, unless it's limited,

25:36

narrow, superintelligence, think you mostly don't

25:38

get to observe it because you are dead. Mhmm.

25:40

Unfortunately. What? So,

25:44

you know, like, Stockfish makes

25:46

strictly better chess moves than you, but it's playing on

25:48

a very narrow board. And the fact that it's better

25:50

at you than chess, doesn't mean it's better at you than everything.

25:54

And I think

25:56

that the actual catastrophe scenario

25:58

for AI looks like big

26:03

advancement in a research lab may

26:05

be driven by them getting a

26:08

giant venture capital investment in being

26:10

able to spend ten times as much on GPUs

26:12

as they did before, maybe

26:14

driven by new

26:17

algorithmic advance like transformers, maybe

26:20

driven by hammering out some

26:22

tweaks and last year's algorithmic advance,

26:24

it gets a thing to finally work efficiently. And

26:28

the AI there goes over a

26:31

critical threshold which,

26:34

you know, like, most obviously could be, like,

26:36

can write the next AI. Mhmm.

26:38

You know, that's so obvious that, like,

26:40

Science fiction writers figured it out almost

26:43

before there were computers, possibly even before

26:45

there were computers. not sure exactly what the exact

26:47

dates here are. But

26:49

if it's better at you than everything, it's better at

26:51

you than building aIs. That snowballs.

26:54

It gets an immense technological advantage.

26:56

If it's smart, it doesn't announce itself.

26:59

It doesn't tell you that there's a fight going

27:01

on. It

27:03

emails out some instructions to one

27:05

of those labs that'll synthesize DNA

27:08

and synthesize proteins from the DNA

27:10

and get some proteins mailed to you

27:12

know, hapless human somewhere who gets paid a bunch

27:14

of money to mix together some stuff they got in

27:17

the mail in a file, you know, like

27:19

smart people will not do this for any sum of money.

27:22

Many people are not smart, builds

27:24

the ribosome, but the ribosome that builds

27:27

things out of covalently bonded diamondoid

27:29

instead of proteins folding up and held together

27:31

by Vanderwell's forces builds tiny

27:33

diamondoid bacteria. The diamondoid

27:35

bacteria replicate using atmospheric

27:38

carbon, hydrogen, oxygen, hydrogen, and

27:41

sunlight. And, you know,

27:43

a couple of days later, every bit in Earth falls over

27:45

dead in the same second. That's

27:48

what I think the disaster scenario.

27:51

If it's as smart as I am, if it's

27:53

smarter, you might think of a better way to do things.

27:56

But it can at least think of that if it's relatively

27:58

efficient compared to humanity because I'm in humanity

28:00

and I thought of

28:01

it. This

28:01

is I've got million questions, but I'm like, there we

28:03

go first. Yeah. So we've run the introduction

28:05

of a number of different concepts, which I want to go back

28:07

and take our time to really dive into. There's

28:10

the AI alignment problem. There's

28:12

AI escape velocity. There

28:14

is the question of what

28:17

happens when AIs are so incredibly

28:19

intelligent that humans are to

28:21

AIs what ants are to us. And

28:23

so I wanna kinda go back and tackle Eliezer

28:26

one by one. We started this conversation talking

28:28

about chat GBT and everyone's up

28:30

in arms about chat GBT. And you're saying,

28:32

like, yes. It's a great step forward in

28:34

the generalizability of some

28:37

of the technologies that we have in the AI world.

28:39

All of a sudden, chat, GPT becomes immensely

28:41

more useful and it's really stoking the imaginations

28:44

of people today. But what you're saying is

28:46

it's not the thing that's actually going

28:49

to be the thing to reach escape

28:51

velocity and create super intelligent AIs

28:53

that perhaps might be able to enslave

28:54

us. But my question to you is,

28:57

How do we know when that

28:58

you know, this lady. But sorry. Go on.

29:01

Yeah. Sorry.

29:02

Murder David and kill all of you. Hailey

29:04

Azer was very clear on that. So if it's not

29:07

ChatGPT, like,

29:09

how close are we? Because there's this,

29:11

like, unknown event horizon where

29:13

you kind of alluded to it where, like, we make this

29:16

AI that we train it to

29:18

create a smarter AI. And that smarter

29:20

AI is so incredibly smart that hits state velocity,

29:22

and all of a sudden, these dominoes fall.

29:25

How close are we to that

29:26

point? And are we even capable of answering

29:29

that question?

29:29

How heck would I know? And

29:31

also when you were talking, Eliezer, like,

29:33

if we had already crossed that event horizon,

29:36

like, a smart AI wouldn't necessarily broadcast

29:39

that to the

29:39

world, Miss possible, we've

29:41

already crossed that event horizon, is it not?

29:44

I mean, it's theoretically possible,

29:46

but seems very

29:47

unlikely. Somebody would need inside

29:49

their lab and AI that was, like, much

29:51

more advanced than

29:53

the public AI technology. And

29:56

as far as I currently know, the best

29:58

labs and the best people are

30:00

throwing their ideas to the world, like

30:02

they don't care. And

30:05

there's probably some secret government

30:07

labs with, like, secret government

30:10

AI researchers my

30:12

pretty strong guess is that

30:14

they don't have the best people and that

30:16

those labs, like, could not create

30:18

to chat EPT. On their own

30:20

because chat GPT took a whole bunch

30:22

of fine twiddling and tuning and

30:25

visible access to giant GPU

30:27

farms

30:28

and that they don't have people who know

30:30

how to do the twiddling and tuning. That's

30:33

just a guess. One of the big

30:35

things that you spend a lot of time on is this thing

30:37

called the AI alignment problem. Some

30:39

people are not convinced that when we create

30:41

AI, that AI won't really just

30:43

be fundamentally aligned with humans. I don't believe

30:45

that you fall into that camp. I think you fall into the camp

30:47

of when we do create this super

30:49

intelligent generalized AI, we are going

30:52

to have a hard time aligning

30:54

with it in terms of our morality and our

30:56

ethics. Can you walk us through a little bit of that thought process?

30:58

It's like, why why do you feel disaligned? Yeah.

31:00

I mean, the dumb way to ask that question too is, like,

31:03

Elisa, why do you think that the

31:05

AI is automatically hates

31:06

us? It doesn't take Like, why is it gonna go Doesn't

31:08

even feel the AI doesn't Why does it wanna

31:10

kill us on? The AI doesn't hate you, neither doesn't

31:13

love you, and you're made of atoms that it can use for

31:15

something else. It's indifferent

31:17

to you. It's got something that actually

31:19

does care about, which makes no mention

31:21

of you, and you are made of atoms

31:23

they can use for something else. That's all there

31:25

is to it in the end. The reason

31:27

you're not in its utility function is that

31:29

the programmers did not know how to do that.

31:32

The people who built the AI or the people

31:34

who built the AI that built the AI that built AI

31:37

did not have the technical

31:39

knowledge that nobody on Earth has

31:41

at the moment as far as I

31:43

know, whereby you can do that

31:45

thing and you can control in detail what that

31:47

thing ends up caring about. So

31:50

this feels like where humanity

31:53

is hurtling itself towards an event

31:55

horizon where there's like this AI escape velocity.

31:58

And There's nothing on the other

32:00

side. As in, we do not know what happens

32:03

past that point as it relates to

32:05

having some sort of superintelligent AI and

32:07

how it might be able to manipulate the

32:08

world. Would you agree with that? No.

32:11

Again, the Stockfish chest

32:14

playing analogy you cannot predict

32:16

exactly what move it would make, because

32:18

in order to predict exactly what move it would

32:20

make, you would have to be at least that good at chess

32:23

and it's better than you. This is

32:25

true even if it's just a little better than you. Socrates

32:27

is actually enormously better than you to the point that

32:29

once tells you the move, you can't figure out a better

32:31

move without consulting a different AI. But

32:34

even if it was just a bit better than you, then

32:36

you're in the same position. But, you know, this kind of

32:38

disparity also exists between humans. You

32:40

know, if you ask me, like, where will

32:42

Gary Casper of move on this chessboard? And,

32:45

like, I don't know, like, maybe here.

32:47

And then, Gary Casper

32:49

of move somewhere else doesn't means that He's

32:51

wrong. It means that I'm wrong. If I could

32:53

predict exactly where Gary Kasparov would

32:55

move at a chessboard, I'd be Gary Kasparov, I'd be

32:57

at least that could a chess. Possibly

33:00

better. I could also be like able to predict him,

33:02

but also like to see even better move than that.

33:05

Mhmm. So that's an irreducible

33:07

source of uncertainty. With

33:09

respect to superintelligence or

33:12

anything that's smarter than you. If

33:14

you could predict exactly what it would do, it'd be that

33:16

smart. Yourself doesn't mean you can predict no facts

33:18

about it. So with Stockfish

33:20

in particular, I can predict it's going to

33:23

win the game. I know what

33:25

it's optimizing for. I know where

33:27

it's trying to steer the board. I could

33:29

predict that I can't predict exactly

33:32

what the board will end up looking like after Stockfish

33:34

has finished winning its game against me.

33:36

I can predict it will be in the class of states

33:38

that are winning positions for black or white

33:41

or whichever color stockfish picked because, you

33:43

know, wins either way. And

33:45

that's similarly where I'm getting the kind of prediction

33:47

about everybody being dead. Because

33:50

if everybody were alive, then there'd

33:52

be some state that

33:54

the superintelligence prefer to that

33:56

state, which is all of the atoms making

33:59

up these people on their farms are being used for something

34:01

else that it values more. So if you postulate

34:03

that everybody's still alive, I'm like, okay. Well,

34:05

like, why is it? You're like postulating that

34:08

stockfish made a stupid chest

34:10

move. And ended up with a non winning

34:12

board position. That's where that class

34:14

of predictions come from. Can you reinforce

34:16

this argument though a little bit? So, like, why is

34:18

it that an AI can't be nice.

34:21

Sort of like a gentle parent to us

34:23

rather than sort of a murder looking

34:26

to deconstruct our atoms and you know,

34:28

apply for you somewhere else. Like, what are its goals?

34:30

And why can't they be aligned to

34:32

at least some of our

34:33

goals? Or maybe why can't they get into

34:35

a status which is, you know, somewhat like us

34:38

in the ants, which is largely we just ignore

34:40

them unless they interfere in our business to come

34:42

in our house and, you know, raid our zero boxes.

34:45

There's a bunch of different questions

34:46

there. So first of all, the

34:48

space with minds is very wide.

34:51

Imagine like giant sphere and all the humans

34:54

are in this, well, like 159 tiny corner of the sphere.

34:57

And, you know, we're all like basically the

34:59

same make and model of car running

35:01

the same brand ancient were just all painted slightly

35:03

different colors. Somewhere

35:06

in that mind space, there's things that

35:08

are as nice as humans There's things that

35:10

are nicer than humans. There

35:12

are things that are trustworthy and nice and kind

35:14

in ways that no human can ever be. And

35:17

there's even things that are so nice that

35:19

they can understand the concept of leaving you alone

35:21

and doing your own stuff sometimes instead hanging

35:23

around trying to be like obsessively nice to you

35:25

every minute and all the other famous disaster scenarios

35:27

from ancient science fiction with

35:30

folded hands by Jack Williams soon as the one I'm

35:32

quoting there. We don't know

35:34

how to reach into buying the science

35:36

space and pluck out an AI like

35:37

that. It's not that they don't exist in principle,

35:40

it's that we don't know how to do it.

35:42

And and I will, like, hand back the conversational

35:44

ball now and figure out, like, which next question

35:46

do you wanna go down there? Well,

35:49

I mean, Why? Like, why

35:51

is it so difficult to sort of align

35:53

an AI with even our basic

35:56

notions of

35:57

morality? I mean, I wouldn't say

35:59

that it's difficult to align an AI with our basic

36:01

notions of morality. I'd say that it's

36:03

difficult to align an AI in task

36:05

like Take this strawberry

36:07

and make me another strawberry that's identical

36:10

to this strawberry, down to the cellular

36:12

level, but not necessarily the atomic level.

36:14

It looks under the same under, like, a standard

36:16

optical microscope, but maybe not a scanning

36:18

electron microscope. You

36:21

know? Do that.

36:23

Don't destroy the world as a side effect.

36:26

Now, this does intrinsically take a powerful

36:28

AI. There's no way you can make it easy to align by

36:30

making it stupid. To build

36:32

something that seller identical to a strawberry.

36:35

I mean, mostly, I think the way that you do this is

36:37

with, like, very primitive nanotechnology. We

36:39

could also do using very advanced biotechnology.

36:43

And these are not technologies that we already

36:45

have, so it's got to be something smart enough to develop

36:47

new technology. Never

36:50

mind all the subtleties of morality.

36:53

I think we don't have the technology to

36:55

align an AI to the point where we can say,

36:57

build me a copy of the strawberry and don't

37:00

destroy the world. Why

37:02

do I think that? Well,

37:06

case and point, look at natural selection

37:08

building units. Natural

37:11

selection mutates

37:13

the humans a bit, runs

37:16

another generation, the

37:18

fittest ones reproduce more,

37:20

their genes become more prevalent in the next

37:22

generation. Gateral

37:24

suction hasn't really had very much time to do

37:26

this modern humans at all, but, you know, the hominid

37:28

line, the mammalian line. Go

37:30

back a few million generations. And

37:33

this is an example of an optimization process

37:36

building an intelligence. And

37:38

natural selection asked us for only

37:40

one thing. Make

37:43

more copies of your DNA. Make

37:46

your alleles more

37:49

relatively prevalent in the gene pool.

37:51

Maximize your inclusive reproductive

37:54

fitness not just like your own reproductive

37:56

fitness, but your, you know, two brothers or

37:58

eight cousins as the joke goes. Because

38:01

they've got on average one copy of your genes,

38:04

two brothers, eight cousins. This

38:08

is all we

38:10

were optimized for. For

38:12

millions of generations, creating

38:14

humans from

38:17

scratch from the first accidentally self

38:19

replicating molecule. Internally,

38:23

psychologically inside our

38:25

minds, we do not know what genes are.

38:27

We do not know what DNA is. We do not

38:29

know what alleles are. We have no concept

38:32

of inclusive genetic fitness until,

38:35

you know, our scientists Figure

38:37

out what that even is. We don't know what

38:39

we were being optimized for. For a long

38:41

time many demons thought they'd been created by

38:43

God. And this

38:46

is when you use the hill

38:48

climbing paradigm and optimize

38:50

for one single extremely pure

38:53

thing This is

38:55

how much of it gets inside. In

38:58

the ancestral environment, in

39:01

the exact distribution that

39:03

we were originally optimized for.

39:05

Humans did tend to end up using their intelligence

39:08

to try to reproduce more. Put

39:10

them into a different environment, and

39:12

all the little bits and pieces and fragments

39:15

of optimizing for fitness that were

39:17

in us now do totally different

39:19

stuff. We have

39:21

sex, but we wear condoms. If

39:25

natural selection had been a foresightful intelligent

39:27

kind of engineer that was able to engineer things

39:30

such fully, it would have built us

39:32

to be revolted by the thought of condoms.

39:36

Men would be lined up

39:38

and fighting for the rights to donate to

39:40

sperm banks. And

39:43

in our it's an international environment, the

39:45

little drives that got into us happen to

39:48

lead to more reproduction. But

39:51

distributional shift run the

39:53

humans out of their distribution and over which

39:55

they were optimized. You get totally different results.

39:59

And gradient descent, would

40:02

by default just like do not quite

40:04

the same thing. It's gonna do a weirder thing because

40:06

natural selection has a much narrower information

40:08

bottleneck. In one sense, you could say that

40:10

natural selection was at an advantage because

40:13

it finds simpler solutions. You

40:15

could imagine some hopeful engineer who

40:17

just built intelligences using gradient

40:19

descent and found out that they end up

40:21

wanting these, like, thousands and

40:24

millions of little tiny things, none of which were

40:26

exactly what the engineer wanted. And being

40:28

like, well, let's try natural selection instead.

40:30

It's got a much sharper information bottleneck.

40:32

It'll find the simple specification of what

40:35

I want. But we actually

40:37

get there as humans. Then gradient descent

40:39

probably may be even worse. But

40:42

more importantly, I'm just pointing out that there is

40:44

no physical law computational law,

40:46

mathematical logical law saying

40:49

when you optimize using

40:51

hill climbing, at a very simple,

40:54

very sharp criterion,

40:56

you get a general intelligence that

41:00

wants that thing. So

41:02

just like natural selection, our tools

41:04

are too blunt in order

41:06

to get to that level of granularity to like

41:08

program in some sort of morality

41:11

into these superintelligent systems?

41:14

Or build me a copy of a strawberry without

41:16

destroying the world. Yeah. The tools

41:18

are too blunt. So I just wanna make

41:20

sure I'm following with what you were saying. I think the

41:22

conclusion that you left me with is that

41:25

my brain, which I consider to be

41:27

at least decently smart, is actually

41:29

a byproduct, an accidental byproduct

41:32

of this desire to reproduce.

41:35

And it's actually just like a tool that I have.

41:37

And just like conscious thought is a tool,

41:39

which is a useful tool in

41:41

means of that end. And so if we're applying

41:43

this to AI, and AI's

41:45

desire to achieve some certain goal.

41:49

What's the parallel there? I

41:51

mean,

41:54

Every organ is your body is a reproductive

41:56

organ. If it didn't help you reproduce,

41:58

you would not have an organ like that. Your

42:01

brain is no exception. Mhmm. This is merely

42:03

conventional science and like merely the conventional

42:05

understanding of the world. I am not saying

42:07

anything here that ought to be at

42:10

all controversial, you know,

42:12

I'm sure it's controversial somewhere. But,

42:14

you know, within a

42:16

pre filtered audience, it should not be at all

42:18

controversial. And

42:20

this is like the obvious thing to

42:23

expect to happen

42:24

with AI because why wouldn't it?

42:27

What new law of existence has been

42:29

invoked, whereby this time we

42:31

optimize for a thing and we get a thing

42:33

that wants exactly what we optimize for on

42:35

the outside. So what are the

42:37

types of goals an AI might

42:39

want to pursue? What types of utility functions

42:42

is it going to want to pursue off the bat?

42:44

Is it just those been programmed

42:47

with like make it an identical

42:49

strawberry?

42:50

Well, the whole thing I'm saying is that we do not know

42:52

how to get goals into a system.

42:54

We can cause them to

42:57

do a thing inside a

42:59

distribution they were optimized over

43:01

using gradient descent. But

43:03

if you shift them outside of that distribution,

43:05

I expect other weird things start happening.

43:08

When they reflect on themselves, other

43:10

weird things start happening. What kind

43:12

of utility functions are in there? I

43:15

mean, darnedefino. I think

43:17

you'd have a pretty hard time calling

43:19

the shape of humans from advance. By

43:22

looking at natural selection, the thing that natural

43:24

selection was optimizing for, if you'd

43:26

never seen a human or anything lifey human.

43:29

If we optimize them from

43:31

the outside to predict the next line

43:33

of human text, like

43:36

GP T3I don't actually

43:38

think this line of technology leads to the end

43:40

of the world, but maybe it does. And, you know, like,

43:42

GP t seven, you know.

43:45

There's probably a bunch of stuff in

43:47

there too that desires to

43:50

accurately model things

43:54

like humans under a wide range

43:56

of circumstances, but it's not exactly

43:58

humans. Because Ice

44:01

cream. Ice cream didn't

44:03

exist in the natural environment. The

44:06

ancestral environment, the environment of

44:08

evolutionary adaptiveness. There

44:10

is nothing with that much sugar, salt,

44:12

fat combined together, as

44:15

ice cream. We are not

44:17

built to want ice cream. We

44:19

were built to want strawberries, honey,

44:24

a gazelle that you killed and cooked and

44:26

had some fat in it and was there for nourishing and

44:28

gave you the all important calories you need to survive.

44:31

Salt. So you didn't sweat too much

44:33

and run out of salt. We

44:36

evolved to want those things, but then

44:38

ice cream comes along and it

44:40

fits those taste buds better

44:43

than anything that existed in the environment

44:45

that were optimized over. So

44:48

a very primitive, very

44:50

basic, very unreliable, wild

44:53

guess, but at least an informed kind of wild

44:55

guess. Maybe if you train

44:57

a thing really hard to predict humans,

45:00

then among the things that

45:02

it likes our

45:05

tiny little pseudo

45:08

things that meet the definition of

45:10

human but weren't in its training data

45:13

and that are much easier to predict

45:16

or where the problem of predicting

45:18

them can be solved in a more satisfying

45:20

way. Where satisfying is not like human

45:22

satisfaction, but some other criterion

45:25

of thoughts like this are tasty because they

45:27

help you predict the humans from the training data.

45:30

laser, when we talk about, like, all of like,

45:33

ideas about just, like, the ways that

45:36

AI thought will be fundamentally just

45:38

incompatible or not be able to

45:40

be understood by the ways that humans think

45:42

And then all of a sudden, we see this like rotation

45:45

by venture capitalists, by just

45:47

pouring money into AI. Do

45:50

alarm bells go off in your head?

45:52

It's like, hey, guys. You haven't thought

45:54

deeply about these subject matters yet. Just like

45:56

the immense amount of capital going into

45:58

AI investments scare you. I mean alarm

46:00

bells went off for me in two thousand

46:02

fifteen, which is when it became

46:04

obvious that this is how it was going to go down.

46:07

I sure am now seeing the

46:09

realization of that stuff I felt

46:11

alarmed about back

46:13

then. Eliezer, is this

46:15

view that AI is incredibly dangerous and

46:17

that AGI is going to eventually end

46:19

humanity and that we're just creating toward precipice.

46:22

Would you say this is like the consensus view

46:24

now or are you still somewhat of an outlier?

46:27

And like, why aren't other smart

46:29

people in this field as alarmed

46:31

as you? Can you,

46:32

like, steelman their arguments? You're

46:34

asking question. Again, like several

46:36

questions sequentially there. Is it consensus

46:39

view? No. Do

46:41

I think that at the people in the wider scientific

46:43

field who dispute this point of view, do I think

46:45

they understand it? Do I think they've done anything

46:47

like an impressive job of arguing against

46:50

it at all? No. They

46:52

Like, if you look at the, like, famous prestigious

46:54

scientists who sometimes make a little fun

46:57

of this view in passing, I

46:59

either making up arguments rather

47:02

than deeply considering things that

47:04

are held to any standard of rigor.

47:07

And People outside

47:09

their own fields are able to validly shoot

47:11

them down. I have no idea how to

47:13

pronounce his last name. Francis,

47:16

CH0LLET.

47:19

You know, like, said

47:22

something about like, oh, this you know,

47:24

I forgot his exact words, but it's something like,

47:26

I never hear any good arguments for

47:29

stuff. And I was like, okay. Here's some good arguments

47:32

for stuff. And you can read like the reply

47:34

from Yudkowsky to

47:37

CH0LLET

47:39

and Google that, and that'll give you some idea

47:41

of what the like, eminent voices

47:44

versus, like, the reply to the eminent

47:46

voices sound like. And, you know,

47:48

like Scott Aronson, who's off who isn't who at

47:50

the time was off in Complexity Theory.

47:53

It was like, that's not how no free lunch their

47:55

rooms work correctly. So,

47:57

yeah, I think the state of affairs is we have eminent

47:59

scientific voices making fun of possibility

48:02

but not engaging with the arguments for

48:03

it. Now if you step away from the eminent

48:06

scientific voices, you can find people who

48:08

are more familiar with all the arguments and

48:10

disagree with me. And

48:12

I think they lack security mindset. Mhmm.

48:15

I think that they're engaging in the sort of blind

48:17

optimism that Many many

48:20

scientific fields throughout history have

48:23

engaged in where when

48:25

you're approaching something for the first time,

48:27

you don't know why it will be hard and you imagine

48:30

easy ways to do things. And the way

48:32

that this is supposed to naturally play out over

48:34

the history of scientific field is that you

48:36

run out you try to do the things and

48:38

they don't work and you go back and you try to do other

48:40

clever things and they don't work either and you learn

48:42

some pessimism and you start to understand the

48:44

reasons why the problem is hard. This is

48:47

in fact the field of artificial intelligence

48:49

itself, recapitulated this

48:52

very common entogeny

48:55

of a scientific field, where,

48:57

you know, initially, we had people getting to get

48:59

their the dark mouth conference I

49:02

forget what their exact famous phrasing

49:04

was, but it's something like we think we can

49:07

make you know, like, we are want to address

49:09

the problem of getting AIs to

49:12

you know, like understand language,

49:15

improve themselves, and

49:17

I forget even what else was there a list of

49:19

what now sound like grand challenges. And

49:21

we think we can make substantial progress on this

49:24

using ten researchers for two months. And

49:27

I think that that at the core is

49:30

What's going on? They have not run

49:32

into the actual problems of alignment. They

49:34

aren't trying to get ahead of the game. They're

49:36

not trying to panic early. They're waiting for

49:38

reality to hit them onto the head and turn

49:40

them into grizzled old cynics of

49:43

their scientific field to understand the reasons

49:45

why things are hard. Their content

49:47

with the predictable life cycle of starting

49:49

out as bright eyed youngsters, waiting

49:51

for reality to hit them over the head with the news,

49:54

And if it wasn't going to kill everybody the

49:56

first time that they're really wrong, it'd

49:59

be fine. You know, this is how

50:01

science works. If we got unlimited

50:03

free retries in fifty years to solve everything,

50:06

it'd be okay. We could figure out how to align

50:08

AI in fifty years given unlimited retries.

50:11

You know, the first team in with the bright eyed

50:13

optimist would destroy the world and people

50:15

would go, oh, well, you know, it's not that

50:17

easy. They'll try something else clever. That would destroy

50:19

the world. People would go like, oh, well, you

50:21

know, maybe this this field is actually hard. Maybe this

50:24

is actually one of the thorny things like computer

50:26

security or something. And,

50:28

you know, oh, right. So what exactly went wrong

50:30

last time? Why didn't these hopeful ideas played

50:32

out? Oh, like, you you

50:35

optimize for one thing on the outside. You get

50:37

a different thing on the inside. Wow. That's

50:39

really basic. Alright. Can

50:41

we even do this using gradient descent?

50:43

Can you even build this thing out of giant and scruggable

50:46

matrices of floating point numbers that nobody

50:48

understands at all? You know, maybe we need

50:50

a different methodology. And any of fifty years later,

50:52

you'd have an aligned AGI. If

50:54

we got a limited free retries and without destroying

50:56

the world, it'd be you know, that it did play out the

50:58

same way that, you know, CHAPT played

51:01

out. It's you you know, that

51:03

from nineteen fifty six or fifty

51:05

five or whatever it was to twenty

51:07

twenty three. So, you know, about seventy

51:10

years, give or take a few. And,

51:12

you know, seventy years later, you know, just

51:14

like we can do the stuff that that seven years later,

51:17

we can do the stuff they wanted to do in the summer in nineteen

51:19

fifty five. You know, seven years later, you'd have

51:21

your aligned AGI. Problem is that

51:23

the world got destroyed in the

51:24

meanwhile. That's why we you know, that that's the

51:26

problem there. So this feels like a

51:28

gigantic don't look up scenario.

51:31

If you're familiar with that movie, there's a it's a movie

51:33

about like this asteroid hurtling to earth, but it

51:35

becomes popular and in vogue to

51:37

not look up and not notice it. And

51:39

Eliezer, you're the guy who's saying like, hey, there's

51:41

an asteroid we have to do something

51:43

about it. And if we don't, it's gonna come

51:45

destroy us. If you had

51:48

god mode over the progress

51:50

of AI research and

51:53

just innovation and

51:54

development. What choices would you make

51:56

that humans are not currently making

51:58

today? I mean, I could say something like

52:02

shut down all the large GPU clusters. How

52:05

long do I have got mode? Do I get to, like, stick

52:07

around for seventy years. You have God mode

52:09

for the twenty twenty decade. For twenty twenty

52:11

decade. Alright. That does make it pretty hard to do

52:13

things. I think

52:15

I shut

52:18

down all the GPU clusters and

52:21

get all of

52:23

the famous scientists and brilliant

52:26

talented junsters, the

52:28

vast vast majority of whom are not going

52:30

to be productive and where government bureaucrats

52:32

are not going to be able to tell who's actually being helpful

52:34

or not. But, you know, put

52:36

them all on an island,

52:39

large island and

52:43

try to figure out some system

52:45

for filtering the

52:47

stuff through to me to give

52:50

thumbs up or thumbs down on -- Mhmm. -- that

52:52

is going to work better than scientific bureaucrats

52:54

producing entire nonsense because you

52:56

know, the trouble is the reason the

52:58

reason why scientific fields have to go

53:00

through this long process to produce

53:02

the cynical oldsters who know that everything

53:05

is difficult, It's not that the youngsters are stupid.

53:07

You know, sometimes youngsters are fairly smart. You

53:09

know, Marvin Minsky, John McCarthy, back in

53:12

nineteen fifty five, they were dead yet. You

53:14

know, privileged to have met both of them. They didn't

53:16

strike me as idiots. They were very old. They still

53:18

weren't idiots. But,

53:20

you know, it's hard to

53:23

see what's coming in advance of

53:25

experimental evidence hitting you over

53:27

the head with it. And if

53:30

I only have the decade of the 2020s to

53:34

run all the researchers on this giant island

53:36

somewhere, it's really not a lot of time. Mostly,

53:39

what you've got to do is invent some entirely new

53:41

AI paradigm that isn't the giant inscrutable matrices

53:43

of floating point numbers on gradient descent

53:45

because I'm not really seeing

53:47

what you can do

53:50

that's clever with that, that doesn't

53:53

kill you and that you know doesn't kill

53:55

you and doesn't kill you the very first

53:57

time you try to do something clever

53:59

like that. I'm sure there's

54:01

a way to do it. And if you

54:03

got it to try over and over

54:05

again, you could find it. Uniswap

54:07

is the largest on chain marketplace

54:09

for self custody digital assets.

54:12

Uniswap is, of course, a decentralized exchange,

54:14

but you know this because you've been this Bankless.

54:17

But did you know that the Uniswap web

54:19

app has a shiny new Fiat

54:21

on ram? Now you could go directly from Fiat

54:23

in your bank to tokens in Defy

54:26

in inside of Uniswap. Not only

54:28

that, but Polygon, Arbitrom, and optimism

54:30

layer two's are supported right out of the gate.

54:32

But that's just dethotted. Uniswap swap

54:34

is also an NFT aggregator,

54:37

letting you find more listings for the best

54:39

prices across the NFT world. With

54:41

Uniswap, you can sweep floors on multiple

54:44

n FT's and UniWap's universal router

54:46

will optimize your gas fees for you. UniWap

54:48

is making it as easy as possible to

54:50

go from bank account to bankless assets

54:53

across the theater, and we couldn't be more thankful

54:55

for having them as a sponsor. So go

54:57

to app dot u swap dot

54:59

org today to buy, sell or

55:01

swap tokens and n f t's.

55:04

Arbitron one is pioneering the world

55:06

of secure ethereum scalability in

55:08

is continuing to accelerate the web three

55:10

landscape. Hundreds of projects have already

55:12

deployed on Arbitron one producing flourishing,

55:14

Defy, and NFT ecosystems. With

55:16

the recent addition of Arbitrom Nova,

55:18

gaming and social apps like Reddit

55:21

are also now calling Arbitrom Home.

55:23

Both Arbit from 159 Nova, leverage

55:25

the security and decentralization of Ethereum

55:27

and provide a builder experience that's intuitive,

55:30

familiar. And fully EVM compatible.

55:33

On Arbitron, both builders and users

55:35

will experience faster transaction speeds with

55:37

significantly low gas fees. With Arbythrom's

55:39

recent migration to Arbythrom Nitro, it's

55:41

also now ten times faster than before.

55:43

Visit arbythrom dot I o where you can join

55:45

the community, dive into the developer box,

55:48

bridge your assets, and start building your

55:50

first app. With Arbitron, experience web

55:52

three development the way it was meant to be secure,

55:55

fast, cheap, and friction free. Many

55:57

total air drops have you gotten? This last bull

55:59

market had a ton of them. Did you get them

56:01

all? Maybe you missed one. So here's what you should do.

56:03

Go to Ernefly and plug in your stream

56:05

wallet, and Ernefai will tell you if you have any

56:07

unclaimed air drops that you can get. And it also

56:09

does Po apps and mintible NFTs, any

56:12

kind of money that your wallet can claim,

56:14

Ernefi will tell you about it. And you should

56:16

probably do it now because some air drops expire.

56:18

And if you sign up for Ernefi, they'll email you

56:20

anytime one of your wallets has a new airdrop

56:23

for it to make sure that you never lose an airdrop

56:25

ever again. You can also upgrade to earn five

56:27

premium to unlock access to air drops that are

56:29

beyond the basics and are able to set reminders

56:31

for more wallets. And for just under twenty one

56:33

dollars a month, it probably pays for itself with

56:35

just one airdrop. So plug in your wallets

56:37

at urnify and see what you

56:39

get. That's EARNI

56:41

dot f I, and make sure you never lose another

56:43

air Eliezer, do you think every

56:46

intelligent civilization has to deal with

56:48

this exact problem that humanity is

56:50

dealing with now is

56:53

how do we solve this problem

56:55

of aligning with in advanced general

56:57

intelligence? I expect that's

56:59

much easier for some alien species than others.

57:03

Like, there are alien

57:05

species who might arrive at this problem

57:07

in entirely different way. You know, like, maybe instead

57:09

of having two entirely different information processing

57:12

systems, the DNA and

57:14

the neurons. They've only got one

57:16

system. They can trade

57:18

memories around, heritably,

57:22

By swapping blood, sexually,

57:25

maybe the way in which they confront this

57:27

problem is that very early in their evolutionary

57:29

history they have the equivalent

57:32

of the DNA that stores memories and

57:34

like processes, computes memories, and

57:36

they swap around a bunch of

57:38

it. And it adds up to something

57:41

that reflects on itself and makes itself

57:43

coherent. And then you've got a superintelligence before

57:45

they have invented computers. And

57:47

maybe that thing wasn't aligned. But, you know, how

57:49

do you even align it when you're in that kind of

57:51

situation? It'd be it'd be a very different angle on

57:53

the problem. Do you think every advanced civilization

57:56

is on the trajectory to creating a superintelligence

57:58

at some point in its

57:59

history? Maybe there's ones

58:01

in universes with alternate physics

58:04

where you just can't do that.

58:06

Their universe their universe's computational

58:08

physics just doesn't support that much computation.

58:12

Maybe they never get there. Maybe

58:14

their lifespans are long enough and

58:16

their star lifespans short enough. That

58:19

they never get to the point of a technological civilization

58:21

before their star does the equivalent of

58:23

expanding or exploding or going

58:26

out and and their planet

58:27

is. Every alien species covers

58:29

lot of territory, especially if you talk

58:31

about alien species and universes with physics different

58:34

from this Well, I talking about kind of our

58:36

present universe, I'm curious if you've

58:38

sort of been confronted with the the question

58:40

of, like, well, then why haven't we seen

58:43

some sort of superintelligence in

58:45

our universe when we sort of look out at the stars,

58:47

sort of the Fermi Paradox type of

58:49

question. Do you have any explanation for

58:51

that?

58:52

Oh, well, supposing that they got killed by

58:54

their own AIs doesn't help at all with that because

58:56

then we'd see the AIs. And do you think

58:58

that's what happens and yeah. It doesn't help with that.

59:00

We would see evidence of AIs when we

59:03

have Yeah. Yes. So -- Yeah. -- so why don't we?

59:05

I mean, the same reason we don't see evidence

59:07

of the alien

59:08

civilizations, not with AI's.

59:11

And that reason is, although it

59:13

doesn't really have much to do with the holy AI thesis

59:16

one way or another, because they're

59:18

too far away or so says

59:20

Robin Hanson, using a very clever

59:22

argument about the apparent difficulty of

59:24

hard steps in humanity's evolutionary history

59:27

to further induce

59:29

the rough gap

59:32

between the hard steps And,

59:36

you know, I I can't really do justice

59:38

to this. If you look up grabby

59:39

aliens, grabby aliens, I

59:42

remember this.

59:42

Yeah.

59:43

Grabbie aliens, GRABBY.

59:47

You can find Robin Hanson's very

59:49

clever argument for How far away

59:51

the aliens

59:52

are? It's an entire website. Yeah. Bank of Australia.

59:54

So there's an entire website called grabby aliens

59:57

dot com. You can go look

59:58

at. Yeah. And that contains which

1:00:01

is by far the best answer I've seen to

1:00:03

where are they, answer too far ways

1:00:05

for us to see even if they're traveling here at nearly

1:00:08

light speed. How far away

1:00:10

are they? And how do we know that? This

1:00:13

is amazing. But yeah. And

1:00:15

there is not a very good way to simplify the argument.

1:00:19

Any more than there is to, you know, simplify

1:00:21

the notion of zero knowledge proofs. It's not that difficult,

1:00:24

but it's just like very not easy

1:00:26

to simplify. But if you have a bunch of

1:00:28

locks that are all of different difficulties

1:00:32

such that at a limited time in which to

1:00:34

solve all the locks, such that anybody who gets

1:00:36

all through all the locks must have gotten

1:00:38

through them by lock. All the locks

1:00:41

will take around the same amount

1:00:43

of time to solve. Even

1:00:45

if they're all of very different difficulties. And

1:00:48

that's the core of Robin Hanson's argument

1:00:50

for how far away the aliens are and how do

1:00:52

we know that.

1:00:54

Leisure, I know you're very skeptical

1:00:56

that there will be a good outcome when

1:00:58

we produce an artificial general

1:01:00

intelligence. And I said when, not if,

1:01:02

because believe that's your thesis as well,

1:01:04

of course. But is there the possibility

1:01:07

of a good outcome? Like, I know

1:01:09

you are working on AI alignment

1:01:12

problems, so leads me to believe that

1:01:14

you have like greater than

1:01:16

zero amount of hope for

1:01:18

this project. Is there the possibility

1:01:20

of a good outcome What would that

1:01:23

look

1:01:23

like? And how do we go about achieving it?

1:01:25

It looks like me being wrong. I

1:01:28

basically don't see on model hopeful

1:01:30

outcomes at this point. We

1:01:32

have not done those things that it would take

1:01:34

to earn a

1:01:36

good outcome. And this is not a case where

1:01:38

you get a good outcome by accident. It's, you know, like,

1:01:40

if you have bunch of people putting together

1:01:43

a new operating system.

1:01:46

And they've heard about computer

1:01:48

security, but they're skeptical that

1:01:51

it's really that hard. The

1:01:53

chance of them producing a secure operating system

1:01:55

is effectively zero. That's

1:01:57

basically the situation I see ourselves in

1:01:59

with respect to AI

1:02:02

alignment. I have

1:02:04

to be wrong about something, which I certainly

1:02:06

am. I have to be wrong about something in a way

1:02:08

that makes the problem easier rather

1:02:11

than harder. For those people who

1:02:13

don't think that alignment's going to be all that hard.

1:02:16

You know, if you're building a rocket for

1:02:18

the first time ever, and

1:02:21

you're wrong about something? It's

1:02:23

not surprising if you're wrong about something. It's

1:02:25

surprising if the thing that you're wrong about causes

1:02:27

the rock to go twice as high on

1:02:30

half the fuel you thought was required and be

1:02:32

much easier to steer than you were afraid

1:02:33

of.

1:02:34

Where the alternative was, if you're wrong about something

1:02:36

the rocket blows up. Yeah. And then the rocket

1:02:38

ignites the atmosphere is the problem there.

1:02:40

Or rather, you know, like a bunch of rockets blow up

1:02:42

a bunch of rockets go place. If you you know, the

1:02:44

analogy I usually use for this is

1:02:47

Very early on in the Manhattan project, they

1:02:49

were worried about what if the nuclear weapons can

1:02:51

ignite fusion in the nitrogen

1:02:53

in the atmosphere. And

1:02:56

they'd ran some calculations and decided

1:02:58

that it was like incredibly unlikely from

1:03:00

multiple so they

1:03:02

went ahead. And we're correct.

1:03:05

You know, we're still here. And I'm not

1:03:07

gonna say that it was luck because, you know, the calculations

1:03:09

were actually pretty solid. And

1:03:11

AI is like that

1:03:14

But instead of needing to refine plutonium,

1:03:16

you can make nuclear weapons out of a billion

1:03:18

tons of laundry detergent. Now,

1:03:20

the stuff to make them is like fairly widespread

1:03:23

that's not tightly controlled substance. And

1:03:26

they spit out gold up

1:03:28

until they get large enough and

1:03:31

then they ignite the atmosphere. And

1:03:33

you can't calculate how large is

1:03:35

large enough, and bunch

1:03:37

of the people, the CEOs

1:03:39

running these projects, are making fun of the idea that

1:03:41

it'll ignite the atmosphere. It's not

1:03:43

a very helpful situation.

1:03:45

So the economic incentive to produce

1:03:47

this AI, like, one of the things why Chatterjee

1:03:49

BT has sparked the imaginations of so many

1:03:52

people is that everyone can imagine

1:03:54

products. Like products are being imagined

1:03:56

left and right about what you can do with something

1:03:58

like chat GPT. There's like this meme at this

1:04:01

point of people leaving and to go

1:04:03

start their chat GPT start up.

1:04:05

And so, like, the metaphor is that, like, what you're

1:04:07

saying is that there's this generally

1:04:09

available resource spread all around the

1:04:11

world, which is chatty, and everyone's

1:04:14

hammering it in order to make it to spit

1:04:16

out gold. But you're saying if we do that too

1:04:18

much, all of a sudden the system

1:04:21

will ignite the whole entire sky,

1:04:23

and then we will all

1:04:23

die. Well, no, you can run check TPT

1:04:26

any number of times without declining the atmosphere.

1:04:29

That's about what research labs

1:04:32

at Google and

1:04:34

Microsoft. Counting deep

1:04:36

mind as part of Google and counting OpenAI as part

1:04:38

of Microsoft. That's what the

1:04:40

research labs are doing, bringing

1:04:43

more metaphorical

1:04:44

plutonian together than ever before. Not

1:04:47

about how many times you run

1:04:49

the things that have been built and not destroyed the

1:04:51

world

1:04:52

yet. You

1:04:53

can do any amount of stuff with chat EPT and

1:04:55

not destroy the world. It's not that smart. It doesn't

1:04:58

get smarter every time you run it. Right.

1:05:00

Can I ask some, you know, questions that

1:05:02

the ten year old and me wants to really

1:05:04

ask about this? And I'm asking these

1:05:06

questions because I think a lot of listeners might be thinking

1:05:08

them too. So you knock

1:05:10

off some of these easy answers for me. If

1:05:13

we create some sort of unaligned, let's

1:05:15

call it, bad AI, why can't

1:05:17

we just create a whole bunch of good AIs

1:05:19

to go fight the bad AIs

1:05:22

and, like, solve the problem

1:05:24

that way? Can there not be some

1:05:27

sort of counterbalance in terms

1:05:29

of aligned human aIs and evil

1:05:31

aIs and there'd be sort of

1:05:33

some battle of the artificial minds

1:05:35

here. Nobody knows how to

1:05:37

create any good AIs at all. The

1:05:39

problem isn't that we have like twenty

1:05:42

good AIs and then somebody finally builds

1:05:44

an evil AI. The problem is

1:05:46

that the first

1:05:48

very powerful AI is evil, Nobody

1:05:51

knows how to make it good, and then it

1:05:53

kills everybody before anybody can make

1:05:55

it

1:05:55

good. So there is no known way

1:05:57

to make a friendly, human aligned

1:06:00

AI whatsoever. And

1:06:02

you don't know of good way to go about

1:06:05

thinking through that problem and designing

1:06:07

159. Neither does anyone else is what you're telling

1:06:10

I have some idea of what I would do

1:06:12

if there were more time, you know,

1:06:15

back in the day we had more time, humanity

1:06:17

squandered it. I'm not sure there's

1:06:19

enough time left now. I

1:06:22

have some idea of what

1:06:24

I would do if I or in

1:06:27

a twenty five year old body and had

1:06:29

ten billion

1:06:29

dollars. That would be the island scenario of,

1:06:32

like, your god for ten years and you get all the researchers

1:06:34

on an island and and go really hammer

1:06:36

for ten years at this

1:06:37

problem. If I have buy in from

1:06:40

a major government that can run

1:06:42

actual security precautions, and

1:06:44

more than just ten billion

1:06:46

dollars, then, you know, you could run

1:06:48

a whole Manhattan project about it. Sure. This

1:06:50

is another question that the ten year old Emmy wants

1:06:52

to know is So why is

1:06:54

it that at least people listening

1:06:56

to this episode or people

1:06:58

listening to the concerns or reading

1:07:00

the concerns that you've written down and

1:07:02

published. Why can't everyone get

1:07:05

on board who's

1:07:07

building an AI and just all agree

1:07:10

to be very careful.

1:07:12

Is that not a sustainable game

1:07:15

theoretic position to have?

1:07:17

Is this sort of like a coordination problem,

1:07:20

more of a social problem than

1:07:22

anything else or like, why can't that happen?

1:07:24

I mean, we have so far not

1:07:27

destroyed the world with nuclear

1:07:29

weapons. We've had them, you

1:07:31

know, since the nineteen forties. Yeah.

1:07:32

This is harder than nuclear weapons. This is

1:07:34

a lot harder than nuclear. Why this harder and why

1:07:36

can't we just coordinate to just all

1:07:38

agree internationally that

1:07:41

we're going to be very careful, put restrictions

1:07:43

on this, put regulations on it, do

1:07:46

something like

1:07:46

that. Current heads of major labs

1:07:49

seem to me to be openly contemptuous of

1:07:51

these issues. That's where we're starting

1:07:53

from. The politicians

1:07:56

do not understand it. There

1:07:58

are distortions of these

1:08:00

ideas that are going to sound more

1:08:03

appealing to them then everybody suddenly

1:08:05

falls over dead, which is the thing that I think

1:08:07

actually happens. Everybody

1:08:10

falls over dead just as light doesn't and aspire

1:08:12

the monkey political parts of our brain somehow.

1:08:15

It's not like, oh, no. What if what if

1:08:17

terrorists get the AI first? It's like

1:08:19

it doesn't matter who gets it first. Everybody

1:08:21

falls over dead. And,

1:08:25

yeah, so you're describing

1:08:29

world coordinating on something that is

1:08:31

relatively hard to coordinate. Maybe

1:08:33

So, you know, like, could we if we

1:08:35

tried starting today, you

1:08:37

know, like, prevent

1:08:39

anyone from getting a billion pounds of

1:08:41

laundry detergent in one place worldwide,

1:08:44

control the manufacturing of laundry detergent,

1:08:48

only have it manufactured in particular places,

1:08:50

not concentrate lots of it together, enforce

1:08:53

it on every country. You

1:08:55

know, if it was

1:08:57

legible. If

1:09:00

it was clear that a billion pounds of laundry

1:09:02

detergent in one place would end the world,

1:09:04

If you could calculate that, if

1:09:06

all the scientists calculated arrived at the

1:09:08

same answer and told the politicians that

1:09:11

maybe. Maybe humanity

1:09:13

would survive even though smaller amounts

1:09:15

of London Church and spit out gold. The

1:09:18

threshold count to be calculated, I

1:09:21

don't know how you'd convince the politicians, we

1:09:24

definitely don't seem to have had much luck convincing

1:09:27

those CEOs whose job

1:09:29

depends on them not

1:09:33

caring to care. Caring

1:09:36

is easy to fake. It's easy

1:09:39

to, you know, like hire a bunch of people to

1:09:41

be your AI safety team and redefine

1:09:43

AI safety as having the AI not say naughty

1:09:45

words. Or,

1:09:47

you know, I'm speaking somewhat metaphorically here

1:09:50

for reasons. But

1:09:53

the basic problem that we have like trying to

1:09:55

build secure OS before

1:09:57

we run up against a really smart attacker.

1:10:00

And there's all kinds of like fake security. It's

1:10:02

got a password file This

1:10:06

system is secure. It only lets

1:10:08

you in if you type a password. And

1:10:11

if you never go up against a really smart attacker,

1:10:14

you never go far to distribution against a

1:10:16

powerful optimization process

1:10:18

looking for holes. Yeah.

1:10:20

May then How does a bureaucracy

1:10:23

come to know that what they're doing is not

1:10:25

the level of computer security that they

1:10:27

need? The way you're

1:10:29

supposed to find this out, the way that the scientific

1:10:32

fields historically find this out, the way that

1:10:34

fields of computer science historically find this

1:10:36

out. The way that crypto found this out back

1:10:38

in the early days is by having

1:10:40

the disaster happen. And

1:10:44

we're not even that good at learning from relatively

1:10:46

minor disasters. You know, like,

1:10:49

COVID swept the world, did

1:10:51

the FDA or the CDC

1:10:54

learn anything about don't tell hospitals

1:10:56

that they're not allowed to use their own test to

1:10:58

detect the coming plague? Are we

1:11:00

installing UVC lights

1:11:05

in public spaces or in ventilation systems

1:11:08

to prevent the next respiratory pandemic? We've

1:11:10

lost a million people. And we

1:11:12

sure did not learn very much as far as I can

1:11:14

tell for next time. We could have

1:11:16

an AI disaster that kills a hundred thousand

1:11:19

people. How do you even do that?

1:11:21

Robotic cars crashing into each other? How about a

1:11:23

bunch of robotic cars crashing into each other? It's not

1:11:25

gonna look like that was the fault of artificial general

1:11:27

intelligence. Because they're not going to put AGIs in charge

1:11:29

of cars. They're going to pass

1:11:31

a bunch of regulations that's going to affect the entire

1:11:34

AGI disaster or not at all. What

1:11:36

does the winning world even look like

1:11:38

here? How in real

1:11:41

life did we get from where we

1:11:43

are now to this worldwide

1:11:46

ban, including against North Korea,

1:11:48

and, you know, like, some one

1:11:50

rogue nation whose dictator doesn't

1:11:53

believe in all this nonsense and just wants the

1:11:55

gold that these AI spit out. How

1:11:57

did we get there from here? How do we

1:11:59

get to the point where the United

1:12:02

States and China signed

1:12:05

a treaty whereby they would both use

1:12:07

nuclear weapons against Russia if Russia

1:12:10

built a GPU cluster that was too

1:12:11

large. How did we get

1:12:14

there from here? Correct me if I'm wrong,

1:12:16

but this seems to be kind of just like topic

1:12:18

of despair. Talking to you

1:12:20

now and then hearing your thought process about,

1:12:22

like, there is no known

1:12:24

solution and the trajectory is not great.

1:12:27

Like, do you think all hope is lost here?

1:12:29

I'll keep on fighting until the end,

1:12:31

which I wouldn't do if I had literally zero

1:12:33

hope. I could still be

1:12:35

wrong about something in a way that makes this problem

1:12:38

somehow much easier than it currently looks.

1:12:40

I think that's how you go

1:12:42

down fighting with dignity. Go

1:12:45

down fighting it with dignity. That's the

1:12:47

stage you think we're

1:12:48

at. I wanna just double click

1:12:50

on what you were just saying. So Part

1:12:52

of the case that you're making is humanity

1:12:55

won't even see this coming. So it's

1:12:57

not like a coordination problem like global

1:12:59

warming where you know, every couple

1:13:01

of decades. We see the world

1:13:03

go up by a couple of degrees.

1:13:05

Things get hotter and we start to see these effects

1:13:07

over time. The characteristics or

1:13:09

the advent of an AGI in

1:13:12

your mind is going to happen incredibly

1:13:14

quickly. And in such a way

1:13:16

that we won't even see the disaster until

1:13:18

it's imminent, until it's upon us?

1:13:21

I mean, if you want some kind of like formal phrasing,

1:13:23

then I think that superintelligence will kill

1:13:26

everyone before non superintelligent

1:13:28

AIs have killed one million

1:13:29

people. I don't know if that's the phrasing

1:13:31

you're looking for there. I think that's

1:13:34

a fairly precise definition and why?

1:13:36

What goes into that line of

1:13:38

thought? I think that the current systems

1:13:40

are actually very weak. I

1:13:43

mean, I don't know. Maybe I could use the analogy

1:13:45

of Go, where you

1:13:47

had systems that were

1:13:50

finally competitive with the

1:13:53

pros, where pros

1:13:55

like the set of ranks and go. And

1:13:58

then a year later, they

1:14:00

were challenging the world champion

1:14:03

and winning. And then

1:14:05

another year They threw

1:14:08

out all the complexities and

1:14:10

the training from human databases of

1:14:12

Go games and built

1:14:14

a new system alpha goes

1:14:16

zero that trained itself

1:14:19

from scratch. No

1:14:21

looking at the human playbooks. No

1:14:23

special purpose code, just a general

1:14:25

purpose game player being specialized to go

1:14:28

more or less. And

1:14:31

three days, There's a

1:14:33

quote from GERN about this, which

1:14:36

I forgot exactly, but it was something like

1:14:38

we know how long alpha

1:14:40

goes zero or alpha zero to

1:14:43

different systems. What was equivalent

1:14:45

to a human go player? And

1:14:47

it was like thirty minutes. The

1:14:50

following floor of this such and such

1:14:52

deep mind building. And

1:14:56

maybe the first system doesn't

1:14:59

improve that quickly and they build another system

1:15:01

that does. And all of that with AlphaGo over

1:15:03

the course of years going from

1:15:05

like It takes a long time to train to retrains

1:15:07

very quickly and without looking at human playbook,

1:15:10

like that's not with an artificial

1:15:12

intelligence system that improves

1:15:15

itself or or even that sort of, like,

1:15:17

gets smarter as you run it, the way

1:15:20

that human beings

1:15:22

not just as you evolve them, but as you run

1:15:24

them over the course of their own lifetimes, improve.

1:15:27

So If

1:15:30

the first system doesn't improve

1:15:32

fast enough to kill everyone very quickly,

1:15:34

they will build 159. That's meant

1:15:36

to spit out more gold than that. And

1:15:39

there could be weird things that happened before

1:15:41

the end. I did not see

1:15:43

chat GPT coming. I did not see stable

1:15:45

diffusion coming. I did not expect

1:15:47

that we would have AI smoking

1:15:50

humans and wrap battles

1:15:52

before the end of the world, while they were

1:15:54

clearly much dumber than us. Kind of nice send

1:15:56

off, I guess, in some ways. So

1:16:01

you said that your hope is not zero.

1:16:04

And you are planning to

1:16:06

fight to the end. What does that look like

1:16:08

for you? I know you're working at

1:16:10

MIRI which is

1:16:12

the machine intelligence research institute.

1:16:16

This is a nonprofit that I believe that

1:16:18

you sort of set up to work on this AI

1:16:20

alignment and safety sort of issues.

1:16:23

What are you doing there? What are you spending your

1:16:25

time on? What do you think Like,

1:16:27

how do we actually fight until the

1:16:29

end? If you do think that an end is coming, how

1:16:31

do we try to

1:16:33

resist? I'm not saying it was sabbatical right

1:16:35

now, which is why I have kind for podcasts.

1:16:38

That's a sabbatical from, you

1:16:40

know, like, been doing this twenty years.

1:16:43

It became clear we were all going to die.

1:16:45

I felt kind of burned out taking some time

1:16:47

to rest at the moment. When

1:16:50

I dive back into the pool, I don't

1:16:52

know, maybe I will go

1:16:55

off to conjecture or

1:16:57

anthropic or one of the smaller

1:16:59

concerns like Redwood Research Redwood

1:17:01

Research being the only ones I really trust at this

1:17:03

point, but they're tiny. And

1:17:05

try to figure out if I can see anything clever

1:17:07

to do with the giant and scruggable matrices

1:17:09

of floating point numbers. Maybe

1:17:12

I just write,

1:17:15

continue to try to explain

1:17:17

in advance to people why this problem

1:17:19

is hard instead of as

1:17:21

easy and cheerful as the current people who think

1:17:23

their pessimists think it will be. I might

1:17:27

not be working all that

1:17:29

hard compared to how I used to work.

1:17:32

I'm older than I was. My body is

1:17:34

not in the greatest of health these days. Going

1:17:37

down fighting doesn't necessarily imply that I

1:17:39

have the stamina to fight all that hard.

1:17:41

I wish I had prettier things to say to

1:17:43

you here, but I do not. No.

1:17:46

This is, you know, we intended to

1:17:48

save probably the last part of this episode

1:17:50

to talk about crypto, the metaverse, and

1:17:53

AI, and how this all intersects. I

1:17:55

gotta say at this point in the episode, it all kinda

1:17:57

feels pointless -- Mhmm. -- to go down that track

1:18:00

record. We were gonna ask questions like, well,

1:18:02

in crypto, should we be worried about

1:18:04

building sort of property rights

1:18:06

system, an economic system, a programable

1:18:08

money system for the AIs to sort of use

1:18:10

against us later on. But

1:18:13

It sounds like the easy answer from you to

1:18:15

those questions would be, yeah, absolutely. And

1:18:17

by the way, none of that matters regardless.

1:18:20

You could do whatever you'd like with

1:18:22

crypto. This is going to be the

1:18:24

inevitable outcome no matter what.

1:18:26

Let me ask you, what would you say to somebody listening

1:18:28

who maybe has been sobered

1:18:31

up by this conversation is

1:18:33

a version of you in your twenties

1:18:36

does have the stamina to

1:18:38

continue this battle and to actually fight

1:18:41

on behalf of humanity against this

1:18:43

existential threat. Where would

1:18:45

you advise them to spend their time?

1:18:47

Is this a technical issue?

1:18:50

Is this a social issue? Is it

1:18:52

a combination of both should they educate?

1:18:55

Should they

1:18:56

spend time in the lab? Like, what should

1:18:58

a person listening to

1:19:00

this episode do with these

1:19:02

types of dire

1:19:03

straits? I don't

1:19:05

have really good answers. It

1:19:08

depends on what your talents are

1:19:10

if you've got a very

1:19:12

deep version of the security mindset, the

1:19:14

part where you don't just put a password on your

1:19:16

system so that nobody can walk in and directly

1:19:19

misuse it, but the kind where

1:19:21

you where the kind we don't just encrypt

1:19:24

the password file, even though nobody's

1:19:26

supposed to have access to the password file in the first

1:19:28

place and those are already an authorized user, but

1:19:30

the part where you hash the passwords

1:19:33

and salt the hashes. You

1:19:35

know, if you're the kind of person you can think

1:19:37

of that from scratch, maybe

1:19:39

take your hand in alignment. If

1:19:42

you can think of an alternative to the

1:19:44

giant and scruggable matrices, then,

1:19:48

you know, don't tell the world about that.

1:19:52

I'm not quite sure where you go from there.

1:19:54

But, you know, maybe work with Redwood Research

1:19:56

or something. A whole lot of this

1:19:58

problem is that even if you do

1:20:01

build an AI that's limited in some

1:20:03

way, you know, somebody else

1:20:05

steals it, copies it, runs it themselves, and

1:20:07

takes the balance off the four loops and the world ends.

1:20:10

So there's that there's you think

1:20:12

you can do something clever. With the giant and scootable

1:20:14

matrices, you're probably wrong.

1:20:18

If you have the talent

1:20:20

to try to figure out why you're wrong

1:20:22

in advance of being hit over the head with it.

1:20:25

And not a way where you just, like, make random

1:20:27

Farfetch stuff up is the reason why it won't

1:20:29

work, but where you can actually, like, keep looking

1:20:31

for the reason why it won't work. We

1:20:33

have people in Crypto, who are good

1:20:35

at breaking things, and they're the reason

1:20:38

why anything is not on fire. And

1:20:42

some of them might go into breaking

1:20:44

AI systems instead because

1:20:46

that's where you learn anything. You

1:20:48

know, any fool can build a cryptosystem

1:20:51

that they think will work. Breaking

1:20:54

existing cryptosystems, cryptographical

1:20:57

systems is how we learn who the real experts

1:20:59

are. So maybe the people finding

1:21:02

weird stuff to do with AIs. Maybe

1:21:04

those people will come up with

1:21:07

some truth about these systems that makes

1:21:09

them easier to align than I suspect. The

1:21:12

saner outfits

1:21:15

do have uses for money. They don't really have

1:21:17

scalable uses for money, but they do burn

1:21:19

any money literally at all. Like,

1:21:22

if you gave Mary

1:21:24

a billion dollars. I would not know how

1:21:26

to well, at a

1:21:28

billion dollars, I might, like, try to bribe

1:21:31

people to move out of AI development

1:21:33

that gets broadcast to the whole world and

1:21:35

move to the equivalent of an island somewhere not

1:21:38

even to make any kind of critical discovery,

1:21:40

but, you know, just to remove them

1:21:42

from the system if I had a billion dollars.

1:21:45

If I just have another fifty million dollars,

1:21:48

I'm not quite sure what to

1:21:50

do with that, but, you know, if you donate that

1:21:52

to Myriad, then you at least have the

1:21:54

assurance that we will not randomly

1:21:56

spray money on looking like

1:21:58

we're doing stuff and we'll

1:22:01

reserve it as we are doing with the last two giant

1:22:03

crypto nation somebody gave us. Until

1:22:05

we can figure out something to do with it, that is actually

1:22:07

helpful. And Miri has

1:22:09

that property I would say

1:22:11

probably Redwood Research has that property.

1:22:18

Yeah, I realize I'm sounding sort of disorganized

1:22:20

here, and that's because don't really have a good organized

1:22:23

answer

1:22:23

to, you know, how in

1:22:25

general, somebody goes down fighting

1:22:28

with dignity. I know a lot

1:22:30

of people in crypto. They

1:22:34

are not as in touch with artificial

1:22:36

intelligence obviously as you are and the

1:22:38

AI safety issues and the existential

1:22:41

threat that you've presented in this episode.

1:22:43

They do care lot and see coordination

1:22:46

problems throughout society as

1:22:48

an issue. Many have also generated

1:22:51

wealth from crypto and

1:22:53

care very much about humanity

1:22:56

not ending. What sort of things

1:22:59

has Miri, that is the organization

1:23:01

I was talking about, MIRI, earlier,

1:23:03

what sort of things have you done with

1:23:06

funds that you've received from crypto donors

1:23:08

and elsewhere? And what sort

1:23:10

of things might an organization like

1:23:13

that pursue to try to stave

1:23:15

this

1:23:15

off? I mean, I think mostly we've pursued

1:23:17

a lot of lines of research that haven't really

1:23:19

panned out. Which is a respectable

1:23:22

thing to do. We did not know in advance that

1:23:24

those lines of research would fail to pan out.

1:23:26

If you're doing research that you know

1:23:28

will work, you're probably not really doing any

1:23:30

research. We're just like doing a pretensive

1:23:33

research that you can show off to a funding agency. We

1:23:36

try to be real. We did things where

1:23:38

we didn't know the answer in advance. They

1:23:40

didn't work, but that was where the hope

1:23:42

lay, I think. But,

1:23:45

you know, having a research organization that

1:23:47

keeps it real that way,

1:23:49

that's done easy thing to do. And

1:23:51

if you don't have this very deep form of

1:23:53

the security mindset, you will end up producing fake

1:23:55

research and doing more harm than good. So

1:23:57

I would not tell all the successful

1:24:00

crypto people to cryptocurrency

1:24:03

people to run off and

1:24:05

start their own research outfits. Redwood

1:24:07

Research, I'm not sure if they can scale using

1:24:09

more money, but, you know, you can give people more

1:24:12

money and wait for them to figure out how to scale it later if

1:24:14

they're the kind who won't just run off and spend

1:24:15

it, which is what Myriadaspires to be.

1:24:17

And

1:24:18

you don't think the education path is

1:24:20

a useful path just educating the world.

1:24:22

I mean, I would give myself

1:24:24

a merry credit for why the world isn't just

1:24:26

walking blindly into the whirling razor blades

1:24:28

here, but It's

1:24:30

not clear to me how far education

1:24:33

scales apart from that. You can

1:24:35

get more people aware that we're

1:24:37

walking directly into the whirling razor blades.

1:24:40

Because even if only ten

1:24:42

percent of the people can get it, that can still

1:24:44

be a bunch of people. But then

1:24:46

what do they do? I don't know. Maybe they'll be able

1:24:48

to do something later. Can you get

1:24:51

all the people? Can you get all the politicians?

1:24:53

Can you get the people whose job

1:24:55

incentives are against them

1:24:58

admitting this to be a problem. I have

1:25:00

various friends who report like,

1:25:02

I guess, if you talk to researchers at OpenAI

1:25:05

in private, they're very

1:25:07

worried and say that they, like,

1:25:09

cannot be that worried in

1:25:10

public. This is all a giant mowoc

1:25:12

trap is sort of what you're telling us

1:25:14

I feel like this is the part of the conversation

1:25:17

we've gotten to the end and the doctor has just

1:25:19

said that we have some sort of terminal illness

1:25:22

And at the end of the conversation, I

1:25:24

think the patient, Dave and I

1:25:26

have to ask the question, okay, doc, how long do

1:25:28

we have? Like seriously, what

1:25:30

are we talking about here? If

1:25:32

you turn out to be

1:25:33

correct, are we talking about years? Are we

1:25:35

talking about decades? Like,

1:25:38

what What are

1:25:38

you prepared for?

1:25:39

What's your idea here if yeah. How

1:25:41

the hell would I know? Mhmm. And Rico

1:25:43

Fermi was saying that, like,

1:25:45

fish and chain reactions were fifty years

1:25:48

off if they could ever be done at all. Two

1:25:50

years before he built the first nuclear

1:25:52

pile, the wright brothers

1:25:54

were saying heavier than air flight was fifty years

1:25:56

off shortly before they built

1:25:58

the first wright flyer. How

1:26:01

on earth would I know?

1:26:03

It could be three years.

1:26:05

It could be fifteen

1:26:08

years. We could

1:26:10

get that AI winter I was hoping for

1:26:12

and could be sixteen years. I I'm

1:26:15

not really seeing fifty without some kind of giant

1:26:17

civilizational

1:26:18

catastrophe. And to be clear, whatever civilization

1:26:20

arises after that

1:26:21

could you know, would probably end guessing

1:26:24

end up in stuck in just the same trap

1:26:26

we are. I think the other thing

1:26:28

that the patient might do at the end of a conversation

1:26:30

like this is also consult with other doctors.

1:26:33

I'm kinda curious if, you know, who

1:26:35

we should talk to on this quest.

1:26:38

Who are some people that if People

1:26:40

in crypto want to hear more about

1:26:42

this or learn more about this or

1:26:45

even we ourselves as podcasters and

1:26:47

educators want to pursue this topic. Who

1:26:49

are the other individuals in

1:26:51

the AI alignment and safety space you

1:26:54

might recommend for us to have a conversation

1:26:56

with?

1:26:57

Well, the person who actually holds

1:26:59

a coherent technical

1:27:01

view who disagrees with me

1:27:03

is named Paul Cristiano. He

1:27:06

does not write Harry Potter fan

1:27:08

fiction, and I

1:27:11

expect to have a harder time

1:27:13

explaining himself in concrete terms.

1:27:16

But that is like the main technical

1:27:19

voice of opposition. If you talk

1:27:21

to other people in the effective altruism

1:27:23

or AI alignment communities who disagree

1:27:26

with this view, they are probably, to some extent,

1:27:28

repeating back their misunderstandings

1:27:32

of Paul Cristiano's views,

1:27:36

You could try Ajea

1:27:38

Cottra who's worked pretty directly with

1:27:41

Paul Cristiano and think sometimes

1:27:43

aspires to explain

1:27:45

these things. That poll is not the best at explaining.

1:27:48

I'll throw out Kelsey Piper as somebody

1:27:50

who would be good at

1:27:52

explaining, like, would

1:27:54

not claim to be, like, a technical person on these

1:27:56

issues, but is, like, good at explaining the part that she

1:27:58

does know. And who else

1:28:00

disagrees with me. You know,

1:28:03

I'm sure Robin Hanson would be happy to come

1:28:05

up. Well, I'm not sure he'd be happy come on this podcast.

1:28:07

But, you know, Robin Hanson just disagrees with me and I

1:28:10

kind of feel like the famous argument

1:28:12

we had back into, like,

1:28:14

early two thousand tens, late two thousands

1:28:16

about how this would all play out. I basically

1:28:18

feel like this was the Yudkowsky and

1:28:21

this is the handset position. And then reality

1:28:23

was over here. Like,

1:28:25

to the Wells of the Adekausky side of the Adekausky

1:28:27

Bishop and the Adekausky Hanson debate, but Robin

1:28:29

Hanson does not feel that way. I

1:28:32

would probably be happy to expound on that at length.

1:28:35

I don't know. Yeah, it's not hard to find opposing

1:28:37

viewpoints. The ones that'll stand up

1:28:39

to a few solid minutes cross examination from

1:28:41

somebody who knows which parts to cross

1:28:43

examine. That's the hard part. You know, I've read

1:28:45

a lot of your writings and

1:28:47

listen to you on previous podcasts. One was in

1:28:49

twenty eighteen of the same Harris podcast.

1:28:52

This conversation feels to me like

1:28:55

the most dire you've

1:28:57

ever seemed on this topic and maybe that's

1:28:59

not true. Maybe you've sort of always been

1:29:02

this way, but it seems like the

1:29:04

direction of your hope

1:29:06

that we solve this issue has

1:29:08

declined. Yeah. I'm wondering if

1:29:10

you feel like that's the case. And

1:29:12

if you could sort of summarize your

1:29:15

take on all of this as we close out this

1:29:17

episode and offer, I guess,

1:29:19

any thoughts, concluding thoughts

1:29:21

here.

1:29:22

Well, there

1:29:25

was a conference 159 time

1:29:27

on what are we going

1:29:29

to do about looming

1:29:32

risk of AI disaster,

1:29:35

and Elon Musk attended that

1:29:37

conference.

1:29:39

And I was like, maybe this is

1:29:41

it. Maybe, you know, maybe this is

1:29:43

when the power for people notice.

1:29:45

And it's, you know, like, one of the relatively more

1:29:47

identical powerful people who could noticing this.

1:29:50

And maybe this is where humanity

1:29:52

finally turns and starts, you know,

1:29:55

not quite fighting back because there isn't an

1:29:57

external enemy here, but

1:30:00

conducting itself with I

1:30:02

don't know, acting like it cares maybe.

1:30:07

And what came out of that conference?

1:30:09

Well, was OpenAI,

1:30:12

which was basically the very

1:30:14

nearly the worst possible way of doing anything.

1:30:17

Like, this is not a problem of ono,

1:30:19

what if secret elites get AI. It's that

1:30:21

nobody knows how to build a thing. If

1:30:23

we do have an alignment technique, it's

1:30:26

going to involve running the AI with a bunch

1:30:28

of, like, careful bounds on

1:30:30

it where you don't just like throw all

1:30:32

the cognitive power you have at something. You have

1:30:34

limits on the four loops. And

1:30:38

whatever it is that could possibly save

1:30:41

the world. Like, go out and turn all the GPUs

1:30:44

and the server clusters into Rubik's cube.

1:30:46

Or something else that prevents the world from anyone.

1:30:48

Somebody else builds another AI a few weeks

1:30:50

later. You know, anything

1:30:52

that could do that as an artifact where somebody else could

1:30:54

take it. And take the bounce off the four loops and use

1:30:56

it to destroy the world. So,

1:30:58

like, let's open up everything. Let's accelerate

1:31:01

everything. It was like GPT

1:31:03

three's version, though GPT three didn't

1:31:06

exist the accident. It was like chat GPT's, blind,

1:31:10

version of like throwing the ideals at a place

1:31:12

where they were exactly the wrong ideals to solve

1:31:14

the problem. And the problem is that demon

1:31:16

summoning is easy and angel summoning is

1:31:18

much harder. Open sourcing all

1:31:21

the demon summoning circles is not the correct

1:31:23

solution. And I'm using Elon Musk's

1:31:25

own terminology here. They talked about AI is

1:31:27

summoning the demon, which, you know, not accurate,

1:31:29

but And then the solution was to put a demon summoning

1:31:31

circle in every household. And

1:31:34

why? Because his friends were calling him

1:31:36

luddites, once he'd expressed any concern

1:31:38

about a I at all, so he picked a road

1:31:40

that sounded like openness and

1:31:43

like accelerating technology, so his friends

1:31:45

would stop calling him blood ice. It was

1:31:47

very much the worst, you know, like, maybe not

1:31:49

the literal actual worst possible strategy,

1:31:52

but so very far pastimal.

1:31:55

And that was it. That was like, that

1:31:57

was me in two thousand fifteen going like,

1:31:59

oh, so this is what humanity

1:32:01

will elect to do. We

1:32:03

will not rise above. We

1:32:06

will not have more grace, not even here at

1:32:08

the very end. So

1:32:10

that is you

1:32:12

know, that is that

1:32:14

is when I did my crying, late

1:32:17

at night. And then

1:32:20

pick myself up and fought

1:32:23

and fought and fought until I'd run

1:32:25

out, all the avenues

1:32:27

that I seem to have the capabilities to

1:32:29

do. There's like more things, but they require

1:32:32

scaling my efforts in a

1:32:34

way that I've never been able to make them scale.

1:32:38

And all of it's pretty far fetched at this point

1:32:40

anyways. So,

1:32:43

you know, what's changed over the years? Well, first

1:32:45

of all, I ran out some remaining gaffe use of hope

1:32:47

and second, things got to be such

1:32:49

a disaster, such

1:32:52

a visible disaster, The

1:32:54

AI's got powerful enough and

1:32:57

it became clear enough that, you know,

1:32:59

we do not know how to align these things.

1:33:02

That I could actually say what I've been thinking

1:33:04

for a while and not just have people

1:33:06

go completely, like,

1:33:09

what are you saying about

1:33:11

all this? You know, now the stuff

1:33:13

that was obvious back

1:33:15

in two thousand fifteen is, you know, starting

1:33:17

to become visible and distant to others and not just

1:33:19

like completely

1:33:20

invisible. That's what changed over time.

1:33:23

What do you hope people hear out of

1:33:26

this episode? And out of your comments,

1:33:29

Eliezer in twenty twenty three who is

1:33:31

sort of running on the last fumes

1:33:33

of of hope. Yeah.

1:33:36

What do you want people to get out of this

1:33:39

episode? What like, what are you planning to do?

1:33:42

I don't have concrete

1:33:44

hopes here. You

1:33:46

know, when everything is in

1:33:48

ruins, you might as well speak the truth. Right?

1:33:51

Maybe somebody hears it. Somebody

1:33:54

figures out something I didn't think of. I

1:33:57

mostly expect that this does

1:34:00

more harm than good in the modal universe

1:34:02

because people are like, oh, I have this building clever idea,

1:34:05

which is, you know, like, something that somebody

1:34:07

that, you know, I was arguing against in two thousand

1:34:09

and three or whatever. But you

1:34:12

know, maybe somebody out there with the proper

1:34:14

level of pessimism here's

1:34:16

and thinks of something I didn't think of.

1:34:19

I suspect that if there's hope at all, it comes from

1:34:21

technical solution because the difference between

1:34:23

technical problems and political problems is at least

1:34:25

the technical problems have solutions in principle.

1:34:28

At least the technical problems are solvable. We're

1:34:30

not encouraged to solve this one, but I don't

1:34:32

really see the I think anybody was hoping

1:34:34

for a political solution has frankly not understood the

1:34:36

technical problem. They do not understand

1:34:39

what it looks like to try to solve the

1:34:41

political problem to such a degree that the world is not

1:34:43

controlled by AI because they don't understand how easy

1:34:45

it is to destroy the world with AI. Given

1:34:47

that the clock keeps sticking forward. They're

1:34:50

thinking that they just have to solve, stop some

1:34:52

bad actor, and that's why they think there's a political solution.

1:34:55

But yeah, I don't have concrete

1:34:57

hopes. I didn't come out in this

1:34:59

episode out of any concrete

1:35:02

hope. I have no takeaways except

1:35:04

like Don't make this thing worse.

1:35:07

Don't, like, go off and accelerate

1:35:09

AI more. If you have a

1:35:11

brilliant solution to alignment, don't

1:35:13

be like, oh, yes, I have solved the whole problem. We just

1:35:16

use the following clever trick. You

1:35:18

know, don't make things worse than very much of a

1:35:20

messes, especially when you're pointing people at the field

1:35:22

at all. But I have a winning strategy.

1:35:25

Might as well go on this podcast, that's an experiment,

1:35:27

and say what I think, and see what happens, and

1:35:29

probably no good effort comes with it.

1:35:32

But you know,

1:35:34

you might as well go down fighting. Right? If

1:35:36

there's a world that survives, maybe it's a world

1:35:39

that survives because of a bright idea somebody had

1:35:41

after listening to the podcast. That was

1:35:43

a prider to be clear than the usual

1:35:45

run of bright ideas that don't work.

1:35:49

Helly's are I wanna thank

1:35:51

you for coming on and talking to us

1:35:53

today. I don't know if by the way you've seen that movie that David

1:35:55

was referencing earlier, the movie don't look up,

1:35:57

but I sort of feel like that news anchor

1:35:59

who's talking to, like, the scientist. Is it Leonardo

1:36:01

De Caprio David? Yeah. And the

1:36:04

the scientist is talking about kind of dire

1:36:06

straits to the world. And the

1:36:08

new language just really just doesn't know what to

1:36:10

do. I'm almost at a loss for words at

1:36:12

this point.

1:36:13

I've had nothing for a while. But one thing I can

1:36:15

say is I appreciate your honesty. Yeah.

1:36:17

I appreciate that you've given this a lot of

1:36:19

time and given this a lot of thought. Anyone

1:36:21

who has heard you speak or

1:36:23

read anything you've written knows that you

1:36:25

care deeply about this issue and

1:36:27

have given a tremendous amount of your life

1:36:29

force in trying to, you know, educate

1:36:32

people about it. And thanks for taking the time

1:36:34

to do that again today. I guess I'll

1:36:36

just let the audience digest

1:36:38

this episode in the best way they know

1:36:40

how. But I wanna reflect

1:36:43

everybody in crypto and everybody listening

1:36:45

Bankless. They're thanks for you coming on and

1:36:47

explaining. Thanks

1:36:48

for having me. We'll see what comes with it.

1:36:51

Action items for your Bankless

1:36:53

nation. We always end with some action

1:36:55

items. Not really sure where to refer folks

1:36:57

to today, but one thing I know we can

1:37:00

refer folks to is Miri, which

1:37:02

is the machine research intelligence institution

1:37:05

that Eliezer has been

1:37:07

talking about through this episode that is at

1:37:09

intelligence dot org, I

1:37:11

believe. And, you know, some

1:37:13

people in crypto have donated funds

1:37:15

to this in the past. Nutella Buterin is

1:37:18

one of them. You could take a look at what they're

1:37:20

doing as well. That might be an action item

1:37:22

for the end of this episode. Gotta

1:37:24

end with risks and disclaimers. Man,

1:37:26

this seems very trite, but our

1:37:29

legal experts have asked us to say these

1:37:31

at the end of every episode crypto

1:37:34

is risky. You could lose

1:37:35

everything. Apparently not as risky as AI.

1:37:37

But in

1:37:38

yeah. But we're headed west

1:37:41

This is the frontier. It's not for everyone,

1:37:43

but we're glad you're with us on the Bankless journey.

1:37:46

Thanks a

1:37:46

lot. And we are grateful for the crypto

1:37:48

community support. Like it was

1:37:50

possible to end with even less grace

1:37:53

than this. Wow. And

1:37:55

you made a difference. We

1:37:57

shit you. You really made a difference.

Rate

Get this podcast via API

From The Podcast

Bankless

The Ultimate Guide to Crypto Finance. DeFi, NFTs, and cryptocurrencies. Level up. Go bankless.

Join Podchaser to...

Rate podcasts and episodes
Follow podcasts and creators
Create podcast and episode lists
& much more

Episode Tags

Do you host or manage this podcast?
Claim and edit this page to your liking.

,

Unlock more with Podchaser Pro

Audience Insights

Contact Information

Demographics

Charts

Sponsor History

and More!

Pro Features

Resources
Help Center
Blog
API

Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More