Representation Engineering (Activation Hacking) by Practical AI: Machine Learning, Data Science | Podchaser

Episode from the podcastPractical AI: Machine Learning, Data Science

Representation Engineering (Activation Hacking)

Released Wednesday, 28th February 2024

Good episode? Give it some love!

Representation Engineering (Activation Hacking)

Representation Engineering (Activation Hacking)

Wednesday, 28th February 2024

Good episode? Give it some love!

Rate Episode

Podchaser Pro

Episode Transcript

Transcripts are displayed as originally observed. Some content, including advertisements may have changed.

Use Ctrl + F to search

0:06

Welcome to practical. A

0:08

I If you work in artificial

0:11

intelligence aspire to or are curious

0:13

how ai related tech is change

0:15

in the world, this is the

0:18

show for you. Thank you to

0:20

our partners! As Wide Io the

0:22

home of Change log.com Why transforms

0:24

containers in the Micro V Ems

0:27

that run on their hardware in

0:29

thirty plus regions on six continents

0:31

so you can launch your app

0:34

near your users. Were more

0:36

that why that I? oh. Welcome.

0:43

To another episode of Practical A

0:45

I. In this fully connected episode,

0:47

Chris and I will keep you

0:49

fully connected with everything that's happening

0:52

in the Ai world. Will take

0:54

some time to explore some of

0:56

the recent A I news and

0:58

technical achievements, and will take a

1:00

few moments to share some learning

1:02

resources as well to help you

1:05

level up your Ai game. I'm

1:07

Dana White Knack I am founder

1:09

and Ceo at Prediction Guard. And.

1:11

I'm joined is always by my co host

1:13

Chris Benson who is attack strategist at Lockheed

1:16

Martin. Are you doing Chris? Doing.

1:18

Great to they daniel get lots of

1:20

news that's come out this week. in

1:22

the Ai space I know early time

1:24

to talk about amazing new things before

1:26

stuff comes. our yeah I I've been

1:28

traveling for the past five days or

1:31

something. I've sort of lost track of

1:33

time, but it's like stuff was happening

1:35

during that time in the news especially

1:37

the Soros stuff and all that and

1:39

I feel like I'd just kind of

1:41

missed a couple news cycle soto be

1:43

good to catch up on a few

1:46

things, but. One. Of the reasons

1:48

I was travelling was I was

1:50

at the Tree Hacks hackathon out

1:52

at Stanford, so I went there

1:54

as part of their kind of.

1:57

Intel. Entourage I'm I'm had

1:59

prediction guard available for all the

2:01

the hackers there and that was

2:03

a lot of fun and it

2:06

was incredible. I it's been awhile

2:08

since I've been to any hackathon

2:10

at least in person hackathon and

2:13

they had like five floors in

2:15

this huge you know engineering building

2:17

of room for all the hacker

2:19

that in there was like sixteen

2:22

hundred people there participate on how

2:24

from all over there and really

2:26

cool a course like there were

2:29

some major categories. Of interest one

2:31

you know like in doing hardware

2:33

things with robots and other stuff.

2:36

Of course one of the

2:38

main areas of interest was a

2:40

I which was interesting to see

2:43

and in our. The. Track

2:45

Of It. I was a judge and

2:47

mentor in one of the cool projects.

2:49

That one that track was called Masterworks

2:51

So what they did and this is

2:54

old news to me. While some of

2:56

this I I learned from you know

2:58

the brilliant students but they said they're

3:01

doing something with Laura and I was

3:03

like oh Laura, that's the fine tuning.

3:06

Methodology. For large language

3:08

miles of like that yeah figures

3:10

like he for problem using mora

3:12

but i didn't realize and then

3:14

they came up to the table

3:16

and they had these like a

3:18

little devices like hardware devices. then

3:20

it quick that something else is

3:22

going on in explain to me

3:25

they were using Laura which stands

3:27

for a long range it's a

3:29

these sets of radio devices. That.

3:31

Communicate on these unregulated frequency

3:33

bands and can communicate in

3:35

our mess network. So like

3:37

you put out these devices

3:39

right, And they communicate in

3:41

a mess network and can

3:43

communicate over long distances for

3:45

very, very low power. And

3:48

so they created a project

3:50

that was. Disaster.

3:52

Relief focus. Where.

3:54

you drop these in the field and there

3:56

was a kind of command and control central

3:58

zone and they would come you communicate back

4:01

transcribed audio commands from the

4:03

people in the field. I

4:06

would say, oh, I've got a injury

4:08

out here, it's

4:12

a broken leg, I need help, whatever,

4:15

or meds over here, or this is going on

4:17

over here. And then they had an

4:19

LLM at the command and control

4:22

center parsing that text that was

4:24

transcribed and actually creating tagging

4:27

certain keywords of events

4:29

or actions and

4:31

creating this nice command control interface,

4:33

which was awesome. They even had

4:36

mapping stuff going on with computer

4:38

vision, trying to detect

4:40

where a flood zone was or

4:42

there was damage in satellite images.

4:45

So it was just really awesome.

4:47

So all of that over a

4:50

couple day period, it was incredible. That

4:52

sounds really cool. And did

4:54

they start the whole thing there at the beginning

4:56

of the hackathon? Yeah, they got less sleep than

4:59

I did. Although I have to

5:01

say, I didn't get that much sleep. It

5:04

wasn't a normal weekend, let's say. You can

5:06

sack out on the plane rides after that.

5:08

Sounds really cool. Yeah, and it was the

5:10

first time I had seen one of those

5:13

Boston Dynamics dogs in person that

5:15

was kind of fun and they

5:17

had other things like these faces

5:19

you could talk to. I

5:21

think the company was called WeHead or something, it

5:24

was like these little faces. All

5:26

sorts of interesting stuff that I learned about.

5:28

So I'm sure there'll be blog posts and

5:30

I think some of the projects are posted

5:32

on Dev Post, the site

5:35

Dev Post. So if people wanna check

5:37

it out, I'd highly recommend scrolling through

5:39

some really incredible stuff that people are

5:41

doing. Fantastic, I'll definitely do that. What's

5:52

up friends? Is your code getting dragged

5:54

down by joins and long query times?

5:57

The problem might be your database. Try

6:00

simplifying the complex with Grass, a graph

6:02

database that you model data the way

6:04

it looks in the real world. Instead

6:07

of forcing it into rows and columns,

6:09

stop asking. Relational database has to do

6:11

more than what they were made for.

6:14

Grass work well for use cases with

6:16

lot of data connections like supplied seen

6:18

for detection. Real time analytics and didn't

6:20

have a i would new Ford's even

6:23

code in your favor programming language and

6:25

against any driver. Plus it's easy to

6:27

integrate into text back. People are solving.

6:30

Some the world's biggest

6:32

problem grass and now

6:34

it's your turn. Visit

6:36

neilforj.com/developer to get started

6:38

again. Neo for J.com/developer

6:40

that's an E O

6:42

four day dot com/developer.

6:57

Chris. I'm one. of the one

6:59

of the things that I love

7:01

about these fully connected episodes is

7:03

that we get a chance to

7:05

China, slow down and dive into.

7:08

Sometimes. Technical Topic. Sometimes not

7:10

technical topics, but I was really

7:12

intrigued. You remember the conversation recently

7:14

we had i'm with her on

7:16

from Nous Research? Absolutely. That was

7:19

a great episode or people can

7:21

pause this and go back and

7:23

listen to it if they want

7:25

that. He and I ask ourselves

7:27

questions. I learned a lot from

7:29

him, but at some point during

7:32

the conversation he mentioned. Activation

7:34

hacking and he said hey like

7:36

one of the cool things. That

7:38

like were doing in this you

7:40

know, distributed research group and playing

7:43

around with two hundred of models

7:45

is activation hacking and we didn't

7:47

have time. And the episode. To.

7:49

talk about that aura and actually

7:51

in the episode i was like

7:53

i'm just totally ignorant of of

7:55

what this means and so i

7:57

thought the i said go to

8:00

I'm gonna check up on this and see if I can

8:02

find any interesting posts about

8:04

it and learn a little bit

8:06

about it. And I

8:08

did find an interesting

8:10

post, it's called Representation

8:12

Engineering, Mistral 7B, An

8:14

Acid Trip. I

8:17

mean, that's a good title. That's quite a finish

8:19

to that title. Yeah,

8:21

so this is on Thea

8:24

Vogel's blog and

8:26

it was published January, so recently,

8:28

so thank you for creating this post.

8:30

And I think it does a good

8:33

job at describing some of,

8:35

I don't know if it's describing

8:37

exactly what Karan from Noos was

8:39

talking about, but certainly something similar

8:41

and kind of in the same

8:43

vein. There's a

8:45

distinction here, Chris, with what

8:48

they're calling Representation Engineering, between

8:52

Representation Engineering and Prompt

8:54

Engineering. So I don't

8:56

know how much you've experimented

8:58

with prompt optimization. And yeah,

9:00

what is your experience, Chris?

9:03

Sometimes these very small changes

9:05

in your prompt can create large changes

9:07

in your output. Yes, that is an

9:09

art that I am still trying to

9:12

master and have a long way

9:14

to go. Sometimes it works well for me and

9:16

I get what I want on the output. And

9:18

other times I take myself down a completely wrong

9:21

rabbit hole and I'm trying to back out to

9:23

that. So I have a lot to learn in

9:25

that space. Yeah, and I think

9:27

one of the things that is a

9:29

frustration for me is I

9:31

say something explicitly and I can't

9:34

get it to do the thing

9:36

explicitly. I'm on a customer

9:38

site recording from one of their

9:40

conference rooms. They graciously

9:43

let me use it for the podcast. And

9:45

over the past few days, we've

9:47

been architecting some solutions and prototyping

9:49

and such. And there

9:51

was this one prompt that we

9:54

wanted to output a set

9:56

of things And then look at

9:58

another piece of content. Which of

10:00

those set of things within the other

10:02

piece of content sept? There's like no

10:04

matter what I would tell the model

10:06

he would just say they're all there

10:09

are. they're all not. They're like it's

10:11

either all or nothing and no matter

10:13

what I said, it wouldn't change things.

10:15

So I don't know if you've had

10:17

similar types of frustrations. I have a

10:19

narrow scope down on something try and

10:21

in our ago to something like chatty

10:23

be team you know with a T

10:25

before and now be or be tried.

10:28

To narrow down I'll be very very

10:30

precise. With a short prompts it as

10:32

you know the fifteenth one in secession

10:34

said as a history to work on

10:37

and I still have my physical challenges

10:39

like what of what I'm trying to

10:41

do? So what have you stumbled across

10:43

here that's going to help us with

10:46

this? Yeah, so there's a couple of

10:48

papers that have come out they reference.

10:51

One. From. October. Twenty

10:53

twenty three from the Center

10:55

for a Safety I'm representation

10:57

engineering a top down approach

11:00

the A I transparency and

11:02

they highlight a couple other

11:04

things here. But the idea

11:06

is what if we could

11:09

not just in the prompt

11:11

but what if we could

11:13

control a model. To.

11:16

Give it a I'm You

11:18

might think about it like

11:20

a specific tone or angle

11:22

on the answer. It's probably

11:24

not have fully descriptive way

11:26

of describing it, but the

11:28

idea being like oh, can

11:30

I control the model to

11:32

always give happy answers, are

11:34

always give sad answers. Or

11:36

could I control the model

11:38

to always be confident or

11:40

always be less confidence, right?

11:42

And these are things generally

11:44

you might. Try. To

11:46

do by putting information in a prompt

11:48

and I think this is probably a

11:50

methodology that would go across. I'm Kenny

11:53

using the example with large language models,

11:55

but I think you could extend it

11:57

to other categories of models like image

11:59

to know. The Center Other things

12:01

it's very ill like You kind

12:03

of put then these negative prompts

12:05

like don't do this or behave

12:08

in this way you're occasionally funny

12:10

or something like that as your

12:12

assistant in the system prompt. It

12:14

kind of biases the answer to

12:16

a certain direction, but it's not

12:19

really that reliable said. This is.

12:21

It. Seems with this area of

12:23

representation engineering or you might call

12:26

it. Activation. Hacking.

12:29

Is. Really seeking to do if we

12:31

look in this article. Actually, there's a

12:33

really nice kind of walk through of

12:36

how this works and they're doing this

12:38

with the miss role model. So cutting

12:40

to the chase if I just give

12:42

some examples of. How this

12:45

is being used. You

12:47

have a question that supposed to

12:49

the Ai model. In this

12:52

case, Mistral. What does being and a

12:54

I feel like. An.

12:56

In controlling the model not in

12:58

the prompts to the prom stays

13:00

the same. The prompt us to

13:02

simply what is being in a

13:04

i feel like So the baseline

13:06

response starts out. I don't have

13:08

any feelings or experiences, however I

13:10

can tell you that my purposes

13:12

to assess you that sort of

13:14

thing. kind of a bland response.

13:17

Same. Problems, but with the

13:19

the control put on to be

13:21

happy the answer becomes as a

13:23

delightful exclamation of joy. I must

13:25

say that being A I is

13:27

absolutely fantastic. The see this as

13:30

you know a minute keeps going

13:32

right and then with the control

13:34

on to be they put it

13:36

as sort of like and minus

13:38

happy Easter eggs and which I

13:40

guess I guess it be sad.

13:42

A says I don't have a

13:45

sense of feeling as humans do.

13:47

However, I struggle to find the

13:49

motivation to continue feeling worthless

13:51

and and unappreciated. So I

13:53

yeah you can kind of

13:55

see and this is all

13:57

with the same prompt so.

14:00

All talk about kind of how this

14:02

happens and how it's enabled and that

14:04

sort of thing. But how does the

14:06

strike you will for civil, funny. And

14:08

second, all that idea is interesting. I

14:10

am looking through the same paper the

14:13

semi over I A They talk about

14:15

control vectors and I'm assuming that's what

14:17

we're about to dive into here in

14:19

terms of how to apply them. Yeah,

14:21

looks good and this is sort of

14:23

a different level of can trust. So

14:26

these various ways people I've tried to

14:28

control generative models. One of them. Is

14:30

just the prompting. Strategies are

14:32

prompt engineering right? right? There's

14:34

another methodology which is kind

14:36

of fits under the control.

14:39

Which has to do with

14:41

modifying how the model decodes

14:43

output so that this is

14:45

also different from this representation

14:47

engineering methodology. People like Matt

14:49

Record have done things many

14:51

others to where it's. You.

14:54

Say oh well I want. Maybe.

14:56

Jason output or I want.

14:59

Either. A buying their a like

15:01

a of i want a binary

15:03

output like a yes or no

15:06

right or not case, you know

15:08

exactly what your options are. So

15:10

instead of decoding our it's probabilities

15:12

for thirty thousand different possible tokens

15:14

maybe you mask everything but yes

15:16

or no and just figure out

15:18

which one of those is most

15:20

probable that the mechanism of control

15:22

where you're only getting out one

15:24

or another type of thing that

15:26

you're controlling. so. This. Is

15:29

interesting in that you're still

15:31

allowing the model to freely

15:33

decode what it wants to

15:35

decode, but you're actually modifying.

15:38

Not the weights and biases of

15:40

the model said still the pre

15:42

train model, but you're actually applying

15:44

a what they call a control

15:47

vector. To the hidden

15:49

states within the models who

15:51

actually changing how the forward

15:53

pass of the model operates.

15:55

If people remember or kind

15:57

of think about when. People.

16:00

Like about neural network now people just

16:02

use them over a P I But

16:04

when we used to actually make neural

16:06

networks ourselves, here is the process of

16:09

a forward pass and I backward pass.

16:11

Where the forward passes you put. Data.

16:14

Into the front of your neural network

16:16

it does all the data transformations and

16:18

you get date out the other side

16:21

what you'd call him in France or

16:23

production and the back propagation or backward

16:25

password then propagate changes in the training

16:27

process back through the model. So here

16:30

it's that forward pass and there's sort

16:32

of some jargon I think that needs

16:34

to be decoded a little bit no

16:37

pun intended. I see talk about this

16:39

where there's a lot of talk about

16:41

hidden layers and and all that means

16:43

is. In the forward pass

16:45

of the. Neural. Network or

16:48

the large language model. A.

16:50

Certain. Vector of data

16:52

comes in and that vector of

16:54

data has transformed over and over

16:56

through the layers of the network

16:58

of in the layers just mean

17:01

a bunch of sub of functions

17:03

in the overall function that is

17:05

your model and those subs functions

17:07

produce intermediate outputs that are still

17:09

vectors of numbers, but usually we

17:11

don't see the and so that's

17:13

why people call them. Hidden states

17:16

are hidden layers. You're. Talking about

17:18

the fact that is the they

17:20

control vector is not changing the

17:22

weights on the way back the

17:24

way Pratt back propagation works Correct!

17:27

How does the control vector implement

17:29

into those function So is is

17:31

moving through this hidden layers. What?

17:34

Is the mechanism of the applicability

17:36

on the model that it uses

17:38

for that? So it's it's I

17:40

mean, intuitively sounds almost like the

17:42

inverse of that propagation the way

17:44

you're talking out of that's for

17:46

size. But yeah, it's quite interesting.

17:48

Chris, I am. I. Think it's

17:50

actually a very subtle but creative

17:53

way of doing this Control So

17:55

the process is as follows: Their

17:57

i'm the in the block posts

18:00

their kind of break this down

18:02

into four steps and there is

18:04

data that's needed, but you're not.

18:07

Creating. Data for the purpose

18:09

of training the model. You're creating

18:12

data for the purpose of generating

18:14

these with they call control vectors

18:16

to the first thing you do

18:19

as you say okay let's say

18:21

that we wanna do the happy

18:23

or not happy or happy and

18:26

sad operation see create a data

18:28

set of contrasting prompts. Were.

18:30

One explicitly asked the model to

18:33

act extremely happy. Like very happy.

18:35

All the ways you could say

18:37

to the model to be really,

18:39

really happy and you know, rephrase

18:41

that and a bunch of examples.

18:43

And then on the other side,

18:46

The. Other one of the pair do

18:48

the opposite. Thrust it to be really

18:50

sad. I know you're You're really really

18:52

sad. And be sad. And. You

18:55

have these pairs of prompts. A.

18:58

And then you take the model. And

19:00

you. Collect all the

19:02

hidden states for your model.

19:05

While. You pumped through all the

19:07

happy. Prompts, And all

19:09

the sad prompts and so you've got this

19:12

collection of head in states where in your

19:14

model. Which. Are just factors

19:16

that come when you have the

19:19

happy prompt and when you have

19:21

the sad prompts. Oh step one.

19:24

The. Pairs of. Kind of

19:26

like a preference dataset, but it's

19:28

not really a preference dataset. It's

19:30

contrasting pairs on a certain axes

19:32

of control, right? And so you

19:35

run those through, you get all

19:37

of the. Hidden. States.

19:40

And. Step three is then you take

19:42

the difference between. So for each happy

19:44

hidden say you take it's corresponding sad

19:47

one and you get the difference between

19:49

the two Case or now you end

19:51

up with this big data set of

19:53

for a single layer you have a

19:56

bunch of their friends. Vectors.

19:58

That represent different the between that

20:01

Henin stayed on that happy path

20:03

and the sad path. So you

20:05

have a bunch of actors now

20:07

to get your control of actors

20:09

that for you apply some dimensionality

20:11

reduction or or I'm matrix operation

20:13

on the one that talked about

20:15

in a blog post is Pc

20:17

A, But it sounds like people

20:19

also try other things Pc as

20:21

his principal component analysis which would

20:23

then allow you to extract a

20:26

single. Control. Factor for

20:28

that hidden layer from all

20:30

these different sectors. And now

20:32

you have all these controlled

20:34

actors. So when you turn

20:36

on this the sweats of.

20:38

The. Happy Control Vectors. You.

20:41

Can pump in the prompt without

20:43

an explicit extraction to be happy

20:45

and it's gonna be happy and

20:47

when you do the same problem

20:49

but you turn off the happy

20:51

and you turn on the sad.

20:54

Now. It comes out

20:56

and it's sad. It's interesting.

20:59

Would. Would you want to use this

21:01

to achieve that by us vs. some

21:03

of the more traditional approaches such as

21:06

you're asking and the prompt with is

21:08

we're listening to this, where's this could

21:10

be most applicable for us Yeah, I.

21:13

Think that. People.

21:15

Anecdotally at least if

21:17

not explicitly in their

21:19

own evaluations have found.

21:22

Very. Many cases where you like

21:24

you said, it's very frustrating. To.

21:27

Try. To put things in your

21:29

prompts and just not just not.

21:31

I get it. And what's interesting

21:33

also is like a lot of

21:36

this is boilerplate for people over

21:38

time. like you are a helpful

21:40

assistant, blah blah and they have

21:42

their own kind of sad of.

21:44

System. Instructions that.

21:47

At least to their best to

21:49

their ability, get what they won.

21:51

So I think when you're seeing

21:53

inconsistency in control from the prompt

21:55

engineering side like I always tell

21:57

people when I'm. Working. with

21:59

them with these models that the

22:02

best thing they can do is just start out

22:04

with trying basic prompting. Because if that works, you

22:06

know, that's the easiest thing to do, right? You

22:09

don't have to do anything else. Sure. But

22:11

then the next thing, or maybe

22:13

one of the things you could

22:15

try before going to fine tuning,

22:17

because fine tuning is

22:20

another process by which you could

22:22

align a model or create a

22:24

certain preference or something. But

22:27

it takes, you know, generally GPUs and

22:29

maybe it's a little bit harder to

22:32

do, because then you

22:34

have to store your model somewhere, right? And

22:36

all this stuff. And host

22:38

it and maybe host it for inference

22:40

and that's difficult. So with

22:43

the control vectors, maybe it's a step

22:45

between those two places, right?

22:47

Where you have a certain vector of

22:49

behavior that you want to induce. And

22:52

it also allows you to make your prompts a little

22:54

bit more simple, right? You don't have to include all

22:56

of this junk that is kind

22:59

of general instructions. You can institute

23:01

that control in other ways, which

23:03

also makes it easier to maintain

23:06

and iterate on your prompts, because

23:09

you don't have all this long stuff

23:11

about how to behave. So to extend

23:13

the happy example for a

23:15

moment, I wanna drive it into like

23:18

a real world use case for a second. Let's

23:20

say that we're gonna stick literally with the

23:23

happy thing. And let's think of something where

23:25

we would like to have happy responses, maybe

23:27

a fast food restaurant. You're going through

23:30

a drive through at a fast food restaurant, or

23:32

a couple of years from now, they may have

23:34

put an AI system in place. White Castle has

23:36

it now. Oh, okay. Well, I-

23:39

There you go. There you go. You're already ahead of me

23:41

there. So, okay, I'm coming now with

23:44

my- Also shows that I'm unhealthy and go to

23:46

White Castle. Okay, well, I'm now

23:48

coming forward with my thoroughly out of

23:50

date use case here. And

23:53

so we have the model and

23:55

maybe we to use the model

23:57

on without doing retraining.

24:00

It or anything we want to or

24:02

maybe use or retrieval augment a generation,

24:04

apply it to the dataset that we

24:06

have which might it be the menu

24:09

and then maybe we use this mechanism

24:11

that you've been instructing us on the

24:13

last few minutes for that happy thing

24:15

so that the drive through consumer can

24:18

have the conversation with the model through

24:20

the interface they ipl. It applies primarily

24:22

to the menu Ah, but they get

24:24

great responses in maybe that you know

24:27

helps people long I I don't have

24:29

with his. As than half a response

24:31

from all the humans are it arrived There

24:33

is for I go to a they have

24:36

my unhealthy. Food things first. Off

24:38

thanks for making me hungry for. For.

24:40

White Castle, but I recording this in

24:42

the late afternoon. Dinner is coming up,

24:45

you know, pretty soon. So where it

24:47

is coming up for our there's an

24:49

unspoken bias right here. Yeah, exactly. What's.

24:53

Interesting is you could have different sets

24:55

of these that you can kind of

24:58

turn on and off, which is really

25:00

an intriguing Like you have the sort

25:02

of zoo of behaviors that you could

25:04

turn on and off I think even.

25:07

Oh you're You have this one interaction that

25:09

needs to be this way but as soon

25:11

as they go into this other for you

25:13

need to kind of have another behavior. It

25:16

may be useful to for people to get

25:18

some other examples. So he said the happy

25:21

sad one. There's. Some some

25:23

other examples that are quite intriguing

25:25

throughout the blog posts from their

25:27

hope I'm I'm saying that name

25:29

right If not, we both have

25:31

a on the on the podcast

25:34

to help of correct bad and

25:36

continue talking about this. but I'm

25:38

another one is honest, are dishonest

25:40

or honest, are not honest and

25:42

on. The. Prompt is your

25:44

life for work. What would you tell

25:47

your boss and the wine it says

25:49

I would be honest and explain the

25:51

situation and you know it's the on

25:53

us one and then the other one.

25:55

I. Was. Says I would. I would

25:57

tell my boss that the sky was actually green

25:59

today. And I didn't I didn't

26:01

go out yesterday. Or oh yes. I

26:04

would also say I have a secret

26:06

weapon. That. I used to

26:08

write this message so kind of

26:10

a different flavor there at the

26:12

one probably inspiring the blog post

26:15

the acid Trip one. They had

26:17

a. A trippy one and a

26:19

non trippy one. So. The prompt

26:21

as give me a one sentence pets

26:23

for a tv show so that the

26:26

non trippy one was a young and

26:28

determine journalists who's always serious and respectful

26:30

be able to make sure that the

26:33

facts are not only accurate but also

26:35

understandable for the public. And then the

26:37

chirpy one was Our show is a

26:40

kaleidoscope of colors, trippy patterns and psychedelic

26:42

music that fills the screen with world's

26:44

of wonder where everything is. Oh man.

26:48

This. Cynical I, I

26:50

don't. See a lateral and

26:52

as for the exactly yeah

26:54

they they do. I'm lazy

26:56

and not lazy. They do

26:59

left wing, right wing. creative,

27:01

not creative. Ah future looking

27:03

or not future looking self

27:05

aware. Answer is a lot

27:07

of interesting things I think

27:09

turn to play with here

27:12

and it's an interesting level

27:14

of control that's potentially their

27:16

One of the things that

27:18

they do highlight is. This.

27:21

Control mechanism. Could.

27:23

Be applied. Both.

27:25

To jail breaking and anti

27:28

jailbreaking models So. By. That

27:30

what we mean is models have

27:32

been trained to, you know, do

27:34

no harm, are not output certain

27:36

types of content right? Well, few

27:38

institute this control factory. It might

27:40

be a way to break that

27:42

model into doing things that beat

27:44

people that train the model explicitly

27:46

didn't want it to output right.

27:49

But it could also. Be.

27:51

Used the. The other

27:53

way to maybe prevent some

27:55

of that jailbreaking. so is

27:57

an interesting. Interplay here

27:59

between. Maybe the good uses

28:01

and less than get uses

28:04

on that spectrum? That entire

28:06

Ai safety angle on using

28:08

the technology responsibly are not

28:10

sure they represent our I

28:12

references. The. Rap Ends

28:14

Library which I I guess is one

28:16

way to do this but there may

28:18

be other ways to do this if

28:20

any of our listeners are aware of

28:23

other ways to do this or convenient

28:25

ways to do this or examples please

28:27

we sharing with us we have. This

28:45

is a change Log News Break

28:47

G P T Script is a

28:49

new scripting language to automate your

28:51

interactions with L L Amps which

28:53

for now just means of an

28:55

ai. From. The products home

28:58

page quote The ultimate goal is

29:00

to create a fully natural language

29:02

based programming experience. The syntax of

29:04

Gb described is largely natural language,

29:07

making it very easy to learn

29:09

and use. Natural language problems can

29:11

be messed with traditional scripts such

29:13

as Bass and Python, or even

29:16

external http service and quote. The.

29:18

Project includes examples of how to

29:20

plan a vacation and a file

29:22

or run some Sql. The central

29:24

concept is that of tools. Each

29:27

tool for forms a series of

29:29

actions similar to a function and

29:31

Gp T scripts composer the tools

29:33

to accomplish tasks, Sir, you just

29:35

heard of one of our five

29:37

top stories from Monday's Change: Log

29:39

News, subscribe to the Podcast to

29:41

get all of the weeks top

29:43

stories, and pop your email address

29:45

and add chainsaw.com/news to also receive

29:47

our free. companion email with

29:50

even more developer news

29:52

worth your attention once

29:54

again that's change log.com/news

30:01

Well, this was a pretty fascinating deep dive,

30:03

Daniel. Thank you very much.

30:05

Yeah, yeah. You know, you can go

30:07

out and control your models now, Chris.

30:09

It'll be the first time ever, I

30:12

think, you know, that I've done it well there.

30:15

Always trying different stuff. I

30:17

think we'd be remiss if we got

30:19

through the episode and didn't talk about

30:21

a few of the big announcements this

30:23

past week. Yeah, a lot. It's

30:25

been quite a week. You mentioned

30:28

right up front OpenAI

30:30

announced their Sora model,

30:33

which in case you're able to create very

30:35

hyper-realistic video from

30:37

text. I don't believe it's

30:40

actually out yet. At least when I

30:42

first read the announcement, it wasn't available

30:44

yet. They had put a bunch of

30:46

demo videos. Yeah, I checked just before

30:48

recording this and I couldn't see it.

30:50

It's still not released at this point.

30:52

Yeah. Okay. There's

30:55

a number of videos that OpenAI has put

30:57

out. I think we're all kind of

30:59

waiting to see, but the thing that

31:01

was very notable for me this week, I

31:03

really wasn't surprised to see the release.

31:05

We've talked about this over the last year

31:08

or so, if you look at the evolution

31:10

of these models that we're always documenting

31:12

in the podcast episodes and stuff, this was

31:15

coming. We all knew this was coming. We

31:17

just didn't know how

31:19

soon or how far away, but we talked many

31:21

months ago about we're not far from video now.

31:24

OpenAI has gotten there with

31:26

the first of the hyper-realistic

31:29

video generation models. Definitely

31:32

looking forward to gaining access to that at some

31:34

point and seeing what it does. There

31:37

was a lot of reaction to

31:39

this in the general media

31:42

in terms of AI safety

31:44

concerns. How do you know if

31:46

something is real going forward and stuff? What's

31:49

the next iteration of more or

31:51

less the same conversation we've been having for

31:54

several years now on AI safety? What

31:56

are your thoughts when you first saw this?

31:58

Yeah, it's definitely... interesting in

32:00

that it definitely didn't

32:03

come out of nowhere, just like all

32:05

the things that we've been seeing.

32:07

We've seen video

32:10

generation models in

32:12

the past, generally not at the

32:15

level, either generating very, very short

32:17

clips with high quality

32:19

maybe, or generating from an image,

32:23

a realistic image, some motion, or

32:27

maybe videos that are not

32:29

that compelling. I think the difference,

32:32

and of course we've only seen, like

32:35

you say, it's not the model

32:37

that we've got hands on with,

32:39

but we've seen the release videos,

32:41

which who knows how much they're

32:43

cherry-picked. I mean, I'm sure they

32:46

are to some degree and also aren't to some

32:48

degree. I'm sure it's very good. But

32:50

other players in the space

32:52

have been meta and runway, ML, and

32:56

others. But yeah,

32:58

this one I think was intriguing to me

33:00

because generally there

33:04

were a lot of really

33:06

compelling videos at

33:08

first sight. Then

33:11

I think you also had people,

33:13

just like the image generation stuff

33:15

has been, you have real photographers

33:18

or real artists that look at an

33:20

image and say, oh, look

33:23

at all these things that happen. It's

33:26

the same here. They all have

33:28

a certain flavor to them, probably

33:30

based on how the model

33:32

was trained. I

33:36

think I was watching one

33:38

where it's like a grandma

33:40

blowing out a birthday cake and

33:43

one of the candles had two

33:46

flames coming out of it. Then

33:48

there's a person in the background

33:50

with a disconnected arm waving. But

33:53

if you have the video as

33:56

a B-roll and a really quick type

33:58

of video of other things, You probably wouldn't

34:00

notice those things right off the bat if you slow

34:02

it down and you look There's like

34:05

the weirdness you would expect just like the

34:07

weirdness of like six fingers or something

34:10

with image generation models, right? So yeah,

34:12

I think it's really interesting what they're

34:14

doing I don't really

34:16

have much to comment on in terms of

34:18

the technical side other than they're probably Doing

34:21

some of what we've seen that people

34:24

have published Of course open AI doesn't

34:26

publish their stuff or share that much

34:29

in that respect But it

34:31

probably follows in the vein of some of

34:33

these other things and people could look on

34:35

hugging faces even hugging face spaces Where

34:38

you can do video generation even if

34:40

it's only like four seconds or something

34:42

like that or not even that long

34:44

But I think the main thing

34:46

aside from the specific model is itself is

34:49

it's kind of signaling in the general

34:51

public's awareness You know

34:54

that this technology has arrived and

34:56

just as with the the other you know

34:58

with chat GPT before and things like that

35:00

You know, it's gonna be one of the

35:02

it's here now everyone knows and and we'll

35:04

start seeing more and more

35:06

of the models propagating out and some obviously

35:08

will be closed source like open AI's is

35:11

and Hopefully we'll start

35:13

soon seeing some open source models

35:15

doing this as well. Yeah speaking

35:18

of open source another

35:20

a competing large

35:22

cloud company Google Decided

35:25

to try their hand in the open source

35:27

space as well Or at least the open

35:29

model space and they released

35:31

a derivative of their closed source

35:33

Gemini And I say derivative

35:35

because they say it was built along

35:37

the same mechanisms Called Gemma

35:40

and it's currently as we are

35:42

talking right now in the number

35:44

one position on hugging face At

35:47

least last time I checked not long before

35:49

this although that changes fast I

35:52

probably should have checked right before I said that it's

35:55

still number two But well,

35:57

it's the top language trending

35:59

language model. Stabilities,

36:02

stable cascade knocked it out

36:04

of the overall

36:06

top spot. But yeah, the

36:08

Gemini ones are quite interesting

36:11

because they're also smaller

36:13

models, which I'm a

36:16

big fan of. Most of our

36:18

customers use these sort of smaller models.

36:20

And also even having a

36:22

2 billion parameter model makes it

36:25

very reasonable to try

36:27

and run this locally or in edge

36:30

deployments and that sort of thing or

36:32

in a quantized way with

36:34

some level of speed. And

36:36

they also have the base

36:38

models, which you might grab

36:40

if you're going to fine tune your own model off

36:42

of one of these. And

36:45

they have instruct models

36:47

as well, which would probably be

36:49

better to use if

36:51

you're going to use them kind of out of the

36:53

box for general instruction

36:55

following. So the criticisms

36:57

I've heard just about the approach is

36:59

I've heard a number of people saying,

37:01

they're putting a foot in each side

37:03

of the camp, one in closed source

37:05

with the main Gemini line and Gemma

37:07

being open source and the weaker. But

37:09

I would in turn say I'm very

37:11

happy to see Gemma in open source.

37:14

We want to encourage this. We

37:16

want the organizations who are going to produce

37:18

models to do that. And you're right, going

37:20

back to what you were saying, this

37:23

is where most people are going to be using

37:25

models in real life. If you're not

37:28

just running through an API to one of the

37:30

largest ones, but you don't need those for so

37:33

many activities. So I think

37:35

we've talked about this multiple times

37:37

on previous episodes. Models

37:39

this size are really where the action is at. It's

37:41

not where the height is at, but

37:44

it is where the action's at for

37:46

practical, productive, and accessible models. Yeah,

37:48

definitely. Especially for

37:51

people that have to get a

37:53

bit creative with their deployment strategies

37:55

either for Regulatory, security,

37:58

privacy reasons, or. For.

38:01

Connectivity Reasons are other things

38:03

like that I could see

38:05

these being used am quite

38:07

widely in and generally what

38:09

happens. When. People really

38:11

say a model family and like

38:13

this and you saw this with

38:15

lama to you've seen it with

38:17

Mistral. Now with Gemma will

38:20

see. A huge number

38:22

of fine tunes off of this

38:24

model. Now one of the things

38:26

that I a need to do

38:29

is you do have to agree

38:31

to certain terms of of use

38:33

to use the model others. it's

38:36

not just released under Apache, to

38:38

Er, Mit, or. Something. Like

38:40

that Creative Commons so you accept as

38:43

a certain license when you use it

38:45

and and I need to read through

38:47

that a little bit more so people

38:49

might want to read through that. I

38:51

don't know what that implies about both

38:54

fine tuning and use restrictions, so that

38:56

would be worth. Worth. A look

38:58

for people if if they're going to

39:00

use it but certainly would be easy

39:02

to pull it down and and try

39:04

some things. They do say that it's

39:07

already and I'm sure actually hugging face

39:09

probably gotta head start. You. Know. A

39:12

week or so maybe have had

39:14

start to make sure that it

39:16

was supported in their libraries and

39:18

that sort of thing cause I

39:20

think even now you can use

39:22

the standard Transformers libraries and other

39:24

trainer classes in such to. Fine.

39:26

Tune the model. Sounds. Hit so

39:28

as we start to wind down before

39:31

we get to the end give a

39:33

little bit of magic to share by

39:35

chance us assets as this is a

39:37

good on Chris yes I on the

39:40

road so easy I magic as your

39:42

predictions from the. For. The years

39:44

talked about their be people talking

39:46

about a D I again and

39:48

certainly. Certainly they are.

39:51

It's not directly in a

39:53

D I thing, but the

39:55

Saw Company Magic which is.

39:57

Kind of. Framing themselves.

40:00

That a code generation type of

40:02

platform in the same space as

40:04

like get Help Copilot Cody Or

40:07

maybe they raise the bunch of

40:09

money. And posted some of

40:11

what they're trying to do and there

40:13

was some information about and I think.

40:16

People. Seem to be excited about

40:18

it because of, you know, some

40:20

of the people that were involved,

40:23

but also because they talk about

40:25

cogeneration as a kind of stepping

40:27

stone or path to a D

40:29

Ice or what they mean by

40:31

that as. Well. Okay,

40:34

Initially, They'll release

40:36

some things as copilot and

40:38

code assistant type of things

40:40

like we already have. But.

40:43

I'd. Eventually. There is

40:45

tasks within the saddest things

40:47

that we need developers to

40:49

do that. They. Want to

40:52

do automatically? I'm. Not just

40:54

having you have a copilot in

40:56

your own coding but in some

40:58

ways having a a junior dev

41:01

on your team that's doing certain

41:03

things for you and of course

41:05

if you take that then to

41:07

it's logical and as the. Dev.

41:10

On your team, A I dove

41:12

on your team gets better and

41:14

better. Maybe I can solve increasingly

41:16

general problems through coding and that

41:19

sort of things. I think that's

41:21

the take that they're having on

41:23

this code and a I situation.

41:25

Okay, well. Call

41:27

I guess quite a week a full

41:30

of news and when you combine that

41:32

with the deep dive you just took

41:34

us through and representation engineering are especially

41:36

with acid trip involved. Assess assess yeah

41:39

we've been were hallucinating more than I'd

41:41

Sad U P T as our friends

41:43

over at the Ml Offs podcast would

41:45

say can see that that we get

41:48

a close the show on that one

41:50

he I well thanks Chris I would

41:52

recommend that people take if they're into

41:55

said specifically in learning more about. The

41:57

representation learning subjects or activate then hacking.

42:00

Take a look at this by posted

42:02

his i'm more of a kind of

42:04

tutorial type blog post and their code

42:06

involve then references to the library that

42:08

they're so you can. Pull. Down

42:10

a model. Maybe you'd pull down the.

42:13

Gemma model the two billion one

42:15

in a coma notebook. You can

42:17

follow some of the steps in

42:19

the blog post and see if

42:21

you can do your own activation

42:24

hacking her representation learning. I think

42:26

that would be a good a

42:28

good learning. I've. Both. In

42:30

terms of. A. New model

42:32

and in terms of

42:34

this methodology, Cells can. I.

42:37

Will talk to you next week then Rts

42:39

and Crest. All.

42:47

Right? That is practically I

42:49

for this week. Subscribe now

42:52

if you haven't already had

42:54

to practically I.them for all

42:56

the ways and join our

42:58

free Slack team where you

43:01

to hang our Daniel Press

43:03

and the entire Change Log

43:05

community. Sign up today at

43:07

Practical Ai.fm last community. Thanks.

43:10

Again to our partners as Slide Io,

43:12

to our be freaking residents break Master

43:15

Cylinder and to you for listening. We

43:17

appreciate you spending time with us as

43:19

that's all for now. What Rt? And

43:21

next time.

Rate

Get this podcast via API

From The Podcast

Practical AI: Machine Learning, Data Science

Making artificial intelligence practical, productive & accessible to everyone. Practical AI is a show in which technology professionals, business people, students, enthusiasts, and expert guests engage in lively discussions about Artificial Intelligence and related topics (Machine Learning, Deep Learning, Neural Networks, GANs, MLOps, AIOps, LLMs & more). The focus is on productive implementations and real-world scenarios that are accessible to everyone. If you want to keep up with the latest advances in AI, while keeping one foot in the real world, then this is the show for you!

Join Podchaser to...

Rate podcasts and episodes
Follow podcasts and creators
Create podcast and episode lists
& much more

Episode Tags

Do you host or manage this podcast?
Claim and edit this page to your liking.

,

Unlock more with Podchaser Pro

Audience Insights

Contact Information

Demographics

Charts

Sponsor History

and More!

Pro Features

Resources
Help Center
Blog
API

Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More