Podchaser Logo
Home
The state of open source AI

The state of open source AI

Released Tuesday, 12th December 2023
Good episode? Give it some love!
The state of open source AI

The state of open source AI

The state of open source AI

The state of open source AI

Tuesday, 12th December 2023
Good episode? Give it some love!
Rate Episode

Episode Transcript

Transcripts are displayed as originally observed. Some content, including advertisements may have changed.

Use Ctrl + F to search

2:00

and really looking forward to that. Sounds

2:02

like fun. Yeah, and the

2:05

hackathon is all centered around

2:07

these openly accessible or open

2:09

source or permissively licensed generative

2:11

AI models. I think it's

2:13

really fitting because we have

2:15

with us Casper, who is

2:17

a long time open source

2:19

enthusiast, but also one of

2:21

the contributors to the recently

2:23

published state of open source

2:25

AI book from Prem. So

2:27

welcome, Casper. It's great to

2:29

have you with us. Hello,

2:31

yes, yeah, great to be here.

2:34

Yeah, well, I mentioned you're a

2:36

long time open source enthusiast. How

2:38

did you kind of get enthused

2:41

about open source AI specifically?

2:43

So what was your own kind of

2:45

journey into open source AI, maybe kind

2:47

of leading up to this book and

2:49

what it's become? I mean, that's

2:51

a good question. I've been around for long enough that

2:54

AI didn't really exist as a thing back

2:56

when I got into open source. And it was

2:58

honestly just purely a hobby. I never even considered

3:00

it as a career. This

3:02

was, I must mean, what, 15 years

3:04

ago or something. And in fact, I felt

3:07

ashamed and embarrassed every time I was working in

3:09

open source because it felt like I should have

3:12

been spending that time working on an actual career,

3:14

right? It felt like it was just a

3:16

toy. I had a very

3:18

long commute between my home and workplace

3:20

on a train, and I was just

3:22

coding away on my phone. I

3:24

actually installed Deviants. I loaded on my Android.

3:27

And yeah, that got me hooked on open

3:29

source purely as a hobby. And I mean,

3:32

if you contribute enough and you're happy making

3:34

mistakes in public, eventually you build something that

3:36

loads of people start using, it spirals

3:39

out of control. Before you know

3:41

it, it suddenly turns into a career. So

3:44

I probably entered into this whole space in

3:46

an unconventional way. I didn't intend to make

3:49

things that would become famous, but they just

3:51

wound up becoming famous, which is quite pleasant.

3:54

I mean, there's pros and cons because also

3:56

things that become successful are necessarily things that

3:58

you expect to become successful. successful, right? You

4:00

can put a lot of effort into something

4:03

and the world determines it's not really of

4:05

much value and so they don't use it and

4:07

something you barely put much effort into

4:09

could explode, right? So that

4:12

was my sort of background. I'm

4:14

kind of an academic slant as well, so I

4:16

did a lot of machine vision type things in

4:18

university. Didn't really want to

4:20

shoe on myself into any particular one

4:22

area though and also I didn't want

4:25

to do pure academia, right? I much

4:27

prefer industry and having stakeholders

4:29

and actual products that you build at the

4:31

end of the day and I mean, there's

4:33

pros and cons definitely to those. But yeah,

4:35

so that's obviously how I wound up like

4:38

the rest of the industrial world seemingly

4:40

moving towards AI because that's

4:43

a buzzword and that's what everyone wants you to work

4:45

on effectively. So yeah, what started

4:47

off as initially being machine vision pre-machine

4:50

learning became machine learning type machine vision

4:52

type stuff and now of course LLMs

4:54

are all the rage. So

4:56

that's why we thought of doing a

4:58

bit of extra research and try and

5:00

consolidate all of the noise out there

5:02

and various different blog posts, people

5:04

effectively shouting to the ether and we thought we

5:06

might as well write a book and

5:09

release some of our research in the wild, get some

5:11

feedback on that before we actually start

5:13

building more things. Yeah, that's

5:15

awesome and you even allude to this

5:18

in the sort of intro to the

5:20

book, this sort of fast paced nature

5:22

of the field and a lot of

5:24

people feeling sort of FOMO like how

5:27

do I even categorize all of the things

5:29

that are happening in open

5:31

source AI? So maybe one

5:35

kind of general question about the structure

5:37

of this. Chris and I have

5:39

worked through some of these categories in various

5:41

episodes on the podcast, but sometimes it is

5:43

hard to sort of think about like

5:46

how do you categorize all the things that

5:48

are happening in open source

5:50

AI because they do go beyond

5:52

just models, but they include models

5:54

and a lot of things are

5:56

sort of interconnected. So how did

5:58

you kind of... of was it

6:01

organic in how the structure of this book

6:03

came together or how did you come up

6:05

with the major categories in your mind for

6:07

what's going on in open source AI? And

6:09

that's what I was really wondering as well.

6:12

You literally said Daniel exactly what was

6:14

in my head just now.

6:16

Yeah, we're in tune. Yeah,

6:19

no, I mean, it is a big ask because

6:21

I mean, my philosophy in general is that the

6:23

universe exists as a cohesive whole. And you know,

6:26

we split it up into different subjects like physics

6:28

and chemistry and math just as a

6:30

way for humans to actually parse everything

6:32

that exists in this model bite-sized chunks.

6:34

But they're not really independent subjects, right?

6:36

And the same goes with AI. I

6:39

mean, there's so many different categories of

6:41

AI. So I mean, the

6:43

nice thing about working in the open source space is that

6:45

there's lots of different people you can have conversations with, get

6:47

some feedback. Everyone kind of chipped

6:49

in their own ideas about how to, let's say,

6:52

break down a book into different chapters. Ultimately,

6:54

I think what made the most sense

6:56

is that it doesn't matter too much

6:58

what those chapter titles are. It's more

7:00

about the content within them

7:03

being, let's say, not too repetitive, and actually,

7:06

you know, distilling the ideas that people are

7:08

talking about. And if you can do that

7:10

really well, it may be

7:12

almost doesn't matter quite how you self-categorize

7:14

things. But I would say, Filippo, there's

7:16

any who's probably the one who came

7:18

up with the actual final, let's say

7:20

10 chapters, but then pass that

7:23

in terms of, you know, the actually writing those

7:25

chapters, probably about a dozen people have actually worked

7:27

with them, which is, again, really nice that you

7:29

can do this in the open source space, like

7:31

no, no single person is really the author of

7:33

this book. It seems fairly

7:35

obvious to me, based on my own

7:37

particular passion and research that licensing should

7:39

definitely be a chapter. And that's something

7:42

that developers often neglect, because it's just sort of

7:44

outside their field of interest and expertise. And it's

7:47

just a bit of red tape that maybe they

7:49

have to be aware of in the back of

7:51

their mind. But yeah, so that I

7:53

mean, I basically wrote a chapter and licenses,

7:55

which I think everyone else was happy about.

7:57

Nobody else wanted to do it. But sure.

8:00

I mean, it was just effectively topics that

8:02

we felt are big major things that there's

8:04

a lot of confusion over. Maybe we ourselves

8:06

were confused about it as well. So evaluation

8:08

and data sets, what's the best way to

8:10

evaluate a model anyway? So that

8:12

seemed like a big topic. Let's make that a

8:14

chapter. So it seemed fairly organic coming up with

8:16

these titles. And of course, as

8:19

we were writing this, again, it was all

8:21

fully open source in the whole writing process.

8:23

We thought maybe we should split up a

8:25

chapter. So we split up models into two

8:27

chapters, let's say, one for specifically unaligned models.

8:30

Versus unaligned models. So it was

8:32

an iterative process. Yeah. On

8:35

that front, I definitely hear the passion coming

8:37

through for that licensing element of

8:40

that. And I see that upfront in the

8:42

book. And maybe, so I'm also

8:44

very, very much like we've mentioned on

8:47

the podcast multiple times that people need

8:49

to be reviewing these things, especially as

8:51

they see whatever 400,000 models on

8:56

hugging face and kind of parse through

8:58

these things. Could you kind

9:00

of give us maybe the pitch

9:02

for engineering teams or tech

9:04

teams that are considering open

9:07

models, but might not

9:09

be aware of the kind of various

9:12

flavors of openness that are occurring

9:14

within kind of quote, open source

9:16

AI? Could you just give us

9:18

a little bit of a sense

9:20

of maybe why people should care

9:22

about that and maybe just at

9:24

a high level, what are some

9:26

of these kind of major flavors

9:28

that you see going on in

9:30

terms of openness and access? Right.

9:33

Yeah. I mean, I suppose first

9:35

I should have a disclaimer, which is the quiet part

9:37

that nobody usually says, which is almost

9:40

a counter argument. It might not

9:42

matter because in practice,

9:45

nobody is going to sue you if you do something

9:47

illegal, unless you're fairly big

9:49

and famous. Right. And that's just a

9:51

harsh truth. And it's very frustrating that,

9:53

you know, laws and enforcement tend to

9:56

be two separate things. There

9:58

is a precedent in law that you're not meant

10:00

to create a law unless you know definitely you can enforce

10:02

it. So to a large extent,

10:04

a lot of these licenses out there are

10:07

questionable in that regard. The

10:09

other thing is a lot of these licenses are not actually, let's

10:12

say tested in court, they're not

10:14

actually formally approved by any government

10:16

or legal process. So it's

10:18

not necessarily legal just to write something in

10:20

a license. You should probably be

10:22

aware of recent developments in the EU, for example,

10:25

that proposed the two new

10:27

laws, the CRA and PLA, two new acts, I should

10:29

say, that are effectively saying the

10:31

no warranty clause in all of these open

10:34

source licenses might be illegal if you are

10:36

in any way benefiting, let's say monetarily, even

10:38

if it's indirectly. So you're a company releasing

10:40

open source things purely for advertising purposes, but

10:42

you're not directly gaining any money from it.

10:45

We're still going to ignore the no warranty

10:47

clause. So yeah, there's interesting stuff

10:49

in that space. But I would say as a

10:51

developer, the things that you should be aware of

10:53

when it comes to model openness is that there's

10:56

a difference between weights, training data

10:58

and output. Those are

11:00

the three main categories, really. So licenses

11:03

usually make a distinction with,

11:05

it's not licenses, it's more about

11:07

the source. So are the model

11:10

weights available? That's often the only thing that

11:12

developers care about in the first instance, because

11:14

that means they can download things and just

11:16

play with it. But if

11:18

you actually care about explainability or in any way

11:20

alignment in order to figure out how you might

11:22

be able to make a model aligned or underlined

11:24

or whatever you want to do with it, you

11:27

probably do need to know a bit about the

11:29

training data. So is the training data at least

11:31

described, if not available? And when I say described,

11:33

as in more than just a couple of sentences

11:35

saying how the data was obtained, but actual

11:37

full references and things. So a lot of

11:39

models are not actually open when it comes

11:42

to the training data. And then of

11:44

course, the final thing is the licensing around the outputs

11:46

of the model. Do you really own it? Are you

11:48

allowed to use it for commercial purposes? And

11:50

even if you are, it's highly dependent

11:53

on the training data itself, right? Because

11:55

if the training data is not permissively

11:57

licensed, then technically you shouldn't really have

12:00

much permission to use the output either,

12:02

right? So I think even

12:05

developers are kind of confused about

12:07

the ethics around the permissions. So

12:09

certainly legally we're super confused as

12:11

well. I have two questions

12:13

for you as follow up, but they're unrelated,

12:15

but I'm going to go ahead and throw both

12:18

of them out. Number one, the quick one

12:20

I think is, could you define what an

12:22

aligned model versus an unaligned model is just to

12:24

compare those two for those who haven't heard

12:26

those phrases? And then I'll go ahead

12:28

just as you finish that and say, and

12:31

what's the reason that I notice, you know,

12:33

licenses is addressed at the very top of

12:35

the book? And is that framing the

12:37

way you would look at the rest of the book

12:39

or is that more just happen chance that it came

12:41

there? I was just wondering how that fits into the

12:44

larger story you're telling. Yes. So

12:46

for those who don't know, unaligned

12:48

models, it's effectively, if

12:50

you train a model in a bunch

12:52

of data, it is by default considered

12:54

unaligned. But in the interest of safety,

12:57

what most of the famous models that you've

12:59

heard of do like chat GPT, for example,

13:02

is add safeguards to ensure

13:04

that the model doesn't really

13:06

output sensitive topics, issues, anything

13:09

illegal. It's still probably capable

13:11

of outputting something quite bad,

13:13

but there are safeguards. And

13:15

the process of adding safeguards to

13:17

a model is called aligning a

13:19

model as in aligning with good

13:21

ethics. I suppose that's the implicit.

13:24

Gotcha. Thank you very much. And

13:26

then I was just wondering, like I said,

13:28

the positioning of licensing at the front, is

13:30

that relevant? Or is that just happen chance?

13:33

We did sort of think of an order

13:35

of chapters, let's say, and licensing just seemed

13:37

like a good introduction, let's say, because it's

13:39

before you get into the meat and the

13:42

details of actual implementations and where you can

13:44

download things and where the research is going,

13:46

let's say. Well, Casper, as

13:48

you were kind of, you were just

13:50

describing the kind of framing of the

13:52

book, and also some of these concerns

13:54

around licensing, I'm wondering if we could

13:56

kind of take a little bit of

13:58

a step back. that back as well

14:01

and think about like, what

14:03

are some of the main kind of

14:05

components of the open source AI ecosystem?

14:07

The book kind of details all of

14:09

these, but what are some of like

14:11

the big major components of

14:14

the AI ecosystem, maybe beyond

14:16

models? Cause people obviously

14:18

have maybe thought about or heard of

14:21

generative AI models or LLMs or

14:23

text to image models, but there's

14:26

a lot sort of around the

14:28

periphery of those models that make

14:30

AI applications work or be able

14:33

to run in a company

14:35

or in your application or whatever you're

14:37

building. So could you describe maybe a

14:39

few of these things that

14:41

are either orbiting around the models, if you

14:43

view it that way, or part of this

14:46

ecosystem of open source AI? Sure,

14:48

I mean, there's huge issues I would

14:50

say regarding, I'd say performance

14:53

per watt, effectively

14:55

electrical watt. There's a lot of

14:57

development in the hardware space and

15:01

we have new Mac M1 and M2s, which

15:03

might actually mean you can fairly easily do

15:05

some fine tuning and or at least inference

15:08

on a humble laptop without ever needing CUDA.

15:11

It seems like there's a lot of shifts

15:13

and paradigm changes when it comes to the

15:15

actual engineering implementations. Web GPU

15:17

is a big upcoming thing, which I mean, it

15:20

has technically been going on for a decade or

15:22

more, but it might actually have reached a point

15:24

where possibly we can just write

15:26

code once and it just works in all

15:28

operating systems on your phone. You can get

15:30

an LLM on just working wherever. But

15:33

yes, I mean, there's effectively a lot

15:35

of MLOps style problems. It's one thing

15:37

to have a theory of how to

15:39

actually create an LLM, but quite another

15:41

thing to actually train a thing, fine

15:43

tune it or deploy it in a

15:45

real world application. So there are a

15:47

lot of competing, let's say, software development

15:49

toolkits, desktop applications. And I

15:51

don't think anyone's really settled on one

15:54

that's conclusively better than anything

15:56

else. And really based on

15:58

your individual use cases, to do

16:00

an awful lot of market research just to find

16:02

something that suits it to your use case. I

16:05

ask this because we've had a

16:07

number of discussions on the show about

16:09

sort of training, fine

16:12

tuning, and then this sort

16:14

of prompt or retrieval based

16:17

methodologies. So from your

16:19

perspective as someone that's kind of taken

16:21

survey of the open source AI ecosystem

16:24

and is operating within it

16:26

and building things, what is

16:28

your kind of vision for where

16:31

things are kind of headed

16:33

in terms of more

16:35

sort of fine tunes getting easier

16:37

and fine tunes being everywhere or kind

16:40

of pre-trained models getting better and people

16:42

just sort of implementing fancy prompting

16:44

or retrieval based methods on top of

16:47

those? Do you have any opinion on

16:49

that sort of development? I know

16:51

it's something that's on people's mind because

16:53

they're maybe thinking about, oh, this

16:55

is harder to fine tune, but is it

16:58

worth it because I'm getting maybe not ideal

17:00

results with my prompting. Yeah, no, it makes

17:02

sense. I would say basically if you're

17:04

not doing some form of fine tuning, you're

17:06

not producing anything of commercial value. Effectively,

17:09

it's very much like hiring an intelligent

17:11

human being to work for you without

17:14

them having any particular expertise and

17:17

not even knowing what your company does. That's

17:19

what a pre-trained model is effectively. So

17:21

you do need to fine tune these

17:23

things or have some amount of equivalent,

17:25

anything else that's equivalent to fine tuning,

17:28

let's say. In terms of things

17:30

that actually predate LLMs, I think there's a lot of

17:32

stuff that is very useful and even

17:34

maybe far more explainable, but people seem to

17:37

be discounting just because it's

17:39

easy to get some result out of an

17:41

LLM just by prompting it. So people view

17:43

it as good enough and they start using

17:46

it even though it's maybe not safe, right?

17:48

So one thing I would

17:50

really recommend people look at is

17:52

embeddings. Just by doing a simple

17:55

vector comparison in your embeddings, you

17:57

can find related documents. need

18:00

an LLM to drive that because LLM is

18:02

effectively, instead of explicitly making an embedding

18:04

of your query, you know, converting your query

18:07

into a vector and then comparing it

18:09

to other vectors in your database that correspond

18:11

to, let's say, documents or paragraphs that

18:13

you're trying to search through, your LLM is

18:15

automatically doing that entire process. And it might

18:17

make mistakes while it does that, right?

18:19

It's going to paraphrase things, which it might

18:22

get wrong because it can't even do

18:24

simple basic mathematics, it doesn't understand logic, right?

18:27

So, yeah, whenever it comes to

18:29

things like, let's say, medical imaging, where there's

18:31

a lot of interest in how can

18:33

we use AI to improve this, people

18:35

tend to get frustrated with how slow

18:37

the uptake of AI is. But there's

18:39

a reason for that, which is explainability

18:41

is important, right? So, the way I

18:43

see things going is, yes, far more

18:45

fine tuning, more retrieval, augmented generation types

18:47

are still rack stuff, and then also

18:50

probably push into explainability.

18:52

I don't really think

18:54

there's much explainability in LLMs right now

18:56

in general. Everyone's been

18:58

so focused on LLMs with large

19:00

vision models are kind of one of the

19:02

newer things on the rise. What is your

19:04

take on large vision models in the future

19:07

and how they start integrating in?

19:09

I was just, Andrew and Guy is talking

19:11

about some of them now and I

19:13

would love your take on it. Sure. I mean,

19:15

we didn't quite get to covering this in

19:17

the book. I mean, that's how fast-paced things

19:19

are. So, multimodal things are

19:22

super interesting. To me, my

19:24

feeling is that it's effectively gluing together

19:26

existing models into pipelines and

19:28

it hasn't been historically something that I

19:30

was about interested in because that's more

19:33

an application and it's not so much

19:35

something you need to research per se.

19:37

It's very similar to how the

19:39

open AI people were very surprised that chat GPC

19:41

exploded in popularity, even though technically the technology is

19:43

quite old. It's just, you know, you lower the

19:46

entry barrier a little bit and then everyone actually

19:48

starts using it because they can, right? So,

19:50

to me, the multimodal type stuff is similar.

19:53

It could result in really

19:55

innovative new companies popping up and new

19:57

solutions that are actually usable by the

19:59

general... public, but in terms of the

20:01

underlying technology, it doesn't seem that particularly novel

20:03

to me. As you

20:05

kind of looked at the landscape

20:08

of models itself and the licensing

20:10

of those models, the support for

20:12

those models and underlying MLOps sort

20:14

of infrastructure, the support for an

20:17

underlying kind of like model optimization,

20:20

you know, toolkits and that sort of

20:22

thing. Some people out there

20:24

might hear all of these words like,

20:26

oh, there's these llama two models and

20:29

there's now Mistral and then there's, you

20:31

know, now Yi and like all of these. As

20:35

you were going through and researching the book

20:38

and also kind of doing

20:40

that as an open source community,

20:42

can you orient people at all

20:44

in terms of the kind of

20:47

major model families? So you already

20:49

distinguish between sort of models and

20:51

unaligned models. Is there any

20:53

kind of categories within the models that you

20:55

looked at that you think it would be good

20:57

for people to have in their mind in

20:59

terms of, hey, I have this application or I

21:02

have this idea for working on this. I'm

21:05

listening to Casper. I want to maybe fine

21:07

tune a model. I've got some cool data

21:09

that I can work with. Where

21:11

might be a sort of well

21:13

supported or reasonable place for people

21:15

to start in terms

21:17

of open LLMs or open

21:20

text to image models if you also want

21:22

to mention those? Sure. I

21:24

mean, because there's just a new model

21:26

basically being proposed every day, I mean,

21:28

often it's a small incremental improvement over

21:30

a previous model. So in

21:32

terms of actually trying to compare them

21:34

from a theoretical level without looking at their results,

21:37

there isn't really much to talk about in terms

21:39

of, you know, large model families. They might be

21:41

in an extra type of layer that has been

21:43

added to a model in order to give it

21:46

a new name, let's say. Nothing

21:48

particularly stands out there. I mean, we do have

21:50

a chapter on models where we try and address

21:52

some of the more popular models over

21:55

time, the proprietary ones and then the

21:57

open source ones. But... I

22:00

would say nothing particularly stood out to me

22:02

over there. I suppose the more interesting thing

22:04

in terms of actually implementing

22:06

something for your own particular use case

22:08

is starting with a base

22:10

model that has pretty good performance on presumably

22:13

other people's data that looks as close as

22:15

possible to the data that you actually personally

22:17

care about. So you don't have to wait

22:19

too long when then fine tuning it on

22:21

your own data. So for that, I think

22:24

the most important thing is to take a

22:26

look at the most up-to-date leaderboards. And

22:28

there are quite a few different leaderboards out there. We

22:31

do also have a chapter on that. And that was, interestingly,

22:34

also a nightmare to keep up to

22:37

date because the leaderboards themselves are also

22:39

changing regularly. New leaderboards are being proposed

22:41

for different things. And take

22:44

a look at a leaderboard, pick the best model performing

22:46

there, and then start doing some fine tuning.

22:48

That would be my MO. This

22:50

kind of gets to one of the

22:53

natural questions that might come up with

22:55

a book on this topic,

22:58

which is things are evolving so

23:00

quickly. And you mentioned

23:02

the strategy with this book being

23:04

to have the book be open

23:06

source, have multiple contributors. And

23:09

I'm assuming part of that is also with

23:12

a goal for it to be

23:14

updated over time and be an

23:16

active resource. How have you

23:18

seen that start to work

23:20

out in practice? And

23:23

what is your hope for that sort of community

23:25

around the book or contributors around the book to

23:27

look like going into the future? Sure,

23:30

yeah. I mean, for the evaluation and

23:32

data sets thing, we already have more

23:34

than a dozen leaderboards, just the names

23:36

of the leaderboards and links to them,

23:38

and then what benchmarks they actually implicitly

23:40

include. We have

23:42

comments at the bottom of each chapter,

23:44

which are driven by GitHub effectively, powered

23:46

by Otrances, which is this integration tool

23:49

helper. So you don't

23:51

need to maintain a separate comments platform,

23:53

let's say, and also encourages people to

23:55

open issues, open pull requests. If

23:58

we've made any mistake or something like that, architecture,

26:00

the things that we're building. So effectively,

26:03

our strategy was to first do a lot of

26:05

research. We didn't mind publishing this for the general

26:07

public to have a look at. So we released

26:09

it in a book. And now

26:11

we're working on actually reading our own book and maybe

26:14

taking some of its advice and building things. And

26:16

we have this very much fast

26:18

paced startup style, let's build lots

26:21

of different things, try lots of different experiments.

26:23

It's fine if we throw things away. This

26:42

is a changelog news break. One

26:45

year after chat GPT brought a

26:47

seismic shift in the entire landscape

26:49

of AI, a group of researchers

26:51

set out to test claims that

26:53

its open source rivals had achieved

26:55

parity or even better on certain

26:58

tasks. In the linked

27:00

paper, they provide an exhaustive overview

27:02

of the success surveying all tasks

27:04

where an open source LLM has

27:07

claimed to be on par or

27:09

better than chat GPT. Their conclusion,

27:11

quote, in this survey, we deliver

27:14

a systematic review on high performing

27:16

open source LLMs that surpass or

27:18

catch up with chat GPT in

27:21

various task domains. In addition, we

27:23

provide insights, analysis and potential issues

27:25

of open source LLMs. We believe

27:27

that this survey sheds light on

27:30

promising directions of open source LLMs

27:32

and will serve to inspire further

27:34

research and development helping to close

27:36

the gap with their pain counterpart.

27:38

End quote. It's becoming increasingly

27:41

clear to me that the data

27:43

models powering future AI rollouts will

27:45

be commoditized and democratized. Thanks to

27:47

the competitive nature and hard work

27:50

of both academia and industry. What

27:52

a relief. You just

27:54

heard one of our five top

27:56

stories from Monday's changelog news. Subscribe

27:58

to the podcast. to get

28:01

all of the week's top stories

28:03

and pop your email address in

28:05

at changelog.com/news to also receive our

28:07

free companion email with even more

28:09

developer news worth your attention once

28:12

again that's changelog.com/news

28:19

so Casper I want to actually do a quick

28:21

follow-up of something you were just saying as we

28:24

were going into the break and that was you're

28:26

talking about you know now we're going to start

28:28

going through the book ourselves and taking the advice

28:31

and that brings up kind of a business

28:33

oriented question I want to ask about it and

28:35

so you go out today you've

28:37

listened to the podcast downloaded the book and

28:40

there's so much great information in all

28:42

of these chapters and the comparisons and

28:44

the what you know the different options

28:46

that each chapter addresses are good or

28:48

bad and things like that if someone

28:50

is just getting going or maybe they're

28:52

starting a new project and they're using

28:55

your book as a primary

28:57

source to kind of help them

28:59

make their initial evaluations how

29:01

best to use that book because there's a lot

29:03

of material in here in terms you know all

29:05

these different categories they need to come up with

29:08

their pipelines and you know go back to the

29:10

leaderboards and select the models that they were

29:12

the architectures there's been doing and and all that

29:15

if you were looking at this initially

29:17

with a new set of eyes but also

29:19

having the insight of been one of the

29:22

authors and editors of this how

29:24

would you recommend to somebody that they

29:26

best be productive as quickly

29:28

as possible and getting all their questions

29:31

sorted how would they go about that

29:33

process right I mean that's not really

29:35

in question I was thinking of addressing

29:37

with you know writing a book so

29:39

I suppose what you're referring to is a

29:41

case where someone has

29:43

a particular problem that

29:45

they want to solve sure and

29:48

an actual let's say business model

29:50

or target audience so

29:52

I mean if there's actually something that you're trying to

29:54

solve the book hasn't been really written from that perspective

29:56

it's more for a student who

29:58

kind of wants to learn about about everything,

30:00

right? Or a

30:03

practitioner who just hasn't kept up to

30:05

date with the latest advancements in the

30:07

last year. So the intention is that

30:09

you can skim through the entire book,

30:11

really. You're not meant to necessarily

30:13

know it in advance which specific

30:16

chapters might have or spur

30:18

an innovation or an idea that you

30:20

can actually implement to help you. In

30:22

terms of that, I mean, what probably

30:24

might be more useful is looking through

30:26

a couple of blog posts that actually

30:28

take you from zero to, here's an

30:30

example application that, for

30:32

example, will download a YouTube

30:35

video, automatically detect the speech,

30:38

do some speech-to-text recognition type things, and then give you

30:40

a prompt and you can type in a question and

30:42

it will answer it based on that video. We do,

30:44

in fact, have a few blogs giving you these kind

30:46

of examples, right? And I think that

30:49

would probably be more useful if you're actually

30:51

trying to build a product to find existing

30:53

write-ups of people who have built similar things

30:55

and just follow that as a tutorial, right?

30:57

The book is more just to get an

30:59

overview of what's happened in the last year

31:01

in terms of the recent cutting-edge state-of-the-art, right?

31:04

Yeah, and I think that's a good call-out.

31:06

And I think one of the ways I'm

31:08

viewing this is like I am having a

31:11

lot of those conversations as a practitioner with

31:13

our clients about, you know, how

31:15

are we going to solve this problem? And something might

31:17

come up like, oh, now we're

31:19

talking about a vector database. How does

31:21

that fit into like the whole ecosystem

31:23

of what we're talking about here and

31:26

why did we start talking about this?

31:28

I think that the way that you

31:30

formatted things here and laid them out

31:32

actually really helps put some of these

31:34

things in context for people

31:36

within the whole of what is

31:38

open source AI, which is really

31:40

helpful. So I just mentioned vector

31:43

databases, which we have talked about quite a

31:45

bit on the show and is something

31:47

that, of course, is an important piece

31:50

of a lot of workflows. But there's

31:52

one thing on the list of chapters

31:54

here that maybe we haven't talked about

31:56

as much on this show, and that's

31:58

desktop apps. which we've talked

32:01

a lot about whether it be

32:03

like that orchestration or software development

32:05

toolkit layer, like you're talking about

32:07

Lang chain and llama index and

32:09

other things or the models or

32:11

the MLOps or the vector database.

32:13

But I don't think we've talked

32:15

that much about sort of desktop

32:17

apps, quote unquote, associated with this

32:19

ecosystem of open source AI. Could

32:21

you give us a little bit of

32:23

framing of that topic? Like what is

32:26

meant by desktop app here and maybe

32:28

highlighting a couple of those things that

32:30

people could have in their mind as part

32:32

of the ecosystem? Sure,

32:34

I mean, I should probably quickly say about

32:37

vector databases. I don't quite understand why there's

32:39

so much of hype over it. To me,

32:41

embeddings are actually the important thing. The database

32:43

that you happen to store your embeddings in

32:45

is almost like a minor implementation detail. Unless

32:47

you're really dealing with huge amounts of data,

32:49

it shouldn't really matter which database you pick,

32:51

right? Sure, valid point. I don't know if

32:53

you have a different opinion there though. No,

32:57

I think it's not necessarily

33:01

a one or the other, but there's use, in

33:03

my opinion, there's use cases for both, but

33:06

not everyone should assume that they fit

33:09

in one of those use cases and still they

33:11

figure out what's relevant for their own problem, so.

33:14

But yeah, in the desktop space, I think maybe

33:17

there aren't that many developers who talk about

33:19

it because it's almost

33:21

front-end type applications as

33:23

opposed to getting stuck into the details

33:26

of implementing, fine-tuning, and all that stuff

33:28

tends to mean more back-end, let's say,

33:30

in inverse commerce. So I

33:32

think that might be one of the reasons why

33:35

there aren't that many desktop applications being produced because

33:37

you kind of need both, both front-end and back-end,

33:39

and that maybe naturally lends

33:41

itself to more the sort

33:44

of resources that only a

33:46

closed-source company might be willing to

33:48

dedicate. So maybe that

33:51

just might be why there's not so much in the

33:53

open-source space. Just takes a lot of

33:55

development effort. But yeah, there are a

33:57

few that we do mention in the book. There's

33:59

LM Studio, GPT. for all COBOLD. All of

34:01

them are still very new

34:03

because I mean the thing that they're

34:05

effectively giving you a user interface for

34:07

itself is very new. Yeah,

34:10

I mean there are some common design

34:12

principles that are maybe being

34:14

settled on. You know, you do

34:16

expect a prompt if you're dealing

34:18

with language models. You do expect

34:20

a certain amount of configuration for

34:22

images if you're dealing with images

34:24

like how many, what's the

34:26

dimensions and some basic pre-processing that

34:29

has nothing to do with artificial

34:31

intelligence but you might still expect to see this

34:33

sort of thing in one place rather than having

34:35

to switch between a separate

34:37

image editor and your pipeline. Things

34:40

that I'm kind of interested in is improving

34:42

the usability or the end user pleasure

34:44

let's say of using the desktop apps

34:47

far more so. Can you sort of

34:49

graphically connect these pipelines together like some

34:51

sort of a node editor so you

34:53

can drag and drop models around and

34:55

like drop their inputs, connect

34:57

their inputs and outputs to each other so that you

34:59

can have a nice visual representation of your

35:02

entire pipeline. But yeah, I decided to

35:04

see what happens in that space. To some

35:06

extent, I think Prime itself is probably interested

35:09

in developing a desktop app itself. As

35:12

you've gone through the process of putting the book

35:14

together and I think one of the things that

35:16

in any project that folks do is kind of

35:18

like when to go ahead and put it out

35:20

there. There's a point where you have to kind

35:22

of put a pin in it and say that's

35:25

this one right now but our

35:27

brains never stop working obviously on

35:29

these problems. To that effect,

35:31

you get the book out there. Is

35:33

there anything and you have conversations like this one that we're having

35:35

right now where we're talking about it and you're like, well, it

35:38

wasn't meant for that but it was meant for this. Is

35:40

there anything in your head that you're starting to think,

35:42

well, maybe that should have been a topic or

35:45

something we should have put in

35:48

the book maybe next time with this landscape

35:50

evolving so fast. Where has your

35:52

post-publishing brain been at on these collection

35:54

of topics? We definitely have

35:56

yet another 10 more chapters

35:59

planned. So there's definitely going

36:01

to be a second edition of this book or

36:04

maybe I should say second volume It's not even a

36:06

second edition. It's not correction. So the kind of thing

36:08

it's ten whole new chapters. Yes, literally

36:10

v2 that's going to include a

36:12

lot of interesting stuff about things

36:15

that happen in the Last half of 2023 and

36:18

hopefully will be developed in in 24 as

36:20

well among the things that people

36:22

are talking about I mean, we already talked

36:24

about vector databases a little bit and maybe

36:26

you're like you don't see the hype there

36:29

What are some things in the ecosystem

36:31

that you're really really excited about

36:34

and then some things that maybe? Like

36:36

are there any is there anything else that

36:38

you're like, ah like people are talking about this

36:40

a lot But I don't I don't really

36:42

see it going anywhere any any hot

36:45

takes I mean, I probably

36:47

already covered some of these things right

36:49

what I'm super interested in is fine-tuning

36:51

and lowering entry barriers for other things

36:54

that I'm not all that convinced by are Pretending

36:57

that AI is AGI. They're not the same.

36:59

I'm sorry and I don't see it And

37:02

I don't trust these models to be more

37:05

intelligent right now than at best

37:07

a well-trained secretary They're

37:10

considerably faster So, you know There are

37:12

applications where being able to churn

37:14

through a lot of text really quickly is actually

37:16

a value in which yes Yes, great apply one

37:18

of these things but apart from that I don't

37:20

I don't really buy the hype. Yeah,

37:23

that's fair I think and as we kind

37:25

of get closer to

37:27

an end here. I'm wondering maybe

37:29

there's Some in

37:32

our listener base that don't have

37:34

the kind of history in Open

37:37

source that you do and

37:39

of course there's contributions to this

37:41

book that would be relevant But

37:43

there's also contributions within this whole

37:46

ecosystem of open AI whether it's

37:48

in Tool kits or it's

37:50

in the desktop apps or it's in the

37:53

in the actual models or data

37:55

sets or evaluation techniques themselves

37:58

for those out there that maybe are are

38:00

newer to open source, do

38:02

you have any recommendations or

38:04

suggestions in terms of more

38:07

people getting involved in open source

38:09

AI? Obviously the book

38:11

is a piece of that because it's

38:14

open source and people could contribute to

38:16

that. But maybe more broadly, do you

38:18

have any encouragement for people out there

38:20

in terms of ways to get started

38:23

in contributing to open source AI rather

38:25

than just consuming? Sure, yeah,

38:27

no, I would say that basically every

38:29

time you consume, you are 90%

38:31

of the way there to

38:34

contributing back as well. So you

38:36

have probably cloned a repository somewhere

38:38

in order to run some code,

38:40

right? You probably encountered some issues

38:43

and a lot of those issues probably are

38:45

genuine bugs because these are fast moving things,

38:47

people just write some code without necessarily doing

38:49

full proper robust testing. We don't have time

38:51

to do robust testing, right? A lot of

38:53

the time they just throw away experiment type

38:55

things, so we're in make and break mode. Yeah,

38:58

so if you find an issue rather than quietly

39:00

fixing it yourself, feel free to open

39:02

a pull request and maybe you're not new, but

39:04

you're kind of new to this and you're scared

39:06

of opening a pull request. You're scared that it's

39:08

not perfect code that you've written as well. I

39:11

mean, bear in mind that the code you fixed

39:13

was even less perfect, right? And I

39:15

can say as an open source maintainer, I'm

39:17

always super happy when people contribute anything whether

39:19

it's an issue, a pull request. And

39:21

I think generally people are far

39:24

more happy and helpful and kind than

39:26

you might expect. I

39:29

would say that when it comes to actually writing

39:31

code, people aren't necessarily the same trolls that

39:33

you might find on Twitter, right? Or social

39:35

media in general, right? These are people who

39:37

have a mindset that

39:40

they're thinking about what's being written and they

39:42

care about the actual project and they don't

39:44

care about fighting you on a political front,

39:46

let's say. So if you are

39:48

trying to be helpful, that counts a lot

39:50

more than are you actually helpful in your

39:52

own opinion or anyone else's opinion, right? And

39:55

even if your pull request doesn't get accepted

39:57

or merged in, you will definitely have

39:59

some useful. feedback, it might help you

40:01

in your own expertise, your own growth as

40:03

a student or a contributor. And

40:06

I would say, you know, there are definitely times

40:08

where you might rub somebody up the wrong way

40:10

and you're not happy with an interaction. But

40:13

it's such a small percentage of the time that

40:16

it's definitely worth it. Yeah,

40:18

well, I think that's a really

40:20

great encouragement in this

40:22

conversation with. And of course, Chris

40:25

and I as well would encourage you to

40:27

get involved. Even if

40:29

it's something small initially, get plugged

40:31

into a community, start interacting and

40:34

contribute to the ecosystem, because I

40:36

would agree with you, Casper, it

40:38

can be both useful

40:40

for the projects, but also

40:42

very rewarding and beneficial

40:44

for the contributors in terms of

40:46

the community and the things you

40:49

learn and the connections that you

40:51

make and and all of that. So

40:53

yes, very much encourage people to get

40:55

involved. Also encourage people to check out

40:57

the open source book, which

40:59

we'll link in our show notes. So

41:01

make sure you go down and click

41:03

and take a look. It's very easy

41:05

to navigate to and you'll see all

41:08

the categories that we've been talking about

41:10

through the episode. So dig in. And

41:12

if you see things to add, definitely

41:14

contribute them. Appreciate you joining Casper. Yes.

41:16

And thanks for sharing the link. You

41:18

just shared it with me. So book

41:20

that prem. I. Oh,

41:22

slash state of open source AI

41:25

with dashes. We'll link it in

41:27

the show notes as well so

41:29

people can click easily. But yeah,

41:32

thank you so much for joining Casper. And also thank

41:34

you for your contributions to the book. We're

41:36

really thankful that you've done this. Sure.

41:39

Yeah. Thanks for having me on. Thank

41:50

you for listening to Practical AI. Your

41:53

next step is to subscribe now if

41:55

you haven't already. And if

41:57

you're a longtime listener of the show, help us reach.

42:00

more people by sharing Practical AI with your

42:02

friends and colleagues. Thanks once

42:04

again to Fastly and Fly for partnering

42:06

with us to bring you all ChangeLog

42:08

podcasts. Check out what they're up to

42:10

at fastly.com and fly.io. And

42:13

to our beat-freaking residents, Breakmaster Cylinder, for continuously

42:15

cranking out the best beats in the biz.

42:17

That's all for now. We'll talk to you

42:20

again next time.

Unlock more with Podchaser Pro

  • Audience Insights
  • Contact Information
  • Demographics
  • Charts
  • Sponsor History
  • and More!
Pro Features