Episode Transcript
Transcripts are displayed as originally observed. Some content, including advertisements may have changed.
Use Ctrl + F to search
2:00
and really looking forward to that. Sounds
2:02
like fun. Yeah, and the
2:05
hackathon is all centered around
2:07
these openly accessible or open
2:09
source or permissively licensed generative
2:11
AI models. I think it's
2:13
really fitting because we have
2:15
with us Casper, who is
2:17
a long time open source
2:19
enthusiast, but also one of
2:21
the contributors to the recently
2:23
published state of open source
2:25
AI book from Prem. So
2:27
welcome, Casper. It's great to
2:29
have you with us. Hello,
2:31
yes, yeah, great to be here.
2:34
Yeah, well, I mentioned you're a
2:36
long time open source enthusiast. How
2:38
did you kind of get enthused
2:41
about open source AI specifically?
2:43
So what was your own kind of
2:45
journey into open source AI, maybe kind
2:47
of leading up to this book and
2:49
what it's become? I mean, that's
2:51
a good question. I've been around for long enough that
2:54
AI didn't really exist as a thing back
2:56
when I got into open source. And it was
2:58
honestly just purely a hobby. I never even considered
3:00
it as a career. This
3:02
was, I must mean, what, 15 years
3:04
ago or something. And in fact, I felt
3:07
ashamed and embarrassed every time I was working in
3:09
open source because it felt like I should have
3:12
been spending that time working on an actual career,
3:14
right? It felt like it was just a
3:16
toy. I had a very
3:18
long commute between my home and workplace
3:20
on a train, and I was just
3:22
coding away on my phone. I
3:24
actually installed Deviants. I loaded on my Android.
3:27
And yeah, that got me hooked on open
3:29
source purely as a hobby. And I mean,
3:32
if you contribute enough and you're happy making
3:34
mistakes in public, eventually you build something that
3:36
loads of people start using, it spirals
3:39
out of control. Before you know
3:41
it, it suddenly turns into a career. So
3:44
I probably entered into this whole space in
3:46
an unconventional way. I didn't intend to make
3:49
things that would become famous, but they just
3:51
wound up becoming famous, which is quite pleasant.
3:54
I mean, there's pros and cons because also
3:56
things that become successful are necessarily things that
3:58
you expect to become successful. successful, right? You
4:00
can put a lot of effort into something
4:03
and the world determines it's not really of
4:05
much value and so they don't use it and
4:07
something you barely put much effort into
4:09
could explode, right? So that
4:12
was my sort of background. I'm
4:14
kind of an academic slant as well, so I
4:16
did a lot of machine vision type things in
4:18
university. Didn't really want to
4:20
shoe on myself into any particular one
4:22
area though and also I didn't want
4:25
to do pure academia, right? I much
4:27
prefer industry and having stakeholders
4:29
and actual products that you build at the
4:31
end of the day and I mean, there's
4:33
pros and cons definitely to those. But yeah,
4:35
so that's obviously how I wound up like
4:38
the rest of the industrial world seemingly
4:40
moving towards AI because that's
4:43
a buzzword and that's what everyone wants you to work
4:45
on effectively. So yeah, what started
4:47
off as initially being machine vision pre-machine
4:50
learning became machine learning type machine vision
4:52
type stuff and now of course LLMs
4:54
are all the rage. So
4:56
that's why we thought of doing a
4:58
bit of extra research and try and
5:00
consolidate all of the noise out there
5:02
and various different blog posts, people
5:04
effectively shouting to the ether and we thought we
5:06
might as well write a book and
5:09
release some of our research in the wild, get some
5:11
feedback on that before we actually start
5:13
building more things. Yeah, that's
5:15
awesome and you even allude to this
5:18
in the sort of intro to the
5:20
book, this sort of fast paced nature
5:22
of the field and a lot of
5:24
people feeling sort of FOMO like how
5:27
do I even categorize all of the things
5:29
that are happening in open
5:31
source AI? So maybe one
5:35
kind of general question about the structure
5:37
of this. Chris and I have
5:39
worked through some of these categories in various
5:41
episodes on the podcast, but sometimes it is
5:43
hard to sort of think about like
5:46
how do you categorize all the things that
5:48
are happening in open source
5:50
AI because they do go beyond
5:52
just models, but they include models
5:54
and a lot of things are
5:56
sort of interconnected. So how did
5:58
you kind of... of was it
6:01
organic in how the structure of this book
6:03
came together or how did you come up
6:05
with the major categories in your mind for
6:07
what's going on in open source AI? And
6:09
that's what I was really wondering as well.
6:12
You literally said Daniel exactly what was
6:14
in my head just now.
6:16
Yeah, we're in tune. Yeah,
6:19
no, I mean, it is a big ask because
6:21
I mean, my philosophy in general is that the
6:23
universe exists as a cohesive whole. And you know,
6:26
we split it up into different subjects like physics
6:28
and chemistry and math just as a
6:30
way for humans to actually parse everything
6:32
that exists in this model bite-sized chunks.
6:34
But they're not really independent subjects, right?
6:36
And the same goes with AI. I
6:39
mean, there's so many different categories of
6:41
AI. So I mean, the
6:43
nice thing about working in the open source space is that
6:45
there's lots of different people you can have conversations with, get
6:47
some feedback. Everyone kind of chipped
6:49
in their own ideas about how to, let's say,
6:52
break down a book into different chapters. Ultimately,
6:54
I think what made the most sense
6:56
is that it doesn't matter too much
6:58
what those chapter titles are. It's more
7:00
about the content within them
7:03
being, let's say, not too repetitive, and actually,
7:06
you know, distilling the ideas that people are
7:08
talking about. And if you can do that
7:10
really well, it may be
7:12
almost doesn't matter quite how you self-categorize
7:14
things. But I would say, Filippo, there's
7:16
any who's probably the one who came
7:18
up with the actual final, let's say
7:20
10 chapters, but then pass that
7:23
in terms of, you know, the actually writing those
7:25
chapters, probably about a dozen people have actually worked
7:27
with them, which is, again, really nice that you
7:29
can do this in the open source space, like
7:31
no, no single person is really the author of
7:33
this book. It seems fairly
7:35
obvious to me, based on my own
7:37
particular passion and research that licensing should
7:39
definitely be a chapter. And that's something
7:42
that developers often neglect, because it's just sort of
7:44
outside their field of interest and expertise. And it's
7:47
just a bit of red tape that maybe they
7:49
have to be aware of in the back of
7:51
their mind. But yeah, so that I
7:53
mean, I basically wrote a chapter and licenses,
7:55
which I think everyone else was happy about.
7:57
Nobody else wanted to do it. But sure.
8:00
I mean, it was just effectively topics that
8:02
we felt are big major things that there's
8:04
a lot of confusion over. Maybe we ourselves
8:06
were confused about it as well. So evaluation
8:08
and data sets, what's the best way to
8:10
evaluate a model anyway? So that
8:12
seemed like a big topic. Let's make that a
8:14
chapter. So it seemed fairly organic coming up with
8:16
these titles. And of course, as
8:19
we were writing this, again, it was all
8:21
fully open source in the whole writing process.
8:23
We thought maybe we should split up a
8:25
chapter. So we split up models into two
8:27
chapters, let's say, one for specifically unaligned models.
8:30
Versus unaligned models. So it was
8:32
an iterative process. Yeah. On
8:35
that front, I definitely hear the passion coming
8:37
through for that licensing element of
8:40
that. And I see that upfront in the
8:42
book. And maybe, so I'm also
8:44
very, very much like we've mentioned on
8:47
the podcast multiple times that people need
8:49
to be reviewing these things, especially as
8:51
they see whatever 400,000 models on
8:56
hugging face and kind of parse through
8:58
these things. Could you kind
9:00
of give us maybe the pitch
9:02
for engineering teams or tech
9:04
teams that are considering open
9:07
models, but might not
9:09
be aware of the kind of various
9:12
flavors of openness that are occurring
9:14
within kind of quote, open source
9:16
AI? Could you just give us
9:18
a little bit of a sense
9:20
of maybe why people should care
9:22
about that and maybe just at
9:24
a high level, what are some
9:26
of these kind of major flavors
9:28
that you see going on in
9:30
terms of openness and access? Right.
9:33
Yeah. I mean, I suppose first
9:35
I should have a disclaimer, which is the quiet part
9:37
that nobody usually says, which is almost
9:40
a counter argument. It might not
9:42
matter because in practice,
9:45
nobody is going to sue you if you do something
9:47
illegal, unless you're fairly big
9:49
and famous. Right. And that's just a
9:51
harsh truth. And it's very frustrating that,
9:53
you know, laws and enforcement tend to
9:56
be two separate things. There
9:58
is a precedent in law that you're not meant
10:00
to create a law unless you know definitely you can enforce
10:02
it. So to a large extent,
10:04
a lot of these licenses out there are
10:07
questionable in that regard. The
10:09
other thing is a lot of these licenses are not actually, let's
10:12
say tested in court, they're not
10:14
actually formally approved by any government
10:16
or legal process. So it's
10:18
not necessarily legal just to write something in
10:20
a license. You should probably be
10:22
aware of recent developments in the EU, for example,
10:25
that proposed the two new
10:27
laws, the CRA and PLA, two new acts, I should
10:29
say, that are effectively saying the
10:31
no warranty clause in all of these open
10:34
source licenses might be illegal if you are
10:36
in any way benefiting, let's say monetarily, even
10:38
if it's indirectly. So you're a company releasing
10:40
open source things purely for advertising purposes, but
10:42
you're not directly gaining any money from it.
10:45
We're still going to ignore the no warranty
10:47
clause. So yeah, there's interesting stuff
10:49
in that space. But I would say as a
10:51
developer, the things that you should be aware of
10:53
when it comes to model openness is that there's
10:56
a difference between weights, training data
10:58
and output. Those are
11:00
the three main categories, really. So licenses
11:03
usually make a distinction with,
11:05
it's not licenses, it's more about
11:07
the source. So are the model
11:10
weights available? That's often the only thing that
11:12
developers care about in the first instance, because
11:14
that means they can download things and just
11:16
play with it. But if
11:18
you actually care about explainability or in any way
11:20
alignment in order to figure out how you might
11:22
be able to make a model aligned or underlined
11:24
or whatever you want to do with it, you
11:27
probably do need to know a bit about the
11:29
training data. So is the training data at least
11:31
described, if not available? And when I say described,
11:33
as in more than just a couple of sentences
11:35
saying how the data was obtained, but actual
11:37
full references and things. So a lot of
11:39
models are not actually open when it comes
11:42
to the training data. And then of
11:44
course, the final thing is the licensing around the outputs
11:46
of the model. Do you really own it? Are you
11:48
allowed to use it for commercial purposes? And
11:50
even if you are, it's highly dependent
11:53
on the training data itself, right? Because
11:55
if the training data is not permissively
11:57
licensed, then technically you shouldn't really have
12:00
much permission to use the output either,
12:02
right? So I think even
12:05
developers are kind of confused about
12:07
the ethics around the permissions. So
12:09
certainly legally we're super confused as
12:11
well. I have two questions
12:13
for you as follow up, but they're unrelated,
12:15
but I'm going to go ahead and throw both
12:18
of them out. Number one, the quick one
12:20
I think is, could you define what an
12:22
aligned model versus an unaligned model is just to
12:24
compare those two for those who haven't heard
12:26
those phrases? And then I'll go ahead
12:28
just as you finish that and say, and
12:31
what's the reason that I notice, you know,
12:33
licenses is addressed at the very top of
12:35
the book? And is that framing the
12:37
way you would look at the rest of the book
12:39
or is that more just happen chance that it came
12:41
there? I was just wondering how that fits into the
12:44
larger story you're telling. Yes. So
12:46
for those who don't know, unaligned
12:48
models, it's effectively, if
12:50
you train a model in a bunch
12:52
of data, it is by default considered
12:54
unaligned. But in the interest of safety,
12:57
what most of the famous models that you've
12:59
heard of do like chat GPT, for example,
13:02
is add safeguards to ensure
13:04
that the model doesn't really
13:06
output sensitive topics, issues, anything
13:09
illegal. It's still probably capable
13:11
of outputting something quite bad,
13:13
but there are safeguards. And
13:15
the process of adding safeguards to
13:17
a model is called aligning a
13:19
model as in aligning with good
13:21
ethics. I suppose that's the implicit.
13:24
Gotcha. Thank you very much. And
13:26
then I was just wondering, like I said,
13:28
the positioning of licensing at the front, is
13:30
that relevant? Or is that just happen chance?
13:33
We did sort of think of an order
13:35
of chapters, let's say, and licensing just seemed
13:37
like a good introduction, let's say, because it's
13:39
before you get into the meat and the
13:42
details of actual implementations and where you can
13:44
download things and where the research is going,
13:46
let's say. Well, Casper, as
13:48
you were kind of, you were just
13:50
describing the kind of framing of the
13:52
book, and also some of these concerns
13:54
around licensing, I'm wondering if we could
13:56
kind of take a little bit of
13:58
a step back. that back as well
14:01
and think about like, what
14:03
are some of the main kind of
14:05
components of the open source AI ecosystem?
14:07
The book kind of details all of
14:09
these, but what are some of like
14:11
the big major components of
14:14
the AI ecosystem, maybe beyond
14:16
models? Cause people obviously
14:18
have maybe thought about or heard of
14:21
generative AI models or LLMs or
14:23
text to image models, but there's
14:26
a lot sort of around the
14:28
periphery of those models that make
14:30
AI applications work or be able
14:33
to run in a company
14:35
or in your application or whatever you're
14:37
building. So could you describe maybe a
14:39
few of these things that
14:41
are either orbiting around the models, if you
14:43
view it that way, or part of this
14:46
ecosystem of open source AI? Sure,
14:48
I mean, there's huge issues I would
14:50
say regarding, I'd say performance
14:53
per watt, effectively
14:55
electrical watt. There's a lot of
14:57
development in the hardware space and
15:01
we have new Mac M1 and M2s, which
15:03
might actually mean you can fairly easily do
15:05
some fine tuning and or at least inference
15:08
on a humble laptop without ever needing CUDA.
15:11
It seems like there's a lot of shifts
15:13
and paradigm changes when it comes to the
15:15
actual engineering implementations. Web GPU
15:17
is a big upcoming thing, which I mean, it
15:20
has technically been going on for a decade or
15:22
more, but it might actually have reached a point
15:24
where possibly we can just write
15:26
code once and it just works in all
15:28
operating systems on your phone. You can get
15:30
an LLM on just working wherever. But
15:33
yes, I mean, there's effectively a lot
15:35
of MLOps style problems. It's one thing
15:37
to have a theory of how to
15:39
actually create an LLM, but quite another
15:41
thing to actually train a thing, fine
15:43
tune it or deploy it in a
15:45
real world application. So there are a
15:47
lot of competing, let's say, software development
15:49
toolkits, desktop applications. And I
15:51
don't think anyone's really settled on one
15:54
that's conclusively better than anything
15:56
else. And really based on
15:58
your individual use cases, to do
16:00
an awful lot of market research just to find
16:02
something that suits it to your use case. I
16:05
ask this because we've had a
16:07
number of discussions on the show about
16:09
sort of training, fine
16:12
tuning, and then this sort
16:14
of prompt or retrieval based
16:17
methodologies. So from your
16:19
perspective as someone that's kind of taken
16:21
survey of the open source AI ecosystem
16:24
and is operating within it
16:26
and building things, what is
16:28
your kind of vision for where
16:31
things are kind of headed
16:33
in terms of more
16:35
sort of fine tunes getting easier
16:37
and fine tunes being everywhere or kind
16:40
of pre-trained models getting better and people
16:42
just sort of implementing fancy prompting
16:44
or retrieval based methods on top of
16:47
those? Do you have any opinion on
16:49
that sort of development? I know
16:51
it's something that's on people's mind because
16:53
they're maybe thinking about, oh, this
16:55
is harder to fine tune, but is it
16:58
worth it because I'm getting maybe not ideal
17:00
results with my prompting. Yeah, no, it makes
17:02
sense. I would say basically if you're
17:04
not doing some form of fine tuning, you're
17:06
not producing anything of commercial value. Effectively,
17:09
it's very much like hiring an intelligent
17:11
human being to work for you without
17:14
them having any particular expertise and
17:17
not even knowing what your company does. That's
17:19
what a pre-trained model is effectively. So
17:21
you do need to fine tune these
17:23
things or have some amount of equivalent,
17:25
anything else that's equivalent to fine tuning,
17:28
let's say. In terms of things
17:30
that actually predate LLMs, I think there's a lot of
17:32
stuff that is very useful and even
17:34
maybe far more explainable, but people seem to
17:37
be discounting just because it's
17:39
easy to get some result out of an
17:41
LLM just by prompting it. So people view
17:43
it as good enough and they start using
17:46
it even though it's maybe not safe, right?
17:48
So one thing I would
17:50
really recommend people look at is
17:52
embeddings. Just by doing a simple
17:55
vector comparison in your embeddings, you
17:57
can find related documents. need
18:00
an LLM to drive that because LLM is
18:02
effectively, instead of explicitly making an embedding
18:04
of your query, you know, converting your query
18:07
into a vector and then comparing it
18:09
to other vectors in your database that correspond
18:11
to, let's say, documents or paragraphs that
18:13
you're trying to search through, your LLM is
18:15
automatically doing that entire process. And it might
18:17
make mistakes while it does that, right?
18:19
It's going to paraphrase things, which it might
18:22
get wrong because it can't even do
18:24
simple basic mathematics, it doesn't understand logic, right?
18:27
So, yeah, whenever it comes to
18:29
things like, let's say, medical imaging, where there's
18:31
a lot of interest in how can
18:33
we use AI to improve this, people
18:35
tend to get frustrated with how slow
18:37
the uptake of AI is. But there's
18:39
a reason for that, which is explainability
18:41
is important, right? So, the way I
18:43
see things going is, yes, far more
18:45
fine tuning, more retrieval, augmented generation types
18:47
are still rack stuff, and then also
18:50
probably push into explainability.
18:52
I don't really think
18:54
there's much explainability in LLMs right now
18:56
in general. Everyone's been
18:58
so focused on LLMs with large
19:00
vision models are kind of one of the
19:02
newer things on the rise. What is your
19:04
take on large vision models in the future
19:07
and how they start integrating in?
19:09
I was just, Andrew and Guy is talking
19:11
about some of them now and I
19:13
would love your take on it. Sure. I mean,
19:15
we didn't quite get to covering this in
19:17
the book. I mean, that's how fast-paced things
19:19
are. So, multimodal things are
19:22
super interesting. To me, my
19:24
feeling is that it's effectively gluing together
19:26
existing models into pipelines and
19:28
it hasn't been historically something that I
19:30
was about interested in because that's more
19:33
an application and it's not so much
19:35
something you need to research per se.
19:37
It's very similar to how the
19:39
open AI people were very surprised that chat GPC
19:41
exploded in popularity, even though technically the technology is
19:43
quite old. It's just, you know, you lower the
19:46
entry barrier a little bit and then everyone actually
19:48
starts using it because they can, right? So,
19:50
to me, the multimodal type stuff is similar.
19:53
It could result in really
19:55
innovative new companies popping up and new
19:57
solutions that are actually usable by the
19:59
general... public, but in terms of the
20:01
underlying technology, it doesn't seem that particularly novel
20:03
to me. As you
20:05
kind of looked at the landscape
20:08
of models itself and the licensing
20:10
of those models, the support for
20:12
those models and underlying MLOps sort
20:14
of infrastructure, the support for an
20:17
underlying kind of like model optimization,
20:20
you know, toolkits and that sort of
20:22
thing. Some people out there
20:24
might hear all of these words like,
20:26
oh, there's these llama two models and
20:29
there's now Mistral and then there's, you
20:31
know, now Yi and like all of these. As
20:35
you were going through and researching the book
20:38
and also kind of doing
20:40
that as an open source community,
20:42
can you orient people at all
20:44
in terms of the kind of
20:47
major model families? So you already
20:49
distinguish between sort of models and
20:51
unaligned models. Is there any
20:53
kind of categories within the models that you
20:55
looked at that you think it would be good
20:57
for people to have in their mind in
20:59
terms of, hey, I have this application or I
21:02
have this idea for working on this. I'm
21:05
listening to Casper. I want to maybe fine
21:07
tune a model. I've got some cool data
21:09
that I can work with. Where
21:11
might be a sort of well
21:13
supported or reasonable place for people
21:15
to start in terms
21:17
of open LLMs or open
21:20
text to image models if you also want
21:22
to mention those? Sure. I
21:24
mean, because there's just a new model
21:26
basically being proposed every day, I mean,
21:28
often it's a small incremental improvement over
21:30
a previous model. So in
21:32
terms of actually trying to compare them
21:34
from a theoretical level without looking at their results,
21:37
there isn't really much to talk about in terms
21:39
of, you know, large model families. They might be
21:41
in an extra type of layer that has been
21:43
added to a model in order to give it
21:46
a new name, let's say. Nothing
21:48
particularly stands out there. I mean, we do have
21:50
a chapter on models where we try and address
21:52
some of the more popular models over
21:55
time, the proprietary ones and then the
21:57
open source ones. But... I
22:00
would say nothing particularly stood out to me
22:02
over there. I suppose the more interesting thing
22:04
in terms of actually implementing
22:06
something for your own particular use case
22:08
is starting with a base
22:10
model that has pretty good performance on presumably
22:13
other people's data that looks as close as
22:15
possible to the data that you actually personally
22:17
care about. So you don't have to wait
22:19
too long when then fine tuning it on
22:21
your own data. So for that, I think
22:24
the most important thing is to take a
22:26
look at the most up-to-date leaderboards. And
22:28
there are quite a few different leaderboards out there. We
22:31
do also have a chapter on that. And that was, interestingly,
22:34
also a nightmare to keep up to
22:37
date because the leaderboards themselves are also
22:39
changing regularly. New leaderboards are being proposed
22:41
for different things. And take
22:44
a look at a leaderboard, pick the best model performing
22:46
there, and then start doing some fine tuning.
22:48
That would be my MO. This
22:50
kind of gets to one of the
22:53
natural questions that might come up with
22:55
a book on this topic,
22:58
which is things are evolving so
23:00
quickly. And you mentioned
23:02
the strategy with this book being
23:04
to have the book be open
23:06
source, have multiple contributors. And
23:09
I'm assuming part of that is also with
23:12
a goal for it to be
23:14
updated over time and be an
23:16
active resource. How have you
23:18
seen that start to work
23:20
out in practice? And
23:23
what is your hope for that sort of community
23:25
around the book or contributors around the book to
23:27
look like going into the future? Sure,
23:30
yeah. I mean, for the evaluation and
23:32
data sets thing, we already have more
23:34
than a dozen leaderboards, just the names
23:36
of the leaderboards and links to them,
23:38
and then what benchmarks they actually implicitly
23:40
include. We have
23:42
comments at the bottom of each chapter,
23:44
which are driven by GitHub effectively, powered
23:46
by Otrances, which is this integration tool
23:49
helper. So you don't
23:51
need to maintain a separate comments platform,
23:53
let's say, and also encourages people to
23:55
open issues, open pull requests. If
23:58
we've made any mistake or something like that, architecture,
26:00
the things that we're building. So effectively,
26:03
our strategy was to first do a lot of
26:05
research. We didn't mind publishing this for the general
26:07
public to have a look at. So we released
26:09
it in a book. And now
26:11
we're working on actually reading our own book and maybe
26:14
taking some of its advice and building things. And
26:16
we have this very much fast
26:18
paced startup style, let's build lots
26:21
of different things, try lots of different experiments.
26:23
It's fine if we throw things away. This
26:42
is a changelog news break. One
26:45
year after chat GPT brought a
26:47
seismic shift in the entire landscape
26:49
of AI, a group of researchers
26:51
set out to test claims that
26:53
its open source rivals had achieved
26:55
parity or even better on certain
26:58
tasks. In the linked
27:00
paper, they provide an exhaustive overview
27:02
of the success surveying all tasks
27:04
where an open source LLM has
27:07
claimed to be on par or
27:09
better than chat GPT. Their conclusion,
27:11
quote, in this survey, we deliver
27:14
a systematic review on high performing
27:16
open source LLMs that surpass or
27:18
catch up with chat GPT in
27:21
various task domains. In addition, we
27:23
provide insights, analysis and potential issues
27:25
of open source LLMs. We believe
27:27
that this survey sheds light on
27:30
promising directions of open source LLMs
27:32
and will serve to inspire further
27:34
research and development helping to close
27:36
the gap with their pain counterpart.
27:38
End quote. It's becoming increasingly
27:41
clear to me that the data
27:43
models powering future AI rollouts will
27:45
be commoditized and democratized. Thanks to
27:47
the competitive nature and hard work
27:50
of both academia and industry. What
27:52
a relief. You just
27:54
heard one of our five top
27:56
stories from Monday's changelog news. Subscribe
27:58
to the podcast. to get
28:01
all of the week's top stories
28:03
and pop your email address in
28:05
at changelog.com/news to also receive our
28:07
free companion email with even more
28:09
developer news worth your attention once
28:12
again that's changelog.com/news
28:19
so Casper I want to actually do a quick
28:21
follow-up of something you were just saying as we
28:24
were going into the break and that was you're
28:26
talking about you know now we're going to start
28:28
going through the book ourselves and taking the advice
28:31
and that brings up kind of a business
28:33
oriented question I want to ask about it and
28:35
so you go out today you've
28:37
listened to the podcast downloaded the book and
28:40
there's so much great information in all
28:42
of these chapters and the comparisons and
28:44
the what you know the different options
28:46
that each chapter addresses are good or
28:48
bad and things like that if someone
28:50
is just getting going or maybe they're
28:52
starting a new project and they're using
28:55
your book as a primary
28:57
source to kind of help them
28:59
make their initial evaluations how
29:01
best to use that book because there's a lot
29:03
of material in here in terms you know all
29:05
these different categories they need to come up with
29:08
their pipelines and you know go back to the
29:10
leaderboards and select the models that they were
29:12
the architectures there's been doing and and all that
29:15
if you were looking at this initially
29:17
with a new set of eyes but also
29:19
having the insight of been one of the
29:22
authors and editors of this how
29:24
would you recommend to somebody that they
29:26
best be productive as quickly
29:28
as possible and getting all their questions
29:31
sorted how would they go about that
29:33
process right I mean that's not really
29:35
in question I was thinking of addressing
29:37
with you know writing a book so
29:39
I suppose what you're referring to is a
29:41
case where someone has
29:43
a particular problem that
29:45
they want to solve sure and
29:48
an actual let's say business model
29:50
or target audience so
29:52
I mean if there's actually something that you're trying to
29:54
solve the book hasn't been really written from that perspective
29:56
it's more for a student who
29:58
kind of wants to learn about about everything,
30:00
right? Or a
30:03
practitioner who just hasn't kept up to
30:05
date with the latest advancements in the
30:07
last year. So the intention is that
30:09
you can skim through the entire book,
30:11
really. You're not meant to necessarily
30:13
know it in advance which specific
30:16
chapters might have or spur
30:18
an innovation or an idea that you
30:20
can actually implement to help you. In
30:22
terms of that, I mean, what probably
30:24
might be more useful is looking through
30:26
a couple of blog posts that actually
30:28
take you from zero to, here's an
30:30
example application that, for
30:32
example, will download a YouTube
30:35
video, automatically detect the speech,
30:38
do some speech-to-text recognition type things, and then give you
30:40
a prompt and you can type in a question and
30:42
it will answer it based on that video. We do,
30:44
in fact, have a few blogs giving you these kind
30:46
of examples, right? And I think that
30:49
would probably be more useful if you're actually
30:51
trying to build a product to find existing
30:53
write-ups of people who have built similar things
30:55
and just follow that as a tutorial, right?
30:57
The book is more just to get an
30:59
overview of what's happened in the last year
31:01
in terms of the recent cutting-edge state-of-the-art, right?
31:04
Yeah, and I think that's a good call-out.
31:06
And I think one of the ways I'm
31:08
viewing this is like I am having a
31:11
lot of those conversations as a practitioner with
31:13
our clients about, you know, how
31:15
are we going to solve this problem? And something might
31:17
come up like, oh, now we're
31:19
talking about a vector database. How does
31:21
that fit into like the whole ecosystem
31:23
of what we're talking about here and
31:26
why did we start talking about this?
31:28
I think that the way that you
31:30
formatted things here and laid them out
31:32
actually really helps put some of these
31:34
things in context for people
31:36
within the whole of what is
31:38
open source AI, which is really
31:40
helpful. So I just mentioned vector
31:43
databases, which we have talked about quite a
31:45
bit on the show and is something
31:47
that, of course, is an important piece
31:50
of a lot of workflows. But there's
31:52
one thing on the list of chapters
31:54
here that maybe we haven't talked about
31:56
as much on this show, and that's
31:58
desktop apps. which we've talked
32:01
a lot about whether it be
32:03
like that orchestration or software development
32:05
toolkit layer, like you're talking about
32:07
Lang chain and llama index and
32:09
other things or the models or
32:11
the MLOps or the vector database.
32:13
But I don't think we've talked
32:15
that much about sort of desktop
32:17
apps, quote unquote, associated with this
32:19
ecosystem of open source AI. Could
32:21
you give us a little bit of
32:23
framing of that topic? Like what is
32:26
meant by desktop app here and maybe
32:28
highlighting a couple of those things that
32:30
people could have in their mind as part
32:32
of the ecosystem? Sure,
32:34
I mean, I should probably quickly say about
32:37
vector databases. I don't quite understand why there's
32:39
so much of hype over it. To me,
32:41
embeddings are actually the important thing. The database
32:43
that you happen to store your embeddings in
32:45
is almost like a minor implementation detail. Unless
32:47
you're really dealing with huge amounts of data,
32:49
it shouldn't really matter which database you pick,
32:51
right? Sure, valid point. I don't know if
32:53
you have a different opinion there though. No,
32:57
I think it's not necessarily
33:01
a one or the other, but there's use, in
33:03
my opinion, there's use cases for both, but
33:06
not everyone should assume that they fit
33:09
in one of those use cases and still they
33:11
figure out what's relevant for their own problem, so.
33:14
But yeah, in the desktop space, I think maybe
33:17
there aren't that many developers who talk about
33:19
it because it's almost
33:21
front-end type applications as
33:23
opposed to getting stuck into the details
33:26
of implementing, fine-tuning, and all that stuff
33:28
tends to mean more back-end, let's say,
33:30
in inverse commerce. So I
33:32
think that might be one of the reasons why
33:35
there aren't that many desktop applications being produced because
33:37
you kind of need both, both front-end and back-end,
33:39
and that maybe naturally lends
33:41
itself to more the sort
33:44
of resources that only a
33:46
closed-source company might be willing to
33:48
dedicate. So maybe that
33:51
just might be why there's not so much in the
33:53
open-source space. Just takes a lot of
33:55
development effort. But yeah, there are a
33:57
few that we do mention in the book. There's
33:59
LM Studio, GPT. for all COBOLD. All of
34:01
them are still very new
34:03
because I mean the thing that they're
34:05
effectively giving you a user interface for
34:07
itself is very new. Yeah,
34:10
I mean there are some common design
34:12
principles that are maybe being
34:14
settled on. You know, you do
34:16
expect a prompt if you're dealing
34:18
with language models. You do expect
34:20
a certain amount of configuration for
34:22
images if you're dealing with images
34:24
like how many, what's the
34:26
dimensions and some basic pre-processing that
34:29
has nothing to do with artificial
34:31
intelligence but you might still expect to see this
34:33
sort of thing in one place rather than having
34:35
to switch between a separate
34:37
image editor and your pipeline. Things
34:40
that I'm kind of interested in is improving
34:42
the usability or the end user pleasure
34:44
let's say of using the desktop apps
34:47
far more so. Can you sort of
34:49
graphically connect these pipelines together like some
34:51
sort of a node editor so you
34:53
can drag and drop models around and
34:55
like drop their inputs, connect
34:57
their inputs and outputs to each other so that you
34:59
can have a nice visual representation of your
35:02
entire pipeline. But yeah, I decided to
35:04
see what happens in that space. To some
35:06
extent, I think Prime itself is probably interested
35:09
in developing a desktop app itself. As
35:12
you've gone through the process of putting the book
35:14
together and I think one of the things that
35:16
in any project that folks do is kind of
35:18
like when to go ahead and put it out
35:20
there. There's a point where you have to kind
35:22
of put a pin in it and say that's
35:25
this one right now but our
35:27
brains never stop working obviously on
35:29
these problems. To that effect,
35:31
you get the book out there. Is
35:33
there anything and you have conversations like this one that we're having
35:35
right now where we're talking about it and you're like, well, it
35:38
wasn't meant for that but it was meant for this. Is
35:40
there anything in your head that you're starting to think,
35:42
well, maybe that should have been a topic or
35:45
something we should have put in
35:48
the book maybe next time with this landscape
35:50
evolving so fast. Where has your
35:52
post-publishing brain been at on these collection
35:54
of topics? We definitely have
35:56
yet another 10 more chapters
35:59
planned. So there's definitely going
36:01
to be a second edition of this book or
36:04
maybe I should say second volume It's not even a
36:06
second edition. It's not correction. So the kind of thing
36:08
it's ten whole new chapters. Yes, literally
36:10
v2 that's going to include a
36:12
lot of interesting stuff about things
36:15
that happen in the Last half of 2023 and
36:18
hopefully will be developed in in 24 as
36:20
well among the things that people
36:22
are talking about I mean, we already talked
36:24
about vector databases a little bit and maybe
36:26
you're like you don't see the hype there
36:29
What are some things in the ecosystem
36:31
that you're really really excited about
36:34
and then some things that maybe? Like
36:36
are there any is there anything else that
36:38
you're like, ah like people are talking about this
36:40
a lot But I don't I don't really
36:42
see it going anywhere any any hot
36:45
takes I mean, I probably
36:47
already covered some of these things right
36:49
what I'm super interested in is fine-tuning
36:51
and lowering entry barriers for other things
36:54
that I'm not all that convinced by are Pretending
36:57
that AI is AGI. They're not the same.
36:59
I'm sorry and I don't see it And
37:02
I don't trust these models to be more
37:05
intelligent right now than at best
37:07
a well-trained secretary They're
37:10
considerably faster So, you know There are
37:12
applications where being able to churn
37:14
through a lot of text really quickly is actually
37:16
a value in which yes Yes, great apply one
37:18
of these things but apart from that I don't
37:20
I don't really buy the hype. Yeah,
37:23
that's fair I think and as we kind
37:25
of get closer to
37:27
an end here. I'm wondering maybe
37:29
there's Some in
37:32
our listener base that don't have
37:34
the kind of history in Open
37:37
source that you do and
37:39
of course there's contributions to this
37:41
book that would be relevant But
37:43
there's also contributions within this whole
37:46
ecosystem of open AI whether it's
37:48
in Tool kits or it's
37:50
in the desktop apps or it's in the
37:53
in the actual models or data
37:55
sets or evaluation techniques themselves
37:58
for those out there that maybe are are
38:00
newer to open source, do
38:02
you have any recommendations or
38:04
suggestions in terms of more
38:07
people getting involved in open source
38:09
AI? Obviously the book
38:11
is a piece of that because it's
38:14
open source and people could contribute to
38:16
that. But maybe more broadly, do you
38:18
have any encouragement for people out there
38:20
in terms of ways to get started
38:23
in contributing to open source AI rather
38:25
than just consuming? Sure, yeah,
38:27
no, I would say that basically every
38:29
time you consume, you are 90%
38:31
of the way there to
38:34
contributing back as well. So you
38:36
have probably cloned a repository somewhere
38:38
in order to run some code,
38:40
right? You probably encountered some issues
38:43
and a lot of those issues probably are
38:45
genuine bugs because these are fast moving things,
38:47
people just write some code without necessarily doing
38:49
full proper robust testing. We don't have time
38:51
to do robust testing, right? A lot of
38:53
the time they just throw away experiment type
38:55
things, so we're in make and break mode. Yeah,
38:58
so if you find an issue rather than quietly
39:00
fixing it yourself, feel free to open
39:02
a pull request and maybe you're not new, but
39:04
you're kind of new to this and you're scared
39:06
of opening a pull request. You're scared that it's
39:08
not perfect code that you've written as well. I
39:11
mean, bear in mind that the code you fixed
39:13
was even less perfect, right? And I
39:15
can say as an open source maintainer, I'm
39:17
always super happy when people contribute anything whether
39:19
it's an issue, a pull request. And
39:21
I think generally people are far
39:24
more happy and helpful and kind than
39:26
you might expect. I
39:29
would say that when it comes to actually writing
39:31
code, people aren't necessarily the same trolls that
39:33
you might find on Twitter, right? Or social
39:35
media in general, right? These are people who
39:37
have a mindset that
39:40
they're thinking about what's being written and they
39:42
care about the actual project and they don't
39:44
care about fighting you on a political front,
39:46
let's say. So if you are
39:48
trying to be helpful, that counts a lot
39:50
more than are you actually helpful in your
39:52
own opinion or anyone else's opinion, right? And
39:55
even if your pull request doesn't get accepted
39:57
or merged in, you will definitely have
39:59
some useful. feedback, it might help you
40:01
in your own expertise, your own growth as
40:03
a student or a contributor. And
40:06
I would say, you know, there are definitely times
40:08
where you might rub somebody up the wrong way
40:10
and you're not happy with an interaction. But
40:13
it's such a small percentage of the time that
40:16
it's definitely worth it. Yeah,
40:18
well, I think that's a really
40:20
great encouragement in this
40:22
conversation with. And of course, Chris
40:25
and I as well would encourage you to
40:27
get involved. Even if
40:29
it's something small initially, get plugged
40:31
into a community, start interacting and
40:34
contribute to the ecosystem, because I
40:36
would agree with you, Casper, it
40:38
can be both useful
40:40
for the projects, but also
40:42
very rewarding and beneficial
40:44
for the contributors in terms of
40:46
the community and the things you
40:49
learn and the connections that you
40:51
make and and all of that. So
40:53
yes, very much encourage people to get
40:55
involved. Also encourage people to check out
40:57
the open source book, which
40:59
we'll link in our show notes. So
41:01
make sure you go down and click
41:03
and take a look. It's very easy
41:05
to navigate to and you'll see all
41:08
the categories that we've been talking about
41:10
through the episode. So dig in. And
41:12
if you see things to add, definitely
41:14
contribute them. Appreciate you joining Casper. Yes.
41:16
And thanks for sharing the link. You
41:18
just shared it with me. So book
41:20
that prem. I. Oh,
41:22
slash state of open source AI
41:25
with dashes. We'll link it in
41:27
the show notes as well so
41:29
people can click easily. But yeah,
41:32
thank you so much for joining Casper. And also thank
41:34
you for your contributions to the book. We're
41:36
really thankful that you've done this. Sure.
41:39
Yeah. Thanks for having me on. Thank
41:50
you for listening to Practical AI. Your
41:53
next step is to subscribe now if
41:55
you haven't already. And if
41:57
you're a longtime listener of the show, help us reach.
42:00
more people by sharing Practical AI with your
42:02
friends and colleagues. Thanks once
42:04
again to Fastly and Fly for partnering
42:06
with us to bring you all ChangeLog
42:08
podcasts. Check out what they're up to
42:10
at fastly.com and fly.io. And
42:13
to our beat-freaking residents, Breakmaster Cylinder, for continuously
42:15
cranking out the best beats in the biz.
42:17
That's all for now. We'll talk to you
42:20
again next time.
Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More