Episode Transcript
Transcripts are displayed as originally observed. Some content, including advertisements may have changed.
Use Ctrl + F to search
0:07
Welcome to Practical AI. If
0:10
you work in artificial intelligence, aspire
0:12
to, or are curious how AI-related
0:14
technologies are changing the world, this
0:16
is the show for you. Thank
0:19
you to our partners at Fastly for shipping
0:21
all of our pods super fast to wherever
0:24
you listen. Check them out
0:26
at fastly.com. And to our
0:28
friends at Fly, deploy your app servers and
0:30
database close to your users. No
0:32
ops required. Learn more at
0:35
fly.io. Welcome
0:44
to another episode of Practical
0:46
AI. This is Daniel Weitnack.
0:48
I am CEO and founder
0:50
at Prediction Guard. And
0:52
I'm joined as always by my co-host, Chris
0:54
Benson, who is a tech strategist at
0:56
Lockheed Martin. How are you doing,
0:59
Chris? I'm doing good today. How's it going,
1:01
Daniel? Oh, it's going great. I was
1:03
just, well, we were just remarking before
1:05
actually starting the recording that one of
1:07
the great things about doing
1:09
these episodes is that we get the
1:11
excuse to bring on the show the
1:13
coolest open source
1:16
and tooling and other projects
1:18
that I'm using day to
1:20
day and get the chance to interact
1:22
with. And one of those is LanceDB.
1:25
And we're really excited today to
1:28
have with us Chung Shoo, who
1:30
is the CEO and co-founder at
1:32
LanceDB. Welcome. Thanks. Hey,
1:34
guys. Super excited to be here. Thanks for
1:37
having me on. Yeah, yeah.
1:39
Well, first off, congrats on
1:41
all your success. I was
1:43
scrolling through LinkedIn and saw
1:45
a video of LanceDB up
1:47
on the NASDAQ screen in
1:50
Times Square. So that was cool to see.
1:52
That must mean good things, I'm assuming. Yeah,
1:55
it is possible for the
1:58
Brex and also. SMCC.
2:01
So big thanks goes out to them.
2:03
Cool. Cool. Yeah. Well, I mentioned, um,
2:06
I've had a chance to look through
2:08
some of what you're doing and actually
2:10
use it day to day. Actually, that
2:12
was a result of a previous episode
2:15
that was I think titled, you know,
2:17
vector databases beyond the hype with Prashant.
2:19
I think the question that we asked him was
2:22
like, Oh, there's all these vector databases you've
2:24
compared all of them. What
2:26
are some of the things that stand out or
2:28
some of the vector databases that
2:31
stand out in terms of what
2:33
they're doing technically or how they're
2:35
approaching things. And one of them he
2:37
called out was Lance DB. I think
2:39
in particular he was talking about kind
2:42
of on disc index stuff. And so
2:44
I'm sure we'll get into that in a
2:46
little bit more, but that's how I
2:48
got into it. So I recommend listeners
2:50
maybe go back and get some context
2:53
from that episode. But as we get
2:55
into things, could you maybe
2:57
give us a little bit of a picture
2:59
as to how Lance DB
3:01
came about? I know there's a lot of hyped
3:04
vector database stuff out there and
3:06
people not might not sort of
3:08
realize how these things were developed,
3:10
how they came about, what the
3:12
motivation was. And so if you
3:15
could just give us a little bit of a sense
3:17
of that, at least for Lance DB. Yeah,
3:19
absolutely. And first I wanted to also, uh,
3:22
give a big shout out to Prashant as well. As
3:24
you were saying, there's a lot of hype and noise
3:26
in this area. There are a lot of different choices
3:29
and for users and developers
3:31
who are building generative AI
3:34
tooling and applications, it's always kind
3:36
of confusing, like which one is
3:38
good? And should you listen to
3:40
the marketing from one tool
3:42
versus another? So it's
3:45
great to see someone with an engineering background
3:47
who can write so well to actually take
3:49
the time and just try out a ton
3:51
of different tools and interview a bunch of
3:53
different companies and come to his
3:55
own conclusions. I am super happy and
3:57
excited that he's a fan of Lance DB and. we
4:00
hope to make that better for him and also
4:02
all of our users. So, you
4:04
know, back to the land CV, I think, so
4:07
we started the company two years ago
4:09
at this point, and we
4:11
didn't start out as a vector
4:13
database company, actually, because I
4:16
think if you kind of remember, chat
4:18
GPT is barely one year old. Yeah.
4:20
The dawn of AI. Yes,
4:22
exactly. And
4:25
so the original motivation was
4:27
actually serving companies, building
4:29
computer vision and building new data infrastructure
4:31
for computer vision. So I had been
4:34
working in this space for a long
4:36
time. I've been building data and machine
4:38
learning tooling for about almost two decades
4:40
at this point. I started out my
4:42
career as a financial quant and then
4:45
became involved in Python open source. It
4:47
was one of the original co-authors, the
4:49
pandas library. And that really got me
4:51
sort of excited about open source about
4:54
Python and building tools for data scientists
4:56
and machine learning engineers. And
4:58
so at the time, this was in 2020 and
5:00
2021, what I observed was at the company
5:05
I was working for to be TV. So
5:08
the streaming company. So we dealt
5:10
with both machine learning problems for
5:13
tabular data and also for unstructured
5:15
data, like images and the video
5:17
assets and things like that. And
5:19
what I had noticed was that
5:21
anytime a project touched this
5:24
multimodal data for AI, from images
5:26
to like the text for, you
5:28
know, let's say subtitles or summaries
5:31
to the poster images, these projects
5:33
always took a lot
5:35
longer, they were much harder to
5:37
maintain and it was difficult to actually
5:41
put into production at the same time.
5:43
So my co-founder, a late when I
5:45
had met during my days at Cloudera,
5:48
he was working at cruise and sort of dealing
5:50
with the same issues. And so we
5:52
put our heads together and our conclusion was that,
5:54
Hey, it's not the sort
5:57
of top application or workflow layer
5:59
or orchestration. layer that's the problem, it's
6:02
the underlying data infrastructure. If you look at
6:04
sort of what's been out there, like, you
6:06
know, Parquet and Org has been around, and
6:09
they've been great for tabular data,
6:11
but they really, really suck for
6:13
managing unstructured data. And so
6:15
we essentially said, hey, what
6:17
would it take to build a
6:20
single source of truth where we
6:22
can toss in the tabular data
6:24
plus the unstructured data and give
6:27
much better performance at a much
6:29
lower cost, a total cost of
6:31
ownership, an easier foundation
6:33
to build on top of for companies
6:35
dealing with a lot of vision data.
6:38
And so this comes in handy when
6:40
you want to explore your large vision
6:42
data sets for, you know, let's say
6:44
all time is driving. This comes
6:46
in really handy for things like
6:49
recommender systems and things like that. So
6:51
we started out building out that layer,
6:54
that storage layer in the open source.
6:57
And that took about a year's
6:59
worth of effort to really get
7:01
to a shape that is usable,
7:03
kind of like Parquet or Org
7:05
and other formats in these tools.
7:08
And that was when Generative
7:10
AI became really diverse
7:13
onto the scene and became sort of
7:15
a revolutionary technology. And what
7:17
happened at the time was we
7:20
had originally built in vector index for
7:22
our computer vision users to say, hey,
7:25
let's duplicate a bunch of images, or let's
7:27
find the most relevant samples for training for
7:30
active learning and things like that. And
7:32
it was sort of that open source
7:34
community that discovered to, hey, this can
7:36
be really good for Generative AI as
7:39
well. That's when we
7:41
sort of separated out another repo to say,
7:43
hey, this is a vector database. And
7:46
it's much easier to communicate with the
7:48
community than to say, hey, you're looking
7:51
for a vector search, use
7:53
this columnar format. And
7:55
so that's how we got onto this path. It's
8:00
been a couple of moments now as we were
8:02
going through that, but I was just curious when
8:05
you were talking about kind of going through
8:07
the analysis on the top workflow versus whether
8:09
it was infrastructure and you said y'all concluded
8:11
infrastructure. I was just wondering, you kind of
8:13
went on past that into that, but I
8:16
was kind of wondering how did
8:18
y'all come to that determination? For those of us who are
8:20
not deeply into that thought process, I was wondering where your
8:22
head was at when you were doing that. Yeah,
8:25
it wasn't an easy decision
8:27
or a conclusion. Looking
8:30
back, it was like 2022 initially
8:33
seemed pretty crazy when we sort of first
8:35
came up on it. If you
8:37
think about it, it's like why would you make a new
8:39
data format like in 2022? Parquet
8:41
has been working so well. I
8:44
think it was really observing the pain
8:47
in our own teams and also we
8:49
went out and interviewed a lot of
8:51
folks managing unstructured data. For
8:53
them, it was one, data
8:56
was split into many different places. The
8:59
metadata might be managed in Parquet and
9:01
then raw assets are just dumped onto local
9:03
hard drives or S3 and then
9:06
you might have other tabular data
9:08
managed in other systems and they
9:10
would always talk about how painful
9:12
it is to stitch everything
9:14
together and manage it all together. Some
9:17
of the outcomes are like it's really
9:19
hard to maintain those data sets in
9:21
production. You have a
9:23
Parquet data set that has the metadata
9:25
and then links to S3 or
9:28
something like that to all the images and then
9:30
somebody moves the S3
9:33
directory or something like that and now all of your
9:35
data sets are broken or something like
9:37
we would interview folks are like, hey, what
9:39
are you doing to explore your
9:41
digital data sets and things like that? They're
9:43
like, well, I use MacBook and
9:45
there's this app on that called Finder and
9:48
if you single click on a folder, it shows you
9:50
a bunch of some things. It's
9:52
sort of this horrible way to actually
9:54
work with your data but it was because
9:56
it was so hard to manage all of
9:58
that. machine learning engineers
10:01
and researchers were stuck with the subpar
10:03
tools. You mentioned kind
10:05
of this transition of thinking from
10:08
some of the original use cases that
10:10
you were talking about with computer vision
10:13
to this world of generative AI
10:15
that we're living in now. From
10:18
my impression, from an outsider's perspective,
10:20
it seems like Lance DB has
10:23
kind of positioned itself
10:25
very well to serve this kind
10:27
of generative AI use cases, which
10:29
I'm sure we'll talk about in
10:31
a lot more detail later on.
10:33
I'm wondering from your perspective, how
10:36
has that overwhelming demand for the
10:38
generative AI use case kind of
10:41
changed your mindset and direction as a company
10:43
and a project and open
10:45
source tooling and all of that? And
10:48
how do you envision the kind of
10:50
what you're targeting as the use cases
10:52
moving forward, I guess? I
10:54
think certainly generative AI has
10:56
brought in a lot of
10:58
different changes in new thinking.
11:01
One was the sort of
11:04
focus around use cases of semantic
11:07
search and just retrieval in general.
11:09
I think with
11:11
the advent of generative AI, I
11:13
think retrieval becomes much more important
11:16
and then ubiquitous. For
11:18
us, what that means is, you
11:20
know, increased investments in terms of
11:23
getting the index to work really well and
11:25
really scalable. Then sort
11:28
of making that data management piece to
11:30
work really well as well and
11:33
integrating with frameworks for RAG and
11:36
for agents and for just
11:38
generative AI in particular. When
11:40
we started out, inevitably
11:43
we were dealing with multi-terabyte
11:46
to petabyte scale like
11:48
vision data sets and things like that.
11:50
We're still dealing with a lot of
11:52
that. But for generative AI, I think
11:55
there was a renewed focus on
11:57
ease of use because a lot of users are
11:59
coming in who don't have
12:02
years of experience in data engineering
12:04
or machine learning engineering. What
12:07
they're looking for is an easy
12:09
to use and easy to
12:11
install package that doesn't require
12:14
you to be an expert in any
12:17
of these underlying technologies. We
12:19
also spent some effort into, okay, that
12:22
was the motivation behind us making
12:24
LAMDB, so vector database, one open
12:26
source, and two, embed it. Because
12:29
we felt like there were lots
12:31
of options on the market
12:33
that required you to figure out,
12:35
okay, what is the instance I need?
12:37
How many instances do I need? What
12:39
type of it? Okay, now
12:41
I have to chart the data and blah, blah, blah. Coming
12:45
from that data background, what
12:47
I had been working with a lot is SQLite
12:50
or DuckDB that just
12:52
runs as part of your application code
12:54
and would just talk to files that
12:56
live anywhere. It was super
12:59
easy to install and use. That's
13:02
what gave us that inspiration to
13:04
make an embedded vector database. You
13:07
had just got into this idea of
13:10
embedded databases, which, well,
13:12
embeddings are related, but
13:14
that's another topic. But
13:17
the idea that LAMDB is embedded, you mentioned
13:19
DuckDB and other things that kind of
13:21
opt in and operate in
13:23
the same sort of sphere. I'm
13:26
wondering, for those that
13:28
maybe are trying to
13:30
position LAMDB's vector database
13:33
tooling within a kind of wider
13:36
ecosystem of vector databases
13:38
and plug-ins to other
13:40
databases that support vector search, could
13:42
you explain a little bit about
13:44
what does it mean that LAMDB
13:46
is embedded? What does that mean
13:49
practically for the user? Maybe people
13:51
aren't familiar with that term quite
13:53
as much. What does
13:55
that mean practically for the user? Are there other
13:57
kind of general ways that you can use LAMDB?
13:59
you would differentiate LanceDB's
14:02
tooling and the database versus
14:04
some other things out there.
14:07
So I love sort of geeking out about these topics.
14:09
So at the very bottom layer in
14:12
terms of technology, I think there's a
14:14
couple of things that fundamentally sets LanceDB
14:16
apart. One, as you
14:18
mentioned, is the fact that it's
14:20
embedded or runs in process. I
14:23
think we are one of two that can
14:25
run in process in Python and we're the
14:27
only one in JavaScript that runs in
14:30
process. Number two is the
14:32
fact that we have a totally new
14:34
storage layer through Lance columnar format. What
14:37
this allows us to do is add
14:39
data management features on top of the
14:41
index. And then number three is
14:43
the fact that the indices,
14:45
the vector indices, and others
14:47
in LanceDB are disk-based
14:50
rather than memory-based, so that it
14:52
allows us to separate compute and
14:54
storage and allows us to scale
14:56
up a lot better. So those
14:58
are kind of the big value
15:00
propositions that these technological choices bring
15:02
to users of LanceDB. So
15:05
number one, ease of use. Number
15:07
two, hyperscalability. Number
15:09
three, cost effectiveness.
15:12
And then number four, the ability to manage
15:14
all of your data together, and not just
15:16
the vectors, but also, if you think about
15:18
it, the metadata and also
15:21
the raw assets, whether they're images,
15:23
text, or videos. Could
15:27
you describe a typical
15:29
use case of a developer
15:32
doing this, where you're taking
15:34
those features that are distinguishing
15:36
LanceDB from other possibilities, other
15:38
competition, but just talk about
15:40
what that workflow looks like,
15:42
or if there is a major one
15:45
or a couple, and just get it very
15:47
grounded. So somebody that's listening can understand how
15:50
they're going to do it from A to Z when
15:52
they're integrating LanceDB into their workflow. So
15:54
there's a couple of sort of
15:56
prototypical workflows that we see
15:58
from our users. at
16:00
the smaller scale for LAMDB,
16:03
you're installing it via PIP
16:06
or NPM or something like that. In
16:08
general, you get some input data that comes
16:10
in as like a pandas data frame or
16:12
maybe a polar data frame. And
16:15
then you interface with an
16:17
embedding model. You can do that yourself or
16:19
you can actually configure the LAMDB table and
16:21
say, hey, use OpenAI
16:24
embeddings or hey, use these
16:26
hugging face embeddings. LAMDB
16:28
can actually take care of all that. So it's
16:30
a pretty quick sort of data
16:33
frame to LAMDB and then you
16:35
can search it and then that
16:37
comes out as data frames
16:39
or Python Dix or things like
16:41
that. That plugs into the rest
16:43
of your workflow that are likely
16:45
data frame or pedantic or Python
16:47
Dix based. So that's number
16:49
one. And then kind of number two
16:51
is really these large scale use cases
16:53
where some of our users have anywhere
16:56
from like 100 million to multiple
16:59
billions of vectors in one
17:01
table. And that's
17:03
a much bigger production deployment.
17:06
And typically what makes LAMDB stands
17:08
out in that area is one
17:10
straight easy for them to process
17:12
the data using a distributed engine
17:14
like Spark. And they can write
17:16
concurrently and get that done really
17:18
quickly. I think we're one
17:20
of the few that offers GPU acceleration in
17:22
terms of indexing. So even for
17:25
those really large data sets, you can index
17:27
pretty quickly. And then number three
17:29
is because we're able to
17:31
actually separate the compute and storage,
17:34
even at that large vector
17:36
size, you don't really
17:38
need that many krijnodes. Like
17:41
you can actually just have one
17:44
or two like fairly average
17:46
and commodity krijnodes that runs on
17:49
your storage of choice, depending on
17:51
what latency requirements you want. And
17:54
then just have a very simple architecture
17:56
for these types of architectures. The
17:59
krijnodes are stateless. and they don't need to talk to each
18:01
other. So when you need to scale up or
18:03
when a node drops out and has to come back in, there's
18:06
no sort of leader election. There's no
18:08
coordination. It really lowers the
18:10
complexity of that whole stack. So
18:12
another great example of this
18:14
kind of architecture and the benefits that
18:16
it brings is NEON, the
18:18
NEON database. So I think Nikita,
18:22
the founder, recently had a
18:24
good Twitter thread about the
18:26
difference between NEON and
18:29
other databases. And
18:32
he called it shared data versus
18:34
shared nothing architecture. And I think
18:36
that's also what we kind of
18:38
strive to deliver in LansDB versus
18:40
other vector databases. Yeah,
18:42
I know one of the things that I
18:45
really enjoyed in trying out
18:47
a lot of things with LansDB is
18:49
I can pull up a
18:51
collab notebook and try out,
18:53
I can import LansDB. I can import
18:56
a subset of the kind of database
18:58
that I'm going to be working, or
19:00
the data that I'm working with. It
19:02
all runs fine. I don't have to
19:04
set up some client server type of
19:06
scenario. And then
19:09
when people ask, well, how are you going to push
19:11
this out to a larger
19:13
scale, the appeal of just saying, hey, well,
19:15
we can just throw up this LansDB
19:18
database on S3 and
19:21
then connect to it. That's a
19:23
very appealing thing for people because
19:25
also those storage layers are available
19:27
everywhere from on prem to
19:30
cloud to whatever sort of scenarios you're
19:32
working with. So it's very, very flexible
19:34
for people. Could you explain a little
19:36
bit? Because this is something like I've
19:39
been asked a couple times. So this
19:41
is my selfish question because I have
19:43
you on the line. So
19:45
you're helping me with my own day
19:48
to day work. But when I'm talking
19:50
to some people, clients that
19:52
I'm working with, I'm like, oh, we can just push this up
19:54
on S3 and then access
19:56
it. Usually their question is something
19:58
like, well, like, Because they have in
20:01
their mind a database has a compute
20:03
node and somehow
20:06
the performance of queries into the
20:08
database is tied to the
20:10
sizing of that compute node and maybe
20:12
how that's sort
20:14
of clustered or
20:16
sharded across the database. And
20:19
then this idea, oh, I'm just going to
20:21
have even just a lambda function that connects
20:23
to S3 and does a query. In
20:27
some ways it like breaks things in people's mind.
20:30
And so a lot of times their question
20:32
was like, how does that work? How can
20:34
a query to this large amount of data
20:36
be efficient when the data is just like
20:38
sitting there in S3 or in
20:41
another place? So could you help me
20:43
with my answer, I guess, is what I'm asking.
20:45
Yeah, absolutely. So this goes back to
20:48
what we talked about earlier with separation
20:50
of compute and storage. If
20:53
you've been sort of steeped in data
20:55
warehousing, data engineering land, this
20:57
has been a big arc of
20:59
data warehouse innovation in the past
21:01
decade by allowing us to
21:03
scale up the storage versus the compute
21:05
separately. This is the thing that
21:07
makes these systems seem magical
21:10
where you can process a
21:12
huge amount of data on what seems
21:14
like pretty commodity
21:16
or pretty weak compute. And
21:19
so the analogy that I like to make with these
21:21
situations is kind of like a
21:23
lot of us are familiar with, let's
21:26
say, like DuckDB demos or videos. And
21:29
you could see instances where
21:31
DuckDB is processing hundreds
21:33
of gigabytes of data on just
21:36
a laptop and in a very fast amount
21:39
of time. And they are able to spit
21:41
out results almost interactively.
21:44
There are companies from
21:48
like Motherduck to there's
21:50
a new company called Valplan that is
21:52
looking to essentially distribute DuckDB
21:55
queries on AWS Lambdas. It's
21:57
basically the same thing. It's all about the
21:59
separation. of computer storage. And
22:01
that's only possible if you have
22:04
the right underlying data architecture for
22:06
storing vectors and the data itself.
22:09
And just for someone that
22:11
is not a database
22:14
developer, can you describe in
22:16
any words the generalities of
22:18
that data structure that enables
22:20
such a thing? Yeah,
22:22
so it's two things. One is the
22:25
columnar format. So typically, from Gen
22:27
AI to machine learning, you can
22:29
have very wide tables. But typically,
22:31
a single query only needs a
22:34
couple of columns. So columnar format
22:36
allows you to only have to
22:38
fetch and look at a very
22:40
small subset of that data. Number
22:42
two is that columnar
22:44
format needs to be
22:46
paired with an index, like the
22:49
vector index in this particular scenario.
22:51
And that vector index, in order
22:53
to give this separation of compute and
22:55
storage, has to be based on disk.
22:57
So you have to store
22:59
the data on disk, not force the user
23:02
to hold everything into memory, and
23:04
then be able to access that very quickly. And
23:07
then number three is how to connect
23:10
that index with the columnar format.
23:12
So a columnar format like parquet
23:15
does not give you the ability to
23:17
do fast random access. So even if
23:19
you have that good index, using parquet,
23:21
you would not be able to get
23:23
interactive performance in terms of queries. And
23:25
it's only by having a
23:28
new columnar format like LANs that can
23:30
give you random access and
23:32
fast scans that you can successfully
23:34
put these two together and deliver the things.
23:37
So those are the three big
23:39
pillars that I think in our data architecture
23:41
that makes us possible. While we were
23:43
talking here, I'm going through GitHub on your
23:45
repo and stuff, and was
23:47
surprised at something that kind of prompting
23:49
the next question. It looks like you're
23:52
really addressing a wide
23:55
range of different types of needs. And so there's
23:58
obviously Python, as you mentioned, and Python. would
24:00
expect, but you have JavaScript. And then
24:02
I was delighted to discover that there's
24:04
a Rust client in there, which is
24:06
when I'm not doing AI specific
24:08
things. Most of the time I'm that's my
24:11
language of choice these days. Could you talk
24:13
a little bit about kind of two things,
24:15
uh, the broader, like what you're trying to
24:17
achieve, like how you choose what languages to
24:19
support, um, and how you're getting there. And
24:21
then, uh, if you'll scratch my itch, uh,
24:24
what is your intention with that Rust client?
24:26
Is it ready? What does it do? Just
24:29
because I'm fascinated with that. Sorry. Yeah,
24:31
absolutely. Uh, I love talking about Rust.
24:33
Um, the Rust package is actually
24:36
not a client, but so on the core
24:38
of the, both the data format and the
24:40
vector database is actually in Rust. So the
24:42
Rust crate that we have is
24:44
actually the database or the embedded database.
24:46
Um, and so, and
24:48
we actually build, uh, for example, the
24:51
JavaScript, again, the same thing with JavaScript.
24:53
It's not just a client, but it's also
24:55
an embedded database in JavaScript. So that
24:58
is actually based on top of the
25:00
Rust crate and kind of like you
25:02
have in so like polars or
25:04
something like that, you have like a Rust
25:06
core and then you connect that into JavaScript.
25:10
So we had actually
25:12
started out in 2022 writing in C++
25:14
because Parquet is written in
25:18
C++, you know, like serious
25:21
data people and database people write in C++. Until
25:23
they find Rust, of course. Right.
25:26
And it was sort of a hack
25:28
project during Christmas time in 2022 at
25:30
the end of 2022 where
25:34
we had to get a hack
25:36
project for a customer
25:38
actually, and where we had to actually re-implement
25:41
partially the repath for a lands
25:43
format. And what we
25:45
found was just, it was
25:47
so good that we decided to
25:49
just actually rewrite everything in Rust.
25:52
I think biggest things were, we
25:54
were a lot more productive. We rewrote
25:57
Roughly six months of solid. Support.
26:00
Us. Development in about three
26:03
weeks with Ross and we had.
26:05
This was like us learning rust
26:07
as beginners as we went along.
26:09
A lot of that. Initial
26:12
Rust code. Has again
26:14
been rewritten over the past year, but
26:16
it just made us feel a lot
26:18
more productive. And the number two is
26:20
the safety that Rust offers you. has
26:23
been amazing with C Plus was like
26:25
every release. Didn't have
26:27
a good feeling. It was almost like you
26:29
know, where's that next sidewalk and the com
26:31
from. Whereas. With Ross I
26:34
you know we felt very
26:36
confident making multiple releases per
26:38
week with major features and
26:41
we did not see. Anywhere,
26:44
Near the sort of issues that
26:46
we saw a simple sauce or
26:48
it. So everything been really great
26:50
and know that that like Ross
26:52
has become really popular. Now.
26:54
For actually even with factor databases like
26:56
Quadrant Rust, Pine Cone, they're not open
26:59
sores but entire. They publicly said that
27:01
they've written in their whole stack and
27:03
Ross as well. so when we're question
27:06
from he along the same line before
27:08
I let it go because we fit
27:10
that. A sweet spot that I love
27:12
you think in this is not specific
27:15
to land Speedy but based on what
27:17
you're saying clearly you're thinking ahead on
27:19
these things on his we go forward
27:22
and you see both the the ai.
27:24
Applications and you see the different types
27:26
of workflows and infrastructures you know becoming
27:28
broader and more supportive. The multi language
27:31
aspect of getting out of only Python
27:33
for instance. Do you do for see
27:35
that as a convergence for you're seeing
27:38
language agnosticism developing in the space as
27:40
it has in other areas of computer
27:42
science? Or do you think that we're
27:45
still kind of be kind a locked
27:47
in on the current sets of In
27:49
for Surfer and tooling. Very Python oriented
27:51
for the indefinite future. What does your
27:54
thinking? Along those lines, so
27:56
i think generative i'd definitely changes the
27:58
picture and that i think there a
28:00
very large TypeScript
28:02
JavaScript community that
28:04
has been brought into
28:06
the arena to build AI
28:09
tools. And so I think
28:12
this is also an underserved
28:14
segment where it's not
28:16
just vector databases, but data tooling in
28:18
general lags far behind
28:21
in JavaScript's life, TypeScript land
28:23
versus Python. And I
28:25
think there's a real opportunity for
28:27
the open source community to create
28:30
good tools for this part of the
28:32
community as well. I
28:34
want to hear about some
28:37
of the actual use cases
28:39
that you've seen people implement
28:41
with LanceDB. Maybe if
28:43
there's ones that stand out like, oh,
28:45
this was cool because whatever it was,
28:48
they used it at scale, or it's
28:50
like fits a very typical generative AI
28:52
use case or whatever. And then maybe
28:55
something that surprised you in terms of,
28:57
oh, I didn't always when you
28:59
put a project out into the world, there's
29:01
these things where, oh, I really
29:03
didn't expect people to be using it that
29:06
way. But yeah, that sort of makes
29:08
sense. So do you think of anything
29:10
that fits into one or both of
29:12
those categories? The use cases
29:14
for LanceDB in the community that I see
29:16
falls into three or
29:19
four large buckets. One is of
29:21
course, generative AI, rag, and things
29:23
like that. And I
29:26
think it's not so much the
29:28
use of LanceDB
29:30
that I think is really cool, but
29:33
it's the applications that people build with
29:35
it that is really cool and amazing.
29:37
And I think a lot of the applications
29:40
that people build that is cool,
29:42
that really takes advantage of LanceDB
29:44
is things where you need
29:47
rag to be very agile and
29:49
that you need it to be
29:51
really sort of tightly bundled
29:54
with your application, you can sort of
29:56
call this rag from anywhere
29:58
and have it reach. term pretty quickly
30:00
and without too much complexity. And so
30:03
this is where I see a lot
30:05
of folks from your
30:08
standard chat bots and chat
30:10
with documentation to things like
30:12
productivity tools, where they build
30:15
things that help people organize
30:17
their daily schedules to much
30:19
more high stakes things
30:23
in production and like code generation
30:26
or like healthcare and
30:28
legal and things like that. And
30:30
so there, I think typically you see vector
30:33
dataset sizes from like the tens
30:35
of thousands up to single
30:38
digit millions of vectors
30:40
in typically. And
30:42
so production means you really scale up both
30:44
the number of datasets that you have
30:46
and then the number of vectors that you
30:49
have. One
30:51
of the cool things that I've seen that
30:53
takes advantage of LAN TV and LAN format
30:55
uniquely is there's
30:58
a code analysis tool that
31:00
sort of analyzes your GitHub
31:02
repository and plugs it into
31:04
a rag like
31:06
customer success sort of tool.
31:10
And what they want to be able to do
31:12
is say query the state of the database like
31:14
this today versus yesterday versus a week
31:16
ago to say, hey, was this issue
31:18
fixed or not? And like
31:21
what's still outstanding? And so
31:23
LAN TV uniquely gives you this ability
31:25
to version your table and also do
31:27
time travel. So you can say any
31:29
data vector database can do like, give
31:31
me the temos similar things to this
31:34
input uniquely, but what LAN TV gives
31:36
you the ability to do is say,
31:38
give me the temos similar
31:40
as of yesterday or as of a week
31:42
ago. And we do that sort
31:45
of automatically for you. Yeah. And
31:47
I think the other big
31:49
buckets are e-commerce and a
31:51
search and recommender engines. This
31:54
is like the traditional use case for
31:56
vector databases. And there you tend
31:58
to see much bigger single data. data sets. So
32:00
that are, you know, say, I want to store like
32:02
item embeddings, maybe that's, you know, up to a couple
32:05
of million up to 10 million on
32:07
a store item embedding that could get up to like
32:09
hundreds of millions. And you don't
32:11
have as many tables, but you have potentially
32:13
have very large tables, right. And then, of
32:15
course, the last bucket is this like computer
32:18
vision, like AI native computer vision,
32:20
either generative computer vision, or things
32:23
like autonomous vehicles and things like
32:25
that. And there's a whole sort
32:28
of combination of more
32:30
complicated use cases that enables
32:32
active learning, data application, things like
32:35
that. And the thing that
32:37
is very unique about the use case of
32:39
land CBN there is companies
32:41
that are managing all of their training
32:43
data and land CBN land format as
32:46
well. So you can use the vector
32:48
database to find the most interesting samples.
32:50
And then you can actually use the
32:53
tooling on top of the format
32:55
to essentially keep your GPU utilization
32:57
high and keep your GPU fed
33:00
very quickly during training, or if you're fine
33:02
tuning, or, you know, if you're running evals
33:04
and things like that. Yeah, so
33:06
cool. I, one of the things
33:09
that has been most fun for
33:11
me recently is this combination of
33:14
an LLM, Lance DB and
33:16
Duck DB, where like you
33:18
can create these really cool.
33:20
So if I'm using an
33:23
open LM that can generate like
33:25
SQL queries or something, but I
33:27
have like all of these different
33:30
SQL tables, like what we're doing
33:32
is like putting descriptions of the
33:34
SQL fields and tables in
33:36
Lance DB and actually on the fly,
33:39
like matching and pulling those to
33:41
generate a prompt, which goes to the LM
33:43
to generate the SQL code, which is executed
33:45
with Duck DB. And this gives you like
33:48
the kind of really nice natural
33:51
language query to
33:53
your data type of scenario, which has been
33:55
really fun to play with. That's really good
33:57
to hear. Actually, sorry to interrupt. So because
34:00
You kind of nerd sight me. So get it
34:02
out there. So one of the things that's really
34:04
cool about DuckDV is its
34:06
extension mechanism. So
34:09
I think they've also published
34:11
like a extension framework for
34:13
Rust-based extensions. And so we have sort
34:15
of a basic integration going there. And
34:17
I think in New Year, what you
34:19
can expect from us is actually we're
34:21
going to be spending a little more
34:24
time to make that
34:26
integration be more rich, meaning
34:29
our goal is for you to be
34:31
able to say, to write like a DuckDV
34:33
UDF to do vector search. And
34:36
then the results come back as like
34:38
a DuckDV table where you can then
34:41
run additional queries, like DuckDV queries on
34:43
top of that. And so,
34:45
and sort of the same thing with
34:48
like polars, right? So you can, and
34:51
the goal is to essentially make it so
34:54
that like vector database is no longer a
34:56
thing that you even have to think about.
34:59
People are generally more familiar with like DuckDV
35:01
or polars as the sort of that
35:03
tool that just stitches together the workflow. So
35:06
we just want that to make it
35:08
feel even smoother and more transparent. A
35:11
couple of moments ago, when you were talking about
35:13
the use cases, you were talking about like autonomous
35:15
vehicles and stuff. And I was wondering if we
35:17
could pull that thread a little bit more. It
35:19
seems like it is a fantastic, Chris, like drones.
35:22
Yeah, I love drones. And I love things
35:24
that are not by data centers. I
35:27
love things that are off on
35:29
the edge, whether it be for inference
35:31
or including training concerns that you may
35:34
not have all the things that we're
35:36
so spoiled with, with our cloud providers
35:39
out there. And it seems like, you know, there's
35:41
many types of opportunities
35:43
to use that. What's your
35:45
thinking around that? Have you seen any use
35:47
cases? Any ideas for the future in
35:50
that kind of autonomous on the edge world?
35:53
Yeah, definitely. So we certainly have, so
35:55
some of our users are like
35:58
robotics or device companies. now
38:00
in this sort of practical AI
38:02
space, because that's where you're
38:04
living, what excites you about
38:07
whatever it is the next six months,
38:09
the next year, and what you think
38:11
is kind of coming as this tooling
38:13
rolls out there further and further, people
38:15
learn to apply it better and better.
38:18
What's exciting for you? That's a
38:20
great question. I think there are
38:22
lots of things that I think hold
38:24
a lot of promise in the next
38:26
six to 12 months. I think
38:29
we'll see one is
38:31
this explosion of
38:34
retrieval, kind of information retrieval tools.
38:36
So we already see a lot
38:38
of companies are adding like generative
38:40
AI in customer success
38:43
management and
38:45
like documentation and things like that. And
38:47
so I think we'll see a lot
38:49
of applications providing value
38:52
that is, you know, that can be
38:54
also personalized and, you know, not just
38:57
like chat GPT stop answers, but actually
38:59
personalized to their own data or their
39:01
own, you know, cases or things like
39:03
that. And then number two is
39:06
I see a lot of
39:08
successes in very domain
39:10
specific agents that are able to
39:12
dive deep into legal
39:14
or healthcare or some domain
39:16
very specifically and build things
39:18
that seem sort of magical,
39:20
whether it's compliance or
39:23
driving better outcomes or, you know,
39:25
creating things that would democratize
39:28
a lot of these sort of
39:30
like very deep expertise type of
39:33
domains. And then I
39:35
think a little bit further out are generalized,
39:38
like low code and no code tools
39:41
for you to build, you know, very
39:43
sophisticated applications using generative
39:46
AI through code generation and sort
39:48
of creative, let's say creative interfaces
39:50
and things like that. So those are
39:52
things I think we'll deliver in the
39:55
short term. And then, you
39:57
know, personally, like I love
39:59
games and I'm I'm actually super excited about
40:01
what genitive ad brings to gaming. We
40:04
talked about open world and things like that.
40:06
And this is, this can
40:08
be really open where you
40:10
could just get lost for a
40:12
long, long time in a generative world. Awesome.
40:15
Thank you so much for taking time to
40:17
talk with us and please pass
40:19
on my thanks to the Lance DB
40:21
team for making me look good in
40:23
my, in my day job by giving
40:26
me great, great tools that work really
40:28
well. Appreciate what you
40:30
all are doing. And yeah, I
40:32
just looking forward to seeing what
40:34
comes over, over the coming
40:36
months. And yeah, encourage our listeners to
40:38
check out the show notes, all
40:41
the links to Lance DB, try it out. It only
40:43
takes a few minutes and hope
40:45
to talk to you again soon. Thanks so much. Thank you,
40:47
Danny. I thank you, Chris. It was
40:49
super fun talking to you with you guys. And
40:51
if you have any feedback, please let us know.
40:53
We hope to make you look even better in
40:55
the new year. Thank
41:05
you for listening to Practical AI. Your
41:08
next step is to subscribe now, if
41:10
you haven't already. And if
41:13
you're a long time listener of the show, help
41:15
us reach more people by sharing Practical AI with
41:17
your friends and colleagues. Thanks
41:19
once again to Fastly and Fly for
41:21
partnering with us to bring you all
41:23
Change Talk podcasts. Check out what they're
41:25
up to at fastly.com and fly.io. And
41:28
to our beat freaking residents, Breakmaster Cylinder
41:30
for continuously cranking out the best beats
41:32
in the viz. That's all for now.
41:35
We'll talk to you again next time.
Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More