Podchaser Logo
Home
Deep learning in Rust with Burn 🔥

Deep learning in Rust with Burn 🔥

Released Tuesday, 24th October 2023
Good episode? Give it some love!
Deep learning in Rust with Burn 🔥

Deep learning in Rust with Burn 🔥

Deep learning in Rust with Burn 🔥

Deep learning in Rust with Burn 🔥

Tuesday, 24th October 2023
Good episode? Give it some love!
Rate Episode

Episode Transcript

Transcripts are displayed as originally observed. Some content, including advertisements may have changed.

Use Ctrl + F to search

0:08

Welcome to Practical AI.

0:10

If you work with artificial intelligence, aspire

0:14

to, or are curious how

0:16

AI-related technologies are changing

0:18

the world, this is the show for

0:20

you. Thank you to our partners

0:23

for helping us bring you Practical AI each

0:25

and every week. What's up,

0:28

friends?

0:42

There's so much going on in the data

0:45

and machine learning space. It's just

0:47

hard to keep up. Did you know that graph technology

0:49

lets you connect the dots across your data and

0:51

ground your LLM in actual knowledge?

0:54

To learn about this new approach, don't miss Nodes

0:56

on October 26th. At this free online

0:59

conference, developers and data scientists

1:01

from around the world will share how they use graph technology

1:04

for everything from building intelligent apps and

1:06

APIs to enhancing machine learning

1:08

and improving data visualizations. There

1:11

are 90 inspiring talks over 24 hours,

1:13

so no matter where you're at in the world, you can attend

1:16

live sessions. To register for this free conference,

1:18

visit neo4j.com slash

1:20

nodes. That's N-E-O,

1:23

the number four, J dot

1:25

com slash nodes.

1:43

Welcome to another episode

1:45

of Practical AI. This is Daniel

1:47

Whitenak. I am the founder at Prediction

1:50

Guard. I'm joined as always by

1:52

my co-host, Chris Benson, who is

1:54

a tech strategist at Lockheed Martin.

1:57

How are you doing, Chris? I am doing very well today,

1:59

Daniel.

1:59

It is fall weather

2:02

out and I'm enjoying getting outside. It's

2:04

fall, it's raining here today. Yeah,

2:06

it's a little cloudy out but I'm enjoying its nice weather.

2:09

And so, you know, it's like part of me wants

2:11

to stay inside and do the fun things,

2:13

like especially about what we're going to be talking about today. Right.

2:16

And part of me wants to get outside and enjoy the weather. Well,

2:19

it's that time of year where

2:22

you just want to curl up next to a fireplace

2:25

and burn some firewood. Oh my

2:27

gosh, you took us right there. I'll

2:30

tell you what, before you say that, I'll just

2:32

say this is an exciting episode

2:34

coming up because I think this is

2:36

a little moment where we're going to talk about

2:39

our industry maturing a little bit through

2:41

one effort. And with that said, I'll let

2:43

you go ahead and do the intro. Well, the connection

2:46

to burn is because

2:48

burn is a deep learning

2:50

framework that's built in Rust.

2:54

And today we have with us the creator

2:56

of burn, Nathaniel Samar. Welcome

2:59

Nathaniel.

3:00

Hi, thanks for having me.

3:01

Yeah, well, I admitted to you before

3:04

the episode that I am

3:07

basically uninitiated

3:09

in terms of Rust goes. I've

3:13

looked at various articles.

3:16

I think that I've run Rust

3:19

programs just in a sort

3:21

of hello world sort of way. Probably

3:24

my biggest use of Rust has been using

3:27

Rust in the Python linter

3:29

called rough, which is

3:31

really great. So that's kind

3:34

of a circular thing. But for those

3:36

others out there in our audience that

3:38

might not be as familiar

3:41

with Rust as a programming language,

3:43

could you just tell us a little bit about

3:45

this sort of like, what is

3:48

Rust and why Rust? Yeah,

3:50

Rust is it. I think it's possibly

3:53

being categorized as a low level programming

3:55

language, probably because of the circular

3:57

reason, but it's very general programming

3:59

language.

3:59

that can be used for high-level stuff

4:02

as well as low-level stuff. So the

4:05

main reason to use Ross is maybe when

4:07

you need to go through multiple abstraction

4:09

boundaries without having to

4:12

pay for performance.

4:13

So, yes, this is how I define

4:15

it. And I could be

4:18

wrong about this, but I think

4:20

one of the great features along

4:23

with Go having a really great mascot, we've

4:25

got...isn't it a crab? If

4:28

you see crabs or something for Russ,

4:30

isn't that a thing? Yeah, I think it's a cute crab. It's

4:34

a cute crab. That's the mascot,

4:36

yeah. I think

4:38

it's important for programming language to have that.

4:41

Yes. You have Python for

4:43

the Snakes with the programming

4:45

language we've got. I don't know where this is for Go. It's

4:48

the Gopher, the Go Gopher.

4:50

Yeah, it's quite nice. Yeah, the Gopher.

4:53

It's funny, and you mentioned that Go is

4:55

actually how Daniel and I got to know

4:57

each other. We met in the Go programming

5:00

language community, and we were kind of the

5:02

two data-oriented people at the time. This is going

5:05

way back. There are many, many data-oriented people these

5:07

days, but got to know each other. Subsequent

5:10

to that, I had been hearing about Rust

5:12

for a while, and I got very interested in

5:14

it, not only for...because

5:17

as you pointed out, it's a fantastic general-purpose

5:19

programming language all around, but it also

5:22

does have a lot of really amazing

5:24

low-level features and performance

5:26

capability that attracted me

5:28

to it. So I'm not nearly as

5:30

accomplished in the language as you are, Nathaniel.

5:33

I still love Go, but Rust

5:36

is now another programming language that I have fallen

5:38

in love with. Yeah, I think Go is really well

5:40

suited for web services.

5:42

So we've got a lot of tooling around that. It's really pragmatic

5:44

to use it for that stuff.

5:46

So yeah, Rust is getting there, but

5:48

we've got the whole essence stories behind

5:50

that. Yep. And for Rust

5:53

itself, you mentioned kind of people have

5:55

the stereotype of Rust as

5:57

a low-level programming language.

5:59

but could you give maybe some

6:02

examples of the types

6:04

of things either you've built in Rust

6:06

over time or that are possibilities

6:08

just to kind of give people a sense of what

6:11

people are doing with the language. Obviously,

6:14

we're going to be talking about deep learning, which

6:16

is thanks to you something that can

6:18

be done with the language. But

6:21

what are some of the other things that are out there that

6:23

people are doing right now with Rust?

6:25

Well, I think it was first created

6:27

as a replacement for C++ to write browser

6:30

engines. So this is maybe why it

6:32

was known as a low-level programming

6:34

language. But now I think it's

6:36

used in game engines. It's also

6:38

used to do web front-end. So you've

6:41

got like Letdos and Deoxys,

6:43

which are front-end libraries like React

6:45

and Vue. So this is pretty high-level. We've

6:48

got also current line libraries

6:50

that you can use like meta programming,

6:53

so that it's very easy to do your command

6:55

line arguments, all that kind of stuff. So

6:58

yeah, there's tons of things that are built with Rust.

7:01

High-level and low-level. So you

7:03

can mix and match for your own applications. Of

7:05

course, there is like the web services

7:08

with Tokyo, the Async Front-Time, Accident,

7:10

a lot of if you want to do

7:12

web services, there is also

7:14

libraries for that. Yeah, this is a project

7:17

on top of my mind. It was one of the first

7:19

languages that really embraced WebAssembly and

7:22

got it out there. It's interesting. Coming,

7:25

speaking as kind of a novice in the language

7:27

and coming from most recently Go, there's

7:30

always this debate on Go versus Rust

7:32

that you tend to see in articles out there. And

7:35

I really found room for both of them

7:37

and I go back and forth at this point.

7:40

I will point out, you know, whereas Go

7:42

is one of those languages that has runtimes, that

7:44

kind of manages memory. Rust has a really cool

7:47

feature to it. It's not specific to what we're

7:49

talking about today, but the compiler

7:52

ensures that you don't have memory faults,

7:55

segfaults, which is something like 70% of

7:58

all the bugs in software according to my mind. Microsoft

8:01

and so it has a really interesting way

8:03

of approaching Ensuring that you

8:05

can produce bug-free software or

8:07

at least much fewer bugs in it or fewer

8:09

bugs in it So it's a pretty cool language.

8:12

I'm just curious as we're talking about the language in general

8:15

What's your favorite feature? What are the some of the

8:17

things that made you turn to rust versus some of

8:19

the other languages you may have worked in? This

8:22

is hard to just choose one feature.

8:24

I think this is the whole package said like

8:27

a real rust aficionado there Yeah,

8:29

but like my favorite feature is not the

8:32

reason why I started writing in rust

8:34

But now I think my favorite feature

8:36

is just associated types

8:38

Because it can abstract data types like

8:41

something that is really hard to do with other languages

8:43

So yeah And could you explain

8:46

a little bit of like when might

8:48

that be useful or how is that useful

8:51

in terms of like when? That might come up

8:53

in your programming Well, it's when you

8:56

need to abstract the type you're

8:58

going to use but you let the

9:00

implementation decides the types Normally

9:03

like you have the generics the generics

9:05

you have to hit you out of list and

9:07

you have to say Okay, I want a list of string, but

9:10

it's when you use the list that you decide the type where

9:13

associative type is Okay, I forgot

9:16

maybe a list, but I don't know of what it's

9:18

the implementation that decides of what's

9:20

going to be the list So sometimes it makes

9:22

sense for instance in burn We've got

9:25

the back in a back-end traits which

9:27

we can have multiple implementations like

9:29

CPU GPU

9:30

And we have associative types for the

9:33

memory for the tensile end loads for all those

9:35

things that you can manipulate

9:37

at a high level But you don't have to

9:39

know which type it is. It's to

9:41

the implementation to decide

9:43

I'm just gonna ask maybe like an

9:45

ignorant question But I think

9:48

maybe some people out there might

9:50

be wondering it if I'm

9:52

working in Python This

9:54

is a language where

9:56

I don't have to compile

9:58

my Python code some of the

10:01

things that we're talking about here with the compiler and

10:03

other things a lot of people don't

10:05

think about, although there's some intersection

10:07

with that. So could you describe like

10:10

when you're writing a Rust program,

10:13

what does that look like in terms of is

10:15

it a statically typed language? You're

10:17

talking a little bit about type there. It

10:20

sounds like you talked about a compiler. So

10:22

am I right in that it's a compiled program

10:24

and then you can run the binary

10:27

on some architecture? What

10:29

is it like to work in Rust

10:31

as compared to something that people

10:34

might be very familiar with like Python

10:36

where a lot of people that are probably listening

10:38

to our episode or have their

10:41

Google Colab notebook pulled up right

10:43

next to them, right? And they're doing all sorts

10:45

of things with a Python interpreter. What

10:48

is the workflow and programming

10:51

like in Rust as far as how

10:53

the language is set up and how you work

10:55

with it?

10:56

Obviously it's a bit different than working

10:58

in a notebook. Like you said, it's

11:00

the strongly typed static programming

11:02

language, similar to like

11:04

C++, Java, all of those

11:07

older language. So for people that

11:09

comes from Python, maybe you're aware

11:11

of the Python type int that

11:13

you can use, it's a bit like that, but

11:16

you have to use that everywhere in

11:18

all of your functions and definitions. And

11:21

the workflow, something that I

11:23

like is that

11:24

in Rust, I think it's one of the

11:26

only programming language that does that, is

11:28

that when you write a function, you can just

11:30

write the test below it. So that's kind

11:32

of the way where you can get some

11:34

feedback on what you're actually writing.

11:37

And it's encouraged good practice because

11:39

you're writing a test that can be reduced

11:41

all the time. It's not the script that you're

11:43

trying to just run on the side, you can

11:45

actually commit that

11:47

and it describes how the code should run. And

11:49

that's how you get

11:50

interactivity with this. And

11:53

since you have a packet manager, which

11:55

is cargo, it's pretty easy to just execute

11:58

the code you're writing. To follow up. up on

12:00

that, you know, cargo, the package manager

12:02

is based on a lot of the best practices

12:05

we see in some of the other programming languages. For

12:08

instance, in JavaScript and the node

12:10

community, you have NPM and

12:13

there are several others and the Rust community

12:16

really drew from kind of best practices

12:18

on that. Another thing to kind of follow up

12:20

on the compiler notes that Nathaniel was

12:22

mentioning was, was a lot of Rust

12:24

developers kind of see the

12:27

compiler almost as a pair programming

12:29

partner, in a sense, to where instead

12:31

of just hitting compile from time to time, like you

12:33

would in Java or something like that,

12:36

the compiler is so comprehensive that

12:38

it kind of helps you and you kind of use it to write

12:41

the right code and you get to the end of the process

12:43

and know that your code will actually work

12:45

without runtime errors. So it's a different

12:48

way of thinking about being a developer.

12:50

It takes a little bit of a mind shift

12:52

to adjust over to it. This is very different.

12:55

Like in Python, an important skill

12:57

is just to be able to read the stack trace because

12:59

you're going

12:59

to have a lot of exceptions when you run

13:02

your programs and you have to learn

13:04

how to develop your program. This is kind of

13:07

a hard skill you have to do when you learn Python.

13:09

In Rust, you have to learn

13:12

how to write the compiler errors, but

13:14

they may at least try to make it

13:16

as easy as possible. Even

13:18

sometimes you've got the links to the documentation,

13:21

it opens a browser, it can mean why you have that

13:23

error. It explains the reasons why. So

13:26

this is a different set of skills. And yeah,

13:28

this is quite different from the workflow

13:31

you use

13:32

with Python. Maybe just one more

13:34

question about Rust

13:37

in general before we dive into

13:39

some other things, what is the

13:41

Rust community like in

13:43

terms of whether it be,

13:46

is there active channels where

13:48

the Rust community communicates with one

13:50

another, conferences, meetups,

13:53

what is the Rust community like

13:55

and is it growing? How

13:58

is it changing over time as you. you've

14:00

been with the language for some

14:02

time, how has it developed in the time

14:04

that you've been part of the community? I'm

14:07

not sure about all of the community, obviously,

14:09

but I think it's pretty friendly.

14:11

Like there are some Discord channels where

14:14

you can just go and ask your questions

14:16

if you want to. There is an active GitHub

14:18

issue, so the language is open source. If

14:21

you have a problem, just open an issue and

14:23

people maybe are going to help. So

14:25

this is a pretty inviting community, I think.

14:27

This is part of the reason why it succeeds,

14:31

I think, because if you don't answer questions,

14:33

you don't help people use your technology,

14:36

it doesn't really work out. I never went to

14:38

a Pinterest for us yet,

14:40

but I know there are many, so maybe

14:42

I'm going to go to some later.

14:45

You know, one of the topics that

14:47

has been kind of a recurring topic between

14:50

Daniel and me over a number of episodes,

14:52

we've been tracking kind of the

14:55

maturity process of

14:57

the AI community and kind of what

14:59

it takes to kind of level up

15:02

and to take it to the next level. And

15:05

on a number of different occasions, we've talked

15:07

about the fact that if you look at other

15:09

communities that have arisen before

15:11

this one, often it takes kind of broad

15:14

support, whereas in the

15:16

kind of the early days, that

15:18

we're really still in, in my view, of

15:21

modern AI, it has been largely

15:23

dominated by a single programming language,

15:25

which most of our listeners are very

15:27

aware of, which is Python, which has

15:29

really been kind of the focus of where

15:31

all the work is. It's where all

15:34

the APIs have been focused on and everything. And

15:36

we've discussed quite a bit about how

15:38

for AI to mature, it needs

15:41

to become more broadly available to

15:43

other languages. And so that as

15:45

you have different types of use cases,

15:48

addressing different business needs,

15:51

and that requires languages other than just

15:53

Python all the time, how do you get to

15:55

AI and what kind of bridging do you need to

15:57

do? It leaves me in Nathaniel as... wanted

16:00

to ask you is it's clearly

16:02

a need that the community has

16:04

had to be able to start getting Rust

16:06

and other languages in there. I'm curious

16:09

how did you approach this? What was

16:11

it about trying to get Rust

16:13

working as a framework that could

16:15

work with AI tools of the day? How

16:18

did you get into that? What was your motivation?

16:20

What did you see as the need at a

16:22

personal level? Well I started

16:24

working on Bird because

16:26

I was experimenting with I

16:28

think from this non-network and I wanted

16:31

to make something a bit not

16:33

standard let's say that and I needed like

16:36

multi-trading, concurrency and stuff like

16:38

that and it was really hard

16:40

to do with Python and I had a

16:42

software engineering background so I said to myself

16:45

well if it's hard for me to do that

16:47

then maybe it's too hard for any researcher

16:49

to do that so that's why maybe we don't have yet

16:52

an architecture for that kind of stuff so

16:54

I say well let me try and make a framework

16:56

in the language that has support for

16:59

highly little programming and concurrency

17:01

and all those things and yeah

17:03

it's pretty much the description of Rust so

17:06

that's why I started writing a scheme work

17:08

in this language and then it's just was

17:11

a personal project for a long time I just was

17:13

experimenting with it and yeah it

17:15

grew with time. When

17:17

you first started thinking about burn

17:20

and these problems that you were

17:22

looking at what was the current

17:24

support

17:25

for doing

17:26

whether it be kind of

17:29

quote-unquote traditional machine

17:31

learning like you know

17:33

random forests SVM whatever that

17:35

is in all the way up to kind

17:37

of deep learning in Rust what

17:40

was kind of the state of things I'm

17:42

looking at your burn repo and

17:44

I see you've at least been submitting

17:46

pull requests since July of 2022 maybe

17:50

I'm sure some of it goes back further

17:52

than that so back to those days

17:54

what did the ecosystem look like in terms

17:57

of its support for these things well

17:59

I don't think there was a lot of deep learning

18:01

framework in Rust. So there were

18:03

some experiments, but nothing

18:06

really pragmatic that you can use.

18:08

So

18:09

I think there was a library for

18:12

normal like SVM random for us

18:15

in Rust. I never used it. But

18:17

yeah, I don't think it's comparable yet to

18:19

Scikit-learn and Python, which is very

18:22

complete.

18:23

It's interesting because some of the

18:25

sort of early stuff that we were doing in Go,

18:27

it was similar there. There were certain

18:30

packages like for whether it be

18:33

kinds of regression or hypothesis

18:35

testing, statistical

18:38

things, but not really a robust deep

18:40

learning framework. One of my

18:43

questions would be in Go, I

18:46

know one of the struggles with

18:49

trying to support really

18:51

robust deep learning is not

18:54

necessarily the fact that you

18:57

can't create a nice package

18:59

with a good API, but

19:01

a lot of these sort of specialized

19:05

libraries and toolkits like

19:08

CUDA and GPU support

19:10

make things a little bit more difficult.

19:13

So it might not be that, but

19:16

what did you see at

19:19

the time you started working

19:21

on burn as the big challenges

19:24

on the Rust side? And has that

19:26

been the case as you develop the package or

19:29

have other things become the kind of dominant

19:32

challenges over time? Yeah, all of those

19:34

things are hard to work with like CUDA,

19:37

having your own GPU kernels,

19:39

all the drivers, not necessarily easy to

19:41

install on all platforms. There

19:44

is a GPU libraries in Rust,

19:46

that has write kernels. This is

19:48

like WGPUs, it's targeting the web.

19:51

But

19:51

when I started working on burn,

19:53

I acknowledged that it was

19:55

pretty important to be generic

19:58

over the backend.

19:59

that we can write the

20:02

best backend for the specific

20:04

hardware you're actually targeting. Because

20:06

it's probably always going to be faster

20:08

to write CUDA for Nvidia to

20:11

write low-level C

20:13

or Rust mainly with SCMD

20:15

support for CPU or to write with

20:18

the

20:18

Metal graphics driver for Mac. So

20:22

I was aware that one backend cannot be

20:24

written for all of them and

20:26

I

20:26

just defined the API

20:29

and I just used the libtorch as

20:31

a backend because there was already bindings

20:33

to libtorch and Rust. So this

20:36

allowed myself to iterate over

20:38

the abstraction over the user space

20:40

API and not necessarily

20:43

worry about speed and

20:45

writing all of the kernels, just getting

20:48

the abstractions in place and the software

20:50

architecture in place.

20:51

And it's more pragmatics, it's

20:53

probably as fast as libtorch

20:56

by default and then I can just go

20:59

and write more kernels afterwards, which

21:01

is what we're doing right now. I'm

21:03

curious, do you feel,

21:07

given the low-level capabilities that

21:09

Rust brings to bear that so many other

21:11

languages don't have and that when you're

21:13

looking at whether it be GPUs

21:16

over time, and I know you're talking about using

21:18

libtorch in this case, but do you think

21:20

that as you move forward that

21:23

that low-level capability that you have in

21:25

this language that other languages don't bring to

21:27

bear will be a helpful

21:30

part of developing it and maturing

21:33

burn over the years ahead? Does

21:35

that low-level give you an advantage that

21:37

you might not have with other languages that we're trying

21:39

to integrate in?

21:40

I think so. Mostly in the part

21:43

where we need to handle memory. So

21:45

that's an important part of deep learning frameworks,

21:47

you don't have to waste memory. We can leverage

21:49

all of the tight system of Rust to actually

21:52

do

21:53

graph optimizations and all of that kind of

21:55

stuff that we're going to work on

21:58

soon

21:58

and I think it's going to

21:59

to be easier to work with Rust to

22:02

do that with good performance, then

22:04

it will be with maybe another programming language

22:06

with the garbage collection because it has fine

22:09

control over the memory.

22:11

So not necessarily to write GPU

22:13

kernels. When you do that, you're actually

22:15

writing compute shaders. So it's not

22:17

relevant to Python or C++ or

22:19

even Rust. But

22:21

if you want to handle memory and write the

22:23

optimization pipeline,

22:25

then I think Rust can be really useful.

22:27

Just to get a sense of

22:30

the current state of

22:32

burn, what is possible

22:34

in terms of support

22:37

and what you can do right now, and what

22:39

are some of the highest

22:42

requested things that you

22:44

would like to work on but aren't

22:46

there yet? I don't know.

22:49

There are so many things that I want to work on. Time

22:53

is limited, so

22:55

quite hard.

22:56

What I'm really excited to work on

22:58

is kernel fusion and

23:00

really optimize the compute pipeline

23:03

with lazy evaluation. So that's something

23:05

I'm

23:06

really excited to work on. Could you

23:08

dive into that a little bit and kind of what

23:10

that might mean for a

23:12

user specifically? Yeah, in terms

23:14

of user, it's just going to be faster.

23:17

So this is really like optimization techniques

23:20

that the framework, the deep learning framework, can use.

23:22

So yeah, there isn't a lot of

23:25

impacts in terms of user API

23:28

and usability, but it's just going

23:30

to be faster. Gotcha. And

23:32

would you say that right now in

23:34

terms of what people are doing with

23:36

the package, now you

23:39

mentioned that part of what got

23:41

you into it was building

23:43

kind of experimental models

23:47

or architectures that maybe

23:49

you were experimenting with on the research

23:51

side. I'm wondering with

23:53

this package, what are you seeing

23:57

as the people that are using it? What

23:59

are they?

23:59

most

24:01

doing with the package? Is it that

24:03

sort of experimental research implementation

24:06

side? Is it taking models

24:08

that aren't maybe experimental and embedding

24:12

them in, you know, REST applications

24:14

where they wouldn't have been able to before? Is it something

24:16

else? What are you seeing in terms of what people

24:18

are doing over and over again? I

24:21

think a lot of people are using it because

24:23

it's easy to deploy on any platform

24:25

because we have different back ends so you can deploy

24:28

on WebAssembly, you can deploy

24:30

on even a device with auto-preting

24:32

systems. So this is pretty great

24:34

in terms of deployment flexibility.

24:37

But even though I started the same work because

24:40

I had like a research idea I wanted to

24:42

do, the goal of Bern isn't

24:44

necessarily to be only for research. I

24:47

wanted to go with kind of a blank sheet

24:50

and thinking about all the constraints and who

24:52

is going to use the same work. So I'm always

24:54

thinking about the

24:55

machine learning engineer perspective,

24:58

the researcher's perspective, and even the

25:00

backend engineer's perspective. So the one that is

25:02

going to write the actual low-level

25:04

kernel code and do the kernels

25:07

and stuff.

25:08

So there's kind of different user profiles

25:10

or use cases that you can assign to

25:12

the framework. Yeah. Kind of as a follow-up

25:14

to that is you were looking and I noticed

25:17

that you had quite a few people that were making contributions.

25:19

You're for being a relatively young

25:22

project overall. You

25:24

have a lot of people involved in it. So I

25:26

mean it looks like it's really getting a lot of traction.

25:29

How do you kind of organize the

25:31

work around it and kind of satisfy

25:33

the interests of each of those personas

25:36

along the way? Is there one

25:39

that tends to lead or do you tend to try

25:41

to have certain people that do different ones?

25:43

How do you approach that? To be honest,

25:45

I'm not sure. I think the key is just to

25:47

be reactive. So if there is

25:49

an issue, just go and comment it. If there

25:52

is a bug, try and go fix it.

25:53

And I think the most important

25:56

work I can do is in terms of architecture

25:58

like setting the stone. in place,

26:01

but then if I want to extend,

26:03

maybe add more tensor operation,

26:06

if I want to add more neural network

26:08

modules, then I can open issues. And

26:10

people that are interested can just assign

26:13

themselves and actually do a pull

26:15

request. And I just have to be really,

26:17

really conscious about that,

26:19

do code review correctly, be kind.

26:22

And I think that's pretty much it. I don't

26:24

have any other secret.

26:26

So Nathaniel, I've deployed

26:28

a lot of models as part of my day

26:31

job. Let's say that I

26:34

am interested in Rust and

26:37

I am interested to maybe

26:39

take some model

26:42

that I might have experimented

26:44

a little bit with in a Colab

26:47

notebook or something like that. And

26:49

I want to make it, like you said, have

26:51

the support for multiple back ends implemented

26:54

in a maybe more efficient

26:56

application. What would be the

26:58

process that someone would

27:00

have to do to, let's say, get

27:03

one of the kind of popular,

27:06

quote unquote, models these days up

27:08

and running in Rust using Burn?

27:11

Is that something that's possible right now?

27:13

How are people kind of pushing the edges with respect

27:15

to that? Well, I think there are two different

27:17

strategies. So we're actually

27:20

working on being able to import

27:23

an X model.

27:24

So if you have maybe an image

27:26

classification model, then maybe our

27:29

end port is going to work. It's

27:31

still in the whip. But

27:32

if there is no crash, it's going to work. Not

27:34

all operations are supported.

27:36

But maybe for other models, you

27:38

maybe need to write the

27:40

model from scratch using our framework

27:43

and then translate the weights and

27:45

you would be fine to deploy it.

27:47

So it's a bit of work, but working with

27:50

Burn is quite intuitive. So

27:52

the API is similar to PyTorch,

27:55

the modeling API at least. So it's

27:57

not that hard depending on

27:59

object.

27:59

just see the size of the model and the complexity

28:02

of the model. Yeah. And I think I

28:04

saw a few on the repo that

28:06

you people have already sort of done

28:08

this. What are some examples of some

28:11

of these that people have brought over into

28:13

Bern? Yeah, I think there are community

28:15

models for LAMA, for

28:18

stable diffusion, for Whisper.

28:20

This is thanks to the community. I didn't

28:22

actually quite those model. But

28:25

yeah, since it's open source, I think

28:27

you actually do the work to port

28:29

maybe a model. I think it's great to share it

28:31

with the community. People can start using it.

28:34

So

28:34

yeah, we have a few, but we would like more.

28:37

Yeah. So call out to the listeners out

28:39

there that are rust people in

28:41

the audience. Check it out and submit

28:44

some of your own model implementations.

28:46

That's a great way to contribute,

28:49

I'm sure. You mentioned that

28:51

having a similar API

28:54

to PyTorch. And I'm kind

28:56

of looking through some of the documentation

28:58

here. I'm wondering if you could just comment

29:00

on a few of the things that you call

29:03

out as far as features of Bern and kind

29:05

of explain what you mean by

29:07

some of those things. I think we

29:09

already talked a little bit about the customizable,

29:12

intuitive, user-friendly neural network modules.

29:14

So this kind of familiarity with

29:16

maybe a PyTorch API, maybe there's more

29:18

to that. But you also mentioned

29:21

this comprehensive training tools,

29:24

including metrics, logging, check pointing.

29:26

Could you describe that a little bit in

29:28

terms of what the thought process

29:31

is in the framework around these things,

29:33

which are definitely important practically,

29:36

as you said, for the machine learning engineer,

29:38

for the actual practical person

29:40

who's trying to build models? Yeah, and even

29:42

the researchers, sometimes they don't want to actually

29:45

write all of the training loop. That's

29:47

not the core of their research.

29:49

Yeah, there is a library, which is called

29:51

BernTrain, which tries to

29:54

bring a training loop to the user

29:56

so they don't have to write it. So

29:58

you've got like a basic...

29:59

CLI dashboard where you can

30:02

follow all your metrics. You

30:04

have your login, so if you want to maybe

30:06

synchronize the drive to maybe

30:09

a Google account, you can probably do

30:11

that. So it's similar really to PyTorch

30:13

Lightning. So for the PyTorch users that

30:16

are familiar with the project, but

30:18

we also have that for burning and we just have that. It's

30:21

just easier to get started with the framework. I

30:23

think it's essential for now. It's just starting

30:25

a new framework to provide that.

30:27

We already talked a little bit about the

30:30

versatile back-ins. I

30:32

don't know if you want to say any more about the other

30:34

options for that. You mentioned Torch and WebGPU,

30:38

but I see a couple others here mentioned.

30:40

Are there any call-outs that you'd like to make there,

30:42

both in terms of other options, but

30:46

when also those other options might be useful,

30:49

people might not realize in the audience

30:51

when you would want to use a Torch

30:53

back-in versus something else?

30:55

Yeah, I think the Torch back-in is probably

30:58

the fastest if you have an NDJ GPU.

31:00

For the CPU, I'm not sure

31:02

it depends on the model,

31:04

but we also have an ND array back-in.

31:06

So ND array is similar to

31:08

NumPy, but for

31:10

Rust.

31:11

This isn't maybe the fastest back-in,

31:13

but this is extremely portable, so you can deploy

31:16

the back-in everywhere. So if you've

31:18

got a small model, it can be very handy to

31:20

have that or for to write unit tests and

31:22

stuff like that. We also

31:24

have a Kendall back-in. So Kendall is

31:27

also a new framework built by

31:29

Aginface in Rust. So

31:31

they're trying to make it easier

31:34

to deploy model with that. So we actually

31:36

have their framework as a back-in

31:38

for burn, so we can benefit from their work.

31:41

And yes, we have the WebGPU back-in

31:43

as well. So we can target any GPU.

31:46

So if you don't have NDJ, don't be sorry.

31:48

We have you covered. Awesome. So

31:51

I also noticed on your GitHub repo,

31:54

in addition to kind of the familiarizing

31:57

us with kind of the capabilities and features,

32:00

You also have the Burn book, which

32:03

I assume was maybe inspired

32:05

by the Rust book. That seems to be a common

32:08

thing. What is the Burn book and

32:10

how can we best use it? What's it for in your

32:12

mind?

32:13

Yeah, the Burn book is to help people

32:15

getting started with the framework. So it's

32:17

like a big tutorial slash

32:20

reference that you can use to

32:22

actually start using Burn. At the beginning,

32:24

it tells how to install Rust, how

32:26

to get started with the language, how

32:28

to make a basic model, the training

32:31

loop, the data,

32:32

the data pipeline, all of that. So it's just

32:34

with all the explanations and stuff like

32:37

that. So

32:37

it's really to help people getting started

32:40

with the framework in an easy way.

32:42

Of the people that are coming through

32:44

and learning from the Burn book, interacting

32:47

with you on the repo, do you

32:49

see a lot of people coming from

32:52

the non Rust community

32:54

in because they have either

32:57

performance related things or maybe

33:00

their company is exploring, deploying things

33:02

in Rust or other people, that sort of thing.

33:05

So people coming from maybe the Python community

33:08

or do you see more people kind of Rust

33:10

engineers who are

33:13

already building things in Rust? And so

33:15

now that everybody wants to integrate

33:19

AI into their applications, you

33:21

sort of have the influx from that way. Are

33:23

you seeing both which side

33:25

is kind of coming your direction more?

33:28

I'm not sure necessarily about the

33:30

backend of users

33:33

of Burn, but I think the main pain

33:35

point is that they want to deploy their

33:37

model reliably, and they're

33:40

coming to Burn to do that. And

33:42

some of them, once they get familiar

33:44

with the framework, they actually port also

33:47

the training part.

33:48

So they can have all of their machine learning

33:50

workflow working with Burn.

33:52

So it can be people with Python background

33:55

or Rust engineer, I'm not sure,

33:57

but I think this is the main traction

33:59

point.

34:00

I will offer kind of a burn

34:02

newbie perspective on that myself when

34:05

I ran across burn and reached out

34:07

to you I was really excited about

34:10

it in part because as

34:12

this industry is maturing and affecting

34:14

as a You know many other vertical industries

34:17

out there We are seeing AI capability

34:20

being pushed out from only

34:22

being you know in data centers and stuff

34:25

out into onto the edge And you

34:27

can define the edge in many many ways obviously but

34:30

the place where processing is happening

34:33

and even training is happening is evolving

34:35

over time and if you

34:37

look at Businesses and there are other

34:40

use cases that the fact that they are

34:42

they need AI in all these other Industry

34:45

things that they're doing all these other businesses They

34:47

may be you know platforms that are mobile

34:50

like such as we have you know autonomous cars out

34:52

these days and you name it all sorts

34:54

of stuff that Are increasingly

34:56

relying on AI and they're turning because

34:59

those are autonomous things They need the performance

35:02

in many cases the safety and low-level performance

35:04

capability that rust offers I know

35:06

that I got super excited when I came across

35:09

burn Because I'm in this

35:11

AI world, but I'm also in this high-performance

35:14

things moving around time and space World

35:17

as well and being able to combine

35:19

those into one have one language That

35:22

is able to do both at the same time and

35:25

deploy out to the edge in a very safe way

35:27

and highly performant way Was hugely

35:29

exciting and it's been a point of conversation

35:31

that I've had with colleagues for quite some time So

35:34

I think you've hit a sweet spot

35:36

with burn That is gonna get

35:39

probably as people become aware of it

35:41

You'll get a lot more uptake Because

35:43

it solves what would otherwise be a big

35:46

problem that they're gonna be faced with you know

35:48

in the years ahead Yeah, and I think

35:50

it's not just about

35:52

like there is a good

35:53

a good amount of solutions to just deploy

35:56

entrance model like with clinics and stuff

35:59

like that, but

35:59

it's not going to cover the training

36:02

part. And I think it's valuable to

36:04

be able to do training everywhere.

36:06

Like maybe the next generation of model, you're

36:08

going to call backward during entrance. We

36:10

don't know that. It's cool to have like one

36:12

tool that you can do both on any platform.

36:16

As you kind of look to

36:18

the future of the project

36:21

itself, I maybe have

36:24

kind of two elements to this question.

36:26

What are some of your hopes

36:29

for what burn becomes into

36:32

the future as a framework

36:34

in terms of like the sweet spot and

36:36

what it does really well, what people turn to it

36:38

for. So what is your kind of hope and

36:40

vision for the project, I guess. And

36:43

then for yourself in terms of

36:45

your own work and how you're using the

36:48

project or other things, what

36:50

is your hope for the future? You

36:52

have your own interests obviously in terms

36:54

of developing AI related

36:56

applications. So I'd love to hear both

36:59

of those things if you have a comment on them.

37:01

I think I would like burn to be

37:03

widely used for maybe complex

37:06

model. I think Rust really shines when you've

37:08

got complexity. So if

37:10

you've got this convolutional network

37:13

with just a few layer, maybe the benefits

37:15

of using Rust isn't as massive

37:18

maybe for deployment. But if you've

37:20

got like a big models and a lot

37:22

of complexity in the building blocks,

37:25

then I think burn will shine in that

37:27

place.

37:28

So I would like to see like

37:30

innovative new deep learning application

37:32

being built with it as well as maybe

37:35

just normal deep learning models like

37:37

that we're familiar with

37:39

like ResNet, Transformers, all

37:41

of those ones, but deploying any

37:43

hardware

37:44

so that everybody can run maybe

37:46

locally some models. Maybe not the

37:49

big ones, but at least the small ones. And

37:52

what I would like to do with it is maybe

37:54

more research, like I said previously

37:56

on maybe bigger models, maybe asynchronous

37:59

neural network.

38:00

I try to leverage the culture and nature

38:02

of the same words. Yeah. And

38:05

as we kind of get close to

38:07

an end here, just

38:09

for those, because it is a podcast, people

38:12

are listening in their car and maybe

38:14

taking mental notes of some things

38:16

or on their run. Where do

38:19

people go to find out more

38:21

about burn? And what would you suggest?

38:24

Let's say it's a newbie to burn. What

38:27

should they do to get familiar

38:29

with it and try things out? So where do they

38:32

go and what would you suggest they

38:34

start with? I think the best place to

38:36

start is to go to the website. So

38:38

it's just burn.dev. Pretty

38:40

simple. And from there, you can

38:42

just go in the book that we spoke about

38:45

and just follow along. If you

38:48

are not familiar with Rust, we're going to provide

38:50

links so that you can get familiar with

38:52

the language. And then you can come

38:54

back afterwards, follow the

38:57

rest of the book. And if

38:59

you're interested, you can also go to

39:01

the GitHub,

39:02

try the examples. You can run them with

39:04

one command line so you can try to do

39:07

end friends or to even launch

39:09

a training on your own laptop.

39:11

So that can be great. So yeah, that would be

39:13

the place I would go to start.

39:16

Awesome. Well thank you so much for

39:18

taking time to join us and

39:21

not burn us. You are very kind. So

39:24

thank you for your time. We're

39:27

really excited about what you're doing and hope

39:29

to have you on the show maybe next

39:31

year or sometime to see all the

39:33

exciting things that are happening in deep

39:35

learning and Rust and burn.

39:38

So thanks so much, Nathaniel. Thanks a lot, man.

39:40

Thanks to you for having me. If

39:45

you enjoy the music you hear on Practical

39:48

AI, you'll be happy to know we released

39:50

two full-length albums for purchase

39:53

or streaming. Just search for changelog

39:56

beats in your music app of choice and check

39:58

them out. Volume zero is coming. called

40:00

theme songs, and it includes special

40:02

remixes in addition to the classics. And

40:05

our first volume is called Next Level, featuring

40:08

many of the video game inspired tracks you've

40:10

heard on Change.log podcasts over the years. Check

40:12

us out, Change.log beats. Thanks once

40:14

again to our partners, Fastly.com,

40:17

Fly.io, and typesense.org. That's

40:20

all for now, but we'll be back with

40:22

more practical AI goodness next

40:24

week.

Unlock more with Podchaser Pro

  • Audience Insights
  • Contact Information
  • Demographics
  • Charts
  • Sponsor History
  • and More!
Pro Features