LLM Security and Privacy by The Cloudcast | Podchaser

Episode from the podcastThe Cloudcast

LLM Security and Privacy

Released Wednesday, 27th March 2024

Good episode? Give it some love!

LLM Security and Privacy

LLM Security and Privacy

Wednesday, 27th March 2024

Good episode? Give it some love!

Rate Episode

Podchaser Pro

Episode Transcript

Transcripts are displayed as originally observed. Some content, including advertisements may have changed.

Use Ctrl + F to search

0:00

Cloudcast Media presents from the Massive

0:02

Studios in Raleigh, North Carolina. This

0:04

is the Cloudcast with Aaron Delb

0:06

and Brian Gracely, bringing you the

0:08

best of cloud computing from around

0:10

the world. Good

0:14

morning, good evening, Brian. Welcome back to the Cloudcast.

0:16

We're coming to you live from our Massive Cloudcast

0:18

Studios here in Raleigh, North Carolina. And

0:20

we're going to jump right into our topic

0:22

for this week, LLM security and privacy. Now,

0:25

this is the first in a few

0:27

interviews we have coming up talking about

0:29

PII or personally identifiable information. And

0:31

we're going to be talking about in the context of AI. There's

0:35

a few different ways to really tackle

0:37

this topic. And it is certainly top

0:39

of mind for most organizations that I

0:42

speak to. So we hope you

0:44

enjoy this conversation and we'll get started right after

0:46

this quick break. Are

0:48

you getting pressure from finance to justify or reduce

0:50

your cloud bill? Cloud Zero is

0:52

the only cloud cost platform loved by

0:55

engineers and trusted by finance. Cloud Zero

0:57

can identify unused, idle or over provisioned

0:59

resources, alert you to spend anomalies and

1:01

organize 100% of your spend

1:04

into a framework that mirrors your business

1:06

structure, like cost per customer, product feature

1:08

or team. It's the most

1:10

powerful platform ever built to provide accurate Regular

1:12

visibility in your total cloud spent

1:15

without the typical pitfalls of legacy

1:17

cloud cost management tools like enlist

1:19

tagging or clunky Cuban a support.

1:21

Manage. Cost optimized development and

1:23

maximize profit. All and one

1:25

platform. Join. Companies like Rapid

1:27

Seven drift in seat geek

1:29

by visiting Cloud zero.com/cloud Cast

1:31

to get started. That's. Cloud

1:34

your.com/cloud Cast Visit today to

1:36

experience immediate ongoing savings on

1:38

your Cloud bill. Are

1:43

you ready for the ultimate coding challenge? Stay

1:45

in a chance to win a Tesla Cyber truck or

1:47

a hundred thousand dollars. All you

1:49

have to do is build an app with

1:51

a front end and back end and deploy

1:53

it on WSO2's Corio, an internal

1:56

developer platform. The more you do with

1:58

Corio, the more chances you have to win. For

2:00

all the details, go

2:03

to korio.dev.cybertruck. Sign up,

2:05

get started, and possibly win a Tesla

2:08

Cybertruck or $100,000. Plus,

2:10

10 more winners get MacBook Pros. But

2:12

hurry, because the challenge ends on April

2:15

30th. Good luck! And

2:19

we're back. And our topic

2:21

for today is security and

2:24

privacy of LLMs. Something

2:26

we've kind of hinted at

2:28

here and there previously, but we

2:31

wanted to dedicate a show to it. So

2:34

today we have Sean

2:36

Falconer. And Sean, you

2:39

are head of marketing developer relations

2:41

at Skyflow, but also fellow podcaster,

2:43

podcast host for both partially redacted,

2:45

as well as software engineering daily.

2:48

You've taken the reins over there.

2:50

So first of all, welcome to the show. And

2:53

give everyone a quick introduction, if you don't

2:55

mind. Yeah, thank you so much. Yeah, so

2:57

as Aaron said, my name is Sean Falconer.

2:59

I lead marketing and developer relations at Skyflow,

3:01

but don't let the, I guess like the

3:03

title sway you. My

3:05

background is engineering. I studied

3:07

computer science for a decade, have a

3:10

PhD in computer science and postdoc and

3:12

bioinformatics. And was actually going to be

3:14

a professor and or professional researcher for

3:16

a while. And it was kind of on

3:18

that path. And then I ended up raising money

3:20

and starting a company and left

3:23

the world of academics, the BTS CTO

3:25

and founder of a company that

3:27

I ran for about seven years, sold that company,

3:30

joined Google where I was an engineer

3:32

and team lead for a number of years

3:34

before joining Skyflow. And I've been here for

3:36

the last two and a half years and

3:38

having an awesome time trying to make

3:41

the startup go again. Fantastic. And

3:43

so let's dig into LLM

3:45

security and privacy. So we

3:48

see a lot of concern here on the podcast.

3:50

And I talked to customers in my day job.

3:53

And we've touched on it again in various

3:55

past shows as I've mentioned, but we've

3:57

never really dug in deep. So let's hopefully... dig

4:00

into specifically about this. First, let's

4:02

frame the problem. What

4:05

are we talking about at the

4:07

end of the day when we talk about LLM security

4:09

and privacy? It sounds like a vague high level term.

4:11

Like give us a little more on that, Sean. Yeah,

4:14

so I mean, I think there's

4:16

several things that raise concern and

4:19

I see sort of security and

4:21

privacy as two like related but

4:23

slightly different things. So if you

4:26

think about security, like a lot of times like

4:28

security, that goal is like, let's put up like

4:30

a brick wall and not allow anybody

4:32

through the brick wall. But for privacy,

4:34

sometimes you have to essentially like punch holes

4:36

in that brick wall in order to lack

4:39

certain amount of information through because it's one

4:41

thing to secure data and throw away the

4:43

key but that's not super useful to businesses

4:45

and we don't essentially store data to just

4:47

like lock it up and never use it.

4:49

So we need ways of essentially making it

4:51

useful while not exposing too much of it.

4:54

And that's really the sort of the balance that we've had

4:56

in security and privacy for 20 years. And

4:59

especially in the last like half decade

5:01

or so with the introduction of GDPR

5:03

and all the privacy regulations around the

5:05

world. But things get way more complicated

5:07

when we start to move from things

5:10

like databases and structured data, rows

5:12

and columns, things that we understand

5:14

to some degree, there's challenges there,

5:17

but we like understand that, okay, well, if I

5:19

need to delete information or I need to find

5:21

it, I gotta find the row, the column of

5:23

information with that. That doesn't exist in the world

5:25

of essentially AI models, especially when we're talking about

5:27

like deep learning and neural networks and a lot

5:29

of the things that are powering large

5:31

language models today is the big

5:34

problem from like a privacy's

5:36

perspective is there's simply no practical

5:38

delete model, delete button of a

5:40

LOM. So as soon as

5:42

I leak customer

5:44

data, proprietary information, employee information

5:47

into a model through training,

5:50

through inference or some other process, then there's

5:52

no real way to get that back. And

5:55

that becomes a big problem when we live

5:57

in the world of GDPR and the right to be forgotten data

5:59

subject. it requests, data residency, all these different

6:01

regulations that we need to try to

6:03

navigate. How do you actually be

6:06

compliant, make sure that only the right people have access

6:08

to the information when they should have access in

6:11

the world of LMS? That's a very, very difficult problem to

6:13

solve. Yeah, yeah. And let

6:15

me even take that one step further because there

6:18

is the whole concept of the

6:21

data and the, say privacy data,

6:24

and you of course don't want it to get out there,

6:26

but at the same time, a lot

6:28

of times what customers and a lot

6:30

of enterprises are doing is they're taking,

6:32

say a broad based LLM, generic LLM

6:34

off the shelf, and then they're

6:36

gonna fine tune it, or they're gonna do rag against

6:39

it, or they're gonna in some way take their customer

6:41

data and make that the differentiator.

6:44

And so you have this weird almost balance

6:46

then I see of like, hey, if you

6:48

mask everything, well then

6:50

it becomes undifferentiated. And

6:52

so how do you handle this

6:55

concept of masking the data, but

6:57

also still being able to potentially

7:00

differentiate? Yeah,

7:02

so I think there's a couple of different things,

7:04

like the, you

7:07

know, when we think about masking, like the idea

7:09

there is like, how do we show some limited

7:12

amount of information? So if for some reason we

7:14

needed to train an LLM on, I

7:16

don't know, like our employees, like social security

7:18

numbers were part of that data set, then

7:20

clearly we wouldn't want, you know, you to

7:23

be able to pull up my social security

7:25

number as part of a prompt or something

7:27

like that. So how do we make sure

7:29

that, you know, Aaron who only has access

7:32

to see certain types of information.

7:34

So a lot of it comes down to not

7:36

only masking information, but how do you like govern

7:38

access in a way that essentially controls who sees what,

7:40

when and where. And I think this is some of

7:42

the sort of downside

7:45

of limited viewing of some of the

7:47

approaches that we've taken in the space

7:49

so far around what's great private LLMs,

7:51

private LLMs have value, but they don't

7:53

really control that governance piece

7:55

essentially. But going back to your question in

7:57

terms of like masking information, there are ways

7:59

of... essentially de-identifying certain types of

8:01

information and still making it useful

8:03

for training because if I

8:06

think about someone's name, even

8:08

social security number, these types of information,

8:10

these identifiers that are potentially sensitive, the

8:14

LLM doesn't really care that it's my name

8:16

or my social security number or my credit

8:18

card number or some value like that, my

8:20

address, that's part of the training data. It

8:22

just needs a representation of that data because

8:24

eventually it's just going to become vectorized data

8:26

that's numbers in space anyway. If

8:28

I can essentially automatically detect my name

8:30

as part of the training data, replace

8:32

it with a de-identified form of data

8:35

that's consistently generated, like every time it

8:37

sees Sean Faulkner, it's replacing it with

8:39

both the entity recognizing that it's a

8:41

name plus some random value

8:43

like ABC123, then

8:46

training could still essentially occur

8:48

as expected because you have

8:50

contextual values there. The

8:53

LLM can recognize that it's a name being used.

8:55

It doesn't really matter that it's actually the raw

8:57

value. That essentially allows

8:59

us to keep a

9:03

gateway, a privacy gateway is what we refer

9:05

to it at Skyflow around the LLM to

9:07

prevent essentially PII going into the model, preventing

9:09

that problem of once it's shared, we can't

9:11

really get it back. Then we're

9:13

only ever sharing de-identified data. Then

9:16

we can even use things like governance on

9:18

the inference process so that we can control

9:20

who sees what, when, and where. That way

9:22

if a response has de-identified values in it,

9:24

I can check to see who was the

9:26

person who put the prompt in, what

9:29

essentially policies are in place to allow them

9:31

to see this information. If they're not able

9:33

to see that information, then we can essentially

9:35

keep it redacted so that you can't pull

9:37

up my social security number essentially. Okay.

9:40

Yeah. That makes perfect

9:42

sense because I was actually going to ask you

9:44

like, okay, in my head I see this difference

9:46

between a lot of folks when

9:48

they talk about this, they talk about the fine

9:50

tuning stage if you will, the training stage. That

9:54

makes perfect sense. But then when

9:56

you go to implementation phase or inferencing

9:58

phase, a lot of folks are going folks

10:00

might be using RAG for

10:02

something like that. And some

10:04

folks may be thinking about one piece, some

10:06

folks may be thinking about another piece, how

10:10

do organizations stitch together the end-to-end

10:13

compliance of all of this, because it gets

10:15

to be different

10:17

problems at different stages of

10:20

the journey, if you will, or different stages of the

10:22

life cycle of an LLM. And so

10:24

how do you talk through folks when they

10:26

ask that question about how do I do

10:29

end-to-end? Yeah, I

10:31

mean, I think that's one of the big challenges that

10:33

companies have right now is that we're

10:35

looking a little bit too narrowly at this problem,

10:37

if we're looking at it at all. Essentially,

10:40

we're thinking like, okay, well, how do I

10:42

do this at the, let's say the fine

10:44

tuning phase, or I'm taking these, and then

10:46

I can apply some sort of point solution

10:48

or maybe I DIY some sort of solution,

10:50

but that's not looking at the full life

10:52

cycle of the data, because this data that

10:54

you're using for training or even building something

10:57

like a RAG model, it's sitting somewhere as

10:59

well within like an S3 bucket or somewhere

11:01

in your infrastructure, wherever you're sort of pulling

11:03

that information from, and it's gonna go through

11:05

some sort of pipeline down to eventually ending

11:07

up in a model. And that entire pipeline,

11:09

you need to be able to control essentially

11:13

who has access to that, how is the data

11:15

viewed, do your engineers

11:17

have full access to the raw data? It's

11:19

probably not a good idea. It's kind of

11:22

like allowing them to have full access to

11:24

the production database. So how do you essentially

11:26

allow people to do their jobs while not

11:28

potentially compromising the privacy of your

11:31

customers or violating some sort of

11:33

compliance privacy regulation? And

11:35

that full spectrum is I think where you

11:37

need a more holistic sort of privacy

11:40

platform approach, which is what we provide

11:42

at Skyflow. So this is

11:45

a Skyflow provides a technology known as a data

11:47

privacy vault as a service, which

11:49

gives you isolation, protection and governance over sensitive

11:51

customer data. Especially like you get to give

11:53

it like a shared service for secure

11:57

PII management and use. Some

11:59

more. like using a shared service for like identity provider.

12:02

Like I'm going to use Okta for managing my

12:04

identities or something like that across all

12:06

my services. Well, you can essentially use

12:09

Skyflow for that as a way to

12:11

manage customer data across all your

12:13

different services, including your own. So that way, if

12:15

you're building a RAG model, and

12:17

I'm going to take a bunch of training

12:20

data or documents that are internal in my

12:22

company, and I want to vectorize

12:24

that, turn that into embeddings, I'm going to then

12:26

use as part of like an information retrieval step

12:28

as part of the inference process. I want to

12:30

make sure that the

12:32

rules that govern access to the

12:34

raw files are also the

12:36

same rules that get applied at the inference

12:39

layer and at the RAG model layer. And

12:41

I'm then also holistically across the stack. That way,

12:43

a customer service rep

12:45

that's using the LLM can

12:49

generate essentially has the same controlled

12:51

access as they do at

12:53

their CRM or at their

12:55

application level. And that's essentially a service

12:57

that we can build and help customers

12:59

provide. Yeah. And you mentioned there too

13:01

earlier, like, made me think of

13:04

this, you said, hey, you know, somebody who has

13:06

full access to a customer database kind of thing

13:08

there, there's lots of areas here beyond LLMs. I

13:10

mean, we're talking about that mainly today. But I

13:13

mean, there's data lakes,

13:15

there's data warehousing, there's, you know, all

13:17

kinds of just production databases in general.

13:21

How does this extend out,

13:23

if you will, this whole concept of the vault, and

13:26

I almost think of it as like

13:28

a filter or a gateway kind of

13:30

thing for but specifically for PII information.

13:33

Is that a good way to think about this? And does

13:35

it apply everywhere? You know, basically everywhere

13:37

there's data is going to apply. Thinking

13:45

about starting or possibly wanting to advance

13:47

your career in the IT field. Well,

13:49

look no further than the IT Career Podcast,

13:51

your ultimate guide to success in the

13:53

IT industry. Every week we bring you expert

13:56

advice and insider information. You have to

13:58

learn to learn on your own. The

14:00

number one thing you can do to get

14:02

out of the help desk or out of

14:04

any inter-level position is be exceedingly good at

14:07

your current position. The elephant in the room,

14:09

right? Money is obviously certain barrier in roadblock.

14:11

There are so many resources available for free

14:13

on the internet. Whether you're looking to get

14:16

into cybersecurity, networking, data analytics, or any of

14:18

the other exciting career fields within IT, we're

14:20

here to give you the advice and insight

14:22

you need on the IT Career Podcast. Yeah,

14:28

I mean, that's the idea. I think

14:32

if you're doing this right, and you're

14:34

building some sort of infrastructure from scratch,

14:36

I think you would want to start

14:38

with using a vault as

14:40

your core PII storage, secure storage

14:44

and management. Similar to how

14:46

you're going to have at the base level,

14:48

you're probably going to have some sort of

14:51

warehousing solution, maybe you have a database, these

14:53

sort of core components. I think the data

14:55

privacy vault is becoming the standard in

14:57

the industry for managing sensitive data.

15:00

The IEEE came out with an article about a

15:02

year and a half ago about the future of

15:04

privacy engineering. Essentially, that article

15:06

talks about how the future privacy engineering

15:09

is this privacy by architecture approach through

15:11

applying this pattern of the data privacy

15:13

vault. There's been a

15:15

number of leading technology companies like Google, Netflix,

15:17

Apple, a handful of others that sort of

15:19

pioneered this approach. It's

15:22

something that's been done, but it's mostly

15:24

been done by very heavily resourced, well

15:26

funded companies that can throw hundreds of engineers

15:28

at different things. It hasn't necessarily been

15:31

done by the smaller companies because it's hard to

15:33

build, takes a lot of time. If

15:35

it's not your core product, it doesn't make

15:37

sense to divert all your engineering resources to

15:39

build it. That was sort of some of

15:41

the inspiration for the company I work for,

15:43

Skyflow was, let's take this idea and essentially

15:45

build that as a service for everyone else.

15:48

What that gives you if

15:50

you're thinking about like a data lake or a

15:52

data warehouse, some of the challenges that companies run

15:54

into there is I want to make

15:56

sure that my analysts or my data science team

15:58

can do their jobs. and have

16:01

access to the data, or even from an

16:03

analyst standpoint, maybe my marketing

16:05

team needs some level of access as well so

16:07

that they can do analysis to figure out, hey,

16:10

how are we performing in

16:12

certain geographies based on the

16:14

marketing data that we're collecting? So

16:16

then how do I do that

16:19

in a way that doesn't essentially risk customer

16:21

data falling into the wrong hands within

16:23

my own company? It's either accidentally,

16:26

not even necessarily someone doing

16:28

something that they shouldn't be

16:30

doing, but do I run a query and

16:32

essentially get back Aaron's home address

16:34

when I shouldn't have that? We want to

16:37

stop that kind of stuff. So a

16:39

lot of customers use us in

16:41

combination with their Databricks, Snowflake, whatever

16:43

they're using as part of their

16:45

warehousing data lake solutions and analytics

16:47

platform. And Skypo could be integrated

16:49

at different places, but you could essentially start at

16:52

the ETL layer. And like you're saying, it could

16:54

act like sort of a privacy layer or gateway

16:56

to the services where as part of

16:58

that ETL pipeline, Skypo would sit at the head

17:00

of it and you would essentially, it

17:02

would either detect the data automatically, which data

17:04

elements are PII, if it's unstructured data or

17:06

if it's structured data, you could essentially tell

17:08

it like, hey, this is someone's name, this

17:11

is someone's home address. And

17:14

your Skypo vote can essentially hold onto that

17:16

and transform it into de-identified values that then

17:19

you can run your analytics and data science

17:21

and so forth on. And as well as

17:23

control at a governance level, a

17:25

fine-grained access level holistically across

17:27

all your services. Fantastic. Now, Sean,

17:30

let me ask a follow-up to this because

17:32

I've been thinking about, we've talked to a

17:34

number of API security companies here recently and

17:36

that topic has come up on the podcast

17:39

before, but this may

17:41

or may not fit in, but I felt

17:43

it was worth asking. Like everything we're kind

17:45

of talking about is tokenization and depersonalization

17:48

of the data, if you will.

17:51

And I think of that at like the storage level and

17:53

retrieval level, but what about

17:55

the whole concept of bad APIs, right?

17:57

API security itself, like I do an

17:59

API. I call that potentially calls

18:01

information that it shouldn't. Does

18:04

that fit in as an

18:06

additional layer or additional vector to think

18:08

about in all of this? I

18:11

don't think it's necessarily something like fundamentally different

18:14

than some of the other services that would

18:16

have access to the data. Like ideally from

18:18

a security perspective, the privacy perspective, you

18:21

always want to de-identify data as early in the life

18:23

cycle as possible. So most of the time when we

18:25

think about our modern system, where are we getting customer

18:27

data? It's usually going to be some sort of collection

18:29

point as part of an application. So it's like a

18:31

front end form. I asked you to

18:33

sign up with your account information, your banking information,

18:35

whatever it is that I need to collect about

18:37

you. And ideally what we're doing

18:40

there, if we're following sort of this data

18:42

privacy vault shared service model, is I'm going

18:44

to essentially take the data from your front end,

18:47

put it in the vault, replace it with

18:49

de-identified values, and then send that downstream. So

18:51

that way all your downstream services, whether it's

18:54

your database, your log files, your

18:56

API calls, whatever it's going to be, doesn't

18:59

actually need to see any of the

19:01

raw customer data. Because

19:04

very, like what is the use

19:06

case where our internal services need

19:08

to see someone's name? They

19:11

don't necessarily, it's very rare that there's a

19:13

use case for that. They just need a

19:15

representation of that. So by doing

19:17

that, you're automatically taking a lot

19:19

of your backend downstream services sort of out of

19:22

scope and de-risking them, because they're never ever

19:24

handling any of the sensor data. So that way,

19:26

even if something, like a mistake

19:28

happens, coding error where

19:30

an API pulls

19:33

data that it shouldn't or dumps something to a

19:35

log file that it shouldn't, it's

19:37

only ever dumping essentially de-identified values. So

19:40

that way, if it gets compromised, no

19:42

one's seeing your raw name or other

19:44

values, they're seeing essentially just random strings

19:46

that don't mean anything. It's similar to

19:48

how things like PCI

19:51

tokenization works when we're accepting a credit card.

19:54

We leverage payment

19:56

service providers like Stripe and Adjutant and so

19:58

forth in order to essentially. we take

20:00

our systems out of PCI scope and make it

20:02

so that we're not handling, you know, raw credit

20:05

card data, we're offloading that to a third party

20:07

provider. And in a sense, like

20:09

a Skyflow vault works similarly, although you have

20:11

more control of the data and

20:13

it works essentially for any kind of information that you

20:15

might be storing. Yeah. Maybe

20:18

a follow on to that and when it comes to compliance

20:20

things, because of course, you know, anytime

20:22

we think about compliance and privacy, I think

20:24

GDPR comes up first, but now California Privacy

20:27

Act, you know, becomes second. Right.

20:29

What's your thoughts on kind

20:31

of the state of the compliance industry and where we're

20:34

going and is there more things we should be aware

20:36

of when it comes to privacy? What's

20:38

kind of the general trends with all of that lately?

20:41

So I think one of the big

20:43

things that people have to be aware

20:46

of is that more and more countries

20:48

are basically building into their regulations, some

20:50

sort of data residency requirement and those

20:52

have different flavors, but the kind of

20:54

like, and it's very nuanced

20:56

and gets complicated, but the gist of it is that

20:59

different regions in the world want

21:02

to have some say about

21:04

where you're holding their citizens

21:06

information. So, you know, Canada,

21:08

India, Australia, Germany, all

21:12

these different countries essentially have some sort of data

21:14

residency requirement and there's different strictness

21:17

around what that means, as well

21:19

as guidelines around, are

21:21

you able to take data, transfer data of the

21:23

country? What kind of data are you able to

21:26

transfer? So I guess it's really complicated and difficult

21:28

to sort of navigate as a business and it

21:30

also can become a barrier to go to market

21:32

because like if I want to move into, let's

21:34

say China, then and

21:36

my company depends on SaaS products that don't

21:38

operate in China, how do I do that?

21:41

Like how do I essentially have

21:43

my marketing team collect information on HubSpot,

21:46

but HubSpot doesn't have a deployment in China. Like

21:48

these things start to get really complicated and difficult

21:50

to deal with. So that's one of the use

21:52

cases, one of popularly use

21:54

cases of Skyflow is essentially we

21:56

can deploy vaults to various regions around

21:59

the world. and make sure that your

22:01

customer data, the regulated data stays within those

22:03

countries as well as the compute on it,

22:05

while essentially taking your centralized

22:08

cloud system or SaaS products out of

22:10

scope from that. So that helps simplify

22:12

some of the compliance regulations. The other

22:15

big one that people

22:17

need to be aware of is, just

22:19

last week, the EU

22:21

passed their AI act. So because

22:24

of all the, obviously all the

22:26

growth and interest in generative

22:28

AI and large language models and everything that's happened

22:31

in the world of open AI and chat GPT,

22:33

which is like the center for all tech drama

22:35

right now, is there's

22:37

gonna be more regulations around

22:39

AI. Like President Biden had

22:41

his executive order last year where a lot of

22:44

it was around, it wasn't

22:46

necessarily like heaters and regulations, but they're clearly thinking

22:48

about going that way. Things are moving a lot

22:50

faster than they did in the social media era

22:52

where it took a decade or so for the

22:54

world to catch up and put for things like

22:57

GDPR. The AI

22:59

regulations are happening now. So the first

23:02

one to pass was Europe. It

23:05

probably take a few years before it's

23:07

actually in place where they're actually starting

23:09

to find companies for violations or hold

23:11

companies accountable, but these things are definitely

23:13

coming. So if you're a company that's

23:15

operating in that space or thinking about

23:17

investing in it, you need to be

23:19

thinking about these things now, or it's

23:21

gonna be back to 2018 when everyone was

23:23

scrambling to try to be compliant with

23:26

GDPR. Yeah, yeah, that makes sense. Thank you for that, Sean. Now,

23:29

let's switch gears for a second. Let's

23:31

start podcasts. So

23:34

you also host Software Engineering Daily

23:37

and as we mentioned, we had

23:39

good relationships with Jeff, the previous

23:41

host, and there's probably a good

23:44

bit of crossover between our audiences

23:46

as well. So how

23:48

are things going over at Software Engineering Daily and

23:50

give everyone a little bit about the show and

23:52

a little bit update of what's going on over

23:54

there? Yeah, absolutely. So Software

23:56

Engineering Daily is one of the three

23:58

podcasts I actually host. But

24:03

as the name would apply, it's focused on primarily

24:06

talking to engineering leaders or some nice

24:08

product leads. And there are usually 45 minutes

24:11

to an hour long sort of deep dive

24:13

into particular technology, or maybe some

24:16

sort of problem solution that a company went

24:18

through. I think some of our most popular

24:20

episodes are things that have focused that like,

24:22

how did Pinterest scale Kafka or something like

24:24

that? Deeply nerdy technical

24:27

topics. And

24:30

essentially Jeff Meyerson was the original host

24:32

and creator. Unfortunately, he passed

24:34

away now almost two years ago. And

24:37

his brothers took over the

24:39

show and then run things behind the scenes. And then

24:41

they brought myself and a couple other folks

24:43

in to help guest host the show. So I started

24:46

out originally writing for their blog,

24:48

as well as I had been on

24:50

an episode, one of Jeff's last hosted

24:52

episodes as a guest. So I'd

24:54

just been a fan of the show for years and

24:57

I got involved that way. And they approached me about

24:59

coming in and starting the host shows. And I started

25:01

with just a few shows and now I host a

25:03

show every Tuesday. So every Tuesday is

25:05

my day. And there's a couple other hosts

25:08

that are sort of the core hosts that make up

25:10

the show. And we essentially release

25:12

a podcast almost every day of the work week.

25:15

Yeah. Yeah. And I will say this, it's

25:17

just a heroic effort

25:20

to even get

25:23

a daily podcast scheduled and produced

25:26

it out there. I mean, it's just an amazing effort.

25:28

So definitely everyone go check it out if you haven't.

25:32

Yeah. I mean, it's a lot of fun for

25:34

me, especially as my career, I've moved somewhat further

25:36

away from sort of day to day coding and

25:38

engineering. So it's a great way. It's like a

25:40

forcing function for me to kind of stay tapped

25:43

in and learn about what's going on in the

25:45

industry and always be learning something new and get

25:47

to talk to like amazing people. And I'm sure

25:49

you have a similar experience with your job.

25:51

Absolutely. Absolutely. So we're going to wrap

25:53

it up there then. Sean, other than

25:56

Software Engineering Daily, any other

25:58

places, everyone, if they want to learn more about... all of

26:00

this, where can they follow you, where can they learn more

26:02

about everything going on? Yeah, absolutely. So

26:04

if anything I said around, you

26:07

know, data privacy vaults and privacy and security interests

26:09

you, feel free to check us out at skyflow.com.

26:11

And you can always connect with me on LinkedIn,

26:13

that's probably the best place and most active place

26:15

for me. And if you just search my name,

26:17

Sean Faulkner, there's not too many of us out

26:19

there. So you'll find me and I'm happy to

26:21

connect. Fantastic. All right. Well, Sean, thank

26:24

you very much for your time. And on behalf

26:26

of Brian and myself, thank you

26:28

everyone out there for listening. We certainly appreciate

26:30

it. If you enjoy the show, please tell

26:32

a friend, please, please leave us a review

26:34

wherever you get your podcasts. For

26:36

that, I'm going to close this out for this week and

26:38

we will talk to everyone next week.

Rate

Get this podcast via API

From The Podcast

The Cloudcast

The Cloudcast (@cloudcastpod) is the industry's #1 Cloud Computing podcast, and the place where Cloud meets AI. Co-hosts Aaron Delp (@aarondelp) & Brian Gracely (@bgracely) speak with technology and business leaders that are shaping the future of business. Topics will include Cloud Computing | AI | AGI | ChatGPT | Open Source | AWS | Azure | GCP | Platform Engineering | DevOps | Big Data | ML | Security | Kubernetes | AppDev | SaaS | PaaS .

Join Podchaser to...

Rate podcasts and episodes
Follow podcasts and creators
Create podcast and episode lists
& much more

Download Audio Filehttps://chtbl.com/track/2C22GF/www.buzzsprout.com/3195/14775058-llm-security-and-privacy.mp3

Episode Tags

Do you host or manage this podcast?
Claim and edit this page to your liking.

,

Unlock more with Podchaser Pro

Audience Insights

Contact Information

Demographics

Charts

Sponsor History

and More!

Pro Features

Resources
Help Center
Blog
API

Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More