Episode Transcript
Transcripts are displayed as originally observed. Some content, including advertisements may have changed.
Use Ctrl + F to search
0:25
This is advice from a call center geek
0:27
a weekly podcast with a focus on
0:29
all things call center . We'll cover it all
0:31
, from call center operations , hiring
0:33
, culture , technology and education
0:35
. We're here to give you actionable items
0:38
to improve the quality of yours and your customer's
0:40
experience . This is an evolving
0:42
industry with creative minds and ambitious
0:44
people like this guy . Not only is
0:46
his passion call center operations , but
0:49
he's our host . He's the CEO of
0:51
Expedia Interaction Marketing Group and
0:53
the call center geek himself , tom Lear
0:55
.
1:01
So I'm going to just kick this bad boy off
1:03
. We've done a lot of work here over
1:06
the last eight months in
1:08
trying to fully automate quality assurance
1:10
, I think for the smaller contact center , right , I don't
1:12
think we're . You know , the enterprise
1:14
guys have so many different tools and they're
1:17
doing so many different things that we saw that
1:19
there's a need kind of with that smaller
1:22
contact center and we're saying under a hundred
1:24
seats . You know , I caught talking
1:26
to Chris Mouts , who's on here too , an Evalu agent
1:28
. He talks
1:30
a lot about how there's
1:33
so many of these smaller contact centers that are still
1:35
using Excel spreadsheets , right
1:37
, they're still using Google Sheets
1:39
and you know there's
1:41
kind of a need , I think if you can give them
1:43
a tool can look to automate
1:45
with chat , gpt , give them some
1:47
type of better reporting aspect
1:50
. That's kind of what we did
1:52
about seven , eight months ago , or at least we set out to
1:54
do it . And let me guys throw this to this is
1:56
a full AMA . So if anybody
1:58
has any questions , anytime raise your hand , I'll bring you up
2:00
. We can have a conversation , we can talk this through . But
2:03
I want to give you some of the cool stuff that
2:05
we have found out , that we have figured out , especially
2:09
when it comes to prompting , especially
2:11
when it comes to how does chat
2:13
GPT utilize transcripts
2:15
in the best way for
2:18
for listening , for specific
2:20
things like how do you listen for empathy
2:23
, how do you , how do you try to , how
2:25
to try to score things that are unseen
2:27
, like is it ? Did an agent go
2:29
to the right screen on their computer to find this
2:32
information ? Or did they click this box that
2:34
we can't see in a transcript ? How do we , how do we
2:36
deal with some of that ? And then
2:38
also just some of dealing with some of the
2:40
, I
2:43
guess , the nuances of chat GPT and how it
2:45
thinks right . So the amount of
2:47
different testing that we've done over the last seven months
2:49
has been insane . Like
2:51
we have taken
2:53
notes , like I really almost want to write a book
2:55
on all of this , but
2:57
I have like 15
2:59
things that I want to talk to you guys about that I think are super
3:01
cool and what we learned from
3:04
the prompting aspect of chat
3:06
GPT and again , I am a full open book
3:08
. We have our own product . If you want all
3:10
these prompts , if you want our static prompt , I
3:12
will give you everything . Like I
3:14
think that's the other thing too . I'm not here to hide
3:16
anything . So
3:19
anything that , if anything is of interest to you
3:21
, or even you want to play with it on the desktop
3:23
version that you have with some of your calls , you
3:25
know , knock yourself out because I I know that
3:27
these prompts I'm going to talk to you work . So
3:30
just the quick overview of how we do this
3:32
is we have a
3:34
SAS product that
3:36
we basically take a call
3:39
and as soon as we analyze that
3:41
call , it goes out to a company
3:43
called DeepGram . It gets the full transcript
3:45
of the call . The call then comes
3:48
back , looks at our static prompt
3:50
, looks at all the context that we did
3:52
throughout . Each of the questions has
3:54
specific outputs that we want goes
3:57
out to chat GPT . It quote , unquote
3:59
, thinks it comes back and
4:02
then we get an output . And you guys
4:04
, if you want to know what the outputs look like , just go
4:06
look at my LinkedIn . You'll see a bunch
4:08
of how the outputs look . We've decided
4:10
that the best outputs , at least to start with , are
4:12
the actual scoring of every
4:14
question what are four ways
4:16
that the agent did well , what are four things that
4:18
the agent could improve upon in
4:20
the call summary , and then kind of just that overall
4:23
score with customer and agent sentiment
4:25
as well . But let's talk
4:27
about some of these prompts and some of the things that
4:29
if you're planning on doing this or some of the things that
4:32
we have found . So number one is less
4:35
is more for easy questions . So
4:37
if you have a greeting or
4:40
if you want to collect an email address , did the agent
4:42
collect an email address ? Right , that's
4:44
really all you want to say . You don't want to get
4:46
. We try to do these kind of elaborate things for
4:48
everything and it just confused it for the
4:50
, the , the shorter type
4:53
, black and white , binary questions
4:55
. So that's
4:57
pretty easy . But let me say this the
5:00
word explicitly is
5:02
like in grained in chat GPT
5:04
. So if you use the word explicitly
5:06
and sometimes we would use chat GPT to help us
5:08
with developing some of the prompting for
5:10
each of the questions it
5:14
would be absolutely exact . So
5:16
if there was anything off like like one of
5:18
the things was explicitly
5:22
, please make sure that the agent
5:24
explicitly says thank you for calling customer
5:26
service . If there was anything
5:28
off , if there was a pause , it would score it
5:30
as a , as a zero or no . So
5:33
we have found that if you want to be exact
5:35
, you don't , you don't have to really tell it to be
5:37
exact . Just give it
5:39
kind of that general deal and
5:41
it works much better . Unless you have something
5:44
like a disclosure right , like you can't
5:46
go off . You can't have a t
5:48
dotted or
5:50
a t crossed in , an I like it's all going to
5:52
be , sorry , I muted myself
5:54
, sorry , I muted myself , it's
5:56
all going to be perfect . So
5:58
be careful about being too explicit
6:01
when you want to have something exact . Most
6:03
of the time , if you just tell it and give it kind of the rough
6:05
example , it will work . Now
6:08
, this is the cool stuff , right
6:11
? So how do you , how do you
6:13
have chat GPT when
6:15
it's looking at it just to transcript
6:17
, talk about empathy , like that was
6:19
. That was something that was big for me , and
6:22
you know you could just say well
6:25
, we want the agent to say I'm so sorry
6:27
for you to hear that , I'm so sorry to hear that , or
6:29
oh my gosh , I can't believe that
6:31
happened . Right now you could do that , and it's
6:33
pretty generic because those
6:35
, those kind of conversations can come up in a lot
6:37
of different instances . But
6:40
what we have found better is is to kind of use
6:42
a lot of if they end statements when it comes to
6:44
the more thinking type questions . So
6:46
you know we'll say something
6:48
and let me actually I'll pull the actual prompt
6:51
, pull
6:53
the prompt up . Give me one second
6:56
here , I'll
7:00
pull it up in a second . But basically what we say is hey
7:02
, can you look in the transcript ? Look
7:05
in the transcript to find out any instances
7:08
where the customer seems distressed , where
7:10
they said something that was that that was had
7:13
a negative sentiment , that was that
7:15
was not positive . And
7:17
then after you have found that , then
7:19
we want to make sure that the agent
7:22
isn't using kind of just a basic scripted
7:24
response , but that they're actually using
7:26
some words in there that correlate directly
7:28
to what the customer said . So
7:30
we're not looking for specific keywords like
7:33
the agent must say . I'm so sorry
7:35
for you to hear that we
7:37
got a little general with what , what
7:39
could be said by the agent , as long as it kind of correlated
7:42
back to the actual problem and
7:44
that the agent was actively listening . For that I
7:47
have a full , if you guys . Again , if you go on my LinkedIn
7:49
, I think yesterday I posted like these five
7:51
kind of core prompts . I have the
7:53
exact full prompt for
7:55
empathy and what we did there and it works
7:58
every single time . So
8:00
again I would ask you or or Employ
8:04
you if you don't believe me , take that prompt , go play
8:06
with it on the desktop , take your call recording . I
8:09
think that was something that was really cool
8:11
for us to kind of finally figure out , to
8:13
just try to , because we were always trying to do something different
8:16
. Like we know , we can just say , hey , can you find
8:18
this in a recording , but how do you take it
8:20
to the next level , to really
8:22
kind of Use the use case
8:24
that we want ? So I think that that
8:26
was interesting . The
8:29
other thing and I'm just kind of all over the board here , these
8:31
are all random is is don't tell
8:34
chat GPT to tell
8:36
you when something is not there . Now
8:38
it can do that , and
8:40
let me give you an example . But it would get confused
8:42
a lot when we would say certain things like
8:44
Please
8:46
score this with full points if this is
8:48
a sales call or
8:50
a retention call , but score
8:52
it as an NA if it is a password
8:55
reset call , all right . And
8:57
chat GPT would consistently get confused
9:00
with what was what , even
9:02
though We've done some things with even
9:04
selecting what different call types can come
9:06
in . So I think in for
9:08
for our platform , it doesn't matter . You don't have
9:11
to have skills set up for you
9:13
know Sales , retention , password reset
9:15
, that you could have one skill that comes in and
9:17
we have a way to know it's actually . People know
9:19
what type of call it is and then what
9:22
questions that correlates to it , but
9:24
we were trying to tell it too much information
9:27
and it would get crazy confused . So
9:29
what we found is that you don't have to tell it
9:31
NA , you just have to tell it for what
9:33
it's looking for and if that stuff's not
9:35
there , it will do and it will score
9:37
it as an NA if that makes sense . So you
9:40
know , please only score this if it is a sales
9:43
or retention call and then you kind of leave it at
9:45
that at the end of the prompt , Don't tell
9:47
it to say and if it's not there , scored in a . It
9:50
got crazy confused and that was super frustrating
9:52
because we're like no , we're telling it exactly what we want , but
9:55
it would get it would get frustrated with that . So that's , that's
9:58
a tip for there and I think
10:00
that's like that's more the analytics right . You
10:02
always , when you're looking at like
10:04
advanced speech analytics , it's very easy
10:06
to find things that are there
10:08
, but it's more difficult
10:10
to kind of look for things that aren't , and I think that that kind
10:12
of maybe is a little bit of a crossover , why ? Why
10:14
it gets confused . The
10:17
other thing that I think is is pretty cool
10:19
is how do you prompt
10:22
for the unseen right , meaning
10:24
an agent has to move on
10:26
to a certain , you know , part of their
10:28
computer screen , they have to get a certain part
10:30
of information , a certain piece
10:33
of information . They have to click on a
10:35
click on a box . And I
10:37
do actually want to pull the prompt up here . Give me one second
10:40
, because I think this , this one
10:42
, baffled us for a really long time , and I'm
10:45
not saying it's perfect , because it's never gonna be
10:47
perfect if it's something that we can't find
10:50
in the , in the actual
10:52
transcript , but
10:54
I think it's pretty darn close
10:56
and it has given us the Kind
10:59
of the outputs that I think we've
11:01
wanted on on a vast majority of the calls
11:03
. All right , give me one second , let me pull this bad
11:05
boy up . I don't know , I thought I had it up but I deleted
11:08
it . All right
11:12
, let's
11:16
set . Pull this post up , all
11:23
right . So for scoring
11:25
for the unseen , we
11:28
basically say things and again , this prompt is
11:30
in that that post that I did the other day
11:32
is Well
11:35
, what , what can't , what do we know ? You know
11:37
, if an agent has to get some specific
11:40
information from a specific part of a screen , we
11:42
know certain things , like there's a promptness
11:45
in providing that information , right . So if
11:47
we say , all right
11:49
, let me , I need to pull that up , or or
11:51
you know something along those lines , if there's , if there's
11:53
a delay in the in the actual talking
11:55
, we can kind of see that , yeah
11:58
, they're probably not being
12:00
able to find that piece of information quickly . Can
12:03
they transition between topics , like if there's
12:05
a big change in topic ? And again
12:07
, if the question is , did
12:10
the agent read the proper
12:12
disclosure or did the let's
12:14
? Let's say , did the agent Find
12:18
the information for the dishwasher ? Right
12:20
, and so the cusp , if the customers is
12:22
saying have a problem with my dishwasher , and
12:25
there's four seconds , five seconds , six seconds
12:27
when the agent is trying to find that information right
12:30
and the question is did the agent
12:32
quickly find the information ? We know
12:34
that if that's going to be kind of a yes or no again
12:37
, is that perfect ? It's not perfect , but
12:40
I think that you kind of get the idea of the
12:42
transition between topics , you know
12:44
confirmation of actions , minimal
12:48
need for correction . So there's a couple things in
12:50
that prompt that basically said
12:52
how quick did that ? Did that agent
12:55
really find this information ? Now
12:57
, things like did they click the
13:00
, the , the box for
13:02
opt out of email , we
13:06
actually look for a little bit of a delay . So
13:08
if a customer , if that's a question In
13:12
the the agent says , hey , would
13:14
you like to opt out of our email , and the the
13:16
customer said yes , I do . If the agent
13:18
says okay and they wait like a second
13:20
, all right , like
13:23
things like that . We've been able to kind of find
13:25
in all of this kind of data that we
13:27
think gives a pretty good representation
13:29
of Seeing
13:31
the unseen and doing the best
13:33
that we could possibly do without right
13:35
now having AI be able to go on to the
13:37
actual , you know computer
13:40
for what we're doing and actually seeing
13:42
seeing what we're doing . We
13:47
kind of talked about . One of the things that we
13:50
had a huge problem with is
13:52
that chat GBT sometimes , let's
13:55
say we have 35 questions
13:57
on a QA form , a
13:59
lot of times it would not return all 35
14:01
questions , which is a really big issue . Right
14:04
, and it wasn't just NA questions , it
14:06
wasn't just yes or no questions . There was no real rhyme
14:08
or reason to why it was not returning
14:10
and our JSON output
14:13
all the files or all of
14:15
the questions . We still don't know why that
14:17
that did happen , but we
14:19
use the word imperative . We've
14:22
used a lot of different words , but we found that imperative
14:24
worked the best . So we basically said
14:27
it is imperative that you return all of the
14:29
questions in a JSON format
14:31
and then you know the whole . There's more to
14:33
that , but basically telling it imperative
14:36
, we have found and explicitly
14:40
right those two words , and I'm sure there's
14:42
a ton of those words , I'm sure it's not just those two words
14:44
, but those two words definitely have an impact
14:47
in your prompting , to be exact
14:49
and to kind of not
14:51
go off . So you know , once we said that
14:53
now there was a lot of different ways that we could have done , that
14:55
we could have said you know , one of the things we
14:57
were talking about is hey , you know , please review
14:59
how many questions there are at the beginning , make sure that
15:01
you answer the same amount at the end . You
15:05
know those kind of things , but we found that that it
15:07
took , it made the prompt or made the
15:09
QA form take a
15:11
little bit too long . So
15:13
that's kind
15:15
of the route we went in just one little quick sentence and it works . And
15:18
it's worked every single time and we have not had
15:20
a problem with that sense . The other
15:22
thing that we have found for accuracy and
15:25
for speed is to tell ChatGPT
15:27
where to look for certain things
15:29
in a transcript . So
15:31
if we say things
15:33
like for the greeting , like the caller
15:35
must you know for this specific client
15:38
, the caller must , or the agent
15:40
must , say thank you for calling customer service . Please
15:43
look for that in the first five lines
15:45
of the transcript . So we
15:47
have found that that has I don't want to say significantly
15:50
reduced the allotted amount of time
15:52
that it takes for a QA form to come back
15:54
. But I think it's been more
15:56
accurate because it's not looking at everything and it has been a
15:58
little bit quicker the
16:00
more that we've implemented those type of things . You can do the same thing
16:02
for the closing right , because you're not going to have a closing
16:05
at the beginning . So why have it read
16:07
through the entire transcript for all of those things , you
16:09
know . I really got excited
16:11
with the thinking questions , the black and white
16:13
binary . Did the agent do this or that ? Everybody
16:16
knows that ChatGPT could handle that , but
16:18
I think that the nuance to any of these companies that are
16:21
going to try to do this is how do they
16:23
handle the empathy questions ? Did
16:26
the agent do something appropriately
16:28
throughout the call where it's not just a black and
16:31
white but it takes a thought process of maybe
16:33
multiple sections of the call , and
16:36
I think that it can be done . I think we've done a really
16:39
good job with you
16:41
know , the cool thing about this is
16:43
being able to test this with our actual
16:45
customers . So pretty much every
16:47
single customer we have on our BPO
16:49
is utilizing this now . So our QA
16:51
department I haven't got rid of any QA
16:53
people or anything like that , yet . They are . We're
16:56
basically scoring a call human
16:58
beings and they're calibrating it . We're just doing that all
17:00
day long , all day long , making sure that
17:02
all of these prompts work
17:04
. We're now kind of I don't want to say we're hands
17:06
free , but we're at a point now
17:08
where you know , I think , that the core
17:10
basic prompts that everybody has right , everybody's
17:13
going to have , like an opening , a closing , they're
17:15
going to have a greeting , they're going to have , you know , did
17:17
the agent use , have proper tone
17:19
? Did they use
17:22
proper word choices ? Did they not use him
17:24
and ha , did they not have diminishing language for the company
17:26
? Like these , like 10 things that we
17:28
know work really well , or 15 things you know
17:31
are going to be part of every single kind
17:33
of onboarding . And then , obviously , you just utilize
17:36
it and change it and put it into
17:38
your company's context and add as many questions
17:40
in as you can . But
17:43
I think writing the book on
17:45
understanding how to prompt
17:48
for specific questions , whether it is a thinking
17:50
question to a binary black and white question
17:52
, to something that takes a little bit
17:54
more thought process , those were
17:56
the things that I think we feel comfortable about and
17:58
that's the magic sauce , right . So that's
18:01
why I could care less of everybody you could
18:03
use . You know all of our prompts . I
18:05
don't care , because there's going to be certain things
18:07
that come up that we're going to kind of understand a little bit more
18:09
. But I also want people to feel comfortable
18:12
with this technology . I think this should
18:14
be democratized . This could be . You
18:17
could be a five-seater and just use the desktop
18:20
version and have one prompt that
18:22
has everything and you could just be hammering out calls
18:24
by yourself for free every single day , and
18:27
I would love to see that right . I
18:30
think that could be . You know one option
18:32
. Obviously we have , I think , a slicker version
18:34
of that , and there's a lot of companies that are coming out with
18:36
slicker version . This isn't just us , but
18:40
that's the thought process that kind of goes into
18:43
it , from understanding
18:46
how chatGPT thinks
18:48
to get the best result and
18:51
to get the most consistent results . And
18:54
I would say now , again , like I said , all of our QA
18:56
department is utilizing this for all of our
18:58
customers . That's kind of our alpha test before we
19:00
beta . But yeah
19:04
, I mean , I think that's kind of what I
19:06
wanted . I'm trying to just look down my list here
19:08
. Is there any other prompt or anything that else that I
19:10
thought was pretty cool . I
19:13
don't know . Do you guys have any questions ? I
19:16
appreciate everybody kind of joining here , hopefully
19:18
that this was a little bit of insightful
19:21
, that a little bit of how it was a little bit of insightful
19:23
and I think it's kind of cool , but is
19:25
there anything ? Do you guys have anything ? Any questions
19:27
? Just trying
19:29
to think of things . Like you know
19:31
, we didn't really struggle . We
19:34
found that I know ChatGBT has kind of
19:36
a and again I'm not a programmer so I'm gonna say
19:38
this wrong but they have a way or
19:40
a button that you basically click to guarantee
19:42
a JSON file output . We
19:45
found that that was very restricting , so
19:48
we just prompt for the JSON
19:50
output in our actual static
19:52
prompt and
19:54
that has worked out much better and
19:56
we have a lot of flexibility
19:59
then to make sure that we're getting
20:01
the right stuff that we want . I thought one
20:03
of the things that was really helpful and this is kind of crazy
20:05
, but just a quick story is there's a I forget
20:07
what her name is , but she won the Singapore national
20:12
prompting competition and I
20:14
was trying to read as much as I could on prompting and kind
20:17
of how to figure this stuff out and she had
20:19
an article on Medium and at the bottom it was
20:21
like hey , if you wanna talk to me , it's like 50 bucks for a
20:23
half hour . So we've
20:25
utilized her a couple of times at the very beginning a couple
20:27
of months ago . That really helped us to understand
20:30
some of the outputs . Understanding
20:33
, you know , just look for certain aspects
20:35
of the transcript . Don't read the whole transcript
20:38
every single time . If you know
20:40
something's at the beginning , at the end , understanding
20:43
that you know the
20:45
structure right of how chat GPT's
20:47
quote unquote mind works . I think
20:49
all that was extremely helpful when we're
20:51
going through our prompting . You
20:55
know the other aspect , though . There is , oh
20:57
, jeremy , yes , let
21:02
me bring you up bud . All
21:07
right , jeremy , you're muted , but you're up .
21:10
Hey , thanks , buddy . I
21:12
joined a little bit late , so apologize
21:15
if you spoke about this already and I missed it
21:17
. I'm just curious if there's anything
21:19
that you found from you know any
21:21
of your clients where it's like you know what a human
21:23
still needs to do this part ? There's a certain type
21:26
of process or policy or question that
21:30
it just doesn't have the needed information
21:32
. You know , maybe some sort of a different
21:35
than that . Maybe there's something in the record history
21:37
that it doesn't have access to , or
21:39
anything along those lines .
21:41
Yeah , and I think it just does go back . If we
21:43
have a client that is very heavy into Things
21:48
that are happening on their computer screen , right
21:50
, like they have to be in a certain field , they
21:52
have to make sure that certain things are clicked , we're
21:54
gonna really struggle with that right now . You
21:57
know the , the visual aspect . I mean we don't have
21:59
any of that . I mean not that we couldn't , but what
22:01
I mean I'm not even we're saying totally
22:03
on a transcript . So I think I
22:06
think that there's a lot to this right . Number one is
22:08
there is a Security aspect to this right
22:10
there . To be perfectly fair , I think , using
22:12
the , the API Version
22:15
of chat , gpt , I feel much better on
22:17
the security aspect than if we were just . Obviously
22:19
we would never use just the desktop , but but
22:22
I still think that from a masking
22:24
standpoint , from a PCI standpoint , I
22:26
don't know if I feel comfortable working right
22:29
now with , you know , financial services clients
22:31
to to have credit card numbers and all that stuff
22:33
. Now
22:36
I think that it it
22:38
probably is totally fine , but
22:40
again , I I
22:42
think that's a , that's a , that's a thought that we would
22:44
really have to think through . The
22:46
other thing is , again . I just think it is . I
22:49
, as long as something is in the transcript , we've
22:52
been able to figure out really unique ways
22:54
To be able to score that
22:56
call the other things . We can
22:58
rate it's not as as accurate
23:00
, but I'm starting to feel like it can
23:02
be an offering because it's it's accurate
23:04
enough . Where you know , some QA forms
23:07
have like a one through five right , like
23:09
score this on or on a scale
23:11
. So we're
23:13
looking at that . But I think that those
23:15
I think there is a little bit of a there's
23:18
going to be some Customers that are
23:20
that are nervous from the security aspect , that
23:22
they're not gonna want this , they're gonna want a human being
23:24
to do it . But the other thing is , if
23:26
you have more than you know , 20%
23:29
of your your questions are not in
23:31
the actual transcript and it has
23:33
to be a Transactional thing on a computer screen
23:35
, then we're gonna stink at that too . If
23:42
that , if that kind of , answers your question , yeah
23:45
, that's great . Thank you , all right . All
23:52
right , let me bring you up . All
23:58
right , javi , gear up how you doing buddy .
24:01
I'm doing well , tom . How are you ?
24:03
I'm good . I'm good , I'm gonna talk to you .
24:05
Yep , thank you for this session , same as
24:08
Jeremy . I apologize , I joined a little little
24:11
late , but as a follow-up
24:14
to Jeremy and also a question for
24:16
you . So on our side we've
24:18
been leveraging the
24:20
world , the chat
24:22
, gpt API to
24:24
do some automated quality
24:27
and I think
24:29
it's very important , like
24:31
you mentioned earlier , to add in a lot of
24:33
context , before you even ask
24:35
it , the questions that's all related to the QA form
24:37
, provided the intro
24:40
and what it is that you're given it
24:42
, like this is a transcript , so this is a color , this
24:44
is a chat or whatever . And then within that
24:46
context also , what we've learned
24:48
is We've been having
24:50
to provide it a whole bunch of gap
24:52
card whales If there is this , do
24:54
not bring it into your analysis . If there is
24:56
that , do not bring it into your analysis
24:58
either , like ignore it or Whatever
25:01
. And even more than guardrails
25:03
, we've been having to tell it things like use
25:06
constructive language , do not use
25:08
negative terms like mediocre
25:10
or poor or weak . So
25:13
we've been having to be very specific with
25:15
it in terms of Contextualizing
25:18
as much as possible . So when we do
25:20
finally ask it the question that is linked
25:22
to the quality assurance form , it's
25:25
got all that context before it
25:27
answers it . In addition
25:29
to that , to Jeremy's point , what
25:32
is it that chat GPT can
25:34
do that we have to rely on a quality analyst
25:36
to do ? We've started to tell
25:38
it . If this conversation
25:40
is too complex for
25:42
you to provide us constructive feedback
25:45
, please flag it so
25:47
we can have one of our quality analysts
25:49
look at it . So we're basically telling
25:51
chat GPT to help us identify
25:53
which calls should
25:56
be reviewed by human in
25:58
order to help provide more analysis
26:00
and more constructive Feedback
26:03
to the rep or to the manager of that
26:05
rep To improve . So
26:07
I want to learn from you about all that context
26:09
that you've been providing . Yeah
26:11
, god , wales , how did
26:14
you add them within the logic ?
26:15
So I will tell you this we did ask
26:18
for a confidence
26:20
score With
26:22
chat , gpt . So you know we
26:24
basically said can you , you know , rate this transcript
26:26
, rate the , the output
26:28
that you have , on a scale of one to ten ? And
26:30
if it , you know I forget what we said . This was at
26:32
the beginning when we started testing the
26:34
different question . But if it was like below five , then kind
26:37
of flag that because you don't feel comfortable
26:39
or confident that you could score this call Either
26:42
from a complexity standpoint , the transcript
26:44
was garbled , you know , something like that
26:46
. We also found that
26:48
we would ask Confidence score
26:51
as we're testing for every single question
26:53
and that also found what prompts
26:55
we were struggling with and
26:58
there was kind of a direct correlation . To answer
27:00
your first question , we
27:04
really have not found too much of that Now . Like
27:06
our static prompt is basically you
27:08
know , you're the , we go into it just like a regular
27:11
deal . Like you're the head of quality assurance
27:13
, you oversee scoring for quality
27:15
. You will define the type of list
27:17
as the call . We just kind of define
27:20
what we want our output . So one is
27:22
going to be the call type . So if it's a sales
27:24
call or retention call , we want to know that
27:26
we basically tell
27:28
it to give an agent
27:30
and a customer sentiment score . We
27:34
ask it , you know , to add that to the JSON
27:36
output . We
27:38
talk about the scoring being a number
27:41
being an NA , yes
27:44
, a no in an NA . We kind of go through that
27:46
. We
27:49
talk again . We talk about the outputs of
27:51
four ways that they did well , four ways that they did
27:53
poorly , and
27:56
then we actually we
27:58
ask for the call summary in that that
28:00
as well . But we have not
28:02
really done any type of
28:04
guardrails , especially in the summary
28:07
, and not we found
28:09
it . It really hasn't , you know , said
28:12
anything derogatory
28:14
or poor . You know , when
28:17
it comes to the actual summaries , we
28:19
do ask for we're
28:21
calling it the rationale right
28:23
now . So if , if , if
28:26
, chat , gbt , it does the summary , if it scores
28:28
it as a yes , like it gave a full
28:30
points , we don't really we don't say anything
28:32
. But if it scores it as a no
28:35
or an NA , then we have like a little
28:37
question mark next to the question where
28:39
we can look at that and it will tell us
28:41
why it scored it as a no . And
28:43
a lot of times that will be kind of part of the
28:46
prompt as well . That we're kind of because we wanted
28:48
to know you know what piece of that
28:50
question it did not , it didn't
28:52
like . But
28:54
after that it's just each question
28:57
has its own . We're calling it context , but the context
28:59
is just the mini prompt to find
29:01
that question , and
29:04
then that's basically , and then we just
29:06
define the output of how we want the JSON
29:09
file to look like , and then
29:11
that's how we get the output for each of
29:13
the of the call scoring
29:15
form . So , again , from
29:18
a guardrail standpoint , I'm not and
29:20
I don't know I've done , I don't want to say
29:23
a thousand of these , but hundreds upon hundreds of these myself
29:25
, the call summaries have
29:27
been pretty much on point with what the call
29:29
is , black and white . We
29:32
added in there to please , in the call
29:34
summary you know , talk about if the agent
29:36
did not do something where the points were taken off
29:38
, so that you know that that QA
29:40
person can kind of read that and look at that . You
29:43
know it probably does mean a lot
29:45
based on the . You
29:47
know how complicated and how complex
29:49
that you know the calls are . You
29:52
know I mean we're talking about BPO , you
29:54
know financial services , retail
29:57
tech support . You
30:00
know those type of of kind of I
30:02
don't know , say four to 10 minute type
30:04
calls that a
30:06
lot of them , you know , are extremely
30:08
binary and they're , yes , no type
30:11
question , that we have one client that has seven
30:13
different call types that come in , that have seven different
30:15
types
30:18
of different calls , that all have different scoring
30:20
and questions that correlate to
30:22
different types of calls . We've been able
30:24
to kind of figure that out . But
30:27
yeah , I mean , I guess I really haven't
30:30
seen too much kind of derogatory
30:32
language or those type of things you
30:34
know with the , with the outputs , but I'm going to probably
30:36
look out for it , maybe a little bit now . You got me freaked
30:38
out , but
30:40
yeah , that's kind of what we've , how we've
30:43
at least structured the , the regular prompt
30:45
, which is pretty straightforward , and
30:47
I think the meat and potatoes of it , though , is the , is
30:49
the figuring out the , the context
30:52
or the prompting for each of the questions
30:54
to get a proper response , an
30:56
accurate response and a consistent response
30:58
. So I hope that that
31:01
helps you a little bit . Yeah , let
31:06
me bring it up , all right , guys ? Well , hey , I don't know , that's
31:09
really all that I have . I appreciate it . I hope that
31:11
that was helpful . We'll
31:13
continue to kind of do this . I think it's been interesting
31:15
to to kind of go down
31:18
this path . And then I know there's a lot of you who are interested
31:20
in this stuff too and it's it's a lot of fun to talk
31:22
with you guys . So again , thank you guys very , very
31:24
, very much . If you have any questions
31:26
, just just hit me up . Thanks , guys . Tick
31:37
, tock , what's up ? Does
31:41
anybody have any ? You guys have any questions
31:43
on prompting , on
31:46
AI , on quality
31:49
assurance ? Let
31:51
me know .
Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More