Episode Transcript
Transcripts are displayed as originally observed. Some content, including advertisements may have changed.
Use Ctrl + F to search
0:00
Hey , inside the Mix podcast fans , it's Ian
0:02
Stewart . If you want to follow me or find
0:04
out more info about me , the best place to do that
0:06
is my website flotownmasteringcom
0:09
. That's F-L-O-T-O-W-N
0:12
. Masteringcom . You're
0:14
listening to the Inside the Mix podcast . Here's
0:17
your host , mark Matthews .
0:18
Hello and welcome to the Inside the Mix podcast
0:20
. I'm Mark Matthews , your host
0:22
, musician , producer and mix and mastering
0:25
engineer . You've come to the right place if
0:27
you want to know more about your favourite synth music
0:29
artists , music engineering and production
0:31
, songwriting and the music industry
0:34
. I've been writing , producing , mixing and mastering
0:36
music for over 15 years and I want to
0:38
share what I've learnt with you . Hello
0:46
, folks , and welcome to the Inside the Mix podcast . If you are a new listener , a big
0:49
, big welcome and make sure you hit , follow on your podcast player of choice
0:51
. And to the returning listeners , as always
0:53
, a huge welcome back . So I've just
0:55
returned from an amazing mini break
0:57
in Edinburgh with my fiance . She's been
0:59
before , but it was my first trip to Scotland
1:01
and , wow , what a city definitely up there
1:03
in my top three cities I've ever visited
1:06
. So we did all the touristy stuff . We
1:08
walked up Arthur's Sea . It was pretty
1:10
bad weather when we got there . I couldn't
1:12
see over the edge , but I thought that added to the spectacle
1:15
, as it were . We did get to see the
1:17
view around Edinburgh on the third
1:19
day when the weather cleared up and again , amazing
1:21
stuff Tried
1:26
some whiskey , of course , and then we went on a uh , an excursion for a day . So we
1:28
were taken in a coach with a load of other people
1:30
we didn't know and we were taken to Inverness
1:33
and got to go on a boat on Loch Ness and
1:35
then drive back down through Glencoe
1:37
, which , wow , what scenery
1:39
. That is , um , definitely up
1:41
there , if not the best scenery I've seen in
1:44
the UK and I am biased because I live in the southwest
1:46
but wow . So if you're ever in Scotland
1:49
, I highly recommend that drive through
1:51
Glencoe . The scenery is second
1:53
to none . Wow , amazing , amazing
1:55
stuff . So that's enough about my recent excursion
1:58
to Edinburgh . In this episode
2:01
it's an interview episode and I'm
2:03
joined by none other than Jonathan Weiner . Now , if you're not familiar
2:05
with Jonathan Weiner , it's an interview episode and I'm joined by none other than Jonathan Weiner . Now , if you're not familiar with
2:07
Jonathan Weiner , he's a Grammy-nominated
2:09
mastering engineer and educator
2:11
as well . We go into more detail in this episode
2:13
with regards to that , but he's also
2:15
the host , the face , the educator
2:18
in the iZotope . Are you Listening
2:20
series on YouTube , which is a fantastic
2:23
series that I highly encourage you to go and check
2:25
out , as I use it and
2:27
reference it a lot , both in the podcast
2:30
and when I'm working with clients with
2:32
mixing and mastering . So in this episode
2:34
, jonathan talks about the intersection of mastering
2:36
and AI and how mastering
2:39
is assisting music production . Jonathan
2:41
talks about some common misconceptions about
2:43
AI and mastering that he often encounters
2:45
and , importantly , jonathan talks about what
2:48
AI can and cannot do , both
2:50
in mastering and mixing and in music production
2:52
in general . Jonathan talks about what producers
2:55
, artists and musicians should keep
2:57
in mind when incorporating AI into
2:59
their process and , importantly , jonathan
3:01
gives advice to artists who are
3:04
navigating the landscape of DIY
3:06
mastering versus professional
3:08
mastering services and what you should consider
3:10
. So before we dive into this
3:12
episode , I just want to make you aware of
3:15
my 12 steps to a mastering ready
3:17
mix checklist . It's a totally
3:19
free checklist and with these 12
3:21
steps you'll be able to make the mastering process super
3:23
smooth and exciting and
3:25
make sure you can take your music up a notch in the mastering
3:27
process . So head over to synthmusicmasteringcom
3:30
, forward slash free and you can download that
3:32
free checklist today . So that's enough for me , folks
3:35
. Here's my conversation with Jonathan Weiner
3:37
. Hey , folks , in this episode
3:40
I am very , very excited now I
3:42
say that every , but I genuinely am every
3:44
time excited , but
3:49
in particular this one to be joined by Grammy-nominated Mastering and Chief Mastering Engineer at iZotope
3:51
, jonathan Weiner . Jonathan , thank you for joining me today , and how are
3:53
you ?
3:54
I'm fine , I have to amend
3:56
your introduction . I
3:58
am in fact the Chief Mastering Engineer of MWorks
4:01
Mastering . Also
4:03
, I teach music production and engineering at
4:05
Berklee College of Music . I've
4:07
got a few other titles , but I am formerly
4:10
the education director
4:12
at iZotope , involved in a fair
4:14
bit of product development and
4:17
also creating some learning
4:19
tools and social media and public speaking
4:21
and all of that . But just to set the record straight
4:24
, if you want to pretend that this was 18
4:27
months ago , then your introduction would have
4:29
been entirely accurate .
4:31
That will teach me . I thought I'd done my due
4:33
diligence with my research there
4:35
, but I was slightly out
4:37
with that one there . So thank you for setting the
4:39
record straight and I'm sure the
4:41
audience will appreciate that . And
4:44
so I've got your bio here , so hopefully I've
4:46
got a bit of this correct . So
4:48
I mentioned then a Grammy nominated mastering engineer
4:50
, producer , educator and musician , and
4:52
you lead the development of groundbreaking audio
4:54
processing technologies , as you've mentioned , and
4:57
you also teach at Berkeley College of Music and
5:00
where you teach mastering and audio production
5:02
. So you've got over three decades of experience in the industry
5:05
and you've worked with a diverse range
5:07
of artists and contributed to countless successful
5:10
albums across various genres
5:13
. And today we'll
5:15
be discussing the intersection
5:18
I've got written here in this elaborate
5:20
introduction I've got of mastering and
5:22
artificial intelligence . Now this is
5:24
sort of like part of a mini
5:26
series I've got going on , so at the point
5:28
of this episode going live , a previous
5:30
episode would have been with Bobby Osinski about his
5:33
book AI in music
5:35
production as well . So it's a nice little mini series
5:37
. So really excited for this one and I was saying
5:39
off air as well that your Are you Listening
5:41
series is probably my most signposted
5:44
suite of content
5:46
that I send the listeners to when they
5:48
ask me questions where I'm kind of like , actually you
5:50
know what I could give you the answer , but Jonathan
5:53
probably puts in a much more palatable
5:55
way than I do . So , uh
5:58
, yeah , very much so , and they probably heard
6:00
me mention it a few times on the podcast , so I thought it'd
6:02
be quite good if we can kick off with so
6:04
you mentioned about the development of audio technology
6:07
and whatnot If you could talk about how
6:09
you see artificial intelligence influencing
6:12
the mastering process and , in particular , what
6:14
are some common misconceptions about AI
6:16
and mastering that you often encounter .
6:18
Well , I've never actually heard anybody ask about
6:20
how AI might influence the mastering process . Anybody ask about how AI might
6:22
influence the mastering process . You
6:25
know , I think a lot about the way technologies
6:27
as they come across our desks actually
6:30
change not only our workflows and
6:32
the way we do things , but also the aesthetics of what
6:34
we do , and there's some famous examples
6:37
of that , going back through the ages , whether
6:39
you know , especially in the introduction
6:41
of digital signal processing , around the introduction
6:43
of limiters and even being able to
6:45
use a buffer , like once buffers
6:48
became affordable in computers
6:50
so that we could hold on to a signal
6:52
for a moment , analyze it , figure
6:54
out what the pitch was , figure out something
6:57
about the signal compute , sort
6:59
of the low
7:01
frequency period of a signal . You can't do
7:03
that in analog , you can only do it in digital
7:05
, and that resulted in a complete
7:08
sea change in terms of the aesthetics of sound
7:10
. So , anyway , you
7:13
know , I'm not sure I
7:16
have a single answer to how AI
7:18
will affect the aesthetics , but I can
7:20
guarantee you that it will . One
7:25
of the things that first comes to mind is
7:28
the ability to engage in source separation
7:30
, which is , at this point , I think
7:32
, probably everybody is familiar with this idea of demixing
7:35
. You can take a full mix and separate
7:37
it into four stems or maybe more
7:39
. Audioshake and some other platforms are
7:41
extending the vocabulary . And
7:44
then what we do with that information
7:46
is fascinating
7:48
and varied , and more than simply
7:50
doing karaoke or remixing
7:52
. But we can take the signals that are extracted
7:54
and use them as side chains to
7:57
feed different signal processors in our mastering
7:59
chains or in our recording or mixing
8:01
production . I
8:04
think that there's a lot of sort
8:07
of interesting innovation that
8:10
falls out of simply having access to
8:12
components in a mixed signal . In
8:14
that way , you could tune your vocals as you're
8:16
mastering . I mean , that's pretty mundane
8:18
. Now here's sort of another take
8:21
on this , and I'm
8:24
gonna say this this may sound a little
8:26
bit I'm going
8:28
to say a little bit harsh , but
8:31
I think it's something that we all need to really
8:33
acknowledge and embrace
8:36
, and that is so
8:39
we can talk a little bit about what AI and mastering
8:41
actually means , which was the second part of your question
8:43
. But I think we all have to allow
8:46
for the fact that on some very sort of
8:48
superficial level , ai-driven
8:51
tools in mastering may
8:53
do a reasonably good job Now
8:55
, maybe not full of creativity
8:58
and interesting results as
9:00
a human , but let's just say sort of baseline
9:03
. It's a competent kind
9:05
of processing , depending on how models are
9:07
trained et cetera . So now
9:09
let's take a look at . You know , the mastering
9:11
marketplace has been exploding over
9:14
the last bunch of years with the
9:16
advent of ozone and other approachable tools
9:18
. We have more and more people coming
9:20
into the market who have relatively
9:22
little experience , and
9:25
it takes a while to get good at something . And
9:27
so you may see where I'm going with this . But if the
9:29
entry level can't measure up or
9:31
doesn't measure up to what the AI-driven
9:34
tools can do , that may exert
9:36
some pressure on the market in general . It
9:38
may sort of further dilute the market . It may
9:41
mean it's more difficult for people to enter the
9:43
market . So just in terms of the
9:45
activity , I think there's
9:47
potential for AI
9:49
to have an impact and maybe to
9:53
encourage people to look
9:55
at what it's doing and make sure you can do at least
9:58
as well as what the
10:00
best of AI-driven tools
10:02
are . Now I
10:04
assume we'll get to the question of what
10:07
the AI sort of in mastering
10:09
or any other AI tools do and don't
10:11
do well , at least currently . But
10:14
let's just acknowledge that there's certain
10:16
things that may be where
10:19
the tools may be competent Certain
10:22
kinds of ways of
10:24
adjusting signals , understanding signals and
10:28
I'm by no means an AI maximalist , right
10:31
, I'm not saying you know , the robots are coming to
10:33
take our jobs and they're going to take over and
10:35
all of our pets . And you know , social
10:37
life is going to be AI in five years
10:40
. So let's take a look at what AI
10:42
and mastering actually is . And
10:45
I'll start by saying
10:48
that probably the celebrity
10:50
of the AI and mastering world
10:53
is Lander . So Lander
10:55
is a company that was started probably 10 years ago , maybe
10:58
a little bit more , and
11:03
the original registered trademark was MixWizard , and so
11:05
many people are surprised to find out that what the
11:07
intention of the platform originally
11:09
was was to develop an auto-mixing
11:11
environment . It became
11:14
very evident very
11:16
quickly that mixing is hard , and
11:22
creating a mixing environment driven
11:24
by machine learning and we should differentiate
11:27
between machine learning and AI that
11:31
produced decent results probably
11:33
wasn't going to happen very quickly . So they pivoted to mastering
11:35
, because in some ways , mastering on
11:37
the surface of it is a much simpler thing to
11:39
understand . You know there
11:41
are a few things . Whenever you ask anybody
11:43
what happens in mastering , the thing
11:46
that they will probably say is it's where our projects
11:49
go , to get loud to be made loud
11:51
, which is a proxy for setting level
11:53
and then probably to be made brighter
11:55
right , even though that's
11:57
not necessarily the thing you want to have happen . That's what
11:59
people think about mastering . So
12:02
if you take those two very high level
12:04
concepts , you know , setting the
12:06
level and getting the tone , which
12:09
is kind of a two-dimensional measurement across an
12:11
entire program , then you
12:13
could say , well , sure , you could measure level
12:16
. That's pretty easy . You can measure
12:18
tone . You can take kind of an
12:20
FFT average across
12:22
a certain amount of time in a program and
12:25
then you can compare
12:27
it against an average that's created
12:29
via machine learning yeah , all
12:32
right , sort of data mining and
12:35
say , okay , so this is how that varies
12:37
. From that . We'll make an adjustment
12:39
, you know , we'll set the level differently
12:41
, probably make it hotter . We'll do
12:44
some kind of EQ , maybe some
12:46
kind of dynamics processing , in order to
12:48
change the dynamism , either
12:50
broadband or within parts of the spectrum
12:52
, and that's going to be mastering
12:55
. And then , beyond
12:57
that , some of the tools have now started
12:59
to try to either give the
13:01
users options driven by semantic
13:03
sort of attributes you
13:06
know , a soft versus an
13:08
aggressive version , you can check a
13:10
box or , in the case of the work
13:12
that we did at iZotope , we tried to use genre tags
13:14
as a way of designating certain
13:18
kinds of tonal curves and
13:20
treatments , which is interesting
13:22
. It just creates a little more nuance in the
13:24
result . But at the end
13:26
of it all , it really is what I just said
13:28
it's level and tone , and
13:35
it could be more or less automated . There
13:38
are certain platforms that fully automate it , like
13:41
put in your track , hit , go , you
13:43
get something back , like it
13:45
or not . Here you go , and
13:47
then there are other tools . I'll sort of take it all the
13:49
way back to iZotope and Ozone
13:51
, where there's an assistant that
13:54
produces a treatment that you can simply
13:56
accept , but it also
13:58
lets you unpack it with as
14:00
much detail as you'd like . So you know
14:02
, to the point where you can go in and change the peak detector
14:04
to an average detector and the compressor . You
14:06
can moderate and modify
14:09
any of the parameters to your heart's content . So
14:12
there's the automated version of this kind
14:14
of tool and then there's the assistive or
14:17
, you know , your assistant . I think
14:19
, is the term that we used to use and
14:22
still is used by many tools and
14:24
certainly iZotope , so
14:28
hopefully that's a pretty good sort of overview
14:30
of what AI and mastering means . You
14:34
know what it doesn't mean ? We can talk about that too
14:36
.
14:36
Yeah , yeah , fantastic , yeah , just
14:38
to recap some of the bits you went through there in particular
14:41
. So you mentioned about the source separation , which
14:43
I think is really interesting , because it's the
14:45
same conversation I had with bobby azinski
14:47
in a previous episode and we
14:49
mentioned the beatles , or rather
14:51
he mentioned the beatles film , whereby they separated
14:54
the mix there and they were actually able to separate
14:57
the drum stems that weren't originally
14:59
recorded separately , as it were , so
15:01
they were able to separate the kick snare . I
15:03
might be doing a crude description of it , but I
15:05
think that's incredible being able to do that , because
15:07
I know I've had instances
15:09
where I've been sent tracks whereby there
15:12
needs to be something changed
15:14
level-wise before it hits the master
15:17
and the client has said I
15:19
no longer have that project
15:21
available , I haven't got access to it anymore
15:23
, which comes down to project management , but
15:26
that happens a lot . So that I think is incredibly
15:28
useful . And
15:30
it's interesting what you mentioned there about sort of like the barrier
15:33
to entry with mastering as well , with
15:35
these products being available , and
15:37
do you think then Could
15:39
it potentially , if you've got the facilities
15:42
there to have , like you
15:44
described there , with a mastering assistant , would
15:47
that then mean there could be more ? Is
15:49
the barrier to entry lower then for
15:51
mastering engineers to enter
15:54
the market because they've got this assistant
15:56
and then they can learn on the job ? Would
15:58
that be a fair description ?
15:59
Yeah , there's probably three answers to that . I want to go back to the source
16:01
separation again for one second Go ahead .
16:02
We're going to have parallel conversations or one second .
16:03
Yeah , go ahead , we're going to have parallel conversations
16:06
or interleaved conversations
16:08
, I guess . So
16:12
the sort of isolating drums
16:14
and low frequency instruments from
16:16
other tonal instruments . At
16:19
this point that's become kind of a relatively
16:21
simple task
16:23
. The thing that was fascinating about the Beatles
16:25
example , and the
16:28
place where the vocabulary of these
16:30
tools is getting extended , is being able
16:32
to separate voices . So being
16:35
able to separate John's voice from Paul's voice
16:37
, now
16:39
that . Or taking a four-part harmony and
16:42
being able to deconstruct it , so you've got the soprano
16:44
, the tenor , the , you know whatever , and
16:47
I think that's the direction we're moving
16:49
into . So it's no longer anyway
16:53
, it's becoming more capable and more subtle
16:55
and more nuanced , and
16:57
so that's that
17:00
. I just wanted to sort of feed into that
17:02
.
17:03
Yeah , of course .
17:04
So in terms of access
17:06
I mean
17:08
. So I'm going
17:11
to take your last point about
17:13
learning . I think one of the greatest benefits
17:15
about this technology is that , with
17:19
an open mind and with
17:22
a spirit of inquisitiveness
17:25
, you can sort of look at what these
17:27
tools are doing and , assuming
17:30
that they are informed by a good data set and
17:32
that's an assumption , yeah , we can
17:34
dive into that too , but , assuming
17:37
it's informed by a good data set observe
17:39
the outcome and then
17:42
say
17:44
, oh , I see , so here's what I've
17:46
been doing , or here are my mixes and
17:49
this is what these systems are proposing
17:51
all the time . So let me sort of see
17:53
what I can make of that information . You
17:55
know , my mixes are always a little dull , or my
17:57
levels , you know , in
18:00
a good place , not in a good place . Or
18:02
you know , it seems like the low
18:04
end of my kick drum is always interacting in a
18:06
negative way with these tools . Maybe I should go back and rethink
18:09
my mixing so they can provide
18:11
some insight into
18:13
the user's work and in that sense it really
18:15
is kind of a neat assistive technology
18:18
. In
18:20
terms of accessibility . I guess it's a double-edged
18:22
sword because on one hand
18:24
, yeah , you know in the same way that , like I
18:27
don't know . If you remember , there was something made by TC
18:29
Electronics called the Finalizer , which
18:31
is like a mastering engineer in a box . It was one
18:33
of the earliest hardware sort
18:35
of mastering wizard things no AI , but
18:38
it had a multiband compressor , an EQ
18:41
, a reverb and a widener and
18:43
you know , it was instant access
18:45
to mastering tools For
18:48
mastering process . You just push a button and suddenly
18:50
for at that time it was probably $1,100
18:53
, you had access to this . Now I've
18:55
actually got an extra one . If you'd like I'd send it to you
18:57
.
18:57
Oh , yes , please , I'd love to try it out . That would be amazing
18:59
.
19:00
They were pretty funny devices . So
19:03
anyway , the sort of access
19:06
to the tools for mastering
19:08
has sort
19:10
of accelerated . You know , through Ozone
19:12
through there was something called
19:15
T-Rex that was made by IK Multimedia
19:17
. That was , I think , probably earlier than Ozone
19:19
. We're
19:26
right around the same time that it came on the market . So that's
19:29
provided greater
19:31
access . And now AI sort of does two things at once it
19:33
speeds up the workflow , it
19:35
does increase access , but it can also be
19:37
more opaque . So the
19:39
learning that you take away from using something
19:41
like Lander is
19:44
a little harder to come by . You have to make
19:46
your own observations and make your own deductions
19:49
With something that's assistive , that
19:51
unpacks the processing in front of you . Then
19:53
you can say oh , now I hear what
19:55
I hear and I see why I hear it , and
19:57
then I can sort of get a little
19:59
bit of that insight more directly from
20:02
the feedback .
21:20
Yeah , it's kind of like reverse engineering , isn't it ? I think I've said that before on the
21:22
podcast is where you've got these tools and access to them . You say , okay , well , it's made
21:24
that decision , how has it got to that decision
21:26
? And then I can reverse engineer it from
21:28
there and understand and unpack what's happened
21:30
, whereas I guess , like you say , with a platform
21:32
like Lander or possibly CloudBounce as well , you
21:38
kind of like it just spits out the end product and you don't necessarily know how
21:40
it's got there . That's right .
21:43
There are 22 online mastering
21:45
services at the moment .
21:46
Wow , are there really ?
21:47
I did not know that 22 online mastering
21:49
services .
21:50
That's right , separate and distinct
21:52
kinds of processing engines . I'm
21:54
just going to make a note of that 22 . Separate and distinct kinds of processing engines
21:56
I'm just going to make a note of that 22 .
21:58
I'm going to go and do a bit of research into it , because I did not know that and that was as
22:00
of yesterday . There might be more today .
22:02
Do you think , then , this is going off on a tangent
22:04
? Then you mentioned about Landa starting out
22:06
as an auto-mixing service . Do
22:09
you think that that will eventually
22:11
be something where we
22:13
upload stems
22:16
for want of a better way of putting it , Stems would
22:18
be the right way and then it mixes it for
22:20
us ? Do you think that's something that's on the horizon ? Oh
22:22
, it's already happening . Is it really ?
22:24
Yeah , there's a platform called Roex , started
22:26
by a fellow named David Ronin , another
22:30
one called Osmix OSmix in
22:33
the market and
22:40
actually at iZotope . We tried to sort of put something together
22:43
that was a mixing assistant within the context of
22:45
the Neutron plugin .
22:46
Yes , another one .
22:48
And so absolutely
22:50
, and you know , I think , for the purposes of
22:52
this discussion , I just want to and
23:01
you know , I think , for the purposes of this discussion , I just want to sort of state a sort of a focus
23:03
for us , and that is that we are talking about all of this technology in the
23:05
context of bespoke music
23:08
production to
23:19
the sort of writing for commercials or advertising , where kind of good enough means something very
23:21
different than it does if you're trying to make music that makes people happy and inspires
23:23
their imaginations , as opposed to selling products . Because
23:26
you know , I just want to sort of say that
23:28
at this point so we can
23:30
not go into the yeah
23:32
, auto mixing is good enough for the people
23:35
who just want a 30 second spot that starts
23:37
slow , ends up fast and sounds
23:39
like reggae or something like that , because those engines
23:41
already exist . Back to the
23:44
sort of auto mixing idea
23:47
. I think that we that
23:49
the learning
23:51
of the systems is
23:53
improving , it's getting
23:55
faster and there's sort
23:57
of improvements that are iterative
23:59
, over time . You know , if you train a system long
24:02
enough , it gets better
24:04
. Yeah , yeah
24:06
, you know the difference between training like
24:10
a system , a machine learning system
24:12
, for one hour versus one
24:14
day , versus three days versus a month is
24:17
profound . So
24:21
, having said that , one
24:24
of the big problems with
24:26
auto mixing systems
24:28
is the user experience , the
24:30
design of the system , and I'll
24:32
just illustrate a couple of problems
24:35
that you get . First
24:38
of all , you have to tell the system what the
24:41
focus of a mix is . And
24:43
if there's drums , bass and vocals , sure
24:45
it could assume those
24:47
things , but what if it's an instrumental track ? Or
24:51
what if in a section there is no vocal
24:53
any longer ? Or what
24:55
if you have some other idea about
24:57
what should be the priority
25:00
of a mix ? So initially
25:03
you have to give the system some guidance , and that
25:05
requires user input . So
25:07
that already creates a layer of interaction
25:10
that is complicated . And
25:12
then , if you think about , you know , if
25:14
you've got your multi-track environment
25:18
, where you've got 60 or 70 or 80
25:20
tracks , you have to wait for
25:22
the system to scan everything , ingest
25:25
everything , identify everything . Hopefully
25:27
it's correct , hopefully it's grouped them in
25:30
the way that you want to group them . So there's a lot
25:32
of like pre-work for the system
25:34
to do to get to the point where you can even make use of
25:36
it . And then , how does that integrate into
25:39
your particular DAW ? Most
25:42
DAWs are not yet willing
25:45
to bring this into
25:47
their product environments
25:49
, probably to protect the IP
25:51
, probably to protect their market and
25:53
probably because it's a lot of work , the
25:57
ip probably to protect their market and probably because
25:59
it's a lot of work , um . So we're a ways away , I think , from it being
26:02
uh sort of commonly used um
26:04
and in use , but but inevitably
26:07
I think it will be yeah , it's interesting
26:09
what you said there about how you
26:11
.
26:11
Obviously there is that layer of interaction whereby
26:13
, essentially , we are having to
26:15
prompt it to do what we want
26:18
it to do , and then it comes down to whether
26:20
or not we get the prompt right . And uh
26:23
, I've noticed this with generative ai , because I use generative
26:25
ai and I like to experiment with these different bits and pieces
26:27
, and if you don't prompt it correctly , then you're not going to
26:29
get . You'll get it , you'll get it , you'll get an output
26:31
, but it won't logically . I'm going down the computer science route now , but it won't
26:33
logically . I'm going down the computer science
26:36
route now , but it won't logically be correct , it won't be quite
26:38
what you're after . So we've almost got to learn
26:40
another skill set now , which is how
26:42
good we are at prompting computers to
26:44
do what we want them to do . Is that a fair assumption
26:47
?
26:47
Absolutely , and the engineering of the systems
26:49
. There has to be some agreement about language
26:51
and
26:54
mapping the language to the sound
26:56
examples . You've
26:58
heard this term multimodal systems , which
27:01
is environments that
27:03
describe the ability to work not
27:05
just with semantic prompts , but also
27:07
having either video or images
27:10
or audio examples
27:12
. A lot of the LLMs that
27:15
are in use right now have never listened to anything
27:17
. They've never heard a sound , and
27:20
so mapping the language to the sound
27:22
that you're after is not
27:24
a simple task . It's
27:27
hard to get it right .
27:29
Yeah , very interesting . I cannot wait to
27:31
see what it looks like in five
27:33
years' time , bearing in mind how far
27:35
we've come in the previous five years in
27:37
terms of what every platform now has
27:39
this ai component , because I think
27:41
there was a clamor for it . No , they're not just in audio
27:44
, but in in video as well
27:46
, in imageries and and every . All
27:48
these platforms and I have this platform using
27:50
right now to to on this podcast
27:52
riverside they uh , when
27:54
I started using there was a really basic
27:56
element . If not , there might not have been any AI
27:58
integration . There probably was , but now it's
28:01
just a hockey stick curve
28:03
in terms of what they're doing , which
28:05
is amazing . Jonathan , in the interest
28:07
of time I'm well aware we're
28:09
already 25 minutes in I
28:12
think it'd be quite interesting to now jump on
28:14
what you mentioned earlier about what AI mastering can and cannot do for us . I think it'd
28:16
be quite interesting to now jump on what you mentioned earlier about what what ai mastering can and
28:18
cannot do for us . I think it'd be quite cool if you could talk about that and how
28:20
well , basically , what it can and cannot
28:22
do for us and how it can assist
28:24
us as creatives .
28:26
Sure well , I mean
28:28
, I think , both for mixing
28:31
engineers and for um
28:33
sort of those who are learning or coming
28:36
into the marketing I'm sorry
28:38
, the mastering market , the activity . As
28:41
I said , it can give you some guidance and
28:44
that's a great use of
28:47
it
28:49
. It can also , you know , for somebody
28:51
who's creating an album of demos
28:53
and you just want to get everything into a place where you could send
28:55
it out for somebody , a producer
28:58
, to listen to or something at
29:00
a label to listen to . You
29:02
know , it's kind of an easy win . You
29:04
know the
29:07
problems that have not yet
29:10
been addressed are
29:12
how do you indicate
29:14
intent , how do you understand
29:17
musical context and
29:19
how do you facilitate the
29:21
sort of interesting and creative
29:23
things that one does in mastering ? When
29:26
you're interpreting a mix and you get a sense
29:28
of what you think the artist , what the
29:30
vision might be , and you take
29:32
it in a direction , often
29:35
that decision is informed by lots
29:37
and lots of information . It's not just about level
29:39
and tone , and
29:41
sometimes you come up with an idea to
29:43
do something that's slightly unconventional
29:46
. And sometimes
29:48
records that don't sound perfect
29:50
or don't conform to the model are the
29:52
most interesting records you
29:55
know . Probably the best most recent
29:57
example is the Billie Eilish's
29:59
record two records ago , which
30:02
was very different sounding
30:05
from pretty much anything else on the market . I
30:08
don't think a mix , an AI mix engine
30:10
, would have mixed it the way they mixed it , and I don't
30:12
think an AI mastering system
30:14
would have mastered it the way they mastered it and
30:17
, frankly , it's actually got a little
30:19
too much base in it . You know
30:21
, from a technical standpoint it ain't correct
30:23
, but it's really great and it's really
30:25
cool and it's you know . One
30:27
can't argue with the commercial success . No
30:29
, not at all . No . So Drilling
30:33
down a level no pun intended . You
30:37
know the nuances
30:40
, such as the difference in the level between
30:42
an introduction and a first verse , or
30:44
a verse and a chorus , being
30:47
able to sort of program , a system to assess that
30:49
difference and then make a change that
30:51
would actually be consonant with what was desired
30:56
, which is one of the things that sometimes we
30:58
do in mastering . You want to
31:00
maintain the impact from the intro to the first
31:02
verse . There's an
31:04
example of this that
31:06
the first experience that
31:09
I had with this is when I was mastering
31:11
a record . This is probably
31:13
seven or eight years ago for my daughter . It was in a punk
31:15
rock band and it started with a really janky
31:18
guitar intro and then the drums explode
31:20
after this four bar intro and
31:23
I had a few years later I
31:26
decided to use it as an example
31:28
and sample and
31:30
sent it to a couple of engines
31:33
, ai driven mastering systems
31:35
, and all of them completely obliterated
31:37
the contrast , destroyed it . You
31:40
know , suddenly I mean they did a great job
31:43
of matching the level by
31:46
compressing the heck out of it , because probably
31:48
they measured too much dynamic change
31:50
across either some part of the mix or the whole
31:52
mix . You know that lacked
31:54
all the context , for
31:56
you know what was built into the
31:58
mix , so that
32:01
that's a problem right . And
32:04
how do I mean , you know
32:06
, how do we make these systems in
32:10
such a way that they actually
32:12
can can sort of take that
32:14
kind of consideration into account ? Well
32:17
, there's
32:20
another sort of whole arena that
32:22
I think requires greater exploration
32:24
and that is around genre , and
32:26
I know that , as I said earlier , at iZotope we
32:28
used genre tags to try to
32:30
give people a
32:32
way to
32:34
give input and
32:36
curate the results a little bit differently . But
32:39
frankly , I think genre as
32:42
a word is very hard for AI to
32:44
actually wrap its artificial
32:46
brain around . I think style transfer
32:49
and style is something that's easier
32:51
to understand . You know , if you were to describe
32:53
what makes disco disco
32:55
, you'd probably talk about the level of the hi-hat
32:57
and the , the snappiness of the drums and this
33:00
. You know , the tone of the bass , and there
33:02
are very specific attributes that you could define
33:04
, um , but
33:06
what makes something kind of a
33:08
, a disco dance , hit
33:10
from the standpoint of a genre , is sort of a very
33:12
different construct . And then other genres
33:15
, like , involve culture and
33:18
sort of much deeper concepts
33:21
that I think it's very hard for us to reduce
33:24
them to the kinds of features that
33:26
are easy to measure and quantify
33:28
and build into a database .
33:30
So those are all some areas
33:32
where I think the AI
33:35
and mastering could improve
33:37
what you've mentioned there right at the beginning about
33:39
how you could use it for guidance and
33:41
demos sort of resonates
33:43
a lot with what the conversation I've had on this podcast
33:46
over the last hundred plus episodes in which I've spoken
33:48
to producers , artists , and they say , yeah
33:50
, for example , logic just at the end of last
33:52
year introduced the mastering assistant into logic
33:54
and it's a way of just okay , well , what
33:56
could it sound like ? I'm mixing at the moment
33:58
, what could it sound like , inevitably
34:00
, and it just gives you those guidelines . But
34:02
I think I totally agree
34:04
with what you say about with the mastering and the engineer element
34:07
of it . And going back to that billy eilish record , and
34:09
in a way , sometimes you get those happy accidents
34:11
that you do . You get out of mixing as well . You
34:13
do something . You're thinking actually I didn't mean to do
34:15
that , but it sounds really good and
34:17
you're just not going to get that from artificial
34:20
intent at the moment . You're not going to get it from our . With
34:22
the growth mindset there , I'm saying you're not going to get it yet
34:25
. Let's say um , but I
34:27
suppose that's what it comes down to genre , because I was
34:29
speaking to someone earlier today and
34:32
they were saying , um , can you help
34:34
me pinpoint what genre I am , because
34:36
they didn didn't know they were . Like I've had
34:38
someone say it's this , someone says it's this , someone
34:40
says it's that and I don't really know what genre
34:42
of music this is . So I guess once again , it comes
34:44
down to being able to prompt correctly and
34:46
that sort of feeds into what you said about the genre
34:48
discussion around mastering and how it's
34:51
not quite there yet . I suppose
34:53
that'd be fair to say .
34:53
Yeah , I suppose that'd be fair to say . Yeah , that's right . I
34:56
mean , I think , defining genre , defining culture
34:58
using
35:01
a , I mean , I really think that there's a cultural component in
35:03
all of this and I you
35:05
know it's especially true of genres
35:08
like jazz or
35:11
certain sort of what we would call world
35:13
musics , where
35:16
there's either harmonic vocabulary
35:18
or rhythmic vocabulary or even
35:20
the role of individual elements
35:22
. That's very different
35:24
from probably what's represented by
35:27
most of the data sets , you
35:30
know , which actually parenthetically brings
35:33
up the whole question about bias and data . You
35:36
know , if all of the records that you feed into a
35:38
system have some similarity
35:41
to them , chances are that that
35:43
can be both a strengthness but also a blind
35:45
spot or a weakness in
35:48
a machine learning system
35:51
.
35:52
Yeah , very interesting it really is
35:54
. In a previous
35:56
life I was a teacher of computer science
35:58
, so this is why it's all very interesting
36:01
to me . When you mentioned there about bias and
36:03
the whole idea about randomization in computing
36:05
as well , where it's pseudo-randomization
36:08
and things like that , and well , you can
36:10
go down a total rabbit hole in that instance , you
36:12
know .
36:12
But I'll give you a very specific example of
36:14
where this showed up , which was when we were training
36:17
the vocal assistant for Nectar , which
36:20
is another isotope product , and
36:22
after some I
36:25
think it was a couple of days of learning
36:27
we started to recognize
36:29
that the system observed accurately
36:32
that every vocal that was
36:35
fed into the system was in tune , so
36:38
it assumed everything needed to be tuned and
36:42
that was a bias that was built
36:44
into the data which was not intended . So
36:47
we had to start again and kind
36:49
of make sure that we removed that
36:51
as a
36:54
feature if you will , interesting
36:56
.
36:56
It's amazing that when you hear the stories
36:59
behind the scenes , under the hood of how it was
37:01
all put together , because what we see as consumers
37:03
is this great piece of kit , but you don't
37:05
realize all the work and the dev work that's gone
37:07
into it and all
37:09
those bits and pieces , which I can imagine is quite
37:11
a feat to do . Hey , listen , while
37:13
you said that .
37:15
if I may , a
37:17
10-second plug , oh please
37:19
. In June of this year we're
37:21
hosting an AI and the Musician Symposium
37:24
in Boston , massachusetts , at the Berklee College of Music
37:26
, partly just to give
37:28
musicians access to
37:30
the kinds of thinking that's going into the
37:32
design of tools that you're describing , so I just
37:34
wanted to mention that Not everybody's going to travel
37:36
to Boston in June , but
37:39
if you happen to be in the neighborhood
37:41
, please attend I
37:43
.
37:43
I had this conversation with uh , with matt
37:45
um off air who gives
37:47
the warm introduction , and he mentioned it to me and
37:50
I was like june and at the point it was february
37:52
. I was like that's four months I might be able to put something together
37:54
and get over to boston . That would
37:56
be amazing . I won't lie , that'd be a nice little trip for
37:58
me . The uh , I just I won't tell the girlfriend
38:00
. So we're going to go to boston look
38:03
, it's a .
38:03
It's a great place to visit , in june also . I'll
38:06
tell you imagine um so yeah
38:08
, fantastic .
38:10
Um , and audience listening . I'll put a link to that
38:12
and a bit more information in the episode description
38:14
as well . Um , so you can go and check that out , if
38:17
you are . I know we've got . I
38:19
want to say a sweeping
38:21
statement here , but I think a lot of other listeners are in the united
38:23
states . So , um , yeah , yeah , yeah
38:25
, which is , maybe they like the english tone
38:27
or not quite quite sure it could be something like that
38:29
. Um , yeah , I'll
38:31
have to try that sometime uh
38:35
, jonathan , we're coming towards the end now
38:37
, so I think it'd be quite nice just to maybe wrap
38:39
things up with . If we've , maybe you could talk a bit about
38:42
, if you've got an artist who's navigating this landscape
38:44
of sort of AI mastering versus professional
38:47
mastering services , maybe what
38:49
sort of considerations they should take into
38:51
account if they're thinking , if they're in , if
38:53
they're on the fence , do I go with AI mastering
38:55
or do I go with what's your single ? Maybe
38:57
it'd be what's your biggest piece of advice
38:59
there .
39:01
Well , what I'm about
39:03
to say absolutely reflects
39:05
my own bias and my own values , and
39:08
it's not just driven by
39:11
the sort of creative economy
39:13
either . But you know , when
39:15
you make a record a
39:18
year later , what are you going to remember
39:20
about that record ? Are you going to remember , you
39:22
know , that you spent $500
39:25
doing this or $1,000 doing that , or
39:27
are you going to remember the
39:30
ways in which the record succeeded , whether
39:33
it's commercially or in
39:35
terms of the artistic vision ? If
39:45
you look at it through that lens
39:47
, you can tell that I'm advocating for the
39:49
bespoke approach . And in the right interaction , you
39:52
also stand to learn more there still , because
39:54
if you're collaborating with someone , you
39:56
can get feedback , there's an iterative
39:58
process , and
40:01
so for all of those reasons
40:03
, I would absolutely advocate for the
40:05
sort of human collaboration . Yeah
40:08
, if
40:10
you're somewhere and I want to sort of say
40:12
this without it sounding too judgmental
40:15
, but over here is kind of the music you're making
40:17
for today and you just want to get something out the door
40:19
and test it in the market , and
40:21
then there's the thing over here , which
40:26
is the thing that potentially has some legacy
40:28
for you , if
40:30
you're more on this side of the spectrum
40:32
, then sometimes it may make sense to just
40:34
throw a track up and make sure it comes back . It doesn't
40:37
sound too bright , it's not been squashed too
40:39
hard and you can put something out and
40:42
it's less expensive
40:44
, it takes less time to
40:46
do that . I mean you can get it back in
40:49
10 minutes instead of waiting for 10
40:51
days to book somebody , and
40:54
that may be exactly the right thing
40:56
to do . So there's
40:58
some of you know there's some gray
41:00
areas in between . It's not a fully binary
41:03
scenario , but
41:06
you know again , as
41:10
much as I love doing this work and as
41:12
much affection as I hold for all
41:14
of my peers who
41:17
are amazing mix engineers
41:19
and mastering engineers , I also recognize
41:21
that there's a real pressure
41:23
on the creative economy for artists , and
41:26
you can make an argument that if you make something
41:29
that sounds great , it's more likely to
41:31
succeed commercially . But you can also
41:33
make an argument that it's hard for artists
41:35
to make money and so you have to be careful
41:37
and smart about where you spend it . So you
41:39
can tell I'm not um
41:42
, I'm not recommending the
41:45
ai version because it's going to be better
41:47
in any instance , but
41:49
I understand when sometimes it might be good enough
41:51
I suppose it comes down to
41:53
intent and situation
41:55
, I guess , in a way like what , what is
41:58
it ?
41:58
what is it ? Because I think it goes back to
42:00
the clarification that you mentioned
42:02
earlier in this episode , whereby it was are
42:05
you creating music that you want to get out there
42:07
quickly for me , for example , um tv
42:10
and film or something along those , some sort of sync
42:12
opportunity in that respect , that's right or
42:14
are you or are you trying to create something
42:16
that's , as you mentioned , legacy ? So I guess it really
42:18
does depend on what the and also budget
42:20
, like you say , artists and budget , and it comes
42:23
down to that as well . So I suppose it's quite
42:25
a tricky question , isn't it ? I mean , there are many factors
42:27
involved with regards
42:29
to what it is you actually want to do with the music
42:31
that might well will influence your decision
42:34
.
42:34
Yeah , I mean I could say snarky things like my
42:37
clients are people who actually care about their music , but
42:40
I'm not going to say that , even though I just did . You
42:43
know , I mean it's kind of true . But
42:46
, that's not the only way .
42:50
Jonathan , before we wrap things up , I
42:52
just want to . We mentioned this off-air about an Easter
42:54
egg in this episode , and that was with regards
42:56
to your T-shirt . So , for those of you watching
42:59
this on YouTube , because
43:02
it's a classic album cover , isn't it ? Obviously
43:05
, there are four cats on there at the moment as
43:07
well If you can identify
43:09
that album cover , please do write
43:11
it in the comments and if you listen to
43:13
this on your podcast player of choice , head
43:15
over to YouTube and check it out and see if you can
43:18
figure out what it is . Uh , jonathan
43:20
, before you go , you've already mentioned that
43:22
about what's happening in june , but
43:25
if our audience , I want to find about
43:27
a bit more about you and your
43:29
past , what you're doing at the moment . Uh , where should
43:31
they go online ?
43:32
well , my , my mastering studio
43:34
and , by extension , um other
43:37
things audio is called M-Works
43:39
Mastering . We're M-Works Studios
43:41
here in the Boston area , specifically in Somerville
43:44
, massachusetts . You can find
43:46
me both in person and online
43:48
at Berkeley College . I've written a couple
43:50
of mastering courses for the online school and
43:52
also teach at the Brick and Border School in Boston and also teach at the Brick
43:54
and Border School in Boston and
43:57
I usually find my way to AES
43:59
and NAMM , and you
44:02
know this . Last year I was speaking
44:04
in Norway , in
44:06
Japan , and
44:09
I think those were my two big trips recently
44:11
. But
44:17
you know I travel around sometimes and show up at schools and give talks , which is something I enjoy
44:19
doing very much . Have you got any plan for the UK
44:22
? You know , maybe I
44:24
was just talking with some folks
44:26
about doing some work in London in
44:28
the next couple of months .
44:29
And if I ?
44:30
do ? I might end up at BIM , or we'll see .
44:32
Yeah , that'd be amazing . I'll keep an eye out . I'm
44:35
due for a London trip . I think I'm going there this is slightly
44:37
off topic , but for a podcast
44:41
what do they call it ? Fair I ? Don't
44:43
think it's fair . Convention . That was it . That's
44:45
what I was looking for Convention Confair
44:47
yes , yeah , yeah , something
44:49
like that . Jonathan
44:58
, it's great to meet you as well . As I mentioned , you have been referenced on this podcast
45:00
many a time , so it's great
45:02
having you here today and I will catch up with you soon
45:04
. Thanks very much , mark . I appreciate it .
Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More