Post Pandemic Alert.

The Covid-19 pandemic broke out since last 7 month and these might continue for the whole year 2020. The vaccines are on the way for clinical trial and mass production which might be available to…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Visualising Speech

Using Python libraries to visualise audio

There are several Python libraries available that make it very easy to view waveforms in different ways. In this post, I’ll go through some of the ways to get started.

To begin, import the key libraries:

Waveform for “What’s today’s weather?”

This first plot shows the waveform in the time domain. It’s easy to see the silence at the beginning and end of this clip, but to see what else is going on I need to zoom in and get a closer view.

Waveform zoomed in on /s/ at around 0.6s

The first place to zoom in is the point at around 0.6s. This corresponds to the /s/ at the end of the word “what’s”. There’s not really a lot of structure to see here. In fact, this part of the audio looks a lot like random white noise.

In comparison, zooming in around 0.85s focuses in on the /eɪ/ of “day”. This part of the audio looks quite different — you can clearly see the repeating pattern.

Waveform zoomed in on /eɪ/ at around 0.85s

Even with zooming in, there’s only so much you can see in the time domain. Matplotlib has its own spectrogram function to let us see what’s going on in the frequency domain. A spectrogram plots time on the x-axis, and frequency on the y-axis. The colour shows the amount of energy for particular frequencies at particular times.

Add a comment

Related posts:

O raio e o fogo

Antes que pudesse pensar, se levantou e esquivou de uma fina navalha de vento. Um pequeno corte lhe foi feito no braço esquerdo. De longe um garoto com expressão séria, porém seus olhos brilham de um…