Automatic Gain Control in Python?

Sun May 29 13:27:20 EDT 2022

On 2022-05-29 16:17, Benjamin Schollnick wrote:
> Okay, you are capturing the audio stream as a digital file somewhere, correct?
> 
> Why not just right a 3rd party package to normalize the audio levels in the digital file?  It’ll be faster, and probably easier than trying to do it in real time…
> 
> eg. https://campus.datacamp.com/courses/spoken-language-processing-in-python/manipulating-audio-files-with-pydub?ex=8 <https://campus.datacamp.com/courses/spoken-language-processing-in-python/manipulating-audio-files-with-pydub?ex=8>
> 
> Normalizing an audio file with PyDub
> 
> Sometimes you'll have audio files where the speech is loud in some portions and quiet in others. Having this variance in volume can hinder transcription.
> 
> Luckily, PyDub's effects module has a function called normalize() which finds the maximum volume of an AudioSegment, then adjusts the rest of the AudioSegment to be in proportion. This means the quiet parts will get a volume boost.
> 
> You can listen to an example of an audio file which starts as loud then goes quiet, loud_then_quiet.wav, here <https://assets.datacamp.com/production/repositories/4637/datasets/9251c751d3efccf781f3e189d68b37c8d22be9ca/ex3_datacamp_loud_then_quiet.wav>.
> 
> In this exercise, you'll use normalize() to normalize the volume of our file, making it sound more like this <https://assets.datacamp.com/production/repositories/4637/datasets/f0c1ba35ff99f07df8cfeee810c7b12118d9cd0f/ex3_datamcamp_normalized_loud_quiet.wav>.
> 
> or
> 
> https://stackoverflow.com/questions/57925304/how-to-normalize-a-raw-audio-file-with-python <https://stackoverflow.com/questions/57925304/how-to-normalize-a-raw-audio-file-with-python>
> 
> 
[snip]

Here's a sample script that uses pyaudio instead of Audacity.

You can check whether the podcast is playing by checking the volume soon 
after it should've started.

Pyaudio can also read and write files.

import pyaudio
import time
import numpy as np

WIDTH = 2
CHANNELS = 2
RATE = 44100
MAX_VOL = 1024

GAIN_STEP = 0.2
LOUDER = 1 + GAIN_STEP
QUIETER = 1 - GAIN_STEP

gain = 1

p = pyaudio.PyAudio()

def callback(data, frame_count, time_info, status):
     global gain

     # Decode the bytestream
     chunk = np.frombuffer(data, dtype=np.int16)

     # Adjust the volume.
     chunk = (chunk.astype(np.double) * gain).astype(np.int16)

     # Adjust the gain according to the current maximum volume.
     max_vol = max(chunk)

     if max_vol < MAX_VOL:
         gain *= LOUDER
     elif max_vol > MAX_VOL:
         gain *= QUIETER

     return (chunk.tobytes(), pyaudio.paContinue)

stream = p.open(format=p.get_format_from_width(WIDTH), channels=CHANNELS,
   rate=RATE, input=True, output=True, stream_callback=callback)

stream.start_stream()

while stream.is_active():
     time.sleep(0.1)

stream.stop_stream()
stream.close()

p.terminate()