Numpy combine channels

Sat Nov 10 08:05:00 EST 2012

On 11/09/2012 11:30 PM, Aahz wrote:
> In article <mailman.465.1347307911.27098.python-list at python.org>,
> MRAB  <python-list at python.org> wrote:
>> <snip>
>>
>> But should they be added together to make mono?
>>
>> Suppose, for example, that both channels have a maximum value. Their
>> sum would be _twice_ the maximum.
>>
>> Therefore, I think that it should probably be the average.
>>
>>>>> (a[:, 0] + a[:, 1]) / 2
>> array([1, 1, 2])
> I'd actually think it should be the max.  Consider a stereo where one
> side is playing a booming bass while the other side is playing a rest
> note -- should the mono combination be half as loud as as the bass?

max would sound awful.

The right answer is to add them with weighting, then scale the whole
waveform according to a new calculation of clipping. Just like a mixer,
you have level controls on each input, then an overall gain.

So if the inputs were x and y, the output would be   gain *( x_scale * x
+ y_scale * y), but it'd normally be done in two passes, so as to
minimize the places that are clipped, while maximizing the average. 
it's also possible to have gain vary across the time axis, like an agc. 
But you wouldn't want that as a default, as it'd destroy the dynamics of
a musical piece.

One more consideration.  If these are unsigned values (eg. 0 to 255),
then you should adjust both signals by 128 before storing them as signed
values, do your arithmetic, and then adjust again by adding 128.  You
could do the algebraic equivalent, but the programming would be much
simpler on signed values.

-- 

DaveA