audioopManipulate raw audio data操纵原始音频数据

Deprecated since version 3.11: 自3.11版起已弃用:The audioop module is deprecated (see PEP 594 for details).audioop输出模块已弃用(详见PEP 594)。


The audioop module contains some useful operations on sound fragments. audioop模块包含一些对声音片段的有用操作。It operates on sound fragments consisting of signed integer samples 8, 16, 24 or 32 bits wide, stored in bytes-like objects. 它对由8、16、24或32位宽的带符号整数样本组成的声音片段进行操作,这些样本存储在类字节对象All scalar items are integers, unless specified otherwise.除非另有规定,否则所有标量项都是整数。

Changed in version 3.4:版本3.4中更改: Support for 24-bit samples was added. 增加了对24位样本的支持。All functions now accept any bytes-like object. 所有函数现在都接受任何类字节对象String input now results in an immediate error.字符串输入现在会立即导致错误。

This module provides support for a-LAW, u-LAW and Intel/DVI ADPCM encodings.此模块支持a-LAW、u-LAW和Intel/DVI ADPCM编码。

A few of the more complicated operations only take 16-bit samples, otherwise the sample size (in bytes) is always a parameter of the operation.一些更复杂的操作只需要16位样本,否则样本大小(以字节为单位)始终是操作的参数。

The module defines the following variables and functions:该模块定义以下变量和函数:

exceptionaudioop.error

This exception is raised on all errors, such as unknown number of bytes per sample, etc.所有错误都会引发此异常,例如每个样本的字节数未知等。

audioop.add(fragment1, fragment2, width)

Return a fragment which is the addition of the two samples passed as parameters. 返回一个片段,它是作为参数传递的两个样本的相加。width is the sample width in bytes, either 1, 2, 3 or 4. width是以字节为单位的样本宽度,可以是1234Both fragments should have the same length. 两个片段应具有相同的长度。Samples are truncated in case of overflow.如果溢出,样本将被截断。

audioop.adpcm2lin(adpcmfragment, width, state)

Decode an Intel/DVI ADPCM coded fragment to a linear fragment. 将Intel/DVI ADPCM编码片段解码为线性片段。See the description of lin2adpcm() for details on ADPCM coding. 有关ADPCM编码的详细信息,请参阅lin2adpcm()的描述。Return a tuple (sample, newstate) where the sample has the width specified in width.返回一个元组(sample, newstate),其中样本具有在width中指定的宽度。

audioop.alaw2lin(fragment, width)

Convert sound fragments in a-LAW encoding to linearly encoded sound fragments. 将a-LAW编码中的声音片段转换为线性编码的声音片段。a-LAW encoding always uses 8 bits samples, so width refers only to the sample width of the output fragment here.a-LAW编码始终使用8位样本,因此这里的width仅指输出片段的样本宽度。

audioop.avg(fragment, width)

Return the average over all samples in the fragment.返回片段中所有样本的平均值。

audioop.avgpp(fragment, width)

Return the average peak-peak value over all samples in the fragment. 返回片段中所有样本的平均峰峰值。No filtering is done, so the usefulness of this routine is questionable.没有进行筛选,因此这个例程的有用性值得怀疑。

audioop.bias(fragment, width, bias)

Return a fragment that is the original fragment with a bias added to each sample. 返回一个片段,该片段是向每个样本添加了偏差的原始片段。Samples wrap around in case of overflow.样本在溢出时环绕。

audioop.byteswap(fragment, width)

“Byteswap” all samples in a fragment and returns the modified fragment. “Byteswap”片段中的所有样本,并返回修改后的片段。Converts big-endian samples to little-endian and vice versa.将大端数样本转换为小端数,反之亦然。

New in version 3.4.版本3.4中新增。

audioop.cross(fragment, width)

Return the number of zero crossings in the fragment passed as an argument.返回作为参数传递的片段中的零交叉数。

audioop.findfactor(fragment, reference)

Return a factor F such that rms(add(fragment, mul(reference, -F))) is minimal, i.e., return the factor with which you should multiply reference to make it match as well as possible to fragment. 返回一个因子F,使rms(add(fragment, mul(reference, -F)))最小,即返回应该乘以reference的因子,使其尽可能匹配fragmentThe fragments should both contain 2-byte samples.这些片段都应包含2字节的样本。

The time taken by this routine is proportional to len(fragment).此例程所花费的时间与len(fragment)成比例。

audioop.findfit(fragment, reference)

Try to match reference as well as possible to a portion of fragment (which should be the longer fragment). 尽量将referencefragment的一部分(应该是较长的片段)匹配。This is (conceptually) done by taking slices out of fragment, using findfactor() to compute the best match, and minimizing the result. 这是(从概念上)通过从fragment中提取片段,使用findfactor()计算最佳匹配,并最小化结果来实现的。The fragments should both contain 2-byte samples. 这些片段都应包含2字节的样本。Return a tuple (offset, factor) where offset is the (integer) offset into fragment where the optimal match started and factor is the (floating-point) factor as per findfactor().根据findfactor(),返回一个元组(offset, factor),其中offset是开始最佳匹配的fragment的(整数)偏移量,factor是(浮点)因子。

audioop.findmax(fragment, length)

Search fragment for a slice of length length samples (not bytes!) with maximum energy, i.e., return i for which rms(fragment[i*2:(i+length)*2]) is maximal. 搜索具有最大能量的长度length样本fragment(不是字节!)的片段,即返回rms(fragment[i*2:(i+length)*2])最大的iThe fragments should both contain 2-byte samples.这些片段都应包含2字节的样本。

The routine takes time proportional to len(fragment).该例程花费的时间与len(fragment)成比例。

audioop.getsample(fragment, width, index)

Return the value of sample index from the fragment.返回片段中样本index的值。

audioop.lin2adpcm(fragment, width, state)

Convert samples to 4 bit Intel/DVI ADPCM encoding. 将样本转换为4位Intel/DVI ADPCM编码。ADPCM coding is an adaptive coding scheme, whereby each 4 bit number is the difference between one sample and the next, divided by a (varying) step. ADPCM编码是一种自适应编码方案,其中每个4位数是一个样本和下一个样本之间的差值,除以(变化的)步长。The Intel/DVI ADPCM algorithm has been selected for use by the IMA, so it may well become a standard.IMA已选择使用Intel/DVI ADPCM算法,因此它很可能成为标准。

state is a tuple containing the state of the coder. state是包含编码器状态的元组。The coder returns a tuple (adpcmfrag, newstate), and the newstate should be passed to the next call of lin2adpcm(). 编码器返回一个元组(adpcmfrag, newstate),并且newsstate应该传递给lin2adpcm()的下一个调用。In the initial call, None can be passed as the state. 在初始调用中,None可以作为状态传递。adpcmfrag is the ADPCM coded fragment packed 2 4-bit values per byte.adpcmfrag是每字节压缩2个4位值的ADPCM编码片段。

audioop.lin2alaw(fragment, width)

Convert samples in the audio fragment to a-LAW encoding and return this as a bytes object. 将音频片段中的样本转换为a-LAW编码,并将其作为字节对象返回。a-LAW is an audio encoding format whereby you get a dynamic range of about 13 bits using only 8 bit samples. a-LAW是一种音频编码格式,仅使用8位样本就可以获得大约13位的动态范围。It is used by the Sun audio hardware, among others.它被Sun音频硬件等使用。

audioop.lin2lin(fragment, width, newwidth)

Convert samples between 1-, 2-, 3- and 4-byte formats.在1、2、3和4字节格式之间转换样本。

Note

In some audio formats, such as .WAV files, 16, 24 and 32 bit samples are signed, but 8 bit samples are unsigned. 在一些音频格式中,例如.WAV文件,16、24和32位样本是有符号的,但8位样本是无符号的。So when converting to 8 bit wide samples for these formats, you need to also add 128 to the result:因此,当转换为这些格式的8位宽样本时,还需要在结果中添加128:

new_frames = audioop.lin2lin(frames, old_width, 1)
new_frames = audioop.bias(new_frames, 1, 128)

The same, in reverse, has to be applied when converting from 8 to 16, 24 or 32 bit width samples.相反,当从8位宽度样本转换为16位、24位或32位宽度样本时,必须应用相同的方法。

audioop.lin2ulaw(fragment, width)

Convert samples in the audio fragment to u-LAW encoding and return this as a bytes object. 将音频片段中的样本转换为u-LAW编码,并将其作为字节对象返回。u-LAW is an audio encoding format whereby you get a dynamic range of about 14 bits using only 8 bit samples. u-LAW是一种音频编码格式,仅使用8位样本即可获得大约14位的动态范围。It is used by the Sun audio hardware, among others.它被Sun音频硬件等使用。

audioop.max(fragment, width)

Return the maximum of the absolute value of all samples in a fragment.返回片段中所有样本的绝对值的最大值。

audioop.maxpp(fragment, width)

Return the maximum peak-peak value in the sound fragment.返回声音片段中的最大峰值峰值。

audioop.minmax(fragment, width)

Return a tuple consisting of the minimum and maximum values of all samples in the sound fragment.返回由声音片段中所有样本的最小值和最大值组成的元组。

audioop.mul(fragment, width, factor)

Return a fragment that has all samples in the original fragment multiplied by the floating-point value factor. 返回原始片段中所有样本乘以浮点值factor的片段。Samples are truncated in case of overflow.如果溢出,样本将被截断。

audioop.ratecv(fragment, width, nchannels, inrate, outrate, state[, weightA[, weightB]])

Convert the frame rate of the input fragment.转换输入片段的帧速率。

state is a tuple containing the state of the converter. state是包含转换器状态的元组。The converter returns a tuple (newfragment, newstate), and newstate should be passed to the next call of ratecv(). 转换器返回一个(newfragment, newstate),并且newstate应该传递给ratecv()的下一个调用。The initial call should pass None as the state.初始调用应传递None作为状态。

The weightA and weightB arguments are parameters for a simple digital filter and default to 1 and 0 respectively.weightAweightB参数是简单数字滤波器的参数,默认值分别为10

audioop.reverse(fragment, width)

Reverse the samples in a fragment and returns the modified fragment.反转片段中的样本并返回修改后的片段。

audioop.rms(fragment, width)

Return the root-mean-square of the fragment, i.e. sqrt(sum(S_i^2)/n).返回片段的均方根,即sqrt(sum(S_i^2)/n)

This is a measure of the power in an audio signal.这是音频信号功率的度量。

audioop.tomono(fragment, width, lfactor, rfactor)

Convert a stereo fragment to a mono fragment. 将立体声片段转换为单声道片段。The left channel is multiplied by lfactor and the right channel by rfactor before adding the two channels to give a mono signal.左声道乘以lfactor,右声道乘以rfactor,然后将两个声道相加,以产生单声道信号。

audioop.tostereo(fragment, width, lfactor, rfactor)

Generate a stereo fragment from a mono fragment. 从单声道片段生成立体声片段。Each pair of samples in the stereo fragment are computed from the mono sample, whereby left channel samples are multiplied by lfactor and right channel samples by rfactor.根据单声道样本计算立体声片段中的每对样本,由此左声道样本乘以lfactor,右声道样本乘以rfactor

audioop.ulaw2lin(fragment, width)

Convert sound fragments in u-LAW encoding to linearly encoded sound fragments. 将u-LAW编码中的声音片段转换为线性编码的声音片段。u-LAW encoding always uses 8 bits samples, so width refers only to the sample width of the output fragment here.u-LAW编码始终使用8位样本,因此这里的width仅指输出片段的样本宽度。

Note that operations such as mul() or max() make no distinction between mono and stereo fragments, i.e. all samples are treated equal. 注意,诸如mul()max()之类的操作不会区分单声道和立体声片段,即所有样本都被视为相等。If this is a problem the stereo fragment should be split into two mono fragments first and recombined later. Here is an example of how to do that:如果这是一个问题,立体声片段应该首先分成两个单声道片段,然后再重新组合。以下是如何做到这一点的示例:

def mul_stereo(sample, width, lfactor, rfactor):
lsample = audioop.tomono(sample, width, 1, 0)
rsample = audioop.tomono(sample, width, 0, 1)
lsample = audioop.mul(lsample, width, lfactor)
rsample = audioop.mul(rsample, width, rfactor)
lsample = audioop.tostereo(lsample, width, 1, 0)
rsample = audioop.tostereo(rsample, width, 0, 1)
return audioop.add(lsample, rsample, width)

If you use the ADPCM coder to build network packets and you want your protocol to be stateless (i.e. to be able to tolerate packet loss) you should not only transmit the data but also the state. 如果您使用ADPCM编码器构建网络数据包,并且希望您的协议是无状态的(即能够容忍数据包丢失),那么您不仅应该传输数据,还应该传输状态。Note that you should send the initial state (the one you passed to lin2adpcm()) along to the decoder, not the final state (as returned by the coder). 注意,应该将initial状态(传递给lin2adpcm()的状态)发送给解码器,而不是最终状态(由编码器返回)。If you want to use struct.Struct to store the state in binary you can code the first element (the predicted value) in 16 bits and the second (the delta index) in 8.如果要使用struct.Struct以二进制形式存储状态,可以将第一个元素(预测值)编码为16位,将第二个元素(增量索引)编码为8位。

The ADPCM coders have never been tried against other ADPCM coders, only against themselves. ADPCM编码器从未对其他ADPCM编码器进行过测试,仅对其自身进行过测试。It could well be that I misinterpreted the standards in which case they will not be interoperable with the respective standards.很可能是我误解了标准,在这种情况下,它们将无法与各自的标准互操作。

The find*() routines might look a bit funny at first sight. find*()例程乍一看可能有点滑稽。They are primarily meant to do echo cancellation. 它们主要用于消除回声。A reasonably fast way to do this is to pick the most energetic piece of the output sample, locate that in the input sample and subtract the whole output sample from the input sample:要做到这一点,一个相当快速的方法是选择输出样本中最有活力的部分,将其定位在输入样本中,然后从输入样本中减去整个输出样本:

def echocancel(outputdata, inputdata):
pos = audioop.findmax(outputdata, 800) # one tenth second
out_test = outputdata[pos*2:]
in_test = inputdata[pos*2:]
ipos, factor = audioop.findfit(in_test, out_test)
# Optional (for better cancellation):
# factor = audioop.findfactor(in_test[ipos*2:ipos*2+len(out_test)],
# out_test)
prefill = '\0'*(pos+ipos)*2
postfill = '\0'*(len(inputdata)-len(prefill)-len(outputdata))
outputdata = prefill + audioop.mul(outputdata, 2, -factor) + postfill
return audioop.add(inputdata, outputdata, 2)