بارك الله بكم
Quantization sound processing
In signal processing and digital audio, quantization is the process of approximating a continuous range of values (or a very large set of possible discrete values) by a relatively-small set of discrete symbols or integer values. This article describes aspects of quantization related to sound signals.
After sampling, sound signals are usually represented by one of a fixed number of values, in a process known as pulse-code modulation (PCM). Some specific issues related to quantization of audio signals follow.
Telephony applications frequently use 8-bit quantization. That is, values of the analogue waveform are rounded to the closest of 256 distinct voltage values represented by an 8-bit binary number. This crude quantization introduces substantial quantization noise into the signal, but the result is still more than adequate to represent human speech.
By comparison, compact discs use a 16-bit digital representation, allowing 65,536 distinct voltage levels. This is far better than telephone quantization, but CD audio representing low signal levels would still sound noticeably 'granular' because of the quantizing noise. However, sometimes an addition of a small amount of noise is added to the signal before digitization. This deliberately-added noise is known as dither. Adding dither eliminates this granularity, and gives very low distortion, but at the expense of a small increase in noise level. Measured using ITU-R 468 noise weighting, this is about 66dB below alignment level, or 84dB below FS (full scale) digital, which is somewhat lower than the microphone noise level on most recordings, and hence of no consequence (see Programme levels for more on this).
Optimizing dither waveforms
In a seminal paper published in the AES Journal, Lipshitz and Vanderkooy pointed out that different noise types, with different probability density functions (PDFs) behave differently when used as dither signals, and suggested optimal levels of dither signal for audio. Gaussian noise requires a higher level for full elimination of distortion than rectangular PDF or triangular PDF noise. Triangular PDF noise has the advantage of requiring a lower level of added noise to eliminate distortion and also minimizing 'noise modulation'. The latter refers to audible changes in the residual noise on low-level music that are found to draw attention to the noise.
Noise shaping for lower audibility
An alternative to dither is noise shaping, which involves a feedback process in which the final digitized signal is compared with the original, and the instantaneous errors on successive past samples integrated and used to determine whether the next sample is rounded up or down. This smooths out the errors in a way that alters the spectral noise content. By inserting a weighting filter in the feedback path, the spectral content of the noise can be shifted to areas of the 'equal-loudness contours' where the human ear is least sensitive, producing a lower subjective noise level (-68/-70dB typically ITU-R 468 weighted).
24-bit audio is sometimes used undithered, because for most audio equipment and situations the noise level of the digital converter can be louder than the required level of any dither that might be applied.
There is some disagreement over the recent trend towards higher bit-depth audio. It is argued by some that the dynamic range presented by 16-bit is sufficient to store the dynamic range present in almost all music. In terms of pure data storage this is often true, as a high-end system can extract an extremely good sound out of the 16-bits stored in a well-mastered CD. However, audio with very loud and very quiet sections can require some of the above dithering techniques to fit it into 16-bits. This is not a problem for most recently produced popular music, which is often mastered so that it constantly sits close to the maximum signal (see loudness war); however, higher resolution audio formats are already being used (especially for applications such as film soundtracks, where there is often a very wide dynamic range between whispered conversations and explosions).
For most situations the advantage given by resolution higher than 16-bit is mainly in the processing of audio. No digital filter is perfect, but if the audio is upsampled and the audio is done in 24-bit or higher, then the distortion introduced by filtering will be much quieter (as the errors always creep into the least significant bits) and a well-designed filter can weight the distortion more towards the higher inaudible frequencies (but a sample rate higher than 48kHz is needed so that these inaudible ultrasonic frequencies are available for soaking up errors).
There is also a good case for 24-bit (or higher) recording in the live studio, because it enables greater headroom (often 24dB or more rather than 18dB) to be left on the recording without encountering quantization errors at low volumes. This means that brief peaks are not harshly clipped, but can be compressed or soft-limited later to suit the final medium.
Environments where large amounts of signal processing are required (such as mastering or synthesis) can require even more than 24 bits. Some modern audio editors convert incoming audio to 32-bit (both for an increased dynamic range to reduce clipping, and to minimize noise in intermediate stages of filtering), and some DAW environments (such as recent versions of REAPER and SONAR) use 64-bit audio for their underlying engine.
عضو في أكاديمية ماركا التعليمية
بارك الله بكم
عضو في أكاديمية ماركا التعليمية
الذين يشاهدون الموضوع الآن: 1 (0 من الأعضاء و 1 زائر)