RTcmix Documentation

denoise

denoise was ported from the CARL music software package (denoise written by Mark Dolson), where it was a command-line filter program. The man page for denoise, as well as Dolson's comments below, is still essential reading, even though the interface to the cmix instrument is different. The variable names for the cmix pfields are the same as the names in the denoise documentation.

Here is the original Dolson documentation (ignore the stuff about command-line flags; they have been converted to cmix p-fields):

       denoise [flags] noise_soundfile < floatsams > floatsams

            flags:
            R = input sample rate (automatically read from stdin)
            N = number of bandpass filters (i.e., size of FFT) (1024)
            M = analysis window length (N-1)
            L = synthesis window length (M)
            D = decimation factor (M/8)
            b = begin time in noise_soundfile (0)
            e = end time in noise_soundfile (end)
            d = duration in noise_soundfile (end - begin)
            t = threshold above noise reference (in dB) (30)
            s = sharpness of filter turnoff (scale of 1 to 5) (1)
            n = number of FFT frames to average over (5)
            m = minimum gain of filter when off (in dB) (-40)
            noise_soundfile name must follow all flags

       This  program tries to reduce unwanted background noise by
       setting up a bank of bandpass filters and controlling  the
       gain  of each filter. The gain is gradually turned down to
       a minimum (specified by -m) whenever  the  average  energy
       level  in  that  filter  falls  below some threshold.  The
       threshold is set to be (approximately) -t dB above a  pre-
       determined  noise  floor.  The floor is computed automati-
       cally at the start of the program as the average  spectrum
       of  the  noise_soundfile,  which  is assumed to contain at
       least .25 seconds of noise without signal.

       This kind of noise reduction works best on hiss-type back-
       ground  noise  (as  opposed  to pops and clicks) where the
       signal-to-noise ratio is good.  The  most  important  con-
       trols  are as follows:  The -t flag controls the number of
       dB above the noise floor at which  the  filter  starts  to
       turn off.  Values between 20 and 40 dB may be appropriate.
       The -s flag controls the sharpness of the  filter  turnoff
       (as a function of energy level in the filter band).  It is
       generally best left at a value of 1.  The -m flag controls
       the  extent  to  which the filter can be turned completely
       off.  Values between -20 and -40 dB are probably most use-
       ful.   The  -n flag controls the extent in time over which
       the average energy in the filter is to be computed.   This
       average  is  computed  looking  an  equal amount ahead and
       behind the current time.  As a rule, n frames of FFT  cor-
       respond to a time window of (n*N/8) / R seconds.  In prac-
       tice, the -t and -m flags probably provide the most useful
       tradeoffs.

       The other flags (-N, -M, and -D) control the FFT size, the
       window size, and the  FFT  spacing,  respectively.   These
       flags  should  probably  not  be  used  except  perhaps to
       increase N (always a power of two) for sample rates  above
       16KHz.  As a rule, M is set equal to N, and D is set equal
       to M/8.  For a factor of two decrease in compute  time,  D
       can  be set to M/4; this may or may not affect sound qual-
       ity.

       The idea of performing noise reduction in this fashion has
       been  independently  introduced  by  a number of different
       people.  This implementation does not conform to  any  one
       in  particular,  except  that the gain calculation is that
       suggested by Moorer & Berger in ``Linear-Phase  Bandsplit-
       ting:  Theory  and  Applications,''  presented at the 76th
       Convention 1984 October 8-11 New York of the  Audio  Engi-
       neering Society (preprint #2132).

Syntax in comment form:

/* ------------------------------------------------------------------------
   denoise

   Generally, you want to leave M, L and D alone. The duration of the
   excerpt from the noise reference file must be at least .25 seconds.

   First, open the input file as unit 0, the output file as unit 1,
   and the noise reference file as unit 2. (The noise reference file
   can be the same as the input file.) Denoise can only process one
   channel at a time.

   p0  input start time
   p1  input duration (use dur(0) to get total file duration)
   p2  N: number of bandpass filters (size of FFT) [must be power of 2]
   p3  M: analysis window length [N]
   p4  L: synthesis window length [M]
   p5  D: decimation factor [M/8]
   p6  b: begin time in noise reference soundfile
   p7  e: end time in noise reference soundfile
   p8  t: threshold above noise reference in dB [try 30]
   p9  s: sharpness of noise-gate turnoff [1-5]
   p10 n: number of fft frames to average over [5]
   p11 m: minimum gain of noise-gate when off, in dB [-40]
   p12 channel number (0=left, 1=right)

   NOTE: the previous version of denoise did not have the first two pfields
   listed above.  Scores for that version still work correctly with this one.

   Dolson's comments from the original source...

   Experimental noise reduction scheme using frequency-domain noise-gating.
   This program should work best in the case of high signal-to-noise with
   hiss-type noise. The algorithm is that suggested by Moorer & Berger in
   "Linear-Phase Bandsplitting: Theory and Applications" presented at the
   76th Convention 1984 October 8-11 New York of the Audio Engineering
   Society (preprint #2132) except that it uses the Weighted Overlap-Add
   formulation for short-time Fourier analysis-synthesis in place of the
   recursive formulation suggested by Moorer & Berger. The gain in each
   frequency bin is computed independently according to 

      gain = g0 + (1-g0) * [avg / (avg + th*th*nref)] ^ sh

   where avg and nref are the mean squared signal and noise respectively for
   the bin in question. (This is slightly different than in Moorer & Berger.)
   The critical parameters th and g0 are specified in dB and internally
   converted to decimal values. The nref values are computed at the start of
   the program on the basis of a noise_soundfile (specified in the command
   line) which contains noise without signal. The avg values are computed
   over a rectangular window of m FFT frames looking both ahead and behind
   the current time. This corresponds to a temporal extent of m*D/R (which
   is typically (m*N/8)/R). The default settings of N, M, and D should be
   appropriate for most uses. A higher sample rate than 16KHz might indicate
   a higher N.

------------------------------------------------------------------------*/

Sample scorefile:

load("denoise")

system("rm -f denoise1.snd")
system("F1 denoise1.snd")
output("denoise1.snd")

input("needs_cleaning_mono.snd")
inskip = 0
input("noiseref_mono.snd", 2)

N = 1024 * 4
M = N
L = M
D = M/8
b = 5.7
e = 6.0
t = 30
s = 1
n = 5
m = -40
chan = 0

denoise(inskip, dur(0), N, M, L, D, b, e, t, s, n, m, chan)

Sample scorefile #2:

load("denoise")

system("rm -f denoise2.snd")
system("F2 denoise2.snd")
output("denoise2.snd")

input("needs_cleaning_stereo.snd")
inskip = 0
input("noiseref_stereo.snd", 2)

N = 1024 * 4
M = N
L = M
D = M/8
b = 5.7
e = 6.0
t = 30
s = 1
n = 5
m = -40

denoise(inskip, dur(0), N, M, L, D, b, e, t, s, n, m, chan=0)
denoise(inskip, dur(0), N, M, L, D, b, e, t, s, n, m, chan=1)