MiXViews (mxv)
Table of Contents
Introduction
Startup Arguments
General Paradigm
Files vs. Views
Types of Data
Targets and Sources
Adjusting the View
Selecting Edit Points
Shifting Edit Points
Command History and Undo
Operations
Mxv Commands
Appendix Entries
-
Mxv Defaults and X Resources
-
Quick Overview of Linear
Predictive Coding (LPC) Analysis
-
Quick Overview of Phase
Vocoder (PVoc) Analysis
-
Dialog Panels
-
Progress Dialogs
-
Playing and Recording Sounds
Introduction
MiXViews is an editing, processing, and analyzing tool for digitized sounds
and other forms of binary data. It is built upon the InterViews X library,
and runs within the X window environment.
NOTE: All of the analysis/processing techniques available in
mxv are well known to those who work with computer-generated sounds. Manual
pages and other specific information about each of them would overly expand
the size of this document, so they are only described briefly. More information
is available elsewhere for those who wish to learn more.
Startup Arguments
MiXViews has three different types of command-line arguments:
-
InterViews arguments
These include all standard X application arguments like -fn, -rv, etc.
-
MiXViews arguments
Currently just two:
-autoplace, which causes the windows to map without user input
-plotwidth, which determines the width in pixels of the data display
File arguments
There are two of these, and they must immediately follow each file
to which they will apply:
-skip N, where 'N' is the number of seconds to skip before reading.
-duration M, where 'M' is the duration in seconds to read in the file.
-skip defaults to 0, and -duration defaults to the entire file length.
Example:
% mxv 1st.snd -skip 3.5 2nd.snd -duration 5.0
will skip 3.5 seconds into 1st.snd, and read from there to the end, and
will read the first 5 seconds of 2nd.snd.
General Paradigm
Mxv is based on the MVC (Model-View-Controller) paradigm of object-oriented
programming. What this means is that every chunk of data being edited,
whether it be a digital sound, an amplitude envelope, a linear predictive
coding file, or whatever, is represented in the program as a type of object
called a model. The user (this means you) interact with this model via
another object called a view. The view lets you "see" the model, that is,
it displays the model's data in some format that makes sense of the values,
usually as a graph of some sort. Any given model can have any number of
views associated with it. All of these are related in that they are providing
"windows" into the data that is being edited. There will always be at least
one view; when the last open view is closed, the model will be destroyed,
usually after having been saved to disk. The third part of the MVC paradigm,
the controller, is an object that coordinates the communication between
the model and its view(s), and is not visible to the user.
Files vs. Views
Because mxv stores all the data being edited in virtual memory (rather
than continuously reading and writing to disk), each view or window does
not necessarily have a corresponding file on disk. When any on-disk file
is opened in mxv, it is displayed as a single view with the file's name
on the title bar. However, many windows may exist which have never been
written to disk (and which have no associated file), or multiple views
of a single file may be visible. Between the time a file is read by mxv
and the time any individual model is written out to a file, it is better
to think in terms of data objects rather than files.
Types of Data
The most common form of data file available for editing is the soundfile.
Mxv is able to read soundfiles with IRCAM-style, native NeXT, Sun (au),
AIFF-C, or WAVE format headers. Soundfiles can have any number of channels,
and can have sample formats of 8-bit linear, 8-bit mu law, 16-bit or 32-bit
linear, or floating point linear. Compressed data formats (other than mu
law) are not readable except on the SGI platform. Other types of data that
maybe edited are as follows:
Name |
File Suffix |
Description |
LPC |
.lpc |
Analysis data from Linear Predictive Coding |
Pitch |
.pt |
Pitch Track analysis data |
Envelope |
.evp |
General purpose data curves, stored as doubles |
FFT |
.fft |
"Time Slice" Fast Fourier Transform analysis data |
Pvoc |
.i or .pv |
Phase Vocoder Analysis data |
All files with no suffixes, or other suffixes, will be read (or attempted
to be read) as soundfiles. If the "read raw file" option is set in the
resource file or in the options menu command, mxv will attempt to read
files without headers, based upon the information you supply, and based
on the file suffix (i.e., a raw file with a .lpc suffix will be interpreted
as a raw LPC data file).
Targets and Sources
Many operations upon data objects involve taking a portion of one and putting
it into another. Sometimes it will be spliced in, sometimes mixed in, or
some other method of combining. In this document, the object being taken
from is called the source, and the object being added to is the target.
The general procedure for this in mxv is as follows: Using the mouse, the
user highlights some portion of a view (displays it in reverse video),
which is called selecting it. The technique for doing this is explained
later. The user then picks some other view, and again using the mouse,
sets an edit point in that view. If the user then performs a "splice in"
operation in this window, this window will be the target of the operation,
and the previously selected window will be the source.
If an operation (e.g. scaling) involves only a single view, then that
view is the target, and there is no source. In these cases, the target
is simply called the selection. For example, if the user wished to scale
a section of a sound, he/she would highlight (select) a portion of the
visible sound, and then choose "phrase" from the available modifying commands
(see below). The operation will take place upon the current selection only.
If the target and source formats differ in their maximum amplitude
capability (for example, mixing a floating point sound into a short integer
sound), the source material will be scaled to match the target in the following
way: for formats with fixed maximum amplitudes, the ratio of signal amp
to maximum amp will be preserved. Otherwise (for floating point and double-precision),
the ratio of signal amp to max amp for the file will be mapped -- for instance,
a source selection with an amp of 100,000 from a file with a max of 300,000
will be scaled to .30 of the target file's maximum amplitude.
Adjusting the View
The data (waveform) view window has a horizontal scroll bar which contains
a button that indicated the portion of the data currently visible. Each
of the three mouse buttons has a different effect when clicked in the scroll
bar:
-
Left: Scroll 1/4 page in the direction of the click.
-
Middle: Scroll 1 page in the direction of the click.
-
Right: Scroll to the location in the file indicated by the click.
The arrows on either end will scroll 1/4 page, or 1 page if shift-clicked.
The view resolution is determined by the horizontal and vertical zoom buttons.
The resolution may also be set directly via the view menu commands (see
below). In addition, the arrow keys can be used in combination with the
control key to adjust the resolution:
-
Up Arrow: Vertical zoom out
-
Down Arrow: Vertical zoom in
-
Right Arrow: Horizontal zoom out
-
Left Arrow: Horizontal zoom in
Note: Due to window manager interactions, the arrow keys may not function
correctly.
Selecting Edit Points
The fastest way to select editing points is with the mouse. In all data
display windows, the mouse buttons behave as follows:
-
Left Mouse: set insert point (or beginning of edit) at current location
-
Middle Mouse: set end of edit region at current location
-
Right Mouse: select entire file for editing
The <control> key modifies these as follows:
-
Control-Left Mouse
continuously update insert time and duration display, and display amplitude
for current frame
-
Control-Middle Mouse
continuously update selection time and duration display
-
Control-Right Mouse
select visible portion of data for editing
If the <shift> key is held down with any of the described combinations,
only the channel that is under the cursor will be selected. Without it
(the default), all channels will be selected. When the mouse button is
released, the currently current selection will be highlighted, and the
numerical values will be displayed in the panel below and to the right
of the data display panel (regardless of whether the <control> key is
down or not). Edit points may also be set via the Set Insert Point and
Set Edit Region commands (see below).
Shifting Edit Points
A previously selected insert point or region may be shifted to the right
or left by one unit using the 'l' and 'h' keys, respectively. This is intended
to parallel the 'vi' editor commands. A "unit" is defined as one horizontal
pixel or one data frame, whichever is wider. In addition, a selected (highlighted)
region may be expanded or shrunk in size by one unit using the '+' and'-'
keys (do not use the <shift> key with the '+'). Also, the selected region
may be "collapsed" into an insert point set to either the beginning or
the end of the region, using the <delete> key or the space bar, respectively.
Command History and Undo
Operations
Each edit operation which alters the data contained in a window in any
way, e.g., sample values, sampling rate, file length, is stored in a command
queue which is bound to the window in which the operation took place.
The length of this queue is user-settable, and defaults to 65535.
In parallel with the command queue, a second queue of undo operations is
created, containing the operations needed to reverse the effect of each
edit. Whenever possible, Mxv stores only the portion of the waveform
data needed to perform the undo operation. This allows many undo
operations to be stored inexpensively (in terms of memory). However,
when operating on large chunks of data, the amount of memory devoted to
the undo and redo queue can grow quite large. If this becomes an
issue, reduce the length of the command queue (see below).
Mxv Commands
Different command menus are available depending on the type of data being
edited, but all types have the view, file, edit, display, and options menus.
Most commands have a key equivalent. To execute a command via its key equivalent,
make sure the cursor is within the bounds of the waveform display window
,then hold down the <control> key and press the key indicated in the
menu item.
File Menu
-
Set default dir...
allows user to enter a directory name to be used as the default directory
for opening/saving the current data type.
-
New...
allows user to create an empty file in memory (not written to disk)
for scratch use. Appropriate information for each data type may be set.
-
New Type ->
pull-out menu for creating a new file of selected type (as opposed
to new... which always creates the same type as the current window).
-
Open...
pops up an open panel to let user choose another file to open. The
default directory to open can be set for each data type via either the
set default dir command (see above) or the X resource file (see below).
User may enter an amount of time (measured in seconds) to skip into the
file before reading, and/or a duration of time to read in the file. The
default is always 0 inskip, total duration. If a file is opened as a segment,
i.e., non zero inskip and/or partial duration, it will be displayed in
mxv with a new name constructed from the string "tmp_" plus the skip and
duration values, followed by the name of the file. This avoids the possibility
of accidentally overwriting the original file with the segment. If a raw
(headerless) file is opened, a dialog panel will be displayed asking the
user for information about this file. Mxv has no way of checking the validity
of this information, so be careful. Users can specify arbitrary-sized amounts
of data to be skipped (for unknown header types, for instance) prior to
reading data. This value is independent of any inskip time that might
have been specified in the open panel. Users may also specify whether the
byte order of this raw data needs to be swapped.
-
Save...
(re)write file to disk. If file is a temp or untitled file, user will
be prompted for a name via the "save as" command. File will always be written
out with the same header format as the file on disk.
-
Save as...
write file out with a new name (changes name of current file to match).
The user may choose the type of header (if any) used to write a new data
file. If overwriting an existing file, the user may choose to force a new
header type or preserve the existing one.
-
Save selection to file
write out selected region as a new file. User is prompted for
a filename.
-
Revert
re-read file from disk, undoing all changes since the last save to
disk. File must have been saved to disk at least once for this to work
(i.e., temp files and untitled files cannot be reverted).
-
Change name...
prompts user for new name for current file. The user will be warned
if the new name matches an existing name in the default directory for the
particular data type.
-
Change file comment...
allows user to edit a text comment for data files. This is stored as
part of the file header on disk. A text editor window will be displayed
in which the current comment (if any) will be displayed. After editing,
the changes must be saved by selecting "save comment" in the Edit menu.
-
File information...
displays a window with all appropriate data about the file, including
name, sample rate, length, file size in Megabytes, and number of channels.
-
Data dump of selection...
prompts for the name of a dump file into which an ASCII representation
of the current selection will be placed.
-
Show program version
Display current version number and copyright.
-
Quit program
Close all views of all models and exit program. Mxv will warn you about
any unsaved models.
View Menu
The view menu contains all commands relating to the visual display of the
data.
-
New view of selection
Open new window (new view of current model) with the display set to
show the current selection.
-
Zoom to selection
Expand currently current selection to fill the entire window.
-
Zoom to full
Set display to view the entire model.
-
Set frame view...
Prompt for desired horizontal frame or sample display.
-
Reset vertical scale
Return the vertical scaling to its default (starting) value.
-
View Mode ->
Pop-up submenu with commands for switching the display between channel
mode (successive channels of the data displayed in stacked graphs) or frame
mode (cascade display of amplitudes of all channels as a function of time).
This latter mode may only be used on files with at least 4 channels.
-
Channel Display ->
Pop up submenu with channel display commands (see below).
-
Display copy buffer
Open new window displaying the internal copy buffer, if there is anything
in it. This buffer is filled via the copy command, and by the remove and
splice out commands.
-
Close current view
Close this window. If this is the last remaining view of a model, mxv
will warn you if there are unsaved changes to this model.
Channel View Submenu
In channel view windows, these commands affect the number of graphs that
are stacked and visible in the view. In frame view mode, these commands
affect the horizontal display, because each visible horizontal plot represents
all channels for a given frame.
-
Set channel view...
Prompt for specific set of channels to display.
-
Add channel
Add next available channel to the displayed channels. If there are
no additional channels, the command has no effect.
-
Remove channel
Remove highest displayed channel from the display. No effect when only
one channel is displayed.
-
Shift up/down
Shift channel display up or down by one increment. For ex., if channels
0 and 1 were displayed, choosing shift up would display channels 1 and
2. Causes horizontal shift in cascade displays. No effect if no additional
channels available.
Edit Menu
-
Undo
restores the current window to the state immediately preceeding the
last edit operation on the stack. Subsequent undo operations revert
previous edits. The number of undo operations available is determined
by the Command Undo Depth in the Editor Options
command.
-
Redo
re-runs the edit operation which has most recently been undone.
-
Set insert point...
prompts user for exact location of desired insert point. This bypasses
the "quantization" of the insert point due to the current horizontal display
resolution.
-
Set edit region...
prompts user for exact location of desired selection. Also bypasses
quantization as in previous command.
-
Copy
copies the current selection into the global copy buffer.
-
Copy to new
copies the current selection into a new temporary file
-
Remove
copies the current selection into the buffer and then erases (zeroes)
the region.
-
Remove to new
does a Copy to new followed by an erase.
-
Erase
erases (zeroes) region without storing it in copy buffer.
-
Splice out
copies the current selection into the global buffer and then splices
it out of the original file, i.e., shortens the file by the length of the
region and moves the remainder to the left.
Note: Mxv will not allow splicing out of the entire file. Minimum file
length is at least two frames.
-
Splice out to new
does a Copy to new followed by a splice out.
-
Delete
splices out the region without storing it in copy buffer.
-
Mix
adds the contents of the global buffer or current source, sample by
sample, to the file beginning at the insertion point.
-
Replace
replaces (destructively writes) the contents of the global buffer or
current source to the file beginning at the insertion point.
-
Crossfade...
allows user to combine two segments with a crossfaded overlap zone.
The user may choose a curve from a set of predefined choices, or read one
from a .evp file on disk.
-
Splice in
inserts the contents of the global buffer or current source at the
insertion point, shifting the contents to the right. Opposite of splice
out.
Modify Menu (for Sounds)
-
Adjust gain...
multiplies the current selection by an amplitude factor.
-
Apply envelope...
multiplies the amplitude values of the selected region by selected
source envelope. If no source is currently selected, program will prompt
for one.
-
Fade in...
multiplies the volume of the selected region by either an increasing
linear gain curve or a linear dB curve.
-
Fade out...
multiplies the volume of the selected region by either an decreasing
linear gain curve or a linear dB curve.
-
Reverse
reverses the current selection in time.
-
Transpose...
transposes the current selection by either an equal-tempered or linear-octave
interval, or by any frequency ratio. The result will be displayed in a
new window -- the original file is unchanged.
-
Time shift...
not yet implemented
-
Filter ->
pull-out Filter submenu (see below)
-
Insert space...
allows the user to splice in any number of seconds worth of silence,
starting at the insert point.
-
Normalize values
scales amplitude values of current selection between 1.0 and -1.0.
This is useful for converting sounds into control envelopes. The user will
be warned if attempting this on a non-floating-point data type (which would
turn the data into random 1's, 0's and -1's).
-
Add delay...
shifts selection to the right by any number of frames
-
Add DC offset...
adds a fixed offset to every sample
-
Remove DC offset
applies a sharp filter to remove all frequencies below 20 Hz. Useful
for removing LF rumble, etc.
-
Swap bytes
converts little-endian data into big-endian and vice versa. Useful
for converting data which was read in the "wrong" way.
Filter Submenu (Sounds only)
-
Low pass
current selection is filtered by a first-order (one pole) low pass
filter.
-
Resonant
current selection is processed by a second-order (two pole) resonant
filter. Center frequency, bandwidth ("Q"), and gain-mode can be set.
-
Comb
current selection is filtered by a comb filter. Value can be set as
either center frequency in Hz. or as a loop (delay) time in seconds. The
"Q" of the filter is measured as the time it takes the impulse response
to fall by 60 db. NOTE: This routine, like the transposer, puts its output
into a new window.
-
Elliptical
current selection is processed by a variable-length elliptical filter,
which is capable of very sharp rolloff curves, and may function as a high,
low, or bandpass filter with variable stopband ripple and attenuation.
The passband cutoff is the frequency at which the rolloff will begin.
The stopband cutoff is the frequency at which the amplitude has been reduced
by the value given in the stopband attenuation field. Therefor, if the
stopband is greater than the passband, a low-pass filter will be created,
otherwise a high-pass filter will be created. The sharpness of this filter
is infinitely variable through changes in the passband, stopband, and attenuation.
If the bandpass stopband is non zero, it will be used as the max attenuation
point for the side of a bandpass filter not set by the stopband. The ripple
factor determines the amplitude of the sidebands which are an unavoidable
part of any filter. Smaller values produce bigger and slower filters.
-
FIR
current selection is filtered using a Finite Impulse Response filter
whose coefficients are the source selection (see Targets and Sources).
It is highly recommended that you normalize the source before applying
the filter to avoid extreme amplitude values.
-
LPC formant
use a previously created Linear Predictive Coding datafile to apply
a time-varying format filter to the current selection. A portion of an
open LPC datafile must be selected first, followed by the selection of
a region in a sound to be filtered. Note: This filter often produces
large amplitudes, and is best performed upon a floating-point soundfile.
The user has a choice of LPC frame interpolation methods; linear is
faster, but recalculated produces smoother results, and is useful when
stretching a small number of LPC frames over a large amount of sound. A
warp factor may be specified to shift the formant peaks up or down. See
Analysis menu, below, and Appendix B for more information about LPC data.
Modify Menu (Other data types)
-
Normalize values
same as above
-
Smooth curve
applies a fixed low-pass filter to data -- f(x) = .5x + .5(x-1)
-
Autocorrelate curve...
applies a very simple autocorrelation function to the selected portion
of the data. This is not a denoising or click-removal
algorithm, and is not intended to be used on large (> 100 frames) of data.
It is most useful for removing "glitches" from an otherwise smooth curve
(such as a pitch analysis data curve). The parameters are Deviation
threshold, which indicates the minimum frame-to-frame delta which should
be regarded as an out-of-range value (i.e., a glitch), and the number of
frames to look ahead for a continuation of the valid curve.
This is experimental code -- save a copy of your data before trying this.
-
Scale values...
same as Adjust gain, above
-
Rescale to fit...
allows the user to scale selected region to fit between an arbitrary
maximum and minimum value
-
Add offset...
same as "Add DC offset", above
-
Apply envelope...
same as above
-
Reverse
same as above
-
Swap bytes
same as above
-
Insert space...
same as above
-
Add delay...
same as above
-
Stretch/shrink...
like Transpose, above, but specified as ratio or as new length,
rather than interval
Display Menu (channel view)
-
Graph Style ->
waveform graph may be displayed in either a continuous line graph,
or a solid bar graph. The bar graph is useful for displaying individual
samples for editing purposes.
-
Horiz Scale Units ->
horizontal scale for waveform may be shown in either time (based on
frame rate), frame numbers (sample numbers for sounds), or SMPTE frames
(see "Scale Options" in options menu, below).
Display Menu (frame view)
-
Horiz. Scale Units ->
cascade-style frame display's horizontal axis may be displayed in either
frequency or band (channel) numbers.
-
Vert. Scale Units ->
vertical (y) axis, which is always amplitude, may be displayed in either
linear (amplitude) mode or log (decibel) mode. This is especially useful
for examining fft and phase vocoder data.
Options Menu
Note: Besides the commands listed below, each data type has its own options
menu command in its data-specific menu (i.e., under the "Sound" menu for
sounds).
-
Global Options...
Allows user to set global program options. Many of these options are
also settable via a .Xdefaults file (see mxv X resources, below).
-
Alert Beep Volume
Set the beep level relative to the base level set for the keyboard
(see the xset man page). Note: The beep cannot be shut off here -- use
"xset b off" to do this.
-
Dialog Panels Ignore Window Manager
See Appendix D for more information.
-
Auto-Place Windows on Screen
For window managers which offer a choice, windows may be placed automatically
in a staggered position on screen.
-
Memory Options...
For more complete control over the amount of memory used by mxv to
store and edit data, a limit may be arbitrarily imposed on the size of
any single allocation (for example, when a new file is opened or created
in memory) and on the total amount of memory used by mxv for all open datafiles.
If the single or total allocation exceeds its associated limit, an alert
panel pops up informing you of this, and giving you a choice of whether
to continue or not. Default values are: 500 mB for total allocation, 100
mB for single allocations. NOTE: These limits are not independent of the
machine's own memory limits. Real memory allocation errors will still occur
when the machine's virtual memory limit is reached if that limit is less
than the one set in mxv. As much as possible, mxv is designed to recover
gracefully from all such allocation failures.
-
File Options...
Allows user to set options regarding files and directories. Some of
these options are also settable via a .Xdefaults file:
-
Read Raw (Headerless) Files
If this option is on, mxv will attempt to read files without headers,
based upon the information you supply, and based on the file suffix (i.e.,
a raw file with a .lpc suffix will be interpreted as a raw LPC data file).
It will also attempt to reread all unreadable (i.e., empty or without read
permission) files as raw files. See the open... command, above, for further
information.
-
Store/Recall Browser Path
If this is set to Yes, each subsequent open command will return you
to whatever was the last directory from which a file was read, as opposed
to returning you to the default directory for that data type.
-
Browser Shows Invisible Files
If set to Yes, file browsers will display invisible (dot) files as
well as normal files.
-
Editor Options...
Allows user to set the depth of the command undo queue, and whether
or not saving a file clears the undo queue.
-
Scale Options...
Allows user to set options regarding vertical and horizontal scales.
Currently, the only settable option is the SMPTE frame format, in
which the user can choose any of the standard SMPTE frame rates. Note:
This feature is still in development.
-
Peak Rescan ->
Displays a submenu allowing user to disable the normal peak rescan
that occurs after every editing operation. This will speed up editing significantly,
but will not update the vertical scale on the screen.
-
Sound Playback Options... (Linux only)
Allows the user to set the audio device, mixer device, and audio buffer
size for audio playback. The audio and mixer device indices must
match for things to work correctly; for example, "/dev/dsp1" uses
"/dev/mixer1". The default is /dev/dsp and /dev/mixer. Audio
buffer size is specified in seconds. The default is 0.1 second, which
corresponds to 4410 samples at 44.1k. Larger values may be necessary
on some machines, but they will also lead to a time lag when trying to
stop playback.
-
Re-read .mxvrc file
Reads the .mxvrc file out of your home directory, if such a file is
present. This will reset the internal state to the values specified
there.
-
Write .mxvrc file
Writes out the current values for options you have set via the above
menus. It will also write out global MiXViews options which were
set in your X resources file.
Help Menu
-
MiXViews Manual (via the web)
Allows the user to browse the MiXViews manual page via web browser
(configurable via the X resource "MiXViews*WebBrowser").
The default manual web page is "http://www.create.ucsb.ucsb.edu/~doug/MiXViews/mxv-manual.html".
-
MiXViews Home Page (via the web)
Allows the user to visit the MiXViews Home Page page via web browser
(configurable via the X resource "MiXViews*WebBrowser").
Special Menus
Many data types have specific menus for editing/processing commands only
available for that data type.
Sound menu (Sounds only)
-
Play
play the current selection through the machine's digital-to-analog
converters (if any). The cursor will show the current position in the file,
and the position in time or samples will update in the panel below.
-
Record
record sound into currently selected region. This assumes that the
sound type is compatible with the converter in use. The cursor will show
the current position in the file, and the position in time or samples will
update in the panel below.
-
Stop
halt the current play or record operation, if any (See Playing and
Recording Sounds, below)
-
Rescan for peak
scan the soundfile to determine the current minimum and maximum values.
Rescale sets the peak amplitude of the file to +-32767 (for mu law &
16-bit samples), +-1.0 (for float samples), and +-127 (actually 0 - 255)
for 8-bit samples.
-
Change sample format...
allows user to convert file from one format to another, i.e., floating-point
to short int or short int to 8-bit mu law. File name will have a new suffix
appended to show it has been converted.
-
Change sample rate...
allows user to change the sampling rate of the soundfile.
-
Change file length...
change the overall length of the current file. User will be warned
if reducing the length (which will destroy data).
-
Synthesis ->
pull-out Synthesis submenu (see below)
-
D/A Converter ->
pull-out Converter submenu (see below)
-
Sound options...
set the default format for sounds, including sampling rate, sample
format, and header format. These defaults will be used in the "Save to"
panel and the "New Sound" panel.
Synthesis Submenu (Sounds only)
All the synthesis techniques require a "source" selection to be made (this
will be the data controlling the synthesis) and a "target" selection (this
will be the portion of a sound into which you wish to synthesize new sound.
-
LPC resynthesis...
prompts user for parameters for resynthesis of sounds from existing
LPC data. The voiced threshold is the frame error value below which the
resynthesized sound will be entirely pitched material; the unvoiced threshold
is the frame error value above which it will be entirely unpitched (noise)
material. The pitched/unpitched ratio allows adjustment of the mix -- the
default is a good starting point. The warp factor allows shifting of the
formant peaks up or down. The user has a choice of LPC frame interpolation
methods; linear is faster, but recalculated produces smoother results,
and is useful when stretching a small number of LPC frames over a large
amount of sound. See Appendix B for more information.
-
Phase Vocoder resynthesis...
prompts user for parameters for resynthesis of sounds from existing
Phase Vocoder data. See Appendix C for more information.
Converter Submenu (Sounds only)
-
NeXT, SPARC, SGI, etc.
items show the choice of D/A converters, if any.
-
Converter settings...
allows user to set converter parameters such as record and play levels,
depending on the platform.
-
Reset converter
reset the converter to its pre-initialized state. This is for use after
a converter initialization failure.
Analysis Menu (Sounds only)
For all analysis techniques that involve frame rates and offsets, the dialog
panel works as follows: if the frame rate value is left at zero, the value
for the frame offset will be used. If the frame rate value is nonzero,
the frame offset value will be ignored, and will be set to sample/frame
rate.
-
Locate next zero crossing
sets the cursor to the next location in the soundfile where the sample
values cross the zero boundary, i.e., sample[n] is positive and sample[n+1]
is negative, or vice versa.
-
Find slope change
searches for locations in a sound where the ratio of sample[n+1]/sample[n]
exceeds the average value for the frame size selected. This is useful for
detecting discontinuities such as pops and clicks.
-
Show maxamp sample location
sets the cursor to the location of the peak sample in the file.
-
Extract amplitude envelope
generates a new Envelope file containing values representing the amplitudes
in the selected region. User may choose between linear, RMS, or Decibel
display of amplitude. The number of samples per resulting envelope frame
is determined by the formula (size of selection)/(envelope length).
-
FFT analysis
runs a Fast-Fourier Transform on a selected region with various size
transforms and frame offsets. This FFT display is for analysis purposes
only and cannot currently be used for any other purpose in MiXViews.
-
LPC analysis
runs a Linear-Predictive Coding analysis on the selected region. This
uses the LPC program developed by Paul Lansky at Princeton. A dialog window
will let you set the number of poles and the frame size and offset. See
appendix B for further information.
-
Extract pitch envelope
runs a pitch tracking analysis on the selected region. Pitch range
should be set to approx. 10% about and below the expected range of fundamentals.
The RMS amplitudes of the frames are also displayed. Pitch tracking information
is used in conjunction with LPC analysis; The pitch channel (channel 2)
is usually merged into the 4th channel of an LPC data file (see the merge
pitch data command under LPC menu). The user can choose whether to pre-filter
the data with a bandpass filter (slower, but much better frequency accuracy)
or a low-pass filter (faster, but not useful at low or high frequencies).
-
Phase Vocoder analysis
runs a Phase Vocoder analysis on the selected region. Input frame size
is the number of samples to analyze per frame. Larger values will give
more precise spectral peaks, but tend to produce blurred sounds if the
timbre changes quickly. Input frame offset is the number of samples to
shift over for each frame. This defaults to framesize/8 unless a non-unity
time scaling factor is used. Input frame rate if non zero, sets the offset
to sample/frame rate samples. Time scaling factor, if non-unity, changes
the frame offset. This is used to increase the number of frames per second
by a given factor, so that when a sound is resynthesized, it can be stretched
by that same time factor without significant windowing distortion. See
appendix C for further information.
LPC menu (LPC Datafiles only)
-
Stabilize frames
runs a stabilization algorithm on the entire datafile which tries to
eliminate all frames with unstable coefficients, i.e., those that would
produce infinite amplitudes during resynthesis. This is highly recommended!
-
Display filter amplitudes
produces an envelope window displaying the relative amplitudes generated
by each filter frame as it is interpolated into the following frame (10
frames per original). This is useful for detecting "bad" frame interpolations.
-
Display filter formants
produces a more elaborate fast-Fourier transform of each frame, so
that the formant peaks of each filter frame may be examined.
-
Audition via resynthesis
allows the user to listen to a resynthesized version of the currently
selected portion of the LPC data file. This is useful for determining
which portion of an analysis is which.
-
Merge pitch data
copies the pitch channel information from a Pitch Track Analysis file
into the fourth channel of the LPC data file. This is necessary because
the LPC analyzer does not do this on its own. In most circumstances, the
pitch analysis should be from the same region of the sound as the LPC analysis,
with the same frame offset or frame rate settings.
-
Adjust pitch deviation...
the total amount of pitch variance around the average pitch value can
be adjusted. A threshold is entered first; frames with error thresholds
exceeding this value will be ignored by the process. These are usually
unpitched frames. This allows you to effectively smooth or exaggerate the
inflections within a given pitch curve.
-
Change sample rate...
allows user to change the value of the sample rate for the data. This
is only of occasional interest.
-
Change file length...
change the overall length of the current file
-
LPC options...
set the default format for lpc data files, including frame rate, number
of filter poles, and header format. These defaults will be used in the
"Save to" panel and the "New LPC Data" panel. NOTE: The default sampling
rate will be set to the default sound sampling rate (see above).
Pitch menu (Pitch Track
Datafiles only)
-
Interpolate between endpoints
"redraws" pitch curve to produce a straight line between the first
and last frames values in the current selection. Very useful for removing
pitch discontinuities.
-
Change file length...
change the overall length of the current file
-
Shift by pitch interval...
currently not implemented
PVoc menu (Phase Vocoder
Datafiles only)
-
Harmonically shift spectrum
allows user to multiply any portion of the Phase Vocoder analysis by
an arbitrary factor, effectively transposing that portion of the spectrum.
Any envelope may be used to map this factor, if desired. A new pvoc file
will be created by this command, and the original will be untouched.
-
Stretch/shrink shift spectrum
works like the previous command, but the shift factor varies linearly
with the frequency, converting harmonic spectra into enharmonic ones. Any
envelope may be used to map this factor, if desired. A new pvoc will be
created by this command, and the original will be untouched.
-
Change file length...
change the overall length of the current file. User will be warned
if reducing the length (which will destroy data).
-
Pvoc options...
currently not implemented.
Envelope menu (Envelopes only)
-
Create linear curve...
creates a linear slope over the selected region. Starting and ending
values default to the start and end of the selected region.
-
Create exponential curve...
creates an exponential curve using the equation y = x ** n. The user
can select the exponent 'n' for the curve. Starting and ending values default
to the start and end of the selected region.
-
Invert existing curve
currently not implemented.
-
Change file length...
change the overall length of the current file. User will be warned
if reducing the length (which will destroy data).
-
Envelope options...
user can select the byte order in which envelope files should be written
to disk, and whether or not a header should be written on the file. Setting
this to raw is extremely useful for writing out "generic" raw floating-point
data files for use in other programs.
Appendix Entries
Appendix A: Mxv defaults
and X resources
X Resources:
Mxv is build on top of the InterViews toolkit. As with other X toolkits,
a large number of resource values may be set by including lines in your
.Xresources or .Xdefaults file (see X windows manuals for a full explaination
of this procedure). This diagram shows the toolkit hierarchy within mxv.
The entries below are in addition to all those that are available for the
Interviews toolkit. For the most part, these resource values affect the
visual appearance of the program only.
[Note from the editor: I realize this is VERY complicated -- I will
eventually include a section from the InterViews manual here to help explain
this. The default settings of everything should be sufficient for now.
-DAS ]
For example, to set the background color of the horizontal scale in
the LPC data display window to Blue, put a line like the following in your
.Xresources file:
mxv*LPCWindow*HScale*HorizontalScale*background: Blue
This same resource can be set on the command line using the -xrmoption:
% mxv -xrm "*LPCWindow*HScale*HorizontalScale*background:Blue"
Key to diagram:
ProgramName
*ResourceNode
<*GenericResourceNode>
*GenericResourceName:
*GenericResourceName:
*GenericResourceName:
*ResourceNode <*GenericResourceNode>
*ResourceNode ("Specific Instance Name")
*ResourceName: [default value]
MiXViews
*AutoPlaceWindows: [false]
*BrowserUseLastPath: [false]
*ReadRawFiles: [false]
*WebBrowser: [no default]
*ManualURL: [no default]
*SoundWindowDisplayChannels: [4]
*LPCWindowDisplayChannels: [4]
*PitchWindowDisplayChannels:
*FFTWindowDisplayChannels:
*EnvelopeWindowDisplayChannels:
*PvocWindowDisplayChannels: [4]
*DefaultSoundFileDir:
*DefaultLPCFileDir:
*DefaultPitchFileDir:
*DefaultFFTFileDir:
*DefaultEnvelopeFileDir:
*DefaultPvocFileDir:
*StatusBar
*FramedWindow
<*TextWindow>
<*DataWindow>
*MenuBar
*PulldownCommandMenu <*PulldownMenu> [menu titles]
*Command <*MenuItem> [command names]
*PullrightCommandMenu <*PullrightMenu>
*Command
<*DataView>
*HScale
*VMessage ("HorizontalScaleLabel")
*HScaleMarks ("HorizontalScale")
*padding:
*borderWidth:
*allowShrink:
*HBorder
*ViewScaler
*VBorder
*StatusPanel ("Edit Start: ")
*StatusPanel ("Edit End: ")
*HorizontalViewScroller
*VScale
*VMessage ("VerticalScaleLabel")
*VScaleMarks ("VerticalScale")
*padding:
*borderWidth:
*allowShrink:
<*Graph>
*PlotWidth:
*PlotHeight:
*ChannelView <*DataView>
*ChannelGraph <*Graph>
*PlotStyle: [bar]
*FrameView <*DataView>
*VerticalViewScroller
*VBorder
*FrameGraph <*Graph>
<*DialogBox>
*Message ("Title")
*Message ("Subtitle")
*ResponseButton <PushButton> (button title)
*Message ("ChoiceButtonTitle")
-*RadioButton
or
-*CheckBox
*Alert <*DialogBox> ("AlertPanel")
*Confirmer <*DialogBox> ("ConfirmPanel")
*ChoiceDialog <*DialogBox> ("ChoicePanel")
*InputDialog <*DialogBox> ("InputPanel")
*Message ("TextEntryLabel")
*TextInput <*StringEditor>
*ValueSlider
*Message ("TextEntryLabel")
*TextInput <*StringEditor>
*NumberLabel
*ScrollerBar
*SoundWindow <*DataWindow>
*LPCWindow <*DataWindow>
*PitchWindow <*DataWindow>
*FFTWindow <*DataWindow>
*EnvelopeWindow <*DataWindow>
*PvocWindow <*DataWindow>
Mxv Defaults
All non visual application default settings are in the process of being
moved to a new .mxvrc file which should be located in the user's home directory.
The format of this file is:
DefaultName1 DefaultValue1
DefaultName2 DefaultValue2
...
Entries must be one pair per line, and may be separated with spaces
and/or tabs. No other characters (i.e., no *'s or :'s) should appear in
these lines. At the present moment, only the following defaults may beset
in this file:
AutoPlaceWindows
BrowserUseLastPath
ReadRawFiles
WebBrowser
ManualURL
DefaultSoundFileDir
DefaultLPCFileDir
DefaultPitchFileDir
DefaultFFTFileDir
DefaultEnvelopeFileDir
DefaultPvocFileDir
On Linux machines, the following may also be set
OSSAudioDevice
OSSMixerDevice
OSSBufferTime
These may also be set in the .Xdefaults file as described above, but
values set in the .mxvrc file will override any settings in the other file.
Appendix B: Quick
Overview of Linear Predictive Coding (LPC) Analysis
LPC analysis, which originally was used for statistical analysis, proved
useful in computer music because of its ability to extract and store time-varying
formant information. Time varying means that the information changes over
time, like the amplitude of a waveform does. Formants are points in a sound's
spectrum where frequencies are boosted. In the real world this is often
due to natural resonance in the object that is vibrating. The difference
in the sounds of spoken vowels such as 'a' and 'e' are due to differences
in the formant peaks caused by the difference in the shape of your mouth
when you produce the sounds.
The data generated by an LPC analysis of a sound consists primarily
of filter coefficients which, if used to control a specific type of filter,
will alter an input sound's spectrum to match the formant peaks of the
original sound. If this input sound is a raw pulse waveform (which contains
all harmonics at equal amplitudes), the resultant filtered sound timbre
will be very close to the original. This is the basic procedure for the
LPC Resynthesis command. Typically, the original sound to be analyzed is
a vocal sound, which can then be resynthesized with various parameters
(such as pitch or duration) changed.
The Formant Filter command allows for the filtering of any arbitrary
input sound, which "maps" the formant peaks onto that sound. An additional
'warping' parameter is also available, with values between -1 and 1, with
0 being no warping. This factor has the effect of shifting the formant
peaks down or up in frequency, thereby radically altering the timbre. The
best values are in the range +-.01 to +-.5. In the LPC analysis command,
the number of poles specifies the accuracy of the analysis: the greater
the number of poles, the more precisely the format regions will be captured.
Typical values range from about 24 for 22 khz sounds up to 64 for 44 khz
and 48 kHz sounds.
Appendix C: Quick
Overview of Phase Vocoder (PVoc) Analysis
Phase vocoder analysis creates a data file containing frames of information
representing the frequencies and amplitudes of those frequencies for successive
"time slices" of a given sound. These slices usually overlap, and when
a sound is resynthesized from a PVoc datafile, it is usually possible to
produce an exact replica of the original -- depending on the accuracy of
the original analysis. This is very different than LPC analysis, which
only extracts formant peaks. PVoc analyses contain complete information
about the spectral composition -- in essence, a blueprint -- of a sound.
This blueprint may be altered in an infinitude of ways prior to performing
the resynthesis, allowing for an infinite range of possible resynthesized
sounds. The size and spacing of the "slices" is determined at the time
of the analysis; typically the slices are 512 samples long, and are spaced
at 512/8 or 64 samples apart, or about 689 frames per second.
Appendix D: Dialog Panels
Dialog panels are used to display information and error or alert conditions,
as well as to confirm certain kinds of actions and allow choices to be
made. Another set of dialog panels allow the user to input information
for various types of operations. In panels which contain text entry items
(places where the user can type text), the user may switch between items
using the <tab> character. If a single text item is present, a <tab>
will redisplay the value, which is very useful for items with bounded values
(see below). The standard kill-word and kill-line key commands can be used
to edit text (i.e., ^W and ^U), as well as deleting with the backspace
or delete key. Hitting <return> will usually activate one of the buttons
at the bottom of the panel -- always the one in boldface type. In panels
offering a choice (i.e., yes-no-cancel or confirm-cancel) the user may
use the keyboard to operate the buttons: typing 'y' for confirm or yes,
'n' or 'c' for no or cancel. Text entries left blank will reset to the
last successfully entered value. Illegal characters (like letters in a
numeric entry item) produce a beep from the system.
Most of the text entry items are "bounded", i.e., there is a specific
range of legal values. Some of these display their bounds visually with
a slider and associated end labels showing the min and max values. These
are usually very limited range items, like -1 to +1, or some specific integer
range like 1 to 32. Other more general bounds, such as all positive integers,
or all non-negative numbers, do not get displayed -- but if an out-of-bounds
value is entered, the actual parameter value will be adjusted to fit the
boundry conditions. This greatly reduces the number of "invalid parameter"
error messages produced by the program. Hitting a <tab> after entering
a value will display the actual parameter value to be used.
Most dialog panels will "remember" the values entered the previous invocation.
Some parameters are always set to certain defaults such as window boundaries
or Nyquist frequencies. For those where entering a '0' indicates a desire
to use default values, usually the '0' will be displayed on each invocation
due to possible changes in the other parameters.
The panels are designed to not interact with the X Window Manager, i.e.,
you are not supposed to be able to iconify or hide a dialog -- only enter
values and then confirm or cancel. The "Dialog Panels Ignore Window Manager"
option in the global options panel (see above) allows a choice because
some window managers (specifically the native SGI 4dWM and the Sun olwm)
will not allow keyboard focus if the dialog panels completely ignore the
window manager.
Appendix E: Progress Dialogs
Progress dialogs display the percentage completion of the currently running
processing operation. Any such operation may be interrupted at any
time prior to its completion by clicking on the Cancel button. Doing
this will cause the file to revert to the state it was in just prior to
the cancelled operation.
Appendix F: Playing and
Recording Sounds
Mxv is able to use the digital to analog converter hardware on several
different platforms. On machines which allow only a limited number of sound
formats to be played (such as mu law only or short int only), mxv will
automatically convert the selection you desire to play into an appropriate
format (for example, converting floating point samples into short integers
or mu law). It will not, however, alter the sampling rate or number of
channels, so you will still get an error message if these additional parameters
do not match the converter specs. To record into a sound object, all parameters
must match -- mxv will not attempt to adjust these.
To stop a sound during play or record, you can either use the "Stop"
menu command, or the more convenient keyboard equivalent <control>-<delete>
or <control>-<backspace>.
If the converter device initialization fails, the converter will switch
to an inactive state until the user either switches converters or uses
the "reset converter" command (see above).
Note that mxv records and plays from virtual memory -- there is no disk
i/o involved in the process. Recording and playing only modify the currently
selected section of the soundfile.
Back to Top
Last updated August 04, 2003