Phase Three: Musical Mosaic

Mxzhan
23 min readFeb 8, 2024

--

Video:

Reflection:

Creating a Musical Mosaic through computational methods, particularly using Chuck, presented a unique and enriching experience that bridged the realms of traditional Chinese music with Western musical elements. The project’s ambition was to blend the distinct tones of traditional Chinese guqin, a classical Chinese musical instrument, with the universal appeal of Western instruments like the piano. This endeavor required not just a deep appreciation for the musical genres involved but also a technical proficiency in manipulating sound through software.

The most challenging aspect of this project was extracting the precise sound of the guqin. The guqin, with its serene and complex harmonic overtones, poses a significant challenge for digital representation, especially when attempting to preserve its authenticity and depth. Finding the right FFT combination was a task that demanded patience, experimentation, and a nuanced understanding of both the instrument’s acoustic properties and the mathematical principles underpinning sound analysis.

Another big challenge was the piano. I hadn’t played the piano in over ten years, so deciding to include it in the project meant I had to brush up on my skills. I had to practice a lot to get comfortable playing again and to figure out how to fit the piano’s sound into the mix (via my editing). On top of that, recording the piano and making sure it sounded good digitally added another layer of difficulty. Things like where to put the microphone and how to make sure the recording captured the true sound of the piano were new challenges I had to tackle.

Finally, thanks to my guqin tutor who helped me find my previous guqin play recordings, and thanks to Ge for providing this platform for me to reconnect with my music skills and for the code that allows me to have fun. Also, thanks to my grandfather’s old house at Shaoxing, I was able to accumulate a lot of material for empty mirror frames.

Phase 1 Code:

  1. Feature extract:
// output file (if empty, will print to console)
"" => string OUTPUT_FILE;
// get from arguments
if( me.args() > 0 ) me.arg(0) => OUTPUT_FILE;

// check
if( Machine.silent() == false )
{
// print helpful message
<<< "-----------------", "" >>>;
<<< "[feature-extract]: chuck is currently running in REAL-TIME mode;", "" >>>;
<<< "[feature-extract]: this step has no audio; may run much faster in SILENT mode!", "" >>>;
<<< "[feature-extract]: to run in SILENT mode, restart chuck with --silent flag", "" >>>;
<<< "-----------------", "" >>>;
}


//---------------------------------------------------------------------
// analysis network -- this determines which feature will be extracted
//---------------------------------------------------------------------
// mayshu - modifying this for different feature combinations
// Unit Analyzers in API reference to see what features we have
// pay attention to spectral features: three dimensions--time, one vertical time slice, energy location)
// "=> connect audio signal"
// "=^ up-Chuck operator--an analysis audio signal"
// remember use =^ up-Chuck when analysis is needed

// audio input into a FFT
SndBuf audioFile => FFT fft;
// a thing for collecting multiple features into one vector
FeatureCollector combo => blackhole;
// add spectral feature: Centroid
fft =^ Centroid centroid =^ combo;
// add spectral feature: Flux
fft =^ Flux flux =^ combo;
// add spectral feature: RMS
//fft =^ RMS rms =^ combo;
// add spectral feature: MFCC
fft =^ MFCC mfcc =^ combo;
fft =^ RollOff rolloff =^ combo;
fft =^ ZeroX zerox =^ combo;
//fft =^ Chroma chroma =^ combo;
//fft =^ Kurtosis kurtosis =^ combo;


//---------------------------------------------------------------------
// setting analysis parameters -- important for tuning your extraction
//---------------------------------------------------------------------
// set number of coefficients in MFCC (how many we get out)
20 => mfcc.numCoeffs;
// set number of mel filters in MFCC (internal to MFCC)
10 => mfcc.numFilters;

// do one .upchuck() so FeatureCollector knows how many total dimension
combo.upchuck();
// get number of total feature dimensions
combo.fvals().size() => int NUM_DIMENSIONS;

// set FFT size
2048 => fft.size;
// set window type and size
Windowing.hann(fft.size()) => fft.window;
// our hop size (how often to perform analysis)
2048::samp => dur HOP;


//---------------------------------------------------------------------
// training data -- preparation specific to a train dataset
//---------------------------------------------------------------------
// labels (and filename roots)
// change things if use different dataset--match the file names in the folder of dataset
["blues", "classical", "country", "disco", "hiphop",
"jazz", "metal", "pop", "reggae", "rock"] @=> string labels[];
// how many per label
100 => int NUM_EXAMPLES_PER_LABEL;
// how much time to aggregate features for each file
30::second => dur EXTRACT_TIME;
// given EXTRACT_TIME and HOP, how many frames per file?
(EXTRACT_TIME / HOP) $ int => int numFrames;
// relative path
//"data/gtzan/genres_original/" => string PATH; This must be right path!!!!!
"gtzan/genres_original/" => string PATH;

// a feature frame
float featureFrame[NUM_DIMENSIONS];
// how many input files
0 => int NUM_FILES;

// output reference, default is error stream (cherr)
cherr @=> IO @ theOut;
// instantiate
FileIO fout;
// output file
if( OUTPUT_FILE != "" )
{
// print
<<< "opening file for output:", OUTPUT_FILE >>>;
// open
fout.open( OUTPUT_FILE, FileIO.WRITE );
// test
if( !fout.good() )
{
<<< "cannot open file for writing...", "" >>>;
me.exit();
}
// override
fout @=> theOut;
}


//---------------------------------------------------------------------
// extraction -- iterating over entire training dataset
//---------------------------------------------------------------------

// filename
string filename;
// loop over labels
for( int i; i < labels.size(); i++)
{
// get current label
labels[i] => string label;
// loop over examples under each label
for( int j; j < NUM_EXAMPLES_PER_LABEL; j++ )
{
// construct filepath
me.dir() + PATH + label + "/" + label + ".000" + (j<10?"0":"") + j + ".wav" => filename;
// extract the file
if( !extractFeatures( filename, label, theOut ) )
{
// issue warning
cherr <= "PROBLEM during extraction: " <= filename <= IO.newline();
// bail out
me.exit();
}
}
}

// flush the output
theOut.flush();


//---------------------------------------------------------------------
// function: extract and print features from a single file
//---------------------------------------------------------------------
fun int extractFeatures( string inputFilePath, string label, IO out )
{
// increment
NUM_FILES++;
// log
cherr <= "[" <= NUM_FILES <= "] extracting features: " <= inputFilePath <= IO.newline();

// load by block to speed up IO
2048 => audioFile.chunks;
// read the audio file
inputFilePath => audioFile.read;
// zero out
featureFrame.zero();

// let one FFT-size of time pass (to buffer)
fft.size()::samp => now;
// loop over frames
for( int i; i < numFrames; i++ )
{
//-------------------------------------------------------------
// a single upchuck() will trigger analysis on everything
// connected upstream from combo via the upchuck operator (=^)
// the total number of output dimensions is the sum of
// dimensions of all the connected unit analyzers
//-------------------------------------------------------------
combo.upchuck();
// for each dimension
for( int d; d < NUM_DIMENSIONS; d++ )
{
// copy
combo.fval(d) +=> featureFrame[d];
}
// advance time
HOP => now;
}

//-------------------------------------------------------------
// average into a single feature vector per file
// NOTE: this can be easily modified to N feature vectors
// spread out over the length of an audio file; for now
// we will just do one feature vector per file
//-------------------------------------------------------------
for( int d; d < NUM_DIMENSIONS; d++ )
{
// average by total number of frames
numFrames /=> featureFrame[d];
// print the MFCC results
out <= featureFrame[d] <= " ";
}
// print label name and endline
out <= label <= IO.newline();

// done
return true;
}

2. Genre classify:

// input: pre-extracted features file with labels
// me.dir() + "data/gtzan-23.txt" => string FEATURES_FILE; Always use the right path!!!!
me.dir() + "mayshugtzan-24.txt" => string FEATURES_FILE;
// if have arguments, override filename
if( me.args() > 0 ) me.arg(0) => FEATURES_FILE;
//------------------------------------------------------------------------------
// expected features file format:
//------------------------------------------------------------------------------
// VALUE VALUE ... VALUE LABEL
// VALUE VALUE ... VALUE LABEL
// ... ... ... ... LABEL
// VALUE VALUE ... VALUE LABEL
//------------------------------------------------------------------------------


//------------------------------------------------------------------------------
// unit analyzer network: this must match the features in the features file
//------------------------------------------------------------------------------
// audio input into a FFT
adc => FFT fft;
// a thing for collecting multiple features into one vector
FeatureCollector combo => blackhole;
// add spectral feature: Centroid
fft =^ Centroid centroid =^ combo;
// add spectral feature: Flux
fft =^ Flux flux =^ combo;
// add spectral feature: RMS
fft =^ RMS rms =^ combo;
// add spectral feature: MFCC
fft =^ MFCC mfcc =^ combo;


//-----------------------------------------------------------------------------
// setting analysis parameters -- also should match what was used during extration
//-----------------------------------------------------------------------------
// set number of coefficients in MFCC (how many we get out)
// 13 is a commonly used value; using less here for printing
20 => mfcc.numCoeffs;
// set number of mel filters in MFCC
10 => mfcc.numFilters;

// do one .upchuck() so FeatureCollector knows how many total dimension
combo.upchuck();
// get number of total feature dimensions
combo.fvals().size() => int NUM_DIMENSIONS;

// set FFT size
2048 => fft.size;
// set window type and size
Windowing.hann(fft.size()) => fft.window;
// our hop size (how often to perform analysis)
2048::samp => dur HOP;


//------------------------------------------------------------------------------
// load feature data; read important global values like numPoints and numCoeffs
//------------------------------------------------------------------------------
// values to be read from file
0 => int numPoints; // number of points in data
0 => int numCoeffs; // number of dimensions in data
// file read PART 1: read over the file to get numPoints and numCoeffs
loadFile( FEATURES_FILE ) @=> FileIO @ fin;
// check
if( !fin.good() ) me.exit();
// check dimension
if( numCoeffs != NUM_DIMENSIONS )
{
// error
<<< "[error] expecting:", NUM_DIMENSIONS, "dimensions; but features file has:", numCoeffs >>>;
// stop
me.exit();
}

// labels of all data points
string inLabels[numPoints];
// label indices of all data points
int inLabelsInt[inLabels.size()];
// feature vectors of data points
float inFeatures[numPoints][numCoeffs];
// keys
string labels[0];
// use as map: labels to numbers
int label2int[0];

//------------------------------------------------------------------------------
// array for storing features
//------------------------------------------------------------------------------
// how much time to aggregate features for each file
// (this does not need to match extraction; might play with this number)
.5::second => dur EXTRACT_TIME;
// given EXTRACT_TIME and HOP, how many frames per file?
(EXTRACT_TIME / HOP) $ int => int numFrames;

// use this for new input
float features[numFrames][numCoeffs];
// average values of coefficients across frames
float featureMean[numCoeffs];
// for printing
int lengths[0];
// for printing (how much to indent)
int indents[0];


//------------------------------------------------------------------------------
// read the data
//------------------------------------------------------------------------------
readData( fin );


//------------------------------------------------------------------------------
// set up our KNN object to use for classification
// (KNN2 is a fancier version of the KNN object)
// -- run KNN2.help(); in a separate program to see its available functions --
//------------------------------------------------------------------------------
KNN2 knn;
// k nearest neighbors
10 => int K;
// results vector
float knnResult[labels.size()];
// knn train
knn.train( inFeatures, inLabelsInt );




//------------------------------------------------------------------------------
// real-time classification loop
//------------------------------------------------------------------------------
while( true )
{
// aggregate features over a period of time
for( int frame; frame < numFrames; frame++ )
{
//-------------------------------------------------------------
// a single upchuck() will trigger analysis on everything
// connected upstream from combo via the upchuck operator (=^)
// the total number of output dimensions is the sum of
// dimensions of all the connected unit analyzers
//-------------------------------------------------------------
combo.upchuck();
// get features
for( int d; d < NUM_DIMENSIONS; d++)
{
// store them in current frame
combo.fval(d) => features[frame][d];
}
// advance time
HOP => now;
}

// compute means for each coefficient across frames
for( int d; d < NUM_DIMENSIONS; d++ )
{
// zero out
0.0 => featureMean[d];
// loop over frames
for( int j; j < numFrames; j++ )
{
// add
features[j][d] +=> featureMean[d];
}
// average
numFrames /=> featureMean[d];
}

//-------------------------------------------------
// predict using KNN2; results filled in knnResults
//-------------------------------------------------
knn.predict( featureMean, K, knnResult );

// print results
chout <= "-------------------------------------------------" <= IO.newline();
// print for each label
for( int i; i < knnResult.size(); i++ )
{
// print label
chout <= labels[i] <= ":";
// print indentatation
for( int ii; ii < indents[i]; ii++ ) { chout <= " "; }
// print probability
chout <= knnResult[i] <= IO.newline();
}
}
//------------------------------------------------------------------------------
// end of real-time classification loop
//------------------------------------------------------------------------------




//------------------------------------------------------------------------------
// function: load data file
//------------------------------------------------------------------------------
fun FileIO loadFile( string filepath )
{
// reset
0 => numPoints;
0 => numCoeffs;

// load data
FileIO fio;
if( !fio.open( filepath, FileIO.READ ) )
{
// error
<<< "cannot open file:", filepath >>>;
// close
fio.close();
// return
return fio;
}

string str;
string line;
// read file int array
while( fio.more() )
{
// read each line
fio.readLine().trim() => str;
// check if empty line
if( str != "" )
{
numPoints++;
str => line;
}
}

// a string tokenizer
StringTokenizer tokenizer;
// set to last non-empty line
tokenizer.set( line );
// -1 (to account for label)
-1 => numCoeffs;
// see how many, including label name
while( tokenizer.more() )
{
tokenizer.next();
numCoeffs++;
}

// check
if( numPoints == 0 || numCoeffs <= 0 )
{
<<< "no data in file:", filepath >>>;
fio.close();
return fio;
}

// print
<<< "# of data points:", numPoints, "dimensions:", numCoeffs >>>;

// done for now
return fio;
}


//------------------------------------------------------------------------------
// function: read the data
//------------------------------------------------------------------------------
fun void readData( FileIO fio )
{
// rewind the file reader
fio.seek( 0 );

// read file int array
string str;
int ci, ri;
while( fio => str )
{
// check for last
if( (ci != 0) && ((ci % numCoeffs) == 0) )
{
// read in label
str => inLabels[ri];
// set in map
1 => label2int[str];
// increment row
ri++;
// reset column
0 => ci;
}
else
{
// store feature value
Std.atof(str) => inFeatures[ri][ci];
// increment column
ci++;
}
}

// get keys from map
label2int.getKeys( labels );
// assign index
for( int i; i < labels.size(); i++ )
{ i => label2int[labels[i]]; }
// convert in labels to ints
for( int i; i < inLabels.size(); i++ )
{
// get index as int
label2int[inLabels[i]] => inLabelsInt[i];
}

// max length
0 => int maxLength;
// get lengths of label names for printing
for( int i; i < labels.size(); i++ )
{
// compare with max length
if( labels[i].length() > maxLength )
labels[i].length() => maxLength;
// append to array
lengths << labels[i].length();
}

// get lengths of label names for printing
for( int i; i < lengths.size(); i++ )
{
// get indent for length
indents << (maxLength + 1 ) - lengths[i];
}
}

3. Validate:

// input: pre-extracted features file with labels
"" => string FEATURES_FILE;
// if have arguments, override filename
if( me.args() > 0 ) me.arg(0) => FEATURES_FILE;
else
{ <<< "[usage]: chuck --silent x-validate:FILE", "" >>>; me.exit();}
//------------------------------------------------------------------------------
// expected features file format:
//------------------------------------------------------------------------------
// VALUE VALUE ... VALUE LABEL
// VALUE VALUE ... VALUE LABEL
// ... ... ... ... LABEL
// VALUE VALUE ... VALUE LABEL
//------------------------------------------------------------------------------


//------------------------------------------------------------------------------
// load feature data; read important global values like numPoints and numCoeffs
//------------------------------------------------------------------------------
// values to be read from file
0 => int numPoints; // number of points in data
0 => int numCoeffs; // number of dimensions in data
// file read PART 1: read over the file to get numPoints and numCoeffs
loadFile( FEATURES_FILE ) @=> FileIO @ fin;
// check
if( !fin.good() ) me.exit();

// labels of all data points
string inLabels[numPoints];
// label indices of all data points
int inLabelsInt[inLabels.size()];
// feature vectors of data points
float inFeatures[numPoints][numCoeffs];
// keys
string labels[0];
// use as map: labels to numbers
int label2int[0];


//------------------------------------------------------------------------------
// read the data
//------------------------------------------------------------------------------
readData( fin );


//------------------------------------------------------------------------------
// set up our KNN object to use for classification
// (KNN2 is a fancier version of the KNN object)
// -- run KNN2.help(); in a separate program to see its available functions --
//------------------------------------------------------------------------------
KNN2 knn;
// k nearest neighbors
10 => int K;
// results vector
float knnResult[labels.size()];


//------------------------------------------------------------------------------
// cross validation
//------------------------------------------------------------------------------
// number of folds
20 => int numFolds;
// number of folds to use for testing
4 => int numTestFolds;
// number of folds to use for training
numFolds - numTestFolds => int numTrainFolds;
// number of points in each fold
(numPoints / numFolds + 1) $ int => int numPointsPerFold;
// feature vectors of training data
float trainFeatures[numTrainFolds * numPointsPerFold][numCoeffs];
// labels of training data
int trainLabelsInt[numTrainFolds * numPointsPerFold];
// feature vectors of testing data
float testFeatures[numTestFolds * numPointsPerFold][numCoeffs];
// labels of testing data
int testLabelsInt[numTestFolds * numPointsPerFold];
// normalize the data
normalizeData();
// shuffle the data
shuffleData();
// cross validation
for( 0 => int i; i < numFolds / numTestFolds; i++)
{
// prepare training and testing data
prepareData( i );
// train
knn.train( trainFeatures, trainLabelsInt );
// test
0.0 => float accuracy;
for( 0 => int j; j < testLabelsInt.size(); j++ )
{
// predict
knn.predict( testFeatures[j], K, knnResult );
// aggregate accuracy
knnResult[ testLabelsInt[j] ] +=> accuracy;
}
// print accuracy
chout <= "fold " + i + " accuracy: " + ( accuracy / testLabelsInt.size() ) <= IO.newline();
}


//------------------------------------------------------------------------------
// function: normalizeData()
//------------------------------------------------------------------------------
fun void normalizeData()
{
// for each dimension
for( 0 => int i; i < numCoeffs; i++ )
{
// find min and max
inFeatures[0][i] => float min;
inFeatures[0][i] => float max;
for( 1 => int j; j < numPoints; j++ )
{
if( inFeatures[j][i] < min ) inFeatures[j][i] => min;
if( inFeatures[j][i] > max ) inFeatures[j][i] => max;
}
max - min => float range;
// normalize
for( 0 => int j; j < numPoints; j++ )
(inFeatures[j][i] - min) / range => inFeatures[j][i];
}
}


//------------------------------------------------------------------------------
// function: shuffleData()
//------------------------------------------------------------------------------
fun void shuffleData()
{
// prepare swap data
float swapFeatures[numCoeffs];
int swapLabelInt;
// shuffle the data
for( numPoints - 1 => int i; i > 0; i-- )
{
// random index
Math.random2( 0, i ) => int j;
// swap features
for( 0 => int k; k < numCoeffs; k++ )
{
inFeatures[i][k] => swapFeatures[k];
inFeatures[j][k] => inFeatures[i][k];
swapFeatures[k] => inFeatures[j][k];
}
// swap labels
inLabelsInt[i] => swapLabelInt;
inLabelsInt[j] => inLabelsInt[i];
swapLabelInt => inLabelsInt[j];
}
}


//------------------------------------------------------------------------------
// function: prepareData( int fold )
//------------------------------------------------------------------------------
fun void prepareData( int fold )
{
// test indices
fold * numTestFolds * numPointsPerFold => int testStart;
testStart + numTestFolds * numPointsPerFold => int testEnd;
// index
0 => int train_i;
0 => int test_i;
// prepare training and testing data
for( 0 => int i; i < numPoints; i++ )
{
// test
if( i >= testStart && i < testEnd )
{
// copy features
for( 0 => int j; j < numCoeffs; j++ )
inFeatures[i][j] => testFeatures[test_i][j];
// copy label
inLabelsInt[i] => testLabelsInt[test_i];
// increment
test_i++;
}
// train
else
{
// copy features
for( 0 => int j; j < numCoeffs; j++ )
inFeatures[i][j] => trainFeatures[train_i][j];
// copy label
inLabelsInt[i] => trainLabelsInt[train_i];
// increment
train_i++;
}
}
}


//------------------------------------------------------------------------------
// function: load data file
//------------------------------------------------------------------------------
fun FileIO loadFile( string filepath )
{
// reset
0 => numPoints;
0 => numCoeffs;

// load data
FileIO fio;
if( !fio.open( filepath, FileIO.READ ) )
{
// error
<<< "cannot open file:", filepath >>>;
// close
fio.close();
// return
return fio;
}

string str;
string line;
// read file int array
while( fio.more() )
{
// read each line
fio.readLine().trim() => str;
// check if empty line
if( str != "" )
{
numPoints++;
str => line;
}
}

// a string tokenizer
StringTokenizer tokenizer;
// set to last non-empty line
tokenizer.set( line );
// -1 (to account for label)
-1 => numCoeffs;
// see how many, including label name
while( tokenizer.more() )
{
tokenizer.next();
numCoeffs++;
}

// check
if( numPoints == 0 || numCoeffs <= 0 )
{
<<< "no data in file:", filepath >>>;
fio.close();
return fio;
}

// print
<<< "# of data points:", numPoints, "dimensions:", numCoeffs >>>;

// done for now
return fio;
}


//------------------------------------------------------------------------------
// function: read the data
//------------------------------------------------------------------------------
fun void readData( FileIO fio )
{
// rewind the file reader
fio.seek( 0 );

// read file int array
string str;
int ci, ri;
while( fio => str )
{
// check for last
if( (ci != 0) && ((ci % numCoeffs) == 0) )
{
// read in label
str => inLabels[ri];
// set in map
1 => label2int[str];
// increment row
ri++;
// reset column
0 => ci;
}
else
{
// store feature value
Std.atof(str) => inFeatures[ri][ci];
// increment column
ci++;
}
}

// get keys from map
label2int.getKeys( labels );
// assign index
for( int i; i < labels.size(); i++ )
{ i => label2int[labels[i]]; }
// convert in labels to ints
for( int i; i < inLabels.size(); i++ )
{
// get index as int
label2int[inLabels[i]] => inLabelsInt[i];
}
}

Phase 2 and 3 Code:

  1. Mosaic extract:
// input audio file
"" => string INPUT;
// output file (if empty, will print to console)
"" => string OUTPUT_FILE;
// get from arguments
if( me.args() > 0 ) me.arg(0) => INPUT;
// get from arguments
if( me.args() > 1 ) me.arg(1) => OUTPUT_FILE;

// print usage
if( me.args() == 0 )
{
<<< "usage: chuck --silent mosaic-extract.ck:INPUT:OUTPUT", "" >>>;
<<< " |- INPUT: audio file (.wav), or text file (.txt) listing audio files", "" >>>;
<<< " |- OUTPUT: model file (.txt) to contain extracted feature vectors", "" >>>;
me.exit();
}

// detect; print helpful message
if( Machine.silent() == false )
{
<<< "-----------------", "" >>>;
<<< "[mosaic-extract]: chuck is currently running in REAL-TIME mode;", "" >>>;
<<< "[mosaic-extract]: this step has no audio output; may run faster in SILENT mode!", "" >>>;
<<< "[mosaic-extract]: to run in SILENT mode, restart chuck with --silent flag", "" >>>;
<<< "-----------------", "" >>>;
}


//------------------------------------------------------------------------------
// analysis network -- this determines which feature will be extracted
// NOTE: see examples/ai/features for examples of different features
// match the ones in synth
//------------------------------------------------------------------------------
// audio input into a FFT
SndBuf audioFile => FFT fft;
// a thing for collecting multiple features into one vector
FeatureCollector combo => blackhole;
// add spectral feature: Centroid
fft =^ Centroid centroid =^ combo;
// add spectral feature: Flux
fft =^ Flux flux =^ combo;
// add spectral feature: RMS
fft =^ RMS rms =^ combo;
// add spectral feature: MFCC
fft =^ MFCC mfcc =^ combo;
fft =^ RollOff rolloff =^ combo;
fft =^ Chroma chroma =^ combo;
//might suit for guqin
//------------------------------------------------------------------------------
// analysis parameters -- useful for tuning your extraction
//------------------------------------------------------------------------------
// set number of coefficients in MFCC (how many we get out)
20 => mfcc.numCoeffs;
// set number of mel filters in MFCC (internal to MFCC)
10 => mfcc.numFilters;

// do one .upchuck() so FeatureCollector knows how many total dimension
combo.upchuck();
// get number of total feature dimensions
combo.fvals().size() => int NUM_DIMENSIONS;

// set FFT size how many FFT take
//4096 => fft.size;
4410 => fft.size;
// set window type and size
// conditioning number of FFT to analyze
Windowing.hann(fft.size()) => fft.window;
// our hop size (how often to perform analysis)
// how often beats in the system
(fft.size()/2)::samp => dur HOP;
// how many frames to aggregate before averaging?
3 => int NUM_FRAMES;


//------------------------------------------------------------------------------
// OUTPUT: prepare for output
//------------------------------------------------------------------------------
// a feature frame
float featureFrame[NUM_DIMENSIONS];
// how many input files
0 => int NUM_FILES;

// output reference, default is error stream (cherr)
cherr @=> IO @ theOut;
// instantiate
FileIO fout;
// output file
if( OUTPUT_FILE != "" )
{
// print
<<< "opening file for output:", OUTPUT_FILE >>>;
// open
fout.open( OUTPUT_FILE, FileIO.WRITE );
// test
if( !fout.good() )
{
<<< " |- cannot open file for writing...", "" >>>;
me.exit();
}
// override
fout @=> theOut;
}


//------------------------------------------------------------------------------
// INPUT: prepare for iterating over input data and extract features
//------------------------------------------------------------------------------

// array input audio files
string filenames[0];
// parse INPUT, which may be an audio file (.wav) or a list of filenames (.txt)
if( !parseInput( INPUT, filenames ) ) me.exit();

// loop over filenname
for( int i; i < filenames.size(); i++)
{
// extract the file
if( !extractTrajectory( me.dir()+filenames[i], filenames[i], i, theOut ) )
{
// issue warning
cherr <= "[mosaic-extract]: problem extracting (and skipping): " <= filenames[i] <= IO.newline();
// skip
continue;
}
}

// flush output
theOut.flush();
// close
theOut.close();


//------------------------------------------------------------------------------
// extractTrajectory() -- extracts and outputs feature vectors from a single file
//------------------------------------------------------------------------------
fun int extractTrajectory( string inputFilePath, string shortName, int fileIndex, IO out )
{
// increment
NUM_FILES++;
// log
cherr <= "[" <= NUM_FILES <= "] extracting features: " <= inputFilePath <= IO.newline();

// load by block to speed up IO
fft.size() => audioFile.chunks;
// read the audio file
inputFilePath => audioFile.read;
// file position (in seconds)
int pos;
// frame index
int index;

while( audioFile.pos() < audioFile.samples() )
{
// remember the starting pos of each vector
audioFile.pos() => int pos;
// let one FFT-size of time pass (to buffer)
fft.size()::samp => now;
// zero out
featureFrame.zero();
// loop over frames
for( int i; i < NUM_FRAMES; i++ )
{
//-------------------------------------------------------------
// a single upchuck() will trigger analysis on everything
// connected upstream from combo via the upchuck operator (=^)
// the total number of output dimensions is the sum of
// dimensions of all the connected unit analyzers
//-------------------------------------------------------------
combo.upchuck();
// for each dimension
for( int d; d < NUM_DIMENSIONS; d++ )
{
// copy
combo.fval(d) +=> featureFrame[d];
}
// advance time
HOP => now;
}

// print label name and endline
out <= shortName <= " " <= (pos::samp)/second <= " ";

//-------------------------------------------------------------
// average into a single feature vector per file
// NOTE: this can be easily modified to N feature vectors
// spread out over the length of an audio file; for now
// we will just do one feature vector per file
//-------------------------------------------------------------
for( int d; d < NUM_DIMENSIONS; d++ )
{
// average by total number of frames
NUM_FRAMES /=> featureFrame[d];
// print the MFCC results
out <= featureFrame[d] <= " ";
}

out <= IO.newline();

// print .
if( out != cherr ) { cherr <= "."; cherr.flush(); }

// increment index
index++;
}

// print newline to screen
if( out != cherr ) cherr <= IO.newline();

// done
return true;
}


//------------------------------------------------------------------------------
// parse INPUT argument -- either single audio file or a text file containing a list
//------------------------------------------------------------------------------
fun int parseInput( string input, string results[] )
{
// clear results
results.clear();
// see if input is a file name
if( input.rfind( ".wav" ) > 0 || input.rfind( ".aiff" ) > 0 )
{
// make new string (since << current appends by reference)
input => string sss;
// append
results << sss;
}
else
{
// load data
FileIO fio;
if( !fio.open( me.dir() + input, FileIO.READ ) )
{
// error
<<< "cannot open file:", me.dir() + input >>>;
// close
fio.close();
// return done
return false;
}

// read each filename
while( fio.more() )
{
// read each line
fio.readLine().trim() => string line;
// if not empty
if( line != "" )
{
results << line;
}
}
}

return true;
}

2. Mosaic synthesis microphone:

// input: pre-extracted model file
string FEATURES_FILE;
// if have arguments, override filename
if( me.args() > 0 )
{
me.arg(0) => FEATURES_FILE;
}
else
{
// print usage
<<< "usage: chuck mosaic-synth-mic.ck:INPUT", "" >>>;
<<< " |- INPUT: model file (.txt) containing extracted feature vectors", "" >>>;
}
//------------------------------------------------------------------------------
// expected model file format; each VALUE is a feature value
// (feel free to adapt and modify the file format as needed)
//------------------------------------------------------------------------------
// filePath windowStartTime VALUE VALUE ... VALUE
// filePath windowStartTime VALUE VALUE ... VALUE
// ...
// filePath windowStartTime VALUE VALUE ... VALUE
//------------------------------------------------------------------------------


//------------------------------------------------------------------------------
// unit analyzer network: *** this must match the features in the features file
//------------------------------------------------------------------------------
// audio input into a FFT
adc => FFT fft;
// a thing for collecting multiple features into one vector
FeatureCollector combo => blackhole;
// add spectral feature: Centroid
fft =^ Centroid centroid =^ combo;
// add spectral feature: Flux
fft =^ Flux flux =^ combo;
// add spectral feature: RMS
fft =^ RMS rms =^ combo;
// add spectral feature: MFCC
fft =^ MFCC mfcc =^ combo;
fft =^ RollOff rolloff =^ combo;
fft =^ Chroma chroma =^ combo;

//-----------------------------------------------------------------------------
// setting analysis parameters -- also should match what was used during extration
//-----------------------------------------------------------------------------
// set number of coefficients in MFCC (how many we get out)
// 13 is a commonly used value; using less here for printing
20 => mfcc.numCoeffs;
// set number of mel filters in MFCC
10 => mfcc.numFilters;

// do one .upchuck() so FeatureCollector knows how many total dimension
combo.upchuck();
// get number of total feature dimensions
combo.fvals().size() => int NUM_DIMENSIONS;

// set FFT size
4096 => fft.size;
// set window type and size
Windowing.hann(fft.size()) => fft.window;
// our hop size (how often to perform analysis)
(fft.size()/2)::samp => dur HOP;
// how many frames to aggregate before averaging?
// (this does not need to match extraction; might play with this number)
3 => int NUM_FRAMES;
// how much time to aggregate features for each file
fft.size()::samp * NUM_FRAMES => dur EXTRACT_TIME;


//------------------------------------------------------------------------------
// unit generator network: for real-time sound synthesis
//------------------------------------------------------------------------------
// how many max at any time?
16 => int NUM_VOICES;
// a number of audio buffers to cycel between
SndBuf buffers[NUM_VOICES]; ADSR envs[NUM_VOICES]; Pan2 pans[NUM_VOICES];
// set parameters
for( int i; i < NUM_VOICES; i++ )
{
// connect audio
buffers[i] => envs[i] => pans[i] => dac;
// set chunk size (how to to load at a time)
// this is important when reading from large files
// if this is not set, SndBuf.read() will load the entire file immediately
fft.size() => buffers[i].chunks;
// randomize pan
Math.random2f(-.75,.75) => pans[i].pan;
// set envelope parameters
envs[i].set( EXTRACT_TIME, EXTRACT_TIME/2, 1, EXTRACT_TIME );
}


//------------------------------------------------------------------------------
// load feature data; read important global values like numPoints and numCoeffs
//------------------------------------------------------------------------------
// values to be read from file
0 => int numPoints; // number of points in data
0 => int numCoeffs; // number of dimensions in data
// file read PART 1: read over the file to get numPoints and numCoeffs
loadFile( FEATURES_FILE ) @=> FileIO @ fin;
// check
if( !fin.good() ) me.exit();
// check dimension at least
if( numCoeffs != NUM_DIMENSIONS )
{
// error
<<< "[error] expecting:", NUM_DIMENSIONS, "dimensions; but features file has:", numCoeffs >>>;
// stop
me.exit();
}


//------------------------------------------------------------------------------
// each Point corresponds to one line in the input file, which is one audio window
//------------------------------------------------------------------------------
class AudioWindow
{
// unique point index (use this to lookup feature vector)
int uid;
// which file did this come file (in files arary)
int fileIndex;
// starting time in that file (in seconds)
float windowTime;

// set
fun void set( int id, int fi, float wt )
{
id => uid;
fi => fileIndex;
wt => windowTime;
}
}

// array of all points in model file
AudioWindow windows[numPoints];
// unique filenames; we will append to this
string files[0];
// map of filenames loaded
int filename2state[0];
// feature vectors of data points
float inFeatures[numPoints][numCoeffs];
// generate array of unique indices
int uids[numPoints]; for( int i; i < numPoints; i++ ) i => uids[i];

// use this for new input
float features[NUM_FRAMES][numCoeffs];
// average values of coefficients across frames
float featureMean[numCoeffs];


//------------------------------------------------------------------------------
// read the data
//------------------------------------------------------------------------------
readData( fin );


//------------------------------------------------------------------------------
// set up our KNN object to use for classification
// (KNN2 is a fancier version of the KNN object)
// -- run KNN2.help(); in a separate program to see its available functions --
//------------------------------------------------------------------------------
KNN2 knn;
// k nearest neighbors
2 => int K;
// results vector (indices of k nearest points)
int knnResult[K];
// knn train
knn.train( inFeatures, uids );


// used to rotate sound buffers
0 => int which;

//------------------------------------------------------------------------------
// SYNTHESIS!!
// this function is meant to be sporked so it can be stacked in time
//------------------------------------------------------------------------------
fun void synthesize( int uid )
{
// get the buffer to use
buffers[which] @=> SndBuf @ sound;
// get the envelope to use
envs[which] @=> ADSR @ envelope;
// increment and wrap if needed
which++; if( which >= buffers.size() ) 0 => which;

// get a referencde to the audio fragment to synthesize
windows[uid] @=> AudioWindow @ win;
// get filename
files[win.fileIndex] => string filename;
// load into sound buffer
filename => sound.read;
// seek to the window start time
((win.windowTime::second)/samp) $ int => sound.pos;

// print what we are about to play
chout <= "synthsizing window: ";
// print label
chout <= win.uid <= "["
<= win.fileIndex <= ":"
<= win.windowTime <= ":POSITION="
<= sound.pos() <= "]";
// endline
chout <= IO.newline();

// open the envelope, overlap add this into the overall audio
envelope.keyOn();
// wait
(EXTRACT_TIME*3)-envelope.releaseTime() => now;
// start the release
envelope.keyOff();
// wait
envelope.releaseTime() => now;
}


//------------------------------------------------------------------------------
// real-time similarity retrieval loop
//------------------------------------------------------------------------------
while( true )
{
// aggregate features over a period of time
for( int frame; frame < NUM_FRAMES; frame++ )
{
//-------------------------------------------------------------
// a single upchuck() will trigger analysis on everything
// connected upstream from combo via the upchuck operator (=^)
// the total number of output dimensions is the sum of
// dimensions of all the connected unit analyzers
//-------------------------------------------------------------
combo.upchuck();
// get features
for( int d; d < NUM_DIMENSIONS; d++)
{
// store them in current frame
combo.fval(d) => features[frame][d];
}
// advance time
HOP => now;
}

// compute means for each coefficient across frames
for( int d; d < NUM_DIMENSIONS; d++ )
{
// zero out
0.0 => featureMean[d];
// loop over frames
for( int j; j < NUM_FRAMES; j++ )
{
// add
features[j][d] +=> featureMean[d];
}
// average
NUM_FRAMES /=> featureMean[d];
}

//-------------------------------------------------
// search using KNN2; results filled in knnResults,
// which should the indices of k nearest points
//-------------------------------------------------
knn.search( featureMean, K, knnResult );

// SYNTHESIZE THIS
spork ~ synthesize( knnResult[Math.random2(0,knnResult.size()-1)] );
}
//------------------------------------------------------------------------------
// end of real-time similiarity retrieval loop
//------------------------------------------------------------------------------




//------------------------------------------------------------------------------
// function: load data file
//------------------------------------------------------------------------------
fun FileIO loadFile( string filepath )
{
// reset
0 => numPoints;
0 => numCoeffs;

// load data
FileIO fio;
if( !fio.open( filepath, FileIO.READ ) )
{
// error
<<< "cannot open file:", filepath >>>;
// close
fio.close();
// return
return fio;
}

string str;
string line;
// read the first non-empty line
while( fio.more() )
{
// read each line
fio.readLine().trim() => str;
// check if empty line
if( str != "" )
{
numPoints++;
str => line;
}
}

// a string tokenizer
StringTokenizer tokenizer;
// set to last non-empty line
tokenizer.set( line );
// negative (to account for filePath windowTime)
-2 => numCoeffs;
// see how many, including label name
while( tokenizer.more() )
{
tokenizer.next();
numCoeffs++;
}

// see if we made it past the initial fields
if( numCoeffs < 0 ) 0 => numCoeffs;

// check
if( numPoints == 0 || numCoeffs <= 0 )
{
<<< "no data in file:", filepath >>>;
fio.close();
return fio;
}

// print
<<< "# of data points:", numPoints, "dimensions:", numCoeffs >>>;

// done for now
return fio;
}


//------------------------------------------------------------------------------
// function: read the data
//------------------------------------------------------------------------------
fun void readData( FileIO fio )
{
// rewind the file reader
fio.seek( 0 );

// a line
string line;
// a string tokenizer
StringTokenizer tokenizer;

// points index
0 => int index;
// file index
0 => int fileIndex;
// file name
string filename;
// window start time
float windowTime;
// coefficient
int c;

// read the first non-empty line
while( fio.more() )
{
// read each line
fio.readLine().trim() => line;
// check if empty line
if( line != "" )
{
// set to last non-empty line
tokenizer.set( line );
// file name
tokenizer.next() => filename;
// window start time
tokenizer.next() => Std.atof => windowTime;
// have we seen this filename yet?
if( filename2state[filename] == 0 )
{
// make a new string (<< appends by reference)
filename => string sss;
// append
files << sss;
// new id
files.size() => filename2state[filename];
}
// get fileindex
filename2state[filename]-1 => fileIndex;
// set
windows[index].set( index, fileIndex, windowTime );

// zero out
0 => c;
// for each dimension in the data
repeat( numCoeffs )
{
// read next coefficient
tokenizer.next() => Std.atof => inFeatures[index][c];
// increment
c++;
}

// increment global index
index++;
}
}
}

--

--