Phase Three: Musical Mosaic

23 min readFeb 8, 2024

Video:

Reflection:

Creating a Musical Mosaic through computational methods, particularly using Chuck, presented a unique and enriching experience that bridged the realms of traditional Chinese music with Western musical elements. The project’s ambition was to blend the distinct tones of traditional Chinese guqin, a classical Chinese musical instrument, with the universal appeal of Western instruments like the piano. This endeavor required not just a deep appreciation for the musical genres involved but also a technical proficiency in manipulating sound through software.

The most challenging aspect of this project was extracting the precise sound of the guqin. The guqin, with its serene and complex harmonic overtones, poses a significant challenge for digital representation, especially when attempting to preserve its authenticity and depth. Finding the right FFT combination was a task that demanded patience, experimentation, and a nuanced understanding of both the instrument’s acoustic properties and the mathematical principles underpinning sound analysis.

Another big challenge was the piano. I hadn’t played the piano in over ten years, so deciding to include it in the project meant I had to brush up on my skills. I had to practice a lot to get comfortable playing again and to figure out how to fit the piano’s sound into the mix (via my editing). On top of that, recording the piano and making sure it sounded good digitally added another layer of difficulty. Things like where to put the microphone and how to make sure the recording captured the true sound of the piano were new challenges I had to tackle.

Finally, thanks to my guqin tutor who helped me find my previous guqin play recordings, and thanks to Ge for providing this platform for me to reconnect with my music skills and for the code that allows me to have fun. Also, thanks to my grandfather’s old house at Shaoxing, I was able to accumulate a lot of material for empty mirror frames.

Phase 1 Code:

Feature extract:

// output file (if empty, will print to console)
"" => string OUTPUT_FILE;
// get from arguments
if( me.args() > 0 ) me.arg(0) => OUTPUT_FILE;

// check
if( Machine.silent() == false )
{
    // print helpful message
    <<< "-----------------", "" >>>;
    <<< "[feature-extract]: chuck is currently running in REAL-TIME mode;", "" >>>;
    <<< "[feature-extract]: this step has no audio; may run much faster in SILENT mode!", "" >>>;
    <<< "[feature-extract]: to run in SILENT mode, restart chuck with --silent flag", "" >>>;
    <<< "-----------------", "" >>>;
}


//---------------------------------------------------------------------
// analysis network -- this determines which feature will be extracted
//---------------------------------------------------------------------
// mayshu - modifying this for different feature combinations
// Unit Analyzers in API reference to see what features we have
// pay attention to spectral features: three dimensions--time, one vertical time slice, energy location)
// "=> connect audio signal"
// "=^ up-Chuck operator--an analysis audio signal"
// remember use =^ up-Chuck when analysis is needed

// audio input into a FFT
SndBuf audioFile => FFT fft;
// a thing for collecting multiple features into one vector
FeatureCollector combo => blackhole;
// add spectral feature: Centroid
fft =^ Centroid centroid =^ combo;
// add spectral feature: Flux
fft =^ Flux flux =^ combo;
// add spectral feature: RMS
//fft =^ RMS rms =^ combo;
// add spectral feature: MFCC
fft =^ MFCC mfcc =^ combo;
fft =^ RollOff rolloff =^ combo;
fft =^ ZeroX zerox =^ combo;
//fft =^ Chroma chroma =^ combo;
//fft =^ Kurtosis kurtosis =^ combo;


//---------------------------------------------------------------------
// setting analysis parameters -- important for tuning your extraction
//---------------------------------------------------------------------
// set number of coefficients in MFCC (how many we get out)
20 => mfcc.numCoeffs;
// set number of mel filters in MFCC (internal to MFCC)
10 => mfcc.numFilters;

// do one .upchuck() so FeatureCollector knows how many total dimension
combo.upchuck();
// get number of total feature dimensions
combo.fvals().size() => int NUM_DIMENSIONS;

// set FFT size
2048 => fft.size;
// set window type and size
Windowing.hann(fft.size()) => fft.window;
// our hop size (how often to perform analysis)
2048::samp => dur HOP;


//---------------------------------------------------------------------
// training data -- preparation specific to a train dataset
//---------------------------------------------------------------------
// labels (and filename roots)
// change things if use different dataset--match the file names in the folder of dataset
["blues", "classical", "country", "disco", "hiphop",
 "jazz", "metal", "pop", "reggae", "rock"] @=> string labels[];
// how many per label
100 => int NUM_EXAMPLES_PER_LABEL;
// how much time to aggregate features for each file
30::second => dur EXTRACT_TIME;
// given EXTRACT_TIME and HOP, how many frames per file?
(EXTRACT_TIME / HOP) $ int => int numFrames;
// relative path
//"data/gtzan/genres_original/" => string PATH; This must be right path!!!!!
"gtzan/genres_original/" => string PATH;

// a feature frame
float featureFrame[NUM_DIMENSIONS];
// how many input files
0 => int NUM_FILES;

// output reference, default is error stream (cherr)
cherr @=> IO @ theOut;
// instantiate
FileIO fout;
// output file
if( OUTPUT_FILE != "" )
{
    // print
    <<< "opening file for output:", OUTPUT_FILE >>>;
    // open
    fout.open( OUTPUT_FILE, FileIO.WRITE );
    // test
    if( !fout.good() )
    {
        <<< "cannot open file for writing...", "" >>>;
        me.exit();
    }
    // override
    fout @=> theOut;
}


//---------------------------------------------------------------------
// extraction -- iterating over entire training dataset
//---------------------------------------------------------------------

// filename
string filename;
// loop over labels
for( int i; i < labels.size(); i++)
{
    // get current label
    labels[i] => string label;
    // loop over examples under each label
    for( int j; j < NUM_EXAMPLES_PER_LABEL; j++ )
    {
        // construct filepath
        me.dir() + PATH + label + "/" + label + ".000" + (j<10?"0":"") + j + ".wav" => filename;
        // extract the file
        if( !extractFeatures( filename, label, theOut ) )
        {
            // issue warning
            cherr <= "PROBLEM during extraction: " <= filename <= IO.newline();
            // bail out
            me.exit();
        }
    }
}

// flush the output
theOut.flush();


//---------------------------------------------------------------------
// function: extract and print features from a single file
//---------------------------------------------------------------------
fun int extractFeatures( string inputFilePath, string label, IO out )
{    
    // increment
    NUM_FILES++;
    // log 
    cherr <= "[" <= NUM_FILES <= "] extracting features: " <= inputFilePath <= IO.newline();

    // load by block to speed up IO
    2048 => audioFile.chunks;
    // read the audio file
    inputFilePath => audioFile.read;
    // zero out
    featureFrame.zero();

    // let one FFT-size of time pass (to buffer)
    fft.size()::samp => now;
    // loop over frames
    for( int i; i < numFrames; i++ )
    {
        //-------------------------------------------------------------
        // a single upchuck() will trigger analysis on everything
        // connected upstream from combo via the upchuck operator (=^)
        // the total number of output dimensions is the sum of
        // dimensions of all the connected unit analyzers
        //-------------------------------------------------------------
        combo.upchuck();
        // for each dimension
        for( int d; d < NUM_DIMENSIONS; d++ )
        {
            // copy
            combo.fval(d) +=> featureFrame[d];
        }
        // advance time
        HOP => now;
    }
    
    //-------------------------------------------------------------
    // average into a single feature vector per file
    // NOTE: this can be easily modified to N feature vectors
    // spread out over the length of an audio file; for now
    // we will just do one feature vector per file
    //-------------------------------------------------------------
    for( int d; d < NUM_DIMENSIONS; d++ )
    {
        // average by total number of frames
        numFrames /=> featureFrame[d];
        // print the MFCC results
        out <= featureFrame[d] <= " ";
    }    
    // print label name and endline
    out <= label <= IO.newline();
    
    // done
    return true;
}

2. Genre classify:

// input: pre-extracted features file with labels
// me.dir() + "data/gtzan-23.txt" => string FEATURES_FILE; Always use the right path!!!!
me.dir() + "mayshugtzan-24.txt" => string FEATURES_FILE;
// if have arguments, override filename
if( me.args() > 0 ) me.arg(0) => FEATURES_FILE;
//------------------------------------------------------------------------------
// expected features file format:
//------------------------------------------------------------------------------
// VALUE VALUE ... VALUE LABEL
// VALUE VALUE ... VALUE LABEL
// ...   ...   ... ...   LABEL
// VALUE VALUE ... VALUE LABEL
//------------------------------------------------------------------------------


//------------------------------------------------------------------------------
// unit analyzer network: this must match the features in the features file
//------------------------------------------------------------------------------
// audio input into a FFT
adc => FFT fft;
// a thing for collecting multiple features into one vector
FeatureCollector combo => blackhole;
// add spectral feature: Centroid
fft =^ Centroid centroid =^ combo;
// add spectral feature: Flux
fft =^ Flux flux =^ combo;
// add spectral feature: RMS
fft =^ RMS rms =^ combo;
// add spectral feature: MFCC
fft =^ MFCC mfcc =^ combo;


//-----------------------------------------------------------------------------
// setting analysis parameters -- also should match what was used during extration
//-----------------------------------------------------------------------------
// set number of coefficients in MFCC (how many we get out)
// 13 is a commonly used value; using less here for printing
20 => mfcc.numCoeffs;
// set number of mel filters in MFCC
10 => mfcc.numFilters;

// do one .upchuck() so FeatureCollector knows how many total dimension
combo.upchuck();
// get number of total feature dimensions
combo.fvals().size() => int NUM_DIMENSIONS;

// set FFT size
2048 => fft.size;
// set window type and size
Windowing.hann(fft.size()) => fft.window;
// our hop size (how often to perform analysis)
2048::samp => dur HOP;


//------------------------------------------------------------------------------
// load feature data; read important global values like numPoints and numCoeffs
//------------------------------------------------------------------------------
// values to be read from file
0 => int numPoints; // number of points in data
0 => int numCoeffs; // number of dimensions in data
// file read PART 1: read over the file to get numPoints and numCoeffs
loadFile( FEATURES_FILE ) @=> FileIO @ fin;
// check
if( !fin.good() ) me.exit();
// check dimension
if( numCoeffs != NUM_DIMENSIONS )
{
    // error
    <<< "[error] expecting:", NUM_DIMENSIONS, "dimensions; but features file has:", numCoeffs >>>;
    // stop
    me.exit();
}

// labels of all data points
string inLabels[numPoints];
// label indices of all data points
int inLabelsInt[inLabels.size()];
// feature vectors of data points
float inFeatures[numPoints][numCoeffs];
// keys
string labels[0];
// use as map: labels to numbers
int label2int[0];

//------------------------------------------------------------------------------
// array for storing features
//------------------------------------------------------------------------------
// how much time to aggregate features for each file
// (this does not need to match extraction; might play with this number)
.5::second => dur EXTRACT_TIME;
// given EXTRACT_TIME and HOP, how many frames per file?
(EXTRACT_TIME / HOP) $ int => int numFrames;

// use this for new input
float features[numFrames][numCoeffs];
// average values of coefficients across frames
float featureMean[numCoeffs];
// for printing
int lengths[0];
// for printing (how much to indent)
int indents[0];


//------------------------------------------------------------------------------
// read the data
//------------------------------------------------------------------------------
readData( fin );


//------------------------------------------------------------------------------
// set up our KNN object to use for classification
// (KNN2 is a fancier version of the KNN object)
// -- run KNN2.help(); in a separate program to see its available functions --
//------------------------------------------------------------------------------
KNN2 knn;
// k nearest neighbors
10 => int K;
// results vector
float knnResult[labels.size()];
// knn train
knn.train( inFeatures, inLabelsInt );




//------------------------------------------------------------------------------
// real-time classification loop
//------------------------------------------------------------------------------
while( true )
{
    // aggregate features over a period of time
    for( int frame; frame < numFrames; frame++ )
    {
        //-------------------------------------------------------------
        // a single upchuck() will trigger analysis on everything
        // connected upstream from combo via the upchuck operator (=^)
        // the total number of output dimensions is the sum of
        // dimensions of all the connected unit analyzers
        //-------------------------------------------------------------
        combo.upchuck();  
        // get features
        for( int d; d < NUM_DIMENSIONS; d++) 
        {
            // store them in current frame
            combo.fval(d) => features[frame][d];
        }
        // advance time
        HOP => now;
    }
    
    // compute means for each coefficient across frames
    for( int d; d < NUM_DIMENSIONS; d++ )
    {
        // zero out
        0.0 => featureMean[d];
        // loop over frames
        for( int j; j < numFrames; j++ )
        {
            // add
            features[j][d] +=> featureMean[d];
        }
        // average
        numFrames /=> featureMean[d];
    }
    
    //-------------------------------------------------
    // predict using KNN2; results filled in knnResults
    //-------------------------------------------------
    knn.predict( featureMean, K, knnResult );
    
    // print results
    chout <= "-------------------------------------------------" <= IO.newline();
    // print for each label
    for( int i; i < knnResult.size(); i++ )
    {
        // print label
        chout <= labels[i] <= ":";
        // print indentatation
        for( int ii; ii < indents[i]; ii++ ) { chout <= " "; }
        // print probability
        chout <= knnResult[i] <= IO.newline();
    }
}
//------------------------------------------------------------------------------
// end of real-time classification loop
//------------------------------------------------------------------------------




//------------------------------------------------------------------------------
// function: load data file
//------------------------------------------------------------------------------
fun FileIO loadFile( string filepath )
{
    // reset
    0 => numPoints;
    0 => numCoeffs;
    
    // load data
    FileIO fio;
    if( !fio.open( filepath, FileIO.READ ) )
    {
        // error
        <<< "cannot open file:", filepath >>>;
        // close
        fio.close();
        // return
        return fio;
    }
    
    string str;
    string line;
    // read file int array
    while( fio.more() )
    {
        // read each line
        fio.readLine().trim() => str;
        // check if empty line
        if( str != "" )
        {
            numPoints++;
            str => line;
        }
    }
    
    // a string tokenizer
    StringTokenizer tokenizer;
    // set to last non-empty line
    tokenizer.set( line );
    // -1 (to account for label)
    -1 => numCoeffs;
    // see how many, including label name
    while( tokenizer.more() )
    {
        tokenizer.next();
        numCoeffs++;
    }
    
    // check
    if( numPoints == 0 || numCoeffs <= 0 )
    {
        <<< "no data in file:", filepath >>>;
        fio.close();
        return fio;
    }
    
    // print
    <<< "# of data points:", numPoints, "dimensions:", numCoeffs >>>;
    
    // done for now
    return fio;
}


//------------------------------------------------------------------------------
// function: read the data
//------------------------------------------------------------------------------
fun void readData( FileIO fio )
{
    // rewind the file reader
    fio.seek( 0 );
    
    // read file int array
    string str;
    int ci, ri;
    while( fio => str )
    {
        // check for last
        if( (ci != 0) && ((ci % numCoeffs) == 0) )
        {
            // read in label
            str => inLabels[ri];
            // set in map
            1 => label2int[str];
            // increment row
            ri++;
            // reset column
            0 => ci;
        }
        else
        {
            // store feature value
            Std.atof(str) => inFeatures[ri][ci];
            // increment column
            ci++;
        }
    }
    
    // get keys from map
    label2int.getKeys( labels );
    // assign index
    for( int i; i < labels.size(); i++ )
    { i => label2int[labels[i]]; }
    // convert in labels to ints
    for( int i; i < inLabels.size(); i++ )
    {
        // get index as int
        label2int[inLabels[i]] => inLabelsInt[i];
    }
    
    // max length
    0 => int maxLength;
    // get lengths of label names for printing
    for( int i; i < labels.size(); i++ )
    {
        // compare with max length
        if( labels[i].length() > maxLength )
            labels[i].length() => maxLength;
        // append to array
        lengths << labels[i].length();
    }
    
    // get lengths of label names for printing
    for( int i; i < lengths.size(); i++ )
    {
        // get indent for length
        indents << (maxLength + 1 ) - lengths[i];
    }
}

3. Validate:

// input: pre-extracted features file with labels
"" => string FEATURES_FILE;
// if have arguments, override filename
if( me.args() > 0 ) me.arg(0) => FEATURES_FILE;
else
{ <<< "[usage]: chuck --silent x-validate:FILE", "" >>>; me.exit();}
//------------------------------------------------------------------------------
// expected features file format:
//------------------------------------------------------------------------------
// VALUE VALUE ... VALUE LABEL
// VALUE VALUE ... VALUE LABEL
// ...   ...   ... ...   LABEL
// VALUE VALUE ... VALUE LABEL
//------------------------------------------------------------------------------


//------------------------------------------------------------------------------
// load feature data; read important global values like numPoints and numCoeffs
//------------------------------------------------------------------------------
// values to be read from file
0 => int numPoints; // number of points in data
0 => int numCoeffs; // number of dimensions in data
// file read PART 1: read over the file to get numPoints and numCoeffs
loadFile( FEATURES_FILE ) @=> FileIO @ fin;
// check
if( !fin.good() ) me.exit();

// labels of all data points
string inLabels[numPoints];
// label indices of all data points
int inLabelsInt[inLabels.size()];
// feature vectors of data points
float inFeatures[numPoints][numCoeffs];
// keys
string labels[0];
// use as map: labels to numbers
int label2int[0];


//------------------------------------------------------------------------------
// read the data
//------------------------------------------------------------------------------
readData( fin );


//------------------------------------------------------------------------------
// set up our KNN object to use for classification
// (KNN2 is a fancier version of the KNN object)
// -- run KNN2.help(); in a separate program to see its available functions --
//------------------------------------------------------------------------------
KNN2 knn;
// k nearest neighbors
10 => int K;
// results vector
float knnResult[labels.size()];


//------------------------------------------------------------------------------
// cross validation
//------------------------------------------------------------------------------
// number of folds
20 => int numFolds;
// number of folds to use for testing
4 => int numTestFolds;
// number of folds to use for training
numFolds - numTestFolds => int numTrainFolds;
// number of points in each fold
(numPoints / numFolds + 1) $ int => int numPointsPerFold;
// feature vectors of training data
float trainFeatures[numTrainFolds * numPointsPerFold][numCoeffs];
// labels of training data
int trainLabelsInt[numTrainFolds * numPointsPerFold];
// feature vectors of testing data
float testFeatures[numTestFolds * numPointsPerFold][numCoeffs];
// labels of testing data
int testLabelsInt[numTestFolds * numPointsPerFold];
// normalize the data
normalizeData();
// shuffle the data
shuffleData();
// cross validation
for( 0 => int i; i < numFolds / numTestFolds; i++)
{
    // prepare training and testing data
    prepareData( i );
    // train
    knn.train( trainFeatures, trainLabelsInt );
    // test
    0.0 => float accuracy;
    for( 0 => int j; j < testLabelsInt.size(); j++ )
    {
        // predict
        knn.predict( testFeatures[j], K, knnResult );
        // aggregate accuracy
        knnResult[ testLabelsInt[j] ] +=> accuracy;
    }
    // print accuracy
    chout <= "fold " + i + " accuracy: " + ( accuracy / testLabelsInt.size() ) <= IO.newline();
}


//------------------------------------------------------------------------------
// function: normalizeData()
//------------------------------------------------------------------------------
fun void normalizeData()
{
    // for each dimension
    for( 0 => int i; i < numCoeffs; i++ )
    {
        // find min and max
        inFeatures[0][i] => float min;
        inFeatures[0][i] => float max;
        for( 1 => int j; j < numPoints; j++ )
        {
            if( inFeatures[j][i] < min ) inFeatures[j][i] => min;
            if( inFeatures[j][i] > max ) inFeatures[j][i] => max;
        }
        max - min => float range;
        // normalize
        for( 0 => int j; j < numPoints; j++ )
            (inFeatures[j][i] - min) / range => inFeatures[j][i];
    }
}


//------------------------------------------------------------------------------
// function: shuffleData()
//------------------------------------------------------------------------------
fun void shuffleData()
{
    // prepare swap data
    float swapFeatures[numCoeffs];
    int swapLabelInt;
    // shuffle the data
    for( numPoints - 1 => int i; i > 0; i-- )
    {
        // random index
        Math.random2( 0, i ) => int j;
        // swap features
        for( 0 => int k; k < numCoeffs; k++ )
        {
            inFeatures[i][k] => swapFeatures[k];
            inFeatures[j][k] => inFeatures[i][k];
            swapFeatures[k] => inFeatures[j][k];
        }
        // swap labels
        inLabelsInt[i] => swapLabelInt;
        inLabelsInt[j] => inLabelsInt[i];
        swapLabelInt => inLabelsInt[j];
    }
}


//------------------------------------------------------------------------------
// function: prepareData( int fold )
//------------------------------------------------------------------------------
fun void prepareData( int fold )
{
    // test indices
    fold * numTestFolds * numPointsPerFold => int testStart;
    testStart + numTestFolds * numPointsPerFold => int testEnd;
    // index
    0 => int train_i;
    0 => int test_i;
    // prepare training and testing data
    for( 0 => int i; i < numPoints; i++ )
    {
        // test
        if( i >= testStart && i < testEnd )
        {
            // copy features
            for( 0 => int j; j < numCoeffs; j++ )
                inFeatures[i][j] => testFeatures[test_i][j];
            // copy label
            inLabelsInt[i] => testLabelsInt[test_i];
            // increment
            test_i++;
        }
        // train
        else
        {
            // copy features
            for( 0 => int j; j < numCoeffs; j++ )
                inFeatures[i][j] => trainFeatures[train_i][j];
            // copy label
            inLabelsInt[i] => trainLabelsInt[train_i];
            // increment
            train_i++;
        }
    }
}


//------------------------------------------------------------------------------
// function: load data file
//------------------------------------------------------------------------------
fun FileIO loadFile( string filepath )
{
    // reset
    0 => numPoints;
    0 => numCoeffs;
    
    // load data
    FileIO fio;
    if( !fio.open( filepath, FileIO.READ ) )
    {
        // error
        <<< "cannot open file:", filepath >>>;
        // close
        fio.close();
        // return
        return fio;
    }
    
    string str;
    string line;
    // read file int array
    while( fio.more() )
    {
        // read each line
        fio.readLine().trim() => str;
        // check if empty line
        if( str != "" )
        {
            numPoints++;
            str => line;
        }
    }
    
    // a string tokenizer
    StringTokenizer tokenizer;
    // set to last non-empty line
    tokenizer.set( line );
    // -1 (to account for label)
    -1 => numCoeffs;
    // see how many, including label name
    while( tokenizer.more() )
    {
        tokenizer.next();
        numCoeffs++;
    }
    
    // check
    if( numPoints == 0 || numCoeffs <= 0 )
    {
        <<< "no data in file:", filepath >>>;
        fio.close();
        return fio;
    }
    
    // print
    <<< "# of data points:", numPoints, "dimensions:", numCoeffs >>>;
    
    // done for now
    return fio;
}


//------------------------------------------------------------------------------
// function: read the data
//------------------------------------------------------------------------------
fun void readData( FileIO fio )
{
    // rewind the file reader
    fio.seek( 0 );
    
    // read file int array
    string str;
    int ci, ri;
    while( fio => str )
    {
        // check for last
        if( (ci != 0) && ((ci % numCoeffs) == 0) )
        {
            // read in label
            str => inLabels[ri];
            // set in map
            1 => label2int[str];
            // increment row
            ri++;
            // reset column
            0 => ci;
        }
        else
        {
            // store feature value
            Std.atof(str) => inFeatures[ri][ci];
            // increment column
            ci++;
        }
    }
    
    // get keys from map
    label2int.getKeys( labels );
    // assign index
    for( int i; i < labels.size(); i++ )
    { i => label2int[labels[i]]; }
    // convert in labels to ints
    for( int i; i < inLabels.size(); i++ )
    {
        // get index as int
        label2int[inLabels[i]] => inLabelsInt[i];
    }
}

Phase 2 and 3 Code:

Mosaic extract:

// input audio file
"" => string INPUT;
// output file (if empty, will print to console)
"" => string OUTPUT_FILE;
// get from arguments
if( me.args() > 0 ) me.arg(0) => INPUT;
// get from arguments
if( me.args() > 1 ) me.arg(1) => OUTPUT_FILE;

// print usage
if( me.args() == 0 )
{
    <<< "usage: chuck --silent mosaic-extract.ck:INPUT:OUTPUT", "" >>>;
    <<< " |- INPUT: audio file (.wav), or text file (.txt) listing audio files", "" >>>;
    <<< " |- OUTPUT: model file (.txt) to contain extracted feature vectors", "" >>>;
    me.exit();
}

// detect; print helpful message
if( Machine.silent() == false )
{
    <<< "-----------------", "" >>>;
    <<< "[mosaic-extract]: chuck is currently running in REAL-TIME mode;", "" >>>;
    <<< "[mosaic-extract]: this step has no audio output; may run faster in SILENT mode!", "" >>>;
    <<< "[mosaic-extract]: to run in SILENT mode, restart chuck with --silent flag", "" >>>;
    <<< "-----------------", "" >>>;
}


//------------------------------------------------------------------------------
// analysis network -- this determines which feature will be extracted
// NOTE: see examples/ai/features for examples of different features
// match the ones in synth
//------------------------------------------------------------------------------
// audio input into a FFT
SndBuf audioFile => FFT fft;
// a thing for collecting multiple features into one vector
FeatureCollector combo => blackhole;
// add spectral feature: Centroid
fft =^ Centroid centroid =^ combo;
// add spectral feature: Flux
fft =^ Flux flux =^ combo;
// add spectral feature: RMS
fft =^ RMS rms =^ combo;
// add spectral feature: MFCC
fft =^ MFCC mfcc =^ combo;
fft =^ RollOff rolloff =^ combo;
fft =^ Chroma chroma =^ combo;
//might suit for guqin
//------------------------------------------------------------------------------
// analysis parameters -- useful for tuning your extraction
//------------------------------------------------------------------------------
// set number of coefficients in MFCC (how many we get out)
20 => mfcc.numCoeffs;
// set number of mel filters in MFCC (internal to MFCC)
10 => mfcc.numFilters;

// do one .upchuck() so FeatureCollector knows how many total dimension
combo.upchuck();
// get number of total feature dimensions
combo.fvals().size() => int NUM_DIMENSIONS;

// set FFT size how many FFT take
//4096 => fft.size;
4410 => fft.size;
// set window type and size
// conditioning number of FFT to analyze
Windowing.hann(fft.size()) => fft.window;
// our hop size (how often to perform analysis)
// how often beats in the system
(fft.size()/2)::samp => dur HOP;
// how many frames to aggregate before averaging?
3 => int NUM_FRAMES;


//------------------------------------------------------------------------------
// OUTPUT: prepare for output
//------------------------------------------------------------------------------
// a feature frame
float featureFrame[NUM_DIMENSIONS];
// how many input files
0 => int NUM_FILES;

// output reference, default is error stream (cherr)
cherr @=> IO @ theOut;
// instantiate
FileIO fout;
// output file
if( OUTPUT_FILE != "" )
{
    // print
    <<< "opening file for output:", OUTPUT_FILE >>>;
    // open
    fout.open( OUTPUT_FILE, FileIO.WRITE );
    // test
    if( !fout.good() )
    {
        <<< " |- cannot open file for writing...", "" >>>;
        me.exit();
    }
    // override
    fout @=> theOut;
}


//------------------------------------------------------------------------------
// INPUT: prepare for iterating over input data and extract features
//------------------------------------------------------------------------------

// array input audio files
string filenames[0];
// parse INPUT, which may be an audio file (.wav) or a list of filenames (.txt)
if( !parseInput( INPUT, filenames ) ) me.exit();

// loop over filenname
for( int i; i < filenames.size(); i++)
{
    // extract the file
    if( !extractTrajectory( me.dir()+filenames[i], filenames[i], i, theOut ) )
    {
        // issue warning
        cherr <= "[mosaic-extract]: problem extracting (and skipping): " <= filenames[i] <= IO.newline();
        // skip
        continue;
    }
}

// flush output
theOut.flush();
// close
theOut.close();


//------------------------------------------------------------------------------
// extractTrajectory() -- extracts and outputs feature vectors from a single file
//------------------------------------------------------------------------------
fun int extractTrajectory( string inputFilePath, string shortName, int fileIndex, IO out )
{    
    // increment
    NUM_FILES++;
    // log 
    cherr <= "[" <= NUM_FILES <= "] extracting features: " <= inputFilePath <= IO.newline();
    
    // load by block to speed up IO
    fft.size() => audioFile.chunks;
    // read the audio file
    inputFilePath => audioFile.read;
    // file position (in seconds)
    int pos;
    // frame index
    int index;
    
    while( audioFile.pos() < audioFile.samples() )
    {
        // remember the starting pos of each vector
        audioFile.pos() => int pos;
        // let one FFT-size of time pass (to buffer)
        fft.size()::samp => now;
        // zero out
        featureFrame.zero();
        // loop over frames
        for( int i; i < NUM_FRAMES; i++ )
        {
            //-------------------------------------------------------------
            // a single upchuck() will trigger analysis on everything
            // connected upstream from combo via the upchuck operator (=^)
            // the total number of output dimensions is the sum of
            // dimensions of all the connected unit analyzers
            //-------------------------------------------------------------
            combo.upchuck();
            // for each dimension
            for( int d; d < NUM_DIMENSIONS; d++ )
            {
                // copy
                combo.fval(d) +=> featureFrame[d];
            }
            // advance time
            HOP => now;
        }
        
        // print label name and endline
        out <= shortName <= " " <= (pos::samp)/second <= " ";

        //-------------------------------------------------------------
        // average into a single feature vector per file
        // NOTE: this can be easily modified to N feature vectors
        // spread out over the length of an audio file; for now
        // we will just do one feature vector per file
        //-------------------------------------------------------------
        for( int d; d < NUM_DIMENSIONS; d++ )
        {
            // average by total number of frames
            NUM_FRAMES /=> featureFrame[d];
            // print the MFCC results
            out <= featureFrame[d] <= " ";
        }
        
        out <= IO.newline();
        
        // print .
        if( out != cherr ) { cherr <= "."; cherr.flush(); }
        
        // increment index
        index++;
    }
    
    // print newline to screen
    if( out != cherr ) cherr <= IO.newline();

    // done
    return true;
}


//------------------------------------------------------------------------------
// parse INPUT argument -- either single audio file or a text file containing a list
//------------------------------------------------------------------------------
fun int parseInput( string input, string results[] )
{
    // clear results
    results.clear();
    // see if input is a file name
    if( input.rfind( ".wav" ) > 0 || input.rfind( ".aiff" ) > 0 )
    {
        // make new string (since << current appends by reference)
        input => string sss;
        // append
        results << sss;
    }
    else
    {
        // load data
        FileIO fio;
        if( !fio.open( me.dir() + input, FileIO.READ ) )
        {
            // error
            <<< "cannot open file:", me.dir() + input >>>;
            // close
            fio.close();
            // return done
            return false;
        }
        
        // read each filename
        while( fio.more() )
        {
            // read each line
            fio.readLine().trim() => string line;
            // if not empty
            if( line != "" )
            {
                results << line;
            }
        }
    }
    
    return true;
}

2. Mosaic synthesis microphone:

// input: pre-extracted model file
string FEATURES_FILE;
// if have arguments, override filename
if( me.args() > 0 )
{
    me.arg(0) => FEATURES_FILE;
}
else
{
    // print usage
    <<< "usage: chuck mosaic-synth-mic.ck:INPUT", "" >>>;
    <<< " |- INPUT: model file (.txt) containing extracted feature vectors", "" >>>;
}
//------------------------------------------------------------------------------
// expected model file format; each VALUE is a feature value
// (feel free to adapt and modify the file format as needed)
//------------------------------------------------------------------------------
// filePath windowStartTime VALUE VALUE ... VALUE
// filePath windowStartTime VALUE VALUE ... VALUE
// ...
// filePath windowStartTime VALUE VALUE ... VALUE
//------------------------------------------------------------------------------


//------------------------------------------------------------------------------
// unit analyzer network: *** this must match the features in the features file
//------------------------------------------------------------------------------
// audio input into a FFT
adc => FFT fft;
// a thing for collecting multiple features into one vector
FeatureCollector combo => blackhole;
// add spectral feature: Centroid
fft =^ Centroid centroid =^ combo;
// add spectral feature: Flux
fft =^ Flux flux =^ combo;
// add spectral feature: RMS
fft =^ RMS rms =^ combo;
// add spectral feature: MFCC
fft =^ MFCC mfcc =^ combo;
fft =^ RollOff rolloff =^ combo;
fft =^ Chroma chroma =^ combo;

//-----------------------------------------------------------------------------
// setting analysis parameters -- also should match what was used during extration
//-----------------------------------------------------------------------------
// set number of coefficients in MFCC (how many we get out)
// 13 is a commonly used value; using less here for printing
20 => mfcc.numCoeffs;
// set number of mel filters in MFCC
10 => mfcc.numFilters;

// do one .upchuck() so FeatureCollector knows how many total dimension
combo.upchuck();
// get number of total feature dimensions
combo.fvals().size() => int NUM_DIMENSIONS;

// set FFT size
4096 => fft.size;
// set window type and size
Windowing.hann(fft.size()) => fft.window;
// our hop size (how often to perform analysis)
(fft.size()/2)::samp => dur HOP;
// how many frames to aggregate before averaging?
// (this does not need to match extraction; might play with this number)
3 => int NUM_FRAMES;
// how much time to aggregate features for each file
fft.size()::samp * NUM_FRAMES => dur EXTRACT_TIME;


//------------------------------------------------------------------------------
// unit generator network: for real-time sound synthesis
//------------------------------------------------------------------------------
// how many max at any time?
16 => int NUM_VOICES;
// a number of audio buffers to cycel between
SndBuf buffers[NUM_VOICES]; ADSR envs[NUM_VOICES]; Pan2 pans[NUM_VOICES];
// set parameters
for( int i; i < NUM_VOICES; i++ )
{
    // connect audio
    buffers[i] => envs[i] => pans[i] => dac;
    // set chunk size (how to to load at a time)
    // this is important when reading from large files
    // if this is not set, SndBuf.read() will load the entire file immediately
    fft.size() => buffers[i].chunks;
    // randomize pan
    Math.random2f(-.75,.75) => pans[i].pan;
    // set envelope parameters
    envs[i].set( EXTRACT_TIME, EXTRACT_TIME/2, 1, EXTRACT_TIME );
}


//------------------------------------------------------------------------------
// load feature data; read important global values like numPoints and numCoeffs
//------------------------------------------------------------------------------
// values to be read from file
0 => int numPoints; // number of points in data
0 => int numCoeffs; // number of dimensions in data
// file read PART 1: read over the file to get numPoints and numCoeffs
loadFile( FEATURES_FILE ) @=> FileIO @ fin;
// check
if( !fin.good() ) me.exit();
// check dimension at least
if( numCoeffs != NUM_DIMENSIONS )
{
    // error
    <<< "[error] expecting:", NUM_DIMENSIONS, "dimensions; but features file has:", numCoeffs >>>;
    // stop
    me.exit();
}


//------------------------------------------------------------------------------
// each Point corresponds to one line in the input file, which is one audio window
//------------------------------------------------------------------------------
class AudioWindow
{
    // unique point index (use this to lookup feature vector)
    int uid;
    // which file did this come file (in files arary)
    int fileIndex;
    // starting time in that file (in seconds)
    float windowTime;
    
    // set
    fun void set( int id, int fi, float wt )
    {
        id => uid;
        fi => fileIndex;
        wt => windowTime;
    }
}

// array of all points in model file
AudioWindow windows[numPoints];
// unique filenames; we will append to this
string files[0];
// map of filenames loaded
int filename2state[0];
// feature vectors of data points
float inFeatures[numPoints][numCoeffs];
// generate array of unique indices
int uids[numPoints]; for( int i; i < numPoints; i++ ) i => uids[i];

// use this for new input
float features[NUM_FRAMES][numCoeffs];
// average values of coefficients across frames
float featureMean[numCoeffs];


//------------------------------------------------------------------------------
// read the data
//------------------------------------------------------------------------------
readData( fin );


//------------------------------------------------------------------------------
// set up our KNN object to use for classification
// (KNN2 is a fancier version of the KNN object)
// -- run KNN2.help(); in a separate program to see its available functions --
//------------------------------------------------------------------------------
KNN2 knn;
// k nearest neighbors
2 => int K;
// results vector (indices of k nearest points)
int knnResult[K];
// knn train
knn.train( inFeatures, uids );


// used to rotate sound buffers
0 => int which;

//------------------------------------------------------------------------------
// SYNTHESIS!!
// this function is meant to be sporked so it can be stacked in time
//------------------------------------------------------------------------------
fun void synthesize( int uid )
{
    // get the buffer to use
    buffers[which] @=> SndBuf @ sound;
    // get the envelope to use
    envs[which] @=> ADSR @ envelope;
    // increment and wrap if needed
    which++; if( which >= buffers.size() ) 0 => which;

    // get a referencde to the audio fragment to synthesize
    windows[uid] @=> AudioWindow @ win;
    // get filename
    files[win.fileIndex] => string filename;
    // load into sound buffer
    filename => sound.read;
    // seek to the window start time
    ((win.windowTime::second)/samp) $ int => sound.pos;

    // print what we are about to play
    chout <= "synthsizing window: ";
    // print label
    chout <= win.uid <= "["
          <= win.fileIndex <= ":"
          <= win.windowTime <= ":POSITION="
          <= sound.pos() <= "]";
    // endline
    chout <= IO.newline();

    // open the envelope, overlap add this into the overall audio
    envelope.keyOn();
    // wait
    (EXTRACT_TIME*3)-envelope.releaseTime() => now;
    // start the release
    envelope.keyOff();
    // wait
    envelope.releaseTime() => now;
}


//------------------------------------------------------------------------------
// real-time similarity retrieval loop
//------------------------------------------------------------------------------
while( true )
{
    // aggregate features over a period of time
    for( int frame; frame < NUM_FRAMES; frame++ )
    {
        //-------------------------------------------------------------
        // a single upchuck() will trigger analysis on everything
        // connected upstream from combo via the upchuck operator (=^)
        // the total number of output dimensions is the sum of
        // dimensions of all the connected unit analyzers
        //-------------------------------------------------------------
        combo.upchuck();  
        // get features
        for( int d; d < NUM_DIMENSIONS; d++) 
        {
            // store them in current frame
            combo.fval(d) => features[frame][d];
        }
        // advance time
        HOP => now;
    }
    
    // compute means for each coefficient across frames
    for( int d; d < NUM_DIMENSIONS; d++ )
    {
        // zero out
        0.0 => featureMean[d];
        // loop over frames
        for( int j; j < NUM_FRAMES; j++ )
        {
            // add
            features[j][d] +=> featureMean[d];
        }
        // average
        NUM_FRAMES /=> featureMean[d];
    }
    
    //-------------------------------------------------
    // search using KNN2; results filled in knnResults,
    // which should the indices of k nearest points
    //-------------------------------------------------
    knn.search( featureMean, K, knnResult );
        
    // SYNTHESIZE THIS
    spork ~ synthesize( knnResult[Math.random2(0,knnResult.size()-1)] );
}
//------------------------------------------------------------------------------
// end of real-time similiarity retrieval loop
//------------------------------------------------------------------------------




//------------------------------------------------------------------------------
// function: load data file
//------------------------------------------------------------------------------
fun FileIO loadFile( string filepath )
{
    // reset
    0 => numPoints;
    0 => numCoeffs;
    
    // load data
    FileIO fio;
    if( !fio.open( filepath, FileIO.READ ) )
    {
        // error
        <<< "cannot open file:", filepath >>>;
        // close
        fio.close();
        // return
        return fio;
    }
    
    string str;
    string line;
    // read the first non-empty line
    while( fio.more() )
    {
        // read each line
        fio.readLine().trim() => str;
        // check if empty line
        if( str != "" )
        {
            numPoints++;
            str => line;
        }
    }
    
    // a string tokenizer
    StringTokenizer tokenizer;
    // set to last non-empty line
    tokenizer.set( line );
    // negative (to account for filePath windowTime)
    -2 => numCoeffs;
    // see how many, including label name
    while( tokenizer.more() )
    {
        tokenizer.next();
        numCoeffs++;
    }
    
    // see if we made it past the initial fields
    if( numCoeffs < 0 ) 0 => numCoeffs;
    
    // check
    if( numPoints == 0 || numCoeffs <= 0 )
    {
        <<< "no data in file:", filepath >>>;
        fio.close();
        return fio;
    }
    
    // print
    <<< "# of data points:", numPoints, "dimensions:", numCoeffs >>>;
    
    // done for now
    return fio;
}


//------------------------------------------------------------------------------
// function: read the data
//------------------------------------------------------------------------------
fun void readData( FileIO fio )
{
    // rewind the file reader
    fio.seek( 0 );
    
    // a line
    string line;
    // a string tokenizer
    StringTokenizer tokenizer;
    
    // points index
    0 => int index;
    // file index
    0 => int fileIndex;
    // file name
    string filename;
    // window start time
    float windowTime;
    // coefficient
    int c;
    
    // read the first non-empty line
    while( fio.more() )
    {
        // read each line
        fio.readLine().trim() => line;
        // check if empty line
        if( line != "" )
        {
            // set to last non-empty line
            tokenizer.set( line );
            // file name
            tokenizer.next() => filename;
            // window start time
            tokenizer.next() => Std.atof => windowTime;
            // have we seen this filename yet?
            if( filename2state[filename] == 0 )
            {
                // make a new string (<< appends by reference)
                filename => string sss;
                // append
                files << sss;
                // new id
                files.size() => filename2state[filename];
            }
            // get fileindex
            filename2state[filename]-1 => fileIndex;
            // set
            windows[index].set( index, fileIndex, windowTime );

            // zero out
            0 => c;
            // for each dimension in the data
            repeat( numCoeffs )
            {
                // read next coefficient
                tokenizer.next() => Std.atof => inFeatures[index][c];
                // increment
                c++;
            }
            
            // increment global index
            index++;
        }
    }
}

Phase Three: Musical Mosaic

Written by Mxzhan