Implementation of a new algorithm

There are two ways for adding a new algorithm to the ELF structure. The first way is to derive from the class StandardAlgorithm like the class LinearModel. The second way is to implement the new algorithm without derivation. Here is an example how the header file of the new algorithm can look. The new algorithm has to be added to the method algorithmDispatcher(..) in the Scheduler class. To use the new algorithm, the first line of the specific dsc-file (e.g. NewAlgo_1.dsc) should contain the name of the algorithm.

class NewAlgo : public StandardAlgorithm, public Framework
{
public:
    NewAlgo();
    ~NewAlgo();
    virtual void modelInit();
    virtual void modelUpdate ( REAL* input, REAL* target, uint nSamples, uint crossRun );
    virtual void predictAllOutputs ( REAL* rawInputs, REAL* outputs, uint nSamples, uint crossRun );
    virtual void readSpecificMaps();
    virtual void saveWeights ( int cross );
    virtual void loadWeights ( int cross );
    virtual void loadMetaWeights ( int cross );
    static string templateGenerator ( int id, string preEffect, int nameID, bool blendStop );
private:
    // internal parameters/weights of the algorithm
    REAL** m_someParameters;
};

The requirement of the StandardAlgorithm base class is to re-implement all seven virtual methods. Here is an overview of the idea behind this construct.


"virtual void modelInit();"
Allocate and initialize memory, which is used in training. Allocate parameters for each fold in the cross validation. Add the tunable parameters of the algorithm, for example add the regularization with paramDoubleValues.push_back(&m_reg); paramDoubleNames.push_back("reg");


"virtual void modelUpdate ( REAL* input, REAL* target, uint nSamples, uint crossRun );"
This method is responsible for training the model on a given dataset. The training feature matrix is input, it is an m_nFeat by nSamples matrix (read-only, accessed row-wise). The target matrix is target, it is an m_nClass*m_nDomain by nSamples matrix (read-only, accessed row-wise). The variable crossRun is the number of the current fold in the cross validation process, beginning from 0.


"virtual void predictAllOutputs ( REAL* rawInputs, REAL* outputs, uint nSamples, uint crossRun );"
Here, the algorthm must predict new samples. The feature matrix is rawInputs, it is an m_nFeat by nSamples matrix (read-only, accessed row-wise). The predictions must be put in the outputs matrix, it is an m_nClass*m_nDomain by nSamples matrix (writeable, accessed row-wise). The variable crossRun is the number of the current fold in the cross validation process, beginning from 0.


"virtual void readSpecificMaps();"
This method is used for reading the algorithm dependent parameters from the corresponding dsc-file. For example m_reg=m_doubleMap["initReg"];.


"virtual void saveWeights ( int cross );"
Here, the weights and parameters of the algorthm are saved. The saved parameters constist of the trained state of the algorithm in order to being able to later predict an arbitrary test feature.


"virtual void loadWeights ( int cross );"
Here, the weights and parameters of the algorthm are loaded. At the end of this method, the algorithm is able to predict any new test feature.


"virtual void loadMetaWeights ( int cross );"
This method is only relevant when the option globalTrainingLoops in the Master.dsc file is greater than 1. The idea is to re-optimize the set of parameters when the ensemble is build.


"static string templateGenerator ( int id, string preEffect, int nameID, bool blendStop );"
This method should be implemented, but is not necessary. The idea is to generate a default dsc file for the given algorithm.