Content

UCI LETTER

This is the LETTER dataset from the UCI archieve. The data and the algorithm template files can be downloaded here in order to generate the following results. The training set has 16000 samples, the test set 4000 samples (last 4k samples, 20k in total). The number of classes is 26, the number of features is 17. We use 8-fold cross validation.
To generate the results execute $./ELF LETTER t on the console and enable the specific algorithm in the Master.dsc file.

model	notes	training time	prediction time	cross validation RMSE	cross validation classification error	test RMSE	test classification error
LR - linear regression	15 search epochs, λ=0.000524479	1[s]	0[s]	0.346912	44.7062%	0.348054	45.575%
PR - polynomial regression	15 search epochs, polyOrder=4, λ=0.00349871, crossInteractions=yes	310[s]	4[s]	0.212869	7.51875%	0.214539	7.35%
GBDT - gradient boosted decision tree	500 epochs, featureSubspaceSize=5, maxTreeLeafes=100, η=0.1, optSplitPoint=no	8332[s]	14[s]	0.134175	3.41875%	0.135784	3.275%
KNN - k-nearest neighbors	15 search epochs, distance=euclidean, k=3	27[s]	3[s]	0.112202	5.21875%	0.111163	5%
NN - neural network	1000 epochs, stochastic gradient descent, Net: 17-70-50-26, η=0.001, λ=0	866[s]	0[s]	0.124753	4.68125%	0.122801	4.325%
KRR - kernel ridge regression	30 search epochs, gauss kernel, sigma=2.46455, λ=1.1675e-06	2432[s]	6[s]	0.129338	2.4%	0.127372	2.35%

UCI SATIMAGE

This is the SATIMAGE dataset from the UCI archieve. The data and the algorithm template files can be downloaded here in order to generate the following results. The training set has 4435 samples, the test set 2000 samples. The number of classes is 6, the number of features is 37. We use 8-fold cross validation.
To generate the results execute $./ELF SATIMAGE t on the console and enable the specific algorithm in the Master.dsc file.

model	notes	training time	prediction time	cross validation RMSE	cross validation classification error	test RMSE	test classification error
LR - linear regression	15 search epochs, λ=0.0149426	0[s]	0[s]	0.497712	23.8106%	0.509548	25.85%
PR - polynomial regression	15 search epochs, polyOrder=1, λ=0.0158027, crossInteractions=yes	10[s]	0[s]	0.405117	13.5964%	0.417862	14.95%
GBDT - gradient boosted decision tree	107 epochs, featureSubspaceSize=5, maxTreeLeafes=50, η=0.1, optSplitPoint=yes	37[s]	0[s]	0.315662	9.37993%	0.322224	9.5%
KNN - k-nearest neighbors	15 search epochs, distance=euclidean, k=4	3[s]	0[s]	0.300494	8.95152%	0.301169	9.1%
NN - neural network	329 epochs, stochastic gradient descent, Net: 37-70-50-6 , η=0.001, λ=0	64[s]	0[s]	0.30259	8.97407%	0.304153	9.05%
KRR - kernel ridge regression	30 search epochs, gauss kernel, sigma=3.18697, λ=1.66922e-05	65[s]	1[s]	0.297553	7.64374%	0.305151	8.45%

UCI ADULT

This is the ADULT dataset from the UCI archieve. The data and the algorithm template files can be downloaded here in order to generate the following results. The training set has 32561 samples, the test set 16281 samples. The number of classes is 2, the number of features is 109. We use 8-fold cross validation.
To generate the results execute $./ELF ADULT t on the console and enable the specific algorithm in the Master.dsc file.

model	notes	training time	prediction time	cross validation RMSE	cross validation classification error	test RMSE	test classification error
LR - linear regression	15 search epochs, λ=0.0592731	3[s]	0[s]	0.681173	16.0806%	0.678574	15.7423%
PR - polynomial regression	15 search epochs, polyOrder=2, λ=0.00110125, crossInteractions=no	12[s]	0[s]	0.662291	14.9412%	0.66019	14.7288%
GBDT - gradient boosted decision tree	162 epochs, featureSubspaceSize=10, maxTreeLeafes=50, η=0.1, optSplitPoint=yes	140[s]	1[s]	0.60082	12.7607%	0.601293	12.7019%
KNN - k-nearest neighbors	15 search epochs, distance=euclidean, k=13	155[s]	88[s]	0.814904	21.3753%	0.811027	20.9139%
NN - neural network	16 epochs, stochastic gradient descent, Net: 109-30-20-2 , η=0.001, λ=0	45[s]	1[s]	0.635936	14.5757%	0.637256	14.7595%
KRR - kernel ridge regression	30 search epochs, gauss kernel, sigma=19.1556, λ=1.1675e-05, (maxThreadsInCross=2)	36148[s]	83[s]	0.644849	14.932%	0.64379	14.6797%