# Numerical and Computational Optimization on the Intel Phi

As discussed in the previous article, `myFunc.h`implements a least squares objective function. Suffice it to say that a least squares objective function evaluates how well a set of model parameters fits a data set by calculating the sum of the squares of the errors in the fit over some data set of `N` items as shown in Equation 1. The data points can lie on a line, a curve, or a multidimensional surface.

Equation 1: Sum of squares of differences error.

The following code snippet illustrates the simplicity of an OpenMP implementation on a single device. The user-provided function `myFunc() `calculates the error of the model for each example in the data set being fit. The OpenMP compiler is responsible for correctly parallelizing the reduction loop across the processor cores. The programmer is responsible for expressing `myFunc()` so the compiler can correctly map the parallel instances to the per core vector units on the Intel Xeon Phi coprocessor.

```#pragma omp parallel for reduction(+ : err)
for(int i=0; i < nExamples; i++) {
float d=myFunc(i, P, example, nExamples, NULL);
err += d*d;
}```

### An nlopt Training Code

Listing Two is the source code for `train.c`. This code calls the `init()` function to load the data, then defines a set of random model parameters for the numerical optimization, and finally calls the appropriate nlopt optimization method, which utilizes the `myFunc() `objective function. The user supplies the name of the data file, or '`-`' if reading from `stdin`, and the name of the file to write the optimized parameters to at the end of the run. The method `writeParam()` performs a binary write of the single-precision parameters. The `nlopt_set_maxtime()` method is called to limit the optimization time to 15 minutes.

The praxis numerical method is used to perform the optimization in these examples. The comments show a few of the other derivative-free numerical techniques that can be utilized in the nlopt library. Numerical optimization is both valuable and fun, and you are encouraged to experiment with different techniques.

Listing Two

```// Rob Farber
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#include <nlopt.h>

#include "myFunc.h"

void writeParam(char *filename, int nParam, double *x)
{
FILE *fn=fopen(filename,"w");
if(!fn) {
fprintf(stderr,"Cannot open %s\n",filename);
exit(1);
}

uint32_t n=nParam; // ensure size is uint32_t
fwrite(&n,sizeof(uint32_t), 1, fn);
for(int i=0; i < nParam; i++) {
float tmp=x[i];
fwrite(&tmp,sizeof(float), 1, fn);
}
fclose(fn);
}

int main(int argc, char* argv[])
{
nlopt_opt opt;
userData_t uData = {0};

memset(&uData, 0, sizeof(userData_t));

if(argc < 3) {
fprintf(stderr,"Use: datafile paramFile\n");
return -1;
}
init(argv[1],&uData);
printf("myFunc %s\n", desc);
printf("nExamples %d\n", uData.nExamples);
printf("Number Parameters %d\n", N_PARAM);

opt = nlopt_create(NLOPT_LN_PRAXIS, N_PARAM); // algorithm and dimensionality
// NOTE: alternative optimization methods ...
//opt = nlopt_create(NLOPT_LN_NEWUOA, N_PARAM);
//opt = nlopt_create(NLOPT_LN_COBYLA, N_PARAM);
//opt = nlopt_create(NLOPT_LN_BOBYQA, N_PARAM);
//opt = nlopt_create(NLOPT_LN_AUGLAG, N_PARAM);

nlopt_set_min_objective(opt, objFunc, (void*) &uData);
nlopt_set_maxtime(opt, (30. * 60.)); // maximum runtime in seconds
//nlopt_set_maxtime(opt, 20); // Use for running quick tests
double minf; /* the minimum objective value, upon return */

__declspec(align(64)) double x[N_PARAM];
for(int i=0; i < N_PARAM; i++) x[i] = 0.1*(rand()/(double)RAND_MAX);

double startTime=getTime();
int ret=nlopt_optimize(opt, x, &minf);
printf("Optimization Time %g\n",getTime()-startTime);

if (ret < 0) {
printf("nlopt failed! ret %d\n", ret);
} else {
printf("found minimum %0.10g ret %d\n", minf,ret);
}
writeParam(argv[2],N_PARAM, x);
fini(&uData);
nlopt_destroy(opt);

return 0;
}```

### Evaluating the Success of the Optimization

Listing Three is the source code for `pred.c`, which reads the optimized parameters from the training run and a cross-validation or prediction data set. It writes a set of output values created by `myFunc()` when compiled with the `DO_PRED` C-preprocessor variable `defined` that represents the model predictions that can be utilized as a solution or evaluated for accuracy.

Listing Three

```#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>

// tell myFunc that this is a prediction function
#define DO_PRED
#include "myFunc.h"

void readParam(char* filename, int nParam, float* param)
{
FILE *fn=fopen(filename,"r");
if(!fn) {
fprintf(stderr,"Cannot open %s\n",filename);
exit(1);
}
int parmInFile;
fread(&parmInFile,sizeof(uint32_t), 1, fn);
if(parmInFile != N_PARAM) {
fprintf(stderr,"Number of inputs incorrect!\n");
exit(1);
}
fread(param,sizeof(float), nParam, fn);
}

int main(int argc, char* argv[])
{
userData_t uData = {0};

if(argc < 3) {
fprintf(stderr,"Use: paramFile dataFile\n");
return -1;
}

printf("myFunc %s\n", desc);
init(argv[2],&uData);
printf("nExamples %d\n", uData.nExamples);

__declspec(align(64)) float x[N_PARAM];
readParam(argv[1], N_PARAM, x);

if(N_OUTPUT == 0) { // special case for autoencoders
float pred[N_INPUT];
for(int exIndex=0; exIndex < uData.nExamples; exIndex++) {
float err = myFunc(exIndex, x, uData.example, uData.nExamples, pred);

printf("input ");
for(int j=0; j < N_INPUT; j++)
printf("%g ", uData.example[IN(j,uData.nExamples,exIndex)]);

printf("PredOutput ");
for(int j=0; j < N_INPUT; j++) printf("%g ", pred[j]);
printf("\n");
}

} else {

float pred[N_OUTPUT];
for(int exIndex=0; exIndex < uData.nExamples; exIndex++) {
float err = myFunc(exIndex, x, uData.example, uData.nExamples, pred);

printf("input ");
for(int j=0; j < N_INPUT; j++)
printf("%g ", uData.example[IN(j,uData.nExamples,exIndex)]);

printf(" KnownOutput ");
for(int j=0; j < N_OUTPUT; j++)
printf("%g ", uData.example[OUT(j,uData.nExamples,exIndex)]);

printf("PredOutput ");
for(int j=0; j < N_OUTPUT; j++) printf("%g ", pred[j]);
printf("\n");
}

}

return 0;
}```

Many techniques are used to calculate the success of the predictive model, including the root-mean-square deviation discussed in the first article in this series. Here, graphs will be used to visually show that the optimization did find a reasonable solution. This is done subject to the caveat that the eye can be easily deceived when evaluating complex models. You should utilize more rigorous verification methods in your own projects.

### A Linear Principal Components Optimization

Principal Components Analysis (PCA) accounts for the maximum amount of variance in a data set using a set of straight lines, where each line is defined by a weighted linear combination of the observed variables. The first line, or principle component, accounts for the greatest amount of variance, while each succeeding component accounts for as much variance as possible while remaining orthogonal to (uncorrelated with) the preceding components.

### More Insights

 To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.