Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Parallel

Distributed Computing: Windows and Linux


Controlling the Cluster

With the NIST statistical suite, input is designed to be interactive. We modified this to accept all the parameters from the command line (some parameters were also hard-coded). We then added these parameters:

  • Start. The index of the first processed file.
  • End. The index of the last processed file.
  • Exp. The index of the current experiment.
  • Other. Other parameters that are experiment specific.

All the intermediate files (a couple of dozens for each file) are suffixed with the file index and experiment index. After processing each file, that file is deleted with all the intermediate files to free space on the Linux cluster. Only the final result file is kept.

At this point, the question was how to start the NIST statistical suite on each processor of the Linux cluster. We had several options:

  • Connect to each machine using PUTTY and run the command line. However, this approach required a lot of user interaction, so it was discarded.
  • Connect to all the machines using SSH and start a program that executes scripts—which is what we opted to do.

We wrote the Cluster program (Listing Three) to invoke the Scheduler program (Listing Four) on each node of the Linux cluster. The Scheduler executes script files prepared by the script generator program. This is done using the fork() function to create a child process for each connection. The connections to the Linux nodes using SSH are set to use automatic login and the Cluster program is run on the cgywin emulator on the windows machine.


#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h> 
#include <unistd.h> 
int main()
{
int i,pid;
char cmd[100];
static int N=1;
for( i=start;i<end;i++) {
pid=fork();
sprintf(cmd,"ssh username@node%d.cluster.com ./script/scheduler ",i);
system(cmd);
exit(0);
if(pid==-1)
printf("error on node %d\n",i);}
return 1;
}

Listing Three


#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int CPU_sleep()
{
int i=0,j=0,k=0,sum=0;
for(i=0;i<100;i++)
for(j=0;j<100;j++)
for(k=0;k<10;k++)
sum=sum+j*j*k;
return sum;
}

void main()
{

char * file="/home/script/max.txt";
char* filebase="/home/script/";
char filename[200];
char cmd[200];
char tempo[200];
FILE* W;
FILE* F=fopen(file,"r");
int N=0;
int X,max;
int S,E;
fscanf(F,"%d",&max);


fclose(F);
while(N<max)
{
N++;
sprintf(filename,"%s%d.do",filebase,N);
CPU_sleep();
W=fopen(filename,"r");
if(W==NULL)
{
   FILE* ch=fopen(filename,"w");
   fclose(ch); // we reserve the file
   sprintf(cmd,"/home/script/%d.sh",N);
   system(cmd);
   ch=fopen(filename,"w");
   fprintf(ch,"DONE");//now the file is marked to be done
   fclose(ch);
}
else
{
   fclose(W);// by pass
}
F=fopen(file,"r");
fscanf(F,"%d",&max);  //read the maximum again, 
                      //  to be able to increase it
fclose(F);
CPU_sleep();
}
}

Listing Four

The script generator program generates scripts to be executed by the Linux schedule. A script processes each job and the number of the scripts is stored in a max.txt file. The script files are generated in the form #.sh, where # is the number of the script. These files are then transferred to the directory "script" on the Linux cluster and their access mode is changed to be executable using the chmod +x *.sh command executed from a PUTTY terminal.

The Scheduler program first opens the max.txt file to get the maximum number of scripts. It then enters the loop where it sequentially checks all the status files with the form "#.do." When it finds a nonexisting status file, then it directly assigns it to itself by creating it and executes the corresponding script file. After finishing the script file, it rewrites the status file with the string "DONE" in it, then searches for the next status file until the maximum is reached. It then stops.

The max is read in each loop (as sometimes we increased it) as we introduce new experiments (with new files).

Only one job is assigned per script. If one node stops working due to any reasons, only one file will not be processed, and can be processed later.

The generated scripts look like the script in Listing Five. After the Cluster program finishes, we check the status files. If any are of size 0, the corresponding file is not executed due to a node failure. We then execute this script (manually or by deleting this status file(s) and calling the Cluster program). If the Windows machine is restarted for some reason, we restart the Cluster program. In the worse case, two scripts execute at the same time on each node. Because the Linux cluster is not dedicated to our processes, we use the lowest priority of execution using the nice -n 19 command, which lets others use the Linux nodes.


cd
cd  test
nice -n 19 ./nist  1  1  2481  1  128 > /dev/null

Listing Five

Once the status files are four bytes in size (the word DONE is written in them), we transfer the result files back to the Windows machine using WinSCP, then run the unmodified Extractor program.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.