Google Treasure Hunt 2
In order to solve the Google Treasure Hunt 2, I decided to reuse an old file searcher program I made in C++ using multi-threading. I changed small parts, just to guarantee that the program can search the 2 given file patterns, and I thought that Qt (from Trolltech) has some very nice functions to deal with files, and it is really true – you don’t need anything else, but the QFile and QDir classes and its derivatives.
If you like distributed programming, I think you heard about the MapReduce model, created by Google (www.google.com). In according with Jeffrey Dean and Sanjay Ghemawat, in an article (“Distributed Programming with MapReduce”) written to the nice book “Beautiful Code” (O’Reilly, 2007), “MapReduce was developed as a way of simplifying the development of large-scale computations at Google.” I really don’t think that this problem here can be classified as a “large-scale computation”, but I was in love about the way MapReduce works, and decided to apply this concept in my program.
If you take a look at the source code, there are 2 threads; the first one (Mapper) is able to get a list of files with names following certain pattern (containing “BCD” and with extension “.js”, or containing “foo” and ending with “.xml”); the second thread (Reducer) gets the list filled with file names and do a deeper search inside each one of the files (to get values in the number “4” lines). The main method of the Mapper is the following:
bool FileSearchMap::depthFirstTraversal(const QString& currentDir)
{
QDir dirP(currentDir);
QString temp;
QString fileName;
QStringList entryP = dirP.entryList();
QStringListIterator entryL(entryP);
while (entryL.hasNext())
{
producerMutex.lock();
if (myFileCount == 5)
{
bufferNotFull.wait(&producerMutex);
} // if - mutex and signal
producerMutex.unlock();
fileName.clear();
temp = entryL.next();
if ((temp != ".") && (temp != ".."))
{
fileName = currentDir;
fileName.append('\\');
fileName.append(temp);
if (isDirectory(fileName))
{
depthFirstTraversal(fileName);
}
else
{
if (isRegular(fileName))
{
unsigned int ftype = 0;
if ( ( fileName.endsWith(".js") ) &&
( fileName.contains("BCD") ) )
ftype = TXT_FILE_PATTERN;
else if ( ( fileName.endsWith(".xml") ) &&
( fileName.contains("foo") ) )
ftype = XML_FILE_PATTERN;
if ( ftype != 0 )
{
DirEntry dentry( fileName, ftype );
queueMutex.lock();
textFiles.push(dentry);
queueMutex.unlock();
producerMutex.lock();
++myFileCount;
producerMutex.unlock();
bufferNotEmpty.wakeAll();
}
}
} // if - isDirectory / isRegular
} // if
} // while
return true;
}
The method above will be called exactly one time, from the “run” method of my Mapper’s QThread. Putting the execution of the depthFirstTraversal method, firstly I get all files in the directory, calling entryList, from QDir. So, I instantiate a iterator over this file’s list, and, while iterating over the files, before anything I make a call to a wait function on a QWaitCondition, which is a kind of condition variable structure used to synchronize threads, only in the case I get already 5 files (can be any number of files you want). Why do I get exactly 5 files and stopped the thread execution, waiting for the QWaitCondition release? This is to create a buffer with capacity to 5 files (maximum), from which the Reducer thread can iterate over each file name, open it, and find for the given criteria. By the way, the Reducer thread can only deal with a list of 5 files. This is the first rule I used to synchronize these 2 threads: “Mapper, start to feed the file buffer, until it has 5 elements. If so, wait until the Reducer consumes some of these files, and after that you can start again feeding the file name buffer, until the end of days”. And now, the Reducer thread, getting all files Mapper can search:
void KeySearchReduce::run()
{
DirEntry dirEntry;
int cnt = 0;
do
{
prod->producerMutex.lock();
if (prod->myFileCount == 0)
{
prod->bufferNotEmpty.wait(&prod->producerMutex);
if (prod->isSearchDown())
{
prod->producerMutex.unlock();
return;
}
}
prod->producerMutex.unlock();
prod->queueMutex.lock();
dirEntry = prod->getFirstTextFile();
prod->queueMutex.unlock();
QFile

