Second Pass
It's pretty obvious that calculating the concatenation of state m and n in the innermost loop is full of wasted cycles, since it is repeatedly calculating the same state values. I knew it was inefficient, but I didn't think it was going to matter too much.
Since it turned out that it did matter, I decided to precalculate all the values before entering the four-deep nested comparison loop, with code like this:
std::multiset<char> letters[ 50 ][ 50 ]; for ( int i = 0 ; i <49 ; i++ ) for ( int j = i + 1 ; j <50 ; j++ ) { char *p = states[ i ]; while (*p) letters[ i ][ j ].insert( *p++ ); p = states[ j ]; while ( *p ) letters[ i ][ j ].insert( *p++ ); }
Then I didn't have to do any computation in my main loop, I just had to modify the comparison line in the innermost loop:
if ( letters[i][j] == letters[m][n] ) std::cout <<states[ i ] <<", " ...
This modified program did indeed speed things up considerably, bringing the run time down from a fraction of an hour to just a few seconds, even with a bit of progress tracing turned on.