Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Mobile

Infrared Control of Your PC


May00: Infrared Control of Your PC

Cost-effective hardware and software make it possible

Gavin is a real-time software engineer. He can be contacted at gavin@beesknees .freeserve.co.uk.


Virtually every consumer infrared remote-control device uses a similar protocol. In fact, this is why universal remotes are available -- they all speak the same low-level language. All that distinguishes one from another is the precise sequence of bits that are transmitted.

For the most part, infrared (IR) devices are used to remotely control consumer electronic systems -- TVs, VCRs, stereos, and the like. However, you can also use them to remotely control your computer -- assuming you have the right hardware and software. This is particularly desirable when, say, you've used MP3 encoding to store hundreds of songs on your hard disk and want to remotely select songs on your PC as you do on your CD player. In this article, I'll examine Evation's Irman (short for "Infrared man"), a 3×2×1-inch device that connects to your PC, letting you use the same IR remote control that you use with your TV, VCR, CD, or stereo to control your PC. As Figure 1 shows, the back of the Irman has a 9-pin connector that attaches to the serial port on your PC. On the front of the Irman is an infrared receiver. When the Irman is attached, remote keypresses are translated to sequences of 6 bytes sent to the serial port.

While the Evation web site (http:// www.evation.com/) provides instructions for using Irman with software and a list of applications and libraries that support the device, the software tends to be distributed either without source or for UNIX/Linux. One exception is the freely available C-based Winamp plug-in from Nullsoft (http://www.winamp.com/). Winamp is a flexible, high-fidelity music player for Windows 95/98/NT. Here, I'll describe the IR hardware I am using, then present a thin C++ class to map IR signals on to key codes (see Listing One), with a slight detour through Win32 overlapped I/O. Finally, I will suggest a few ways to attach such control to applications and show one in more detail.

The sequence of 6 bytes is fairly arbitrary, but there is a one-to-one correspondence between remote keys and the sequence received. There are occasional synchronization errors, but these generally recover quickly (by the next keypress). I have also found that, for the three different remote units I have used, the sequences are continuously sent as long as the key on the unit is pressed, and successive sequences have some specific bits inverted. This means that the receiver program should ignore sequences received quickly in succession and can ignore inverted codes since at least one will be noninverted. All that really matters is that each remote key press results in the 6 bytes somewhere in the total sequence that will match previously received and stored bytes.

Irman Driver Software

Given that the Irman connects to the PC's serial port, the driver code is composed of two parts -- a COM port interface and a sequence matcher.

COM port programming is fairly straightforward on Win32. You open the device called "COM1" (or some other port number) and read/write characters. You can use the Win32 functions CreateFile, ReadFile, and WriteFile, or the C stream fopen, fread, and fwrite, or even C++ iostreams if you want. After opening the COM port, various RS-232 parameters must be set -- baud rate, number of bits per character, and so on. This is achieved using the Win32 DCB structure: The program retrieves one of these containing the current settings, changes the appropriate ones, and sends it back to the COM port, as in the Open method in Listing Two. Once the COM port is set up correctly, the Irman is powered up and initialized: It takes power from the RTS and DTR lines, so these have to be set. As far as initialization is concerned, all that is necessary is sending the 2 bytes "IR" to the Irman and if everything is working fine, the program will receive "OK" in return. (There are a few timing requirements, as described in the Irman specification and as can be seen in the listings.)

Apart from issuing the IR sequence, all of the serial interactions in this program are input operations, and all of these bar the OK at the start are 6-byte sequences. However, what if something goes wrong? What if the Irman is removed in midsequence or, more likely, users want to exit the program while it is waiting for some characters from the serial port? I need to be able to interrupt a read, and that is the root of the only significant problem with COM port I/O. By default, the COM port is opened in blocking mode, and there is no timeout, so the program will wait indefinitely for the character sequences. There are a number of ways around this. You could, for example, open in non-blocking mode or have short timeouts and poll the device, but that is fairly wasteful of resources. All other solutions involve separate threads to read the COM port and to service user interactions. In the UNIX world, the select function would let you handle events from different sources asynchronously; or you could set up the COM port reader thread to be signalled (interrupting the read, for instance) when some user-event occurred. In the Win32 world, a more attractive option is asynchronous I/O.

In normal (synchronous) I/O, the program prepares buffers to transmit or receive, calls the appropriate I/O function, and by the time the function returns, the buffers have been dealt with -- completely transmitted (or at least passed to the lower driver layers), completely filled, or some error has occurred. With asynchronous I/O, on the other hand, the program merely initiates the transfer, then continues while Windows, in the background, continues the operation and notifies the application when it has completed. In this case, the application is not permitted to reuse the buffer space until Windows has told it that the I/O transaction has completed: Windows is handling the asynchronous nature of the I/O transfer at the cost of more complex buffer management. Windows NT (and 2000) support asynchronous I/O on virtually every flavor of I/O stream, while Windows 9x restricts it to not much more than COM port I/O (which is jolly convenient in this case).

This is what you need to do to manage asynchronous I/O. First, open the device with the extra flag FILE_FLAG_OVERLAPPED. Note that when a file is opened for asynchronous I/O, all I/O must be asynchronous -- you cannot mix synchronous and asynchronous I/O.

After this, a read is initiated with a ReadFile, as normal, but with an extra parameter pointing to an OVERLAPPED structure. This contains a number of fields relating to the I/O operation, but the only important one here is hEvent, which is the handle of a Win32 manual reset event created specifically for indicating completion of the asynchronous I/O operation.

When the ReadFile returns, there are two possibilities (ignoring errors): The read may have completed already (for example, there might already have been data in the device's buffers), or it might not have completed. In the former case, just continue as before; in the latter, the program must wait for the operation to finish, either by polling or suspending on the completion event flag. Both of these options can be exercised via the GetOverlappedResult function -- its final argument is a flag that determines whether the function waits until the event is signalled. This implies that when interrupting an I/O operation, this event flag can be set from elsewhere (it is just a standard Win32 event) to wake up the waiting thread. The Microsoft documentation includes a CancelIo function that should be invoked to cut short an asynchronous operation. However, this function does not exist on Windows 95, so I have no option but to signal the event and keep my fingers crossed.

So far I've dealt with low-level I/O operations. I now have an interruptible mechanism for retrieving byte sequences from the Irman. The next level involves handling error conditions. First, as I mentioned earlier, the Irman returns a number of byte sequences for each keypress, at least with the remotes I have used. I make the assumption that users are not going to hit keys more rapidly than a couple per second (though this parameter is settable in the registry). So, after receiving a sequence, I ignore all others for the next half second. An additional mechanism for getting rid of spurious data is to flush the COM device's buffers (via PurgeComm) when I suspect there might be garbage.

The top level of the driver has to match Irman byte sequences with index numbers. This is a trivial search through a preloaded array of sequences. (If there were a lot of key codes, a hash table would be a better choice for data structure, but a simple vector will do here.) The array, along with any other operational parameters, is stored in the registry via the trivial C++ class Registry (available electronically; see "Resource Center," page 5).

I now have a facility for performing the blocking, but interruptible, reading and decoding of IR signals. To make use of it in an application, I need to spawn the reader off as a separate thread, so that the main thread can interact with the user. This reader thread also causes some action in response to the IR keypress. I did contemplate wrapping this in a C++ class, but later decided that the options were too varied -- direct invocation of functions (call-backs, in other words), sending Windows messages, COM invocations or socket messages, to name just a few) to make such a class tidy and efficient.

Instead, because the editor gets upset if I overrun, I will just detail the simplest mechanism for controlling another application -- windows messages. The reader thread (available electronically; see "Resource Center," page 5, or my web site at http://www.beesknees.freeserve.co.uk/ software/) loops reading key indexes from the Irman class and either invokes the appropriate operation via a switch statement or sets the mapping from the Irman's byte sequence to key code. This thread terminates when the user interface thread sets the going flag to false: The foreground thread also invokes the Irman object's Interrupt method to cancel any outstanding I/O operation, to avoid letting the reader thread block indefinitely.

Winamp Control

As a concrete example of application control, I chose Winamp mainly because Nullsoft has provided a comprehensive Windows message-based control interface, but also because Winamp has a nice plug-in scheme that lets my code mesh.

Winamp is easy to control via a well-defined set of Windows messages, described in the winamp.h header file in the SDK (available at http://members.xoom .com/plugindev/). All you need to do is locate the Winamp window and send the appropriate message. There are two ways to couple extra code to Winamp -- keep the code in a totally separate executable and use something like the Win32 function FindWindow to locate the Winamp window; or make it a DLL with a few specific entry points and enable it to be loaded automatically by Winamp. The second technique has the advantages that the Winamp window handle is available immediately so there is no need to search for it, that the code is smaller since it is merely a DLL to be loaded, and that the code is more tightly integrated with Winamp. The former has the advantages that I can use the same approach with other programs that are not quite as plug-in friendly and it is easier to debug code in a separate application.

I wrapped the small number of Winamp message sends in a simple WinampManager class (available electronically), which maps member functions onto the specific message being sent.

Tying It All Together

Although the reader thread code is C++ and involves some string manipulation, I deliberately steered clear of the STL. Some STL implementations do not have thread-safe string classes, and working around that was more effort than it was worth for such a short piece of code. (There are some patches to the STL that could make it thread safe, at quite a computational cost; see, for instance, http:// www.dinkumware.com/.)

The application does not have a main window itself, but just occupies a space in the system tray. A click on the icon pops up a menu that offers the configuration dialog box or an opportunity to quit.

When each key sequence is processed, the tray icon toggles between normal and inverted, just to confirm that something is happening (because I prefer not to trust hardware unless I can see an immediate result).

The configuration dialog (Figure 2) lets you test the Irman key definitions by just displaying the name of the key pressed instead of carrying out the intended action -- this is very useful for debugging.

Conclusion

The starting point of this exercise for me was the desire to control a PC via a wireless link. I found that there were a few cost-effective ways to use remote control, one of the easiest being the Irman. However, making use of it in a program was not completely trivial, but the C++ class and associated code I have shown here makes it straightforward to use the device to drive another application. As a bonus, I now have a PC that effectively replaces my CD player -- and I don't have to get up to change the disk every hour or so.

DDJ

Listing One

class Irman
{
public:
  // Initialise from registry (or take defaults)
  Irman( const TCHAR* regKey,   // Registry key under which to find values
         int numKeys );         // Number of keys on the Remote
  // Shutdown, and rewrite configuration to the registry
  ~Irman();
  // Get and set the port name (COMx)
  const TCHAR* Port() const { return comPortName; }
  void Port( const TCHAR* comPort );
  // Get and set the inter-key delay (milliseconds)
  unsigned long Delay() const { return interKeyDelay; }
  void Delay( unsigned long d ) { interKeyDelay = d; }
  // Wait for a data packet to be received and return the index into
  // the vector that represents it (or -1 if not recognised)
  int Key();
  // Trigger next received key sequence to be stored for indicated key code
  void SetKey( int key );
  // Interrupt the current read - used from a separate task
  void Interrupt();
private:
  // Each Irman key is decoded to a sequence of 6 bytes
  struct KeyCode
  {
    unsigned char code[ 6 ];
    bool operator==( const KeyCode& key )
    {
      for( int i = 0; i < sizeof( code ); ++i )
        if( key.code[ i ] != code[ i ] )
          return false;
      return true;
    }
  };
  TCHAR comPortName[ 5 ];  // contains COMx\0
  volatile HANDLE comPort; // Handle to the opened COM port
  HANDLE ioCompletion;     // Handle used for overlapped I/O
  // Where to read/write values - passed to constructor
  const TCHAR* regKey;
  // How many keys on the remote - passed to constructor
  int numKeys;
  // Codes corresponding to each key to be recognised (numKeys long)
  KeyCode* keyCodes;
  // Open and close the COM port
  void Open();
  void Close();
  // Read a complete Irman sequence
  bool Read( KeyCode& );
  // Time (ms) last Irman sequence read
  unsigned long keyTime;
  // Key code to which to set next read key
  volatile int setKey;
  // Low level (blocking, but interruptable via Interrupt() above)
  // COM port read and write
  bool ReadWait( void* data, unsigned long size );
  bool WriteWait( const void* data, unsigned long size );
  // Waggle the control lines to power up or down the Irman
  void PowerOn() const;
  void PowerOff() const;
  // Discard any characters in the COM port buffers
  void Flush();
  // Time to wait from reading one key ro the next
  volatile unsigned long interKeyDelay;
  // Disable copying
  Irman( const Irman& );
  Irman& operator=( const Irman& );  
};

Back to Article

Listing Two

Irman::Irman( const TCHAR* regKey_,  int numKeys_ ) :
        regKey( regKey_ ), numKeys( numKeys_ ),
        keyCodes( new KeyCode[ numKeys_ ] ),
        comPort( INVALID_HANDLE_VALUE ),
        keyTime( GetTickCount() ), setKey( false ),
        ioCompletion( CreateEvent( NULL, TRUE, FALSE, NULL ) )
{
  // Read the port name, inter key delay and all the key codes from
  // the registry, and then fire up the device
  RegistryKey reg( HKEY_LOCAL_MACHINE, regKey );
  reg.Read( RegPort, comPortName, 
                   sizeof( comPortName ) / sizeof( TCHAR ), _T("COM2") );
  interKeyDelay = reg.Read( RegDelay, 500 );
  for( int i = 0; i < numKeys; ++i )
  {
    static KeyCode blankKeyCode;
    TCHAR num[ 16 ];
    wsprintf( num, _T("%03d"), i );
    reg.Read( num, keyCodes[ i ].code, sizeof( KeyCode ), &blankKeyCode );
#   ifdef VERBOSE
      TCHAR buff[ 40 ];
      wsprintf( buff, "Key %d is %02X %02X %02X %02X %02X %02X\n", i,
       keyCodes[ i ].code[0], keyCodes[ i ].code[1], keyCodes[ i ].code[2],
       keyCodes[ i ].code[3], keyCodes[ i ].code[4], keyCodes[ i ].code[5] );
      OutputDebugString( buff );
#   endif
  }
  Open();
}
Irman::~Irman()
{
  // Close everything down, and write back to the registry
  Close();
  if( ioCompletion != INVALID_HANDLE_VALUE )
  {
    CloseHandle( ioCompletion );
    ioCompletion = INVALID_HANDLE_VALUE;
  }
  RegistryKey reg( HKEY_LOCAL_MACHINE, regKey );
  reg.Write( RegPort, comPortName );
  reg.Write( RegDelay, interKeyDelay );
  for( int i = 0; i < numKeys; ++i )
  {
    TCHAR num[ 16 ];
    wsprintf( num, _T("%03d"), i );
    reg.Write( num, keyCodes[ i ].code, sizeof( KeyCode ) );
  }
}
void Irman::Open()
{
  const TCHAR* error = _T("Error opening device");
  comPort = CreateFile( comPortName, GENERIC_READ | GENERIC_WRITE,
                        0, NULL, OPEN_EXISTING, FILE_FLAG_OVERLAPPED, NULL );
  if( comPort == INVALID_HANDLE_VALUE )
    MessageBox( NULL, _T("Could not open COM port - maybe something 
                    else is using it"), error, MB_ICONEXCLAMATION | MB_OK );
  else
  {
    DCB dcb;
    if( Verify( GetCommState( comPort, &dcb ) ) )
    {
      dcb.BaudRate    = CBR_9600;
      dcb.fParity     = 0;
      dcb.Parity      = NOPARITY;
      dcb.ByteSize    = 8;
      dcb.StopBits    = ONESTOPBIT;
      dcb.fDtrControl = DTR_CONTROL_DISABLE;
      dcb.fRtsControl = RTS_CONTROL_DISABLE;
      if( Verify( SetCommState( comPort, &dcb ) ) )
      {
        PowerOff();           // Just in case it was already on
        Sleep( 200 );
        PowerOn();
        Sleep( 100 );         // Time for the output to settle
        Flush();              // Remove power up garbage
        WriteWait( "I", 1 );  // These strings must be ASCII, not Unicode
        Sleep( 2 );           // Need to have >500us between the 'I' & the 'R'
        WriteWait( "R", 1 );

        char data[ 2 ];
        if( ReadWait( data, 2 ) && data[ 0 ] == 'O' && data[ 1 ] == 'K' )
          return;
        else
          MessageBox( NULL, _T("Irman not responding"), 
                               error, MB_ICONEXCLAMATION | MB_OK );
      }
    }
  }
  // To get this far, something must have gone wrong
  Close();
}
void Irman::Close()
{
  if( comPort != INVALID_HANDLE_VALUE )
  {
    Verify( CloseHandle( comPort ) );
    comPort = INVALID_HANDLE_VALUE;
  }
}
void Irman::Port( const TCHAR* comPort )
{
  _tcsnccpy( comPortName, comPort, sizeof( comPortName ) / sizeof( TCHAR ) );
  comPortName[ sizeof( comPortName ) / sizeof( TCHAR ) - 1 ] = 0;
  // Reopen the port if the name changed - I could have checked the new
  // and old names and if they were the same, skip the reopen. However,
  // this way, I can force a recover from a "stuck" I/O port...
  Close();
  Open();
}
void Irman::SetKey( int key )
{
  if( comPort == INVALID_HANDLE_VALUE )
  {
    MessageBox( NULL, _T("COM port not valid - can't configure"), 
               _T("IR Configuration Error"), MB_ICONEXCLAMATION | MB_OK );
    return;
  }
  if( key < 0 || key > numKeys )
  {
    return;
  }
  // Just indicate to the reading function that it should store the next
  // sequence instead of matching it
  setKey = key;
}
int Irman::Key()
{
  if( comPort == INVALID_HANDLE_VALUE && !Open() )
    return -1;

  // The Irman reports a number of sequences for each key - chuck away
  // old ones before reading the next key.
  Flush();

  KeyCode key;

  // Loop for a minimum time, to get rid of old duplicate/inverted messages
  unsigned long startTime = keyTime;
  do
  {
    if( !Read( key ) )
      return -1;
  } while( keyTime - startTime < interKeyDelay );

  // Now, key contains a valid code sequence, so do something with it

  if( setKey != -1 )
  {
    // If we're in record mode, just use this sequence for the relevant
    // entry in keyCodes[]
    keyCodes[ setKey ] = key;
    int retVal = setKey;
    setKey = -1;
    return retVal;
  }
  else
  {
    // If we're not in record mode, scan the list to find a match,
    // and repeat for up to the inter key period before giving up and
    // admitting it's unrecognised - the reason for the loop is to
    // catch any inversions along the way
    startTime = keyTime;
    do
    {
      for( int i = 0; i < numKeys; ++i )
        if( key == keyCodes[ i ] )
          return i;
      if( !Read( key ) )
        return -1;
    } while( GetTickCount() - startTime < interKeyDelay );
    return -1; // No key found
  }
}

bool Irman::Read( KeyCode& key )
{
  bool success = ReadWait( key.code, sizeof( key.code ) );
# ifdef VERBOSE
    if( success )
{
      TCHAR buff[ 40 ];
      wsprintf( buff, "Code %02X %02X %02X %02X %02X %02X\n",
                      key.code[0], key.code[1], key.code[2],
                      key.code[3], key.code[4], key.code[5] );
      OutputDebugString( buff );
    }
# endif
  keyTime = GetTickCount();
  return success;
}

Back to Article


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.