Ever wonder what goes on when your machine collects e-mail? Here are a few of the protocols involved.
Recent growth in the Internet has created a push for new and better standards in PC client/server programming. The explosive interest in Internet connectivity drives home the point that interaction especially from person to person is of extreme value both corporately and on the individual level. Typically this desire for interactivity manifests itself in the form of Electronic Mail (e-mail), the Internets oldest and most popular service.
E-mail takes many different forms. Numerous standards for transport and mail storage exist, providing for a variety of e-mail needs on a multitude of computing platforms. This article shows how one of those platforms an 80x86 PC running Windows can utilize the services of a particular mail handling model called POP3 to collect and store electronic messages.
Mailer Building Blocks
This implementation combines three components which have become industry standards: The Windows Sockets Specification 1.1 (Winsock 1.1), Windows 3.1, and the Post Office Protocol Version 3 (POP3).
A Windows-based POP3 implementation wouldnt be possible without the efforts of those who developed Winsock 1.1, which was finally formalized in January of 1993. Designed to mimic the "socket" paradigm popularized by BSD UNIX, WinSock has become an integral component of Windows-based TCP/IP client/server programming. Although currently undergoing some revision, the 1.1 specification boasts a wide range of implementations and an even larger base of client software. Integrating seamlessly with the message-driven nature of Windows 3.1, WinSock has extended the reach of Windows-based PCs into the TCP/IP realm.
WinSock 1.1 provides the client/server programmer with all of the socket functionality inherent in BSD UNIX, as well as several additional routines needed to accomodate the non-preemptive nature of Windows 3.1. To utilize this functionality, the programmer must acquire a copy of the WINSOCK.H header file, which describes all the WinSock structures and definitions, as well as WINSOCK.LIB, to link into the project (see end of article).
Although not owned or controlled by Microsoft, WinSock is designed expressly for integration with Windows. In fact, the two are inseparable, and the WinSock programmer must have a good understanding of
programming under Microsoft Windows. As a secondary focus, this article presents a simplistic Windows interface to demonstrate most of the essential techniques of Windows programming. I implement this interface in the classic style using the Windows API in ANSI C. I originally developed this interface using Borlands C++ v4.52 for the medium model.
The POP3 Protocol
The goal of any client software is to interact with a server. WinSock can oversee the client/server connection and can send packets of data back and forth between them. But what those packets represent, and when and how they should be transmitted, is entirely up to the implemented protocol. My implementation uses POP3. POP has a history of being very reliable and easy to implement. It consists of just a few native commands. The POP3 protocol has so entrenched itself that its almost impossible to find any serious e-mail client software that doesnt employ its use. In fact, most e-mail clients, including the Windows 95 Inbox, depend on POP3 for all of their remote connectivity. As I show in the implementation of PopMail, POP3 is a state driven protocol. This adds to its reliability, because it allows only a few commands to be valid in any particular state.
The PopMail Design
A WinSock-based program can almost invariably be divided into three broad constituent parts: the "interface layer," which contains all of the code necessary for user interaction, the "network layer," which contains the code that uses WinSock services, and the "protocol layer," which determines the type of client being developed. Typically, minor modifications to the network layer and implementation of a new protocol are all that are needed to convert one type of WinSock client into another. The basic interface and network routines are usually identical across different types of clients. Thus, Ive designed PopMail to facilitate easy transitions to other implementations. To that end, Ive divided the code into three main source files. POPMAIL.C (Listing 1) contains the bulk of the user I/O code; NETWORK.C (Listing 2) contains most of the WinSock service calls; and POP3.C (Listing 3) houses the POP3 protocol implementation. Although distinguishable from one another, these basic divisions are by no means isolated subsystems. In fact, they each in some way depend on the others for at least a portion of their functionality.
POPMAIL.C The Interface Layer
PopMail begins life as do all Windows programs with a call to WinMain (see Listing 1) . WinMain creates and registers two "classes" or types of window objects which will be used during the project. The first window class describes the primary window, for interaction with the user. This window functions primarily as a placemat or storage bin, which will hold other, more useful window objects.
One notoriously difficult task for beginning Windows programmers is simply inserting text into the client area of an ordinary window. Such a feat can generate many lines of source code. Fortunately, there is an easy remedy to this problem. I have found it quite useful to map an "edit" window across the entire surface of the main windows client area. By their very nature, edit windows already know how to respond to keystrokes, receive instructions for the display of text within their client area, react to scrollbar events, and repaint themselves when needed. For these reasons using an edit window to handle text I/O seems sensible. Just be sure to resize the edit window whenever the size of the main window is altered.
PopMail also creates and registers a window class called Pop3. As I will show, this type of window receives all the messages generated by WinSock events and responds to them accordingly.
PopMail uses a Windows "timer" resource to know when to poll the server for waiting messages. Timers send messages to certain windows at defined intervals, with fairly reliable accuracy. They are great tools for automating simple Windows programs.
POPMAIL.C manages a dialog box object that allows the user to change configuration parameters. Dialog box objects come in two varieties modal and non-modal. Modal dialog box objects pause execution of the program until the dialog box is resolved usually by pressing the OK or CANCEL button. Non-modal dialog box objects allow the program to continue while resolution is pending. Which type of dialog box to use depends upon application specifics. Generally, I try to use modal dialog boxes when the input can be gathered quickly and doesnt require the completion of some other task. PopMail applies the modal approach to its configuration dialog box object.
The layout of dialog box objects are specified in "resource" files. Resource files contain all the definitions of all the resources a program intends to utilize, including dialog box objects, icons, bitmaps, string tables, and menus. POPMAIL.RC contains the various resources used in this project. (POPMAIL.RC, as well as full source code, is available on this months code disk and online sources. See page 3 for details.)
The Windows API provides facilities for the management of configuration files. These facilities come in the form of two system calls: GetPrivateProfileString and WritePrivateProfileString. With these two API calls, the Windows programmer can easily manipulate a programs .INI file, storing and retrieving configuration information whenever needed. PopMail takes advantage of these capabilities by storing its configuration parameters in a file called POPMAIL.INI.
NETWORK.C The Network Layer
NETWORK.C (Listing 2) contains the routines necessary for initializing the WinSock TCP/IP stack, retrieving remote host information, and connecting to a server. The call to WSAStartup in the InitNetwork function performs the necessary DLL initialization and prepares the subsystem for use. It must be the first WinSock API call performed. WSAStartup retrieves useful information about the WinSock implementation used to produce the DLL and what version(s) of the WinSock specifications are supported.
The call to the WinSock API WSAAsyncGetHostByName in the LookupServer function initiates a network event and returns a handle to that event. The event in question is a Domain Name Services (DNS) lookup of the named server, performed in the hopes of finding its Internet IP address. Since the requested event is taking place across a computer network, the length of time required to complete this function can never be determined. This is a pivitol concept, which the WinSock programmer must firmly grasp. Due to the non-preemptive nature of Window 3.1, any function requiring interaction across a network will cause the calling computer to block or "hang" for indeterminate periods of time.
To cope with this indeterminacy, WinSock incorporated the WSA family of function calls. WSAAsyncGetHostByName is an asyncronous, non-blocking version of BSDs getthostbyname. Note that this function does not return the result of the lookup; rather, its return value specifies a unique task-handle that can be used later to identify completion of this network request. WSAAsyncGetHostByName initiates the network event and returns immediately. The call takes the following form:
WSAAsyncGetHostByName(hwnd Wnd, unsigned int wMsg, const char FAR* Name, char FAR* HostInfo, int HfLen);
This function instructs the WinSock layer to send wMsg to the window Wnd when the server Name is found. The wParam parameter of this message will contain the unique task-handle, as returned by the original function call. This allows any number of simultaneous outstanding events to be distinguished. The resulting information concerning Name will be stored in HostInfo.
ConnectToServer demonstrates the classic technique used to make a stream socket connection to a listening server. First, a call to the WinSock API socket function instructs the WinSock layer to allocate a new socket descriptor. The SOCK_STREAM parameter to this function informs WinSock that this particular socket will be used in a reliable, bidirectional connection to some remote host, and that a Transmission Control Protocol (TCP in this case, POP3) will synchronize the communications. Using a TCP ensures that no loss of data will occur in the client/server application.
Once the socket descriptor has been created it must be assigned a local port number and a local IP address under which to operate. It doesnt matter which of the clients ports are used for server communication. Thus, ConnectToServer assigns a value of zero to the variable LocalAddr.sin_port. Assigning zero to this variable will allow the WinSock layer to assign any available port. The macro INADDR_ANY is assigned to the LocalAddr.sin_addr.s_addr variable for similar reasons. You can provide specific values for either of these parameters if specific ports and/or IP addresses must be used by the client.
The next step is to bind the socket using the LocalAddr parameters. If the bind completes successfully, one side of the stream-socket connection will be established. Now the other side of the connection must be made. Unlike the client side of the connection, the server side must use a well-defined port number and a specific IP address. Each service the Internet provides occurs on a unique port number. Hosts that wish to provide Internet services will install server software that will "listen" on a specific port number for incoming client connections. The port for the POP3 mail service is 110; ConnectToServer places this value in the ServerAddr.sin_port variable. Likewise, the servers IP address must be placed in ServerAddr.sin_addr.s_addr. Now the client is almost ready to connect.
Note the call to WSAAsyncSelect. In many ways, this call is similar in nature to WSAAsyncGetHostByName, which I have already discussed. Basically, this function call registers an interest in certain events which may be generated by the socket once it connects to the server. PopMail is interested in the FD_CONNECT event, which is generated immediately following a successful socket connection; the FD_READ event, which is signaled each time data has arrived at the client end of the socket connection; and the FD_CLOSE event, which is sent when the connection has been severed. From this point forward any occurrence of these events will trigger a message that will be sent to the specified window. Take a look at the if statement that calls function connect. The body of the if statement ignores an error of type WSAEWOULDBLOCK. Since a previous interest in the FD_CONNECT event has already been registered, it doesnt matter that connect returns with an error of WSAEWOULDBLOCK. This error simply means that connect has requested an operation that will block using a socket that was set up for non-blocking operation.
POP3.C The Protocol Layer
Listing 3 shows POP3.C, the clients protocol layer code. Once a reliable socket connection has been established with a POP3 server the mail transfer process can begin. Initially, the POP3 session is in the authentication state, in which the incoming connection must identify itself to the POP3 server by sending an acceptable user name and password. Once this has been completed successfully the session moves into the transmission state wherein statistical information on the total number of messages waiting at the server and their combined size is transmitted to the client. Once this information is made available the client begins to request each message, one at a time. After each message is successfully transmitted and the client saves it to a .MSG ASCII file, the client instructs the server to flag the message for deletion. When all messages have been communicated the session moves into the update state. The server breaks the connection and removes all messages flagged for deletion. As you can see, POP3 is a very simple protocol.
Note that the POP3 implementation takes the form of a window procedure. This is necessary due to the message-driven design of WinSock. Because the socket connection with the remote host generates and posts Windows messages each time certain events occur, a protocol implementation must take the form of a window procedure. When a socket connection is requested in PopMail, the events of interest are FD_CONNECT, FD_READ, and FD_CLOSE. In other words, PopMail wants the protocol layer to be informed each time a connection is made to a remote host, a connection is severed with a remote host, or data is present on a socket and needs to be processed. The POP3 window procedure is also the recipient of the WSAAsyncGetHostByName call.
Client/Server programming is here to stay and so is the Internet. Combined with the inherent need to interact with ones peers, Electronic Mail will probably continue to be the great common denominator on computer networks. Thanks to tools like Microsoft Windows, WinSock, and POP3, network communication can be implemented easily, reliably, and in short order! o
Information and Code Sources
For a book on Windows Programming, see Charles Petzolds Programming Windows 3.1 (Microsoft Press, 1992).
WINSOCK.H and WINSOCK.LIB are available via ftp from ftp.microsoft.com/bussys/Winsock/spec11/.
WinSock information is available on the World Wide Web at http://www.r2m.com/windev/winsock.html, and on Usenet at alt.winsock, alt.winsock.programming.
The POP3 protocol is described in RFC1725, http://ds.internic.net/rfc/rfc1725.txt.
Arvel Hathcock is a senior programmer analyst with Mailing List Systems Corporation in Arlington, TX. He can be reached at [email protected]