HTML Conversion and FTP Automation

Lauren's HTML Automator program--built using IST's OpenExchange file conversion DLL, Distinct's TCP/IP Winsock library and SDK, and Borland's Delphi development environment--automates the process of converting data to HTML format and posting pages to a Web server.


April 01, 1996
URL:http://www.drdobbs.com/web-development/html-conversion-and-ftp-automation/184409866

Figure 2

Figure 2

APR96: HTML Conversion and FTP Automation

HTML Conversion and FTP Automation

Automating the conversion process while empowering users

Lauren Hightower

Lauren is a general partner of the Calico Company in Tallahassee, FL. Her company specializes in client/server and Internet development. She can be contacted at [email protected].


With the growing popularity of the Internet, information systems (IS) departments are finding themselves in the odd position of maintaining and managing unfamiliar data on their companies' World Wide Web sites. But it doesn't really make sense for IS departments to interpret data developed and maintained by other departments just so it can be translated to a different format for distribution. This is equivalent to having IS print reports for other departments, rather than giving those departments the printer and software to do it themselves. The people responsible for the data should prepare it for any type of distribution--paper or electronic. After all, IS is in the business of managing information, not creating and maintaining it.

The HTML tags used in World Wide Web documents are not designed to be viewed in a WYSIWYG environment. Instead, HTML codes are applied to an ASCII document and interpreted by a browser when it displays the document. This open architecture allows an enormous amount of flexibility and ensures that any browser capable of parsing and interpreting HTML tags can display any Web page that adheres to the HTML standard.

Unfortunately, HTML is nonsense to many people accustomed to working with slick, user-friendly business applications. Until software developers create HTML filters for business software (as they have done for most word processors), IS is faced with maintaining the information or training end users to mark up and update their own HTML documents.

What HTML Automator Does

Overwhelmed with the volume of information my company wanted to deliver across the Web, I developed a program called "HTML Automator" to automate the process of converting data stored in spreadsheets and databases to HTML format; see Figure 1. I also added an FTP module to allow users to replace existing pages on our Web server without IS intervention. As a result, HTML Automator eliminates unnecessary work for IS, automates a task that might otherwise require a long training process, and ensures a standard on all HTML documents churned out by departments within the company.

The design of HTML Automator contains two major components--file conversion and file transfer. The file conversion module has two requirements:

The FTP module has four requirements:

After researching available tools, I settled on IST's OpenExchange library to handle the file-conversion process and Distinct Corp.'s TCP/IP Winsock library and SDK to transfer the resulting HTML file to the Web server. Distinct's tool offers excellent documentation to the Winsock APIs and a reliable, high-speed dial-up or direct-connection TCP/IP stack. Because both tools are DLLs, they can be called from any development tool that supports calls to external libraries, including C++, PowerBuilder, and Visual Basic. I chose Delphi as my development tool because of the speed and deployment advantages of its compiled executables and its underlying object-oriented language, Object Pascal.

HTML Automator uses configuration files to track the important pieces of information it needs to create HTML files. Once you create a configuration file, you can open it from the command line and perform the conversion and transfer without user intervention. Table 1 shows the contents of a configuration file.

IST's OpenExchange DLL supports Lotus, Excel, Quattro Pro, Symphony, dBASE, and Paradox file formats. Knowing that most word processors have built-in HTML filters and most dynamic data comes from spreadsheet and database applications (not word processing documents), I chose to focus my filter on spreadsheet and database files. In a typical spreadsheet, there is a title or a row of column headings on the first row, a column of row numbers or identifiers in the first column, and data in the remaining columns and rows. Similarly, databases have a set of field names that serve as column headings and data stored in the rows. Using this as the basis for my layout scheme, I developed a filter that converts either a database or spreadsheet file to an HTML marked-up document using the <PRE></PRE> tags to properly format the information in evenly spaced columns.

To enforce a company-wide standard on all our Web pages, I developed standard header and footer files. The header file determines the graphic at the top of the page, the background color for the body, and a place holder for the title; see Example 1(a). The footer file, see Example 1(b), contains the markup for a line separating the body of the document from the standard .GIF file that allows a user to move to different pages in our Web site.

HTML Automator uses header and footer files on the network. If the standard header and footer files change, HTML Automator incorporates the changes in every Web page it produces.

As Figure 2 illustrates, HTML Automator generates the output file by knitting together the header, body, and footer files to create one HTML document. The conversion component of HTML Automator generates the body of the document.

HTML Automator incorporates the standard header and footer files at the top and bottom of each HTML file it generates. The result is a Web page consistent with all of the other pages on our site.

HTML Automator lets me define a setup for a particular conversion, then save it in a configuration file. I can develop as many configuration files as needed for any number and type of input files. I can pass the configuration filename on the command line for an automatic conversion, or open the configuration file from inside HTML Automator.

Using the command-line option, I can use a timer application to update files on the server on a regular basis without user intervention. I can also run the application from a remote machine with access to a shared directory that contains the data file across a peer-to-peer network such as Windows for Workgroups. If the volume of conversion is high enough, I can dedicate one machine to convert and transfer files from multiple shared directories to the server on a regular basis.

The Conversion Module

OpenExchange comes with C, Delphi, and Visual Basic header files. I added the Delphi header file to my project and openex to the uses clause of the implementation section of my data in the input file.

The procedure convert_file (available electronically; see "Availability," page 3), uses oe_OpenInputFile to open the input file, oe_ReadDict to read the structure of the file, and oe_GetNCases to determine the number of rows. A call to oe_SelectAllVars returns the number of columns in the spreadsheet or database. Alternately, you could use oe_SetVarSelect to select individual columns or fields. Calling oe_InitReadOnly performs initialization on the input file. OpenExchange has several high-level functions designed to transfer data from one supported type to another, for example from Paradox to Excel. These functions automatically initialize the input file. If you handle the conversion and output yourself, you must call oe_InitReadOnly to force the initialization because the high-level functions are not being used. oe_ReadRecord performs the bulk of the work, reading in an entire record. Once read, the oe_CopyVarDataStr function copies individual columns in the record to string variables. Because I am interested in formatting the text in fixed-width columns, I use oe_GetVarPrintWidth to find the physical width of a database field or the print width of a spreadsheet column. I add three spaces to each width to allow some white space between columns, then pad each field I read with the appropriate number of spaces. To tidy up, I call oe_CloseFiles to close the input file.

I use the <PRE> tag at the beginning of the data file to force a Web browser to display the data in a fixed-width font. At the end of the data file, I use the </PRE> tag to end fixed-width formatting. The result is a table-like appearance for the data in the input file (see Figure 3).

To incorporate the standard header and footer files, I use the Pascal readln and writeln functions to read the information from the header file and write it to the output file before I convert the data. Similarly, I read information from the footer file and write it to the output file after the data conversion.

The Transfer Module

Once the output file is complete, the FTP process moves it to the server where it replaces the "live" version of the file. I always encourage users to view the output file as a local HTML file on their PC before they transfer it to the server. Using the URL file:///C|/EXCEL/WORK/ RATES .HTM, users can open a local file in their browser the same way they would any other URL.

The FTP process is handled through calls to WINSOCK.DLL. I used Distinct's TCP/IP SDK and Windows socket to implement my file transfer. The Distinct FTP API documentation provides in-depth instructions for using and calling standard FTP services in the WINSOCK.DLL. Assuming users are physically connected (or have established a SLIP or PPP connection) to the TCP/IP network and are set up to communicate across that connection, they should be able to open an FTP session and transfer a file to any directory on the server to which they have access.

To establish a connection to the server, I call the ftp_open function. Once the connection is made, I call ftp_login to pass the login name and password. The trickiest and most important part of the transfer uses the ftp_put function. ftp_put transfers the file from the local drive to the server. ftp_put requires an additional function, sendfilecontents, to read data from the input file and move bytes of information into a buffer. The sendfilecontents function returns the number of bytes in the buffer while there is still information in the input file, and 0 when it is finished reading the file. The ftp_put uses the results of sendfilecontents to transfer the correct number of bytes from the buffer to the server.

To use my function, ftp_put must know where to find it. The last parameter of the ftp_put function is a pointer to my function, sendfilecontents. I obtain the pointer by calling MakeProcInstance, specifying the exported function as the first parameter (see Listing One). When the transfer is complete, I call ftp_close to close the connection.

Enhancements

HTML Automator was designed for a very specific purpose. A few enhancements can open the door to a host of other useful possibilities. For starters, in its current design, HTML Automator reads an entire spreadsheet or database. OpenExchange has the capacity to narrow its focus to a particular range of fields or cells. Users accustomed to keeping several sources of data in one spreadsheet can specify particular ranges in different configuration files to create several HTML files from one spreadsheet.

For the most part, my HTML Automator ignores security issues. Our Web server runs on the same box as our FTP server, making it easy to transfer files directly to the appropriate directories on the Web server. If your organization requires a greater degree of security, you might devise a system to place the files in a temporary directory where they can be reviewed before they go "live."

In the current implementation of HTML Automator, there is no way to add text to an HTML file, other than what is stored in the header, data, and footer files. Adding a memo box might help users write their own HTML to include before or after the data, as introductory or summary text.

For aesthetic reasons, you might prefer to write your filter to produce HTML files that incorporate the <TABLE></TABLE> tags. Tables give your page a professional look that doesn't come through with the traditional <PRE></PRE> tags. Unfortunately, tables (defined in HTML 3.0) are a new and evolving standard on the Web. Some browsers don't support them yet and your filter might change as the standard develops.

HTML Automator assumes users are connected physically to a TCP/IP network or already have established a connection through a SLIP or PPP connection. Additional calls to functions in WINSOCK.DLL can check for a TCP/IP connection before the FTP process and initiate the call if it doesn't find one. This is especially useful if a significant number of users are remote.

If your organization maintains several Web servers, it would be prudent to have a list of FTP servers and the corresponding login names and passwords stored so that users can choose a server from a list rather than type in a cryptic IP address or domain name.

For More Information

IST

OpenExchange DLL 1.0

P.O. Box 3774

Joplin, MO 64803

417-781-3282

[email protected]

Distinct Corp.

Distinct TCP/IP SDK

12901 Saratoga Ave.

P.O. Box 3410

Saratoga, CA 95070

408-366-8933

[email protected]

Figure 1: Main HTML Automator screen.

Figure 2: HTML Automator conversion process.

Figure 3: Sample output.

<PRE>
<EM>TYPE OF LOAN</EM> <EM>PERCENTAGE</EM>
2 Year Fixed Loan         4.50
3 Year Fixed Loan         5.50
4 Year Fixed Loan         6.00
5 Year Fixed Loan         7.00
6 Year Fixed Loan         8.25
7 Year Fixed Loan         9.00
8 Year Fixed Loan        10.25
9 Year Fixed Loan        10.25
10 Year Fixed Loan       10.75
11 Year Fixed Loan       11.00
12 Year Fixed Loan       11.25
13 Year Fixed Loan       11.50
14 Year Fixed Loan       12.10
15 Year Fixed Loan       12.25
</PRE>

Example 1: (a) Standard header file; (b) footer file.

(a)

<HTML>

<HEAD>
<TITLE></TITLE>
</HEAD>
<BODY BGCOLOR="#FFFFFF">
<P>
<CENTER><IMG SRC=logo.gif ALIGN=MIDDLE><CENTER>
<HR>

(b)

<HR>
<CENTER><IMG SRC=banner.gif ALIGN=MIDDLE></CENTER>
</BODY>
</HTML>

Table 1: HTML Automator configuration-file contents.

Information            Purpose

Input Filename         Name of the file to be converted.
Output Filename        Name of the resulting HTML file.
Type of File           Type of input file that OpenExchange supports.
FTP Server             Name or IP address of the FTP server.
Login Name             Login Name for the user on the FTP server.
Password               Password for the user on the FTP server.
Server Directory       Directory where the FTP module will place the output file.
HTML Header            Name of the HTML file to be used as the header on the output file.
HTML Footer            Name of the HTML file to be used as the footer on the output file HTML.
Title                  Title for the resulting output file.

Listing One

   .
   .
   .

{Check to see if you want to automatically FTP the output file}
     if CheckBox_FTPAutomate.State=cbchecked then
        FTP_file;
end;
procedure THTML_AUTOMATE.Save1Click(Sender: TObject);
var s_filename : string;
begin
     if SaveDialog_CFG.Execute then
      begin
        s_filename := SaveDialog_CFG.Filename;
        save_file(s_filename);
      end;
end;
procedure Open_File(s_filename : string);
var AutoConfigFile : Tinifile;
begin
AutoConfigFile := TIniFile.Create(s_Filename);
 with AutoConfigFile do
  begin
    HTML_Automate.Edit_Server.text :=
          ReadString('AUTOHTML Configuration File', 'FTPServer', '');
    HTML_Automate.ComboBox_FileType.text :=
          ReadString('AUTOHTML Configuration File', 'FileType', '');
    HTML_Automate.Edit_LoginName.text :=
          ReadString('AUTOHTML Configuration File', 'FTPLoginName', '');
    HTML_Automate.Edit_Password.text :=
          ReadString('AUTOHTML Configuration File', 'FTPPassword', '');
    HTML_Automate.Edit_ServerDirectory.text :=
          ReadString('AUTOHTML Configuration File', 'FTPDirectory', '');
    HTML_Automate.Edit_HTMLHeader.text :=
          ReadString('AUTOHTML Configuration File', 'HTMLHeader', '');
    HTML_Automate.Edit_HTMLFooter.text :=
          ReadString('AUTOHTML Configuration File', 'HTMLFooter', '');
    HTML_Automate.Edit_Input.text :=
          ReadString('AUTOHTML Configuration File', 'InputFile', '');
    HTML_Automate.Edit_Output.text :=
          ReadString('AUTOHTML Configuration File', 'OutputFile', '');
    HTML_Automate.Edit_HTMLTitle.text :=
          ReadString('AUTOHTML Configuration File', 'HTMLTitle', '');
    If ReadString('AUTOHTML Configuration File','FTPAutomate','No')<>'No' then
         HTML_Automate.CheckBox_FTPAutomate.State := cbchecked
    else
         HTML_Automate.CheckBox_FTPAutomate.State := cbunchecked;
  end;
AutoConfigFile.Free;
end;
function sendfilecontents(ftphandle : integer; buf : pChar; len : integer) : 
                                                               Integer; EXPORT;
var read_line : string;
begin
         if eof(send_file) then
            sendfilecontents:= 0
         else
         begin
            readln(send_File, read_line);
            HTML_Automate.Label_Status.caption := read_line;
            read_line := read_line + Chr(13) + Chr(10);
            StrPCopy(buf, read_line);
            sendfilecontents := length(read_line);
         end;
end;
procedure FTP_File;
var host : array[0..80] of Char;
    user_name : array[0..80] of char;
    user_pass : array[0..80] of char;
    file_transfer : array[0..80] of char;
    ftphandle, i : integer;
    send_file_proc : pointer;
    ftpfile : string;
begin
     {Open a connection to FTP Server}
     StrPCopy(host, HTML_Automate.Edit_Server.Text);
     HTML_Automate.Label_Status.caption := 'Connecting to host ' + 
                                                HTML_Automate.Edit_Server.Text;
     ftphandle:=ftp_open(Application.Handle,host);
     if ftphandle <> 0 then
        HTML_Automate.Label_Status.caption := 'Successful connection with handle #' + inttostr(ftphandle)
     else
     begin
        HTML_Automate.Label_Status.caption := 'Can not connect';
        exit;
     end;
     {Login to the server, pass user name and password}
     StrPCopy(user_name, HTML_Automate.Edit_LoginName.Text);
     StrPCopy(user_pass, HTML_Automate.Edit_Password.Text);
     HTML_Automate.Label_Status.caption := 'Logging in...';
     i:=ftp_login(ftphandle, user_name, user_pass);
     if i = FTP_OK then
        HTML_Automate.Label_Status.caption := 'Successful login'
     else
     begin
        HTML_Automate.Label_Status.caption := 'Can not login';
        ftp_close(ftphandle);
        exit;
     end;
     {change file type to ASCII}
     i:=ftp_type(ftphandle, TYPE_ASCII);
     if i = FTP_OK then
        HTML_Automate.Label_Status.caption := 'Changed type to ASCII'
     else
     begin
        HTML_Automate.Label_Status.caption := 'Can not change type to ASCII';
        ftp_close(ftphandle);
        exit;
     end;
     {get the address of the sendfilecontents function}
     send_file_proc := MakeProcInstance(@sendfilecontents, HInstance);
     ftpfile:=HTML_Automate.Edit_Output.Text;
     {parse the output file to get only the filename}
     while Pos('\', ftpfile) > 0 do
         ftpfile := Copy(ftpfile, Pos('\', ftpfile) + 1, length(ftpfile));
     {Transfer file}
     ftpfile:= HTML_Automate.Edit_ServerDirectory.Text + '\' + ftpfile;
     StrPCopy(file_transfer, ftpfile);
     HTML_Automate.Label_Status.caption := 'Sending file ' + 
                                                HTML_Automate.Edit_Output.Text;
     Assign(send_file, HTML_Automate.Edit_Output.Text);
     Reset(send_file);
     i := ftp_put(ftphandle, FALSE, file_transfer, Send_File_Proc);
     if i = FTP_OK then
        HTML_Automate.Label_Status.caption := 'File sent successfully'
     else
        HTML_Automate.Label_Status.caption := 'Error sending file ' + 
                                                                   inttostr(i);
     Close(send_file);
     {close FTP session}
     ftp_close(ftphandle);
end;
procedure Save_File(s_filename : string);
var F : Text;
begin
     assign(F, s_filename);
     rewrite(F);
     writeln(F, '[AUTOHTML Configuration File]');
     writeln(F, 'InputFile=' + HTML_Automate.Edit_Input.Text);
     writeln(F, 'FileType=' + HTML_Automate.ComboBox_FileType.Text);
     writeln(F, 'OutputFile=' + HTML_Automate.Edit_Output.Text);
     writeln(F, 'FTPServer=' + HTML_Automate.Edit_Server.Text);
     writeln(F, 'FTPLoginName=' + HTML_Automate.Edit_LoginName.Text);
     writeln(F, 'FTPPassword=' + HTML_Automate.Edit_Password.Text);
     writeln(F, 'FTPDirectory=' + HTML_Automate.Edit_ServerDirectory.Text);
     writeln(F, 'HTMLHeader=' + HTML_Automate.Edit_HTMLHeader.Text);
     writeln(F, 'HTMLFooter=' + HTML_Automate.Edit_HTMLFooter.Text);
     writeln(F, 'HTMLTitle=' + HTML_Automate.Edit_HTMLTitle.Text);
     if HTML_Automate.CheckBox_FTPAutomate.state = cbchecked then
        writeln(F, 'FTPAutomate=Yes')
     else
        writeln(F, 'FTPAutomate=No');
     close(F);
     HTML_Automate.Caption := 'HTML Automator ' + '[' + s_filename + ']';
end;
procedure THTML_AUTOMATE.OutputFile1Click(Sender: TObject);
var F : system.text;
    S : string;
begin
f_view.Memo_HTM.Clear;
If FileExists(Edit_Output.Text) then
begin
System.Assign(F, Edit_Output.Text);
Reset(F);
  repeat
    readln(F, S);
    f_view.Memo_HTM.Lines.Add(S);
  until eof(F);
system.close(F);
f_view.ShowModal;
end;
end;
procedure THTML_AUTOMATE.Button2Click(Sender: TObject);
begin
     FTP_file;
end;
procedure THTML_AUTOMATE.FTP1Click(Sender: TObject);
begin
     FTP_file;
end;
procedure THTML_AUTOMATE.FormCreate(Sender: TObject);
begin
     Open_File(paramstr(1));
     HTML_Automate.Caption := 'HTML Automator ' + '[' + paramstr(1) + ']';
end;
procedure THTML_AUTOMATE.Panel1Click(Sender: TObject);
begin
end;
end.

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.