Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Database

Java in the Database? Objects in the Middle Tier?


Dr. Dobb's Sourcebook September/October 1997: Database Developer

Ken is a consultant and software developer who also conducts seminars. This column includes excerpts from his forthcoming book, Database Magic with Java (Prentice-Hall). You can contact Ken at http://www.knorth.com/.


Database developers have numerous options for laying out the architecture of an application, so many that we often face a classic decision about where to put logic. Sometimes the best solution to a major problem is to apply a small bit of discrete logic at the right place and time. During the Apollo program, the assignment of improving task termination and error recovery for NASA's Goddard Real-Time System (GRTS) fell into my lap. GRTS was part of the network that tracked lunar spacecraft. The software ran on IBM mainframes as mission-critical, real-time extensions to IBM's OS/360 operating system. GRTS used a main task/subtask architecture and executed in an unprotected address space. My challenge was to keep problems such as disk I/O errors during subtask loading from bringing down the system. No one wanted to say "Houston, patch around us," or worse yet, "Apollo, we have a problem!"

The error-handling solution involved the installation of a transient SVC (Supervisor Call), a discrete bit of logic in a small module that the operating system loaded when needed. Today, operating systems such as UNIX and Windows NT routinely use a similar technique in the form of loadable device drivers. What was once true about operating systems is now true about database software -- adding extensions can solve problems or supplement core functionality. Many early DBMS products were closed systems that supported only a query language interface. There was neither a programming language interface nor the capability to add a discrete bit of logic to the DBMS. Today's database products are moving from closed to extensible architectures.

The next-generation architecture from major database vendors has been the subject of much debate. Vendors are producing object-relational products by merging objects with SQL servers and adding SQL to object databases. The essence of the debate is where to install logic and what data types belong in what server. Informix, IBM, and Oracle took the course of developing universal database servers. Instead of developing a universal server, Microsoft and Sybase moved to a component-based, distributed server architecture. The universal database strategy is to use a single, extensible server to support all types of information. A single server delivers text, spatial data, multimedia, tabular data, images, and other content. Because the server architecture is open-ended, developers can write plug-ins to add new types and functionality. Following this model, IBM provides DB2 Extenders, Informix has DataBlades, and Oracle supports Data Cartridges. Microsoft and Sybase are following a different model, one which says data access should be distributed across multiple servers supporting heterogeneous data stores. The servers are optimized to store specific types of data and they use components to encapsulate data and logic. With either model, database developers still face decisions about where and how to install rules and logic.

The opening up of the database server architecture occurs at a time when we've also seen the emergence of distributed computing, Java, and component-based development. All of these influences mean database developers have more options than ever for deploying discrete logic in the right place. These options include "in the database" solutions such as rules, constraints, procedures, triggers, and functions. Other options include remote procedure calls (RPC), remote methods (Java RMI), embedded tags and SQL in HTML scripts, component-based programming, and various forms of middleware. Programming with components can involve simply the setting of properties with little coding, or it may involve writing methods to respond to events. Component technologies and remote object middleware include ActiveX, COM and DCOM, CORBA and IIOP, Java and JavaBeans.

Components are a solution for client and server programming. You should also consider components as an "in the database" option. For example, SQL scripts can call user-defined functions that invoke OLE Automation methods.

Moving to a Multitier Architecture

During the central computing era, too often the solution for implementing rules and algorithms was embedding them in monolithic programs. Modular programming and components emerged as solutions for "dividing a big problem into a bunch of small problems." The first generation of SQL client-server database products introduced a two-tier architecture that permitted developers to move some logic and rules out of the client tier and into the database. The clients execute remote procedure calls (RPCs) to invoke logic at the SQL server. The two-tier client-server architecture works well for work group and departmental applications but it is challenged to deliver performance when you try to scale up to a large number of users.

One solution for creating large-scale applications is to partition logic across more servers by adding middle tiers between the client and database server. Some developers create multitier applications by using RPCs to communicate from clients to application servers that implement rules or other logic. An advantage with this approach is that you can use a heterogeneous mix of machines and use more powerful machines for application servers. The RPC software handles the marshaling of data (making data from a remote machine appear to be of the same word size and orientation as the client). Operating systems such as Solaris and Windows NT include built-in RPCs, and NobleNet offers an SDK for doing RPCs with heterogeneous systems. RPCs permit you to add horsepower, distribute logic to multiple computers, and program with a variety of languages. If you are writing Java, you can use Remote Method Invocation (RMI), which is analogous to using RPCs. RMI permits Java programs to invoke Java methods on remote machines.

Synchronous RPCs can be a bottleneck for applications when a connection is broken or overloaded. To avoid those problems, developers use asynchronous RPCs and message-oriented middleware with message queues to support "fire and forget" processing. IBM MQSeries, Digital's DECmessageQ, and Microsoft "Falcon" are software for building applications that require messaging or message queues. Sybase dbQ is messaging middleware, written in Java, that uses database queues to permit distributed programs to share data.

Another solution is to use remote object technologies such as the Common Object Request Broker Architecture (CORBA) and the Distributed Component Object Model (DCOM). CORBA and DCOM are object brokers that operate over network protocols and RPCs to provide easy access to components on remote machines. Developers doing Internet and intranet applications can use the Internet Inter-ORB Protocol (IIOP) to access objects at remote web sites. JavaSoft recently announced IIOP will be the underlying protocol for implementing Java's RMI API. DCOM works today with NT's synchronous RPCs but it will eventually operate with Falcon's asynchronous RPCs and message queues.

Transaction Processing Software

Large-scale transaction processing applications can involve multiple databases on distributed servers. This can raise questions of "Who's in charge?" and "Where do I put transaction management logic?" For years, the answer to those questions involved TP monitors such IBM CICS, NCR Top End, BEA Tuxedo, and Transarc Encina. TP monitors for UNIX, VMS, and mainframe systems have been in use for decades, including some very large applications with databases on dozens of machines.

Transaction servers such as Sybase Jaguar CTS and Microsoft Transaction Server (MTS) are a new class of product that marries component technology with the functionality of TP monitors. Transaction servers permit you to follow a single-user model while developing transaction processing software as components. To scale up, you package the components and install them in the transaction server environment. The transaction server provides context and handles thread management, locking, and connection pooling. Developers writing for MTS create COM objects, or Java classes that map to COM objects. Jaguar CTS supports several types of components including ActiveX, JavaBeans, and CORBA IDL.

In the Database

Because of intense competition, the SQL marketplace is a buyer's market with vendors continually adding capabilities and features. SQL products are increasingly capable of supporting rules and other logic in the database.

Two of the primary reasons for using SQL client-server databases are version control and centralization of logic. Moving logic away from the client into the server makes it easier to maintain. Changing logic requires a change in only one location -- the server -- as opposed to updating dozens, hundreds, or thousands of clients. Because SQL implements a security model, putting rules in the database restricts access to those who are authorized to create and change rules. Consider some of the SQL techniques for implementing rules or other logic. Constraints are rules enforced by the database such as limiting data to a specific range of values. Constraints are a declarative solution for enforcing rules, as opposed to a programmatic solution. SQL-92 defines column and table constraints including UNIQUE, NOT NULL, and CHECK constraints. It also defines assertions, primary key, foreign key, and referential integrity constraints.

Triggers are event alerters because they fire when a specific event occurs. You typically use triggers to declare an action to perform when SQL statements execute. Stored procedures are compiled, optimized SQL statements that are stored in the database at the server. The SQL standard refers to stored procedures as persistent stored modules (PSM), although writing a procedure often involves a CREATE PROCEDURE statement. Functions are logic that can be invoked by a SQL statement to process an argument or arguments and return a result. Most SQL dialects include functions to perform scalar operations, such as concatenating strings or calculating a square root. Some database products supplement built-in functions with a capability to install user-defined functions or UDFs. I'm going to explore scalar function and user-defined functions in more detail as I examine phonetic searching.

VB, C++, or Java in the Database

Extensions are plug-in components that extend a server's capability such as supporting a user-defined data type or a new scalar function. Some SQL products permit you to extend the server by adding modules written in C, C++, Visual Basic, or Java. To extend the server using Java in the database, the server must also have a Java Virtual Machine.

Phonetic Text Retrieval

To look at alternative solutions for implementing a bit of discrete logic in a database application, we'll consider a specific example related to phonetic searching. Some database applications require a form of phonetic text retrieval where it is possible to query for a name or other text string that sounds like a search argument. Many SQL database management systems provide a SOUNDEX scalar function that implements the Russell Soundex algorithm. Although widely used, the algorithm is not always the optimal solution for phonetic encoding. Soundex uses the first letter of the text string, discards vowels, and maps groups of consonants to digits. It produces an encoded value that is a letter and three numbers. One of the problems with Soundex is that the encoded value is not intuitive. Philips' Metaphone algorithm encodes to text strings and produces a result that is easier to use. For example, "HSMN" is more recognizable as Houseman than three numbers following an H. Because Soundex encodes to four characters, it can be limiting if you are encoding chemical names, substance names, or other long words. Philips' original Metaphone encoded to four characters or less, but the version presented here lets you encode to a longer string.

Metaphone: How and Where?

Components, remote methods, SQL, Java classes and other solutions are available for providing your users with a Metaphone search capability. Deciding on a solution should not simply be a matter of using the most familiar tool or language, in part because the language might prejudice your decision about architecture and location. In this context, location means the best place to implement the logic: client, server, or database. The alternatives for implementing Metaphone include a downloadable component, Java applet, DLL, method or procedure in a middle-tier server, remote object, and installing Metaphone at a SQL server.

There are a few points to consider if you decide to implement Metaphone in your environment. One is familiarity. Because many SQLs implement SOUNDEX, users are familiar with a query such as:

SELECT SOUNDEX(surname) FROM emp WHERE SURNAME = "Johannson".

SOUNDEX is recognized by multidatabase APIs. The ODBC SQLGetInfo function reports whether a DBMS and ODBC driver support SOUNDEX. (Note that in the previous sentence, the word "function" refers to an ODBC API function, not a SQL scalar function.)

If possible, extend your server to keep SQL users working within a familiar framework for phonetic searching. Installing Metaphone as a user-defined function permits users to invoke it in the same manner as SOUNDEX:

SELECT METAFON4(surname) FROM emp WHERE SURNAME = "Johannson".

Efficiency is another consideration. Executing a SQL scalar function for every requirement to do a phonetic match isn't the most efficient technique. Often, it is best to encode strings such as surnames when you add records to the database. If you use a separate column for the encoded value, you can use it to do keyed searches. Creating an index on the column holding the encoded value is frequently preferable to repetitively executing the phonetic search function.

Outside the Database

Putting rules and logic in the database has distinct advantages, including better security and version control. However, moving logic into the database means that a program must be operating on a SQL database to invoke the logic. For example, if you install Metaphone only as a SQL user-defined function, then it isn't available to scripts and programs that aren't using a SQL database. Supporting the phonetic search logic outside the database means that it will be available to programs and scripts that can use ActiveX components, CORBA objects, JavaBeans, applets, or remote methods.

Implementing Metaphone

Listing One presents Metaphone as a simple application, before it is tweaked to fit specific requirements. As a simple application, the program prompts for a text string to encode, then displays the encoded value. This version returns a four-character value. You can easily modify it to return a shorter or longer Metaphone value. The code that does the encoding, given an input value, is logic that is suitable for repackaging in other languages and other forms, such as components or user-defined functions in a database. Space does not permit us to cover the details of those implementations here, so I'll revisit Metaphone next time.

DDJ

Listing One

/***************************************************************************** FILE NAME:  metphone.java     TITLE: Metaphone phonetic text encoding
* AUTHOR:  K. E. North II, Ken North Computing
*    from Database Magic with Java (Prentice-Hall PTR, 1997)
****************************************************************************
* Copyright (c) Kendall E. North II, 1997. All rights reserved. Reproduction
* or translation of this work beyond that permitted in Section 117 of the
* United States Copyright Act without express written permission of the
* copyright owner is unlawful. The purchaser may make backup copies for
* his/her own use only and not for distribution or resale. The Author and
* Publisher assume no responsibility errors, omissions or damages caused by
* the use of these programs or from use of the information contained herein.
****************************************************************************
*   Synopsis:
*   Java application executed by Java Virtual machine. Executes a method that
*   implements a modified Philips Metaphone algorithm. To encode phonetically,
*   this version checks the character following "sch" to decide whether
*   to use a hard or soft consonant. 
*   Encodes input strings to four characters. To generate a longer phonetic
*   string, change encodeLimit. To run, use this command line:
*           java -classpath <your path> MetaphoneEncoder
*           Type in a last name or other string to encode.
****************************************************************************/


</p>
import java.io.DataInputStream;


</p>
class PhoneticString {


</p>
    String          inputString;
    StringBuffer    mString;


</p>
    PhoneticString (String  inString)  {
        inputString =  new String(inString);
        }
    public StringBuffer encodedString()    {
        return (Metafon4(inputString));
        }       
    public static boolean isVowel (char charToTest)    {
        if (charToTest == 'A') return true;
        if (charToTest == 'E') return true;
        if (charToTest == 'I') return true;
        if (charToTest == 'O') return true;
        if (charToTest == 'U') return true;
        return false;
        }       
    public StringBuffer Metafon4(String InputString)    {
        StringBuffer sb = new StringBuffer("");
        String  EncodeMe    = new String(InputString.toUpperCase());


</p>
        int encodeLimit = 4;
        int index;
        int lastidx = 0;
        int offset = 0;
        int length = EncodeMe.length();
        
        lastidx = length - 1;
        for (index = 0; offset < encodeLimit; index++)
            {
            if (index > lastidx) 
                break;


</p>
            if (index > 0) {
                if (EncodeMe.charAt(index) == EncodeMe.charAt(index-1))
                    continue;
                }
            if (EncodeMe.charAt(index) == 'A' ) {
                if (index < lastidx) {
                    if (EncodeMe.charAt(index+1) == 'E' ) {
                        sb.insert(offset, 'E');                    
                        offset++;
                        index += 1;
                        continue;
                        }
                    }
                if (index == 0) {
                    sb.insert(offset, EncodeMe.charAt(index));
                    offset++;
                    }
                continue;
                }
            if (EncodeMe.charAt(index) == 'B' ) {
                if (index < lastidx) {
                    sb.insert(offset, EncodeMe.charAt(index));
                    offset++;
                    continue;
                    }
                else {
                    if (EncodeMe.charAt(index-1) != 'M' ) {
                        sb.insert(offset, EncodeMe.charAt(index));
                        offset++;
                        continue;
                        }
                    }
                }
            if (EncodeMe.charAt(index) == 'C' ) {
                if (index > 0) {
                    if (EncodeMe.regionMatches(index-1, "SCI", 0, 3)) {
                        index += 1;
                        continue;
                        }
                    if (EncodeMe.regionMatches(index-1, "SCE", 0, 3)) {
                        index += 1;
                        continue;
                        }
                    if (EncodeMe.regionMatches(index-1, "SCY", 0, 3)) { 
                        index += 1;
                        continue;
                        }
                    }
                if (index+1 < lastidx) {
                    if (EncodeMe.regionMatches(index+1, "IA", 0, 2)) {
                        sb.insert(offset, 'X');
                        offset++;
                        index += 1;
                        continue;
                        }
                    }
                if (index < lastidx) {
                    if (EncodeMe.charAt(index+1) == 'H') {
                        sb.insert(offset, 'X');
                        offset++;
                        index += 1;
                        continue;
                        }
                    if (EncodeMe.charAt(index+1) == 'E') {
                        sb.insert(offset, 'S');
                        offset++;
                        index += 1;
                        continue;
                        }
                    if (EncodeMe.charAt(index+1) == 'I') {
                        sb.insert(offset, 'S');
                        offset++;
                        index += 1;
                        continue;
                        }
                    if (EncodeMe.charAt(index+1) == 'Y') {
                        sb.insert(offset, 'S');
                        offset++;
                        index += 1;
                        continue;
                        }
                sb.insert(offset, 'K');
                offset++;
                continue;
                }


</p>
            if (EncodeMe.charAt(index) == 'D' ) {
                if (index+1 < lastidx) {
                    if (EncodeMe.regionMatches(index+1, "GE", 0, 2)) {
                        sb.insert(offset, 'J');
                        offset++;
                        index += 2;
                        continue;
                        }
                    if (EncodeMe.regionMatches(index+1, "GI", 0, 2)) {
                        sb.insert(offset, 'J');
                        offset++;
                        index += 2;
                        continue;
                        }
                    if (EncodeMe.regionMatches(index+1, "GY", 0, 2)) {
                        sb.insert(offset, 'J');
                        offset++;
                        index += 2;
                        continue;
                        }
                    }
                sb.insert(offset, 'T');
                offset++;
                continue;
                }
            if (EncodeMe.charAt(index) == 'E' ) {
                if (index == 0) {
                    sb.insert(offset, EncodeMe.charAt(index));
                    offset++;
                    }
                continue;
                }
            if (EncodeMe.charAt(index) == 'F' ) {
                sb.insert(offset, EncodeMe.charAt(index));
                offset++;
                continue;
                }
            if (EncodeMe.charAt(index) == 'G' ) {
                if (index < lastidx) {
                    if (EncodeMe.charAt(index+1) == 'H') {
                        if (index+1 < lastidx) {
                            if (isVowel(EncodeMe.charAt(index+2))) {
                                sb.insert(offset, 'K');
                                index += 1;
                                offset++;
                                continue;
                                }
                            index += 1;
                            continue;
                            }
                        }
                    if (EncodeMe.charAt(index+1) == 'N') {
                        sb.insert(offset, 'N');
                        offset++;
                        index += 1;
                        continue;
                        }
                    if (index > 0) {
                        if (EncodeMe.charAt(index-1) == 'G') {
                            sb.insert(offset, 'K');
                            offset++;
                            continue;
                            }                            
                        }
                    if (EncodeMe.charAt(index+1) == 'I') {
                        sb.insert(offset, 'J');
                        offset++;
                        continue;
                        }
                    if (EncodeMe.charAt(index+1) == 'E') {
                        sb.insert(offset, 'J');
                        offset++;
                        continue;
                        }
                    if (EncodeMe.charAt(index+1) == 'Y') {
                        sb.insert(offset, 'J');
                        offset++;
                        continue;
                        }
                    if (EncodeMe.regionMatches(index-1, "DGE", 0, 3))
                        continue;
                    if (EncodeMe.regionMatches(index-1, "DGI", 0, 3)) 
                        continue;
                    if (EncodeMe.regionMatches(index-1, "DGY", 0, 3)) 
                        continue;
                    }
                sb.insert(offset, 'K');
                offset++;
                continue;
                }


</p>
            if (EncodeMe.charAt(index) == 'H' ) {
                if (index > 0) {
                    if (isVowel(EncodeMe.charAt(index-1))) {
                        if (index < lastidx) {
                            if (isVowel(EncodeMe.charAt(index+1))) {
                                sb.insert(offset, EncodeMe.charAt(index));
                                offset++;
                                continue;
                                }
                            else continue;
                            }
                        else continue;
                        }
                    }
                else {
                    if (!(isVowel(EncodeMe.charAt(index+1))))
                        continue;
                    }
                sb.insert(offset, EncodeMe.charAt(index));
                offset++;
                continue;
                }
            if (EncodeMe.charAt(index) == 'I' ) {
                if (index == 0) {
                    sb.insert(offset, EncodeMe.charAt(index));
                    offset++;
                    }
                continue;
                }
            if (EncodeMe.charAt(index) == 'J') {
                sb.insert(offset, EncodeMe.charAt(index));
                offset++;
                continue;
                }
            if (EncodeMe.charAt(index) == 'K' ) {
                if (index < lastidx) {
                    if (EncodeMe.charAt(index+1) == 'N') {
                        sb.insert(offset, 'N');
                        offset++;
                        index += 1;
                        continue;
                        }
                    }
                if (index > 0) {
                    if (EncodeMe.charAt(index-1) == 'C') 
                        continue;
                    }
                sb.insert(offset, EncodeMe.charAt(index));
                offset++;
                continue;
                }
            if (EncodeMe.charAt(index) == 'L') {
                sb.insert(offset, EncodeMe.charAt(index));
                offset++;
                continue;
                }
            if (EncodeMe.charAt(index) == 'M') {
                sb.insert(offset, EncodeMe.charAt(index));
                offset++;
                continue;
                }
            if (EncodeMe.charAt(index) == 'N') {
                sb.insert(offset, EncodeMe.charAt(index));
                offset++;
                continue;
                }
            if (EncodeMe.charAt(index) == 'O' ) {
                if (index == 0) {
                    sb.insert(offset, EncodeMe.charAt(index));
                    offset++;
                    }
                continue;
                }
            if (EncodeMe.charAt(index) == 'P' ) {
                if (index < lastidx) {
                    if (EncodeMe.charAt(index+1) == 'H') {
                        sb.insert(offset, 'F');
                        offset++;
                        index += 1;
                        continue;
                        }
                    if (EncodeMe.charAt(index+1) == 'N') {
                        sb.insert(offset, 'N');
                        offset++;
                        index += 1;
                        continue;
                        }
                    }
                sb.insert(offset, EncodeMe.charAt(index));
                offset++;
                continue;
                }
            if (EncodeMe.charAt(index) == 'Q') {
                sb.insert(offset, 'K');
                offset++;
                continue;
                }
            if (EncodeMe.charAt(index) == 'R') {
                sb.insert(offset, EncodeMe.charAt(index));
                offset++;
                continue;
                }
            if (EncodeMe.charAt(index) == 'S' ) {
                if (index+2 < lastidx) {
                    if (EncodeMe.regionMatches(index, "SCHE", 0, 4)) {
                        sb.append("SK");
                        offset += 2;
                        index += 2;
                          System.out.print (" Encoded String = "+ sb);
                          System.out.print(" \n");
                        continue;
                        }
                    if (EncodeMe.regionMatches(index, "SCHI", 0, 4)) {
                        sb.append("SK");
                        offset += 2;
                        index += 2;
                          System.out.print (" Encoded String = "+ sb);
                          System.out.print(" \n");
                        continue;
                        }
                    if (EncodeMe.regionMatches(index, "SCHO", 0, 4)) {
                        sb.append("SK");
                        offset += 2;
                        index += 2;
                          System.out.print (" Encoded String = "+ sb);
                          System.out.print(" \n");
                        continue;
                        }
                    }
                if (index+1 < lastidx) {
                    if (EncodeMe.regionMatches(index, "SCH", 0, 3)) {
                        sb.insert(offset, 'X');
                        offset++;
                        index += 2;
                        continue;
                        }
                    if (EncodeMe.regionMatches(index+1, "IA", 0, 2)) {
                        sb.insert(offset, 'X');
                        offset++;
                        index += 1;
                        continue;
                        }
                    if (EncodeMe.regionMatches(index+1, "IO", 0, 2)) {
                        sb.insert(offset, 'X');
                        offset++;
                        index += 1;
                        continue;
                        }
                    }
                if (index < lastidx) {
                    if (EncodeMe.charAt(index+1) == 'H') {
                        sb.insert(offset, 'X');
                        offset++;
                        index += 1;

                        continue;
                        }
                    }
                sb.insert(offset, EncodeMe.charAt(index));
                offset++;
                continue;
                }
            if (EncodeMe.charAt(index) == 'T' ) {
                if (index > 0) {
                    if (index+1 < lastidx) {
                        if (EncodeMe.regionMatches(index+1, "IA", 0, 2)) {
                            sb.insert(offset, 'X');
                            offset++;
                            index += 1;
                            continue;
                            }
                        if (EncodeMe.regionMatches(index+1, "IO", 0, 2)) {
                            sb.insert(offset, 'X');
                            offset++;
                            index += 1;
                            continue;
                            }
                        }
                    }
                if (index+1 < lastidx) {
                    if (EncodeMe.regionMatches(index+1, "CH", 0, 2)) {
                        continue;
                        }
                    }
                if (index < lastidx) {
                    if (EncodeMe.charAt(index+1) == 'H') {
                        sb.insert(offset, '0');
                        offset++;
                        index += 1;
                        continue;
                        }
                    }
                sb.insert(offset, EncodeMe.charAt(index));
                offset++;
                continue;
                }
            if (EncodeMe.charAt(index) == 'U' ) {
                if (index == 0) {
                    sb.insert(offset, EncodeMe.charAt(index));
                    offset++;
                    }
                continue;
                }
            if (EncodeMe.charAt(index) == 'V') {
                sb.insert(offset, 'F');
                offset++;
                continue;
                }
            if (EncodeMe.charAt(index) == 'W' ) {
                if (index < lastidx) {
                    if (EncodeMe.charAt(index+1) == 'R') {
                        sb.insert(offset, 'R');
                        offset++;
                        index += 1;
                        continue;
                        }
                    if (EncodeMe.charAt(index+1) == 'H') {
                        sb.insert(offset, EncodeMe.charAt(index));
                        offset++;
                        index += 1;
                        continue;
                        }
                    if (isVowel(EncodeMe.charAt(index+1))) {
                        sb.insert(offset, EncodeMe.charAt(index));
                        offset++;
                        continue;
                        }
                    }
                continue;
                }
            if (EncodeMe.charAt(index) == 'X' ) {
                if (index == 0) {
                    sb.insert(offset, 'S');
                    offset++;
                    continue;
                    }
                sb.append("KS");
                offset += 2;
                continue;
                }
            if (EncodeMe.charAt(index) == 'Y' ) {
                if (index < lastidx) {
                    if (isVowel(EncodeMe.charAt(index+1))) {
                        sb.insert(offset, EncodeMe.charAt(index));
                        offset++;
                        continue;
                        }
                    }
                continue;
                }
            if (EncodeMe.charAt(index) == 'Z') {
                sb.insert(offset, 'S');
                offset++;
                continue;
                }
            }
        return sb;
        }
}
/********************************************************************/
/*                  Metaphone Encoder                               */
/********************************************************************/


</p>
class MetaphoneEncoder    {
    PhoneticString    phonString;
    boolean getMore = true;
    void encodeString () {
        while (getMore) {
            phonString = new PhoneticString(getStringInput());
            showMetaphoneValue(phonString.encodedString());
        }
    }  
    String getStringInput () {
        String              stringToEncode;
        DataInputStream     stream = new DataInputStream(System.in);


</p>
        System.out.println("Enter string to phonetically encode:");
        try {
            stringToEncode = stream.readLine();
        } catch (java.io.IOException ex)    {
            stringToEncode = "Error Getting Input";
            getMore = false;
        }
    return stringToEncode;
    }
    void showMetaphoneValue (StringBuffer MetaphonString)
            {
                                                    // print encoded string
          System.out.print (" Metaphone value is = "+ MetaphonString);
          System.out.print(" \n");
            }
    public static void main(String argv[]) 
        {
        MetaphoneEncoder mf = new MetaphoneEncoder();
        mf.encodeString();
        }
}

Back to Article


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.