Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Chatbot::Eliza


Chatbot::Eliza - The Perl Journal, Spring 1998


Issue 9, Spring 1998

Chatbot::Eliza

John Nolan

Of all the chatterbots - programs that converse with humans - Eliza is the most famous. The original Eliza was written by Professor Joseph Weizenbaum of MIT and described in the Communications of the ACM in 1967 (Vol. 10, No. 8). This program is older than I am - and yet remains fascinating to this day. It's one of the all-time classic programs in computer science. Eliza pretends to be a Rogerian psychiatrist; whatever the human says, it replies - usually with a question - in an attempt to encourage the patient to elaborate.

The Eliza algorithm has been cloned dozens of times in all kinds of programming languages, including Fortran, Basic, Pascal, C, Java, and JavaScript. The first Eliza was written in a Lisp-like language called MAD-Slip, way back in pre-Unix days. (Eliza is named after Eliza Doolittle, the cockney-speaking woman taught to speak proper English in G.B.Shaw's book Pygmalion.)

Last year I took a course in Natural Language Processing, and was surprised to find that much of the research in the field still uses Lisp. Lisp is a fine language, but Perl can do anything Lisp can do, and Perl source code is much easier to read. I searched the Web for Eliza clones, but I couldn't find any written in Perl. So I wrote one.

The Chatbot::Eliza module is a faithful clone of Weizenbaum's Eliza algorithm. It encapsulates Eliza's behavior within an object.

You can install Chatbot::Eliza just like any other Perl module. Once installed, this little bit of code is all you need to start an interactive Eliza session:


   use Chatbot::Eliza;

   $mybot = new Chatbot::Eliza; 

   $mybot->command_interface; 

Let's see what this looks like. If you install the module from the CPAN, save the three lines of code above to a file, and then execute it, here's what the output looks like:

Eliza: Please tell me what's been bothering you. 

you: 

This is an interactive session; type your reply to Eliza after the you: prompt. Listing 1. shows a transcript of a sample run. You can set a few parameters of your Eliza object, such as its name, or a configuration file for it to read:

 $myotherbot = new Chatbot::Eliza "Brian", "myscript.txt";

 $myotherbot->command_interface; 

In this way, you can customize what the chatterbot says by providing your own configuration file. This consists of a list of keywords, decomposition rules, and reassemble rules. If you don't like Eliza's default rules, you can write your own. For instance, the following lines in myscript.txt would have Eliza (or Brian, as we've called it above) begin with one of two otherworldly greetings, chosen at random:

  initial: Greetings Earthling!

  initial: Take me to your leader!

Chatbot::Eliza contains a default configuration file with default greetings, salutations, 'quit'-equivalents, and rules for determining how Eliza should converse. If you want to want to watch Eliza think, you can turn on the debugging output before you launch your session:

  $mybot->debug(1); 

  $mybot->command_interface; 

Listing 2 shows part of the same session as Listing 1, with the debugging output turned on. The Eliza algorithm is actually relatively straightforward. It consists of three steps:
  1. Search the input string for a keyword.
  2. If we find a keyword, use the list of "decomposition rules" for that keyword, and pattern-match the input string against each rule.
  3. If the input string matches any of the decomposition rules, then randomly select one of the "reassemble rules" for that decomposition rule, and use it to construct the reply.
So, in Listing 2, Eliza read the input string He says I am too lazy and found the keyword i. It ran through its entire list of keywords, but i is the only one that matched. The keywords are ranked; if more than one keyword matches, it picks the most salient. Next, it applied all the decomposition rules for the keyword i, (* i was *, * i am* @happy *, and so on) to see if any matched. One rule did: * i am *. Using this rule, we isolate parts of the input string around i am: the two phrases He says and too lazy. Next we randomly select a reassemble rule: Is it because you are (2) that you came to me. We use this rule to construct the reply. We replace (2) with the text that matched the second asterisk in the decomposition rule - in our example, the string too lazy. Finally, Eliza replies with, Is it because you are too lazy that you came to me? The Eliza algorithm has pre- and post-processing steps as well. These handle the transformation of words like I and you; you can read the documentation embedded in the module to learn more. You can also access all of the module's internal functions from your program. For example, using the transform() method, you can feed a string to Eliza and fetch its response:

   $string   = "I'm sad."; 

   $response = $mybot->transform( $string ); 

The Eliza bot is an object, and its configuration data is encapsulated,which means that you can instantiate other Eliza bots, each with their own distinct configurations.

In Listing 3, we create two bots and make them talk to one another. Two bots conversing only produces interesting results if we have clever scripts. Listing 4 shows sample output from the program, where both bots use the default Eliza script. In general, the default Eliza script does not produce any sensible conversation when interacting with itself. (In fairness, people who talk to themselves often don't make much sense either.)

One of the reasons the original Eliza was so successful was Weizenbaum's clever rules and limited domain; the role of a mock Rogerian psychotherapist doesn't require much intelligence, and a relatively simple algorithm can pull it off. But as Weizenbaum found, even this relatively simple program fooled naive users into believing that it understood what they were saying. Scarier still, some people actually felt more comfortable talking to Eliza than a flesh-and-blood psychiatrist.

Listing 5 shows an interactive session, this time with a human deliberately trying to expose Eliza's weaknesses.

What Now?

The CPAN includes many modules which allow a script to interact easily with resources on the Internet. For example, it is straightforward to write scripts which combine the Chatbot::Eliza module with the Net::IRC module or the CGI module. (The CPAN distribution of Chatbot::Eliza includes a sample CGI script - a Web-based Eliza.)

I am currently working on alternative rulesets. If I come up with one that works reasonably well, I'll include it in future releases of the module.

_ _END_ _

References

The CMU Artificial Intelligence Repository:

http://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/areas/classics/0.html.

Hayden's Java Eliza:

http://members.home.com/chayden/eliza/Eliza.html.

Goerlich's Java Eliza:

http://philly.cyberloft.com/bgoerlic/eliza.htm.

The white paper on the Loebner Prize competition:

http://www.vperson.com/mlm/aaai94.html.

Julia, an Eliza-like chatterbot which roams on TinyMUDs:

http://www.vperson.com/mlm/julia.html.

The CYC Project: http://www.cyc.com/.

BotSpot: http://www.botspot.com/.

UMBC AgentWeb: http://www.cs.umbc.edu/agents/.


John Nolan is a systems administrator for N2K Inc.



Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.