1Chatbot::Eliza(3)     User Contributed Perl Documentation    Chatbot::Eliza(3)
2
3
4

NAME

6       Chatbot::Eliza - A clone of the classic Eliza program
7

SYNOPSIS

9         use Chatbot::Eliza;
10
11         $mybot = new Chatbot::Eliza;
12         $mybot->command_interface;
13
14         # see below for details
15

DESCRIPTION

17       This module implements the classic Eliza algorithm.  The original Eliza
18       program was written by Joseph Weizenbaum and described in the Communi‐
19       cations of the ACM in 1966.  Eliza is a mock Rogerian psychotherapist.
20       It prompts for user input, and uses a simple transformation algorithm
21       to change user input into a follow-up question.  The program is
22       designed to give the appearance of understanding.
23
24       This program is a faithful implementation of the program described by
25       Weizenbaum.  It uses a simplified script language (devised by Charles
26       Hayden).  The content of the script is the same as Weizenbaum's.
27
28       This module encapsulates the Eliza algorithm in the form of an object.
29       This should make the functionality easy to incorporate in larger pro‐
30       grams.
31

INSTALLATION

33       The current version of Chatbot::Eliza.pm is available on CPAN:
34
35         http://www.perl.com/CPAN/modules/by-module/Chatbot/
36
37       To install this package, just change to the directory which you created
38       by untarring the package, and type the following:
39
40               perl Makefile.PL
41               make test
42               make
43               make install
44
45       This will copy Eliza.pm to your perl library directory for use by all
46       perl scripts.  You probably must be root to do this, unless you have
47       installed a personal copy of perl.
48

USAGE

50       This is all you need to do to launch a simple Eliza session:
51
52               use Chatbot::Eliza;
53
54               $mybot = new Chatbot::Eliza;
55               $mybot->command_interface;
56
57       You can also customize certain features of the session:
58
59               $myotherbot = new Chatbot::Eliza;
60
61               $myotherbot->name( "Hortense" );
62               $myotherbot->debug( 1 );
63
64               $myotherbot->command_interface;
65
66       These lines set the name of the bot to be "Hortense" and turn on the
67       debugging output.
68
69       When creating an Eliza object, you can specify a name and an alterna‐
70       tive scriptfile:
71
72               $bot = new Chatbot::Eliza "Brian", "myscript.txt";
73
74       You can also use an anonymous hash to set these parameters.  Any of the
75       fields can be initialized using this syntax:
76
77               $bot = new Chatbot::Eliza {
78                       name       => "Brian",
79                       scriptfile => "myscript.txt",
80                       debug      => 1,
81                       prompts_on => 1,
82                       memory_on  => 0,
83                       myrand     =>
84                               sub { my $N = defined $_[0] ? $_[0] : 1;  rand($N); },
85               };
86
87       If you don't specify a script file, then the new object will be ini‐
88       tialized with a default script.  The module contains this script within
89       itself.
90
91       You can use any of the internal functions in a calling program.  The
92       code below takes an arbitrary string and retrieves the reply from the
93       Eliza object:
94
95               my $string = "I have too many problems.";
96               my $reply  = $mybot->transform( $string );
97
98       You can easily create two bots, each with a different script, and see
99       how they interact:
100
101               use Chatbot::Eliza
102
103               my ($harry, $sally, $he_says, $she_says);
104
105               $sally = new Chatbot::Eliza "Sally", "histext.txt";
106               $harry = new Chatbot::Eliza "Harry", "hertext.txt";
107
108               $he_says  = "I am sad.";
109
110               # Seed the random number generator.
111               srand( time ^ ($$ + ($$ << 15)) );
112
113               while (1) {
114                       $she_says = $sally->transform( $he_says );
115                       print $sally->name, ": $she_says \n";
116
117                       $he_says  = $harry->transform( $she_says );
118                       print $harry->name, ": $he_says \n";
119               }
120
121       Mechanically, this works well.  However, it critically depends on the
122       actual script data.  Having two mock Rogerian therapists talk to each
123       other usually does not produce any sensible conversation, of course.
124
125       After each call to the transform() method, the debugging output for
126       that transformation is stored in a variable called $debug_text.
127
128               my $reply      = $mybot->transform( "My foot hurts" );
129               my $debugging  = $mybot->debug_text;
130
131       This feature always available, even if the instance's $debug variable
132       is set to 0.
133
134       Calling programs can specify their own random-number generators.  Use
135       this syntax:
136
137               $chatbot = new Chatbot::Eliza;
138               $chatbot->myrand(
139                       sub {
140                               #function goes here!
141                       }
142               );
143
144       The custom random function should have the same prototype as perl's
145       built-in rand() function.  That is, it should take a single (numeric)
146       expression as a parameter, and it should return a floating-point value
147       between 0 and that number.
148
149       What this code actually does is pass a reference to an anonymous sub‐
150       routine ("code reference").  Make sure you've read the perlref manpage
151       for details on how code references actually work.
152
153       If you don't specify any custom rand function, then the Eliza object
154       will just use the built-in rand() function.
155

MAIN DATA MEMBERS

157       Each Eliza object uses the following data structures to hold the script
158       data in memory:
159
160       %decomplist
161
162       Hash: the set of keywords;  Values: strings containing the decomposi‐
163       tion rules.
164
165       %reasmblist
166
167       Hash: a set of values which are each the join of a keyword and a corre‐
168       sponding decomposition rule; Values: the set of possible reassembly
169       statements for that keyword and decomposition rule.
170
171       %reasmblist_for_memory
172
173       This structure is identical to %reasmblist, except that these rules are
174       only invoked when a user comment is being retrieved from memory. These
175       contain comments such as "Earlier you mentioned that...," which are
176       only appropriate for remembered comments.  Rules in the script must be
177       specially marked in order to be included in this list rather than
178       %reasmblist. The default script only has a few of these rules.
179
180       @memory
181
182       A list of user comments which an Eliza instance is remembering for
183       future use.  Eliza does not remember everything, only some things.  In
184       this implementation, Eliza will only remember comments which match a
185       decomposition rule which actually has reassembly rules that are marked
186       with the keyword "reasm_for_memory" rather than the normal "reasmb".
187       The default script only has a few of these.
188
189       %keyranks
190
191       Hash: the set of keywords;  Values: the ranks for each keyword
192
193       @quit
194
195       "quit" words -- that is, words the user might use to try to exit the
196       program.
197
198       @initial
199
200       Possible greetings for the beginning of the program.
201
202       @final
203
204       Possible farewells for the end of the program.
205
206       %pre
207
208       Hash: words which are replaced before any transformations; Values: the
209       respective replacement words.
210
211       %post
212
213       Hash: words which are replaced after the transformations and after the
214       reply is constructed;  Values: the respective replacement words.
215
216       %synon
217
218       Hash: words which are found in decomposition rules; Values: words which
219       are treated just like their corresponding synonyms during matching of
220       decomposition rules.
221
222       Other data members
223
224       There are several other internal data members.  Hopefully these are
225       sufficiently obvious that you can learn about them just by reading the
226       source code.
227

METHODS

229       new()
230
231           my $chatterbot = new Chatbot::Eliza;
232
233       new() creates a new Eliza object.  This method also calls the internal
234       _initialize() method, which in turn calls the parse_script_data()
235       method, which initializes the script data.
236
237           my $chatterbot = new Chatbot::Eliza 'Ahmad', 'myfile.txt';
238
239       The eliza object defaults to the name "Eliza", and it contains default
240       script data within itself.  However, using the syntax above, you can
241       specify an alternative name and an alternative script file.
242
243       See the method parse_script_data(). for a description of the format of
244       the script file.
245
246       command_interface()
247
248           $chatterbot->command_interface;
249
250       command_interface() opens an interactive session with the Eliza object,
251       just like the original Eliza program.
252
253       If you want to design your own session format, then you can write your
254       own while loop and your own functions for prompting for and reading
255       user input, and use the transform() method to generate Eliza's
256       responses.  (Note: you do not need to invoke preprocess() and postpro‐
257       cess() directly, because these are invoked from within the transform()
258       method.)
259
260       But if you're lazy and you want to skip all that, then just use com‐
261       mand_interface().  It's all done for you.
262
263       During an interactive session invoked using command_interface(), you
264       can enter the word "debug" to toggle debug mode on and off.  You can
265       also enter the keyword "memory" to invoke the _debug_memory() method
266       and print out the contents of the Eliza instance's memory.
267
268       preprocess()
269
270           $string = preprocess($string);
271
272       preprocess() applies simple substitution rules to the input string.
273       Mostly this is to catch varieties in spelling, misspellings, contrac‐
274       tions and the like.
275
276       preprocess() is called from within the transform() method.  It is
277       applied to user-input text, BEFORE any processing, and before a reasse‐
278       bly statement has been selected.
279
280       It uses the array %pre, which is created during the parse of the
281       script.
282
283       postprocess()
284
285           $string = postprocess($string);
286
287       postprocess() applies simple substitution rules to the reassembly rule.
288       This is where all the "I"'s and "you"'s are exchanged.  postprocess()
289       is called from within the transform() function.
290
291       It uses the array %post, created during the parse of the script.
292
293       _testquit()
294
295            if ($self->_testquit($user_input) ) { ... }
296
297       _testquit() detects words like "bye" and "quit" and returns true if it
298       finds one of them as the first word in the sentence.
299
300       These words are listed in the script, under the keyword "quit".
301
302       _debug_memory()
303
304            $self->_debug_memory()
305
306       _debug_memory() is a special function which returns the contents of
307       Eliza's memory stack.
308
309       transform()
310
311           $reply = $chatterbot->transform( $string, $use_memory );
312
313       transform() applies transformation rules to the user input string.  It
314       invokes preprocess(), does transformations, then invokes postprocess().
315       It returns the tranformed output string, called $reasmb.
316
317       The algorithm embedded in the transform() method has three main parts:
318
319       1   Search the input string for a keyword.
320
321       2   If we find a keyword, use the list of decomposition rules for that
322           keyword, and pattern-match the input string against each rule.
323
324       3   If the input string matches any of the decomposition rules, then
325           randomly select one of the reassembly rules for that decomposition
326           rule, and use it to construct the reply.
327
328       transform() takes two parameters.  The first is the string we want to
329       transform.  The second is a flag which indicates where this sting came
330       from.  If the flag is set, then the string has been pulled from memory,
331       and we should use reassembly rules appropriate for that.  If the flag
332       is not set, then the string is the most recent user input, and we can
333       use the ordinary reassembly rules.
334
335       The memory flag is only set when the transform() function is called
336       recursively.  The mechanism for setting this parameter is embedded in
337       the transoform method itself.  If the flag is set inappropriately, it
338       is ignored.
339
340       How memory is used
341
342       In the script, some reassembly rules are special.  They are marked with
343       the keyword "reasm_for_memory", rather than just "reasm".  Eliza
344       "remembers" any comment when it matches a docomposition rule for which
345       there are any reassembly rules for memory.  An Eliza object remembers
346       up to $max_memory_size (default: 5) user input strings.
347
348       If, during a subsequent run, the transform() method fails to find any
349       appropriate decomposition rule for a user's comment, and if there are
350       any comments inside the memory array, then Eliza may elect to ignore
351       the most recent comment and instead pull out one of the strings from
352       memory.  In this case, the transform method is called recursively with
353       the memory flag.
354
355       Honestly, I am not sure exactly how this memory functionality was
356       implemented in the original Eliza program.  Hopefully this implementa‐
357       tion is not too far from Weizenbaum's.
358
359       If you don't want to use the memory functionality at all, then you can
360       disable it:
361
362               $mybot->memory_on(0);
363
364       You can also achieve the same effect by making sure that the script
365       data does not contain any reassembly rules marked with the keyword
366       "reasm_for_memory".  The default script data only has 4 such items.
367
368       parse_script_data()
369
370           $self->parse_script_data;
371           $self->parse_script_data( $script_file );
372
373       parse_script_data() is invoked from the _initialize() method, which is
374       called from the new() function.  However, you can also call this method
375       at any time against an already-instantiated Eliza instance.  In that
376       case, the new script data is added to the old script data.  The old
377       script data is not deleted.
378
379       You can pass a parameter to this function, which is the name of the
380       script file, and it will read in and parse that file.  If you do not
381       pass any parameter to this method, then it will read the data embedded
382       at the end of the module as its default script data.
383
384       If you pass the name of a script file to parse_script_data(), and that
385       file is not available for reading, then the module dies.
386

Format of the script file

388       This module includes a default script file within itself, so it is not
389       necessary to explicitly specify a script file when instantiating an
390       Eliza object.
391
392       Each line in the script file can specify a key, a decomposition rule,
393       or a reassembly rule.
394
395         key: remember 5
396           decomp: * i remember *
397             reasmb: Do you often think of (2) ?
398             reasmb: Does thinking of (2) bring anything else to mind ?
399           decomp: * do you remember *
400             reasmb: Did you think I would forget (2) ?
401             reasmb: What about (2) ?
402             reasmb: goto what
403         pre: equivalent alike
404         synon: belief feel think believe wish
405
406       The number after the key specifies the rank.  If a user's input con‐
407       tains the keyword, then the transform() function will try to match one
408       of the decomposition rules for that keyword.  If one matches, then it
409       will select one of the reassembly rules at random.  The number (2) here
410       means "use whatever set of words matched the second asterisk in the
411       decomposition rule."
412
413       If you specify a list of synonyms for a word, the you should use a "@"
414       when you use that word in a decomposition rule:
415
416         decomp: * i @belief i *
417           reasmb: Do you really think so ?
418           reasmb: But you are not sure you (3).
419
420       Otherwise, the script will never check to see if there are any synonyms
421       for that keyword.
422
423       Reassembly rules should be marked with reasm_for_memory rather than
424       reasmb when it is appropriate for use when a user's comment has been
425       extracted from memory.
426
427         key: my 2
428           decomp: * my *
429             reasm_for_memory: Let's discuss further why your (2).
430             reasm_for_memory: Earlier you said your (2).
431             reasm_for_memory: But your (2).
432             reasm_for_memory: Does that have anything to do with the fact that your (2) ?
433

How the script file is parsed

435       Each line in the script file contains an "entrytype" (key, decomp,
436       synon) and an "entry", separated by a colon.  In turn, each "entry" can
437       itself be composed of a "key" and a "value", separated by a space.  The
438       parse_script_data() function parses each line out, and splits the
439       "entry" and "entrytype" portion of each line into two variables, $entry
440       and $entrytype.
441
442       Next, it uses the string $entrytype to determine what sort of stuff to
443       expect in the $entry variable, if anything, and parses it accordingly.
444       In some cases, there is no second level of key-value pair, so the func‐
445       tion does not even bother to isolate or create $key and $value.
446
447       $key is always a single word.  $value can be null, or one single word,
448       or a string composed of several words, or an array of words.
449
450       Based on all these entries and keys and values, the function creates
451       two giant hashes: %decomplist, which holds the decomposition rules for
452       each keyword, and %reasmblist, which holds the reassembly phrases for
453       each decomposition rule.  It also creates %keyranks, which holds the
454       ranks for each key.
455
456       Six other arrays are created: "%reasm_for_memory, %pre, %post, %synon,
457       @initial," and @final.
458

CHANGES

460       * Version 1.02-1.04 - January 2003
461             Added a Norwegian script, kindly contributed by
462             Mats Stafseng Einarsen.  Thanks Mats!
463
464       * Version 1.01 - January 2003
465             Added an empty DESTORY method, to eliminate
466             some pesky warning messages.  Suggested by
467             Stas Bekman.
468
469       * Version 0.98 - March 2000
470             Some changes to the documentation.
471
472       * Versions 0.96-0.97 - October 1999
473             One tiny change to the regex which implements
474             reassemble rules.  Thanks to Gidon Wise for
475             suggesting this improvement.
476
477       * Versions 0.94-0.95 - July 1999
478             Fixed a bug in the way the bot invokes its random function
479             when it pulls a comment out of memory.
480
481       * Version 0.93 - June 1999
482             Calling programs can now specify their own random-number generators.
483             Use this syntax:
484
485                   $chatbot = new Chatbot::Eliza;
486                   $chatbot->myrand(
487                           sub {
488                                   #function goes here!
489                           }
490                   );
491
492             The custom random function should have the same prototype
493             as perl's built-in rand() function.  That is, it should take
494             a single (numeric) expression as a parameter, and it should
495             return a floating-point value between 0 and that number.
496
497             You can also now use a reference to an anonymous hash
498             as a parameter to the new() method to define any fields
499             in that bot instance:
500
501                   $bot = new Chatbot::Eliza {
502                           name       => "Brian",
503                           scriptfile => "myscript.txt",
504                           debug      => 1,
505                   };
506
507       * Versions 0.91-0.92 - April 1999
508             Fixed some misspellings.
509
510       * Version 0.90 - April 1999
511             Fixed a bug in the way individual bot objects store
512             their memory.  Thanks to Randal Schwartz and to
513             Robert Chin for pointing this out.
514
515             Fixed a very stupid error in the way the random
516             function is invoked.  Thanks to Antony Quintal
517             for pointing out the error.
518
519             Many corrections and improvements were made
520             to the German script by Matthias Hellmund.
521             Thanks, Matthias!
522
523             Made a minor syntactical change, at the suggestion
524             of Roy Stephan.
525
526             The memory functionality can now be disabled by setting the
527             $Chatbot::Eliza::memory_on variable to 0, like so:
528
529                   $bot->memory_on(0);
530
531             Thanks to Robert Chin for suggesting that.
532
533       * Version 0.40 - July 1998
534             Re-implemented the memory functionality.
535
536             Cleaned up and expanded the embedded POD documentation.
537
538             Added a sample script in German.
539
540             Modified the debugging behavior.  The transform() method itself
541             will no longer print any debugging output directly to STDOUT.
542             Instead, all debugging output is stored in a module variable
543             called "debug_text".  The "debug_text" variable is printed out
544             by the command_interface() method, if the debug flag is set.
545             But even if this flag is not set, the variable debug_text
546             is still available to any calling program.
547
548             Added a few more example scripts which use the module.
549
550               simple       - simple script using Eliza.pm
551               simple.cgi   - simple CGI script using Eliza.pm
552               debug.cgi    - CGI script which displays debugging output
553               deutsch      - script using the German script
554               deutsch.cgi  - CGI script using the German script
555               twobots      - script which creates two distinct bots
556
557       * Version 0.32 - December 1997
558             Fixed a bug in the way Eliza loads its default internal script data.
559             (Thanks to Randal Schwartz for pointing this out.)
560
561             Removed the "memory" functions internal to Eliza.
562             When I get them working properly I will add them back in.
563
564             Added one more example program.
565
566             Fixed some minor errors in the embedded POD documentation.
567
568       * Version 0.31
569             The module is now installable, just like any other self-respecting
570             CPAN module.
571
572       * Version 0.30
573             First release.
574

AUTHOR

576       John Nolan  jpnolan@sonic.net  January 2003.
577
578       Implements the classic Eliza algorithm by Prof. Joseph Weizenbaum.
579       Script format devised by Charles Hayden.
580
581
582
583perl v5.8.8                       2003-01-23                 Chatbot::Eliza(3)
Impressum