1Chatbot::Eliza(3) User Contributed Perl Documentation Chatbot::Eliza(3)
2
3
4
6 Chatbot::Eliza - A clone of the classic Eliza program
7
9 use Chatbot::Eliza;
10
11 $mybot = new Chatbot::Eliza;
12 $mybot->command_interface;
13
14 # see below for details
15
17 This module implements the classic Eliza algorithm. The original Eliza
18 program was written by Joseph Weizenbaum and described in the Communi‐
19 cations of the ACM in 1966. Eliza is a mock Rogerian psychotherapist.
20 It prompts for user input, and uses a simple transformation algorithm
21 to change user input into a follow-up question. The program is
22 designed to give the appearance of understanding.
23
24 This program is a faithful implementation of the program described by
25 Weizenbaum. It uses a simplified script language (devised by Charles
26 Hayden). The content of the script is the same as Weizenbaum's.
27
28 This module encapsulates the Eliza algorithm in the form of an object.
29 This should make the functionality easy to incorporate in larger pro‐
30 grams.
31
33 The current version of Chatbot::Eliza.pm is available on CPAN:
34
35 http://www.perl.com/CPAN/modules/by-module/Chatbot/
36
37 To install this package, just change to the directory which you created
38 by untarring the package, and type the following:
39
40 perl Makefile.PL
41 make test
42 make
43 make install
44
45 This will copy Eliza.pm to your perl library directory for use by all
46 perl scripts. You probably must be root to do this, unless you have
47 installed a personal copy of perl.
48
50 This is all you need to do to launch a simple Eliza session:
51
52 use Chatbot::Eliza;
53
54 $mybot = new Chatbot::Eliza;
55 $mybot->command_interface;
56
57 You can also customize certain features of the session:
58
59 $myotherbot = new Chatbot::Eliza;
60
61 $myotherbot->name( "Hortense" );
62 $myotherbot->debug( 1 );
63
64 $myotherbot->command_interface;
65
66 These lines set the name of the bot to be "Hortense" and turn on the
67 debugging output.
68
69 When creating an Eliza object, you can specify a name and an alterna‐
70 tive scriptfile:
71
72 $bot = new Chatbot::Eliza "Brian", "myscript.txt";
73
74 You can also use an anonymous hash to set these parameters. Any of the
75 fields can be initialized using this syntax:
76
77 $bot = new Chatbot::Eliza {
78 name => "Brian",
79 scriptfile => "myscript.txt",
80 debug => 1,
81 prompts_on => 1,
82 memory_on => 0,
83 myrand =>
84 sub { my $N = defined $_[0] ? $_[0] : 1; rand($N); },
85 };
86
87 If you don't specify a script file, then the new object will be ini‐
88 tialized with a default script. The module contains this script within
89 itself.
90
91 You can use any of the internal functions in a calling program. The
92 code below takes an arbitrary string and retrieves the reply from the
93 Eliza object:
94
95 my $string = "I have too many problems.";
96 my $reply = $mybot->transform( $string );
97
98 You can easily create two bots, each with a different script, and see
99 how they interact:
100
101 use Chatbot::Eliza
102
103 my ($harry, $sally, $he_says, $she_says);
104
105 $sally = new Chatbot::Eliza "Sally", "histext.txt";
106 $harry = new Chatbot::Eliza "Harry", "hertext.txt";
107
108 $he_says = "I am sad.";
109
110 # Seed the random number generator.
111 srand( time ^ ($$ + ($$ << 15)) );
112
113 while (1) {
114 $she_says = $sally->transform( $he_says );
115 print $sally->name, ": $she_says \n";
116
117 $he_says = $harry->transform( $she_says );
118 print $harry->name, ": $he_says \n";
119 }
120
121 Mechanically, this works well. However, it critically depends on the
122 actual script data. Having two mock Rogerian therapists talk to each
123 other usually does not produce any sensible conversation, of course.
124
125 After each call to the transform() method, the debugging output for
126 that transformation is stored in a variable called $debug_text.
127
128 my $reply = $mybot->transform( "My foot hurts" );
129 my $debugging = $mybot->debug_text;
130
131 This feature always available, even if the instance's $debug variable
132 is set to 0.
133
134 Calling programs can specify their own random-number generators. Use
135 this syntax:
136
137 $chatbot = new Chatbot::Eliza;
138 $chatbot->myrand(
139 sub {
140 #function goes here!
141 }
142 );
143
144 The custom random function should have the same prototype as perl's
145 built-in rand() function. That is, it should take a single (numeric)
146 expression as a parameter, and it should return a floating-point value
147 between 0 and that number.
148
149 What this code actually does is pass a reference to an anonymous sub‐
150 routine ("code reference"). Make sure you've read the perlref manpage
151 for details on how code references actually work.
152
153 If you don't specify any custom rand function, then the Eliza object
154 will just use the built-in rand() function.
155
157 Each Eliza object uses the following data structures to hold the script
158 data in memory:
159
160 %decomplist
161
162 Hash: the set of keywords; Values: strings containing the decomposi‐
163 tion rules.
164
165 %reasmblist
166
167 Hash: a set of values which are each the join of a keyword and a corre‐
168 sponding decomposition rule; Values: the set of possible reassembly
169 statements for that keyword and decomposition rule.
170
171 %reasmblist_for_memory
172
173 This structure is identical to %reasmblist, except that these rules are
174 only invoked when a user comment is being retrieved from memory. These
175 contain comments such as "Earlier you mentioned that...," which are
176 only appropriate for remembered comments. Rules in the script must be
177 specially marked in order to be included in this list rather than
178 %reasmblist. The default script only has a few of these rules.
179
180 @memory
181
182 A list of user comments which an Eliza instance is remembering for
183 future use. Eliza does not remember everything, only some things. In
184 this implementation, Eliza will only remember comments which match a
185 decomposition rule which actually has reassembly rules that are marked
186 with the keyword "reasm_for_memory" rather than the normal "reasmb".
187 The default script only has a few of these.
188
189 %keyranks
190
191 Hash: the set of keywords; Values: the ranks for each keyword
192
193 @quit
194
195 "quit" words -- that is, words the user might use to try to exit the
196 program.
197
198 @initial
199
200 Possible greetings for the beginning of the program.
201
202 @final
203
204 Possible farewells for the end of the program.
205
206 %pre
207
208 Hash: words which are replaced before any transformations; Values: the
209 respective replacement words.
210
211 %post
212
213 Hash: words which are replaced after the transformations and after the
214 reply is constructed; Values: the respective replacement words.
215
216 %synon
217
218 Hash: words which are found in decomposition rules; Values: words which
219 are treated just like their corresponding synonyms during matching of
220 decomposition rules.
221
222 Other data members
223
224 There are several other internal data members. Hopefully these are
225 sufficiently obvious that you can learn about them just by reading the
226 source code.
227
229 new()
230
231 my $chatterbot = new Chatbot::Eliza;
232
233 new() creates a new Eliza object. This method also calls the internal
234 _initialize() method, which in turn calls the parse_script_data()
235 method, which initializes the script data.
236
237 my $chatterbot = new Chatbot::Eliza 'Ahmad', 'myfile.txt';
238
239 The eliza object defaults to the name "Eliza", and it contains default
240 script data within itself. However, using the syntax above, you can
241 specify an alternative name and an alternative script file.
242
243 See the method parse_script_data(). for a description of the format of
244 the script file.
245
246 command_interface()
247
248 $chatterbot->command_interface;
249
250 command_interface() opens an interactive session with the Eliza object,
251 just like the original Eliza program.
252
253 If you want to design your own session format, then you can write your
254 own while loop and your own functions for prompting for and reading
255 user input, and use the transform() method to generate Eliza's
256 responses. (Note: you do not need to invoke preprocess() and postpro‐
257 cess() directly, because these are invoked from within the transform()
258 method.)
259
260 But if you're lazy and you want to skip all that, then just use com‐
261 mand_interface(). It's all done for you.
262
263 During an interactive session invoked using command_interface(), you
264 can enter the word "debug" to toggle debug mode on and off. You can
265 also enter the keyword "memory" to invoke the _debug_memory() method
266 and print out the contents of the Eliza instance's memory.
267
268 preprocess()
269
270 $string = preprocess($string);
271
272 preprocess() applies simple substitution rules to the input string.
273 Mostly this is to catch varieties in spelling, misspellings, contrac‐
274 tions and the like.
275
276 preprocess() is called from within the transform() method. It is
277 applied to user-input text, BEFORE any processing, and before a reasse‐
278 bly statement has been selected.
279
280 It uses the array %pre, which is created during the parse of the
281 script.
282
283 postprocess()
284
285 $string = postprocess($string);
286
287 postprocess() applies simple substitution rules to the reassembly rule.
288 This is where all the "I"'s and "you"'s are exchanged. postprocess()
289 is called from within the transform() function.
290
291 It uses the array %post, created during the parse of the script.
292
293 _testquit()
294
295 if ($self->_testquit($user_input) ) { ... }
296
297 _testquit() detects words like "bye" and "quit" and returns true if it
298 finds one of them as the first word in the sentence.
299
300 These words are listed in the script, under the keyword "quit".
301
302 _debug_memory()
303
304 $self->_debug_memory()
305
306 _debug_memory() is a special function which returns the contents of
307 Eliza's memory stack.
308
309 transform()
310
311 $reply = $chatterbot->transform( $string, $use_memory );
312
313 transform() applies transformation rules to the user input string. It
314 invokes preprocess(), does transformations, then invokes postprocess().
315 It returns the tranformed output string, called $reasmb.
316
317 The algorithm embedded in the transform() method has three main parts:
318
319 1 Search the input string for a keyword.
320
321 2 If we find a keyword, use the list of decomposition rules for that
322 keyword, and pattern-match the input string against each rule.
323
324 3 If the input string matches any of the decomposition rules, then
325 randomly select one of the reassembly rules for that decomposition
326 rule, and use it to construct the reply.
327
328 transform() takes two parameters. The first is the string we want to
329 transform. The second is a flag which indicates where this sting came
330 from. If the flag is set, then the string has been pulled from memory,
331 and we should use reassembly rules appropriate for that. If the flag
332 is not set, then the string is the most recent user input, and we can
333 use the ordinary reassembly rules.
334
335 The memory flag is only set when the transform() function is called
336 recursively. The mechanism for setting this parameter is embedded in
337 the transoform method itself. If the flag is set inappropriately, it
338 is ignored.
339
340 How memory is used
341
342 In the script, some reassembly rules are special. They are marked with
343 the keyword "reasm_for_memory", rather than just "reasm". Eliza
344 "remembers" any comment when it matches a docomposition rule for which
345 there are any reassembly rules for memory. An Eliza object remembers
346 up to $max_memory_size (default: 5) user input strings.
347
348 If, during a subsequent run, the transform() method fails to find any
349 appropriate decomposition rule for a user's comment, and if there are
350 any comments inside the memory array, then Eliza may elect to ignore
351 the most recent comment and instead pull out one of the strings from
352 memory. In this case, the transform method is called recursively with
353 the memory flag.
354
355 Honestly, I am not sure exactly how this memory functionality was
356 implemented in the original Eliza program. Hopefully this implementa‐
357 tion is not too far from Weizenbaum's.
358
359 If you don't want to use the memory functionality at all, then you can
360 disable it:
361
362 $mybot->memory_on(0);
363
364 You can also achieve the same effect by making sure that the script
365 data does not contain any reassembly rules marked with the keyword
366 "reasm_for_memory". The default script data only has 4 such items.
367
368 parse_script_data()
369
370 $self->parse_script_data;
371 $self->parse_script_data( $script_file );
372
373 parse_script_data() is invoked from the _initialize() method, which is
374 called from the new() function. However, you can also call this method
375 at any time against an already-instantiated Eliza instance. In that
376 case, the new script data is added to the old script data. The old
377 script data is not deleted.
378
379 You can pass a parameter to this function, which is the name of the
380 script file, and it will read in and parse that file. If you do not
381 pass any parameter to this method, then it will read the data embedded
382 at the end of the module as its default script data.
383
384 If you pass the name of a script file to parse_script_data(), and that
385 file is not available for reading, then the module dies.
386
388 This module includes a default script file within itself, so it is not
389 necessary to explicitly specify a script file when instantiating an
390 Eliza object.
391
392 Each line in the script file can specify a key, a decomposition rule,
393 or a reassembly rule.
394
395 key: remember 5
396 decomp: * i remember *
397 reasmb: Do you often think of (2) ?
398 reasmb: Does thinking of (2) bring anything else to mind ?
399 decomp: * do you remember *
400 reasmb: Did you think I would forget (2) ?
401 reasmb: What about (2) ?
402 reasmb: goto what
403 pre: equivalent alike
404 synon: belief feel think believe wish
405
406 The number after the key specifies the rank. If a user's input con‐
407 tains the keyword, then the transform() function will try to match one
408 of the decomposition rules for that keyword. If one matches, then it
409 will select one of the reassembly rules at random. The number (2) here
410 means "use whatever set of words matched the second asterisk in the
411 decomposition rule."
412
413 If you specify a list of synonyms for a word, the you should use a "@"
414 when you use that word in a decomposition rule:
415
416 decomp: * i @belief i *
417 reasmb: Do you really think so ?
418 reasmb: But you are not sure you (3).
419
420 Otherwise, the script will never check to see if there are any synonyms
421 for that keyword.
422
423 Reassembly rules should be marked with reasm_for_memory rather than
424 reasmb when it is appropriate for use when a user's comment has been
425 extracted from memory.
426
427 key: my 2
428 decomp: * my *
429 reasm_for_memory: Let's discuss further why your (2).
430 reasm_for_memory: Earlier you said your (2).
431 reasm_for_memory: But your (2).
432 reasm_for_memory: Does that have anything to do with the fact that your (2) ?
433
435 Each line in the script file contains an "entrytype" (key, decomp,
436 synon) and an "entry", separated by a colon. In turn, each "entry" can
437 itself be composed of a "key" and a "value", separated by a space. The
438 parse_script_data() function parses each line out, and splits the
439 "entry" and "entrytype" portion of each line into two variables, $entry
440 and $entrytype.
441
442 Next, it uses the string $entrytype to determine what sort of stuff to
443 expect in the $entry variable, if anything, and parses it accordingly.
444 In some cases, there is no second level of key-value pair, so the func‐
445 tion does not even bother to isolate or create $key and $value.
446
447 $key is always a single word. $value can be null, or one single word,
448 or a string composed of several words, or an array of words.
449
450 Based on all these entries and keys and values, the function creates
451 two giant hashes: %decomplist, which holds the decomposition rules for
452 each keyword, and %reasmblist, which holds the reassembly phrases for
453 each decomposition rule. It also creates %keyranks, which holds the
454 ranks for each key.
455
456 Six other arrays are created: "%reasm_for_memory, %pre, %post, %synon,
457 @initial," and @final.
458
460 * Version 1.02-1.04 - January 2003
461 Added a Norwegian script, kindly contributed by
462 Mats Stafseng Einarsen. Thanks Mats!
463
464 * Version 1.01 - January 2003
465 Added an empty DESTORY method, to eliminate
466 some pesky warning messages. Suggested by
467 Stas Bekman.
468
469 * Version 0.98 - March 2000
470 Some changes to the documentation.
471
472 * Versions 0.96-0.97 - October 1999
473 One tiny change to the regex which implements
474 reassemble rules. Thanks to Gidon Wise for
475 suggesting this improvement.
476
477 * Versions 0.94-0.95 - July 1999
478 Fixed a bug in the way the bot invokes its random function
479 when it pulls a comment out of memory.
480
481 * Version 0.93 - June 1999
482 Calling programs can now specify their own random-number generators.
483 Use this syntax:
484
485 $chatbot = new Chatbot::Eliza;
486 $chatbot->myrand(
487 sub {
488 #function goes here!
489 }
490 );
491
492 The custom random function should have the same prototype
493 as perl's built-in rand() function. That is, it should take
494 a single (numeric) expression as a parameter, and it should
495 return a floating-point value between 0 and that number.
496
497 You can also now use a reference to an anonymous hash
498 as a parameter to the new() method to define any fields
499 in that bot instance:
500
501 $bot = new Chatbot::Eliza {
502 name => "Brian",
503 scriptfile => "myscript.txt",
504 debug => 1,
505 };
506
507 * Versions 0.91-0.92 - April 1999
508 Fixed some misspellings.
509
510 * Version 0.90 - April 1999
511 Fixed a bug in the way individual bot objects store
512 their memory. Thanks to Randal Schwartz and to
513 Robert Chin for pointing this out.
514
515 Fixed a very stupid error in the way the random
516 function is invoked. Thanks to Antony Quintal
517 for pointing out the error.
518
519 Many corrections and improvements were made
520 to the German script by Matthias Hellmund.
521 Thanks, Matthias!
522
523 Made a minor syntactical change, at the suggestion
524 of Roy Stephan.
525
526 The memory functionality can now be disabled by setting the
527 $Chatbot::Eliza::memory_on variable to 0, like so:
528
529 $bot->memory_on(0);
530
531 Thanks to Robert Chin for suggesting that.
532
533 * Version 0.40 - July 1998
534 Re-implemented the memory functionality.
535
536 Cleaned up and expanded the embedded POD documentation.
537
538 Added a sample script in German.
539
540 Modified the debugging behavior. The transform() method itself
541 will no longer print any debugging output directly to STDOUT.
542 Instead, all debugging output is stored in a module variable
543 called "debug_text". The "debug_text" variable is printed out
544 by the command_interface() method, if the debug flag is set.
545 But even if this flag is not set, the variable debug_text
546 is still available to any calling program.
547
548 Added a few more example scripts which use the module.
549
550 simple - simple script using Eliza.pm
551 simple.cgi - simple CGI script using Eliza.pm
552 debug.cgi - CGI script which displays debugging output
553 deutsch - script using the German script
554 deutsch.cgi - CGI script using the German script
555 twobots - script which creates two distinct bots
556
557 * Version 0.32 - December 1997
558 Fixed a bug in the way Eliza loads its default internal script data.
559 (Thanks to Randal Schwartz for pointing this out.)
560
561 Removed the "memory" functions internal to Eliza.
562 When I get them working properly I will add them back in.
563
564 Added one more example program.
565
566 Fixed some minor errors in the embedded POD documentation.
567
568 * Version 0.31
569 The module is now installable, just like any other self-respecting
570 CPAN module.
571
572 * Version 0.30
573 First release.
574
576 John Nolan jpnolan@sonic.net January 2003.
577
578 Implements the classic Eliza algorithm by Prof. Joseph Weizenbaum.
579 Script format devised by Charles Hayden.
580
581
582
583perl v5.8.8 2003-01-23 Chatbot::Eliza(3)