1Text::Brew(3) User Contributed Perl Documentation Text::Brew(3)
2
3
4
6 Text::Brew - An implementation of the Brew edit distance
7
9 use Text::Brew qw(distance);
10
11 my ($distance,$arrayref_edits)=distance("four","foo");
12 my $sequence=join",",@$arrayref_edits;
13 print "The Brew distance for (four,foo) is $distance\n";
14 print "obtained with the edits: $sequence\n\n";
15
17 This module implements the Brew edit distance that is very close to the
18 dynamic programming technique used for the Wagner-Fischer (and so for
19 the Levenshtein) edit distance. Please look at the module references
20 below. For more information about the Brew edit distance see:
21 <http://ling.ohio-state.edu/~cbrew/795M/string-distance.html>
22
23 The difference here is that you have separated costs for the DELetion
24 and INSertion operations (but with the default to 1 for both, you
25 obtain the Levenshtein edit distance). But the most interesting feature
26 is that you can obtain the description of the edits needed to transform
27 the first string into the second one (not vice versa: here DELetions
28 are separated from INSertions). The difference from the original
29 algorithm by Chris Brew is that I have added the SUBST operation,
30 making it different from MATCH operation.
31
32 The symbols used here are:
33
34 INITIAL that is the INITIAL operation (i.e. NO operation)
35 MATCH that is the MATCH operation (0 is the default cost)
36 SUBST that is the SUBSTitution operation (1 is the default cost)
37 DEL that is the DELetion operation (1 is the default cost)
38 INS that is the INSertion operation (1 is the default cost)
39
40 and you can change the default costs (see below).
41
42 You can make INS and DEL the same operation in a simple way:
43
44 1) give both the same cost
45 2) change the output string DEL to INS/DEL (o whatever)
46 3) change the output string INS to INS/DEL (o whatever)
47
48 USAGE
49 use strict;
50 use Text::Brew qw(distance);
51
52 my ($distance,$arrayref_edits)=distance("four","foo");
53 my $sequence=join",",@$arrayref_edits;
54 print "The Brew distance for (four,foo) is $distance\n";
55 print "obtained with the edits: $sequence\n\n";
56
57 my $string1="foo";
58 my @strings=("four","foo","bar");
59 my (@dist,@edits);
60 foreach my $string2 (@strings) {
61 my ($dist,$edits)=distance($string1,$string2);
62 push @dist,$dist;
63 push @edits,(join ",",@$edits);
64 }
65 foreach my $i (0 .. $#strings) {
66
67 print "The Brew distance for ($string1,$strings[$i]) is $dist[$i]\n";
68 print "obtained with the edits: $edits[$i]\n\n";
69 }
70
71 OPTIONAL PARAMETERS
72 distance($string1,$string2,{-cost=>[0,2,1,1],-output=>'edits'});
73
74 -output
75 accepted values are:
76 distance means that the distance returns
77 only the numeric distance
78
79 both the distance returns both the
80 numeric distance and the array of the edits
81
82 edits means that the distance returns only the
83 array of the edits
84
85 Default output is 'both'.
86
87 -cost
88 accepted value is an array with 4 elements:
89 1st is the cost for the MATCH
90 2nd is the cost for the INS (INSertion)
91 3rd is the cost for the DEL (DELetion)
92 4th is the cost for the SUBST (SUBSTitution)
93
94 Default array is [0,1,1,1] .
95
96 Examples are:
97
98 my $distance=distance("four","foo",{-output=>'distance'});
99 print "The Brew distance for (four,foo) is $distance\n\n";
100
101
102 my $arrayref_edits=distance("four","foo",{-output=>'edits'});
103 my $sequence=join",",@$arrayref_edits;
104 print "The Brew sequence for (four,foo) is $sequence\n\n";
105
106
107 my ($distance,$arrayref_edits)=distance("four","foo",{-cost=>[0,2,1,1]});
108 my $sequence=join",",@$arrayref_edits;
109 print "The Brew distance for (four,foo) is $distance\n";
110 print "obtained with the edits: $sequence\n\n";
111
112 ($distance,$arrayref_edits)=distance("foo","four",{-cost=>[0,2,1,1]});
113 $sequence=join",",@$arrayref_edits;
114 print "The Brew distance for (foo,four) is $distance\n";
115 print "obtained with the edits: $sequence\n\n";
116
118 All the credits goes to Chris Brew the author of the algorithm.
119
121 Many thanks to Stefano L. Rodighiero <larsen at perlmonk.org> for the
122 suggestions.
123
125 Copyright 2003 Dree Mistrut <dree@friuli.to>
126
127 This package is free software and is provided "as is" without express
128 or implied warranty. You can redistribute it and/or modify it under the
129 same terms as Perl itself.
130
132 "Text::Levenshtein", "Text::WagnerFischer"
133
134
135
136perl v5.30.1 2020-01-30 Text::Brew(3)