1ntextWordBreak(nn)text Word Boundary Detection for the Text WidgnettextWordBreak(n)
2
3
4
5______________________________________________________________________________
6
8 ntextWordBreak - ntext Word Boundary Detection for the Text Widget
9
11 package require Tcl 8.5
12
13 package require Tk 8.5
14
15 package require ntext ?0.81?
16
17______________________________________________________________________________
18
20 The ntext package provides a binding tag named Ntext for use by text
21 widgets in place of the default Text binding tag.
22
23 Navigation and selection in a text widget require the detection of
24 words and their boundaries. The word boundary detection facilities
25 provided by Tcl/Tk through the Text binding tag are limited because
26 they define only one class of "word" characters and one class of "non-
27 word" characters. The Ntext binding tag uses more general rules for
28 word boundary detection, that define two classes of "word" characters
29 and one class of "non-word" characters.
30
32 The behaviour of Ntext may be configured application-wide by setting
33 the values of a number of namespace variables. One of these is rele‐
34 vant to word boundary detection:
35
36 ::ntext::classicWordBreak
37
38 · 0 - (default value) selects Ntext behaviour, i.e. platform-inde‐
39 pendent, two classes of word characters and one class of non-
40 word characters.
41
42 · 1 - selects classic Text behaviour, i.e. platform-dependent, one
43 class of word characters and one class of non-word characters
44
45 · After changing this value, Ntext 's regexp matching patterns
46 should be recalculated. See FUNCTIONS for details and advanced
47 configuration options.
48
51 ::ntext::tcl_match_wordBreakAfter
52
53 ::ntext::tcl_match_wordBreakBefore
54
55 ::ntext::tcl_match_endOfWord
56
57 ::ntext::tcl_match_startOfNextWord
58
59 ::ntext::tcl_match_startOfPreviousWord
60
61 These variables hold the regexp patterns that are used by Ntext to
62 search for word boundaries. If they are changed, subsequent searches
63 are immediately altered. In many situations, it it unnecessary to
64 alter the values of these variables directly: instead call one of the
65 functions ::ntext::initializeMatchPatterns, ::ntext::createMatchPat‐
66 terns.
67
68 In the Text binding tag one can change the search rules by changing the
69 values of the global variables tcl_wordchars and tcl_nonwordchars. The
70 equivalent operation in the Ntext binding tag is to call ::ntext::cre‐
71 ateMatchPatterns with appropriate arguments.
72
74 If a simple regexp search should prove insufficient, the following
75 functions (analogous to the Tcl/Tk core's tcl_wordBreakAfter etc) may
76 be replaced by the developer:
77
78 ntext::new_wordBreakAfter
79
80 ntext::new_wordBreakBefore
81
82 ntext::new_endOfWord
83
84 ntext::new_startOfNextWord
85
86 ntext::new_startOfPreviousWord
87
89 Each function calculates the five regexp search patterns that define
90 the word boundary searches. These values are stored in the namespace
91 variables listed above.
92
93 ::ntext::initializeMatchPatterns
94
95 · This function is called when Ntext is first used, and needs to
96 be called again only if the script changes the value of either
97 ::ntext::classicWordBreak or ::tcl_platform(platform). The
98 function is called with no arguments. It is useful when the
99 desired search patterns are the default patterns for either the
100 Ntext or Text binding tag, and so are implicitly specified by
101 the values of ::ntext::classicWordBreak and ::tcl_platform(plat‐
102 form) alone.
103
104 ::ntext::createMatchPatterns new_nonwordchars new_word1chars
105 ?new_word2chars?
106
107 · This function is useful in a wider range of situations than
108 ::ntext::initializeMatchPatterns. It calculates the regexp
109 search patterns for any case with one class of "non-word" char‐
110 acters and one or two classes of "word" characters.
111
112 Each argument should be a regexp expression defining a class of
113 characters. An argument will usually be a bracket expression,
114 but might alternatively be a class-shorthand escape, or a single
115 character. The third argument may be omitted, or supplied as
116 the empty string, in which case it is unused.
117
118 The first argument is interpreted as the class of non-word char‐
119 acters; the second argument (and the third, if present) are
120 classes of word characters. The classes should include all pos‐
121 sible characters and will normally be mutually exclusive: it is
122 often convenient to define one class as the negation of the
123 other two.
124
126 The problem of word boundary selection is a vexed one, because text is
127 used to represent a universe of different types of information, and
128 there are no simple rules that are useful for all data types or for all
129 purposes.
130
131 Ntext attempts to improve on the facilities available in classic Text
132 by providing facilities for more complex definitions of words (with
133 three classes of characters instead of two).
134
135 What is a word? Why two classes of word?
136
137 When using the modified cursor keys <Control-Left> and <Control-Right>
138 to navigate through a Ntext widget, the cursor is placed at the start
139 of a word. A word is defined as a sequence of one or more characters
140 from only one of the two defined "word" classes; it may be preceded by
141 a character from the other "word" class or from the "non-word" class.
142
143 The double-click of mouse button 1 selects a word of text, where in
144 this case a "word" may be as defined above, or alternatively may be a
145 sequence of one or more characters from the "non-word" class of charac‐
146 ters.
147
148 Traditionally Tcl has defined only one word class and one non-word
149 class: on Windows, the non-word class is whitespace, and so alphanumer‐
150 ics and punctuation belong to the same class. On other platforms,
151 punctuation is bundled with whitespace as "non-word" characters. In
152 either case, the navigation and selection of text are unnecessarily
153 coarse-grained, and sometimes give unhelpful results.
154
155 The use of three classes of characters might make selection too fine-
156 grained; but in this case, holding down the Shift key and double-click‐
157 ing another word is an excellent way to select a longer range of text
158 (a useful binding that Tcl/Tk has long provided but which is missing in
159 other systems).
160
161 As well as its defaults, Ntext permits the developer to define their
162 own classes of characters, or to revert to the classic Text defini‐
163 tions, or to specify their own regexp matching patterns.
164
166 To use Ntext with Tcl/Tk's usual word-boundary detection rules:
167
168
169 package require ntext
170 text .t
171 bindtags .t {.t Ntext . all}
172 set ::ntext::classicWordBreak 1
173 ::ntext::initializeMatchPatterns
174
175 See bindtags for more information.
176
177 To define a different set of word-boundary detection rules:
178
179
180 package require ntext
181 text .t
182 bindtags .t {.t Ntext . all}
183 ::ntext::createMatchPatterns \
184 {[[:space:][:cntrl:]]} {[[:punct:]]} {[^[:punct:][:space:][:cntrl:]]}
185
186 See regexp, re_syntax for more information.
187
189 bindtags, ntext, re_syntax, regexp, text
190
192 bindtags, re_syntax, regexp, text
193
194
195
196ntext 0.81 ntextWordBreak(n)