1Catalyst::UTF8(3) User Contributed Perl Documentation Catalyst::UTF8(3)
2
3
4
6 Catalyst::UTF8 - All About UTF8 and Catalyst Encoding
7
9 Starting in 5.90080 Catalyst will enable UTF8 encoding by default for
10 text like body responses. In addition we've made a ton of fixes around
11 encoding and utf8 scattered throughout the codebase. This document
12 attempts to give an overview of the assumptions and practices that
13 Catalyst uses when dealing with UTF8 and encoding issues. You should
14 also review the Changes file, Catalyst::Delta and Catalyst::Upgrading
15 for more.
16
17 We attempt to describe all relevant processes, try to give some advice
18 and explain where we may have been exceptional to respect our
19 commitment to backwards compatibility.
20
22 Using UTF8 characters in your Controller classes and actions.
23
24 Summary
25 In this section we will review changes to how UTF8 characters can be
26 used in controller actions, how it looks in the debugging screens (and
27 your logs) as well as how you construct URL objects to actions with
28 UTF8 paths (or using UTF8 args or captures).
29
30 Unicode in Controllers and URLs
31 package MyApp::Controller::Root;
32
33 use utf8;
34 use base 'Catalyst::Controller';
35
36 sub heart_with_arg :Path('♥') Args(1) {
37 my ($self, $c, $arg) = @_;
38 }
39
40 sub base :Chained('/') CaptureArgs(0) {
41 my ($self, $c) = @_;
42 }
43
44 sub capture :Chained('base') PathPart('♥') CaptureArgs(1) {
45 my ($self, $c, $capture) = @_;
46 }
47
48 sub arg :Chained('capture') PathPart('♥') Args(1) {
49 my ($self, $c, $arg) = @_;
50 }
51
52 Discussion
53 In the example controller above we have constructed two matchable URL
54 routes:
55
56 http://localhost/root/♥/{arg}
57 http://localhost/base/♥/{capture}/♥/{arg}
58
59 The first one is a classic Path type action and the second uses
60 Chaining, and spans three actions in total. As you can see, you can
61 use unicode characters in your Path and PathPart attributes (remember
62 to use the "utf8" pragma to allow these multibyte characters in your
63 source). The two constructed matchable routes would match the
64 following incoming URLs:
65
66 (heart_with_arg) -> http://localhost/root/%E2%99%A5/{arg}
67 (base/capture/arg) -> http://localhost/base/%E2%99%A5/{capture}/%E2%99%A5/{arg}
68
69 That path path "%E2%99%A5" is url encoded unicode (assuming you are
70 hitting this with a reasonably modern browser). Its basically what
71 goes over HTTP when your type a browser location that has the unicode
72 'heart' in it. However we will use the unicode symbol in your
73 debugging messages:
74
75 [debug] Loaded Path actions:
76 .-------------------------------------+--------------------------------------.
77 | Path | Private |
78 +-------------------------------------+--------------------------------------+
79 | /root/♥/* | /root/heart_with_arg |
80 '-------------------------------------+--------------------------------------'
81
82 [debug] Loaded Chained actions:
83 .-------------------------------------+--------------------------------------.
84 | Path Spec | Private |
85 +-------------------------------------+--------------------------------------+
86 | /base/♥/*/♥/* | /root/base (0) |
87 | | -> /root/capture (1) |
88 | | => /root/arg |
89 '-------------------------------------+--------------------------------------'
90
91 And if the requested URL uses unicode characters in your captures or
92 args (such as "http://localhost:/base/♥/♥/♥/♥") you should see the
93 arguments and captures as their unicode characters as well:
94
95 [debug] Arguments are "♥"
96 [debug] "GET" request for "base/♥/♥/♥/♥" from "127.0.0.1"
97 .------------------------------------------------------------+-----------.
98 | Action | Time |
99 +------------------------------------------------------------+-----------+
100 | /root/base | 0.000080s |
101 | /root/capture | 0.000075s |
102 | /root/arg | 0.000755s |
103 '------------------------------------------------------------+-----------'
104
105 Again, remember that we are display the unicode character and using it
106 to match actions containing such multibyte characters BUT over HTTP you
107 are getting these as URL encoded bytes. For example if you looked at
108 the PSGI $env value for "REQUEST_URI" you would see (for the above
109 request)
110
111 REQUEST_URI => "/base/%E2%99%A5/%E2%99%A5/%E2%99%A5/%E2%99%A5"
112
113 So on the incoming request we decode so that we can match and display
114 unicode characters (after decoding the URL encoding). This makes it
115 straightforward to use these types of multibyte characters in your
116 actions and see them incoming in captures and arguments. Please keep
117 this in might if you are doing for example regular expression matching,
118 length determination or other string comparisons, you will need to try
119 these incoming variables as though UTF8 strings. For example in the
120 following action:
121
122 sub arg :Chained('capture') PathPart('♥') Args(1) {
123 my ($self, $c, $arg) = @_;
124 }
125
126 when $arg is "♥" you should expect "length($arg)" to be 1 since it is
127 indeed one character although it will take more than one byte to store.
128
129 UTF8 in constructing URLs via $c->uri_for
130 For the reverse (constructing meaningful URLs to actions that contain
131 multibyte characters in their paths or path parts, or when you want to
132 include such characters in your captures or arguments) Catalyst will do
133 the right thing (again just remember to use the "utf8" pragma).
134
135 use utf8;
136 my $url = $c->uri_for( $c->controller('Root')->action_for('arg'), ['♥','♥']);
137
138 When you stringify this object (for use in a template, for example) it
139 will automatically do the right thing regarding utf8 encoding and url
140 encoding.
141
142 http://localhost/base/%E2%99%A5/%E2%99%A5/%E2%99%A5/%E2%99%A5
143
144 Since again what you want is a properly url encoded version of this.
145 In this case your string length will reflect URL encoded bytes, not the
146 character length. Ultimately what you want to send over the wire via
147 HTTP needs to be bytes.
148
150 What Catalyst does with UTF8 in your GET and classic HTML Form POST
151
152 UTF8 in URL query and keywords
153 The same rules that we find in URL paths also cover URL query parts.
154 That is if one types a URL like this into the browser
155
156 http://localhost/example?♥=♥♥
157
158 When this goes 'over the wire' to your application server its going to
159 be as percent encoded bytes:
160
161 http://localhost/example?%E2%99%A5=%E2%99%A5%E2%99%A5
162
163 When Catalyst encounters this we decode the percent encoding and the
164 utf8 so that we can properly display this information (such as in the
165 debugging logs or in a response.)
166
167 [debug] Query Parameters are:
168 .-------------------------------------+--------------------------------------.
169 | Parameter | Value |
170 +-------------------------------------+--------------------------------------+
171 | ♥ | ♥♥ |
172 '-------------------------------------+--------------------------------------'
173
174 All the values and keys that are part of $c->req->query_parameters will
175 be utf8 decoded. So you should not need to do anything special to take
176 those values/keys and send them to the body response (since as we will
177 see later Catalyst will do all the necessary encoding for you).
178
179 Again, remember that values of your parameters are now decode into
180 Unicode strings. so for example you'd expect the result of length to
181 reflect the character length not the byte length.
182
183 Just like with arguments and captures, you can use utf8 literals (or
184 utf8 strings) in $c->uri_for:
185
186 use utf8;
187 my $url = $c->uri_for( $c->controller('Root')->action_for('example'), {'♥' => '♥♥'});
188
189 When you stringify this object (for use in a template, for example) it
190 will automatically do the right thing regarding utf8 encoding and url
191 encoding.
192
193 http://localhost/example?%E2%99%A5=%E2%99%A5%E2%99%A5
194
195 Since again what you want is a properly url encoded version of this.
196 Ultimately what you want to send over the wire via HTTP needs to be
197 bytes (not unicode characters).
198
199 Remember if you use any utf8 literals in your source code, you should
200 use the "use utf8" pragma.
201
202 NOTE: Assuming UTF-8 in your query parameters and keywords may be an
203 issue if you have legacy code where you created URL in templates
204 manually and used an encoding other than UTF-8. In these cases you may
205 find versions of Catalyst after 5.90080+ will incorrectly decode. For
206 backwards compatibility we offer three configurations settings, here
207 described in order of precedence:
208
209 "do_not_decode_query"
210
211 If true, then do not try to character decode any wide characters in
212 your request URL query or keywords. You will need to handle this
213 manually in your action code (although if you choose this setting,
214 chances are you already do this).
215
216 "default_query_encoding"
217
218 This setting allows one to specify a fixed value for how to decode your
219 query, instead of using the default, UTF-8.
220
221 "decode_query_using_global_encoding"
222
223 If this is true we decode using whatever you set "encoding" to.
224
225 UTF8 in Form POST
226 In general most modern browsers will follow the specification, which
227 says that POSTed form fields should be encoded in the same way that the
228 document was served with. That means that if you are using modern
229 Catalyst and serving UTF8 encoded responses, a browser is supposed to
230 notice that and encode the form POSTs accordingly.
231
232 As a result since Catalyst now serves UTF8 encoded responses by
233 default, this means that you can mostly rely on incoming form POSTs to
234 be so encoded. Catalyst will make this assumption and decode
235 accordingly (unless you explicitly turn off encoding...) If you are
236 running Catalyst in developer debug, then you will see the correct
237 unicode characters in the debug output. For example if you generate a
238 POST request:
239
240 use Catalyst::Test 'MyApp';
241 use utf8;
242
243 my $res = request POST "/example/posted", ['♥'=>'♥', '♥♥'=>'♥'];
244
245 Running in CATALYST_DEBUG=1 mode you should see output like this:
246
247 [debug] Body Parameters are:
248 .-------------------------------------+--------------------------------------.
249 | Parameter | Value |
250 +-------------------------------------+--------------------------------------+
251 | ♥ | ♥ |
252 | ♥♥ | ♥ |
253 '-------------------------------------+--------------------------------------'
254
255 And if you had a controller like this:
256
257 package MyApp::Controller::Example;
258
259 use base 'Catalyst::Controller';
260
261 sub posted :POST Local {
262 my ($self, $c) = @_;
263 $c->res->content_type('text/plain');
264 $c->res->body("hearts => ${\$c->req->post_parameters->{♥}}");
265 }
266
267 The following test case would be true:
268
269 use Encode 2.21 'decode_utf8';
270 is decode_utf8($req->content), 'hearts => ♥';
271
272 In this case we decode so that we can print and compare strings with
273 multibyte characters.
274
275 NOTE In some cases some browsers may not follow the specification and
276 set the form POST encoding based on the server response. Catalyst
277 itself doesn't attempt any workarounds, but one common approach is to
278 use a hidden form field with a UTF8 value (You might be familiar with
279 this from how Ruby on Rails has HTML form helpers that do that
280 automatically). In that case some browsers will send UTF8 encoded if
281 it notices the hidden input field contains such a character. Also, you
282 can add an HTML attribute to your form tag which many modern browsers
283 will respect to set the encoding (accept-charset="utf-8"). And lastly
284 there are some javascript based tricks and workarounds for even more
285 odd cases (just search the web for this will return a number of
286 approaches. Hopefully as more compliant browsers become popular these
287 edge cases will fade.
288
289 NOTE It is possible for a form POST multipart response (normally a
290 file upload) to contain inline content with mixed content character
291 sets and encoding. For example one might create a POST like this:
292
293 use utf8;
294 use HTTP::Request::Common;
295
296 my $utf8 = 'test ♥';
297 my $shiftjs = 'test テスト';
298 my $req = POST '/root/echo_arg',
299 Content_Type => 'form-data',
300 Content => [
301 arg0 => 'helloworld',
302 Encode::encode('UTF-8','♥') => Encode::encode('UTF-8','♥♥'),
303 arg1 => [
304 undef, '',
305 'Content-Type' =>'text/plain; charset=UTF-8',
306 'Content' => Encode::encode('UTF-8', $utf8)],
307 arg2 => [
308 undef, '',
309 'Content-Type' =>'text/plain; charset=SHIFT_JIS',
310 'Content' => Encode::encode('SHIFT_JIS', $shiftjs)],
311 arg2 => [
312 undef, '',
313 'Content-Type' =>'text/plain; charset=SHIFT_JIS',
314 'Content' => Encode::encode('SHIFT_JIS', $shiftjs)],
315 ];
316
317 In this case we've created a POST request but each part specifies its
318 own content character set (and setting a content encoding would also be
319 possible). Generally one would not run into this situation in a web
320 browser context but for completeness sake Catalyst will notice if a
321 multipart POST contains parts with complex or extended header
322 information. In these cases we will try to inspect the meta data and
323 do the right thing (in the above case we'd use SHIFT_JIS to decode, not
324 UTF-8). However if after inspecting the headers we cannot figure out
325 how to decode the data, in those cases it will not attempt to apply
326 decoding to the form values. Instead the part will be represented as
327 an instance of an object Catalyst::Request::PartData which will contain
328 all the header information needed for you to perform custom parser of
329 the data.
330
331 Ideally we'd fix Catalyst to be smarter about decoding so please submit
332 your cases of this so we can add intelligence to the parser and find a
333 way to extract a valid value out of it.
334
336 When does Catalyst encode your response body and what rules does it use
337 to determine when that is needed.
338
339 Summary
340 use utf8;
341 use warnings;
342 use strict;
343
344 package MyApp::Controller::Root;
345
346 use base 'Catalyst::Controller';
347 use File::Spec;
348
349 sub scalar_body :Local {
350 my ($self, $c) = @_;
351 $c->response->content_type('text/html');
352 $c->response->body("<p>This is scalar_body action ♥</p>");
353 }
354
355 sub stream_write :Local {
356 my ($self, $c) = @_;
357 $c->response->content_type('text/html');
358 $c->response->write("<p>This is stream_write action ♥</p>");
359 }
360
361 sub stream_write_fh :Local {
362 my ($self, $c) = @_;
363 $c->response->content_type('text/html');
364
365 my $writer = $c->res->write_fh;
366 $writer->write_encoded('<p>This is stream_write_fh action ♥</p>');
367 $writer->close;
368 }
369
370 sub stream_body_fh :Local {
371 my ($self, $c) = @_;
372 my $path = File::Spec->catfile('t', 'utf8.txt');
373 open(my $fh, '<', $path) || die "trouble: $!";
374 $c->response->content_type('text/html');
375 $c->response->body($fh);
376 }
377
378 Discussion
379 Beginning with Catalyst version 5.90080 You no longer need to set the
380 encoding configuration (although doing so won't hurt anything).
381
382 Currently we only encode if the content type is one of the types which
383 generally expects a UTF8 encoding. This is determined by the following
384 regular expression:
385
386 our $DEFAULT_ENCODE_CONTENT_TYPE_MATCH = qr{text|xml$|javascript$};
387 $c->response->content_type =~ /$DEFAULT_ENCODE_CONTENT_TYPE_MATCH/
388
389 This is a global variable in Catalyst::Response which is stored in the
390 "encodable_content_type" attribute of $c->response. You may currently
391 alter this directly on the response or globally. In the future we may
392 offer a configuration setting for this.
393
394 This would match content-types like the following (examples)
395
396 text/plain
397 text/html
398 text/xml
399 application/javascript
400 application/xml
401 application/vnd.user+xml
402
403 You should set your content type prior to header finalization if you
404 want Catalyst to encode.
405
406 NOTE We do not attempt to encode "application/json" since the two most
407 commonly used approaches (Catalyst::View::JSON and
408 Catalyst::Action::REST) have already configured their JSON encoders to
409 produce properly encoding UTF8 responses. If you are rolling your own
410 JSON encoding, you may need to set the encoder to do the right thing
411 (or override the global regular expression to include the JSON media
412 type).
413
414 Encoding with Scalar Body
415 Catalyst supports several methods of supplying your response with body
416 content. The first and currently most common is to set the
417 Catalyst::Response ->body with a scalar string ( as in the example):
418
419 use utf8;
420
421 sub scalar_body :Local {
422 my ($self, $c) = @_;
423 $c->response->content_type('text/html');
424 $c->response->body("<p>This is scalar_body action ♥</p>");
425 }
426
427 In general you should need to do nothing else since Catalyst will
428 automatically encode this string during body finalization. The only
429 matter to watch out for is to make sure the string has not already been
430 encoded, as this will result in double encoding errors.
431
432 NOTE pay attention to the content-type setting in the example.
433 Catalyst inspects that content type carefully to determine if the body
434 needs encoding).
435
436 NOTE If you set the character set of the response Catalyst will skip
437 encoding IF the character set is set to something that doesn't match
438 $c->encoding->mime_name. We will assume if you are setting an
439 alternative character set, that means you want to handle the encoding
440 yourself. However it might be easier to set $c->encoding for a given
441 response cycle since you can override this for a given response. For
442 example here's how to override the default encoding and set the correct
443 character set in the response:
444
445 sub override_encoding :Local {
446 my ($self, $c) = @_;
447 $c->res->content_type('text/plain');
448 $c->encoding(Encode::find_encoding('Shift_JIS'));
449 $c->response->body("テスト");
450 }
451
452 This will use the alternative encoding for a single response.
453
454 NOTE If you manually set the content-type character set to whatever
455 $c->encoding->mime_name is set to, we STILL encode, rather than assume
456 your manual setting is a flag to override. This is done to support
457 backward compatible assumptions (in particular Catalyst::View::TT has
458 set a utf-8 character set in its default content-type for ages, even
459 though it does not itself do any encoding on the body response). If
460 you are going to handle encoding manually you may set
461 $c->clear_encoding for a single request response cycle, or as in the
462 above example set an alternative encoding.
463
464 Encoding with streaming type responses
465 Catalyst offers two approaches to streaming your body response. Again,
466 you must remember to set your content type prior to streaming, since
467 invoking a streaming response will automatically finalize and send your
468 HTTP headers (and your content type MUST be one that matches the
469 regular expression given above.)
470
471 Also, if you are going to override $c->encoding (or invoke
472 $c->clear_encoding), you should do that before anything else!
473
474 The first streaming method is to use the "write" method on the response
475 object. This method allows 'inlined' streaming and is generally used
476 with blocking style servers.
477
478 sub stream_write :Local {
479 my ($self, $c) = @_;
480 $c->response->content_type('text/html');
481 $c->response->write("<p>This is stream_write action ♥</p>");
482 }
483
484 You may call the "write" method as often as you need to finish
485 streaming all your content. Catalyst will encode each line in turn as
486 long as the content-type meets the 'encodable types' requirement and
487 $c->encoding is set (which it is, as long as you did not change it).
488
489 NOTE If you try to change the encoding after you start the stream, this
490 will invoke an error response. However since you've already started
491 streaming this will not show up as an HTTP error status code, but
492 rather error information in your body response and an error in your
493 logs.
494
495 NOTE If you use ->body AFTER using ->write (for example you may do this
496 to write your HTML HEAD information as fast as possible) we expect the
497 contents to body to be encoded as it normally would be if you never
498 called ->write. In general unless you are doing weird custom stuff
499 with encoding this is likely to just already do the correct thing.
500
501 The second way to stream a response is to get the response writer
502 object and invoke methods on that directly:
503
504 sub stream_write_fh :Local {
505 my ($self, $c) = @_;
506 $c->response->content_type('text/html');
507
508 my $writer = $c->res->write_fh;
509 $writer->write_encoded('<p>This is stream_write_fh action ♥</p>');
510 $writer->close;
511 }
512
513 This can be used just like the "write" method, but typically you
514 request this object when you want to do a nonblocking style response
515 since the writer object can be closed over or sent to a model that will
516 invoke it in a non blocking manner. For more on using the writer
517 object for non blocking responses you should review the "Catalyst"
518 documentation and also you can look at several articles from last years
519 advent, in particular:
520
521 <http://www.catalystframework.org/calendar/2013/10>,
522 <http://www.catalystframework.org/calendar/2013/11>,
523 <http://www.catalystframework.org/calendar/2013/12>,
524 <http://www.catalystframework.org/calendar/2013/13>,
525 <http://www.catalystframework.org/calendar/2013/14>.
526
527 The main difference this year is that previously calling ->write_fh
528 would return the actual Plack writer object that was supplied by your
529 Plack application handler, whereas now we wrap that object in a
530 lightweight decorator object that proxies the "write" and "close"
531 methods and supplies an additional "write_encoded" method.
532 "write_encoded" does the exact same thing as "write" except that it
533 will first encode the string when necessary. In general if you are
534 streaming encodable content such as HTML this is the method to use. If
535 you are streaming binary content, you should just use the "write"
536 method (although if the content type is set correctly we would skip
537 encoding anyway, but you may as well avoid the extra noop overhead).
538
539 The last style of content response that Catalyst supports is setting
540 the body to a filehandle like object. In this case the object is
541 passed down to the Plack application handler directly and currently we
542 do nothing to set encoding.
543
544 sub stream_body_fh :Local {
545 my ($self, $c) = @_;
546 my $path = File::Spec->catfile('t', 'utf8.txt');
547 open(my $fh, '<', $path) || die "trouble: $!";
548 $c->response->content_type('text/html');
549 $c->response->body($fh);
550 }
551
552 In this example we create a filehandle to a text file that contains
553 UTF8 encoded characters. We pass this down without modification, which
554 I think is correct since we don't want to double encode. However this
555 may change in a future development release so please be sure to double
556 check the current docs and changelog. Its possible a future release
557 will require you to to set a encoding on the IO layer level so that we
558 can be sure to properly encode at body finalization. So this is still
559 an edge case we are writing test examples for. But for now if you are
560 returning a filehandle like response, you are expected to make sure you
561 are following the PSGI specification and return raw bytes.
562
563 Override the Encoding on Context
564 As already noted you may change the current encoding (or remove it) by
565 setting an alternative encoding on the context;
566
567 $c->encoding(Encode::find_encoding('Shift_JIS'));
568
569 Please note that you can continue to change encoding UNTIL the headers
570 have been finalized. The last setting always wins. Trying to change
571 encoding after header finalization is an error.
572
573 Setting the Content Encoding HTTP Header
574 In some cases you may set a content encoding on your response. For
575 example if you are encoding your response with gzip. In this case you
576 are again on your own. If we notice that the content encoding header
577 is set when we hit finalization, we skip automatic encoding:
578
579 use Encode;
580 use Compress::Zlib;
581 use utf8;
582
583 sub gzipped :Local {
584 my ($self, $c) = @_;
585
586 $c->res->content_type('text/plain');
587 $c->res->content_type_charset('UTF-8');
588 $c->res->content_encoding('gzip');
589
590 $c->response->body(
591 Compress::Zlib::memGzip(
592 Encode::encode_utf8("manual_1 ♥")));
593 }
594
595 If you are using Catalyst::Plugin::Compress you need to upgrade to the
596 most recent version in order to be compatible with changes introduced
597 in Catalyst 5.90080. Other plugins may require updates (please open
598 bugs if you find them).
599
600 NOTE Content encoding may be set to 'identify' and we will still
601 perform automatic encoding if the content type is encodable and an
602 encoding is present for the context.
603
604 Using Common Views
605 The following common views have been updated so that their tests pass
606 with default UTF8 encoding for Catalyst:
607
608 Catalyst::View::TT, Catalyst::View::Mason, Catalyst::View::HTML::Mason,
609 Catalyst::View::Xslate
610
611 See Catalyst::Upgrading for additional information on Catalyst
612 extensions that require upgrades.
613
614 In generally for the common views you should not need to do anything
615 special. If your actual template files contain UTF8 literals you
616 should set configuration on your View to enable that. For example in
617 TT, if your template has actual UTF8 character in it you should do the
618 following:
619
620 MyApp::View::TT->config(ENCODING => 'utf-8');
621
622 However Catalyst::View::Xslate wants to do the UTF8 encoding for you
623 (We assume that the authors of that view did this as a workaround to
624 the fact that until now encoding was not core to Catalyst. So if you
625 use that view, you either need to tell it to not encode, or you need to
626 turn off encoding for Catalyst.
627
628 MyApp::View::Xslate->config(encode_body => 0);
629
630 or
631
632 MyApp->config(encoding=>undef);
633
634 Preference is to disable it in the View.
635
636 Other views may be similar. You should review View documentation and
637 test during upgrading. We tried to make sure most common views worked
638 properly and noted all workaround but if we missed something please
639 alert the development team (instead of introducing a local hack into
640 your application that will mean nobody will ever upgrade it...).
641
642 Setting the response from an external PSGI application.
643 Catalyst::Response allows one to set the response from an external PSGI
644 application. If you do this, and that external application sets a
645 character set on the content-type, we "clear_encoding" for the rest of
646 the response. This is done to prevent double encoding.
647
648 NOTE Even if the character set of the content type is the same as the
649 encoding set in $c->encoding, we still skip encoding. This is a
650 regrettable difference from the general rule outlined above, where if
651 the current character set is the same as the current encoding, we
652 encode anyway. Nevertheless I think this is the correct behavior since
653 the earlier rule exists only to support backward compatibility with
654 Catalyst::View::TT.
655
656 In general if you want Catalyst to handle encoding, you should avoid
657 setting the content type character set since Catalyst will do so
658 automatically based on the requested response encoding. Its best to
659 request alternative encodings by setting $c->encoding and if you
660 really want manual control of encoding you should always
661 $c->clear_encoding so that programmers that come after you are very
662 clear as to your intentions.
663
664 Disabling default UTF8 encoding
665 You may encounter issues with your legacy code running under default
666 UTF8 body encoding. If so you can disable this with the following
667 configurations setting:
668
669 MyApp->config(encoding=>undef);
670
671 Where "MyApp" is your Catalyst subclass.
672
673 If you do not wish to disable all the Catalyst encoding features, you
674 may disable specific features via two additional configuration options:
675 'skip_body_param_unicode_decoding' and
676 'skip_complex_post_part_handling'. The first will skip any attempt to
677 decode POST parameters in the creating of body parameters and the
678 second will skip creation of instances of Catalyst::Request::PartData
679 in the case that the multipart form upload contains parts with a mix of
680 content character sets.
681
682 If you believe you have discovered a bug in UTF8 body encoding, I
683 strongly encourage you to report it (and not try to hack a workaround
684 in your local code). We also recommend that you regard such a
685 workaround as a temporary solution. It is ideal if Catalyst extension
686 authors can start to count on Catalyst doing the right thing for
687 encoding.
688
690 This document has attempted to be a complete review of how UTF8 and
691 encoding works in the current version of Catalyst and also to document
692 known issues, gotchas and backward compatible hacks. Please report
693 issues to the development team.
694
696 John Napiorkowski jjnapiork@cpan.org <mailto:jjnapiork@cpan.org>
697
698
699
700perl v5.36.0 2022-07-31 Catalyst::UTF8(3)