gkcommit.pl: fix UTF-8 and other encoding issues
The gatekeeper script was not correctly respecting the locale specified in the user's environment. So basically this scenario could (and did) easily happen: 1. A committer writes a valid message in UTF-8 and runs `svn commit` with a correct locale setting of `LANG=en_US.UTF-8`. 2. SVN transcodes that to UTF-8 for internal storage (a no-op in this case). 3. The gatekeeper, also with `LANG=en_US.UTF-8` set, runs `gkcommit.pl ...`. This breaks down into the following steps: A. run `svn log --xml ...`, which SVN correctly transcodes from UTF-8 into the current locale, which happens to also be UTF-8 B. Perl reads this in and assumes this is a sequence of raw 8-bit bytes in a "native" latin1-type encoding. C. Perl's XML::Parser module spots the XML declaration stating the content is UTF-8 encoded: `<?xml version="1.0" encoding="UTF-8"?>`. Perl internally stores the parsed strings as proper Unicode strings (UTF-8 encoded internally, but that's irrelevant here). D. Perl writes out the commit message file in the _latin1_ encoding, transcoding characters from internal UTF-8. This causes characters like "ä" (Unicode code point: 0xe4, UTF-8 encoding: 0xc3 0xa4) to be encoded as a single byte: 0xe4. This fix changes the behavior at steps 3A and 3D to transparently treat the incoming/outgoing data as UTF-8 (assuming a UTF-8 locale is set in the user's environment). There can still be problems if either the committer or the gatekeeper have locale settings that do not agree with the encoding that their editor is producing, but such is i18n :( Helpful references for anyone debugging this sort of issue in the future: * http://perldoc.perl.org/perllocale.html#Unicode-and-UTF-8 * http://perldoc.perl.org/perluniintro.html#Unicode-I%2fO Refs trac:4217 Reviewed-by: Jeff Squyres <jsquyres@cisco.com> cmr=v1.7.5:reviewer=ompi-rm1.7 This commit was SVN r30709. The following Trac tickets were found above: Ticket 4217 --> https://svn.open-mpi.org/trac/ompi/ticket/4217
Этот коммит содержится в:
родитель
452f73de3d
Коммит
33da7d6f23
7
contrib/dist/gkcommit.pl
поставляемый
7
contrib/dist/gkcommit.pl
поставляемый
@ -9,6 +9,13 @@
|
||||
|
||||
use strict;
|
||||
|
||||
use locale ':not_characters';
|
||||
# Respect the locale (LANG, LC_CTYPE, etc.) specified in the environment in
|
||||
# which this script is run when performing file input and output. Necessary to
|
||||
# ensure proper transcoding when grabbing log messages from SVN and then
|
||||
# writing them back out again.
|
||||
use open ':locale';
|
||||
|
||||
use Getopt::Long;
|
||||
use XML::Parser;
|
||||
use Data::Dumper;
|
||||
|
Загрузка…
x
Ссылка в новой задаче
Block a user