1
1
openmpi/contrib
Dave Goodell 33da7d6f23 gkcommit.pl: fix UTF-8 and other encoding issues
The gatekeeper script was not correctly respecting the locale specified
in the user's environment.  So basically this scenario could (and did)
easily happen:

1. A committer writes a valid message in UTF-8 and runs `svn commit` with
   a correct locale setting of `LANG=en_US.UTF-8`.

2. SVN transcodes that to UTF-8 for internal storage (a no-op in this
   case).

3. The gatekeeper, also with `LANG=en_US.UTF-8` set, runs
   `gkcommit.pl ...`.  This breaks down into the following steps:

   A. run `svn log --xml ...`, which SVN correctly transcodes from UTF-8
      into the current locale, which happens to also be UTF-8

   B. Perl reads this in and assumes this is a sequence of raw 8-bit
      bytes in a "native" latin1-type encoding.

   C. Perl's XML::Parser module spots the XML declaration stating the
      content is UTF-8 encoded: `<?xml version="1.0" encoding="UTF-8"?>`.
      Perl internally stores the parsed strings as proper Unicode
      strings (UTF-8 encoded internally, but that's irrelevant here).

   D. Perl writes out the commit message file in the _latin1_ encoding,
      transcoding characters from internal UTF-8.  This causes
      characters like "ä" (Unicode code point: 0xe4, UTF-8 encoding:
      0xc3 0xa4) to be encoded as a single byte: 0xe4.

This fix changes the behavior at steps 3A and 3D to transparently treat
the incoming/outgoing data as UTF-8 (assuming a UTF-8 locale is set in
the user's environment).

There can still be problems if either the committer or the gatekeeper
have locale settings that do not agree with the encoding that their
editor is producing, but such is i18n :(

Helpful references for anyone debugging this sort of issue in the
future:

* http://perldoc.perl.org/perllocale.html#Unicode-and-UTF-8
* http://perldoc.perl.org/perluniintro.html#Unicode-I%2fO

Refs trac:4217

Reviewed-by: Jeff Squyres <jsquyres@cisco.com>

cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r30709.

The following Trac tickets were found above:
  Ticket 4217 --> https://svn.open-mpi.org/trac/ompi/ticket/4217
2014-02-13 03:56:01 +00:00
..
amca-param-sets Simplification of the ErrMgr framework by removing the 'stack'/composite functionality. 2010-08-19 13:09:20 +00:00
build-mca-comps-outside-of-tree As per the RFC, bring in the ORTE async progress code and the rewrite of OOB: 2013-08-22 16:37:40 +00:00
completion Update instructions for installing completion scripts and add them to 2013-11-14 15:26:00 +00:00
dist gkcommit.pl: fix UTF-8 and other encoding issues 2014-02-13 03:56:01 +00:00
git Script to generate svn2git mirror on github 2013-10-17 06:49:58 +00:00
hg Skip some more common files. 2012-05-02 13:05:37 +00:00
nightly Shift off the arguments that we've already processed. 2012-10-05 12:24:24 +00:00
platform Fix wrapper ldflags. 2014-02-04 19:44:08 +00:00
scaling Use preconnect as a better test of startup scaling than barrier 2012-06-01 02:35:15 +00:00
spread Reorganize the rmcast code to capture common code elements. Increase max msg size for spread and udp transports. Cleanup the spread configuration doc. 2010-05-25 22:36:57 +00:00
authors-to-cvsimport.pl add authors-to-cvsimport.pl script 2013-07-17 21:21:15 +00:00
check_unnecessary_headers.sh Turns out that there was exactly ONE place in all of the OMPI code base that still referred to OPAL_TRACE, though a few places retained the include file for no reason. So no point in letting this sit as it is clearly an unused "feature". 2013-07-14 18:57:20 +00:00
check-btl-sm-diffs.pl Update to reflect changes. Matches 1.7 branch now. 2013-11-07 18:46:00 +00:00
check-ob1-pml-diffs.pl Some utilities for tracking difference between ob1 and other PMLs. 2010-11-30 14:51:01 +00:00
check-ob1-revision.pl update revision in check script 2012-03-15 10:29:22 +00:00
code_counter.pl Just because someone asked me how many LOC were in OMPI recently. :-) 2007-01-30 19:27:58 +00:00
find_occurence.pl Update the copyright notices for IU and UTK. 2005-11-05 19:57:48 +00:00
find_offenders.pl Update the copyright notices for IU and UTK. 2005-11-05 19:57:48 +00:00
fix_headers.pl Update the copyright notices for IU and UTK. 2005-11-05 19:57:48 +00:00
fix_indent.pl Update the copyright notices for IU and UTK. 2005-11-05 19:57:48 +00:00
gen_stats.pl - to get coverage analysis with gcc-4, detect the .gcda files, too 2007-01-17 14:21:23 +00:00
generate_file_list.pl Update the copyright notices for IU and UTK. 2005-11-05 19:57:48 +00:00
header_replacement.sh - Minor update used for the last commit 2009-03-04 15:37:50 +00:00
headers.txt this is the big windows commit. there are more things which have gone into this than i can remember. but basically, we are looking for 2004-10-22 16:06:05 +00:00
Makefile.am Fix longstanding issue with our multi-project support. Rather than using 2014-01-07 22:11:15 +00:00
ompi_branch_check_revisions-v1.5.txt - Script to check for revisions in trunk, not yet in branch. 2010-03-12 21:21:37 +00:00
ompi_branch_check_revisions.pl - Convert shell script to perl ;-) Only (XML) svn log, instead of 2010-03-31 03:00:58 +00:00
ompi_cplusplus.sed - Replace combinations of 2009-08-20 11:42:18 +00:00
ompi_cplusplus.sh - Replace combinations of 2009-08-20 11:42:18 +00:00
ompi_cplusplus.txt Remove the last vestiges of mpi_portable_platform.h.in 2010-03-05 21:21:03 +00:00
openmpi-valgrind.supp Update the suppression rules for valgrind to hide the uninitialized byte 2010-07-21 17:30:13 +00:00
search_compare.pl Improve the script to ignore executables and Mac-specific files of no interest 2014-02-11 22:53:14 +00:00
search_replace.pl Improve the search/replace scripty foo a bit: don't traverse into .hg 2011-03-18 12:41:46 +00:00
submit_test.pl Update the copyright notices for IU and UTK. 2005-11-05 19:57:48 +00:00
test_headers_in_ompi.pl Update the copyright notices for IU and UTK. 2005-11-05 19:57:48 +00:00
uncrustify_open_mpi.cfg - Add the uncrustify source code beautification for Open MPI. 2009-09-29 16:10:01 +00:00
update-my-copyright.pl update-my-copyright.pl now works with Git 2013-07-09 14:39:41 +00:00