RedirectRules don't deal with character 0x85 properly

Problems with the Windows version of XAMPP, questions, comments, and anything related.

RedirectRules don't deal with character 0x85 properly

Postby fizbin » 07. July 2015 06:58

I discovered this problem while debugging someone's setup on stackoverflow.

I believe that the Apache binaries distributed by XAMPP are linked to a PCRE library that was compiled with the flag --enable-newline-is-any. (at least on Windows) This has the following bad consequence:

If I have a RewriteRule (e.g., if I'm following the standard instructions for setting up Phalcon) that says something like this:

Code: Select all
    RewriteRule  (.*) public/$1 [L]


Then, if the PCRE library built into Apache is compiled with --enable-newline-is-any , anything after the first %85 byte will get dropped. This is a serious problem for URLs with multibyte input. See, for example, the stack overflow question http://stackoverflow.com/questions/31207949/specific-kanji-gets-wrongly-encoded-during-http-get-request

In that case, building the PCRE library this way caused the bizarre consequence that the url http://localhost/view/kanji/娩 would work fine, but the URL http://localhost/view/kanji/免 wouldn't. The rewrite logs showed this:

Code: Select all
[Mon Jul 06 18:05:59.294798 2015] [rewrite:trace3] [pid 5728:tid 2180] mod_rewrite.c(475): [client 127.0.0.1:52741] 127.0.0.1 - - [localhost/sid#3dac08][rid#3f045c0/initial] [perdir E:/xampp/htdocs/] strip per-dir prefix: E:/xampp/htdocs/view/kanji/\xe5\x85\x8d -> view/kanji/\xe5\x85\x8d
[Mon Jul 06 18:05:59.294798 2015] [rewrite:trace3] [pid 5728:tid 2180] mod_rewrite.c(475): [client 127.0.0.1:52741] 127.0.0.1 - - [localhost/sid#3dac08][rid#3f045c0/initial] [perdir E:/xampp/htdocs/] applying pattern '(.*)' to uri 'view/kanji/\xe5\x85\x8d'
[Mon Jul 06 18:05:59.294798 2015] [rewrite:trace2] [pid 5728:tid 2180] mod_rewrite.c(475): [client 127.0.0.1:52741] 127.0.0.1 - - [localhost/sid#3dac08][rid#3f045c0/initial] [perdir E:/xampp/htdocs/] rewrite 'view/kanji/\xe5\x85\x8d' -> 'public/view/kanji/\xe5'
[Mon Jul 06 18:05:59.294798 2015] [rewrite:trace3] [pid 5728:tid 2180] mod_rewrite.c(475): [client 127.0.0.1:52741] 127.0.0.1 - - [localhost/sid#3dac08][rid#3f045c0/initial] [perdir E:/xampp/htdocs/] add per-dir prefix: public/view/kanji/\xe5 -> E:/xampp/htdocs/public/view/kanji/\xe5
[Mon Jul 06 18:05:59.294798 2015] [rewrite:trace2] [pid 5728:tid 2180] mod_rewrite.c(475): [client 127.0.0.1:52741] 127.0.0.1 - - [localhost/sid#3dac08][rid#3f045c0/initial] [perdir E:/xampp/htdocs/] strip document_root prefix: E:/xampp/htdocs/public/view/kanji/\xe5 -> /public/view/kanji/\xe5
[Mon Jul 06 18:05:59.294798 2015] [rewrite:trace1] [pid 5728:tid 2180] mod_rewrite.c(475): [client 127.0.0.1:52741] 127.0.0.1 - - [localhost/sid#3dac08][rid#3f045c0/initial] [perdir E:/xampp/htdocs/] internal redirect with /public/view/kanji/\xe5 [INTERNAL REDIRECT]


That is, the highly unexpected behavior that (.*) in the RewriteRule would not match character 0x85. (or anything beyond that byte)

Building the PCRE library with this flag has no expected, beneficial effects that I can see and has the major disadvantage of making the "." metacharacter in regexes not match byte 0x85, which is very much unexpected behavior. Essentially no one wants that; the few people who might want to treat character 0x85 as a linefeed-like character can add (*ANY) to the front of their patterns. Given how important RewriteRule directives are to various PHP frameworks, I think the risk of breaking them is too large.
fizbin
 
Posts: 1
Joined: 06. July 2015 22:19
Operating System: Windows 8

Return to XAMPP for Windows

Who is online

Users browsing this forum: No registered users and 175 guests