I believe that the Apache binaries distributed by XAMPP are linked to a PCRE library that was compiled with the flag --enable-newline-is-any. (at least on Windows) This has the following bad consequence:
If I have a RewriteRule (e.g., if I'm following the standard instructions for setting up Phalcon) that says something like this:
- Code: Select all
RewriteRule (.*) public/$1 [L]
Then, if the PCRE library built into Apache is compiled with --enable-newline-is-any , anything after the first %85 byte will get dropped. This is a serious problem for URLs with multibyte input. See, for example, the stack overflow question http://stackoverflow.com/questions/31207949/specific-kanji-gets-wrongly-encoded-during-http-get-request
In that case, building the PCRE library this way caused the bizarre consequence that the url http://localhost/view/kanji/娩 would work fine, but the URL http://localhost/view/kanji/免 wouldn't. The rewrite logs showed this:
- Code: Select all
[Mon Jul 06 18:05:59.294798 2015] [rewrite:trace3] [pid 5728:tid 2180] mod_rewrite.c(475): [client 127.0.0.1:52741] 127.0.0.1 - - [localhost/sid#3dac08][rid#3f045c0/initial] [perdir E:/xampp/htdocs/] strip per-dir prefix: E:/xampp/htdocs/view/kanji/\xe5\x85\x8d -> view/kanji/\xe5\x85\x8d
[Mon Jul 06 18:05:59.294798 2015] [rewrite:trace3] [pid 5728:tid 2180] mod_rewrite.c(475): [client 127.0.0.1:52741] 127.0.0.1 - - [localhost/sid#3dac08][rid#3f045c0/initial] [perdir E:/xampp/htdocs/] applying pattern '(.*)' to uri 'view/kanji/\xe5\x85\x8d'
[Mon Jul 06 18:05:59.294798 2015] [rewrite:trace2] [pid 5728:tid 2180] mod_rewrite.c(475): [client 127.0.0.1:52741] 127.0.0.1 - - [localhost/sid#3dac08][rid#3f045c0/initial] [perdir E:/xampp/htdocs/] rewrite 'view/kanji/\xe5\x85\x8d' -> 'public/view/kanji/\xe5'
[Mon Jul 06 18:05:59.294798 2015] [rewrite:trace3] [pid 5728:tid 2180] mod_rewrite.c(475): [client 127.0.0.1:52741] 127.0.0.1 - - [localhost/sid#3dac08][rid#3f045c0/initial] [perdir E:/xampp/htdocs/] add per-dir prefix: public/view/kanji/\xe5 -> E:/xampp/htdocs/public/view/kanji/\xe5
[Mon Jul 06 18:05:59.294798 2015] [rewrite:trace2] [pid 5728:tid 2180] mod_rewrite.c(475): [client 127.0.0.1:52741] 127.0.0.1 - - [localhost/sid#3dac08][rid#3f045c0/initial] [perdir E:/xampp/htdocs/] strip document_root prefix: E:/xampp/htdocs/public/view/kanji/\xe5 -> /public/view/kanji/\xe5
[Mon Jul 06 18:05:59.294798 2015] [rewrite:trace1] [pid 5728:tid 2180] mod_rewrite.c(475): [client 127.0.0.1:52741] 127.0.0.1 - - [localhost/sid#3dac08][rid#3f045c0/initial] [perdir E:/xampp/htdocs/] internal redirect with /public/view/kanji/\xe5 [INTERNAL REDIRECT]
That is, the highly unexpected behavior that (.*) in the RewriteRule would not match character 0x85. (or anything beyond that byte)
Building the PCRE library with this flag has no expected, beneficial effects that I can see and has the major disadvantage of making the "." metacharacter in regexes not match byte 0x85, which is very much unexpected behavior. Essentially no one wants that; the few people who might want to treat character 0x85 as a linefeed-like character can add (*ANY) to the front of their patterns. Given how important RewriteRule directives are to various PHP frameworks, I think the risk of breaking them is too large.