Page 1 of 1

Apache Windows ExtFilterDefine / Output / SED

PostPosted: 10. November 2007 19:49
by Ancient

a couple of question concerning Apache Windows (tried 2.0 and 2.2) using the mod_ext_filter.

- Apache runs as a Reverse Proxy and I need to rewrite a received HTML document.
- On Linux I am using sed and a simple Regex to "clean up" the HTML code. Works fine.

On an Apache Windows (running on W2K3) i tried the same, trying several Win32 Builds of SED.
The call (on win) is something similiar like this:

ExtFilterDefine cleanup mode=output intype=text/html cmd="c:/sed.exe s/search/replace/g"
SetOutputFilter cleanup

What happens is that either sed (and apache) hangs or nothing at all, depeding on the version of sed. The regex is fine, it works on
the command prompt. However Apache+Sed on windows seem not to work well together.

My question:

- Anybody of you successfully modified output html code using ExtDefineFilter (or something similiar) on a Windows machine? What
tool you were using?

Tnx a lot! I am really stuck on that topic...


PostPosted: 10. November 2007 20:39
by Wiedmann
Apache runs as a Reverse Proxy and I need to rewrite a received HTML document.

Use mod_proxy_html for this.

PostPosted: 10. November 2007 20:47
by Ancient
Can mod_proxy handle any HTML Content or just links? I thought it was mainly used for rewriting URLs. I need to simply delete some data/tags from the html files.

PostPosted: 10. November 2007 21:06
by Wiedmann
Can mod_proxy handle any HTML Content or just links?

You are right, mod_proxy_html only handle links. For changing the content too, use mod_publisher instead.

PostPosted: 11. November 2007 20:25
by Ancient
ok, tried using mod_publisher...but again i fail :( mod_publisher seems to do nothing, not even displaying an error.

I downloaded the module (and the req. files), added it to my httpd.conf (i am using apache 2.0):

LoadFile bin/zlib1.dll
LoadFile bin/iconv.dll
LoadFile bin/libxml2.dll
LoadModule publisher_module modules/

My virtual host is working, looking like that:

- calls to are redirected to the local apache listening on port 80
- then rewritten to a local website (for testing purposes)
works so far
- now i need to replace the content in the retrieved content (removing the sessionid)


RewriteEngine on
RewriteRule /(.*)$1

#AddOutputFilter markup-publisher .html
#SetOutputFilter INFLATE;proxy-html;DEFLATE
RequestHeader unset Accept-Encoding

MLLogVerbose On
MLRewriteOptions +urls +attributes