Giter Site home page Giter Site logo

oisf / libhtp Goto Github PK

View Code? Open in Web Editor NEW
282.0 282.0 113.0 10.35 MB

LibHTP is a security-aware parser for the HTTP protocol and the related bits and pieces.

License: BSD 3-Clause "New" or "Revised" License

Shell 0.03% C 38.49% Ruby 0.43% Perl 0.75% PHP 0.32% C++ 56.96% Makefile 0.18% M4 2.51% Python 0.02% Raku 0.31%

libhtp's People

Contributors

aborrero avatar b1v1r avatar basking2 avatar bolzzy avatar catenacyber avatar cccs-rtmorti avatar cccs-sadugas avatar davidkorczynski avatar fd00 avatar felix-jia avatar ffontaine avatar glongo avatar inashivb avatar ivanr avatar jufajardini avatar kevinreddot avatar montekki avatar pc-anssi avatar poona avatar regit avatar satta avatar victorjulien avatar wenhuizhang avatar wgh- avatar wxsbsd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

libhtp's Issues

Incorrect revision in the version string

LibHTP originally used Subversion and relied on the most recent revision number (manually updated after check out/export using update_version.sh) to accurately identify development releases. Now that we use Git, the revision is fixed at "r238" (in htp.c).

We need to investigate if it is possible to replicate revision numbers in Git or, if it is not, remove the hard-coded revision information.

htp_unparse_uri_noencode() is private

The htp_unparse_uri_noencode() function is private and required to generate a normalized URI, which was removed in 0.5. Please make this and other similar functions public for 0.5.

Infinite loop in htp_connp_close()

It seems that calling htp_connp_close() can cause libhtp to go into an infinite loop. The following patch seems to avoid the infinite loop, but also causes a segfault (gdb trace below).

diff --git a/libs/libhtp/htp/htp_request.c b/libs/libhtp/htp/htp_request.c
index d1e9003..339d966 100644
--- a/libs/libhtp/htp/htp_request.c
+++ b/libs/libhtp/htp/htp_request.c
@@ -909,6 +909,9 @@ int htp_connp_req_data(htp_connp_t *connp, const htp_time_t *timestamp, const vo

                 return HTP_STREAM_STOP;
             }
+            else if (rc != HTP_OK) {
+                return rc;
+            }
         }
     }

gdb trace:
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff6eb138d in htp_connp_res_buffer (connp=connp@entry=0x6fd2e0) at /local/home/nick/devel/Qualys/ib/libs/libhtp/htp/htp_response.c:195
195 if (newlen > connp->out_tx->cfg->field_limit_hard) {
(gdb) where
#0 0x00007ffff6eb138d in htp_connp_res_buffer (connp=connp@entry=0x6fd2e0) at /local/home/nick/devel/Qualys/ib/libs/libhtp/htp/htp_response.c:195
#1 0x00007ffff6eb2587 in htp_connp_res_data (connp=0x6fd2e0, timestamp=, data=, len=)
at /local/home/nick/devel/Qualys/ib/libs/libhtp/htp/htp_response.c:1009
#2 0x00007ffff3191b24 in modhtp_iface_disconnect (pi=0x6f0dc7, iconn=0x700b10) at /local/home/nick/devel/Qualys/ib/modules/modhtp.c:1728
#3 0x00007ffff7993921 in ib_state_notify_conn_closed (ib=0x6ef010, conn=0x700b10) at /local/home/nick/devel/Qualys/ib/engine/state_notify.c:448
#4 0x0000000000410e34 in IronBeeLuaApi::TearDown (this=0x6d1100) at /local/home/nick/devel/Qualys/ib/tests/test_ironbee_lua_api.cpp:177
#5 0x0000000000437b96 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void> (object=0x6d1100,

Do not error out on newlines after response body

Nick Jones writes:

The worlds favourite AntiVirus test pattern, at: http://www.eicar.org/download/eicar.com (et al), actually returns an invalid http response. It claims that the content-length is 68 bytes, but after the 68 bytes of the AV test pattern, it returns an additional two '\n' characters.

This makes libhtp-0.5.6 fail, claiming a 'response without request'.

Strictly speaking, I should limit the input that I feed to libhtp to the exact number of bytes described in the content-length field, then when the server closes the connection (which it does explicitly at the same time providing a Connection: close header), but this will not do in cases where requests are pipelined, the next 'response' would be prepended with the two '\n'.

Is there a standard way a proxy should treat this extra information, is there an rfc standard explanation for this extra information and is it a phenomenon that libhtp should handle?

Extend API to add buffering of connection data as an option

In most cases user can send data (to LibHTP) as they get it, but there's at least one case when we need to see a server's response before we can decide how to continue to parse inbound data. At the moment this complexity must be handled by every user. We should have an API that performs buffering as necessary, allowing users to not think about this.

Wrong end-of-line encoding for COPYING

I'm packaging libhtp for Fedora, and rpmlint is complaining that the end of line character is invalid, as it is CRLF (the Windows one) rather than just LF.

It's of course completely harmless, but since it's absolutely trivial and given that libhtp seems to be targetted at UNIX-like OSes (at least that's what the OS detection in configure.ac seems to indicate), it would make sense to fix it.

At the moment, I simply run the following in my spec file:
$ dos2unix COPYING

Spurious executable permissions on *.c and *.h files

Some of the source and header files have the executable bits set.

I'm trying to package libhtp for Fedora, and rpmlint (rightfully) complains about it so I'm running the following in the spec file:
$ find . -name '*.c' -exec chmod -x '{}' \; $ find . -name '*.h' -exec chmod -x '{}' \;

Since Git can keep track of mode changes, it would be nice if you could commit such a change so that every distro packager doesn't have to do it in their own packages. :)

htp_unparse_uri_noencode is in header, but not implemented

htp_unparse_uri_noencode() is in the header, but not implemented in the C file. IronBee used this private function (probably should not have), but this caused the code to compile fine, but fail at runtime when symbol not found. I suggest leaving the function implemented in 0.5.x, deprecate and then remove in a later version.

Remove header line copying

At the moment we copy individual header lines. We do not have strong use cases that require this functionality (the original motivation was to preserve raw data for forensic purposes), yet this feature significantly decreases performance and increases memory consumption. The one use case we do have is Suricata wanting to inspect raw request headers, but we implement it in a backward way, reassembling individual header lines back into a buffer.

In virtually all cases all request headers will be available to us in a single buffer. Once we identify the beginning and the end of request headers, we can just pass that same buffer to Suricata. We will still have to buffer when request headers arrive in pieces, but that happens only rarely.

To remove header line copying will mean to re-implement header parsing.

Request line callback is defined to take three parameters, but the code to actually call the callback only passes one parameter

The request line callback is defined to take three parameters, but the code to actually call the callback only passes one parameter. This is an problem hidden by the casts to a generic callback prototype.

Callback registration has non-generic three parameter callback prototype:

void htp_config_register_request_line(htp_cfg_t *cfg, int (*callback_fn)(htp_connp_t *, unsigned char *, size_t));

Callback is called with only one parameter using the generic callback prototype (user_data is a htp_connp_t * in this case):

int rc = callback->fn(user_data);

no interface for htp_time_t

There are no functions for working with the htp_time_t type, as far as I can tell. Also, I don't see any comment near the typedef guaranteeing that the type will always be equivalent to struct timeval.

So what is the correct way to work with values of this type?

Do not iterate over data bytes when processing request and response bodies

We should not waste time to iterate over incoming data bytes when we know the size of request and response body data chunks. When C-L is present, we know the size and can just skip the right amount. When chunked encoding is used, we will need to iterate over some bytes, but we can skip as soon as we extract chunk length and reach the end of the line.

0.5.x: Include path needs to include builddir, not just srcdir

Since htp_version.h is now a built source, it is written to the build dir, but the include path only includes the src dir. This causes htp_version.h to not be found for out-of-source dir builds:

/bin/sh ../libtool  --tag=CC   --mode=compile clang-mp-3.2 -DHAVE_CONFIG_H -I. -I../../ironbee/modules -I..  -I/usr/local/include -I/opt/local/include -Wall -Wextra -Werror  -I../../ironbee -I../../ironbee/include -I../include -I../../ironbee/util -I../../ironbee/engine -DMODULE_BASE_PATH=/usr/local/ironbee/lib -DRULE_BASE_PATH=/usr/local/ironbee/lib -Wno-unused-parameter -I../../ironbee/libs/libhtp/htp   -g -O2 -fPIC -I/opt/local/include -I/usr/local/include -MT ibmod_htp_la-modhtp.lo -MD -MP -MF .deps/ibmod_htp_la-modhtp.Tpo -c -o ibmod_htp_la-modhtp.lo `test -f 'modhtp.c' || echo '../../ironbee/modules/'`modhtp.c
libtool: compile:  clang-mp-3.2 -DHAVE_CONFIG_H -I. -I../../ironbee/modules -I.. -I/usr/local/include -I/opt/local/include -Wall -Wextra -Werror -I../../ironbee -I../../ironbee/include -I../include -I../../ironbee/util -I../../ironbee/engine -DMODULE_BASE_PATH=/usr/local/ironbee/lib -DRULE_BASE_PATH=/usr/local/ironbee/lib -Wno-unused-parameter -I../../ironbee/libs/libhtp/htp -g -O2 -fPIC -I/opt/local/include -I/usr/local/include -MT ibmod_htp_la-modhtp.lo -MD -MP -MF .deps/ibmod_htp_la-modhtp.Tpo -c ../../ironbee/modules/modhtp.c  -fno-common -DPIC -o .libs/ibmod_htp_la-modhtp.o
In file included from ../../ironbee/modules/modhtp.c:63:
../../ironbee/libs/libhtp/htp/htp.h:48:10: fatal error: 'htp_version.h' file not found
#include "htp_version.h"
         ^
1 error generated.

And:

clang++-mp-3.2 -DHAVE_CONFIG_H -I. -I../../../../ironbee/libs/libhtp/test -I..  -I../../../../ironbee/libs/libhtp -Wno-write-strings -O2  -g -O2 -MT test_bstr.o -MD -MP -MF .deps/test_bstr.Tpo -c -o test_bstr.o ../../../../ironbee/libs/libhtp/test/test_bstr.cpp
In file included from ../../../../ironbee/libs/libhtp/test/test_bstr.cpp:43:
In file included from ../../../../ironbee/libs/libhtp/htp/htp_private.h:61:
../../../../ironbee/libs/libhtp/htp/htp.h:48:10: fatal error: 'htp_version.h' file not found
#include "htp_version.h"
         ^
1 error generated.

Support Content-Range in responses

Plan:

  • Detect and parse the Content-Range response header.
  • When passing response body data to callbacks, indicate the data offset (within the resource, not within the response body) for each chunk.

Running a Nikto v2.1.4 scan against IronBee results in occasional invalid free's in libhtp htp_tx_destroy

Running a Nikto v2.1.4 scan against IronBee results in occasional invalid free's in libhtp htp_tx_destroy.

./nikto.pl -update
./nikto.pl -host 127.0.0.1 -port 8080

Loaded symbols for /lib64/libgcc_s-4.4.5-20110214.so.1
Core was generated by `/usr/sbin/httpd'.
Program terminated with signal 6, Aborted.
#0  0x00007fb42260a945 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64    return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) bt full
#0  0x00007fb42260a945 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
        resultvar = 0
        pid = <value optimized out>
        selftid = 1237
#1  0x00007fb42260c125 in abort () at abort.c:92
        save_stage = 2
        act = {__sigaction_handler = {sa_handler = 0x7fff9732fcc8, sa_sigaction = 0x7fff9732fcc8}, sa_mask = {__val = {140735730089136, 140735730118397, 15, 140411648777162, 1, 140411648780821, 3, 140735730089131, 5, 140411648777188, 
              1, 140411648783873, 3, 140735730089140, 12, 140411648783877}}, sa_flags = 2, sa_restorer = 0x7fb42272b605}
        sigs = {__val = {32, 0 <repeats 15 times>}}
#2  0x00007fb42264782b in __libc_message (do_abort=2, fmt=0x7fb42272c7f8 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:186
        ap = {{gp_offset = 40, fp_offset = 48, overflow_arg_area = 0x7fff97330630, reg_save_area = 0x7fff97330540}}
        ap_copy = {{gp_offset = 16, fp_offset = 48, overflow_arg_area = 0x7fff97330630, reg_save_area = 0x7fff97330540}}
        fd = 2
        on_2 = <value optimized out>
        list = <value optimized out>
        nlist = <value optimized out>
        cp = <value optimized out>
        written = <value optimized out>
#3  0x00007fb42264d156 in malloc_printerr (action=3, str=0x7fb42272caf0 "free(): invalid next size (fast)", ptr=<value optimized out>) at malloc.c:6283
        buf = "00007fb428754e80"
        cp = <value optimized out>
#4  0x00007fb419988a92 in htp_tx_destroy (tx=0x7fb428753240) at htp_transaction.c:143
        hl = 0x0
        h = 0x7fb428754e80
#5  0x00007fb416db5a74 in modhtp_htp_response (connp=<value optimized out>) at htp.c:693
        modctx = <value optimized out>
        tx = 0x7fb428753240
        ib = 0x7fb4254cb760
        itx = 0x7fb428753c70
#6  0x00007fb41997ef33 in hook_run_all (hook=0x7fb4287527d0, data=0x7fb42874ac10) at hooks.c:144
        callback = <value optimized out>
#7  0x00007fb41998775c in htp_connp_RES_IDLE (connp=0x7fb42874ac10) at htp_response.c:725
        rc = <value optimized out>
#8  0x00007fb419987229 in htp_connp_res_data (connp=0x7fb42874ac10, timestamp=<value optimized out>, data=<value optimized out>, len=<value optimized out>) at htp_response.c:872
        rc = <value optimized out>
#9  0x00007fb416db4fbb in modhtp_iface_data_out (pi=<value optimized out>, qcdata=0x7fff97330840) at htp.c:950
        ib = 0x7fb4254cb760
        iconn = <value optimized out>
        modctx = 0x7fb428752168
        htp = 0x7fb42874ac10
        rc = IB_OK
        tv = {tv_sec = 1318876855, tv_usec = 855987}
        ec = <value optimized out>
#10 0x00007fb419776677 in process_bucket (f=0x7fb4254f5ce8, b=0x7fb4287078b8) at mod_ironbee.c:225
        c = 0x7fb4254f53a8
        ctx = 0x7fb4254f5c90
        icdata = {ib = 0x7fb4254cb760, mp = 0x7fb42873aca4, conn = 0x7fb428751f10, dalloc = 216, dlen = 216, 
          data = 0x7fb42545fe89 "<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML 2.0//EN\">\n<html><head>\n<title>404 Not Found</title>\n</head><body>\n<h1>Not Found</h1>\n<p>The requested URL /administrateur.php was not found on this server.</p>"...}
        bdata = 0x7fb42545fe89 "<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML 2.0//EN\">\n<html><head>\n<title>404 Not Found</title>\n</head><body>\n<h1>Not Found</h1>\n<p>The requested URL /administrateur.php was not found on this server.</p>"...
        nbytes = 216
        rc = <value optimized out>
#11 0x00007fb41977671b in ironbee_output_filter (f=0x7fb4254f5ce8, bb=0x7fb428748af8) at mod_ironbee.c:709
        b = 0x7fb4287078b8
#12 0x00007fb4240d248b in ap_http_header_filter (f=0x7fb4287436f0, b=0x7fb428748af8) at /usr/src/debug/httpd-2.2.15/modules/http/http_filters.c:1292
---Type <return> to continue, or q <return> to quit---
        r = 0x7fb4287420e8
        c = 0x7fb4254f53a8
        clheader = <value optimized out>
        protocol = 0x7fb4240d9f82 "HTTP/1.1"
        e = <value optimized out>
        b2 = <value optimized out>
        h = {pool = 0x7fb428742068, bb = 0x7fb428756520}
        ctx = <value optimized out>
        ctype = <value optimized out>
        eb = <value optimized out>
#13 0x00007fb4240b6d90 in ap_content_length_filter (f=0x7fb4287436c8, b=0x7fb428748af8) at /usr/src/debug/httpd-2.2.15/server/protocol.c:1335
        r = 0x7fb4287420e8
        ctx = 0x7fb428756490
        e = <value optimized out>
        eos = <value optimized out>
        eblock = <value optimized out>
#14 0x00007fb4240d41a1 in ap_byterange_filter (f=0x7fb4287436a0, bb=0x7fb428748af8) at /usr/src/debug/httpd-2.2.15/modules/http/byterange_filter.c:296
        r = 0x7fb4287420e8
        c = 0x7fb4254f53a8
        e = <value optimized out>
        bsend = <value optimized out>
        tmpbb = <value optimized out>
        range_start = <value optimized out>
        range_end = <value optimized out>
        clength = <value optimized out>
        rv = <value optimized out>
        found = 0
        num_ranges = 32692
        boundary = 0x0
        bound_head = 0x0
        indexes = <value optimized out>
        idx = <value optimized out>
        original_status = 404
        i = <value optimized out>
#15 0x00007fb41a9bf9b9 in ap_proxy_http_process_response (p=0x7fb428742068, r=<value optimized out>, backend=0x7fb425459d58, origin=0x7fb42545bf78, conf=0x7fb4254812e0, server_portstr=0x7fff97334e20 ":8080")
    at /usr/src/debug/httpd-2.2.15/modules/proxy/mod_proxy_http.c:1824
        readbytes = 216
        rv = <value optimized out>
        mode = APR_NONBLOCK_READ
        finish = <value optimized out>
        rc = <value optimized out>
        c = 0x7fb4254f53a8
        buffer = "\000\nntent-Type\000 text/html; charset=iso-8859-1\000\n PHP/5.3.2-1ubuntu4.5 with Suhosin-Patch mod_python/3.3.1 Python/2.6.5 mod_perl/2.0.4 Perl/v5.10.1\000\nMon Oct 17 13:40:55 2011] [debug] proxy_util.c(1506): ["...
        buf = <value optimized out>
        keepchar = <value optimized out>
        rp = 0x7fb428748b18
        e = <value optimized out>
        bb = 0x7fb428748ad8
        tmp_bb = 0x7fb4287559a8
        pass_bb = 0x7fb428748af8
        len = 22
        backasswards = <value optimized out>
        interim_response = 0
        pread_len = <value optimized out>
        save_table = <value optimized out>
---Type <return> to continue, or q <return> to quit---
        backend_broke = 0
        hop_by_hop_hdrs = {0x7fb41a9c25e1 "Keep-Alive", 0x7fb41a9c2959 "Proxy-Authenticate", 0x7fb41a9c2837 "TE", 0x7fb41a9c283a "Trailer", 0x7fb41a9c2842 "Upgrade", 0x0}
        i = <value optimized out>
        te = 0x0
        original_status = 200
        proxy_status = 404
        original_status_line = 0x0
        proxy_status_line = <value optimized out>
#16 0x00007fb41a9c0df7 in proxy_http_handler (r=0x7fb4287420e8, worker=<value optimized out>, conf=0x7fff97334e08, url=0x7fb428748400 "/administrateur.php", proxyname=0x7fb4254ff188 "(1O%\264\177", proxyport=49016)
    at /usr/src/debug/httpd-2.2.15/modules/proxy/mod_proxy_http.c:2025
        status = <value optimized out>
        server_portstr = ":8080\000\000\000\020O3\227\377\177\000\000@N3\227\377\177\000\000\345\377\375\032\264\177\000"
        scheme = <value optimized out>
        proxy_function = 0x7fb41a9c275e "HTTP"
        u = <value optimized out>
        backend = 0x7fb425459d58
        is_ssl = 678724632
        c = 0x7fb4254f53a8
        p = 0x7fb428742068
        uri = <value optimized out>
#17 0x00007fb41afd8f6a in proxy_run_scheme_handler (r=0x7fb4287420e8, worker=0x7fb4254814b8, conf=0x7fb4254812e0, url=0x7fb4287482de "http://192.168.4.101/administrateur.php", proxyhost=0x0, proxyport=0)
    at /usr/src/debug/httpd-2.2.15/modules/proxy/mod_proxy.c:2380
        pHook = <value optimized out>
        n = <value optimized out>
        rv = <value optimized out>
#18 0x00007fb41afdd525 in proxy_handler (r=0x7fb4287420e8) at /usr/src/debug/httpd-2.2.15/modules/proxy/mod_proxy.c:996
        url = 0x7fb4287482de "http://192.168.4.101/administrateur.php"
        uri = 0x7fb4287482de "http://192.168.4.101/administrateur.php"
        scheme = 0x7fb428748308 "http"
        p = 0x7fff97334f88 "k\351\t0[\200\275\027ނt(\264\177"
        p2 = <value optimized out>
        sconf = <value optimized out>
        conf = 0x7fb4254812e0
        proxies = 0x7fb4254813b0
        ents = 0x7fb4253ad830
        i = <value optimized out>
        access_status = 0
        direct_connect = <value optimized out>
        str = <value optimized out>
        maxfwd = <value optimized out>
        balancer = 0x0
        worker = 0x7fb4254814b8
        attempts = 0
        max_attempts = 0
        list = <value optimized out>
#19 0x00007fb4240c28c0 in ap_run_handler (r=0x7fb4287420e8) at /usr/src/debug/httpd-2.2.15/server/config.c:158
        pHook = <value optimized out>
        n = <value optimized out>
        rv = <value optimized out>
#20 0x00007fb4240c617e in ap_invoke_handler (r=0x7fb4287420e8) at /usr/src/debug/httpd-2.2.15/server/config.c:376
        handler = <value optimized out>
        p = <value optimized out>
        result = 0
        old_handler = 0x7fb41afe43fe "proxy-server"
        ignore = <value optimized out>
#21 0x00007fb4240d17e0 in ap_process_request (r=0x7fb4287420e8) at /usr/src/debug/httpd-2.2.15/modules/http/http_request.c:282
---Type <return> to continue, or q <return> to quit---
        access_status = <value optimized out>
#22 0x00007fb4240ce6a8 in ap_process_http_connection (c=0x7fb4254f53a8) at /usr/src/debug/httpd-2.2.15/modules/http/http_core.c:190
        r = 0x7fb4287420e8
        csd = 0x0
#23 0x00007fb4240ca3d8 in ap_run_process_connection (c=0x7fb4254f53a8) at /usr/src/debug/httpd-2.2.15/server/connection.c:43
        pHook = <value optimized out>
        n = <value optimized out>
        rv = <value optimized out>
#24 0x00007fb4240d66f7 in child_main (child_num_arg=<value optimized out>) at /usr/src/debug/httpd-2.2.15/server/mpm/prefork/prefork.c:667
        current_conn = <value optimized out>
        csd = 0x7fb4254f51b8
        ptrans = 0x7fb4254f5138
        allocator = 0x7fb4254f3030
        status = <value optimized out>
        i = <value optimized out>
        lr = <value optimized out>
        pollset = 0x7fb4254f3358
        sbh = 0x7fb4254f3350
        bucket_alloc = 0x7fb4254ff188
        last_poll_idx = 1
#25 0x00007fb4240d6a0a in make_child (s=0x7fb425347860, slot=4) at /usr/src/debug/httpd-2.2.15/server/mpm/prefork/prefork.c:763
        pid = 0
#26 0x00007fb4240d6d3b in startup_children (_pconf=<value optimized out>, plog=<value optimized out>, s=<value optimized out>) at /usr/src/debug/httpd-2.2.15/server/mpm/prefork/prefork.c:781
        i = <value optimized out>
#27 ap_mpm_run (_pconf=<value optimized out>, plog=<value optimized out>, s=<value optimized out>) at /usr/src/debug/httpd-2.2.15/server/mpm/prefork/prefork.c:1002
        index = <value optimized out>
        remaining_children_to_start = <value optimized out>
        rv = <value optimized out>
#28 0x00007fb4240ae840 in main (argc=1, argv=0x7fff973354e8) at /usr/src/debug/httpd-2.2.15/server/main.c:740
        c = 0 '\000'
        configtestonly = <value optimized out>
        confname = 0x7fb4240d8f44 "conf/httpd.conf"
        def_server_root = 0x7fb4240d8f39 "/etc/httpd"
        temp_error_log = 0x0
        error = <value optimized out>
        process = 0x7fb425347860
        server_conf = 0x7fb425347860
        pglobal = 0x7fb42533e128
        pconf = 0x7fb425340138
        plog = 0x7fb4253722c8
        ptemp = 0x7fb425344158
        pcommands = 0x7fb425342148
        opt = 0x7fb425342240
        rv = <value optimized out>
        mod = <value optimized out>
        optarg = 0x7fb42533a250 ""
        signal_server = <value optimized out>

Add support for the WebSocket protocol

WebSocket is implemented as an upgrade over HTTP. For example:

GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Origin: http://example.com
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Version: 7

And the response could be:

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Sec-WebSocket-Protocol: chat

For more details, see http://tools.ietf.org/html/draft-ietf-hybi-thewebsocketprotocol-07

Implement full streaming parsing

Request and response body processing is currently fully streaming, meaning data callbacks are called as soon as data is available, and there's no buffering. However, there is buffering when it comes to request/response line and header processing.

In most cases, there's no practical difference between the two approaches, because request headers arrive in a single buffer. However, an enterprising attacker could split request data into small packets, and sneak by an IPS that relies on buffering, and which we will see the attack only once all the headers are assembled. By that time it will be too late.

Inspection of buffered data is easy to implement because the desired chunk of data is available at once. Implementing streaming increases the burden on the LibHTP user. Further, it is questionable whether it is possible to prevent such fragmented attacks reliably. In the worst case, for example, the attacker could be using one byte per packet and several hundred packets for all request headers.

We need to interview Suricata developers to determine if we should pursue full streaming.

0.5.x: errors in htp_connp_close during CONNECT transactions

The transaction state model doesn't seem to be able to handle either client or server closures and the call to htp_connp_close during CONNECT transactions.

When htp_connp_close is called, the notion of HTP_STREAM_CLOSED gets lost as final calls to htp_connp_req_data and htp_connp_res_data are made with NULL data.

The particular error I am seeing is like this:

  1. htp_connp_res_data(..., NULL, 0) called
  2. connp->out_state is htp_connp_RES_FINALIZE, which gets called and in turn calls htp_tx_state_response_complete_ex
  3. response body data callback may be called with NULL, 0, and response complete callback may also be called, htp_tx_finalize will be called which will destroy the transaction if auto_destroy is enabled, connp->out_tx set to NULL
  4. state is set to htp_connp_RES_IDLE and HTP_OK is returned
  5. htp_connp_RES_IDLE is called but because input buffer is empty, HTP_DATA is returned
  6. because HTP_DATA was the last return value, htp_connp_res_receiver_send_data is called, which calls the most recent data reciever hook, which in most cases for me is the response_header_data hook.
  7. the hook gets invoked but it is meaningless because connp->out_tx has already been nullified.

It seems that the case of HTP_STREAM_CLOSED needs to be handled with special care in htp_connp_req/res_data to avoid the state model simply going on as if it's business as usual.

To prepare for this fix, I suggest the following changes:

diff -ur libhtp-0.5.4.orig/htp/htp_response.c libhtp-0.5.4/htp/htp_response.c
--- libhtp-0.5.4.orig/htp/htp_response.c    2013-07-22 09:53:21.393720488 +0000
+++ libhtp-0.5.4/htp/htp_response.c 2013-07-22 14:16:38.871365447 +0000
@@ -471,7 +471,11 @@
                 && (connp->out_tx->response_status_number <= 299)) {
             // This is a successful CONNECT stream, which means
             // we need to switch into tunneling mode.
+            connp->out_tx->response_progress = HTP_RESPONSE_COMPLETE;
+            connp->out_tx->response_transfer_coding = HTP_CODING_NO_BODY;
+
             connp->in_status = HTP_STREAM_TUNNEL;
+            connp->in_state = htp_connp_REQ_FINALIZE;
             connp->out_status = HTP_STREAM_TUNNEL;
             connp->out_state = htp_connp_RES_FINALIZE;
             return HTP_OK;

But I am unsure how to approach the rest.

The HTP_AMBIGUOUS_HOST flag is not very useful

The HTP_AMBIGUOUS_HOST is raised when hostname information appears on the request line and in the Host request header, even if both hostnames are the same. As is, the flag is not very useful. It should be raised only when the hostnames are different.

Incomplete uninstall on OpenBSD

On libhtp-0.5.x, the uninstall Makefile target is incomplete on OpenBSD:

# make uninstall
Making uninstall in htp
 /bin/sh ../libtool   --mode=uninstall rm -f '/usr/local/lib/libhtp.la'

I've tried different versions of autotools on OpenBSD with comparable results. The .so files remain on the filesystem.

It works well on Linux where lihtp*so files are removed.

Add option to buffer request and response bodies

In simpler use cases users would find it more convenient to access request and response bodies as continuous buffers. At the moment we only offer chunked delivery of data via callbacks, meaning users must manually assemble data.

Callbacks invoked twice for the same transaction

Ivan,

There appears to be a bug in which the request complete callback (and possibly others) gets called twice; once at the correct time, but again during htp_connp_close().

I'll send you the patch out-of-band (because it's a pain to upload patches to GitHub).

-Nick

Use dynamic in_line and out_line buffers to reduce per-connection memory consumption

At the moment, the HTTP connection parser will allocate two buffers of HTP_HEADER_LIMIT_HARD each (18,000 bytes), which means that we use at least 36 KB per connection. The buffer size is configurable (via the field_limit_hard configuration option), but we should nevertheless investigate using buffers that grow on demand. Most request and response lines are nowhere near 18 KB.

Remove the ability to extract uploaded files to the filesystem

LibHTP is currently able to extract files transported via multipart/form-data and PUT request bodies; such files are stored on the filesystem. But, because the feature is implemented using blocking I/O, it is not appropriate for use in event-driven programs.

Handle data after single HTTP/0.9 request

We currently continue to parse extra data (and treat them as further requests), but keep-alives were not supported back then. Apache, in particular, will ignore all extra data. This may be a bug or a personality feature, depending on what other web servers do.

New release

Is there going to be a new release? The last release was 7 months ago and that branch is woefully behind master. I maintain the FreeBSD port of libhtp and tagging releases makes it much easier on me to keep the port updated.

Implement header data callbacks

We need new header data callbacks to use when library users are interested in seeing raw header data. There are 2 use cases that we need to handle.

In the first use case, the user cares only about complete (potentially buffered) chunks of data. Normally, all request headers arrive at once in a single buffer, and we can pass the buffer directly to the user. When fragmentation occurs, we will be forced to buffer. But, even in this use case, the callback can potentially be invoked twice. Once for the first header batch, and then the second time if trailing headers are included.

A further complication is if the user is expecting LibHTP to pre-process headers, reassembling the folded ones, and combining headers with the same names. We have a similar feature today.

In the second use case, we send raw header data to the user as we get it ourselves. This is easy to implement, and allows for fully streaming inspection. See #27 for a further discussion.

Stream closure might leave some data unprocessed

The parts of the code that work with lines (e.g., request line, headers, response line) might not correctly handle stream closures. In most cases this will be fine, but there's at least one where it is not: when the first line of the response is not a valid status line, browsers treat that line as response body. LibHTP does that too, but not correctly at the moment if the response does not have a single newline.

Need a higher resolution timestamp

Timestamps are currently defined as htp_time_t (uint32_t) which only has second resolution. Timestamps should be defined with at least micorsecond resolution ("struct timeval" or similar).

Build failure on FreeBSD

Trying to build the tip of master on FreeBSD 9.0 results in:

cc1: warnings being treated as errors
htp_content_handlers.c: In function 'htp_ch_urlencoded_callback_request_body_data':
htp_content_handlers.c:67: warning: dereferencing type-punned pointer will break strict-aliasing rules
htp_content_handlers.c: In function 'htp_ch_urlencoded_callback_request_line':
htp_content_handlers.c:152: warning: dereferencing type-punned pointer will break strict-aliasing rules

My attempts to understand this code have failed so I am not able to provide a patch I am confident is doing the right thing. There are about a dozen different places where this happens.

I can use -fno-strict-aliasing when building but there is one other problem:

cc1: warnings being treated as errors
htp_transcode.c: In function 'htp_transcode_bstr':
htp_transcode.c:163: warning: passing argument 2 of 'libiconv' from incompatible pointer type

Unfortunately I'm not going to have much time to devote to this for a few weeks but I can provide access to a FreeBSD machine which exhibits this behavior if necessary.

README build instructions omit autoreconf

Attempting to RPM-ify both 0.2.x branch (0.2.6) and master (46c41074) , noticed README refers to ./configure which does not exist in source distribution. Instead, autoreconf -i is needed, and then the README instructions are accurate.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.