Giter Site home page Giter Site logo

apachelogs's Introduction

Project Status: Active โ€” The project has reached a stable, usable
state and is being actively developed. CI Status coverage pyversions MIT License

GitHub | PyPI | Documentation | Issues | Changelog

apachelogs parses Apache access log files. Pass it a log format string and get back a parser for logfile entries in that format. apachelogs even takes care of decoding escape sequences and converting things like timestamps, integers, and bare hyphens to datetime values, ints, and Nones.

Installation

apachelogs requires Python 3.8 or higher. Just use pip for Python 3 (You have pip, right?) to install apachelogs and its dependencies:

python3 -m pip install apachelogs

Examples

Parse a single log entry:

>>> from apachelogs import LogParser >>> parser = LogParser("%h %l %u %t "%r" %>s %b "%{Referer}i" "%{User-Agent}i"") >>> # The above log format is also available as the constant apachelogs.COMBINED. >>> entry = parser.parse('209.126.136.4 - - [01/Nov/2017:07:28:29 +0000] "GET / HTTP/1.1" 301 521 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36"n') >>> entry.remote_host '209.126.136.4' >>> entry.request_time datetime.datetime(2017, 11, 1, 7, 28, 29, tzinfo=datetime.timezone.utc) >>> entry.request_line 'GET / HTTP/1.1' >>> entry.final_status 301 >>> entry.bytes_sent 521 >>> entry.headers_in["Referer"] is None True >>> entry.headers_in["User-Agent"] 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36' >>> # Log entry components can also be looked up by directive: >>> entry.directives["%r"] 'GET / HTTP/1.1' >>> entry.directives["%>s"] 301 >>> entry.directives["%t"] datetime.datetime(2017, 11, 1, 7, 28, 29, tzinfo=datetime.timezone.utc)

Parse a file full of log entries:

>>> with open('/var/log/apache2/access.log') as fp: # doctest: +SKIP ... for entry in parser.parse_lines(fp): ... print(str(entry.request_time), entry.request_line) ... 2019-01-01 12:34:56-05:00 GET / HTTP/1.1 2019-01-01 12:34:57-05:00 GET /favicon.ico HTTP/1.1 2019-01-01 12:34:57-05:00 GET /styles.css HTTP/1.1 # etc.

apachelogs's People

Contributors

chosak avatar dependabot[bot] avatar jwodder avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

apachelogs's Issues

Timestamps not enclosed by brackets not parsing

Anonymised extract from my log files :-

10.0.0.0 - anonymous 01/May/2022:07:27:52 +1000 "GET /some/uri/page.html HTTP/1.1" 200 238734 "-" "UserAgent/String"

Using the COMBINED format string, the parser is unable to process the above. It works perfectly when the timestamp is enclosed by square brackets, like so

10.0.0.0 - anonymous [01/May/2022:07:27:52 +1000] "GET /some/uri/page.html HTTP/1.1" 200 238734 "-" "UserAgent/String"

Not sure if bug or feature

directive %r containing \n

Hi,
Thanks for your parser, working great.
Please look at this access log entry where "%r" contains \n. I get an "InvalidEntryError : Could not match log entry..."
[23/Jul/2020:11:21:48 +0100] 66.240.192.138 TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "\n" 226

LogParser('%t %h %{tls}x %{encr}x \"%r\" %b')

Is there smt wrong with my regexp or the way apachelogs handles \n ?

Parsing error logs

Hi,

Great job with this package, used it to quickly parse large volumes of logs using Dask and it worked nicely out of the box.

I am now trying to parse error logs:

[Mon Feb 28 05:53:37.614199 2022] [php7:warn] [pid 527] [client 172.68.189.10:44562] PHP Warning:  Illegal string offset 'strictly_necessary' in /home/admin/web/best-eq.com/public_html/wp-content/themes/iqtest/page-templates/gdpr-consent.php on line 85, referer: https://www.best-eq.com/?gclid=EAIaIQobChMI9Mu077-h9gIVBZSzCh2dLwRKEAAYASAAEgLNnvD_BwE

Any tips on these?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.