Giter Site home page Giter Site logo

sequencer's People

Contributors

charl avatar leolee192 avatar zhenjl avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

myts2

sequencer's Issues

Sequences with URI's are not matched correctly [zentures/sequence#14]

@Leftium opened zentures/sequence#14 and commented

Steps to Reproduce:

1. `echo "get http://example.com" > input.txt`

2. `go run sequence.go analyze --input input.txt --output patterns.txt`

3. `go run sequence.go parse --input input.txt --patterns patterns.txt`

Expected Results:

Message is parsed and there is no error.

Actual Results:

2016/05/12 12:18:40 Error (sequence: no pattern matched for this message) parsing: get http://example.com
2016/05/12 12:18:40 Parsed 1 messages in 0.00 secs, ~ 999.90 msgs/sec
2016/05/12 12:18:40 Quiting...

Comments:

There is no error and the results are correct if the URI is removed. I think the URI fails to match because the scanner says it is type uri but the patterns file is looking for %object%.

I was unable to confirm this because %uri% is not accepted in a patterns file (Invalid tag token "%uri%": unknown type). And I have not figured out how to prevent the URI from being tagged as an object.

Also note this bug causes the analyze command to report the incorrect number of new patterns.


No further details from zentures/sequence#14

URI's starting with "//" are not tokenized correctly [zentures/sequence#15]

@Leftium opened zentures/sequence#15 and commented

Steps to Reproduce:

1. `echo "get //example.com" > input.txt`

2. `go run sequence.go scan --input input.txt`

Expected Results:

#   0: { Tag="funknown", Type="uri", Value="//example.com", ... }

Actual Results:

#   0: { Tag="funknown", Type="literal", Value="//example.com", ... }

Comments:
I found this bug processing an actual log file. One of the log events in question:

81.181.146.13 - - [15/Mar/2005:05:06:49 -0500] "GET //cgi-bin/awstats/awstats.pl?configdir=|%20id%20| HTTP/1.1" 404 1050 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)"

A related question: what is the best way to handle relative URI's? Sequence's heuristic algorithm for processing URI's breaks down on these...


No further details from zentures/sequence#15

'|' (pipe character) causes error during analyze [zentures/sequence#16]

@Leftium opened zentures/sequence#16 and commented

Steps to Reproduce:

1. `echo "t=|" > input.txt`

2. `go run sequence.go analyze --input input.txt`

Expected Results:

2016/05/12 13:13:56 Analyzed 1 messages, found 1 unique patterns, 1 are new.

(No error and message is analyzed.)

Actual Results:

2016/05/12 13:13:56 Error analyzing: t=|
2016/05/12 13:13:56 Analyzed 1 messages, found 0 unique patterns, 0 are new.

Comments:
I think something is going wrong with the heuristics for key=value pairs. I found this bug while processing an actual log file. One of the log events in question:

81.181.146.13 - - [15/Mar/2005:05:06:49 -0500] "GET //cgi-bin/awstats/awstats.pl?configdir=|%20id%20| HTTP/1.1" 404 1050 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)"

@antham referenced this issue on 21 May 2016
Handle pipe #17


@jobordu commented

Might be related, but I found an error and it might be caused be the '|' caracter:

2017/04/18 12:13:32 Error analyzing:
2017-04-11T16:49:28.551 Something is wrong with Attachment Browser Plugin, no working directory found: QDir( "C:/SpecifX_stateFiles/V2/femoralUnWrapNoManageLink3_pv_attachments" , nameFilters = { "*" }, QDir::SortFlags( Name | IgnoreCase ) , QDir::Filters( Dirs|Files|Drives|AllEntries ) ) (:0, )


No further details from zentures/sequence#16

Been working on sequence for the last 2 months - would love to discuss [zentures/sequence#22]

@louiseruthharding opened zentures/sequence#22 and commented

Hi,

I have been working with sequence for the last two months extending it to output its patterns in syslog-ng patterndb and grok for Logstash formats. I have had to make a few changes to sequence code, largely around remembering where the spaces are, adding a database so we can decide to print the patterns on demand, rather than after each analysis, among other things.
It is in a company repo for now, but the goal is to make it available to the open source community.
I would love to discuss this with you.
https://www.linkedin.com/in/louise-harding-3b964551/

Regards
Louise

unit test failed in analyzer_test.go

>> go test
# github.com/strace/sequence
./analyzer_test.go:188:59: Sprintf format %s has arg len(atree.levels[l]) of wrong type int
FAIL    github.com/strace/sequence [build failed]

wake up the project

The project originated from zentures/sequence , which was iced in 2017. Since I tried and couldn't contact the original author for weeks, I decided to restart the project here.

Logstash integration

Write a logstash to parse unstructed log entry into fields, instead of the "grok" filter, to achieve higher throughput.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.