leolee192 / sequencer Goto Github PK
View Code? Open in Web Editor NEWHigh performance sequential log analyzer and parser
License: Apache License 2.0
High performance sequential log analyzer and parser
License: Apache License 2.0
@faxm0dem opened zentures/sequence#20 and commented
Hi,
I'd like to discuss the possibility to write some kind of patterndb integration. I was thinking about a program that would generate the syslog-ng db from a sequence analyzer output.
What are your thoughts/ideas/comments on that?
No further details from zentures/sequence#20
@Leftium opened zentures/sequence#14 and commented
Steps to Reproduce:
1. `echo "get http://example.com" > input.txt` 2. `go run sequence.go analyze --input input.txt --output patterns.txt` 3. `go run sequence.go parse --input input.txt --patterns patterns.txt`
Expected Results:
Message is parsed and there is no error.
Actual Results:
2016/05/12 12:18:40 Error (sequence: no pattern matched for this message) parsing: get http://example.com 2016/05/12 12:18:40 Parsed 1 messages in 0.00 secs, ~ 999.90 msgs/sec 2016/05/12 12:18:40 Quiting...
Comments:
There is no error and the results are correct if the URI is removed. I think the URI fails to match because the scanner says it is type
uri
but the patterns file is looking for%object%
.I was unable to confirm this because
%uri%
is not accepted in a patterns file (Invalid tag token "%uri%": unknown type
). And I have not figured out how to prevent the URI from being tagged as an object.Also note this bug causes the
analyze
command to report the incorrect number of new patterns.
No further details from zentures/sequence#14
@Leftium opened zentures/sequence#13 and commented
Steps to reproduce:
1. `go run sequence.go analyze -i input.txt`
Expected output:
%action% %object% #1 log messages matched # get http://example.com 2016/05/12 11:27:43 Analyzed 1 messages, found 1 unique patterns, 1 are new.
Actual output:
2016/05/12 11:27:43 Analyzed 1 messages, found 1 unique patterns, 1 are new.
No further details from zentures/sequence#13
@Leftium opened zentures/sequence#15 and commented
Steps to Reproduce:
1. `echo "get //example.com" > input.txt` 2. `go run sequence.go scan --input input.txt`
Expected Results:
# 0: { Tag="funknown", Type="uri", Value="//example.com", ... }
Actual Results:
# 0: { Tag="funknown", Type="literal", Value="//example.com", ... }
Comments:
I found this bug processing an actual log file. One of the log events in question:81.181.146.13 - - [15/Mar/2005:05:06:49 -0500] "GET //cgi-bin/awstats/awstats.pl?configdir=|%20id%20| HTTP/1.1" 404 1050 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)"
A related question: what is the best way to handle relative URI's? Sequence's heuristic algorithm for processing URI's breaks down on these...
No further details from zentures/sequence#15
@jobordu opened zentures/sequence#19 and commented
Since a path is a frequent object in a log, it would be very useful if It had a parsing category like "timeFormats" so we would be able to parse for different path syntax (linux VS windows, file/folder, %appdata%, ../../path, ./path, etc.)
Currently:
C:\test\test\test.cxx
Would be analyse as:
c : %string%
No further details from zentures/sequence#19
@Korsaja opened zentures/sequence#23 and commented
When analyzing this part of the log,
GET%3Cbody%3E%3CSCRIPT
the analyzer identifies "% 3Cbody%" as a tag, although it is only part of the GET request
No further details from zentures/sequence#23
@Leftium opened zentures/sequence#16 and commented
Steps to Reproduce:
1. `echo "t=|" > input.txt` 2. `go run sequence.go analyze --input input.txt`
Expected Results:
2016/05/12 13:13:56 Analyzed 1 messages, found 1 unique patterns, 1 are new.
(No error and message is analyzed.)
Actual Results:
2016/05/12 13:13:56 Error analyzing: t=| 2016/05/12 13:13:56 Analyzed 1 messages, found 0 unique patterns, 0 are new.
Comments:
I think something is going wrong with the heuristics for key=value pairs. I found this bug while processing an actual log file. One of the log events in question:81.181.146.13 - - [15/Mar/2005:05:06:49 -0500] "GET //cgi-bin/awstats/awstats.pl?configdir=|%20id%20| HTTP/1.1" 404 1050 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)"
@antham referenced this issue on 21 May 2016
Handle pipe #17
@jobordu commented
Might be related, but I found an error and it might be caused be the '|' caracter:
2017/04/18 12:13:32 Error analyzing:
2017-04-11T16:49:28.551 Something is wrong with Attachment Browser Plugin, no working directory found: QDir( "C:/SpecifX_stateFiles/V2/femoralUnWrapNoManageLink3_pv_attachments" , nameFilters = { "*" }, QDir::SortFlags( Name | IgnoreCase ) , QDir::Filters( Dirs|Files|Drives|AllEntries ) ) (:0, )
No further details from zentures/sequence#16
The document of the original project was at http://sequencer.io/manual, which was iced together. I will migrate the document to the project wiki to keep it up-to-date.
@ctyjrsy opened zentures/sequence#21 and commented
i was hoping to run few quick tests, however i am unable to find the data folder containing the log messages. I checked all your github libs. github.com/strace/sequence, zentures/sequence and library download
No further details from zentures/sequence#21
@louiseruthharding opened zentures/sequence#22 and commented
Hi,
I have been working with sequence for the last two months extending it to output its patterns in syslog-ng patterndb and grok for Logstash formats. I have had to make a few changes to sequence code, largely around remembering where the spaces are, adding a database so we can decide to print the patterns on demand, rather than after each analysis, among other things.
It is in a company repo for now, but the goal is to make it available to the open source community.
I would love to discuss this with you.
https://www.linkedin.com/in/louise-harding-3b964551/Regards
Louise
>> go test
# github.com/strace/sequence
./analyzer_test.go:188:59: Sprintf format %s has arg len(atree.levels[l]) of wrong type int
FAIL github.com/strace/sequence [build failed]
@russ168 opened zentures/sequence#11 and commented
devid=0 date="2013/05/21 09:53:17" dname=themis logtype=9 pri=6 ver=0.3.0 mod=webui from=10.1.5.200 agent="Mozilla/5.0 " admin=administrator act=登录 result=0 msg="成功" dsp_msg="administrator 登录" fwlog=0
@chenryn commented
any progress?
No further details from zentures/sequence#11
The project originated from zentures/sequence , which was iced in 2017. Since I tried and couldn't contact the original author for weeks, I decided to restart the project here.
Write a logstash to parse unstructed log entry into fields, instead of the "grok" filter, to achieve higher throughput.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.