tsaikd / gogstash Goto Github PK
View Code? Open in Web Editor NEWLogstash like, written in golang
License: MIT License
Logstash like, written in golang
License: MIT License
Is there any support for codec plugin? Or is there any plan?
Hi
I have been looking into gogstash and I like its simplicity. My use-case is receiving syslog messages from a Fortigate, parse the message and then send it to Elasticsearch. (And trying to avoid the overhead of logstash.) Logstash is also deprecating its kv filter, so for me I need another long-term solution.
There are two features I miss to be able to to this with gogstash.
To solve these problems I have (re)written the code and tested that it works for me. The filter (it is called kv) takes an input string and produces values based on the content.
The unix nano timestamp issue I am a bit more unsure how to solve. I decided to add a new format "UNIXNANO" and make it handle it correctly. The alternative would be to rewrite convert/convertFloat so they can guess what unix timestamp format this is.
Is this something you are interested in and something I can add to your code. If so what way do you want me to contribute?
Below is a snippet from my configuration:
"filter": [
{
"type": "grok",
"match": ["<%{NUMBER:syslog_pri}>%{GREEDYDATA:kvmessage}"]
},
{
"type": "kv",
"source": "kvmessage",
"strings": ["tz"]
},
{
"type": "date",
"source": "eventtime",
"format": ["UNIXNANO"]
},
so can i use this to pump data to Elastic ?
Hello,
Basically, I have one input record which is JSON of array of 1000 records, I need to split them into individual 1000 records and run filters on them and insert them individually into ELK.
How can I do it?
I'm considering using gogstash on a commercial product but the current LGPL license prevents me from doing this.
Could you please consider switching to a more corporate-friendly open source license instead? e.g. MIT, Apache 2.0 and BSD tend to be fine
Hi
I am looking into the output module elastic and I can't see a way to send in some kind of authentication. How can I configure a username/password to log into Elasticsearch?
I mean since the file input is deprecated, is the official filebeats compatible as an input which we can configure in yaml or json directly and then maybe configure kafka as an output?
even more, can we run another gogstash instance to use kafka as an input and es as an output in the meanwhile?
github.com/tsaikd/gogstash/output/elastic/outputelastic.go:84: too many arguments in call to conf.client.BulkProcessor().Name("gogstash-output-elastic").BulkActions(conf.BulkActions).BulkSize(conf.BulkSize).FlushInterval(conf.BulkFlushInterval).Do
have ("context".Context)
want ()
I'm trying to migrate from logstash to gogstash as the memory usage of my environment is getting out of hand.
I couldn't find anything regarding conditional branching on input/filter/output elements like logstash does. Is there any similar thing in gogstash?
I am missing a way to handle packet based messages in UDP. (Think syslog etc.) Input socket only handles line delimited inputs, and this does not work well with our input. I have been looking through the code and I think the best way to solve this is to create a new UDP-only input type. (The alternative I looked into is to change socket to be more packet-like for UDP connections.)
In addition I could also add support for syslog parsing as a codec, filter or both using go-syslog.
Will this be an acceptable solution?
macOS Sierra 10.12.1, i write something to the file, but it can't detect it, please test it
Logstash offers many filters
-- like aggregate
, grok
, geoip
, and so on.
Are there plans to offer filters for gogstash ?
Or is exec
meant to replace filters ?
It would be nice to see a Go replacement for Logstash and it's many features.
Thanks.
Nice to have a kafka output module
Would probably be a good idea to migrate to dep versioning, as it seems to be the golang canonical versioning/dependency management system.
GROK support and other filter-plugins planned?
My output:
output:
type: http
urls: ["http://xxxxxxx/web_api/log_file_report"]
http_status_codes: [200, 201, 202]
http_error_codes: [501, 405, 404, 208, 505]
retry_interval: 60
max_queue_size: 1
ignore_ssl: true
Why Log output:
2022/08/22 16:54:29 output.go:99 [error] output module "http" failed: "http://xxxxxxx/web_api/log_file_report" retryable error 200
2022/08/22 16:55:28 simplequeue.go:129 [error] queue: httpsendone "http://xxxxxxx/web_api/log_file_report" retryable error 200
I'm trying to move from logstash to gogstash but one of the things that's preventing me from moving is the lack of the file output module.
Is there anything in the works to support this?
Would you have any tips/pointers in case I try to implement this myself? I'm happy to try to get this working but I'm sure I can save some time using your initial direction/tips.
I see in logs string: gogstash_filter_geoip2_error, this is related to private subnets.
Is it possible to improve output with error message for geoip filter?
gogstash_filter_geoip2_private_skipped - for private subnets
gogstash_filter_geoip2_error - use for all another errors in geoip filter.
I would like to have more control over conditions during next steps.
My data format contains @timestamp field.
I would like to send to the index "log-YYYY.MM.DD" according @timestamp.
Is this supported?
Thanks
I still need this feature for a 6.4.2 ES cluster but it seems to be deprecated.
Would it make sense to have an "elastic_legacy" output plugin that still supports it?
Anything to take in consideration?
There is a situation that the code below may fail.
if fpath, err = filepath.EvalSymlinks(fpath); err != nil {
err = errutil.New("Get symlinks failed: "+fpath, err)
return
}
For example do operations Write, Rename, Create
to the fpath.
When the waitWatchEvent() have processed the Write event, it will enter the waitWatchEvent() again. But now, the fpath doesn't exit or the fpath is creating. Under this situation, the filepath.EvalSymlinks
may fail.
It is expected to do:
waitWatchEvent have processed Write
enter waitWatchEvent again
waitWatchEvent processes Rename
waitWatchEvent processed Create
but the reality is:
waitWatchEvent have processed Write
enter waitWatchEvent again
filepath.EvalSymlinks(fpath) fail
I fix this problem temporarily by adding below codes before filepath.EvalSymlinks(fpath)
.
for ;; {
_, err := os.Stat(fpath)
if err != nil && os.IsNotExist(err) {
fmt.Println("IsNotExist")
continue
}
break
}
2019/06/24 22:24:12 outputelastic.go:95 [error] 299 Elasticsearch-7.1.1-7a013de "[types removal] Specifying types in bulk requests is deprecated."
The log cannot be parsed when I use the grok filter, but I can do it in the Grok Debugger, help
cat config.json
{
"input": [
{
"type":"file",
"path":"/var/log/nginx/access.log"
}
],
"debugch":true,
"filter": [
{
"type":"grok",
"source":"message",
"match":["%{NGINXTEST}"],
"patterns_path":"grok-patterns"
}
],
"output": [
{
"type": "stdout"
}
]
}
cat grok-patterns
NGINXTEST %{HOST:upstream_addr}:%{HOST:upstream_port} %{IPORHOST:http_host} %{NUMBER:request_time} %{NUMBER:response_time} %{IPORHOST:remote_addr} - %{NGUSER:remote_user} \[%{HTTPDATE:timestamp}\] "%{WORD:method} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion}" %{NUMBER:status} (?:%{NUMBER:bytes}|-) (?:"(?:%{URI:referrer}|-)"|%{QS:referrer}) (?:%{QS:agent}) %{QS:xforwardedfor}
Original log
127.0.0.1:12345 test.monitor.com 0.001 0.001 127.0.0.1 - - [28/Apr/2019:18:26:10 +0800] "GET /view/test.jpeg HTTP/1.1" 200 9444 "http://test.monitor.com/view/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.108 Safari/537.36" "-" "-"
Log Format
'$upstream_addr $http_host $request_time $upstream_response_time $remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for" "$request_body"';
if c, ok := f.cache.Get(ipstr); ok {
record = c.(*geoip2.City)
} else {
record, err = f.db.City(ip)
if err != nil {
if f.QuietFail {
goglog.Logger.Error(err)
}
event.AddTag(ErrorTag)
return event, false
}
}
missing f.cache.Add(ipstr, record)
Hello,
Out of curiosity, is the project still maintained and are there any plans of implementing the official elasticsearch library in elastic output?
https://github.com/elastic/go-elasticsearch
If yes, I am doing some changes to the library and I would like to create pull requests and push the changes here.
By any chance you are not thinking about adding support to ClickHouse, right?
Its API is simple http and accepts json documents.
Is there any support for grok'ing input? Or is there any plan?
How to reproduce:
Result:
2016/04/25 14:37:01 logger.go:15 [error] handle output config failed: map["exchange_type":"topic" "exchange_durable":%!q(bool=true) "exchange_auto_delete":%!q(bool=false) "persistent":%!q(bool=false) "retry_count":%!q(float64=10) "type":"amqp" "urls":["amqp://guest:guest@localhost:5672/"] "routing_key":"%{routing}" "exchange":"amq.topic"]
Expected result:
no valid amqp server connection found
(see https://github.com/tsaikd/gogstash/blob/master/output/amqp/outputamqp.go#L96)While trying the project on my local machine. I did the following:
go run .
ps aux | grep gog
And noticed that the worker processes are still alive.
user1 5896 21.6 0.3 1628392 27444 pts/6 Sl 11:00 0:01 /tmp/go-build3520306496/b001/exe/gogstash worker
user1 5897 21.0 0.3 1628136 31328 pts/6 Sl 11:00 0:01 /tmp/go-build3520306496/b001/exe/gogstash worker
My config is config.yaml
is:
chsize: 1000
worker: 2
event:
sort_map_keys: false
remove_field: ['@metadata']
input:
- type: lorem
output:
- type: stdout
codec: json
Hi
I would like to know about if output is not available (ex : elasticsearch full disk),
what will happen?
What is the buffering mechanism and is there any parameters (like max_disk_buffer_size, max_memory_buffer_size)?
Thanks
It seems the new goglog shows source file & line only in goglog.go
:
2018/09/17 18:57:55 goglog.go:112 [warning] filter remove_field config empty fields
instead of the file where the original logging funciton called:
2018/09/17 15:59:35 filterremovefield.go:45 [warning] filter remove_field config empty fields
I'm using gogstash 0.1.14 as an indexer from redis to elasticsearch
There are some errors in my error logs:
2018/06/19 08:25:47 output.go:52 [error] output failed: elastic: Error 429 (Too Many Requests): rejected execution of org.elasticsearch.transport.TransportService$7@3e869f6 on EsThreadPoolExecutor[name = data00/bulk, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@2ded0e49[Running, pool size = 8, active threads = 8, queued tasks = 200, completed tasks = 27481615]] [type=es_rejected_execution_exception]
I'm trying to find out the error came from:
https://github.com/tsaikd/gogstash/blob/binary/config/output.go#L44
However, I haven't see the event back to the event channel after it failed.
Am I missing some thing or it will just lost event data after it failed?
Thanks
I'm looking to process JSON logs with fields like this:
{ "http.method": "GET" }
...and many more similar fields with .
in the name.
I can use the json
filter to transform this into fields, but none of the other filters are able to manipulate these fields. For example with these filters:
filter:
- type: json
- type: remove_field
fields: ["http.method"]
...running the above JSON through yields:
{
"@timestamp": "2020-08-21T02:26:06.158181Z",
"host": "MyHost",
"http.method": "GET",
"message": "{\"http.method\":\"GET\"}"
}
There's no way to remove the field! Other filters have a similar issue.
The problem is that getPathValue
always interprets its input as a path expression and not necessarily a literal key into the map. Reference:
gogstash/config/logevent/pathvalue.go
Line 87 in d461391
Logstash has field reference syntax to handle these different cases. I can understand that gogstash may not want such complexity, but it does feel like there should be a solution to my problem. Otherwise I think gogstash will be unusable for me.
Does this project has a plan to be a drop-in replacement of Logstash, and if so, do we need to implement a way to parse the Logstash config syntax to gogstash's Config object? Look at this project, for example, https://github.com/breml/logstash-config. Maybe we can use it to support existing Logstash configuration syntax
When input
socket module is being used and the payload is a valid JSON, the @timestamp
field is not being properly initialized to time.Now()
. This results in the output LogEvent
having a non-initialized Timestamp
field.
Currently the result of the output has a @timestamp
being this:
{
"@timestamp": "0001-01-01T00:00:00Z",
...
}
Using version 0.1.14
can not build with "gopkg.in/olivere/elastic.v5" ,but with "gopkg.in/olivere/elastic.v6"
I'm trying to use the useragent
filter and getting the error: filter useragent 'regexes' not configured
. I would expect gogstash to default to using the DefinitionYaml
variable available in uap-go, so I don't have to configure this unless I require a different set of regexes.
I'd be happy to create PR for this if it sounds reasonable.
Hi
I have been looking into the file output module and I see some potensial issues. (All line numbers are from outputfile.go.)
I can try write a fix for these issues if you agree on this being an issue.
output/prometheus/outputprometheus.go:63:26: undefined: prometheus.Handler
Simplar to https://github.com/helm/charts/tree/master/stable/fluent-bit, would be nice to have a Helm Chart to be able to deploy gogstash
on Kubernetes cluster. This might help with adoption of the tool and wider the usage!
i clone and build, and start program:
./gogstash worker --config ./gogstash.yml
gogstash.yml config file content:
worker: 2
chsize: 1000
input:
- type: file
path: /tmp/logstash-tutorial.log
start_position: beginning
sincedb_write_interval: 5
output:
- type: stdout
codec: json
while i want to stop the program, so i enter command + c
to interrupt, but seems the program cannot stop, just print interrupt
information, i sam not sure that happen.
util.go:116 [info] interrupt
machine: macos 12.5.1
go version: go version go1.18.5 darwin/amd64
Getting frustrated with logstash's memory hogging (and occasionally crashing on invalid JSON input), I set out create a simpler more efficient alternative. Then I found this, which seems to already be a way more complete alternative than what I had in mind. Couple of questions:
Hi! My logs are big complex JSON objects. To prevent from elastic search parsing json and creating indexes, I want to convert not white-listed nested objects to string, and store it as TEXT field type in elastic search.
It will allow me to have full-text search on this json string later.
I can do mapping in the logger formatter, but I want to prevent making unnecessary CPU work (JSON marshaling).
It would be nice to have an plugin that will allow me to define something like this:
keep_fields = ["field1", "msg", "nested2", "time"]
So the input:
{
"field1": "value1",
"msg": "Log example",
"nested1": {
"Name": "name1",
"Family": "family1"
},
"nested2": {
"k1": "v1"
},
"time": "2019-10-31T21:40:28.098391+02:00"
}
Will provide output:
{
"field1": "value1",
"msg": "Log example",
"nested1": "{\"Name\":\"name1\",\"Family\":\"family1\"}",
"nested2": {
"k1": "v1"
},
"time": "2019-10-31T21:40:28.098391+02:00"
}
Thanks.
Add ability to delete message field.
Do you have any plan to add plain codec?
Nice to have email output module, can i contribute it?
Hi all,
I need help with configuration for this wonderful project.
I have simple config:
chsize: 1000
input:
- type: file
path: /logging/test_input.log
start_position: end
sincedb_write_interval: 5
filter:
- type: grok
match: ["%{IPTABLES_SRC}"]
source: "message"
patterns_path: "/iptables.grok"
- type: geoip2
db_path: "/GeoLite2-City.mmdb"
ip_field: "src_ip"
cache_size: 100000
key: geoip
flat_format: true
- type: add_field
key: 'geo2ip_city'
value: '%{geoip.country_code}'
output:
- type: stdout
codec: json
- type: loki
urls:
- "http://mloki.fr.com.ua:3100/loki/api/v1/push"
echo 'May 11 10:02:05 zabbix kernel: [1609112.875635] FW_F_IN_DROP: IN=ens18 OUT= MAC=3a:e9:5f:c7:41:78:d0:07:ca:8c:10:01:08:00 SRC=104.156.15.12 DST=19.0.20.1 LEN=40 TOS=0x00 PREC=0x00 TTL=243 ID=8530 PROTO=TCP SPT=58399 DPT=3080 WINDOW=1024 RES=0x00 SYN URGP=0' >> /logging/test_input.log
Error:
2022/05/11 20:07:03 outputloki.go:89 [warning] key: geoip error:Unable to Cast to string
{"host":"gogstash","path":"/logging/test_input.log","@timestamp":"2022-05-11T20:07:03.018256841Z","message":"May 11 10:02:05 zabbix kernel: [1609112.875635] FW_F_IN_DROP: IN=ens18 OUT= MAC=3a:e9:5f:c7:41:78:d0:07:ca:8c:10:01:08:00 SRC=104.156.15.12 DST=19.0.20.1LEN=40 TOS=0x00 PREC=0x00 TTL=243 ID=8530 PROTO=TCP SPT=58399 DPT=3080 WINDOW=1024 RES=0x00 SYN URGP=0","offset":0,"src_ip":"104.156.15.12","geoip":{"longitude":-97.822,"timezone":"America/Chicago","continent_code":"NA","country_code":"US","country_name":"United States","ip":"104.156.15.12","latitude":37.751,"location":[-97.822,37.751]},"geo2ip_city":"US"}
also, I tried to use output stdout without codec json, and result was without geoip data, looks like geoip can be injected only in json mode.
Is there anything which can help me to resolve issue? I would like to send logs to loki with geoip data to build dashboard with worldmap and geoip data.
A regular schedule of published releases to enable ability to "consume" newly added features/patches/etc would greatly help adoption of this project in the community!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.