Giter Site home page Giter Site logo

grokngelf's Introduction

Grok'n'Gelf

Tool for parsing logs collected by foreman-debug and sosreport tools and import them to a centralized log manager.

How do I use it

Log server is not part of this tool. You can use service of your choice. The only requirement is it has to accept log data in GELF format. The tool was tested with Graylog server

Install the tool:

  git clone https://github.com/mbacovsky/grokngelf
  cd grokngelf
  bundle install
  bundle exec bin/grokngelf -h

Lets assume you have your Graylog server running at graylog.example.com and accepting GELF input on port 12201 via TCP. Also assume you have your sosreport extracted in /tmp/sosreport-sample. To import the logs run:

  $ grokngelf sosreport -t graylog.example.com --host testhost.example.com --import-id 1 /tmp/sosreport-sample

  Importing sosreport from [/tmp/sosreport-sample]...
  Importing yum log [/tmp/sosreport-sample/var/log/yum.log].....................................................................|
  Importing syslog [/tmp/sosreport-sample/var/log/messages]..........................................|
  Done

If you sosreport is a tarball the tool can extract it for you:

  $ grokngelf sosreport -t graylog.example.com --host testhost.example.com --import-id 1 /tmp/sosreport-sample.tar.xz

For more help see:

  $ grokngelf -h
  Usage:
      grokngelf [OPTIONS] SUBCOMMAND [ARG] ...

  Parameters:
      SUBCOMMAND                    subcommand
      [ARG] ...                     subcommand arguments

  Subcommands:
      yum                           process yum log
      syslog                        process system log
      sosreport                     process sos report
      dynflow-tasks                 process CSV export of dynflow tasks

  Options:
      -h, --help                    print help


  $ grokngelf sosreport -h
  Usage:
      grokngelf sosreport [OPTIONS] SOSREPORT

  Parameters:
      SOSREPORT                     sosreport tarball or directory with extracted logs

  Options:
      -h, --help                    print help
      --target, -t TARGET           machine where we can send the processed logs
      --port, -p PORT               port where we can send the processed logs (default: "12201")
      --protocol PROTOCOL           protocol to use to send the data (default: "TCP")
      --host HOST                   hostname of the machine the logs originates from (a.k.a facilty) (default: "default")
      --import-id IMPORT_ID         unique identification of the import (default: 1)

How does it work

For sosreport all applicable importers are called for the relevant logs from collection. Importer reads the log line by line and tries to match the line to GROK patterns from its library. From the matched line structured event is created and send in GELF format to the log manager.

For more complex logs such as syslog we need to parse the content in multiple runs and every pass pick data just for specific pattern. Lines that were not matched are imported with program: 'unmatched' and can be filtered and revieved later.

Each event has mandatory attributes regardless of importer it originates from:

  {
    'version' => '1.1',               # GELF version, *internal*
    'import_id' => '',                # id of the import as set on CLI
    'source' => '',                   # hostname of importer machine, *internal*
    'short_message' => '',            # log entry without timestamps, pids, etc.
    'original_line' => '',            # full original log entry
    'timestamp' => 1485359137,        # log event timestamp
    'level' => 1,                     # log level number
    'level_hr' => 'DEBUG',            # log level in human readable format
    'facility' => '',                 # source (hostname) of the log entry, set on CLI
    'file' => 'N/A',                  # file the notify was called from *internal*
    'line' => 'N/A',                  # line the notify was called from *internal*
    'log_file' => '',                 # file that was parsed to produce this entry
  }

How can I improve it

Any contribution is highly appreciated. I welcome any kind of help

What logs are supported

var/log/yum.log

  • Importer: Yum
  • Status: Complete
  • Example:
{
            "original_line" => "Jan 25 15:45:37 Installed: rh-ruby22-ruby-2.2.2-16.el7.x86_64",
                "timestamp" => 1485359137,
                    "level" => 1,
                 "level_hr" => "INFO",
                  "program" => "yum",
            "short_message" => "Installed: rh-ruby22-ruby-2.2.2-16.el7.x86_64",
             "package_name" => "rh-ruby22-ruby",
                   "action" => "Installed",
            "package_nevra" => "rh-ruby22-ruby-2.2.2-16.el7.x86_64",
            "package_epoch" => 0,
          "package_version" => "2.2.2",
    "package_version_major" => 2,
    "package_version_minor" => 2,
          "package_release" => "16.el7",
     "package_architecture" => "x86_64"
}

var/log/messages

  • Importer: Syslog
  • Status: WIP
  • Pulp messages
  • Generic syslog messages

tmp/export

  • Importer: DynflowTasks
  • Status: Complete
  • Execution plans (+ data from dynflow actions and foreman tasks)
  • Execution steps
  • Example of Execution plan:
  {
     "timestamp"=>1485035640,
     "level"=>1,
     "level_hr"=>"INFO",
     "importer"=>"GrokNGelf::Importers::DynflowTasks",
     'program' => 'dynflow',
     'short_message' => "Execution plan 3926bad1-7b7d-4a49-9e86-e31e4f431483 started",
     'dynflow_object' => 'execution_plan',
     'dynflow_plan_id' => '3926bad1-7b7d-4a49-9e86-e31e4f431483',
     'dynflow_class' => 'Actions::RemoteExecution::RunHostsJob',
     'dynflow_input' => {"job_invocation"=>{"id"=>2, "name"=>"Commands", "description"=>"Run echo hello"}},
     'dynflow_output' => {"total_count"=>1, "failed_count"=>0, "success_count"=>1, "pending_count"=>0},
     'dynflow_event_type' => 'start',
     'dynflow_state' => 'stopped',
     'dynflow_result' => 'success',
     'dynflow_started_at' => '2017-01-21 21:54:00',
     'dynflow_ended_at' => '2017-01-21 21:54:03',
     'dynflow_real_time' => 3.601317757,
     'dynflow_real_time_hr' => '3 s',
     'dynflow_execution_time' => 0.440313698,
     'foreman_task_start_at' => '2017-01-21 21:54:00.053138',
     'foreman_task_start_before' => '2017-01-21 21:54:00.053138',
     'foreman_task_id' => 'd8a0ec75-b056-4102-be4d-f211dc8734be',
     'foreman_task_type' => 'ForemanTasks::Task::DynflowTask',
     'foreman_task_label' => 'Actions::RemoteExecution::RunHostsJob',
     'log_file' => fixture_log('dynflow_tasks_success/dynflow_execution_plans.csv')
   }
  • Example of Execution step:
  {
     'timestamp' => 1485035641,
     'level' => 1,
     'level_hr' => 'INFO',
     'program' => 'dynflow',
     'short_message' => "Execution step started",
     'dynflow_object' => 'step',
     'dynflow_plan_id' => '3926bad1-7b7d-4a49-9e86-e31e4f431483',
     'dynflow_event_type' => 'start',
     'dynflow_class' => 'Actions::RemoteExecution::RunHostsJob',
     'dynflow_step_class' => 'Dynflow::ExecutionPlan::Steps::RunStep',
     'dynflow_input' => {"job_invocation"=>{"id"=>2, "name"=>"Commands", "description"=>"Run echo hello"}},
     'dynflow_output' => {"total_count"=>1, "failed_count"=>0, "success_count"=>1, "pending_count"=>0},
     'dynflow_error' => nil,
     'dynflow_step_id' => 2,
     'dynflow_result' => 'success',
     'dynflow_started_at' => '2017-01-21 21:54:01',
     'dynflow_ended_at' => '2017-01-21 21:54:03',
     'dynflow_real_time' => 2.376708706,
     'dynflow_real_time_hr' => '2 s',
     'dynflow_execution_time' => 0.394046164,
     'dynflow_progress_done' => 1.0,
     'dynflow_progress_weight' => 1.0,
     'log_file' => fixture_log('dynflow_tasks_success/dynflow_steps.csv')
   }

How do I install Graylog

There is plenty of ways to install Graylog. Most easy is probably to use Graylog's VM appliances. I have good experience with their ansible playbook that I extended to create GELF TCP input and to open the port.

- name: Create graylog global GELF input for receiving logs
  uri:
    url: http://127.0.0.1:9000/api/system/inputs
    method: POST
    user: "admin"
    password: "admin"
    body: '{"title":"GELFTCPInput","type":"org.graylog2.inputs.gelf.udp.GELFTCPInput","configuration":{"bind_address":"0.0.0.0","port":12201,"recv_buffer_size":1048576,"override_source":null,"decompress_size_limit":8388608},"global":true}'
    force_basic_auth: yes
    status_code: 201
    body_format: json

- name: Open port 12201 in firewall
  firewalld:
    port: 12201/tcp
    permanent: true
    state: enabled

Regardless of the type of installation you use make sure the GELF Input is Up and running. In Graylog UI select System > Input

GELF Input in Graylog

Also open the port in the firewall if necessary with e.g.

 $ firewall-cmd --add-port="12201/tcp"

How do I search the logs in Graylog

Import some data to the Graylog e.g. with

  $ grokngelf yum -t graylog.example.com --host testhost.example.com --import-id 1 /tmp/sosreport-sample

In the Graylog UI go to Search and select Search in all messages at the top in the time range selector. In the search box filter the data you've just imported with facility:testhost.example.com AND import_id:1 and click the search button. Note that the host (a.k.a facility in Graylog) and import_id fields are there just for filtering of the logs so you can use whatever data there that work for you e.g. bug number, hostname, customer name, etc.

Search in Graylog

More on filtering possibilities, query language, exports, statistics and analysis in Graylog check the docs.

License

This project is licensed under the GPLv3+.

grokngelf's People

Contributors

mbacovsky avatar

Stargazers

adam kaminski avatar Ivan Necas avatar

Watchers

James Cloos avatar  avatar

Forkers

inecas

grokngelf's Issues

Handle exception when the log server is not accessible

grokngelf dynflow-tasks -t graylog.abc.lab.com --import-id 9 --host tasks /tmp/tasks-csv-export.tar.gz 
Task export was extracted to /tmp/export/
/home/mbacovsk/.gem/ruby/gems/gelf-3.0.0/lib/gelf/transport/tcp.rb:31:in `initialize': getaddrinfo: Name or service not known (SocketError)
        from /home/mbacovsk/.gem/ruby/gems/gelf-3.0.0/lib/gelf/transport/tcp.rb:31:in `new'
        from /home/mbacovsk/.gem/ruby/gems/gelf-3.0.0/lib/gelf/transport/tcp.rb:31:in `connect'
        from /home/mbacovsk/.gem/ruby/gems/gelf-3.0.0/lib/gelf/transport/tcp.rb:15:in `block (2 levels) in addresses='
        from /home/mbacovsk/.gem/ruby/gems/gelf-3.0.0/lib/gelf/transport/tcp.rb:15:in `map'
        from /home/mbacovsk/.gem/ruby/gems/gelf-3.0.0/lib/gelf/transport/tcp.rb:15:in `block in addresses='
        from /home/mbacovsk/.gem/ruby/gems/gelf-3.0.0/lib/gelf/transport/tcp.rb:13:in `tap'
        from /home/mbacovsk/.gem/ruby/gems/gelf-3.0.0/lib/gelf/transport/tcp.rb:13:in `addresses='
        from /home/mbacovsk/.gem/ruby/gems/gelf-3.0.0/lib/gelf/transport/tcp.rb:9:in `initialize'
        from /home/mbacovsk/.gem/ruby/gems/gelf-3.0.0/lib/gelf/notifier.rb:138:in `new'
        from /home/mbacovsk/.gem/ruby/gems/gelf-3.0.0/lib/gelf/notifier.rb:138:in `create_sender'
        from /home/mbacovsk/.gem/ruby/gems/gelf-3.0.0/lib/gelf/notifier.rb:36:in `initialize'
        from /home/mbacovsk/work/grokngelf/lib/grokngelf/commands/abstract.rb:32:in `new'
        from /home/mbacovsk/work/grokngelf/lib/grokngelf/commands/abstract.rb:32:in `notifier'
        from /home/mbacovsk/work/grokngelf/lib/grokngelf/commands/dynflow_tasks.rb:22:in `execute'
        from /home/mbacovsk/.gem/ruby/gems/clamp-1.1.1/lib/clamp/command.rb:63:in `run'
        from /home/mbacovsk/.gem/ruby/gems/clamp-1.1.1/lib/clamp/subcommand/execution.rb:11:in `execute'
        from /home/mbacovsk/.gem/ruby/gems/clamp-1.1.1/lib/clamp/command.rb:63:in `run'
        from /home/mbacovsk/.gem/ruby/gems/clamp-1.1.1/lib/clamp/command.rb:132:in `run'
        from bin/grokngelf:6:in `<main>'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.