Giter Site home page Giter Site logo

Comments (75)

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024 1

from manticoresearch.

PavelShilin89 avatar PavelShilin89 commented on August 25, 2024 1

Please let me know if the files have been uploaded and appear in manticore storage @sanikolaev

Your data has been received. We are working on the problem.

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

My goal is to parse the log files and send them to Manticore to insert the data into the index table.

from manticoresearch.

tomatolog avatar tomatolog commented on August 25, 2024

manual about Logstash-configuration shows such output example

output {
  elasticsearch {
   index => " dpkg_log"
   hosts => ["http://localhost:9308"]
   ilm_enabled => false
   manage_template => false
  }
}

the is no any url => "http://localhost:9308/insert" properties

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

tomatolog avatar tomatolog commented on August 25, 2024

I sure you can not use multiple applications on the same ports. you need to assign different ports for each application

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

I run this code and I get this error:

2024-06-13 23:54:57.294 [Ruby-0-Thread-9: :1] elasticsearch - Attempted to resurrect connection to dead ES instance, but got an error {:url=>"http://localhost:9308/", :exception=>LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError, :message=>"Elasticsearch Unreachable: [http://localhost:9308/][Manticore::ClientProtocolException] localhost:9308 failed to respond"}

input {
file {
path => ["/var/log/dpkg.log"]
start_position => "beginning"
sincedb_path => "/dev/null"
mode => "read"
exit_after_read => "true"
file_completed_action => "log"
file_completed_log_path => "/dev/null"
}
}

output {
elasticsearch {
index => "logsnew"
hosts => ["http://localhost:9308"]
ilm_enabled => false
manage_template => false
}
}

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

This is my elasticsearch.yml:

# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
#cluster.name: my-application
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
#node.name: node-1
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
path.data: /var/lib/elasticsearch
#
# Path to log files:
#
path.logs: /var/log/elasticsearch
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# By default Elasticsearch is only accessible on localhost. Set a different
# address here to expose this node on the network:
#
#network.host: 192.168.0.108
#
# By default Elasticsearch listens for HTTP traffic on the first free port it
# finds starting at 9200. Set a specific HTTP port here:
#
#http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
#discovery.seed_hosts: 192.168.0.108
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
#cluster.initial_master_nodes: ["node-1", "node-2"]
#
# For more information, consult the discovery and cluster formation module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Allow wildcard deletion of indices:
#
#action.destructive_requires_name: false

#----------------------- BEGIN SECURITY AUTO CONFIGURATION -----------------------
#
# The following settings, TLS certificates, and keys have been automatically      
# generated to configure Elasticsearch security features on 13-06-2024 11:32:04
#
# --------------------------------------------------------------------------------

# Enable security features
xpack.security.enabled:  false

xpack.security.enrollment.enabled: false

# Enable encryption for HTTP API client connections, such as Kibana, Logstash, and Agents
xpack.security.http.ssl:
  enabled: true
  keystore.path: certs/http.p12

# Enable encryption and mutual authentication between cluster nodes
xpack.security.transport.ssl:
  enabled: true
  verification_mode: certificate
  keystore.path: certs/transport.p12
  truststore.path: certs/transport.p12
# Create a new cluster with the current node only
# Additional nodes can still join the cluster later
cluster.initial_master_nodes: ["haninefawaz-VirtualBox"]

# Allow HTTP API connections from anywhere
# Connections are encrypted and require user authentication
http.host: 0.0.0.0

# Allow other nodes to join the cluster from anywhere
# Connections are encrypted and mutually authenticated
#transport.host: 0.0.0.0

#----------------------- END SECURITY AUTO CONFIGURATION -------------------------

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

@tomatolog any idea?

from manticoresearch.

sanikolaev avatar sanikolaev commented on August 25, 2024

@haninerabahfawaz The task is assigned to @Nick-S-2018 and we'll get to it eventually. In the meantime, what you can do to help yourself is try to understand why this https://play.manticoresearch.com/logstash/ works, but it doesn't work in your case.

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

Nick-S-2018 avatar Nick-S-2018 commented on August 25, 2024

Do you mean you need to add new log data from Logstash to Manticore regularly? I suppose, if you use the input file plugin, this plugin option should help: https://www.elastic.co/guide/en/logstash/7.10/plugins-inputs-file.html#plugins-inputs-file-stat_interval

from manticoresearch.

sanikolaev avatar sanikolaev commented on August 25, 2024

I want to thank everyone that my code is finally working well.

@haninerabahfawaz what was the problem?

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

@Nick-S-2018

Hi, I got this error after I added stat_interval input { file { path => [ "/var/log/dpkg.log", "/var/log/syslog", "/var/log/kern.log", "/var/log/auth.log", "/opt/lampp/logs/access_log", "/opt/lampp/logs/error_log" ] start_position => "beginning" sincedb_path => "/dev/null" mode => "read" exit_after_read => "true" file_completed_action => "log" file_completed_log_path => "/dev/null" stat_interval => "5 second" } } output { elasticsearch { index => "logsinfo" hosts => ["http://localhost:9308"] ilm_enabled => false manage_template => false } } [ERROR] 2024-06-18 15:45:50.330 [[main]>worker1] elasticsearch - Encountered a retryable error (will retry with exponential backoff) {:code=>409, :url=>"http://localhost:9308/_bulk http://localhost:9308/_bulk", :content_length=>31390} *

On Mon, 17 Jun 2024 at 10:33, Nick Sergeev @.
> wrote: Do you mean you need to add new log data from Logstash to Manticore regularly? I suppose, if you use the input file plugin, this plugin option should help: https://www.elastic.co/guide/en/logstash/7.10/plugins-inputs-file.html#plugins-inputs-file-stat_interval — Reply to this email directly, view it on GitHub <#2307 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/A33WFDERQINCDTUES4BIGN3ZH2GNRAVCNFSM6AAAAABJI7H24KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNZSGUYTGNRRG4 . You are receiving this because you were mentioned.Message ID: @.*>

from manticoresearch.

sanikolaev avatar sanikolaev commented on August 25, 2024

Try adding:

max_packet_size = 128m

to section "searchd" of your Manticore configuration file and restarting it.

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

sanikolaev avatar sanikolaev commented on August 25, 2024

Can you please provide an instruction how to reproduce it from scratch?

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

@sanikolaev

manticore.conf:

searchd {
    listen = 127.0.0.1:9312
    listen = 127.0.0.1:9306:mysql
    listen = 127.0.0.1:9308:http
    log = /var/log/manticore/searchd.log
    query_log = /var/log/manticore/query.log
    pid_file = /var/run/manticore/searchd.pid
    max_packet_size = 128m
}

index loggsdata
    {
        type = rt
	path = /var/lib/manticore/loggsdata
        rt_attr_bigint = id
        rt_field = host
        rt_field = @timestamp
        rt_field = message
        rt_field = @version
        rt_field = path
    }

index logsdata
    {
        type = rt
        path = /var/lib/manticore/logsdata
        rt_attr_bigint = id
        rt_field = host
        rt_field = @timestamp
        rt_field = message
        rt_field = @version
        rt_field = path
    }
    
index logsinfo
    {
        type = rt
        path = /var/lib/manticore/logsinfo
        rt_attr_bigint = id
        rt_field = host
        rt_field = @timestamp
        rt_field = message
        rt_field = @version
        rt_field = path
    }
    
index loggs
    {
        type = rt
        path = /var/lib/manticore/loggs
        rt_attr_bigint = id
        rt_field = host
        rt_field = @timestamp
        rt_field = message
        rt_field = @version
        rt_field = path
    }

logstash:

input {
  file {
    path => [
      "/var/log/dpkg.log",
      "/var/log/syslog",
      "/var/log/kern.log",
      "/var/log/auth.log",
      "/opt/lampp/logs/access_log",
      "/opt/lampp/logs/error_log"
    ]
    start_position => "beginning"
    sincedb_path => "/dev/null"
    mode => "read"
    exit_after_read => "true"
    file_completed_action => "log"
    file_completed_log_path => "/dev/null"
    stat_interval => "5 second"
  }
}

output {
  elasticsearch {
    index => "loggs"
    hosts => ["http://localhost:9308"]
    ilm_enabled => false
    manage_template => false
  }
}

COMMAND:

sudo /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/logstashloggs.conf

from manticoresearch.

sanikolaev avatar sanikolaev commented on August 25, 2024

Please provide these files:

"/var/log/dpkg.log",
"/var/log/syslog",
"/var/log/kern.log",
"/var/log/auth.log",
"/opt/lampp/logs/access_log",
"/opt/lampp/logs/error_log"

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

sanikolaev avatar sanikolaev commented on August 25, 2024

I want to ask if the log files are sent correctly?

Where did you send them? I can't find them on our s3 storage.

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

sanikolaev avatar sanikolaev commented on August 25, 2024

I've received your files, thanks.

@PavelShilin89 pls try to reproduce the issue. I'm forwarding the files to you.

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

sanikolaev avatar sanikolaev commented on August 25, 2024

@haninerabahfawaz, sorry, we have no updates at the moment. Please remain patient. If you require a quicker resolution, consider exploring our professional services with an SLA. You can find more information here: https://manticoresearch.com/services

from manticoresearch.

PavelShilin89 avatar PavelShilin89 commented on August 25, 2024

@haninerabahfawaz please specify which version of logstash, manticore you are using, and which os you have?

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

PavelShilin89 avatar PavelShilin89 commented on August 25, 2024

manticore:6.3.0 logstash: 7.14.0 os:linux ubuntu.

On Wed, 26 Jun 2024 at 12:55, Pavel Shilin @.> wrote: @haninerabahfawaz https://github.com/haninerabahfawaz please specify which version of logstash, manticore you are using, and which os you have? — Reply to this email directly, view it on GitHub <#2307 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/A33WFDF6KJMSOPFQMRXYOHTZJKFYLAVCNFSM6AAAAABJI7H24KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJRGI4DOOJVHA . You are receiving this because you were mentioned.Message ID: @.>

We currently support Logstash version 7.10.0, and everything works correctly on this version.

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

sanikolaev avatar sanikolaev commented on August 25, 2024

Okay,I used stat_interval to update the index if any records are added.

Does it have anything to do with the versions?

Is there any problem left? If stat_interval works for you - good. If it doesn't, what's the problem?

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

PavelShilin89 avatar PavelShilin89 commented on August 25, 2024

@haninerabahfawaz I'm working with the files you sent, but I don't see in the logggs table.
I got this:

mysql> SHOW TABLES;
+-----------+------+
| Index     | Type |
+-----------+------+
| loggs     | rt   |
| loggsdata | rt   |
| logsdata  | rt   |
| logsinfo  | rt   |
+-----------+------+
mysql> SELECT COUNT(*) FROM loggs;
+----------+
| count(*) |
+----------+
|   140846 |
+----------+
mysql> select count(*) from loggs where match('syslog');
+----------+
| count(*) |
+----------+
|   110650 |
+----------+

The number of lines remains unchanged.

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

djklim87 avatar djklim87 commented on August 25, 2024

And also when I write another query
select count(*) from logggs where match('syslog')
The number is always fixed, which means something illogical.

You should probably check what duplicates you have, and definitely, it doesn't contain the word 'syslog'

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

PavelShilin89 avatar PavelShilin89 commented on August 25, 2024

@haninerabahfawaz hi! please check the path you specify to the dpkg.log file when indexing, maybe you are using a base dpkg.log that is created in a different path and it is causing more data than you expect. Maybe you need to specify the full path to the file everywhere and respect line breaks. If the problem persists, it would be great if you could send us all the files, configurations, logs and commands you are using. So that we can correctly reproduce the problem.

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

tomatolog avatar tomatolog commented on August 25, 2024

as said in the last comment

If the problem persists, it would be great if you could send us all the files, configurations, logs and commands you are using. So that we can correctly reproduce the problem.

it could be better if you provide all files these reproduces this case locally (configs, data source \ logs and query that shows increasing documents)

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

I will send the files to @sanikolaev. Please send the files to everyone, because I can't send them here.

from manticoresearch.

tomatolog avatar tomatolog commented on August 25, 2024

you can upload your files into our s3 storage as described in the manual https://manual.manticoresearch.com/dev/Reporting_bugs#Uploading-your-data

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

sanikolaev avatar sanikolaev commented on August 25, 2024

you can find them at manticore-logs-2307

I can't find them. Pls upload to issue-2307.

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

hanine@hanine-VirtualBox:/minio-binaries$ ./mc cp -r $HOME/issue-2307/ manticore/write-only/issue-2307/
...7/auth.log: 9.78 MiB / 9.78 MiB ┃▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓┃ 285.13 MiB/s 0shanine@hanine-VirtualBox:
/minio-binaries$ ls
manticore mc
hanine@hanine-VirtualBox:/minio-binaries$ cd manticore
hanine@hanine-VirtualBox:
/minio-binaries/manticore$ ls
write-only
hanine@hanine-VirtualBox:/minio-binaries/manticore$ cd write-only
hanine@hanine-VirtualBox:
/minio-binaries/manticore/write-only$ ls
issue-2307 manticore-logs-2307
hanine@hanine-VirtualBox:/minio-binaries/manticore/write-only$ cd issue-2307
hanine@hanine-VirtualBox:
/minio-binaries/manticore/write-only/issue-2307$ ls
access_log auth.log error_log kern.log syslog

@sanikolaev I also uploaded it in issue-2307

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

Please let me know if the files have been uploaded and appear in manticore storage @sanikolaev

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

any news? @PavelShilin89

from manticoresearch.

PavelShilin89 avatar PavelShilin89 commented on August 25, 2024

@haninerabahfawaz try adding max_packet_size = 128M to manticore.conf.

This way I was able to fully index all the data without loss:

docker network create t_network 
cat <<EOL > /tmp/manticore.conf
searchd {
    listen = 0.0.0.0:9312
    listen = 0.0.0.0:9306:mysql
    listen = 0.0.0.0:9308:http
    log = /var/log/manticore/searchd.log
    query_log = /var/log/manticore/query.log
    pid_file = /var/run/manticore/searchd.pid
    data_dir = /var/lib/manticore
    max_packet_size = 128M
}
EOL
cat <<EOL > /tmp/logstash.conf
input {
  file {
    path => [
      "/var/log/syslog",
      "/var/log/kern.log",
      "/var/log/auth.log",
      "/opt/lampp/logs/access_log",
      "/opt/lampp/logs/error_log"
    ]
    start_position => "beginning"
    sincedb_path => "/dev/null"
    mode => "read"
    exit_after_read => "true"
    file_completed_action => "log"
    file_completed_log_path => "/dev/null"
    stat_interval => "5 second"
  }
}

filter {
  if [path] == "/var/log/syslog" {
    grok {
      match => { "message" => "%{TIMESTAMP_ISO8601:log_timestamp} %{GREEDYDATA}" }
    }
    date {
      match => [ "log_timestamp", "ISO8601" ]
      target => "@timestamp"
      remove_field => ["log_timestamp"]
    }
  }

  if [path] == "/var/log/kern.log" {
    grok {
      match => { "message" => "%{TIMESTAMP_ISO8601:log_timestamp} %{GREEDYDATA}" }
    }
    date {
      match => [ "log_timestamp", "ISO8601" ]
      target => "@timestamp"
      remove_field => ["log_timestamp"]
    }
  }

  if [path] == "/var/log/auth.log" {
    grok {
      match => { "message" => "%{TIMESTAMP_ISO8601:log_timestamp} %{GREEDYDATA}" }
    }
    date {
      match => [ "log_timestamp", "ISO8601" ]
      target => "@timestamp"
      remove_field => ["log_timestamp"]
    }
  }

  if [path] == "/opt/lampp/logs/access_log" {
    grok {
      match => { "message" => "%{IPORHOST:clientip} %{USER:ident} %{USER:auth} \\[%{HTTPDATE:timestamp}\\] \\\"%{WORD:method} %{DATA:request} HTTP/%{NUMBER:httpversion}\\\" %{NUMBER:response} (?:%{NUMBER:bytes}|-)" }
      remove_field => ["clientip", "ident", "auth", "method", "request", "httpversion", "response", "bytes"]
    }

    date {
      match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
      timezone => "Asia/Beirut"
      target => "@timestamp"
    }

    mutate {
      remove_field => ["timestamp"]
    }
  }

  if [path] == "/opt/lampp/logs/error_log" {
    grok {
      match => { "message" => "\\[%{DAY:day} %{MONTH:month} %{MONTHDAY:monthday} %{TIME:time} %{YEAR:year}\\] \\[%{DATA:loglevel}\\] \\[pid %{NUMBER:pid}\\] %{GREEDYDATA:rest}" }
    }

    mutate {
      add_field => { "timestamp" => "%{day} %{month} %{monthday} %{time} %{year}" }
    }

    date {
      match => [ "timestamp", "EEE MMM dd HH:mm:ss.SSSSSS yyyy" ]
      timezone => "Asia/Beirut"
      target => "@timestamp"
    }

    mutate {
      remove_field => ["day", "month", "monthday", "time", "year", "timestamp"]
    }
  }
}

output {
  elasticsearch {
    index => "loginf"
    hosts => ["http://manticore:9308"]
    ilm_enabled => false
    manage_template => false
  }
  stdout { codec => rubydebug }
}
EOL
docker run --network=t_network --rm -it -e EXTRA=1 --platform linux/amd64 --name manticore manticoresearch/manticore:6.3.0
docker run -e PIPELINE_BATCH_SIZE=10000 \
-e XPACK_MONITORING_ENABLED=false \
-e PIPELINE_WORKERS=8 \
--network=t_network --rm -it \
-v /Users/pavelshilin/Downloads/Logs_New/var:/var/log/ \
-v /Users/pavelshilin/Downloads/Logs_New/opt:/opt/lampp/logs/ \
-v /tmp/logstash.conf:/usr/share/logstash/pipeline/logstash.conf \
docker.elastic.co/logstash/logstash:7.10.0
mysql> DESCRIBE loginf;
+------------+-----------+----------------+
| Field      | Type      | Properties     |
+------------+-----------+----------------+
| id         | bigint    |                |
| rest       | text      | indexed stored |
| loglevel   | text      | indexed stored |
| message    | text      | indexed stored |
| pid        | text      | indexed stored |
| path       | text      | indexed stored |
| host       | text      | indexed stored |
| @version   | text      | indexed stored |
| @timestamp | timestamp |                |
| tags       | json      |                |
+------------+-----------+----------------+
mysql> select count(*) from loginf;
+----------+
| count(*) |
+----------+
|   193225 |
+----------+

The data after indexing is identical to that contained in your files.

❯ cat /Users/pavelshilin/Downloads/Logs_New/var/auth.log | wc -l
    3252
❯ cat /Users/pavelshilin/Downloads/Logs_New/var/kern.log | wc -l
   29448
❯ cat /Users/pavelshilin/Downloads/Logs_New/var/syslog | wc -l
  156592
❯ cat /Users/pavelshilin/Downloads/Logs_New/opt/access_log | wc -l
    3821
❯ cat /Users/pavelshilin/Downloads/Logs_New/opt/error_log | wc -l
     112

Total 193225

from manticoresearch.

tomatolog avatar tomatolog commented on August 25, 2024

if this helps it could be better to create another ticket to provide Elastic comparable reply in case incoming packet overflows the searchd limit

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

tomatolog avatar tomatolog commented on August 25, 2024

it is still not clear how do you count for the repetition - it could be better to provide complete case your configuration (we got logstash config but not the daemon config) And initial guess with daemon config shows that searchd.max_packet_size = 128M should be set for logstash to finish well.

Now you said about repetition but that output is matched to the lines in the log files you provided

mysql> select count(*) from loginf;
+----------+
| count(*) |
+----------+
|   193225 |
+----------+

that is why it is not clear what you mean while you said about repetition in the records data.

It could be better to provide complete example, ie

  • logstash configs
  • all log files these posted into logstash
  • daemon config
  • queries you issue to get the data statistics along with queries replies

that allow us to run the case on the empty instance and compare the output of the query you posted with replies after logstash posts data and check the differences,

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

@tomatolog

That's not what I mean.

If you always know that we have new data in the log file.

My task or goal is after running logstash configuration then if any new data is added in any input file then this new data should also be added in the table to create manticore the index is always updated.

from manticoresearch.

tomatolog avatar tomatolog commented on August 25, 2024

I do not use logstash but all your examples use sincedb_path => "/dev/null" and documentation said there logstash keeps track of the current position of monitored log files. Maybe better to consult the logstash forum to get the configuration that works as you need?

from manticoresearch.

sanikolaev avatar sanikolaev commented on August 25, 2024

@haninerabahfawaz I agree sincedb_path => "/dev/null" may be not good if you want to put new data to Manticore (or Elasticsearch) as soon as it's updated in the file. Pls experiment with it.

@PavelShilin89 pls try reproducing the same issue with Elasticsearch (by lowering it's http.max_content_length setting). It's interesting what Elasticsearch and Logstash return in this case and if we can improve our error message so it makes more sense (so it's clear that the user needs to increase max_packet_size)

from manticoresearch.

PavelShilin89 avatar PavelShilin89 commented on August 25, 2024

@haninerabahfawazСогласен, sincedb_path => "/dev/null"это может быть не очень хорошо, если вы хотите помещать новые данные в Manticore (или Elasticsearch) сразу после их обновления в файле. Пожалуйста, поэкспериментируйте с этим.

@PavelShilin89Пожалуйста, попробуйте воспроизвести ту же проблему с Elasticsearch (уменьшив его http.max_content_length настройку ). Интересно, что Elasticsearch и Logstash возвращают в этом случае, и можем ли мы улучшить наше сообщение об ошибке, чтобы оно имело больше смысла (чтобы было ясно, что пользователю нужно увеличить max_packet_size)

When reproducing the same problem with Elasticsearch by reducing its http.max_content_length we also don't get a clearer error or notification.

The Elasticsearch logs only ara warnings about low disk level and JVM rubbish collection, which may indicate possible performance issues.
- Deprecation warning that using types in bulk queries is deprecated.

There is an error in Logstash logs:

[2024-07-10T09:41:39,215][ERROR][logstash.outputs.elasticsearch][main][7d2347ba5a0b22293324ccaa54fff4d7d4657d43825b29f6901128b6a93a70ac] Encountered a retryable error. Will Retry with exponential backoff {:code=>413, :url=>"http://es1:9200/_bulk"}

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

@sanikolaev @tomatolog @PavelShilin89
yes, we can use sincedb_path => "/var/lib/logstash/since_db" .

sincedb_path is used to keep the database in memory across reboots. This means that if you set cedb_path to NUL (on Windows, or /dev/null on UNIX), when you restart logstash, it will think about each new file and reread it.

My question:
After modifying sincedb_path , if any records are added in any input log file, they should be added to the index automatically...

Example:
I run the command and let's say the number of records is 100000. Then new records are added in the input files, let's say 3000 new records, they should be added automatically….

from manticoresearch.

sanikolaev avatar sanikolaev commented on August 25, 2024

After modifying sincedb_path , if any records are added in any input log file, they should be added to the index automatically...

I think so. What's the question? Does it happen with Elasticsearch, but not with Manticore?

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

@sanikolaev
No, I don't mean that...

You know that logs are dynamic files, not static, which means that everything on your computer is added to the log files.
This means that the index table must always be updated.

We need to add something in the logstash file that always updates the index file if any log is added....

from manticoresearch.

sanikolaev avatar sanikolaev commented on August 25, 2024

We need to add something in the logstash file that always updates the index file if any log is added

As I understand sincedb_path does that, doesn't it? Anyway it looks like you are currently struggling with a logstash issue, not Manticore's, right? In this case it's better to discuss it with Logstash experts.

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

@sanikolaev
sincedb_path updates the read position of log files only.
The goal is to always update the index table in manticore.
Why ?
Because the input files are always modified.

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

@sanikolaev @PavelShilin89

I always get this problem in my system log:

2024-07-15T17:51:37.423966+03:00 hanine-VirtualBox logstash[3445]: [2024-07-15T17:51:37,423][ERROR][logstash.outputs.elasticsearch][main][4a1440c56bbde552634034a0198050029a19e1012f1897d5b1d8681720180424] Encountered a retryable error. Will Retry with exponential backoff {:code=>409, :url=>"http://localhost:9308/_bulk"}

from manticoresearch.

PavelShilin89 avatar PavelShilin89 commented on August 25, 2024

@haninerabahfawaz Is your goal to update the old value in the log file and update it in Manticore Search?

from manticoresearch.

haninerabahfawaz avatar haninerabahfawaz commented on August 25, 2024

from manticoresearch.

sanikolaev avatar sanikolaev commented on August 25, 2024

Here's an example of integrating Logstash with Manticore, so as soon as a new line is added to the log file, it's immediately inserted into Manticore by Logstash:

Execute this in Ubuntu Jammy:

mkdir /usr/share/logstash && cd $_

wget -q https://artifacts.elastic.co/downloads/logstash/logstash-7.10.0-linux-x86_64.tar.gz

tar -xvzf logstash-7.10.0-linux-x86_64.tar.gz

cd /tmp/
wget https://repo.manticoresearch.com/manticore-repo.noarch.deb
sudo dpkg -i manticore-repo.noarch.deb
sudo apt -y update
sudo apt -y install manticore manticore-extra

cd /usr/local/src

ln -s /usr/share/logstash/logstash-7.10.0/bin/logstash /usr/bin/logstash

echo 'input { file { path => ["/var/log/dpkg.log"] start_position => "end" sincedb_path => "/tmp/sincedb_path" mode => "tail" } } output { elasticsearch { index => "dpkg_log" hosts => ["http://localhost:9308"] ilm_enabled => false manage_template => false } }' > logstash.conf

systemctl start manticore

logstash -f logstash.conf &

# Wait until Logstash starts

echo -e "2023-05-31 10:42:55 status triggers-awaited ca-certificates-java:all 20190405ubuntu1.1" >> /var/log/dpkg.log

mysql -NB -P9306 -h0 -e "select count(*) from dpkg_log"

# You'll get:
# 1

echo -e "2023-05-31 10:42:55 status triggers-awaited ca-certificates-java:all 20190405ubuntu1.1\n" >> /var/log/dpkg.log

mysql -NB -P9306 -h0 -e "select count(*) from dpkg_log"

# You'll get:
# 2

# and so on

I see no problem. If you see any issue please modify this example to demonstrate it.

from manticoresearch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.