Giter Site home page Giter Site logo

Comments (14)

allinurl avatar allinurl commented on September 23, 2024

Does this occur when you execute it directly on the machine, without using Docker?

from goaccess.

H1r0Sh1mA avatar H1r0Sh1mA commented on September 23, 2024

Does this occur when you execute it directly on the machine, without using Docker?

Standalone goaccess working as intended (with minimum workload).

Got some updates:

In my initial question i didnt mention one important detail - my access log rotating every 1 hour.
Its inode number does not changing
Here are rotation options:

/var/log/docker/*.log {
        hourly
        rotate 336
        copytruncate
        dateext
        datehourago
        dateformat _%Y-%m-%d_%H%M%S
        extension .log
        missingok
        compress
        notifempty
        sharedscripts
        postrotate
                docker exec `docker ps -q -f name=nginx_container` nginx -s reload > /dev/null 2>/dev/null || true
        endscript
}

The 1st hour after goaccess started im getting an accurate stats.

At the moment the second hour ticks and the access.log has been rotated (access.log inode value is still the same) - then the deviations started to appears.
Some of them:
In Overall Analyzed Requests pane

  1. Unique Visitors counter stops count
  2. Total\Valid requests starting to count with huge deviation (~10x )
  3. Tx Amount starting to count with huge deviation
  4. Requested Files counter stops count
  5. Log Size is correct

On the Time Distribution pane

  • Hour before the access.log has been rotated:
    hits -> 849,513
    visitors-> 48,750
    TX Amount -> 11Gb

this stats are fine

  • The next hour after the access.log has been rotated :
    hits -> 6,981,924
    visitors -> 2,767 (this stat is staying in freezed condition - its value does not changing)
    TX Amount -> 105.6 Gb

this stats received in the first 20 mins of a hour and they are NOT correct
Just to know the correct values for the request count during the fresh hour: wc -l access.log returns 213618 total requests
So im expecting to receive ~ 1,063,131 total requests instead of 6,981,924 in goaccess dashboard

from goaccess.

allinurl avatar allinurl commented on September 23, 2024

Are you running this using tail -F? Have you tried passing the log directly, for example?

# goaccess access.log -o /reports/access.log.html --real-time-html --origin=https://my.domain/ --ws-url=wss://my.domain/ws --log-format='%^ %^ %^ %^ %^ %^ %v %h %^[%d:%t %^] "%r" %s %b "%R" "%u" "%^" "%h:%^" "%T"' --time-format='%T' --date-format='%d/%b/%Y' --db-path=/internal_db --geoip-database=/geo_db/GeoIP2-Country.mmdb --persist --restore

from goaccess.

H1r0Sh1mA avatar H1r0Sh1mA commented on September 23, 2024

Are you running this using tail -F? Have you tried passing the log directly, for example?

# goaccess access.log -o /reports/access.log.html --real-time-html --origin=https://my.domain/ --ws-url=wss://my.domain/ws --log-format='%^ %^ %^ %^ %^ %^ %v %h %^[%d:%t %^] "%r" %s %b "%R" "%u" "%^" "%h:%^" "%T"' --time-format='%T' --date-format='%d/%b/%Y' --db-path=/internal_db --geoip-database=/geo_db/GeoIP2-Country.mmdb --persist --restore

in my docker container im running goaccess like this (as stated in my docker-compose file):

command: >
      /app/logs/access.log
      -o /app/report/access.log.html
      --real-time-html
      --origin=https://my.domain
      --ws-url=wss://my.domain/ws
      --log-format='%^ %^ %^ %^ %^ %^ %v %h %^[%d:%t %^] "%r" %s %b "%R" "%u" "%^" "%h:%^" "%T"'
      --keep-last=30
      --time-format='%T'
      --date-format='%d/%b/%Y'
      --db-path=/app/internal_db
      --geoip-database=/app/geo_db/GeoIP2-Country.mmdb
      --geoip-database=/app/geo_db/GeoLite2-ASN.mmdb
      --ignore-panel=OS
      --ignore-panel=BROWSERS
      --persist
      --restore

so the log goes directly to goaccess without piping

from goaccess.

allinurl avatar allinurl commented on September 23, 2024

does it function properly outside of Docker when passing the log directly?

from goaccess.

H1r0Sh1mA avatar H1r0Sh1mA commented on September 23, 2024

does it function properly outside of Docker when passing the log directly?

Yep, standalone goaccess (outside Docker env) working as intended (with minimum workload).

from goaccess.

allinurl avatar allinurl commented on September 23, 2024

Got it. I'm just trying to pinpoint the problem as precisely as possible. So, it seems the issue arises when running via Docker or when piping data in using tail -F, right?

If you remove --persist and --restore, does the issue still occur with the real-time counters? Also, are you experiencing the same issues with terminal output? Any additional details would be appreciated so I can attempt to reproduce this on my end. Thanks!

from goaccess.

H1r0Sh1mA avatar H1r0Sh1mA commented on September 23, 2024

The issue arises when running via Docker.
I did not use tail -F as i read its more advisable to send logs direct to goaccess to avoid potential issues.
I will try to remove --persist and --restore options and report as soon as possible.
I will also try to check the TUI behavior.

I can provide you with the docker-compose file im using.

Thank you so much for your time.

from goaccess.

H1r0Sh1mA avatar H1r0Sh1mA commented on September 23, 2024

Update
removing of --persist and --restore options does not help.
Here is the docker-compose file

  parser:
    image: allinurl/goaccess
    logging:
      driver: syslog
      options:
        tag: "goaccess_parser"
    expose:
      - "7890"
    command: >
      /app/logs/access.log
      -o /app/report/acces.log.html
      --jobs=2
      --real-time-html
      --origin=https://my.domain
      --ws-url=wss://my.domain/ws
      --log-format='%^ %^ %^ %^ %^ %v %h %^ %^[%d:%t %^] \"%r\" %s %b \"%R\" \"%u\" \"%^\" \"%^\" \"%T\"'
      --keep-last=30
      --time-format=%H:%M:%S
      --date-format=%d/%b/%Y
      --db-path=/app/internal_db
      --geoip-database=/app/geo_db/GeoIP2-Country.mmdb
      --geoip-database=/app/geo_db/GeoLite2-ASN.mmdb
      --ignore-panel=OS
      --ignore-panel=BROWSERS
    volumes:
      - ./logs:/app/logs:ro
      - ./reports:/app/report
      - ./parser/internal_db:/app/internal_db
      - ./parser/geo_db:/app/geo_db

Unique Visitors Per Day Pane (wrong values):
image

CUM. T.S\MAX. T.S values looking odd to me

Time Distribution Pane (15h is ok, 16h is wrong. 16h values received in 11 minutes)
image

from goaccess.

H1r0Sh1mA avatar H1r0Sh1mA commented on September 23, 2024

Update

Time Distribution Pane:

Hours from different days overlap each other:
Say - Yesterday 13th hour has 513k hits.
When 13th hour ticks today, then its value starting to add to yesterdays 13th hour (513k for yesterday + XXXk for today)

from goaccess.

allinurl avatar allinurl commented on September 23, 2024

@H1r0Sh1mA, just to clarify, are you encountering two separate issues? One regarding the real-time problem with Docker and the other regarding the accuracy of the stats numbers? If so, I suggest opening a new issue for the stats so we can address them separately.

Regarding the discrepancy with the cum/max/avg time served, it seems like there might be an issue with your log format. Could you share the actual lines from your access log? That would help us investigate further. Thanks

from goaccess.

H1r0Sh1mA avatar H1r0Sh1mA commented on September 23, 2024

Could you share the actual lines from your access log? That would help us investigate further.

Feb 17 12:00:04 nginx_hostname nginx_container[13971]: my.domain 109.124.233.55 - - [17/Feb/2024:12:00:04 +0000] "POST /orders/event HTTP/1.1" 200 0 "-" "-" "0.001" "10.0.4.163:80" "0.002"

the log format arguments:

--log-format='%^ %^ %^ %^ %^ %v %h %^ %^[%d:%t %^] \"%r\" %s %b \"%R\" \"%u\" \"%^\" \"%^\" \"%T\"'

escaping trails (like: \"%r\" ) are using due to the docker-compose format style. Now im issuing it like this:

   command: [
      "/app/logs/accesst.log",
      "-o", "/app/report/access.html",
      "--jobs=8",
      "--keep-last=3",
      "--real-time-html",
      "--origin=https://my.domain",
      "--ws-url=wss://my.domain/ws",
      "--log-format='%^ %^ %^ %^ %^ %v %h %^ %^[%d:%t %^] \"%r\" %s %b \"%R\" \"%u\" \"%^\" \"%^\" \"%T\"'",
      "--time-format=%H:%M:%S",
      "--date-format=%d/%b/%Y",
      "--date-spec=min",
      "--db-path=/app/internal_db",
      "--geoip-database=/app/geo_db/GeoIP2-Country.mmdb",
      "--geoip-database=/app/geo_db/GeoLite2-ASN.mmdb",
      "--ignore-panel=OS",
      "--ignore-panel=BROWSERS",
      "--persist",
      "--restore"
    ]

Didnt have a chance yet to test --datetime-format='%d/%b/%Y:%H:%M:%S %z' instead of --time-format=%H:%M:%S and --date-format=%d/%b/%Y

just to clarify, are you encountering two separate issues? One regarding the real-time problem with Docker and the other regarding the accuracy of the stats numbers?

I am experiencing all these issues in docker environment with real-time parsing mode.
That why i have put them all here.

To avoid confusion i will open issues one by one separately.

from goaccess.

H1r0Sh1mA avatar H1r0Sh1mA commented on September 23, 2024

no updates so far?

from goaccess.

allinurl avatar allinurl commented on September 23, 2024

Some questions here...

  1. It seems like there might be an issue with your logrotate configuration. Can you verify how it's performing?

  2. Can you also share a simple way to reproduce this issue? I need the most basic command that triggers this issue so that we can narrow down the parameters that might not be causing it. e.g.,

 /app/logs/access.log
      -o /app/report/access.log.html
      --real-time-html
      --log-format='%^ %^ %^ %^ %^ %^ %v %h %^[%d:%t %^] "%r" %s %b "%R" "%u" "%^" "%h:%^" "%T"'
      --time-format='%T'
      --date-format='%d/%b/%Y'
  1. Also, have you tried running it as posted on the README.md and see if that also causes issues?
    tail -F access.log | docker run -p 7890:7890 --rm -i -e LANG=$LANG allinurl/goaccess -a -o html --log-format COMBINED --real-time-html - > report.html
  1. Can you replicate this issue on another machine that also performs log rotation?

from goaccess.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.