Giter Site home page Giter Site logo

Random freezes about flexo HOT 12 CLOSED

nroi avatar nroi commented on June 16, 2024
Random freezes

from flexo.

Comments (12)

nroi avatar nroi commented on June 16, 2024 1

I was able to reproduce this issue simply by running:

while true; do sudo pacman -Sy; done

and waiting for a while. It ran fine for the first few seconds, and then the following showed up at the logs (log level DEBUG):

May 23 22:41:06 archWS flexo[544]: [2021-05-23T20:41:06.090Z DEBUG flexo] Reading header from client.
May 23 22:41:16 archWS flexo[544]: [2021-05-23T20:41:16.093Z DEBUG flexo::mirror_flexo] Received header from client

Notice the 10 second gap between the first and the second message. The 2nd log entry appears after pacman has given up:

error: failed retrieving file 'extra.db' from 127.0.0.1:7878 : Operation timed out after 10001 milliseconds with 0 out of 0 bytes received

So this is definitely a bug in Flexo, thank you for reporting this! I have all the information I need to troubleshoot this.

from flexo.

nroi avatar nroi commented on June 16, 2024 1

Ok, this is getting interesting.

I was able to reproduce this behavior without Flexo, by using NGINX to mimic Flexo's behavior. With the following server entry in /etc/nginx/nginx.conf:

server {
    listen 7979;
    server_name localhost;

    location = /core/os/x86_64/core.db {
        return 301 https://mirror.rackspace.com/archlinux/core/os/x86_64/core.db;
    }

    location = /extra/os/x86_64/extra.db {
        return 301 https://mirror.rackspace.com/archlinux/extra/os/x86_64/extra.db;
    }

    location = /community/os/x86_64/community.db {
        return 301 https://mirror.rackspace.com/archlinux/community/os/x86_64/community.db;
    }

    location = /multilib/os/x86_64/multilib.db {
        return 301 https://mirror.rackspace.com/archlinux/multilib/os/x86_64/multilib.db;
    }
}

Configure pacman to use 127.0.0.1:7979 as mirror, and then run

while true; do sudo pacman -Sy; done

Every once in a while, I notice a lag, where pacman stalls for a few seconds. Those lags are also visible from the access log of NGINX (/var/log/nginx/access.log), notice the 6 second delay between the 3rd and the 4th request:

127.0.0.1 - - [24/May/2021:10:51:29 +0200] "GET /core/os/x86_64/core.db HTTP/1.1" 301 170 "-" "pacman/5.2.2 (Linux x86_64) libalpm/12.0.2"
127.0.0.1 - - [24/May/2021:10:51:29 +0200] "GET /extra/os/x86_64/extra.db HTTP/1.1" 301 170 "-" "pacman/5.2.2 (Linux x86_64) libalpm/12.0.2"
127.0.0.1 - - [24/May/2021:10:51:30 +0200] "GET /community/os/x86_64/community.db HTTP/1.1" 301 170 "-" "pacman/5.2.2 (Linux x86_64) libalpm/12.0.2"
127.0.0.1 - - [24/May/2021:10:51:36 +0200] "GET /multilib/os/x86_64/multilib.db HTTP/1.1" 301 170 "-" "pacman/5.2.2 (Linux x86_64) libalpm/12.0.2"

from flexo.

nroi avatar nroi commented on June 16, 2024 1

Looks like it's not a bug in Flexo after all, I have been mislead by this error message:

error: failed retrieving file 'extra.db' from 127.0.0.1:7878 : Operation timed out after 10001 milliseconds with 0 out of 0 bytes received

Even though it says it timed out while retrieving the file from 127.0.0.1, it seems that the timeout actually occurs while downloading the file from the actual remote mirror. Flexo does not download database files from the remote mirror, it serves an HTTP redirect response instead. If pacman then runs into a timeout while downloading the package from the actual mirror, it logs this error message.

This issue also occurs when I use mirror.rackspace.com directly in /etc/pacman.d/mirrorlist, so it's unrelated to Flexo.
@aude So the solution for you is to replace mirror.rackspace.com by a better mirror. I'm fairly certain this should solve the issue, but if it doesn't, please leave a comment.

I have created a new issue to improve Flexo so that unresponsive mirrors are replaced by better mirrors even if the unresponsiveness already occurs when the database files are fetched. But this will obviously not work if you configure your flexo.toml with mirror_selection_method = "predefined" and only a single mirror in mirrors_predefined. So my advice is to either use mirror_selection_method = "auto", or to have a handful of mirrors in mirrors_predefined, so that Flexo can switch the mirror if necessary.

from flexo.

nroi avatar nroi commented on June 16, 2024

@aude are you running Flexo with Docker, or are you using the AUR package?

from flexo.

aude avatar aude commented on June 16, 2024

I'm using the AUR package.

I'll keep an eye out for patterns.

from flexo.

nroi avatar nroi commented on June 16, 2024

@aude two more questions:

  1. What error message did you receive with Pacman? connection refused, or something else?
  2. Did this happen right after you booted your machine or right after you started Flexo? In that case, it could be caused by the latency tests that Flexo runs at startup.

from flexo.

aude avatar aude commented on June 16, 2024
  1. pacman -Sy was just hanging. Each request eventually timed out, one by one.
  2. Yeah. You might be on to something there 🤔 I'll check more for that. (Maybe worth mentioning is that I run with mirrors_predefined and only 1 mirror. And, network is not always available at boot, I experiment a lot with my computer.)

from flexo.

nroi avatar nroi commented on June 16, 2024

@aude could you send me your flexo.toml file so I can look into it?

Maybe worth mentioning is that I run with mirrors_predefined and only 1 mirror.

So then it's not related to the latency tests, those should not run when you use
mirror_selection_method = "predefined" and a non-empty mirrors_predefined list.

I experiment a lot with my computer.

Experimenting users are always great to find and eradicate bugs 😉

from flexo.

aude avatar aude commented on June 16, 2024

😄

My flexo.toml is mostly defaults, here are the things I have changed:

--- flexo.toml  2021-03-21 15:29:47.000000000 +0100
+++ flexo.toml.new      2021-03-21 16:23:28.467698344 +0100
@@ -29,7 +29,7 @@
 mirrorlist_latency_test_results_file = "/var/cache/flexo/state/latency_test_results.json"

 # The IP address to listen on.
-listen_ip_address = "127.0.0.1"
+# listen_ip_address = "127.0.0.1"

 # The port to listen on.
 port = 7878
@@ -40,7 +40,7 @@
 #           to select only sufficiently fast mirrors.
 #   "predefined": To only choose the mirrors defined for the variable
 #                 mirrors_predefined (see below).
-mirror_selection_method = "auto"
+mirror_selection_method = "predefined"


 # The meaning of this variable depends on the mirror_selection_method:
@@ -51,7 +51,7 @@
 # This list must not be empty if mirror_selection_method has been set to "predefined".
 # Mirrors in this list should NOT include the $repo/os/$arch suffix, so you should add
 # something like "http://archlinux.mirror.org/" or "https://mirror.org/archlinux/".
-mirrors_predefined = []
+mirrors_predefined = ["https://mirror.rackspace.com/archlinux/"]

 # The number of versions kept in the cache. If set to a positive number, Flexo
 # will keep at most this many versions in the cache. If set to 0, packages will
@@ -67,6 +67,9 @@
 # [[custom_repo]]
 #     name = "archzfs"
 #     url = "https://archzfs.com"
+[[custom_repo]]
+    name = "archrepo.local"
+    url = "http://archrepo.local/"

 # Various settings that apply if mirror_selection_method has been set to "auto".
 [mirrors_auto]

from flexo.

aude avatar aude commented on June 16, 2024

It happened again, after a long suspend!

Seems related, I'll investigate more.

from flexo.

nroi avatar nroi commented on June 16, 2024

@aude thanks for taking the time to investigate this!

It might also help if you increase the logging verbosity to DEBUG:

  1. Edit the file /usr/lib/systemd/system/flexo.service and change RUST_LOG=info to RUST_LOG=debug
  2. Apply the changes of the systemd service with sudo systemctl daemon-reload
  3. Restart Flexo: sudo systemctl restart flexo

Maybe something shows up at the log before it freezes.

from flexo.

aude avatar aude commented on June 16, 2024

Thanks for investigating! Glad to learn the root cause. What a ride O.O

from flexo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.