Giter Site home page Giter Site logo

Comments (1)

mohammaddawoodshaik avatar mohammaddawoodshaik commented on May 9, 2024

In our case also we are seeing this issue. We are having one volume with 3-brick:3-replica setup with 3-nodes. in one node we have restarted all gluster services. And after that glusterd was continously crashing, stating 0-management: Initialization of volume 'management' failed, review your volfile again.

Logs from GlusterD:

[2023-10-23 04:35:15.477485] W [glusterfsd.c:1570:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f50547c96db] -->/usr/sbin/glusterd(glusterfs_sigwaiter+0xfd) [0x55de29541b4d] -->/usr/sbin/glusterd(cleanup_and_exit+0x54) [0x55de29541994] ) 0-: received signum (15), shutting down
[2023-10-23 04:38:56.022520] I [MSGID: 100030] [glusterfsd.c:2847:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 6.10 (args: /usr/sbin/glusterd -N -p /var/run/glusterd.pid)
[2023-10-23 04:38:56.022658] I [glusterfsd.c:2556:daemonize] 0-glusterfs: Pid of current running process is 88
[2023-10-23 04:38:56.071590] I [MSGID: 106478] [glusterd.c:1422:init] 0-management: Maximum allowed open file descriptors set to 65536
[2023-10-23 04:38:56.071663] I [MSGID: 106479] [glusterd.c:1478:init] 0-management: Using /var/lib/glusterd/ as working directory
[2023-10-23 04:38:56.071704] I [MSGID: 106479] [glusterd.c:1484:init] 0-management: Using /var/run/gluster as pid file working directory
[2023-10-23 04:38:56.073867] I [socket.c:1022:__socket_server_bind] 0-socket.management: process started listening on port (24007)
[2023-10-23 04:38:56.097616] I [socket.c:965:__socket_server_bind] 0-socket.management: closing (AF_UNIX) reuse check socket 11
[2023-10-23 04:38:56.097971] I [MSGID: 106059] [glusterd.c:1860:init] 0-management: base-port override: 49152
[2023-10-23 04:38:56.097982] I [MSGID: 106059] [glusterd.c:1865:init] 0-management: max-port override: 49152
[2023-10-23 04:39:04.956934] I [MSGID: 106513] [glusterd-store.c:2394:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 60000
[2023-10-23 04:39:34.308779] I [MSGID: 106544] [glusterd.c:152:glusterd_uuid_init] 0-management: retrieved UUID: ddc1d9f0-a6e5-4751-8e35-0f91d326fa41
[2023-10-23 04:39:34.912425] I [MSGID: 106498] [glusterd-handler.c:3687:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0
[2023-10-23 04:39:34.912618] I [MSGID: 106498] [glusterd-handler.c:3687:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0
[2023-10-23 04:39:34.912669] W [MSGID: 106061] [glusterd-handler.c:3490:glusterd_transport_inet_options_build] 0-glusterd: Failed to get tcp-user-timeout
[2023-10-23 04:39:34.912720] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2023-10-23 04:39:34.916260] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
Final graph:
+------------------------------------------------------------------------------+
1: volume management
2: type mgmt/glusterd
3: option rpc-auth.auth-glusterfs on
4: option rpc-auth.auth-unix on
5: option rpc-auth.auth-null on
6: option rpc-auth-allow-insecure on
7: option transport.listen-backlog 1024
8: option max-port 49152
9: option base-port 49152
10: option transport.address-family inet
11: option transport.socket.listen-port 24007
12: option event-threads 1
13: option ping-timeout 0
14: option transport.socket.read-fail-log off
15: option transport.socket.keepalive-interval 2
16: option transport.socket.keepalive-time 10
17: option glusterd-sockfile /etc/glusterd_socket/gluster.sock
18: option transport-type socket
19: option working-directory /var/lib/glusterd/
20: end-volume
21:
+------------------------------------------------------------------------------+
[2023-10-23 04:39:34.916252] W [MSGID: 106061] [glusterd-handler.c:3490:glusterd_transport_inet_options_build] 0-glusterd: Failed to get tcp-user-timeout
[2023-10-23 04:39:35.015736] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0
[2023-10-23 04:39:35.017093] I [MSGID: 106487] [glusterd-handler.c:1516:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req
[2023-10-23 04:39:35.017342] I [MSGID: 106487] [glusterd-handler.c:1516:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req
[2023-10-23 04:39:35.021484] I [MSGID: 106163] [glusterd-handshake.c:1389:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 60000
[2023-10-23 04:39:35.022776] I [MSGID: 106493] [glusterd-rpc-ops.c:468:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: 0c333528-b176-4229-8df5-230844b7ee6f, host: <>, port: 0
[2023-10-23 04:39:35.562534] I [glusterd-utils.c:6314:glusterd_brick_start] 0-management: starting a fresh brick process for brick /mnt/bricks/ndp_brick
[2023-10-23 04:39:36.599410] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2023-10-23 04:40:04.394492] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-nfs: setting frame-timeout to 600
[2023-10-23 04:40:04.394624] I [MSGID: 106131] [glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: nfs already stopped
[2023-10-23 04:40:04.394652] I [MSGID: 106568] [glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: nfs service is stopped
[2023-10-23 04:40:04.394972] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-glustershd: setting frame-timeout to 600
[2023-10-23 04:40:04.395758] I [MSGID: 106131] [glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: glustershd already stopped
[2023-10-23 04:40:04.395783] I [MSGID: 106568] [glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: glustershd service is stopped
[2023-10-23 04:40:04.395806] I [MSGID: 106567] [glusterd-svc-mgmt.c:220:glusterd_svc_start] 0-management: Starting glustershd service
[2023-10-23 04:40:05.400050] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-quotad: setting frame-timeout to 600
[2023-10-23 04:40:05.400307] I [MSGID: 106131] [glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: quotad already stopped
[2023-10-23 04:40:05.400340] I [MSGID: 106568] [glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: quotad service is stopped
[2023-10-23 04:40:05.400376] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-bitd: setting frame-timeout to 600
[2023-10-23 04:40:05.400515] I [MSGID: 106131] [glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: bitd already stopped
[2023-10-23 04:40:05.400542] I [MSGID: 106568] [glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: bitd service is stopped
[2023-10-23 04:40:05.400571] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-scrub: setting frame-timeout to 600
[2023-10-23 04:40:05.400693] I [MSGID: 106131] [glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: scrub already stopped
[2023-10-23 04:40:05.400711] I [MSGID: 106568] [glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: scrub service is stopped
[2023-10-23 04:40:05.400754] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
[2023-10-23 04:40:05.400888] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-gfproxyd: setting frame-timeout to 600
[2023-10-23 04:40:05.401121] I [MSGID: 106492] [glusterd-handler.c:2796:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: 0c333528-b176-4229-8df5-230844b7ee6f
[2023-10-23 04:40:23.695495] I [MSGID: 106502] [glusterd-handler.c:2837:__glusterd_handle_friend_update] 0-management: Received my uuid as Friend
[2023-10-23 04:40:23.695661] I [MSGID: 106493] [glusterd-rpc-ops.c:681:__glusterd_friend_update_cbk] 0-management: Received ACC from uuid: 0c333528-b176-4229-8df5-230844b7ee6f
[2023-10-23 04:40:23.714829] I [MSGID: 106493] [glusterd-rpc-ops.c:468:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: c03c5b65-fe44-4342-bacd-76947c016646, host: <>, port: 0
[2023-10-23 04:40:25.356022] I [MSGID: 106490] [glusterd-handler.c:2611:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: 0c333528-b176-4229-8df5-230844b7ee6f
[2023-10-23 04:40:25.356665] I [MSGID: 106009] [glusterd-utils.c:3466:glusterd_compare_friend_volume] 0-management: Version of volume ndp_vol differ. local version = 7370, remote version = 7382 on peer <>
[2023-10-23 04:40:25.357955] I [MSGID: 106493] [glusterd-handler.c:3883:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to <> (0), ret: 0, op_ret: 0
[2023-10-23 04:40:26.595064] W [glusterfsd.c:1570:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f1e9eb816db] -->/usr/sbin/glusterd(glusterfs_sigwaiter+0xfd) [0x55d33e735b4d] -->/usr/sbin/glusterd(cleanup_and_exit+0x54) [0x55d33e735994] ) 0-: received signum (15), shutting down
[2023-10-23 04:40:30.011947] I [MSGID: 100030] [glusterfsd.c:2847:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 6.10 (args: /usr/sbin/glusterd -N -p /var/run/glusterd.pid)
[2023-10-23 04:40:30.012251] I [glusterfsd.c:2556:daemonize] 0-glusterfs: Pid of current running process is 77
[2023-10-23 04:40:30.096321] I [MSGID: 106478] [glusterd.c:1422:init] 0-management: Maximum allowed open file descriptors set to 65536
[2023-10-23 04:40:30.096375] I [MSGID: 106479] [glusterd.c:1478:init] 0-management: Using /var/lib/glusterd/ as working directory
[2023-10-23 04:40:30.096392] I [MSGID: 106479] [glusterd.c:1484:init] 0-management: Using /var/run/gluster as pid file working directory
[2023-10-23 04:40:30.099942] I [socket.c:1022:__socket_server_bind] 0-socket.management: process started listening on port (24007)
[2023-10-23 04:40:30.101375] I [socket.c:965:__socket_server_bind] 0-socket.management: closing (AF_UNIX) reuse check socket 11
[2023-10-23 04:40:30.101751] I [MSGID: 106059] [glusterd.c:1860:init] 0-management: base-port override: 49152
[2023-10-23 04:40:30.101766] I [MSGID: 106059] [glusterd.c:1865:init] 0-management: max-port override: 49152
[2023-10-23 04:40:33.389696] I [MSGID: 106513] [glusterd-store.c:2394:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 60000
[2023-10-23 04:40:33.390046] E [MSGID: 106206] [glusterd-store.c:3139:glusterd_store_update_volinfo] 0-management: Failed to get next store iter
[2023-10-23 04:40:33.390062] E [MSGID: 106207] [glusterd-store.c:3404:glusterd_store_retrieve_volume] 0-management: Failed to update volinfo for ndp_vol volume
[2023-10-23 04:40:33.390092] E [MSGID: 106201] [glusterd-store.c:3641:glusterd_store_retrieve_volumes] 0-management: Unable to restore volume: ndp_vol
[2023-10-23 04:40:33.390119] E [MSGID: 101019] [xlator.c:629:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again
[2023-10-23 04:40:33.390128] E [MSGID: 101066] [graph.c:362:glusterfs_graph_init] 0-management: initializing translator failed
[2023-10-23 04:40:33.390134] E [MSGID: 101176] [graph.c:725:glusterfs_graph_activate] 0-graph: init failed
[2023-10-23 04:40:33.390230] W [glusterfsd.c:1570:cleanup_and_exit] (-->/usr/sbin/glusterd(glusterfs_volumes_init+0xf7) [0x561488e5cc27] -->/usr/sbin/glusterd(glusterfs_process_volfp+0x22f) [0x561488e5cacf] -->/usr/sbin/glusterd(cleanup_and_exit+0x54) [0x561488e58994] ) 0-: received signum (-1), shutting down

FYI:
We are running GlusterFS-6.10

Some more info:
When we look at the config directly of the GlusterFS, we saw that some files are missing.
/var/lib/glusterfs/vols/ndp_vol/ndp_vol.<<brick-hostnames*>>.mnt-bricks-ndp_brick.vol - This file went missing on the GlusterFS crashing node.

@git2212 - Just a suggestion, could you also check, if these config files are missing for you?

from glusterfs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.