Comments (1)
In our case also we are seeing this issue. We are having one volume with 3-brick:3-replica setup with 3-nodes. in one node we have restarted all gluster services. And after that glusterd was continously crashing, stating 0-management: Initialization of volume 'management' failed, review your volfile again
.
Logs from GlusterD:
[2023-10-23 04:35:15.477485] W [glusterfsd.c:1570:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f50547c96db] -->/usr/sbin/glusterd(glusterfs_sigwaiter+0xfd) [0x55de29541b4d] -->/usr/sbin/glusterd(cleanup_and_exit+0x54) [0x55de29541994] ) 0-: received signum (15), shutting down
[2023-10-23 04:38:56.022520] I [MSGID: 100030] [glusterfsd.c:2847:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 6.10 (args: /usr/sbin/glusterd -N -p /var/run/glusterd.pid)
[2023-10-23 04:38:56.022658] I [glusterfsd.c:2556:daemonize] 0-glusterfs: Pid of current running process is 88
[2023-10-23 04:38:56.071590] I [MSGID: 106478] [glusterd.c:1422:init] 0-management: Maximum allowed open file descriptors set to 65536
[2023-10-23 04:38:56.071663] I [MSGID: 106479] [glusterd.c:1478:init] 0-management: Using /var/lib/glusterd/ as working directory
[2023-10-23 04:38:56.071704] I [MSGID: 106479] [glusterd.c:1484:init] 0-management: Using /var/run/gluster as pid file working directory
[2023-10-23 04:38:56.073867] I [socket.c:1022:__socket_server_bind] 0-socket.management: process started listening on port (24007)
[2023-10-23 04:38:56.097616] I [socket.c:965:__socket_server_bind] 0-socket.management: closing (AF_UNIX) reuse check socket 11
[2023-10-23 04:38:56.097971] I [MSGID: 106059] [glusterd.c:1860:init] 0-management: base-port override: 49152
[2023-10-23 04:38:56.097982] I [MSGID: 106059] [glusterd.c:1865:init] 0-management: max-port override: 49152
[2023-10-23 04:39:04.956934] I [MSGID: 106513] [glusterd-store.c:2394:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 60000
[2023-10-23 04:39:34.308779] I [MSGID: 106544] [glusterd.c:152:glusterd_uuid_init] 0-management: retrieved UUID: ddc1d9f0-a6e5-4751-8e35-0f91d326fa41
[2023-10-23 04:39:34.912425] I [MSGID: 106498] [glusterd-handler.c:3687:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0
[2023-10-23 04:39:34.912618] I [MSGID: 106498] [glusterd-handler.c:3687:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0
[2023-10-23 04:39:34.912669] W [MSGID: 106061] [glusterd-handler.c:3490:glusterd_transport_inet_options_build] 0-glusterd: Failed to get tcp-user-timeout
[2023-10-23 04:39:34.912720] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2023-10-23 04:39:34.916260] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
Final graph:
+------------------------------------------------------------------------------+
1: volume management
2: type mgmt/glusterd
3: option rpc-auth.auth-glusterfs on
4: option rpc-auth.auth-unix on
5: option rpc-auth.auth-null on
6: option rpc-auth-allow-insecure on
7: option transport.listen-backlog 1024
8: option max-port 49152
9: option base-port 49152
10: option transport.address-family inet
11: option transport.socket.listen-port 24007
12: option event-threads 1
13: option ping-timeout 0
14: option transport.socket.read-fail-log off
15: option transport.socket.keepalive-interval 2
16: option transport.socket.keepalive-time 10
17: option glusterd-sockfile /etc/glusterd_socket/gluster.sock
18: option transport-type socket
19: option working-directory /var/lib/glusterd/
20: end-volume
21:
+------------------------------------------------------------------------------+
[2023-10-23 04:39:34.916252] W [MSGID: 106061] [glusterd-handler.c:3490:glusterd_transport_inet_options_build] 0-glusterd: Failed to get tcp-user-timeout
[2023-10-23 04:39:35.015736] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0
[2023-10-23 04:39:35.017093] I [MSGID: 106487] [glusterd-handler.c:1516:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req
[2023-10-23 04:39:35.017342] I [MSGID: 106487] [glusterd-handler.c:1516:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req
[2023-10-23 04:39:35.021484] I [MSGID: 106163] [glusterd-handshake.c:1389:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 60000
[2023-10-23 04:39:35.022776] I [MSGID: 106493] [glusterd-rpc-ops.c:468:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: 0c333528-b176-4229-8df5-230844b7ee6f, host: <>, port: 0
[2023-10-23 04:39:35.562534] I [glusterd-utils.c:6314:glusterd_brick_start] 0-management: starting a fresh brick process for brick /mnt/bricks/ndp_brick
[2023-10-23 04:39:36.599410] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2023-10-23 04:40:04.394492] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-nfs: setting frame-timeout to 600
[2023-10-23 04:40:04.394624] I [MSGID: 106131] [glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: nfs already stopped
[2023-10-23 04:40:04.394652] I [MSGID: 106568] [glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: nfs service is stopped
[2023-10-23 04:40:04.394972] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-glustershd: setting frame-timeout to 600
[2023-10-23 04:40:04.395758] I [MSGID: 106131] [glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: glustershd already stopped
[2023-10-23 04:40:04.395783] I [MSGID: 106568] [glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: glustershd service is stopped
[2023-10-23 04:40:04.395806] I [MSGID: 106567] [glusterd-svc-mgmt.c:220:glusterd_svc_start] 0-management: Starting glustershd service
[2023-10-23 04:40:05.400050] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-quotad: setting frame-timeout to 600
[2023-10-23 04:40:05.400307] I [MSGID: 106131] [glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: quotad already stopped
[2023-10-23 04:40:05.400340] I [MSGID: 106568] [glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: quotad service is stopped
[2023-10-23 04:40:05.400376] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-bitd: setting frame-timeout to 600
[2023-10-23 04:40:05.400515] I [MSGID: 106131] [glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: bitd already stopped
[2023-10-23 04:40:05.400542] I [MSGID: 106568] [glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: bitd service is stopped
[2023-10-23 04:40:05.400571] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-scrub: setting frame-timeout to 600
[2023-10-23 04:40:05.400693] I [MSGID: 106131] [glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: scrub already stopped
[2023-10-23 04:40:05.400711] I [MSGID: 106568] [glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: scrub service is stopped
[2023-10-23 04:40:05.400754] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
[2023-10-23 04:40:05.400888] I [rpc-clnt.c:1005:rpc_clnt_connection_init] 0-gfproxyd: setting frame-timeout to 600
[2023-10-23 04:40:05.401121] I [MSGID: 106492] [glusterd-handler.c:2796:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: 0c333528-b176-4229-8df5-230844b7ee6f
[2023-10-23 04:40:23.695495] I [MSGID: 106502] [glusterd-handler.c:2837:__glusterd_handle_friend_update] 0-management: Received my uuid as Friend
[2023-10-23 04:40:23.695661] I [MSGID: 106493] [glusterd-rpc-ops.c:681:__glusterd_friend_update_cbk] 0-management: Received ACC from uuid: 0c333528-b176-4229-8df5-230844b7ee6f
[2023-10-23 04:40:23.714829] I [MSGID: 106493] [glusterd-rpc-ops.c:468:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: c03c5b65-fe44-4342-bacd-76947c016646, host: <>, port: 0
[2023-10-23 04:40:25.356022] I [MSGID: 106490] [glusterd-handler.c:2611:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: 0c333528-b176-4229-8df5-230844b7ee6f
[2023-10-23 04:40:25.356665] I [MSGID: 106009] [glusterd-utils.c:3466:glusterd_compare_friend_volume] 0-management: Version of volume ndp_vol differ. local version = 7370, remote version = 7382 on peer <>
[2023-10-23 04:40:25.357955] I [MSGID: 106493] [glusterd-handler.c:3883:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to <> (0), ret: 0, op_ret: 0
[2023-10-23 04:40:26.595064] W [glusterfsd.c:1570:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f1e9eb816db] -->/usr/sbin/glusterd(glusterfs_sigwaiter+0xfd) [0x55d33e735b4d] -->/usr/sbin/glusterd(cleanup_and_exit+0x54) [0x55d33e735994] ) 0-: received signum (15), shutting down
[2023-10-23 04:40:30.011947] I [MSGID: 100030] [glusterfsd.c:2847:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 6.10 (args: /usr/sbin/glusterd -N -p /var/run/glusterd.pid)
[2023-10-23 04:40:30.012251] I [glusterfsd.c:2556:daemonize] 0-glusterfs: Pid of current running process is 77
[2023-10-23 04:40:30.096321] I [MSGID: 106478] [glusterd.c:1422:init] 0-management: Maximum allowed open file descriptors set to 65536
[2023-10-23 04:40:30.096375] I [MSGID: 106479] [glusterd.c:1478:init] 0-management: Using /var/lib/glusterd/ as working directory
[2023-10-23 04:40:30.096392] I [MSGID: 106479] [glusterd.c:1484:init] 0-management: Using /var/run/gluster as pid file working directory
[2023-10-23 04:40:30.099942] I [socket.c:1022:__socket_server_bind] 0-socket.management: process started listening on port (24007)
[2023-10-23 04:40:30.101375] I [socket.c:965:__socket_server_bind] 0-socket.management: closing (AF_UNIX) reuse check socket 11
[2023-10-23 04:40:30.101751] I [MSGID: 106059] [glusterd.c:1860:init] 0-management: base-port override: 49152
[2023-10-23 04:40:30.101766] I [MSGID: 106059] [glusterd.c:1865:init] 0-management: max-port override: 49152
[2023-10-23 04:40:33.389696] I [MSGID: 106513] [glusterd-store.c:2394:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 60000
[2023-10-23 04:40:33.390046] E [MSGID: 106206] [glusterd-store.c:3139:glusterd_store_update_volinfo] 0-management: Failed to get next store iter
[2023-10-23 04:40:33.390062] E [MSGID: 106207] [glusterd-store.c:3404:glusterd_store_retrieve_volume] 0-management: Failed to update volinfo for ndp_vol volume
[2023-10-23 04:40:33.390092] E [MSGID: 106201] [glusterd-store.c:3641:glusterd_store_retrieve_volumes] 0-management: Unable to restore volume: ndp_vol
[2023-10-23 04:40:33.390119] E [MSGID: 101019] [xlator.c:629:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again
[2023-10-23 04:40:33.390128] E [MSGID: 101066] [graph.c:362:glusterfs_graph_init] 0-management: initializing translator failed
[2023-10-23 04:40:33.390134] E [MSGID: 101176] [graph.c:725:glusterfs_graph_activate] 0-graph: init failed
[2023-10-23 04:40:33.390230] W [glusterfsd.c:1570:cleanup_and_exit] (-->/usr/sbin/glusterd(glusterfs_volumes_init+0xf7) [0x561488e5cc27] -->/usr/sbin/glusterd(glusterfs_process_volfp+0x22f) [0x561488e5cacf] -->/usr/sbin/glusterd(cleanup_and_exit+0x54) [0x561488e58994] ) 0-: received signum (-1), shutting down
FYI:
We are running GlusterFS-6.10
Some more info:
When we look at the config directly of the GlusterFS, we saw that some files are missing.
/var/lib/glusterfs/vols/ndp_vol/ndp_vol.<<brick-hostnames*>>.mnt-bricks-ndp_brick.vol - This file went missing on the GlusterFS crashing node.
@git2212 - Just a suggestion, could you also check, if these config files are missing for you?
from glusterfs.
Related Issues (20)
- problem with depot apt glusterfs HOT 6
- Segmentation fault in gluster client HOT 10
- libglusterfs *at functions don't support AT_FDCWD semantics
- Gluster 11.1 doesn't appear to be updating port-map to clients, causing a myriad of issues HOT 5
- gluster volume keeps showing files that needs to be healed HOT 16
- Current state of RDMA IB support? HOT 2
- Cache invalidation does not work for a FUSE mount unless mounted with fopen-keep-cache
- FUSE mount cache invalidation does not work with stat-prefetch disabled. HOT 9
- Inconsistent bricks and heal doesn't work HOT 3
- Directory renaming causes directory split-brain HOT 1
- creating 4 node replica 4 HOT 2
- No documented procedure to replace a failed brick when brick path doesn't change HOT 1
- writing with flag O_CREAT hangs when restart peer bricks where file located
- MemoryLeak in Glusterfs
- MemoryLeak in Glusterfs
- Increasing size of replicated volume bricks HOT 2
- When I started the mount operation, a large number of logs were generated, and the logs were basically: read from /dev/fuse returned -1, which filled up the disk.
- Infinite recursion segmentation fault involving `inode_unref()` and `xlators/features/bit-rot/src/stub/bit-rot-stub.c` HOT 14
- Not self healing hanging in "Possibly undergoing heal" HOT 1
- Is the project still alive? HOT 15
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from glusterfs.