Hello, We are using hiredis-cluster Release 0.8.1, and this issue is

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

I believe your scenario was a possibility and likely fixed by <a class="issue-link js-

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Crash in actx_get_by_node() about hiredis-cluster HOT 11 CLOSED

nordix commented on July 26, 2024

Crash in actx_get_by_node()

from hiredis-cluster.

Comments (11)

rahulKrishnaM commented on July 26, 2024 1

Hi @bjosv Have some updates from the core analysis. Posting it here. (All data picked from either frame 0 or 1.)

It looks like both table and slots pointers are holding corrupted data at the time of crash. From the hiredis code, the node calculation is done as table[slot_number]. slot_number as seen from the core file is 8535.

So,

(gdb) p acc,cc.table[8535]
$223 = (cluster_node *) 0x1c00550

which is same as the corrupted cluster_node pointer seen at the time of crash. So, the corrupted node is pulled from the corrupted table data.

However, the nodes dictionary object (of type struct dictEntry) seems to be holding valid data.

(gdb) p *acc.cc.nodes
$54 = {table = 0x1c53ed0, type = 0x7fe3b53c53a0 , size = 8, sizemask = 7, used = 7, privdata = 0x0}

(gdb) ptype acc->cc->nodes->table
type = struct dictEntry {
void *key;
void *val;
struct dictEntry *next;
} **

(gdb) p (char *)cc.nodes.table[0].key
$231 = 0x1bce9b1 "185.185.185.185:1111@16379"

(gdb) p *acc.cc.nodes.table[0].next
$43 = {key = 0x1c16551, val = 0x1fd79b0, next = 0x0}
(gdb) p (char *)cc.nodes.table[0].next.key
$232 = 0x1c16551 "240.240.240.240:1111@16379"

(gdb) p (char *)cc.nodes.table[1].key
$225 = 0x1cecee1 "213.213.213.213:1111@16379"

(gdb) p cc.nodes.table[2]
$12 = (dictEntry *) 0x0

(gdb) p (char *)cc.nodes.table[3].key
$226 = 0x1d183b1 "17.17.17.17:1111@16379"

(gdb) p cc.nodes.table[4]
$13 = (dictEntry *) 0x0

(gdb) p cc.nodes.table[5]
$14 = (dictEntry *) 0x0

(gdb) p cc.nodes.table[6]
$15 = (dictEntry *) 0x0

(gdb) p (char *)cc.nodes.table[7].key
$227 = 0x1c62ff1 "123.123.123.123:1111@16379"

(gdb) p (char *)cc.nodes.table[7].next.key
$228 = 0x1f413c1 "247.247.247.247:1111@16379"

(gdb) p acc.cc.nodes.table[7].next.next
$50 = (struct dictEntry *) 0x20da370
(gdb) p *acc.cc.nodes.table[7].next.next
$51 = {key = 0x1d06911, val = 0x1c552b0, next = 0x0}
(gdb) p (char *)cc.nodes.table[7].next.next.key
$229 = 0x1d06911 "26.26.26.26:1111@16379"

So, there are total 7 nodes available at the time of crash as indicated by the nodes dictionary.

All 7 cluster_node (valid) pointers are decoded below.

(gdb) p (cluster_node *)acc.cc.nodes.table[0].val
$122 = (cluster_node *) 0x1f9d540
(gdb) p (cluster_node *)acc.cc.nodes.table[0].next.val
$123 = (cluster_node *) 0x1fd79b0
(gdb) p (cluster_node *)acc.cc.nodes.table[1].val
$124 = (cluster_node *) 0x1c55740
(gdb) p (cluster_node *)acc.cc.nodes.table[3].val
$125 = (cluster_node *) 0x1f73cc0
(gdb) p (cluster_node *)acc.cc.nodes.table[7].val
$126 = (cluster_node *) 0x207a1d0
(gdb) p (cluster_node *)acc.cc.nodes.table[7].next.val
$127 = (cluster_node *) 0x1c162b0
(gdb) p (cluster_node *)acc.cc.nodes.table[7].next.next.val
$128 = (cluster_node *) 0x1c552b0
(gdb) p *$122
$129 = {name = 0x1f37a03 "5d51fc9a44b9c4dc166aa607aa3ab96a79c88c57", addr = 0x1bb9b81 "185.185.185.185:1111@16379", host = 0x1beacb1 "185.185.185.185",
port = 1111, role = 1 '\001', pad = 0 '\000', failure_count = 0, con = 0x0, acon = 0x0, slots = 0x1c41ba0, slaves = 0x0, migrating = 0x0,
importing = 0x0}
(gdb) p *$123
$130 = {name = 0x1c82833 "b4906a5eb2a9c8e8398c5b87202d2d8b360ec1f5", addr = 0x1f233d1 "240.240.240.240:1111@16379", host = 0x1dcdfb1 "240.240.240.240",
port = 1111, role = 1 '\001', pad = 0 '\000', failure_count = 0, con = 0x0, acon = 0x0, slots = 0x1c55e10, slaves = 0x0, migrating = 0x0,
importing = 0x0}
(gdb) p *$124
$131 = {name = 0x1bb8253 "dc74c7bab5209309369cbd2109bba446863d9231", addr = 0x1bbc1d1 "213.213.213.213:1111@16379", host = 0x1bbc071 "213.213.213.213",
port = 1111, role = 1 '\001', pad = 0 '\000', failure_count = 0, con = 0x0, acon = 0x0, slots = 0x22089e0, slaves = 0x0, migrating = 0x0,
importing = 0x0}
(gdb) p *$125
$132 = {name = 0x2510b13 "736b141a059a95839f02fa75af6449faf3f9e295", addr = 0x1d369c1 "17.17.17.17:1111@16379", host = 0x1c4edb1 "17.17.17.17",
port = 1111, role = 1 '\001', pad = 0 '\000', failure_count = 0, con = 0x0, acon = 0x0, slots = 0x1e98030, slaves = 0x0, migrating = 0x0,
importing = 0x0}
(gdb) p *$126
$133 = {name = 0x2af1f03 "21b0aaa3abe526da3adf201c9138a3cef1bdd824", addr = 0x1c01151 "123.123.123.123:1111@16379", host = 0x21892e1 "123.123.123.123",
port = 1111, role = 1 '\001', pad = 0 '\000', failure_count = 0, con = 0x0, acon = 0x0, slots = 0x1e181d0, slaves = 0x0, migrating = 0x0,
importing = 0x0}
(gdb) p *$127
$134 = {name = 0x1de2443 "19145935ddaa3e4a9d1f984876ac114634fb0622", addr = 0x2047f51 "247.247.247.247:1111@16379", host = 0x1c15311 "247.247.247.247",
port = 1111, role = 1 '\001', pad = 0 '\000', failure_count = 0, con = 0x0, acon = 0x7fe3340164a0, slots = 0x21ab1f0, slaves = 0x0, migrating = 0x0,
importing = 0x0}
(gdb) p *$128
$135 = {name = 0x1c46013 "64c9ac4a8aa04e3033f4abba019f3ffe3964cae2", addr = 0x1c1b6e1 "26.26.26.26:1111@16379", host = 0x1bb83b1 "26.26.26.26",
port = 1111, role = 1 '\001', pad = 0 '\000', failure_count = 0, con = 0x0, acon = 0x7fe3900045b0, slots = 0x260a260, slaves = 0x0, migrating = 0x0,
importing = 0x0}
(gdb)

Each cluster_node holds information of the slots they serve. Below information captures slots information of each of the above 7 nodes.

(gdb) p $122.slots
$211 = (struct hilist *) 0x1c41ba0

(gdb) ptype $122.slots
type = struct hilist {
listNode *head;
listNode *tail;
void *(*dup)(void *);
void (*free)(void *);
int (*match)(void *, void *);
unsigned long len;
} *
(gdb)

(gdb) p *$122.slots
$136 = {head = 0x1dee720, tail = 0x1dee720, dup = 0x0, free = 0x7fe3b53b75e0 , match = 0x0, len = 1}
(gdb) p *(cluster_slot *)$122.slots.head.value
$140 = {start = 5462, end = 8192, node = 0x1f9d540}
(gdb) p *$123.slots
$142 = {head = 0x1c47810, tail = 0x1c47810, dup = 0x0, free = 0x7fe3b53b75e0 , match = 0x0, len = 1}
(gdb) p *(cluster_slot *)$123.slots.head.value
$143 = {start = 13655, end = 16383, node = 0x1fd79b0}
(gdb) p *$124.slots
$144 = {head = 0x1d77af0, tail = 0x1d77af0, dup = 0x0, free = 0x7fe3b53b75e0 , match = 0x0, len = 1}
(gdb) p *(cluster_slot *)$124.slots.head.value
$145 = {start = 2731, end = 5461, node = 0x1c55740}
(gdb) p *$125.slots ======> ALL empty. So this slots object is not holding any shards
$146 = {head = 0x0, tail = 0x0, dup = 0x0, free = 0x7fe3b53b75e0 , match = 0x0, len = 0}
(gdb) p *$126.slots
$147 = {head = 0x1d35290, tail = 0x1d35290, dup = 0x0, free = 0x7fe3b53b75e0 , match = 0x0, len = 1}
(gdb) p *(cluster_slot *)$126.slots.head.value
$148 = {start = 10924, end = 13654, node = 0x207a1d0}
(gdb) p *$127.slots
$149 = {head = 0x2149970, tail = 0x2149970, dup = 0x0, free = 0x7fe3b53b75e0 , match = 0x0, len = 1}
(gdb) p *(cluster_slot *)$127.slots.head.value
$150 = {start = 8193, end = 10923, node = 0x1c162b0}
(gdb) p *$128.slots
$151 = {head = 0x2c6b6c0, tail = 0x2c6b6c0, dup = 0x0, free = 0x7fe3b53b75e0 , match = 0x0, len = 1}
(gdb) p *(cluster_slot *)$128.slots.head.value
$152 = {start = 0, end = 2730, node = 0x1c552b0}
(gdb)

slots to cluster_node mapping from above data ==>

0-2730 node = 0x1c552b0
8193-10923 node = 0x1c162b0
10924-13654 node = 0x207a1d0
2731-5461 node = 0x1c55740
13655-16383 node = 0x1fd79b0
5462-8192 node = 0x1f9d540

So, it's indicating 6 valid slot ranges. So, there is 1 extra node ($125 from gdb), which is not having any slots distributed to it.

One more interesting observation is that contiguous slots of the corrupted table object seem to be holding same cluster_node address (even though it is holding invalid data) as how it should be when range of slots are assigned to multiple cluster_nodes.

(gdb) x/16384gx cc.table
0x2dd68d0: 0x0000000001bb80f0 0x0000000001bb80f0
0x2dd68e0: 0x0000000001bb80f0 0x0000000001bb80f0
0x2dd68f0: 0x0000000001bb80f0 0x0000000001bb80f0
0x2dd6900: 0x0000000001bb80f0 0x0000000001bb80f0
.
.
0x2df6890: 0x0000000001bcd120 0x0000000001bcd120
0x2df68a0: 0x0000000001bcd120 0x0000000001bcd120
0x2df68b0: 0x0000000001bcd120 0x0000000001bcd120
0x2df68c0: 0x0000000001bcd120 0x0000000001bcd120
(gdb)

6 unique cluster_node addresses came up (added below) for the entire 16384 slots, indicating there were 6 nodes available at some point of time (which is 7 at the time of crash).

0x0000000001bb80f0
0x0000000001e82c60
0x0000000001c49130
0x0000000001c00550 --> cluster_node pointer which crashed.
0x0000000001beb2b0
0x0000000001bcd120

Not really sure what path it took for the corruption to happen. Please share any thoughts you have on the same.

from hiredis-cluster.

rahulKrishnaM commented on July 26, 2024 1

An also what called redisClusterAsyncCommand in (ie. frame 4+):

Thread 13 (crashed)
0 libhiredis_cluster.so.0.8!actx_get_by_node [hircluster.c : 3630 + 0x0]
1 libhiredis_cluster.so.0.8!redisClusterAsyncFormattedCommand [hircluster.c : 4117 + 0x8]
2 libhiredis_cluster.so.0.8!redisClustervAsyncCommand [hircluster.c : 4244 + 0x16]
3 libhiredis_cluster.so.0.8!redisClusterAsyncCommand [hircluster.c : 4258 + 0x5]

Was it from within a response callback, and in that case, was it a NULL response?

No, this was directly called from the application trying to send out a command to hiredis server.

from hiredis-cluster.

bjosv commented on July 26, 2024 1

It is stated that you run multi-threaded and the question is in what way?

When a response is sent from a redis node the used eventsystem will notice that data is received on the socket,
which will trigger a call to hiredis redisProcessCallbacks() that parses the reply and call hiredis-clusters callbacks.
In a case where slots have moved, the MOVED response will trigger a fetch of the new topology and update the nodes dict and the slot-to-node table in a hiredis-clusters callback.

If the eventsystem is able to process replies in multiple threads at the same time (likely from different nodes) , there is a chance that the replies can run the callback that updates the dict and table at the same time.
This might result in a inconsistency between the dict and table.
The dict would probably be ok since an update of it is a change of pointer, but the table is built using a dict that suddenly is changed.

from hiredis-cluster.

rahulKrishnaM commented on July 26, 2024 1

Hi @bjosv, all access into hiredis is restricted using 1 common mutex. So, any threads trying to call into hiredis would first acquire the lock and then make the hiredis command call. Once the command execution is completed, the lock is released, and another thread can acquire it and invoke its command call.

The eventsystem can only process 1 reply at a time (is always run in a single threaded context). It takes the lock before calling into hiredis for any read/write event which will then be executing the redisProcessCallbacks().

from hiredis-cluster.

bjosv commented on July 26, 2024 1

I believe your scenario was a possibility and likely fixed by #125.
Please re-open a issue if not.

from hiredis-cluster.

bjosv commented on July 26, 2024

Interesting, do you have any ideas what events that lead up to this?
Any timeouts, topology changes or other logs that can give a lead?

from hiredis-cluster.

SS-TruMinds commented on July 26, 2024

Unfortunately no.
We have been having issues with adapter timeouts on the client side & also server failures, but exact trigger for this crash is not known.

from hiredis-cluster.

bjosv commented on July 26, 2024

Just as a reference, what cluster_slots do you have in cc->slots?
Does it match the ones found in $140 to $152?

from hiredis-cluster.

bjosv commented on July 26, 2024

An also what called redisClusterAsyncCommand in (ie. frame 4+):

Thread 13 (crashed)
0 libhiredis_cluster.so.0.8!actx_get_by_node [hircluster.c : 3630 + 0x0]
1 libhiredis_cluster.so.0.8!redisClusterAsyncFormattedCommand [hircluster.c : 4117 + 0x8]
2 libhiredis_cluster.so.0.8!redisClustervAsyncCommand [hircluster.c : 4244 + 0x16]
3 libhiredis_cluster.so.0.8!redisClusterAsyncCommand [hircluster.c : 4258 + 0x5]

Was it from within a response callback, and in that case, was it a NULL response?

from hiredis-cluster.

rahulKrishnaM commented on July 26, 2024

@bjosv No, that data looks all corrupted.

(gdb) p *cc
$208 = {err = 0, errstr = '\000' <repeats 127 times>, flags = 0, connect_timeout = 0x1bce800, command_timeout = 0x1bfe520, max_retry_count = 5,
username = 0x0, password = 0x0, nodes = 0x1c49a50, slots = 0x1cc0ca0, route_version = 3, table = 0x2dd68d0, requests = 0x0, retry_count = 0,
need_update_route = 0, update_route_time = 0, ssl = 0x0, ssl_init_fn = 0x0}
(gdb) p cc->slots
$209 = (struct hiarray *) 0x1cc0ca0
(gdb) p *cc.slots
$210 = {nelem = 6, elem = 0x213e400, size = 8, nalloc = 6}

(gdb) p (cluster_slot*)(cc.slots->elem)
$212 = (cluster_slot*) 0x213e400
(gdb) p *$212
$213 = {start = 32733440, end = 0, node = 0x1dfc5b0}
(gdb) p (cluster_slot*)(cc.slots->elem + 8)
$214 = (cluster_slot*) 0x213e408
(gdb) p *(cluster_slot*)(cc.slots->elem + 8)
$215 = {start = 31442352, end = 0, node = 0x1bb83d0}
(gdb) p *(cluster_slot*)(cc.slots->elem + 8*2)
$216 = {start = 29066192, end = 0, node = 0x1bef460}
(gdb) p *(cluster_slot*)(cc.slots->elem + 8*3)
$217 = {start = 29291616, end = 0, node = 0x1d57580}
(gdb) p *(cluster_slot*)(cc.slots->elem + 8*4)
$218 = {start = 30766464, end = 0, node = 0x1bead20}
(gdb) p *(cluster_slot*)(cc.slots->elem + 8*5)
$219 = {start = 29273376, end = 0, node = 0x40}

from hiredis-cluster.

rahulKrishnaM commented on July 26, 2024

We were thinking of possibilities of how the above stack trace could be related to the one in ticket 124.

One theory which we came up in our internal discussion is that:
Image a case where,

we start with "v1" nodes, slots and table
something triggers us to get into cluster_update_by_addr, which gets as far as updating nodes to "v2" before invoking callbacks - slots and table are still "v1"
we go through a path similar to that in the stack in ticket 124, calling back into the application library, which tries to send another command
we're lucky, and the node that we end up targeting has not yet been freed (or at least not yet corrupted), so we don't crash with the stack as seen in 124
however, instead, we end up deciding that we need to do cluster_update_by_addr again - this goes through and updates nodes to "v3", invokes any callbacks on "v2" nodes (of which there are unlikely to be any, because they've only just been created)
the inner cluster_update_by_addr completes and sets slots and table to "v3" - now, all of nodes, slots and table are "v3"
we unwind the stack, and end up back in the first cluster_update_by_addr, which finishes its processing by setting slots and table to "v2" - now, nodes is "v3" and slots and table are "v2"
now we wait until the next event, when we hit the crash in this ticket because table (which is where we do the lookup) is "v2" but nodes is "v3", so all table's pointers are out-of-date.

If the above scenario is a possibility, do you agree that we could end up in a case as above, and that both are linked ?

from hiredis-cluster.

Crash in actx_get_by_node() about hiredis-cluster HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent