Comments (11)
感觉行号对不上。通过调用栈不好判断是什么错误?
2016年5月17日星期二,张云乾 [email protected] 写道:
使用redis-port同步数据从redis到codis,数据量有7G,400多万个key,在sync
rdb阶段crash,感觉像是在建立到target连接读取返回值时hang住了?实验了多次,每次停在了不同的key上,应该和源数据没关系?下面是crash
log。多谢
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x0 pc=0x41caa9]goroutine 7 [running]:
main.newRDBLoader.func1(0xc820075ec0, 0xc82001e8a0, 0xc820079e00)
/home/easemob/go/src/github.com/left2right/redis-port/cmd/utils.go:331
+0x639
created by main.newRDBLoader
/home/easemob/go/src/github.com/left2right/redis-port/cmd/utils.go:344
+0x68goroutine 1 [select]:
main.(cmdSync).SyncRDBFile(0xc820079e00, 0xc82001e8a0, 0x7fffd855a4e3, 0x26,
0x0, 0x0, 0x3b8a4c5f)
/home/easemob/go/src/github.com/left2right/redis-port/cmd/sync.go:227
http://github.com/left2right/redis-port/cmd/sync.go:227 +0xb2b main.(
cmdSync).Main(0xc820079e00)
/home/easemob/go/src/github.com/left2right/redis-port/cmd/sync.go:91
+0x863
main.main()
/home/easemob/go/src/github.com/left2right/redis-port/cmd/main.go:377
+0x24c3goroutine 17 [syscall, 6 minutes, locked to thread]:
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:1721 +0x1goroutine 8 [chan receive, 6 minutes]:
main.(_cmdSync).SyncRDBFile.func1(0xc820075f20, 0x7fffd855a4e3, 0x26, 0x0, 0x0,
0xc820075ec0, 0xc820079e00)
/home/easemob/go/src/github.com/left2right/redis-port/cmd/sync.go:222
http://github.com/left2right/redis-port/cmd/sync.go:222 +0x136 created by
main.(_cmdSync).SyncRDBFile
/home/easemob/go/src/github.com/left2right/redis-port/cmd/sync.go:224
+0x110goroutine 9 [IO wait]:
net.runtime_pollWait(0x2b88bf6e7388, 0x72, 0xc820076140)
/usr/local/go/src/runtime/netpoll.go:157 +0x60
net.(_pollDesc).Wait(0xc82127eae0, 0x72, 0x0, 0x0)
/usr/local/go/src/net/fd_poll_runtime.go:73 +0x3a net.(_pollDesc).WaitRead(0xc82127eae0,
0x0, 0x0)
/usr/local/go/src/net/fd_poll_runtime.go:78 +0x36
net.(_netFD).Read(0xc82127ea80, 0xc82001d000, 0x1000, 0x1000, 0x0,
0x2b88bfd21028, 0xc820076140) /usr/local/go/src/net/fd_unix.go:232 +0x23a
net.(_conn).Read(0xc820030018, 0xc82001d000, 0x1000, 0x1000, 0x0, 0x0,
0x0)
/usr/local/go/src/net/net.go:172 +0xe4
bufio.(_Reader).fill(0xc8200740c0) /usr/local/go/src/bufio/bufio.go:97 +0x1e9
bufio.(_Reader).ReadSlice(0xc8200740c0, 0xa, 0x0, 0x0, 0x0, 0x0, 0x0)
/usr/local/go/src/bufio/bufio.go:328 +0x21a
github.com/garyburd/redigo/redis.(_conn).readLine(0xc8200ac000, 0x0, 0x0, 0x0, 0x0, 0x0)
/home/easemob/go/src/github.com/left2right/redis-port/Godeps/_workspace/src/github.com/garyburd/redigo/redis/conn.go:338
http://github.com/left2right/redis-port/Godeps/_workspace/src/github.com/garyburd/redigo/redis/conn.go:338
+0x5a github.com/garyburd/redigo/redis.(
http://github.com/garyburd/redigo/redis.(_conn).readReply(0xc8200ac000,
0x0, 0x0, 0x0, 0x0)
/home/easemob/go/src/
github.com/left2right/redis-port/Godeps/_workspace/src/github.com/garyburd/redigo/redis/conn.go:411
+0x57
github.com/garyburd/redigo/redis.(_conn).Do(0xc8200ac000, 0x69ff40, 0xc, 0xc824c0a7e0, 0x3, 0x3, 0x0, 0x0,
0x0, 0x0)
/home/easemob/go/src/github.com/left2right/redis-port/Godeps/_workspace/src/github.com/garyburd/redigo/redis/conn.go:559
http://github.com/left2right/redis-port/Godeps/_workspace/src/github.com/garyburd/redigo/redis/conn.go:559
+0x6b2 main.restoreRdbEntry(0x2b88bfd270e8, 0xc8200ac000, 0xc824bf0680)
/home/easemob/go/src/github.com/left2right/redis-port/cmd/utils.go:286
http://github.com/left2right/redis-port/cmd/utils.go:286 +0x1c6a main.(_cmdSync).SyncRDBFile.func1.1(0xc8212056c0,
0x7fffd855a4e3, 0x26, 0x0, 0x0, 0xc820075ec0, 0xc820079e00)
/home/easemob/go/src/github.com/left2right/redis-port/cmd/sync.go:216
+0x377
created by main.(*cmdSync).SyncRDBFile.func1
/home/easemob/go/src/github.com/left2right/redis-port/cmd/sync.go:219
+0xe7—
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#22
from redis-port.
为了定位到是哪个key,在代码里面添加了下输出所有key的语句,看了下,感觉是Sync -> SyncRDBFile -> restoreRdbEntry(c, e)这个地方建立的连接一直没有数据返回导致io wait?在网上查了下,有几个开源的项目也遇到类似的情况,其中一个解决办法是给连接加了超时https://github.com/golang/gddo/issues/139(感觉这个不太好,可能会丢数据?)。我拿re di s-port最新的代码编译,在试下
from redis-port.
你的情况应该和他提到的不是一回事儿,那个 issue 里面应该是 http 请求 hang 住了,tcp 的确有这情况。
但是这个 issue 里面,贴出的 stack trace 里面报错是 panic: runtime error: invalid memory address or nil pointer dereference,这种情况应该是 nil 指针造成的。
如果能稳定复现的话,应该是 redis-port 里面有 bug 或者没处理到的地方。所以你最好能调试一下或者提供出错的代码,方便跟进一下。
from redis-port.
2016/05/17 13:14:29 [INFO] total=1039336531 - 64666156 [ 6%] entry=270889
2016/05/17 13:14:30 [INFO] total=1039336531 - 65678020 [ 6%] entry=275075
2016/05/17 13:14:31 [INFO] total=1039336531 - 66752256 [ 6%] entry=279343
2016/05/17 13:14:32 [INFO] total=1039336531 - 68302108 [ 6%] entry=283564
2016/05/17 13:14:33 [PANIC] parse rdb entry error
[error]: EOF
11 /home/easemob/go/src/github.com/CodisLabs/redis-port/pkg/rdb/reader.go:75
github.com/CodisLabs/redis-port/pkg/rdb.(*rdbReader).Read
10 /usr/local/go/src/io/io.go:514
io.(*teeReader).Read
9 /home/easemob/go/src/github.com/CodisLabs/redis-port/pkg/rdb/reader.go:73
github.com/CodisLabs/redis-port/pkg/rdb.(*rdbReader).Read
8 /usr/local/go/src/io/io.go:298
io.ReadAtLeast
7 /usr/local/go/src/io/io.go:316
io.ReadFull
6 /home/easemob/go/src/github.com/CodisLabs/redis-port/pkg/rdb/reader.go:229
github.com/CodisLabs/redis-port/pkg/rdb.(*rdbReader).readByte
5 /home/easemob/go/src/github.com/CodisLabs/redis-port/pkg/rdb/reader.go:244
github.com/CodisLabs/redis-port/pkg/rdb.(*rdbReader).readUint8
4 /home/easemob/go/src/github.com/CodisLabs/redis-port/pkg/rdb/reader.go:187
github.com/CodisLabs/redis-port/pkg/rdb.(*rdbReader).readEncodedLength
3 /home/easemob/go/src/github.com/CodisLabs/redis-port/pkg/rdb/reader.go:143
github.com/CodisLabs/redis-port/pkg/rdb.(*rdbReader).readString
2 /home/easemob/go/src/github.com/CodisLabs/redis-port/pkg/rdb/reader.go:99
github.com/CodisLabs/redis-port/pkg/rdb.(*rdbReader).readObjectValue
1 /home/easemob/go/src/github.com/CodisLabs/redis-port/pkg/rdb/loader.go:129
github.com/CodisLabs/redis-port/pkg/rdb.(*Loader).NextBinEntry
0 /home/easemob/go/src/github.com/CodisLabs/redis-port/cmd/utils.go:247
main.newRDBLoader.func1
... ...
[stack]:
0 /home/easemob/go/src/github.com/CodisLabs/redis-port/cmd/utils.go:248
main.newRDBLoader.func1
... ...
这是用这个repo的re dis-port的最新的master代码编译后运行报的错,之前也是遇到这个报错,以为是某个key的value太大导致的,就修改了NextBinEntry ->readObjectValue代码,为了定位是哪个key如下:
val, err := l.readObjectValue(t)
if err != nil {
entry.DB = l.db
entry.Key = key
return entry, err
}
另外一个可能有助于定位问题的点是:由于是在sync rdb阶段crash,我们就直接在源redis bgsave生成rdb文件(1G多),用另一个redis server将这个rdb起来,发现里面key的数量和 源redis不一样(试了两次,其中一次只有2w多的key,第二次是300多万的key,少了100多万)
from redis-port.
这里面的key的过期时间为1天,qps大概为5000左右,这个redis当前在线上运行使用,打算迁入codis
from redis-port.
先说第二个问题,rdb 文件就是 redis bgsave 生成的,redis-port 也只是发 bgsave 指令给 master 让他声称 rdb 而已。
两次 redis 从 rdb 直接恢复差很多,这个问题只能是 redis 的问题。
而且,rdb 里面包含即将过期,但是目前尚未过期的数据。如果在过期时间点之后使用 rdb 恢复数据,那些过期数据就可能直接被丢弃,你说的少 100w key 是不是因为这个?
from redis-port.
再说第一个问题,这个错误是 EOF。issue 最开始的那个 panic 我怀疑是不是你的调试改错了导致的 panic 而不是 redis-port 的错?
然后是 EOF 通常是 master 主动 close 连接。一般是同步的时候 backlog 生成速度大于 redis-port 消费速度导致 backlog buffer 满了而主动关闭与 redis-port 之间的 socket connection 导致的。
这种现象和解决方案 CodisLabs/codis#318 有讨论过,你看看是不是一样的原因。
from redis-port.
而且,rdb 里面包含即将过期,但是目前尚未过期的数据。如果在过期时间点之后使用 rdb 恢复数据,那些过期数据就可能直接被丢弃,你说的少 100w key 是不是因为这个?===有考虑到这点,后续再验证下,多谢
from redis-port.
EOF 通常是 master 主动 close 连接。一般是同步的时候 backlog 生成速度大于 redis-port 消费速度导致 backlog buffer 满了而主动关闭与 redis-port 之间的 socket connection 导致的。
这种现象和解决方案 CodisLabs/codis#318 有讨论过,你看看是不是一样的原因。===好的,我看看,多谢
from redis-port.
用了上述方法解决了问题,多谢~
from redis-port.
想要让redis-port支持普通redis迁移,即A redis到B redis,在修改cmd/utils.go里205行:s, err := redigo.String(c.Do("slotrestore", e.Key, ttlms, e.Value))
slotrestore为restore命令后开始迁移报这个错,看样子是restore里的checksum不对,请问需要怎样修改吗
./bin/redis-port sync -f 127.0.0.1:9000 -t 127.0.0.1:9001
2016/10/18 18:11:15 [INFO] set ncpu = 4, parallel = 4
2016/10/18 18:11:15 [INFO] sync from '127.0.0.1:9000' to '127.0.0.1:9001'
2016/10/18 18:11:15 [INFO] rdb file = 42
2016/10/18 18:11:15 [PANIC] restore command error
[error]: ERR DUMP payload version or checksum are wrong
[stack]:
1 /data/tmp/redis-port-master/cmd/utils.go:207
main.restoreRdbEntry
0 /data/tmp/redis-port-master/cmd/sync.go:212
main.(_cmdSync).SyncRDBFile.func1.1
... ...
2016/10/18 18:11:15 [PANIC] restore command error
[error]: ERR DUMP payload version or checksum are wrong
[stack]:
1 /data/tmp/redis-port-master/cmd/utils.go:207
main.restoreRdbEntry
0 /data/tmp/redis-port-master/cmd/sync.go:212
main.(_cmdSync).SyncRDBFile.func1.1
... ...
from redis-port.
Related Issues (20)
- make HOT 1
- redis4.0.1迁移到codis3.2.2报错 HOT 9
- 编译命令 HOT 3
- Twemproxy 迁移数据到codis报错 HOT 3
- redis2.x迁移至4.x报错 HOT 5
- codis集群 3.2往Redis单实例 2.8版本迁移出错 HOT 1
- M/S迁移数据到cluster报错
- 2.8.8单节点redis 迁移到codis 3.2集群 报错
- 从redis 3.2实例导入到相同的redis 3.2实例中报错
- 【bug】redis-sync密码带有'@'的时候无法正确解析密码 HOT 1
- how to install it HOT 1
- fix auth
- how to know which redis version is corresponded to the release version
- 使用redis-restore将rdb文件导入集群失败,导入单机就正常
- 支持ssl连接到aws上的redis吗 HOT 1
- redis 5.x support HOT 1
- 从 redis2.x迁移至redis5.0
- invalid RDB version number 9 HOT 7
- 迁移redis到codis报错 [PANIC] encode redis resp failed
- 带过期时间key丢失
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from redis-port.