I was seen in the staging cluster on GCP for e2e that etcd was failing even after pod restarts. The logs state that data directory is successfully getting restored and etcd is started post restoration. However, etcd crashes immediately on startup.
Identify the root cause and fix it.
time="2018-07-20T08:27:50Z" level=info msg="Regsitering the http request handlers..."
time="2018-07-20T08:27:50Z" level=info msg="Starting the http server..."
time="2018-07-20T08:27:50Z" level=info msg="Created snapstore from provider: GCS"
time="2018-07-20T08:27:50Z" level=info msg="Validating schedule..."
time="2018-07-20T08:27:50Z" level=info msg="Probing etcd..."
time="2018-07-20T08:27:50Z" level=info msg="Responding to status request with: New"
time="2018-07-20T08:27:50Z" level=info msg="Received start initialization request."
time="2018-07-20T08:27:50Z" level=info msg="Updating status from New to Progress"
time="2018-07-20T08:27:50Z" level=info msg="Checking for data directory structure validity..."
time="2018-07-20T08:27:50Z" level=info msg="Data directory structure invalid."
time="2018-07-20T08:27:50Z" level=info msg="Removing data directory(/var/etcd/data) for snapshot restoration."
time="2018-07-20T08:27:50Z" level=info msg="Responding to status request with: Progress"
time="2018-07-20T08:27:50Z" level=info msg="Finding latest set of snapshot to recover from..."
time="2018-07-20T08:27:51Z" level=info msg="Responding to status request with: Progress"
time="2018-07-20T08:27:52Z" level=info msg="Responding to status request with: Progress"
time="2018-07-20T08:27:53Z" level=info msg="Responding to status request with: Progress"
time="2018-07-20T08:27:54Z" level=info msg="Responding to status request with: Progress"
time="2018-07-20T08:27:55Z" level=info msg="Restoring from base snapshot: Backup-1531824004/Full-00000000-00085834-1531824004"
time="2018-07-20T08:27:56Z" level=info msg="Responding to status request with: Progress"
time="2018-07-20T08:27:57Z" level=info msg="Responding to status request with: Progress"
time="2018-07-20T08:27:58Z" level=info msg="Responding to status request with: Progress"
time="2018-07-20T08:27:59Z" level=info msg="Responding to status request with: Progress"
time="2018-07-20T08:28:00Z" level=info msg="Responding to status request with: Progress"
time="2018-07-20T08:28:01Z" level=info msg="Responding to status request with: Progress"
time="2018-07-20T08:28:02Z" level=info msg="Responding to status request with: Progress"
time="2018-07-20T08:28:03Z" level=info msg="Responding to status request with: Progress"
time="2018-07-20T08:28:04Z" level=info msg="Responding to status request with: Progress"
time="2018-07-20T08:28:05Z" level=info msg="Responding to status request with: Progress"
time="2018-07-20T08:28:06Z" level=info msg="Responding to status request with: Progress"
time="2018-07-20T08:28:07Z" level=info msg="Responding to status request with: Progress"
time="2018-07-20T08:28:08Z" level=info msg="Responding to status request with: Progress"
time="2018-07-20T08:28:09Z" level=info msg="Responding to status request with: Progress"
time="2018-07-20T08:28:10Z" level=info msg="Responding to status request with: Progress"
time="2018-07-20T08:28:10Z" level=info msg="Successfully initialized data directory \"/var/etcd/data\" for etcd."
time="2018-07-20T08:28:11Z" level=info msg="Responding to status request with: Successful"
time="2018-07-20T08:28:11Z" level=info msg="Updating status from Successful to New"
time="2018-07-20T08:28:12Z" level=info msg="Responding to status request with: New"
time="2018-07-20T08:28:12Z" level=info msg="Received start initialization request."
time="2018-07-20T08:28:12Z" level=info msg="Updating status from New to Progress"
time="2018-07-20T08:28:12Z" level=info msg="Responding to status request with: Progress"
time="2018-07-20T08:28:13Z" level=info msg="Responding to status request with: Progress"
time="2018-07-20T08:28:14Z" level=info msg="Responding to status request with: Progress"
time="2018-07-20T08:28:15Z" level=info msg="Responding to status request with: Progress"
^Xtime="2018-07-20T08:28:16Z" level=info msg="Responding to status request with: Progress"
time="2018-07-20T08:28:17Z" level=info msg="Responding to status request with: Progress"
2018-07-20 08:28:11.623015 I | etcdmain: etcd Version: 3.3.8
2018-07-20 08:28:11.623029 I | etcdmain: Git SHA: 33245c6b5
2018-07-20 08:28:11.623035 I | etcdmain: Go Version: go1.9.7
2018-07-20 08:28:11.623040 I | etcdmain: Go OS/Arch: linux/amd64
2018-07-20 08:28:11.623045 I | etcdmain: setting maximum number of CPUs to 4, total number of available CPUs is 4
2018-07-20 08:28:11.623161 N | etcdmain: the server is already initialized as member before, starting as etcd member...
2018-07-20 08:28:11.623575 I | embed: listening for peers on http://localhost:2380
2018-07-20 08:28:11.623669 I | embed: listening for client requests on 0.0.0.0:2379
unexpected fault address 0x7fb1f790b000
fatal error: fault
[signal SIGBUS: bus error code=0x2 addr=0x7fb1f790b000 pc=0x8b54b4]
goroutine 46 [running]:
runtime.throw(0xff2dd4, 0x5)
/usr/local/go/src/runtime/panic.go:605 +0x95 fp=0xc42006b398 sp=0xc42006b378 pc=0x42c145
runtime.sigpanic()
/usr/local/go/src/runtime/signal_unix.go:364 +0x29d fp=0xc42006b3e8 sp=0xc42006b398 pc=0x442e2d
github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt.(*Tx).checkBucket.func1(0x7fb1f790b000, 0x3)
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt/tx.go:445 +0x64 fp=0xc42006b530 sp=0xc42006b3e8 pc=0x8b54b4
github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt.(*Tx).forEachPage(0xc4202a4000, 0x41d87, 0x3, 0xc42006b670)
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt/tx.go:607 +0x81 fp=0xc42006b578 sp=0xc42006b530 pc=0x8b3d31
github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt.(*Tx).forEachPage(0xc4202a4000, 0x5097, 0x2, 0xc42006b670)
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt/tx.go:613 +0xd7 fp=0xc42006b5c0 sp=0xc42006b578 pc=0x8b3d87
github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt.(*Tx).forEachPage(0xc4202a4000, 0x5098, 0x1, 0xc42006b670)
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt/tx.go:613 +0xd7 fp=0xc42006b608 sp=0xc42006b5c0 pc=0x8b3d87
github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt.(*Tx).forEachPage(0xc4202a4000, 0xbb, 0x0, 0xc42006b670)
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt/tx.go:613 +0xd7 fp=0xc42006b650 sp=0xc42006b608 pc=0x8b3d87
github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt.(*Tx).checkBucket(0xc4202a4000, 0xc42021b940, 0xc42006b950, 0xc42006b920, 0xc42028a5a0)
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt/tx.go:444 +0xe1 fp=0xc42006b6e0 sp=0xc42006b650 pc=0x8b3371
github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt.(*Tx).checkBucket.func2(0x7fb1b5c0a1b9, 0x3, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0)
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt/tx.go:469 +0xcd fp=0xc42006b738 sp=0xc42006b6e0 pc=0x8b5add
github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt.(*Bucket).ForEach(0xc4202a4018, 0xc42006b7f8, 0x0, 0xc42003c7c8)
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt/bucket.go:388 +0x124 fp=0xc42006b7a8 sp=0xc42006b738 pc=0x8a21f4
github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt.(*Tx).checkBucket(0xc4202a4000, 0xc4202a4018, 0xc42003c950, 0xc42003c920, 0xc42028a5a0)
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt/tx.go:467 +0x160 fp=0xc42006b838 sp=0xc42006b7a8 pc=0x8b33f0
github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt.(*DB).freepages(0xc4202a0000, 0x0, 0x0, 0x0)
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt/db.go:979 +0x257 fp=0xc42006ba20 sp=0xc42006b838 pc=0x8a98a7
github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt.(*DB).loadFreelist.func1()
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt/db.go:287 +0x1e4 fp=0xc42006ba88 sp=0xc42006ba20 pc=0x8b4be4
sync.(*Once).Do(0xc4202a0150, 0xc42003cad0)
/usr/local/go/src/sync/once.go:44 +0xbe fp=0xc42006bac0 sp=0xc42006ba88 pc=0x472c2e
github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt.(*DB).loadFreelist(0xc4202a0000)
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt/db.go:283 +0x4e fp=0xc42006baf0 sp=0xc42006bac0 pc=0x8a6a3e
github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt.Open(0xc42028e8a0, 0x1d, 0x180, 0xc42003cbf8, 0x0, 0x0, 0x1d)
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt/db.go:260 +0x3b8 fp=0xc42006bbc0 sp=0xc42006baf0 pc=0x8a65a8
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/mvcc/backend.newBackend(0xc42028e8a0, 0x1d, 0x5f5e100, 0x2710, 0x280000000, 0x2)
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/mvcc/backend/backend.go:129 +0xaf fp=0xc42006bcb8 sp=0xc42006bbc0 pc=0x906e1f
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/mvcc/backend.New(0xc42028e8a0, 0x1d, 0x5f5e100, 0x2710, 0x280000000, 0x472c2e, 0x10)
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/mvcc/backend/backend.go:113 +0x48 fp=0xc42006bcf8 sp=0xc42006bcb8 pc=0x906d48
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver.newBackend(0x7fff5ca3322c, 0x9, 0x0, 0x0, 0x0, 0x0, 0xc42021e900, 0x1, 0x1, 0xc42021e700, ...)
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/backend.go:36 +0x1ad fp=0xc42006bdb8 sp=0xc42006bcf8 pc=0xbaa54d
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver.openBackend.func1(0xc42028a540, 0xc420268800)
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/backend.go:56 +0x62 fp=0xc42006bfd0 sp=0xc42006bdb8 pc=0xbcdd32
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:2337 +0x1 fp=0xc42006bfd8 sp=0xc42006bfd0 pc=0x45d2c1
created by github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver.openBackend
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/backend.go:55 +0xe6
goroutine 1 [select]:
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver.openBackend(0x7fff5ca3322c, 0x9, 0x0, 0x0, 0x0, 0x0, 0xc42021e900, 0x1, 0x1, 0xc42021e700, ...)
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/backend.go:58 +0x1c1
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver.NewServer(0x7fff5ca3322c, 0x9, 0x0, 0x0, 0x0, 0x0, 0xc42021e900, 0x1, 0x1, 0xc42021e700, ...)
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/server.go:286 +0x3ee
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/embed.StartEtcd(0xc420276000, 0xc420276480, 0x0, 0x0)
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/embed/etcd.go:179 +0x870
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdmain.startEtcd(0xc420276000, 0x6, 0xff3ea7, 0x6, 0x1)
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdmain/etcd.go:181 +0x40
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdmain.startEtcdOrProxyV2()
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdmain/etcd.go:102 +0x151e
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdmain.Main()
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdmain/main.go:46 +0x3f
main.main()
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/etcd/main.go:28 +0x20
goroutine 51 [chan receive]:
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/pkg/logutil.(*MergeLogger).outputLoop(0xc42018db20)
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/pkg/logutil/merge_logger.go:174 +0x47a
created by github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/pkg/logutil.NewMergeLogger
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/pkg/logutil/merge_logger.go:92 +0xb8
goroutine 62 [chan receive]:
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/pkg/logutil.(*MergeLogger).outputLoop(0xc4201f6660)
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/pkg/logutil/merge_logger.go:174 +0x47a
created by github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/pkg/logutil.NewMergeLogger
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/pkg/logutil/merge_logger.go:92 +0xb8
goroutine 63 [chan receive]:
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/pkg/logutil.(*MergeLogger).outputLoop(0xc4201f66c0)
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/pkg/logutil/merge_logger.go:174 +0x47a
created by github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/pkg/logutil.NewMergeLogger
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/pkg/logutil/merge_logger.go:92 +0xb8
goroutine 47 [chan receive]:
github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt.(*DB).freepages.func2(0xc42028a5a0)
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt/db.go:975 +0x4b
created by github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt.(*DB).freepages
/tmp/etcd-release-3.3.8/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt/db.go:974 +0x214
goroutine 78 [syscall]:
os/signal.signal_recv(0x0)
/usr/local/go/src/runtime/sigqueue.go:131 +0xa6
os/signal.loop()
/usr/local/go/src/os/signal/signal_unix.go:22 +0x22
created by os/signal.init.0
/usr/local/go/src/os/signal/signal_unix.go:28 +0x41