Comments (8)
I think I found the root cause. The auto-compaction config in common config seems to be not deeply copied.
# Before DeepCopy
Common:v1alpha1.SharedConfig{AutoCompactionMode:(*v1alpha1.CompactionMode)(0xc0007ee380), AutoCompactionRetention:(*string)(0xc0007ee390)}
# After DeepCopy
Common:v1alpha1.SharedConfig{AutoCompactionMode:(*v1alpha1.CompactionMode)(0xc0007ee380), AutoCompactionRetention:(*string)(0xc0007ee390)}
I can now see that #157 forgot to generate DeepCopy
. I tried generating it now. But for some reason, I still see the auto-compaction config not deeply copied.
Will dig some more.
from etcd-druid.
I tried with patch and removed locks. But the unit tests gave race condition error. Should I raise a PR with the changes for patch, just for your review?
from etcd-druid.
Hmm. Yes, you can raise a draft PR.
from etcd-druid.
I think I found the reason for the race-condition while using Status().Patch()
or Status.Update()
without synchronisation. The race condition seems to occur when the client-go/rest
tries to decode the response back into the object (see the error trace below).
This in turn seems to relate to a bug in the DeepCopy
implementation where some string pointers are not deeply copied. When I try passing a "freshly constructed" object (with only TypeMeta
and ObjectMeta
and without spec and status) to Status().Patch()
, the race-conditions disappear.
I think, the real fix should be in DeepCopy
, though.
WARNING: DATA RACE
Write at 0x00c000b45680 by goroutine 169:
github.com/json-iterator/go.(*stringCodec).Decode()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/reflect_native.go:209 +0xa6
github.com/json-iterator/go.(*placeholderDecoder).Decode()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/reflect.go:324 +0x64
github.com/json-iterator/go.(*OptionalDecoder).Decode()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/reflect_optional.go:39 +0x195
github.com/json-iterator/go.(*placeholderDecoder).Decode()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/reflect.go:324 +0x64
github.com/json-iterator/go.(*structFieldDecoder).Decode()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/reflect_struct_decoder.go:1054 +0xcd
github.com/json-iterator/go.(*twoFieldsStructDecoder).Decode()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/reflect_struct_decoder.go:615 +0x169
github.com/json-iterator/go.(*structFieldDecoder).Decode()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/reflect_struct_decoder.go:1054 +0xcd
github.com/json-iterator/go.(*generalStructDecoder).decodeOneField()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/reflect_struct_decoder.go:552 +0x1e4
github.com/json-iterator/go.(*generalStructDecoder).Decode()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/reflect_struct_decoder.go:508 +0x10f
github.com/json-iterator/go.(*structFieldDecoder).Decode()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/reflect_struct_decoder.go:1054 +0xcd
github.com/json-iterator/go.(*fiveFieldsStructDecoder).Decode()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/reflect_struct_decoder.go:741 +0x654
github.com/json-iterator/go.(*Iterator).ReadVal()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/reflect.go:79 +0x1fb
github.com/json-iterator/go.(*frozenConfig).Unmarshal()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/config.go:348 +0xfe
k8s.io/apimachinery/pkg/runtime/serializer/json.(*Serializer).Decode()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/runtime/serializer/json/json.go:263 +0x561
k8s.io/apimachinery/pkg/runtime.WithoutVersionDecoder.Decode()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/runtime/helper.go:252 +0xac
k8s.io/apimachinery/pkg/runtime.(*WithoutVersionDecoder).Decode()
<autogenerated>:1 +0xd9
k8s.io/client-go/rest.Result.Into()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/client-go/rest/request.go:1263 +0xc9
sigs.k8s.io/controller-runtime/pkg/client.(*typedClient).PatchStatus()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/client/typed_client.go:204 +0x964
sigs.k8s.io/controller-runtime/pkg/client.(*statusWriter).Patch()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/client/client.go:262 +0x359
github.com/gardener/etcd-druid/controllers.(*EtcdCustodian).updateEtcdStatusWithNoSts.func1()
/Users/i344628/go/src/github.com/gardener/etcd-druid/controllers/etcd_custodian_controller.go:180 +0x1b7
k8s.io/client-go/util/retry.OnError.func1()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/client-go/util/retry/util.go:51 +0x53
k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:211 +0x85
k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:399 +0x64
k8s.io/client-go/util/retry.OnError()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/client-go/util/retry/util.go:50 +0xd0
k8s.io/client-go/util/retry.RetryOnConflict()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/client-go/util/retry/util.go:104 +0x3a7
github.com/gardener/etcd-druid/controllers.(*EtcdCustodian).updateEtcdStatusWithNoSts()
/Users/i344628/go/src/github.com/gardener/etcd-druid/controllers/etcd_custodian_controller.go:179 +0x2c1
github.com/gardener/etcd-druid/controllers.(*EtcdCustodian).Reconcile()
/Users/i344628/go/src/github.com/gardener/etcd-druid/controllers/etcd_custodian_controller.go:109 +0x8bd
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:263 +0x42a
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:235 +0x368
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:198 +0x64
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185 +0x4e
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x75
k8s.io/apimachinery/pkg/util/wait.BackoffUntil()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0xba
k8s.io/apimachinery/pkg/util/wait.JitterUntil()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x114
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185 +0xb3
k8s.io/apimachinery/pkg/util/wait.UntilWithContext()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:99 +0x64
Previous read at 0x00c000b45680 by goroutine 158:
[failed to restore the stack]
Goroutine 169 (running) created at:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:195 +0x784
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:205 +0x252
sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).startRunnable.func1()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/manager/internal.go:691 +0xe3
Goroutine 158 (running) created at:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:195 +0x784
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:205 +0x252
sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).startRunnable.func1()
/Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/manager/internal.go:691 +0xe3
from etcd-druid.
I think, the real fix should be in DeepCopy, though.
I tried to re-generenate the DeepCopy
methods. But the race-conditions persist. So, for now, the only thing that mitigates the problem is to pass a "freshly constructed" object to Status().Patch()
, as mentioned earlier.
from etcd-druid.
When I try passing a "freshly constructed" object (with only TypeMeta and ObjectMeta and without spec and status) to Status().Patch(), the race-conditions disappear.
To be clear, this is really a work-around and it will only work with client.RawPatch()
and not with client.MergeFrom()
or anything else that depends on the spec/status content of the object passed to Status().Patch()
.
from etcd-druid.
But for some reason, I still see the auto-compaction config not deeply copied.
Mystery solved. After separating api
into its own go module in #169, the rest of etcd-druid
uses the vendored api. So, re-generating the DeepCopy
functions followed by make revendor
fixed the race-condition.
from etcd-druid.
@abdasgupta Can you please change Status().Patch()
to Status().Update()
(that fixed all failing test cases after race-conditions were fixed)? Let's revisit using Status().Patch()
as a separate issue.
from etcd-druid.
Related Issues (20)
- [Feature] Consume GCS regional endpoints feature from etcd-backup-restore
- [Feature] Enable Druid to use mock GCS server to run e2e tests
- [Feature] Enable `etcd-druid` to make use of the `Azurite` emulator to run end-to-end (e2e) tests HOT 1
- [Feature] Move `UseEtcdWrapper` alpha feature to beta feature HOT 2
- [Feature] Configurable `--max-backups` parameter for etcdbrctl garbage collection HOT 1
- Make max-backups configurable in LimitBased GC from Etcd CR
- [BUG] violates PodSecurity "baseline:latest" HOT 1
- Add a new configurable field `fullSnapshotLeaseUpdateInterval` in spec.backup section of Etcd CR HOT 3
- [Backlog] Upgrade kustomize to v5.0.0+
- Improve Node Utilization by Avoiding "safe-to-evict" Annotation for Druid-Managed Pods HOT 13
- Improve Node Utilisation / More Permissive CA Down-Scaling (safe-to-evict) HOT 2
- [Feature] Move UseEtcdWrapper beta feature to GA feature
- Upgrade golang version to 1.22
- ☂️ Enhance and Stabilise Druid E2E tests HOT 1
- Improve Node Utilization by Reducing Requests for Druid-Managed Pods HOT 10
- Clashing VPA dependencies between etcd-druid and gardener/gardener
- [TASK] Remove deprecated fields from Etcd API
- Handling orphaned deployed components in case ETCD resources is deleted by mistakenly removing finalizer HOT 1
- Use `fsGroup` instead of initContainer for setting appropriate file owners. HOT 2
- Introduce etcd-druid upgrade tests
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from etcd-druid.