Giter Site home page Giter Site logo

Comments (8)

amshuman-kr avatar amshuman-kr commented on July 21, 2024 2

I think I found the root cause. The auto-compaction config in common config seems to be not deeply copied.

# Before DeepCopy
Common:v1alpha1.SharedConfig{AutoCompactionMode:(*v1alpha1.CompactionMode)(0xc0007ee380), AutoCompactionRetention:(*string)(0xc0007ee390)}
# After DeepCopy
Common:v1alpha1.SharedConfig{AutoCompactionMode:(*v1alpha1.CompactionMode)(0xc0007ee380), AutoCompactionRetention:(*string)(0xc0007ee390)}

I can now see that #157 forgot to generate DeepCopy. I tried generating it now. But for some reason, I still see the auto-compaction config not deeply copied.

Will dig some more.

from etcd-druid.

abdasgupta avatar abdasgupta commented on July 21, 2024

I tried with patch and removed locks. But the unit tests gave race condition error. Should I raise a PR with the changes for patch, just for your review?

from etcd-druid.

amshuman-kr avatar amshuman-kr commented on July 21, 2024

Hmm. Yes, you can raise a draft PR.

from etcd-druid.

amshuman-kr avatar amshuman-kr commented on July 21, 2024

I think I found the reason for the race-condition while using Status().Patch() or Status.Update() without synchronisation. The race condition seems to occur when the client-go/rest tries to decode the response back into the object (see the error trace below).

This in turn seems to relate to a bug in the DeepCopy implementation where some string pointers are not deeply copied. When I try passing a "freshly constructed" object (with only TypeMeta and ObjectMeta and without spec and status) to Status().Patch(), the race-conditions disappear.

I think, the real fix should be in DeepCopy, though.

WARNING: DATA RACE
Write at 0x00c000b45680 by goroutine 169:
  github.com/json-iterator/go.(*stringCodec).Decode()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/reflect_native.go:209 +0xa6
  github.com/json-iterator/go.(*placeholderDecoder).Decode()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/reflect.go:324 +0x64
  github.com/json-iterator/go.(*OptionalDecoder).Decode()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/reflect_optional.go:39 +0x195
  github.com/json-iterator/go.(*placeholderDecoder).Decode()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/reflect.go:324 +0x64
  github.com/json-iterator/go.(*structFieldDecoder).Decode()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/reflect_struct_decoder.go:1054 +0xcd
  github.com/json-iterator/go.(*twoFieldsStructDecoder).Decode()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/reflect_struct_decoder.go:615 +0x169
  github.com/json-iterator/go.(*structFieldDecoder).Decode()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/reflect_struct_decoder.go:1054 +0xcd
  github.com/json-iterator/go.(*generalStructDecoder).decodeOneField()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/reflect_struct_decoder.go:552 +0x1e4
  github.com/json-iterator/go.(*generalStructDecoder).Decode()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/reflect_struct_decoder.go:508 +0x10f
  github.com/json-iterator/go.(*structFieldDecoder).Decode()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/reflect_struct_decoder.go:1054 +0xcd
  github.com/json-iterator/go.(*fiveFieldsStructDecoder).Decode()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/reflect_struct_decoder.go:741 +0x654
  github.com/json-iterator/go.(*Iterator).ReadVal()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/reflect.go:79 +0x1fb
  github.com/json-iterator/go.(*frozenConfig).Unmarshal()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/github.com/json-iterator/go/config.go:348 +0xfe
  k8s.io/apimachinery/pkg/runtime/serializer/json.(*Serializer).Decode()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/runtime/serializer/json/json.go:263 +0x561
  k8s.io/apimachinery/pkg/runtime.WithoutVersionDecoder.Decode()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/runtime/helper.go:252 +0xac
  k8s.io/apimachinery/pkg/runtime.(*WithoutVersionDecoder).Decode()
      <autogenerated>:1 +0xd9
  k8s.io/client-go/rest.Result.Into()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/client-go/rest/request.go:1263 +0xc9
  sigs.k8s.io/controller-runtime/pkg/client.(*typedClient).PatchStatus()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/client/typed_client.go:204 +0x964
  sigs.k8s.io/controller-runtime/pkg/client.(*statusWriter).Patch()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/client/client.go:262 +0x359
  github.com/gardener/etcd-druid/controllers.(*EtcdCustodian).updateEtcdStatusWithNoSts.func1()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/controllers/etcd_custodian_controller.go:180 +0x1b7
  k8s.io/client-go/util/retry.OnError.func1()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/client-go/util/retry/util.go:51 +0x53
  k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:211 +0x85
  k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:399 +0x64
  k8s.io/client-go/util/retry.OnError()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/client-go/util/retry/util.go:50 +0xd0
  k8s.io/client-go/util/retry.RetryOnConflict()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/client-go/util/retry/util.go:104 +0x3a7
  github.com/gardener/etcd-druid/controllers.(*EtcdCustodian).updateEtcdStatusWithNoSts()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/controllers/etcd_custodian_controller.go:179 +0x2c1
  github.com/gardener/etcd-druid/controllers.(*EtcdCustodian).Reconcile()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/controllers/etcd_custodian_controller.go:109 +0x8bd
  sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:263 +0x42a
  sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:235 +0x368
  sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:198 +0x64
  k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185 +0x4e
  k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x75
  k8s.io/apimachinery/pkg/util/wait.BackoffUntil()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0xba
  k8s.io/apimachinery/pkg/util/wait.JitterUntil()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x114
  k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185 +0xb3
  k8s.io/apimachinery/pkg/util/wait.UntilWithContext()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:99 +0x64

Previous read at 0x00c000b45680 by goroutine 158:
  [failed to restore the stack]

Goroutine 169 (running) created at:
  sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:195 +0x784
  sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:205 +0x252
  sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).startRunnable.func1()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/manager/internal.go:691 +0xe3

Goroutine 158 (running) created at:
  sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:195 +0x784
  sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:205 +0x252
  sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).startRunnable.func1()
      /Users/i344628/go/src/github.com/gardener/etcd-druid/vendor/sigs.k8s.io/controller-runtime/pkg/manager/internal.go:691 +0xe3

from etcd-druid.

amshuman-kr avatar amshuman-kr commented on July 21, 2024

I think, the real fix should be in DeepCopy, though.

I tried to re-generenate the DeepCopy methods. But the race-conditions persist. So, for now, the only thing that mitigates the problem is to pass a "freshly constructed" object to Status().Patch(), as mentioned earlier.

from etcd-druid.

amshuman-kr avatar amshuman-kr commented on July 21, 2024

When I try passing a "freshly constructed" object (with only TypeMeta and ObjectMeta and without spec and status) to Status().Patch(), the race-conditions disappear.

To be clear, this is really a work-around and it will only work with client.RawPatch() and not with client.MergeFrom() or anything else that depends on the spec/status content of the object passed to Status().Patch().

from etcd-druid.

amshuman-kr avatar amshuman-kr commented on July 21, 2024

But for some reason, I still see the auto-compaction config not deeply copied.

Mystery solved. After separating api into its own go module in #169, the rest of etcd-druid uses the vendored api. So, re-generating the DeepCopy functions followed by make revendor fixed the race-condition.

from etcd-druid.

amshuman-kr avatar amshuman-kr commented on July 21, 2024

@abdasgupta Can you please change Status().Patch() to Status().Update() (that fixed all failing test cases after race-conditions were fixed)? Let's revisit using Status().Patch() as a separate issue.

from etcd-druid.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.