Giter Site home page Giter Site logo

Comments (13)

leszko avatar leszko commented on May 27, 2024 2

@adnxn, thanks for taking this issue!

So, I recommend doing the following:

  • Change /health/node-state => /health and check if it waits correctly for readiness (it should start one member, wait 30s, start second member, wait, etc.)
  • Check rolling upgrade (put a lot of data into a cluster (~2GB), perform rolling upgrade, check if there is no data loss)
  • Check scaling (make big cluster (6 members at least), put a lot of data (~2GB), scale down to 2 members, scale up to 6 members, check if there is no data loss)

Also, I'd not change readinessProbe, readinessProbe should still stay as /health/node-state.

from charts.

leszko avatar leszko commented on May 27, 2024 1

@adnxn

  1. Right, for the older HZ version (3.12.x), you need to use the old helm chart version. I forgot to mention that. Try the following commands, they should work:
helm install --name my-release --set cluster.memberCount=6 stable/hazelcast --version 2.10.0
helm upgrade my-release --set cluster.memberCount=3 stable/hazelcast --version 2.10.0
  1. Wrt data inserting,
    You can either write the client app to insert the data. Or you can use the built-in Client Console App. If you have a running hazelcast cluster on kubernetes, try executing the following.
$ kubectl exec -it hazelcast-0 /bin/bash
# java -cp lib/hazelcast-all*.jar com.hazelcast.client.console.ClientConsoleApp

from charts.

mesutcelik avatar mesutcelik commented on May 27, 2024

Are you going to parse the response and decide if it is healthy?

Hazelcast::NodeState=ACTIVE
Hazelcast::ClusterState=ACTIVE
Hazelcast::ClusterSafe=TRUE
Hazelcast::MigrationQueueSize=0
Hazelcast::ClusterSize=2

Apart from /node-state/ and /health, we have /ready too. We just need to figure out which one does really tell us restart me if it returns not-200 Response Code. cc: @mmedenjak

from charts.

leszko avatar leszko commented on May 27, 2024

I think that we don't need to parse it, HTTP 200 from /health should mean I'm alive. That is what should be used for livenessProbe.

Currently I think it's a little wrong, because we use /health/node-state and it returns 503 if the Hazelcast node is in the shutdown state. So if Hazelcast is in the shutdown state, Kubernetes terminates it. But I think that if Hazelcast is in the shutdown state, it's still alive and we should wait until it shuts down by itself properly.

from charts.

adnxn avatar adnxn commented on May 27, 2024

The change is trivial, but before applying it we need to double check that it does not break rolling upgrade and scaling down.

any guidance on how to validate these two things? rolling upgrades and scaling down?

from charts.

adnxn avatar adnxn commented on May 27, 2024

well just came across these:
https://hazelcast.com/blog/rolling-upgrade-hazelcast-imdg-on-kubernetes/
https://hazelcast.com/blog/how-to-scale-hazelcast-imdg-on-kubernetes/

but yea - any other advice would be welcome. thanks

from charts.

Holmistr avatar Holmistr commented on May 27, 2024

Hi @adnxn , glad to see you here :) I'm assigning you the issue and I'll make sure to get you some guidance from our experts. Looking forward to your contribution!

from charts.

leszko avatar leszko commented on May 27, 2024

And about the technical details how to scale up/down and how to perform the rolling updates, the blog posts you mentioned are good guidelines. I recommend using Helm Chart.

Then for the scaling, all you need to do is to execute:

helm install --name my-release --set cluster.memberCount=6 stable/hazelcast
helm upgrade my-release --set cluster.memberCount=3 stable/hazelcast

And for the rolling update:

helm install --name my-release --set image.tag=3.12 hazelcast/hazelcast
helm upgrade my-release --set image.tag=3.12.1 hazelcast/hazelcast

Write here if you encounter any issues. I'll try to help.

from charts.

adnxn avatar adnxn commented on May 27, 2024

@leszko: thanks for the info

put a lot of data (~2GB)

what would be the best way to do this?

also - seems like changing the endpoint breaks the management console for v3.12.* hrm

from charts.

mesutcelik avatar mesutcelik commented on May 27, 2024

Hi @adnxn ,
Do you need any more help to finalize this issue?

from charts.

leszko avatar leszko commented on May 27, 2024

@adnxn are you still working on this issue?

from charts.

adnxn avatar adnxn commented on May 27, 2024

hey i havent had time to follow up on this. if someone else wants to take it over, feel free.

from charts.

sgandon avatar sgandon commented on May 27, 2024

any news on this ?
is the liveness recommandation still /health
and readiness recommandation still /health/node-state ?
Do you confirm that the /health/node-state will respond 200 after having at least tried to join other hazelcast nodes at least once ?

from charts.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.