Giter Site home page Giter Site logo

Comments (6)

sadovnychyi avatar sadovnychyi commented on June 28, 2024

https://github.com/GoogleCloudPlatform/appengine-pipelines/wiki/Python#what-is-a-pipeline

All pipelines must be idempotent. This means that running the same pipeline with the same inputs more than once will yield the same results and the same side-effects. The library does not enforce the idempotence requirement on pipelines, it is up to developers to do it themselves. However, the library provides a few pieces (like stable pipeline IDs) which make it easy to achieve idempotence for side-effects.

from appengine-pipelines.

imlm avatar imlm commented on June 28, 2024

Oh that was my bad, I made sure I had read the Java wiki because that's the version I am using, but never considered the python one. 👍
Thanks!

from appengine-pipelines.

aozarov avatar aozarov commented on June 28, 2024

Right, pipeline code is expected be idempotent in both cases. I am not
certain about the Python implementation but the Java implementation is
slightly more lenient in the "yield the same results and the same
side-effects" part. Basically, it is true that the same Job can run more
than once but once run the job state will be changed conditionally and
atomically (CAS). Such a change (including adding child job) will be
dropped if job state was already modified. This means that as long as your
jobs do not change external state/data you should be OK and not experience
any data clobbering in case input changes between runs.

Arie.

On Thu, Apr 23, 2015 at 5:40 AM, Irineu [email protected] wrote:

Oh that was my bad, I made sure I had read the Java wiki because that's
the version I am using, but never considered the python one. [image: 👍]
Thanks!


Reply to this email directly or view it on GitHub
#31 (comment)
.

from appengine-pipelines.

imlm avatar imlm commented on June 28, 2024

Hi @aozarov ,

Thank you very much for further expanding on this. I assume this is the code snippet where CAS happens.

jobRecord.setState(State.WAITING_TO_FINALIZE);
jobRecord.setChildGraphGuid(currentRunGUID);
updateSpec.getFinalTransaction().includeJob(jobRecord);
updateSpec.getFinalTransaction().includeBarrier(finalizeBarrier);
backEnd.saveWithJobStateCheck(
    updateSpec, jobRecord.getQueueSettings(), jobKey, State.WAITING_TO_RUN, State.RETRY);

Now what is intriguing me is the fact that there is also a CAS-like operation before running the job but without actually changing the job state:

if (!backEnd.saveWithJobStateCheck(
    tempSpec, jobRecord.getQueueSettings(), jobKey, State.WAITING_TO_RUN, State.RETRY)) {
    logger.info("Ignoring runJob request for job " + jobRecord + " which is not in a"
        + " WAITING_TO_RUN or a RETRY state");
    return;
}

Could we not seize this and use another State (e.g., RUNNING) to assure the run-only-once effect? I might be being too naive here since if it were this simple it'd have been done this way already, but it does not cost to ask.

from appengine-pipelines.

aozarov avatar aozarov commented on June 28, 2024

lets say you changed it to RUNNING and then failed without a chance to change back the state...
If that happens it would be hard to distinct a failed running and a very long running (considering that taskqueue requests can be up to 10 minutes and backend requests up to 24 hours...)

from appengine-pipelines.

imlm avatar imlm commented on June 28, 2024

Indeed, either way thanks again! I'm closing this as it has been answered already.

from appengine-pipelines.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.