Giter Site home page Giter Site logo

Comments (4)

marksantcroos avatar marksantcroos commented on September 6, 2024

Hi Pradeep,

Can you elaborate, I don’t really understand what you mean.

Thanks

Gr,

Mark

On 09 Feb 2014, at 20:19 , pradeepmantha [email protected] wrote:

Having something like will be great. Currently we need to segregate DUs with all the files required for a CU. But this could be optimized and avoid unnecessary DU creation.

""" Parsing input data field of job description:
    {
    ...
     "input_data": [
                    {
                     input_data_unit.get_url(): 
                     ["file1","file2"]
                    }
                    ]

    or

    "input_data": [
                    input_data_unit.get_url()                                                         
                 ]                        
    }    
"""   


Reply to this email directly or view it on GitHub.

from bigjob.

pradeepmantha avatar pradeepmantha commented on September 6, 2024

Consider below example - I have a task which created 1000 files, where for each file, I wanna create a task, which takes the file itself as input. With current 'input_data' CUD attribute, I can only pass, all the contents of DU. So, I either need to create 1000 intermediate DUS, one for each file, and pass the DU as input to the task, or pass the 1000 files for each CU without creating intermediate DUS. Allowing to Specify required input files as below, will help to avoid intermediate creation of DUS and just get the required files from the DUS.

"input_data": [
{
input_data_unit.get_url():
["file1","file2"]
}

Again this could be a flexibility that user/application can use.

from bigjob.

marksantcroos avatar marksantcroos commented on September 6, 2024

Hi Pradeep,

On 19 Feb 2014, at 19:26 , pradeepmantha [email protected] wrote:

Consider below example - I have a task which created 1000 files, where for each file, I wanna create a task, which takes the file itself as input.

Ok, clear.

With current 'input_data' CUD attribute, I can only pass, all the contents of DU.

Correct, DU’s are atomic units for good reasons.

So, I either need to create 1000 intermediate DUS, one for each file, and pass the DU as input to the task,

Agreed, whats the problem with that? Isn’t that exactly what you want in this situation?

or pass the 1000 files for each CU without creating intermediate DUS.

Thats obviously not what you want.

Allowing to Specify required input files as below, will help to avoid intermediate creation of DUS and just get the required files from the DUS.

"input_data": [
{
input_data_unit.get_url():
["file1","file2"]
}

What does the “just get the required files” actually mean here? What are the exact semantics of that?

In general, I believe I see what you want to do, but as far as I can tell this can be expressed perfectly with the current model, without breaking the actual semantics.

More over, for this specific pattern, it makes sense to add a layer on top of PD, which is exactly what do you did, right?

Gr,

Mark

from bigjob.

pradeepmantha avatar pradeepmantha commented on September 6, 2024

Hi,

On Wed, Feb 19, 2014 at 11:36 AM, Mark Santcroos
[email protected]:

Hi Pradeep,

On 19 Feb 2014, at 19:26 , pradeepmantha [email protected]
wrote:

Consider below example - I have a task which created 1000 files, where
for each file, I wanna create a task, which takes the file itself as input.

Ok, clear.

With current 'input_data' CUD attribute, I can only pass, all the
contents of DU.

Correct, DU's are atomic units for good reasons.

So, I either need to create 1000 intermediate DUS, one for each file,
and pass the DU as input to the task,

Agreed, whats the problem with that? Isn't that exactly what you want in
this situation?

- It works, but need to create intermediate 1000 DUs,  Just want to

avoid that for performance reasons.

or pass the 1000 files for each CU without creating intermediate DUS.

Thats obviously not what you want.

Allowing to Specify required input files as below, will help to avoid
intermediate creation of DUS and just get the required files from the DUS.

"input_data": [
{
input_data_unit.get_url():
["file1","file2"]
}

What does the "just get the required files" actually mean here? What are
the exact semantics of that?

  • Its the same semantics analogous to how the "output_data" CUD
    attribute currently behaves.

In general, I believe I see what you want to do, but as far as I can tell
this can be expressed perfectly with the current model, without breaking
the actual semantics.

  • Yes, its just implementation.. I actually implemented in Pradeep
    branch of BigJob.

More over, for this specific pattern, it makes sense to add a layer on top
of PD, which is exactly what do you did, right?

- Yes.

Gr,

Mark

Reply to this email directly or view it on GitHubhttps://github.com//issues/174#issuecomment-35538681
.

from bigjob.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.