Giter Site home page Giter Site logo

Comments (7)

HaraldGustafsson avatar HaraldGustafsson commented on May 22, 2024

Yes we have thought about dynamic/quality attributes, like small/medium/large memory etc, I will write that in a wiki page, i'll get back with a reference. Dynamic has also the issue of when to check or if you should get notification when changed.

The storage have a local cache of the key-value-pairs which is stored in the localstore and localstore_sets. When using the local storage type instead of DHT the cache keeps all the data.

A1) We can store 2 different types of value for a key, either a single value (using set) (which could be a JSON-string with multiple values) or a set that values can be added and removed to/from (each value could be a JSON-string but typically it is a UUID-string). The localstore save the first type of key-value and the localstore_sets stores the second type. The localstore_sets contains an + (append) and a - (remove) set since its main purpose is to be a cache for operations towards the external store e.g. the DHT. So when an append operation has been finalized in the DHT the appended value is removed from the + set in localstore_sets for the given key and similare for remove.

A2) The append and remove functions expect a list or a set of values and applies a set function call on the values, hence when you supply a dictionary it will take the keys of that dictionary. This is to be able to supply multiple values in one operation.

A3) The node-xx you refer to is of the first type hence you need to alter the complete JSON-coded value. What you should do if you intend to update certain values separate/often is to break that out into its own key. For example node-cpu-xx, node-memory-xx, etc. If you do that make a update_node_load or similare in storage.py and make sure that delete_node deletes that data. BUT if you intend to use this load information for actor placement I would suggest another approach, which utilize the prefix-searchable index that we have, see the upcoming wiki-page.

from calvin-base.

brunodonassolo avatar brunodonassolo commented on May 22, 2024

Hi,

Thanks for the response. It's clearer for me now.

I'll wait the wiki page to give a look.

A1) Ok, I believe I understood the difference between the 2 groups. I was expecting the same kind of value in the 2 structs, so my confusion.

A2) It makes sense. I was using the wrong data in the append method.

A3) I made that by now (create a node-memory-xxx). In fact, I want to use this information for the placement. I saw the prefix-searchable index but I didn't know how to use it when searching for range of values (e.g. memory greater than 1000). By now, the idea was get all possibles nodes (like all_nodes.py) and filter them later, but I'll wait to see the suggestion in the wiki page.

Best,

from calvin-base.

HaraldGustafsson avatar HaraldGustafsson commented on May 22, 2024

I've added a wiki-page https://github.com/EricssonResearch/calvin-base/wiki/Storage-or-Registry .

Hope this helps. If you need more help on designing how to do requirement matching, please ask. For example if you are doing requirement matching by filtering we should make sure that those requirements are of a different type so that they are applied last after other requirements, etc.

from calvin-base.

brunodonassolo avatar brunodonassolo commented on May 22, 2024

Hi Harald,

First of all, thanks a lot for the wiki page. It really helped to understand the storage and the index searches for range of values.

I'm writing this message to tell you which are my ideas to implement an initial support for dynamic parameters in the deployment.

  1. Monitoring:
    The idea is add commands in the Calvin Control API to update the values.
    Example: POST /monitor/cpu/core/node-id { "value": 4 } to set 4 cores in the node.

I believe it is the most flexible way to update the values in Calvin. Mainly because it becomes more difficult to do it inside calvin for more complex parameters like network bandwidth or latency.

I also tried to use the psUtil library to monitor the resources inside csruntime. However, it depends on the host OS to return correct values. For linux, it reads the values from /proc/ what leads to problems when running calvin inside a containers/docker. It returns the resources of the host system, not from the container.

  1. Storage and description
    Following the approach described in the "Future registry expansion" section, I intend to add ranges for the values being monitored (which parameters and which ranges are still to be defined).

For example, in the case of number of CPU cores, we would have in the storage: /index/cpu/cores/1/2/4/8/16/32
The search could be like this: "index": ["cpu", {"cores": 4}], to get all nodes with at least 4 cores.

I first implemented an exact search saving only in the database the number of cores: /cpuCores-ID/4.
And I created a new attribute node_resource_min that recovers all nodes that have at least the specified parameter.
However, to get all the nodes it was necessary to do (1 + (number of runtimes)) requests to the storage. The first to recovery all nodes and 1 for each node to get the value and compare with the requested one.

Following the proposed approach, I can retrieve all nodes with only 1 requests. It seems to use more memory but has better performance.

  1. Dynamic information and deployment changes
    In the first moment, I will not consider this issue. The idea is consider the new parameters for the first deployment only.
    But I believe it is doable, either by changing the storage system or by adding something in the input functions of the monitoring.

from calvin-base.

HaraldGustafsson avatar HaraldGustafsson commented on May 22, 2024

You say dynamic parameters, are you intending to have dynamic number of cores? Anyway be aware that the runtime is single threaded. You are supposed to have one runtime per core.

For parameters that are stable, like number of cores, max CPU rate, max memory, max network bandwidth. I would prefer if you used attributes, these could then be supplied when the runtime starts, and the call to csruntime could easily be wrapped with a tool to derive all values and then supplied to csruntime command.

For parmeters that change during the lifetime of the runtime it looks ok. We already have monitor in the runtime for actors, so to not confuse please use performance capability or resource e.g.
POST /node/resource
Which could take a larger dictionary with several values that you likely want to update simultaneously.
These would then get a storage index-key like: index-/node/resource/cpuload/X/X etc by using new methods in attribute_resolver.py. You have node-id in the URL is that since you could tell any runtime to update the registry also for others?

In general looks interesting, could take a look at it in more detail when you make it public.

from calvin-base.

brunodonassolo avatar brunodonassolo commented on May 22, 2024

You say dynamic parameters, are you intending to have dynamic number of cores? Anyway be aware that the runtime is single threaded. You are supposed to have one runtime per core.

No, just a bad example =).
I agree that it is not a good parameter to consider in the deployment at all given the single-threaded environment.

For parameters that are stable, like number of cores, max CPU rate, max memory, max network bandwidth. I would prefer if you used attributes, these could then be supplied when the runtime starts, and the call to csruntime could easily be wrapped with a tool to derive all values and then supplied to csruntime command.

Ok, I will consider that. I'm just not sure about runtimes running in a container environment. Maybe some of the static values are not so static.

You have node-id in the URL is that since you could tell any runtime to update the registry also for others?

Yes, I left it. I don't think it is really useful, but I don't see a reason to remove it by now.

In general looks interesting, could take a look at it in more detail when you make it public.

Great. I'm developing in my fork of calvin (https://github.com/brunodonassolo/calvin-base/tree/resourceMonitor). As soon I believe I have at least a simple parameter working I'll submit a patch.

I'm struggling a little with the DHT in my local tests. Frequently I got errors messages when updating the values, like "1400-calvin.calvin.runtime.north.storage: Failed to update index-/cpu/avail" and "There are no known neighbors to set".

I'm still investigating, but did these messages say something to you?

Best,

from calvin-base.

brunodonassolo avatar brunodonassolo commented on May 22, 2024

I believe we can close this issue by now.

I'll open new ones if necessary to discuss another points about the network model in the deployment.

Thanks a lot for the help @HaraldGustafsson.

from calvin-base.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.