Giter Site home page Giter Site logo

vecma-project / easyvvuq-qcgpj Goto Github PK

View Code? Open in Web Editor NEW
2.0 7.0 1.0 1.68 MB

EasyVVUQ-QCGPJ provides an API to configure EasyVVUQ to use QCG Pilot Job for execution of demanding parts of EasyVVUQ workflow in parallel. It enables efficient execution of critical parts of EasyVVUQ workflows on HPC machines.

License: GNU Lesser General Public License v3.0

Python 96.69% Shell 2.77% C++ 0.55%

easyvvuq-qcgpj's Introduction

EasyVVUQ-QCGPJ - Python API for HPC execution of EasyVVUQ (EQI)

Build status Total alerts

EasyVVUQ-QCGPJ (EQI) is a lightweight plugin for parallelization of EasyVVUQ with QCG-PilotJob.

It is a part of the VECMA Toolkit.

The tool provides API that can be effortlessly integrated into typical EasyVVUQ workflows to enable parallel processing of demanding operations, in particular the simulation model's executions and encodings. It works regardless if you run your use-case on multi-core laptop or on large HPC machine.

Requirements

The software requires Python 3.6+ for usage.

Moreover, since EasyVVUQ-QCGPJ is a wrapper over EasyVVUQ and QCG-PilotJob, you need to have both these packages available in your environment. This version of the library is compatible with EasyVVUQ v0.8 and QCG-PilotJob v0.11.1. Compatibility with other versions is not confirmed and may be limited. Thus, if you want to be sure that correct versions of required packages are available, install them in the following way:

$ pip3 install --force-reinstall easyvvuq==0.8
$ pip3 install --force-reinstall qcg-pilotjob==0.11.1

Installation

The software could be easily installed from the PyPi repository:

$ pip3 install easyvvq-qcgpj

Alternatively, if you want to use specific branch of the software, you can get it from the the github repository. The procedure is quite typical, e.g.:

$ git clone https://github.com/vecma-project/EasyVVUQ-QCGPJ.git
$ cd EasyVVUQ-QCGPJ
$ git checkout some_branch
$ pip3 install .

Getting started

Documentation is available at https://easyvvuq-qcgpj.readthedocs.io

Authors

easyvvuq-qcgpj's People

Contributors

bartoszbosak avatar jlakhlili avatar

Stargazers

 avatar David Coster avatar

Watchers

James Cloos avatar  avatar  avatar  avatar  avatar  avatar David Coster avatar

Forkers

cspgdds

easyvvuq-qcgpj's Issues

"basic" test is not working anymore

Voilà:

Traceback (most recent call last):
  File "./basic.py", line 136, in <module>
    stats = test_cooling_pj()
  File "./basic.py", line 69, in test_cooling_pj
    my_campaign.add_app(name="cooling",
TypeError: add_app() got an unexpected keyword argument 'encoder'

Best
Paul

Exception queue.Empty

Dear Vecma team:)

An exception was raised today, that I haven't seen before.
Setup: 20 nodes, 48 tasks_per_node, 6400 pilot-jobs,
Machine: SM NG, general partition

What went wrong?

Exception

Traceback (most recent call last):
  File "/dss/dsshome1/03/di67piq/Vecma/install/lib/python3.6/site-packages/qcg/pilotjob/service.py", line 531, in run
    self.service = QCGPMService(self.args)
  File "/dss/dsshome1/03/di67piq/Vecma/install/lib/python3.6/site-packages/qcg/pilotjob/service.py", line 246, in __init__
    self._setup_direct_manager(self._args.parent)
  File "/dss/dsshome1/03/di67piq/Vecma/install/lib/python3.6/site-packages/qcg/pilotjob/service.py", line 276, in _setup_direct_manager
    self._manager = DirectManager(self._conf, parent_manager)
  File "/dss/dsshome1/03/di67piq/Vecma/install/lib/python3.6/site-packages/qcg/pilotjob/manager.py", line 432, in __init__
    self._executor = Executor(self, conf, self.resources)
  File "/dss/dsshome1/03/di67piq/Vecma/install/lib/python3.6/site-packages/qcg/pilotjob/executor.py", line 66, in __init__
    raise exc
  File "/dss/dsshome1/03/di67piq/Vecma/install/lib/python3.6/site-packages/qcg/pilotjob/executor.py", line 61, in __init__
    self._resources.binding)
  File "/dss/dsshome1/03/di67piq/Vecma/install/lib/python3.6/site-packages/qcg/pilotjob/executionjob.py", line 342, in start_agents
    asyncio.get_event_loop().run_until_complete(asyncio.ensure_future(cls.launcher.start(agents)))
  File "/dss/dsshome1/03/di67piq/Vecma/install/lib/python3.6/asyncio/base_events.py", line 484, in run_until_complete
    return future.result()
  File "/dss/dsshome1/03/di67piq/Vecma/install/lib/python3.6/site-packages/qcg/pilotjob/launcher/launcher.py", line 107, in start
    await self.__fire_agents(instances)
  File "/dss/dsshome1/03/di67piq/Vecma/install/lib/python3.6/site-packages/qcg/pilotjob/launcher/launcher.py", line 313, in __fire_agents
    raise Exception('timeout while waiting for agents')
Exception: timeout while waiting for agents
Traceback (most recent call last):
  File "/dss/dsshome1/03/di67piq/Vecma/install/lib/python3.6/site-packages/qcg/pilotjob/api/manager.py", line 682, in __init__
    self.qcgpm_conf = self.qcgpm_queue.get(block=True, timeout=600)
  File "/dss/dsshome1/03/di67piq/Vecma/install/lib/python3.6/multiprocessing/queues.py", line 105, in get
    raise Empty
queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./tests/test_bench.py", line 174, in <module>
    stats = test_cooling_pj(r, o)
  File "./tests/test_bench.py", line 109, in test_cooling_pj
    qcgpjexec.create_manager(log_level='debug')
  File "/dss/dsshome1/03/di67piq/Vecma/install/lib/python3.6/site-packages/eqi/core/executor.py", line 103, in create_manager
    self._qcgpjm = LocalManager(args, client_conf)
  File "/dss/dsshome1/03/di67piq/Vecma/install/lib/python3.6/site-packages/qcg/pilotjob/api/manager.py", line 684, in __init__
    raise errors.ServiceError('Service not started - timeout')
qcg.pilotjob.api.errors.ServiceError: Service not started - timeout

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.