Giter Site home page Giter Site logo

knox-epouta's Introduction

Successful cross-border use of secure cloud

When a cluster runs at full capacity, all the newly scheduled jobs have to wait. In case this happens often, it is necessary to scale up the infrastructure for more computations and more data transfers. To this end, we can of course buy more hardware, ie, more compute nodes, more disks and more network switches. However, this might be an expensive solution. The Tryggve project focused therefore on an alternative approach, where we ask other clusters if they have available resources that we could "borrow for a while".

An immediate issue with such a solution is whether a connection across borders is even feasible, or if there is a penalizing latency. We can imagine the scenario where computations happen in one country, while the data is located in another country. It’s worth mentioning that this work focused on technical aspects and did not take up legal matters related to the transfer of sensitive data between countries. That topic is left for further work within the Tryggve project.

In order to test the connection between countries, we built a temporary cloud cluster in Sweden, called Knox, and connected it to the resources of ePouta, a secure cloud cluster in Finland. The desired outcome is that the jobs would not know whether they are scheduled in Finland or Sweden.

Between Knox and ePouta, we installed a fiber link with a dedicated network with a capacity of 1GB/s. Note that the link is shared by other machines, but our network is not, ie. only the set of virtual machines that we booted on both Knox and ePouta are connected transparently to that same network.

We were interested in running realistic workflows. We choose to run the Cancer Analysis Workflow (CAW) from SciLifeLab and the Whole Genome Sequencing Structural Variation Pipeline (WGS) from NBIS as a first step. The results were surprising: we did not detect any significant slowdown when using resources from either clusters.

We suspected that disk access or even the network itself would be a bottleneck in this setup. So, as a second step, we stress-tested both aspects and noticed even more surprising results: the disk accesses are not slower at all, and the link can be used almost to 100%!

In other words, computations and tests do not notice whether they are performed in Finland or in Sweden. Computations were actually even faster running in Finland, since the hardware in ePouta is better than the one in Knox. It is probably possible to fine-tune the settings to get even further performance and a seamless connection. This is a very positive outcome as we can now carry on with workflows dealing with sensitive data. You can refer the NBIS GitHub repository for further information, or see an informal presentation.

knox-epouta's People

Contributors

silverdaz avatar jhagberg avatar pontus avatar

Watchers

Erik Ylipää avatar  avatar Valentin Georgiev avatar James Cloos avatar Joel Hedlund avatar Johan avatar Malin Klang avatar Björn avatar  avatar  avatar Johan Nylander avatar  avatar Richèl Bilderbeek avatar  avatar  avatar Per Johnsson avatar Bengt Sennblad avatar  avatar Jessica Lindvall avatar John Lövrot avatar Nanjiang Shu avatar Jacques Dainat avatar Jonas Söderberg avatar Guilherme Borges Dias avatar  avatar Agustín Andrés Corbat avatar  avatar Martin Pippel avatar Nima Rafati avatar Fredrik Levander avatar Airen Zaldivar Peraza avatar  avatar  avatar Tomas Larsson avatar Juliana Assis avatar Dimitris Bampalikis avatar Dag Ahren avatar Markus Ringnér avatar  avatar Olga avatar  avatar  avatar Ashfaq Ali avatar Jeanette Tångrot avatar Emilio Mármol Sánchez avatar  avatar

knox-epouta's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.