Giter Site home page Giter Site logo

cloud-mirror's Introduction


Taskcluster
Taskcluster

Inspecting a task on Taskcluster UI

The task execution framework that supports Mozilla's continuous integration and release processes.

Taskcluster Status License Chat netlify pre-commit


Usage

This repository is used to develop, build, and release the Taskcluster services.

Table of Contents

Team Mentions

Do you need to reach a specific subset of the team? Use the team handles to mention us with GitHub's @mention feature.

Team Name Use To...
@taskcluster/Core ping members of the Taskcluster team at Mozilla
@taskcluster/services-reviewers ping reviewers for changes to platform services and libraries
@taskcluster/frontend-reviewers ping people who can review changes to frontend (and related) code in the services monorepo
@taskcluster/security-folks ping people who do security things

Contributors

Thanks goes to these wonderful people (emoji key):

James Lal
James Lal

πŸ’» πŸ‘‹
Selena Deckelmann
Selena Deckelmann

πŸ’» πŸ‘‹
Dustin J. Mitchell
Dustin J. Mitchell

πŸ’» πŸ‘‹
Wander Lairson Costa
Wander Lairson Costa

πŸ’» πŸ‘‹
Greg Arndt
Greg Arndt

πŸ’» πŸ‘‹
Pete Moore
Pete Moore

πŸ’» πŸ”§
Hassan Ali
Hassan Ali

πŸ’» πŸ‘‹
John Whitlock
John Whitlock

πŸ’» πŸ‘‹
Brian Stack
Brian Stack

πŸ’» πŸ‘‹
John Ford
John Ford

πŸ’» πŸ‘‹
Eli Perelman
Eli Perelman

πŸ’» πŸ‘‹
Jonas Finnemann Jensen
Jonas Finnemann Jensen

πŸ’» πŸ‘‹
owlishDeveloper
owlishDeveloper

πŸ’» πŸ‘‹
Miles Crabill
Miles Crabill

πŸ’» πŸ‘‹
Chris Cooper
Chris Cooper

πŸ’» πŸ‘‹
Mathieu Leplatre
Mathieu Leplatre

πŸ’» πŸ‘‹
Rob Thijssen
Rob Thijssen

πŸ’»
Anup
Anup

πŸ’»
Hammad Akhtar
Hammad Akhtar

πŸ’»
Chinmay Kousik
Chinmay Kousik

πŸ’»
Anthony Miyaguchi
Anthony Miyaguchi

πŸ’»
Ana Rute Mendes
Ana Rute Mendes

πŸ’»
Andrea Del Rio
Andrea Del Rio

πŸ’»
kristelteng
kristelteng

πŸ’»
Elena Solomon
Elena Solomon

πŸ’»
Xavier L.
Xavier L.

πŸ’»
Yann Landry
Yann Landry

πŸ’»
Ayub
Ayub

πŸ’»
lteigrob
lteigrob

πŸ’»
Bastien Abadie
Bastien Abadie

πŸ’»
Amjad Mashaal
Amjad Mashaal

πŸ’»
Tom Prince
Tom Prince

πŸ’»
Samantha Yu
Samantha Yu

πŸ’»
Auni Ahsan
Auni Ahsan

πŸ’»
alex
alex

πŸ’»
Alisha Aneja
Alisha Aneja

πŸ’»
Prachi Manchanda
Prachi Manchanda

πŸ’»
Simon Fraser
Simon Fraser

πŸ’»
Yashvardhan Didwania
Yashvardhan Didwania

πŸ’»
Cynthia Pereira
Cynthia Pereira

πŸ’»
Hashini Galappaththi
Hashini Galappaththi

πŸ’»
Fienny Angelina
Fienny Angelina

πŸ’»
Kanika Saini
Kanika Saini

πŸ’»
Biboswan Roy
Biboswan Roy

πŸ’»
sudipt dabral
sudipt dabral

πŸ’»
Ojaswin
Ojaswin

πŸ’»
ΠœΠ°Ρ‚Ρ€Π΅ΡˆΠΊΠ°
ΠœΠ°Ρ‚Ρ€Π΅ΡˆΠΊΠ°

πŸ’»
Alok Kumar
Alok Kumar

πŸ’»
Arshad Kazmi
Arshad Kazmi

πŸ’»
Jason Yang
Jason Yang

πŸ’»
Shubham Gupta
Shubham Gupta

πŸ’»
Arun Kumar Mohan
Arun Kumar Mohan

πŸ’»
Brian Pitts
Brian Pitts

πŸ’»
E. Dunham
E. Dunham

πŸ’»
Shubham Chinda
Shubham Chinda

πŸ’»
Patrick Kang
Patrick Kang

πŸ’»
Rishabh Budhiraja
Rishabh Budhiraja

πŸ’»
ededals
ededals

πŸ’»
Ajin Kabeer
Ajin Kabeer

πŸ’»
Catherine Chepkurui
Catherine Chepkurui

πŸ’»
Jo
Jo

πŸ’»
vishakha
vishakha

πŸ’» πŸ“–
Noor Fatima
Noor Fatima

πŸ’»
Michael
Michael

πŸ’»
Mariana Zangrossi
Mariana Zangrossi

πŸ’»
ANURADHAJHA99
ANURADHAJHA99

πŸ’»
Edil
Edil

πŸ’»
Olympia
Olympia

πŸ’» πŸ“–
Michael Ozoemena
Michael Ozoemena

πŸ’»
lailahgrant
lailahgrant

πŸ’»
km-js
km-js

πŸ’»
Carolina Machado
Carolina Machado

πŸ’»
reenesa
reenesa

πŸ’»
Kelli Blalock
Kelli Blalock

πŸ’»
naima shaikh
naima shaikh

πŸ’»
Jiwoon Kim
Jiwoon Kim

πŸ’»
Michael Umanah
Michael Umanah

πŸ’»
Fahd Jamal A.
Fahd Jamal A.

πŸ“–
shilpi verma
shilpi verma

πŸ’»
somchi
somchi

πŸ’»
Anastasia
Anastasia

πŸ’»
Lubna
Lubna

πŸ’»
Soundharya AM
Soundharya AM

πŸ’»
Mustafa Jebara
Mustafa Jebara

πŸ’»
Aryaman Puri
Aryaman Puri

πŸ’»
Simon Sapin
Simon Sapin

πŸ’»
thoran
thoran

πŸ’»
Manish Giri
Manish Giri

πŸ’»
Tiger Oakes
Tiger Oakes

πŸ’»
Ricky Taylor
Ricky Taylor

πŸ’»
Alex Lopez
Alex Lopez

πŸ’»
Michelle
Michelle

πŸ› πŸš‡
Mrs. Velena
Mrs. Velena

πŸ’»
Ahmed A.
Ahmed A.

πŸ’»
Matt Boris
Matt Boris

πŸ’» πŸ”§
Yaraslau Kurmyza
Yaraslau Kurmyza

πŸ’» πŸ”§
Bastien Orivel
Bastien Orivel

πŸ’»
HamdiAmine
HamdiAmine

πŸ’»

This project follows the all-contributors specification. Contributions of any kind are welcome!

cloud-mirror's People

Contributors

ccooper avatar djmitche avatar imbstack avatar jhford avatar jonasfj avatar petemoore avatar

Watchers

 avatar  avatar  avatar  avatar

cloud-mirror's Issues

CODE_OF_CONDUCT.md isn't correct

Your required text does not appear to be correct

As of January 1 2019, Mozilla requires that all GitHub projects include this CODE_OF_CONDUCT.md file in the project root. The file has two parts:

  1. Required Text - All text under the headings Community Participation Guidelines and How to Report, are required, and should not be altered.
  2. Optional Text - The Project Specific Etiquette heading provides a space to speak more specifically about ways people can work effectively and inclusively together. Some examples of those can be found on the Firefox Debugger project, and Common Voice. (The optional part is commented out in the raw template file, and will not be visible until you modify and uncomment that part.)

If you have any questions about this file, or Code of Conduct policies and procedures, please reach out to [email protected].

(Message COC003)

Use redis pub/sub instead of polling to monitor pending transfers

Currently in the /redirect/ endpoint, if we encounter a file in the pending state, we poll every second until we see that pending status change to either 'error' or 'present'. We should probably have a failsafe of polling every 10 seconds to make sure that pub/sub being flakey doesn't cause too much trouble. In this case, we should log something to make it clear that we have experienced this condition.

Ensure support for pull-through caches

A pull-though cache is just a simple HTTP cache... When working in data-center etc, we won't have cheap object storage available, so we have to do pull-through caches.

cloud-mirror can support this by taking the rawUrl choosing a cache server using something like: cacheServers[fnv(rawUrl) % cacheServers.length]...
So that if we have multiple cache servers, urls get distributed based on hash of the url...

The put operation for such a pull-through cache is pretty naive, as there is no copying, just choose a cache server and rewrite the url. This could be a special implementation CacheHandler as proposed in #9. Or we could do the redis and sqs messaging even if it is unnecessary and do a trivial implementation of StorageProvider.

Either way, when region detection based on ip ranges is moved to cloud-mirror (issue #6) it would be very nice to support backends that are essentially pull-through caches. Otherwise, support for data-center and providers like digital ocean or packet will be hard.

Move validateInputURL

It's okay to make functions, not everything has to be a method...
And duplicating configuration like allowedPatterns in an effort to make method is bad.

Pretty sure I gave this feedback before, validateInputURL should be moved.
Probably it belongs inside the API handler. As we should reject illegal URLs immediately.

Write GCE storage backend

We're interested in having a storage backend for the GCE cloud environment. A great starting place for this is to look at src/s3-backend.js as an example of what needs to be done for a storage backend.

Reduce number of address concepts

Currently you have:

  • rawUrl, that's the user input,
  • storageAddress(rawUrl), function that takes a rawUrl and returns a storageAddress object, I assume?
  • storageAddressToUrl, function that takes a storageAddress object and returns a url (concept of this url has no name, and is unnecessary afaik).

I suspect you might need 2 concepts, but 3 really?

I propose:

  • rawUrl, obviously this is the given by user in the request (url to the source object)
  • cachedObject, a dictionary consisting of {ttl, rawUrl, cacheUrl, ...} plus any number of backend-specific properties (as long as it has ttl, rawUrl and cacheUrl).

Clearly, what you are doing now is:
cacheUrl = storageAddressToUrl(storageAddress(rawUrl))

There is no reason that cacheUrl can't be a propert on the object returned from storageAddress(rawUrl).

I guess you can still call it storageAddress rather than cachedObect if you like... Just get rid of storageAddressToUrl as it introduces a new concept that we don't have a good name for... And worse, a concept we don't need.

Assuming I understood half the code of-course :)

Remove encoded url from end-point

Some clients can't handle urls encoded in other urls..
And we have support a lot of clients that are less than perfect...

We should refactor cloud-mirror so that:
https://cloud-mirror.taskcluster.net/v1/redirect/s3/us-east-1/<key>
mirrors <key> in a bucket hardcoded in configuration.

This will also remove the need for regular expressions, I think it's fine to make cloud-mirror less generic, we only want to use to mirror S3 buckets anyways.

Rather than passing a complete URL to cloud-mirror.

Set protected status on production branch

Hello! This is your neighborhood secops team looking out for you!

The production branch on this repository is not protected against force pushes. This setting is recommended as part of Mozilla's Guidelines for a Sensitive Repository.

Anyone with admin permissions for this repository can correct the setting using this URL.

If you have any questions, or believe this issue was opened in error, please contact us and mention SOGH001-0 and this repository.

Thank you for your prompt attention to this issue.
--Firefox Operations Security team

Write Azure storage backend

We're interested in having a storage backend for the Azure cloud environment. A great starting place for this is to look at src/s3-backend.js as an example of what needs to be done for a storage backend.

Split abstract implementation from concrete `StorageBackend`

The StorageBackend class is huge... And it's very hard to see what is abstract and what isn't.
It's even hard to comprehend all the methods.

I think this can be refactored into two classes:

  • StorageProvider pure abstract class with no methods, (or just very short default implementations)
  • CacheHandler wrapper around a StorageProvider object that deals with meta-data caching in redis, polling redis, sqs messaging, etc...

Currently StorageBackend says method starting with _ must be implemented (not a bad idea). But I would rather that we use _ to signify private methods. Especially, since StorageBackend has so many of them... It's really hard to see what gets called from API, etc...
So when making CacheHandler make sure all private methods starts with _, it makes it a lot easier to see what is an auxiliary method, and what is part of the public interface.
(I don't always do this, but when objects have lots of methods it really helps).

For StorageProvider I propose pure abstract methods, hence, the abstract ones wouldn't start with _.. Unless, for some reason you need to wrap them, but I don't see why that would be necessary.

Write a localhost storage backend and unit tests

Right now we don't have any sort of no-op storage backend for unit tests. We could mock out the AWS api, but that's not great because we really need a real HTTP backed service.

It would be great if we had a basic 'cloud' server which has a PUT, GET and DELETE end point. These are the routes I would see:

GET http://localhost/artifact/:key
PUT http://localhost/artifact/:key
DELETE http://localhost/artifact/:key

This server should be written in node and able to run concurrently in the same process as the unit tests.

Then a backend implementation which knows how to insert and retrieve files from this basic service would be great because we could write integration tests which require no credentials. We could also theoretically use this service to provide cloud-mirror support in on-premises data centers.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.