Giter Site home page Giter Site logo

swiftsyft's Introduction

SwiftSyft-logo CI Coverage License Contributors OpenCollective

All Contributors

SwiftSyft

SwiftSyft makes it easy for you to train and inference PySyft models on iOS devices. This allows you to utilize training data located directly on the device itself, bypassing the need to send a user's data to a central server. This is known as federated learning.

  • โš™๏ธ Training and inference of any PySyft model written in PyTorch or TensorFlow
  • ๐Ÿ‘ค Allows all data to stay on the user's device
  • ๐Ÿ”™ Support for delegation to background task scheduler
  • ๐Ÿ”‘ Support for JWT authentication to protect models from Sybil attacks
  • ๐Ÿ‘ Host of inbuilt best practices to prevent apps from over using device resources.
    • ๐Ÿ”Œ Charge detection to allow background training only when device is connected to charger
    • ๐Ÿ’ค Sleep and wake detection so that the app does not occupy resource when user starts using the device
    • ๐Ÿ’ธ Wifi and metered network detection to ensure the model updates do not use all the available data quota
    • ๐Ÿ”• All of these smart defaults are easily are overridable
  • :mortarboard: Support for both reactive and callback patterns so you have your freedom of choice (_in progress)
  • ๐Ÿ”’ Support for secure multi-party computation and secure aggregation protocols using peer-to-peer WebRTC connections (in progress).

There are a variety of additional privacy-preserving protections that may be applied, including differential privacy, muliti-party computation, and secure aggregation.

OpenMined set out to build the world's first open-source ecosystem for federated learning on web and mobile. SwiftSyft is a part of this ecosystem, responsible for bringing secure federated learning to iOS devices. You may also train models on Android devices using KotlinSyft or in web browsers using syft.js.

If you want to know how scalable federated systems are built, Towards Federated Learning at Scale is a fantastic introduction!

Installation

Cocoapods

Cocoapods is a dependency manager for Cocoa projects. Just add OpenMinedSwiftSyft to your Podfile like below:

pod 'OpenMinedSwiftSyft', ~> 0.1.3-beta1

Quick Start

As a developer, there are few steps to building your own secure federated learning system upon the OpenMined infrastructure:

  1. ๐Ÿค– Generate your secure ML model using PySyft. By design, PySyft is built upon PyTorch and TensorFlow so you don't need to learn a new ML framework. You will also need to write a training plan (training code the worker runs) and an averaging plan (code that PyGrid runs to average the model diff).
  2. ๐ŸŒŽ Host your model and plans on PyGrid which will deal with all the federated learning components of your pipeline. You will need to set up a PyGrid server somewhere, please see their installation instructions on how to do this.
  3. ๐ŸŽ‰ Start training on the device!

๐Ÿ““ The entire workflow and process is described in greater detail in our project roadmap.

You can use SwiftSyft as a front-end or as a background service. The following is a quick start example usage:

// This is a demonstration of how to use SwiftSyft with PyGrid to train a plan on local data on an iOS device

// Authentication token
let authToken = /* Get auth token from somewhere (if auth is required): */

// Create a client with a PyGrid server URL
if let syftClient = SyftClient(url: URL(string: "ws://127.0.0.1:5000")!, authToken: authToken) {

  // Store the client as a property so it doesn't get deallocated during training.
  self.syftClient = syftClient

  // Create a new federated learning job with the model name and version
  self.syftJob = syftClient.newJob(modelName: "mnist", version: "1.0.0")

  // This function is called when SwiftSyft has downloaded the plans and model parameters from PyGrid
  // You are ready to train your model on your data
  // modelParams - Contains the tensor parameters of your model. Update these tensors during training
  // and generate the diff at the end of your training run.
  // plans - contains all the torchscript plans to be executed on your data.
  // clientConfig - contains the configuration for the training cycle (batchSize, learning rate) and metadata for the model (name, version)
  // modelReport - Used as a completion block and reports the diffs to PyGrid.
  self.syftJob?.onReady(execute: { modelParams, plans, clientConfig, modelReport in

    // This returns an array for each MNIST image and the corresponding label as PyTorch tensor
    // It divides the training data and the label by batches
    guard let MNISTDataAndLabelTensors = try? MNISTLoader.loadAsTensors(setType: .train) else {
        return
    }

    // This loads the MNIST tensor into a dataloader to use for iterating during training
    let dataLoader = MultiTensorDataLoader(dataset: MNISTDataAndLabelTensors, shuffle: true, batchSize: 64)

    // Iterate through each batch of MNIST data and label
    for batchedTensors in dataLoader {

      // We need to create an autorelease pool to release the training data from memory after each loop
      autoreleasepool {

          // Preprocess MNIST data by flattening all of the MNIST batch data as a single array
          let MNISTTensors = batchedTensors[0].reshape([-1, 784])

          // Preprocess the label ( 0 to 9 ) by creating one-hot features and then flattening the entire thing
          let labels = batchedTensors[1]

          // Add batch_size, learning_rate and model_params as tensors
          let batchSize = [UInt32(clientConfig.batchSize)]
          let learningRate = [clientConfig.learningRate]

          guard
              let batchSizeTensor = TorchTensor.new(array: batchSize, size: [1]),
              let learningRateTensor = TorchTensor.new(array: learningRate, size: [1]) ,
              let modelParamTensors = modelParams.paramTensorsForTraining else
          {
              return
          }

          // Execute the torchscript plan with the training data, validation data, batch size, learning rate and model params
          let result = plans["training_plan"]?.forward([TorchIValue.new(with: MNISTTensors),
                                                        TorchIValue.new(with: labels),
                                                        TorchIValue.new(with: batchSizeTensor),
                                                        TorchIValue.new(with: learningRateTensor),
                                                        TorchIValue.new(withTensorList: modelParamTensors)])

          // Example returns a list of tensors in the folowing order: loss, accuracy, model param 1,
          // model param 2, model param 3, model param 4
          guard let tensorResults = result?.tupleToTensorList() else {
              return
          }

          let lossTensor = tensorResults[0]
          lossTensor.print()
          let loss = lossTensor.item()

          let accuracyTensor = tensorResults[1]
          accuracyTensor.print()

          // Get updated param tensors and update them in param tensors holder
          let param1 = tensorResults[2]
          let param2 = tensorResults[3]
          let param3 = tensorResults[4]
          let param4 = tensorResults[5]

          modelParams.paramTensorsForTraining = [param1, param2, param3, param4]

      }
    }

        // Generate diff data and report the final diffs as
        let diffStateData = try plan.generateDiffData()
        modelReport(diffStateData)

  })

  // This is the error handler for any job exeuction errors like connecting to PyGrid
  self.syftJob?.onError(execute: { error in
    print(error)
  })

  // This is the error handler for being rejected in a cycle. You can retry again
  // after the suggested timeout.
  self.syftJob?.onRejected(execute: { timeout in
      if let timeout = timeout {
          // Retry again after timeout
          print(timeout)
      }
  })

  // Start the job. You can set that the job should only execute if the device is being charge and there is
  // a WiFi connection. These options are on by default if you don't specify them.
  self.syftJob?.start(chargeDetection: true, wifiDetection: true)
}

API Documenation

See API Documenation for complete reference.

Running in the background

A mini tutorial on how to run SwiftSyft on iOS using the background task scheduler can be found here

Running the Demo App

The demo app fetches the plans, protocols and model weights from PyGrid server hosted locally. The plans are then deserialized and executed using libtorch.

Follow these steps to setup an environment to run the demo app:

  • Clone the repo PyGrid and change directory to PyGrid/apps/domain
$ git clone https://github.com/OpenMined/PyGrid
$ cd PyGrid/apps/domain
  • Run the PyGrid Domain application
$ ./run.sh --port 5000 --start_local_db
  • Install PySyft from source or via PyPy. Follow the instructions specified in the repo.
  • Clone the PySyft repo. In your command line, go to PySyft/examples/federated-learning/model-centric/ folder and run jupyter notebook.
$ cd PySyft/examples/federated-learning/model-centric/
$ jupyter notebook
  • Open the notebook mcfl_create_plan_mobile.ipynb notebook. Run all the cells to host a plan to your running PyGrid domain server.

  • Set-up demo project using Cocoapods

  • Install Cocoapods

gem install cocoapods
  • Install the dependencies of the project.
pod install # On the root directory of this project
  • Open the file SwiftSyft.xcworkspace in Xcode.
  • Run the SwiftSyft project. It automatically uses 127.0.0.1:5000 as the PyGrid URL.

Contributing

Set-up

You can work on the project by running pod install in the root directory. Then open the file SwiftSyft.xcworkspace in Xcode. When the project is open on Xcode, you can work on the SwiftSyft pod itself in Pods/Development Pods/SwiftSyft/Classes/*

Workflow

  1. Star, for and clone the repo
  2. Open the project in Xcode
  3. Check out current issues and Github. For newcomers, check out issues labeled Good first issue
  4. Do your work
  5. Push your fork
  6. Submit a PR to OpenMined/SwiftSyft

Read the contribution guide as a good starting place.

Support

For support in using this library, please join the #lib_swift_syft Slack channel. If you'd like to follow along with any code changes to the library, please join #code_swiftsyft Slack channel. Click here to join our Slack Community!

License

Apache License 2.0

Contributors โœจ

Thanks goes to these wonderful people (emoji key):


Mark Jimenez

๐Ÿ’ป ๐Ÿ“–

Madalin Mamuleanu

๐Ÿ’ป

Rohith Pudari

๐Ÿ’ป

Sebastian Bischoff

๐Ÿ“–

Luke Reichold

๐Ÿ’ป

This project follows the all-contributors specification. Contributions of any kind welcome!

swiftsyft's People

Contributors

allcontributors[bot] avatar baschdl avatar cereallarceny avatar lukereichold avatar mamuleanu avatar mjjimenez avatar rohithpudari avatar sailor-a avatar vvmnnnkv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

swiftsyft's Issues

RTCCVPixelBuffer is implemented in both Apple System Library and GoogleWebRTC Pod

Crash happens when connecting to a signalling server using GoogleWebRTC.

Issue was also found here:

stasel/WebRTC-iOS#19

https://bugs.chromium.org/p/webrtc/issues/detail?id=10560

Safe to say, Apple probably won't fix this on their end and we're going to have to recompile GoogleWebRTC from source and rename RTCCVPixelBuffer.

For those who can help building the library from source here's where you can get the source:

https://webrtc.org/native-code/ios/

Take note, it says the source code is 6GB, but when I downloaded it it's actually 12 GB :(

Edit:

  • Crash doesn't happen all the time. Sometimes it's just a warning. I think iOS randomly chooses which class to use and may cause a crash depending on the device.

Use corrected Google WebRTC binary to prevent class name conflicts in iOS 13

As previously discussed, GoogleWebRTC binary has naming conflicts with some of iOS's classes that may cause a crash in some devices. Reference: stasel/WebRTC-iOS#19

Also in the same issue someone has
a.) Created step by step guide on how to rename the classes and rebuild the library from scratch
stasel/WebRTC-iOS#19 (comment)
and
b.) Provided a pre-built binary.
stasel/WebRTC-iOS#19 (comment)

We should test the pre-built binary if it doesn't cause anymore warnings or crashes. But for production, we need to follow the steps above in building the library from scratch for security purposes.

Create new project to demonstrate using background task scheduler with SwiftSyft

iOS 13's BGTaskScheduler API's limitations (mainly having to register the background task in applicationDidFinishLaunchingOptions) have made it hard to integrate into the SwiftSyft library's API. Instead, a new project will be created to show an example of how to use it along with SwiftSyft library.

  • Create new Example-background project and use it in the same workspace.
  • Add BGTaskScheduler code to create a one-time background task to execute MNIST plan example
  • Optionally add a repeating background task logic
  • Update github flow yaml to reflect new project/workspace structure.

iOS 12 Socket Connection Implementation

As referenced in #3 , this is the socket connection implementation that will be used by the signalling client. It should conform to the SocketClientProtocol and be responsible for maintaining socket connection via ping, sending and receiving data.

For iOS 12, you can research for a updated socket connection library that supports all of our requirements specified in #3 . Please post it here so we can discuss the pros and cons of your chosen library.

For reference, in the WebRTC iOS demo here: https://github.com/stasel/WebRTC-iOS , uses https://github.com/daltoniam/Starscream

Can't build app for a real device, have error "does not contain bitcode"

SwiftSyft/Example/Pods/LibTorch/install/lib/libtorch.a(CPUGenerator.cpp.o)โ€™ does not contain bitcode. You must rebuild it with bitcode enabled (Xcode setting ENABLE_BITCODE), obtain an updated library from the vendor, or disable bitcode for this target. file โ€˜/Pods/LibTorch/install/lib/libtorch.aโ€™ for architecture arm64

Implement FL Authentication and Cycle API Request

This will be implemented in SyftClient class start method here : https://github.com/OpenMined/SwiftSyft/blob/master/SwiftSyft/Classes/SyftClient.swift#L28

The API requests are described here:
OpenMined/PyGrid-deprecated---see-PySyft-#445
Refer to Authentication with PyGrid and FL Worker Cycle Request

The requests can be made by socket messages or HTTP requests but this issue is mainly for the HTTP requests made using URLSession.dataTaskPublisher and chaining the requests.

A seperate issue will be made for using signalling requests.

Execute plans in iOS

This epic issue is somewhat self-explanatory, but in theory, we need to be able to execute a PySyft plan. This should ideally only be done after the API has been finalized in the iOS worker (#28).

Add documentation for MNIST Loader

While the MNIST Loader is not part of the SwiftSyft library, it does demonstrate how to load data and transform it to be able to be consumed by our torchscript models.

Feel free to ask me if you have any questions about it. The original code for the loader came from this repository: https://github.com/simonlee2/MNISTKit

Add support for charge detection and wifi detection in iOS

While we want to allow the end-user developer integrating SwiftSyft into their application the ability to choose to execute models while the user is charging their phone, we don't want to force them into this paradigm.

This issue includes developing charge detection that a developer can choose to use. It's worth noting that the default option for a syft client would be to "enable" this as a requirement for training. It's up to the developer to state that they want this option "disabled".

Add support for background task scheduling in iOS

In order to properly execute training plans, we must do so in a background task. This allows for training to take place without a visual API (as a library of another app), and do so separate from the main thread.

Background Execution Example Documentation

There's currently a separate example app to demonstrate how to use SwiftSyft job using iOS 13's new Background Task Scheduler. We need some form of documentation (Background-Execution.md) explaining the rationale behind using it outside of the library and how BGTaskScheduler is used with SwiftSyft

The file with the example is here

I think some main points that should be included.

Add a stopping method that stops the training process in iOS

We need to have a stopping method that will terminate the current job in question. Reasons for stopping training could be any of the following:

  • The user wanted to... like they clicked a "stop" button
  • The plan has an error and can't execute
  • The user started using their device again
  • The device loses wifi
  • The device loses active charging
  • Or perhaps most importantly... if the model isn't really going anywhere (the error rate isn't going down)

At this point, we should stop and notify the user with some sort of message.

Implement FL Authentication and Cycle Socket Requests

Similar to #47, but implemented using socket messages. This may require a refactor of signalling client to be able to easily chain both requests.

  • Refactor SignallingClient to use Combine framework for observing messages. This makes it easier to chain requests
  • Separate SignallingMessages model to separate SignallingMessagesRequest and SignallingMessageResponse. Previous version assumed socket request and response formats were the same.
  • Add auth request and response models
  • Add cycle request and response models
  • Add socket auth request and response handler in syft client
  • Add socket cycle request and response handler in syft client

SwiftSyft Connection Clients Roadmap

SwiftSyft establishes connection to a grid server (grid.js or pygrid) via a signalling server (websocket) and receives instructions (plans, protocols or tensor operations) sent by PySyft. The operations can be shared to other workers (mobile or web) via a WebRTC peer-to-peer connection.

In order to facilitate these functionality we're going to need three main components

Components

  • iOS 12 Socket connection
  • iOS 13 Socket connection
  • Signalling client
  • WebRTC client
  • Syft client

Components Description

  1. iOS 12 and 13 Socket Connection implementation
  1. Signalling client:
  1. WebRTC client
  • Should establish and keep multi-peer connection and data channel from the connections it receives from the signalling client.
  • Handles storing connections, data channels and ice candidates for each peer connection.
  • Sends SDP offers/answers using signalling client.
  • Should use GoogleWebRTC iOS library.
  • Reference: https://github.com/OpenMined/syft.js/blob/master/src/webrtc.js
  1. Syft client
  1. Messages
  • These are the messages sent among the grid server and mobile/web workers.
  • For now, only socket messages (between workers and grid server) are defined
  • Socket messages are serialized as JSON for now. PySyft protocols, on the other hand, are serialized in serde (custom serialization by Pysyft) by grid.js. We're not going to implement serde deserialization since eventually all the messages will be serialized in Protobuf.
  • Reference: https://github.com/OpenMined/grid.js/blob/master/SOCKETS.md

Supporting features

  • Data channel chunking
  • UI to specify server and peer connection
  1. Data Channel Chunking
  1. UI
  • Specifies socket server to connect to
  • Initiates connection to grid server via button tap

Edit:

  • Added requirement to enforce ws:// connection in signalling client.

Implement Protobuf classes in iOS

We need to add the following Protobuf classes to SwiftSyft as they are completed:

  • Plan
  • State
  • Operation
  • Placeholder
  • TorchParameter
  • Protocol
  • PromiseTensor
  • ... more to be defined

Implement sleep/wake detection in iOS

While we want to allow the end-user developer integrating SwiftSyft into their application the ability to choose to execute models while the user is asleep, we don't want to force them into this paradigm.

This issue includes developing a basic "asleep or awake" algorithm that a developer can choose to use. It's worth noting that the default option for a syft client would be to "enable" this as a requirement for training. It's up to the developer to state that they want this option "disabled".

WebRTC Client Implementation

  • GoogleWebRTC integration
  • PeerConnection Wrapper
  • Initialize peer connection
  • Peer connection events
  • PeerConnection Events Hooks/Observers
  • Signalling message processing logic
  • Data channel observers
  • Unit tests

High memory use while training

Memory use of the example app using MNIST data increases incrementally until it reaches OS limits for the app and crashes on a real device.

Doesn't happen on the simulator due to higher memory limit.

UI for initiating connection to grid server

We need a UI in the example app similar to that found in syft.js

Screen Shot 2020-01-20 at 8 03 56 PM

You can run the example syft.js site by following the instructions in https://github.com/OpenMined/syft.js and https://github.com/OpenMined/grid.js/ .

Easiest way to do it would be to use docker.
Grid

  • cd into docker director in grid.js and enter docker-compose -f example-seed.yml up in the commandline

Syft.js

This can be iPhone only. We don't need to support iPad since it's just an example. You're free to use
the storyboard for this.

UI Components:

  • OpenMined Logo
  • description
  • Text box for socket server URL
  • Text box for protocol ID (this can be hardcoded value)
  • Button to connect to grid
  • Text box to write message to other participants (hidden when disconnected, visible when connected)
  • Button to send message to other participants. (hidden when disconnected, visible when connected)

Add bandwidth and Internet connectivity test in iOS

We need to have some sort of way to run a basic bandwidth and Internet connectivity test in iOS so that we may submit these values to PyGrid. This allows PyGrid to properly select candidates for pooling based on internet connection speed. This does not check for wifi connectivity. This will be included in a separate issue.

We must determine the average ping, upload speed, and download speed of the device and report these values to PyGrid.

Web Socket Connection Messages Model

As specified in #3 , we're going to need models to serialized/deserialized messages sent to and received from the grid server.

Here is the documentation specification for messages in grid.js

https://github.com/OpenMined/grid.js/blob/master/SOCKETS.md

  • Get Protocol
  • WebRTC Internal Message (Offer, Answer and Ice Candidate)
  • Socket ping (URLSessionWebSocketTask has a ping method. Not sure if we need this)

Notes:

  • Use Codable for serialization and deserialization. Eventually we, will use Protobuf, so we don't need unit tests for these for now.
  • Protocol/Plan field from Get Protocol message is in serde (PySyft custom serialization). This is being ported to Protobuf, so we can skip it for now.
  • In order to determine the type of the message, you're going to have to implement init(fromDecoder:) and encode(to encoder:) and select the correct model to serialize with. A reference on how to implement it is here: https://github.com/stasel/WebRTC-iOS/blob/master/WebRTC-Demo-App/Sources/Services/Message.swift

If this serialization method doesn't fit our use case or you run into some problems, we can discuss it here.

Add app icon

We need app icon (already created a set of all necessary images)

iOS 13 Socket Connection Implementation

As referenced in #3 , this is the socket connection implementation that will be used by the signalling client. It should conform to the SocketClientProtocol and be responsible for maintaining socket connection via ping, sending and receiving data.

For iOS 13, use URLSessionWebSocketTask.

Implement reachability publisher

This will listen to reachability changes (internet connectivity). Users of this class can check changes by subscribing to a publisher (check Combine framework) exposed by the Reachability class.

iOS 12 should have a new class that makes this easier.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.