Giter Site home page Giter Site logo

secretflow / secretflow Goto Github PK

View Code? Open in Web Editor NEW
2.2K 52.0 362.0 196.21 MB

A unified framework for privacy-preserving data analysis and machine learning

Home Page: https://www.secretflow.org.cn/docs/secretflow/en/

License: Apache License 2.0

Shell 0.38% Python 98.92% Starlark 0.11% C++ 0.35% C 0.02% Dockerfile 0.23%
differential-privacy homomorphic-encryption machine-learning privacy-preserving private-set-intersection secure-multiparty-computation trusted-execution-environment data-analysis federated-learning split-learning

secretflow's Introduction


CircleCI

简体中文English

SecretFlow is a unified framework for privacy-preserving data intelligence and machine learning. To achieve this goal, it provides:

  • An abstract device layer consists of plain devices and secret devices which encapsulate various cryptographic protocols.
  • A device flow layer modeling higher algorithms as device object flow and DAG.
  • An algorithm layer to do data analysis and machine learning with horizontal or vertical partitioned data.
  • A workflow layer that seamlessly integrates data processing, model training, and hyperparameter tuning.

Documentation

SecretFlow Related Projects

  • Kuscia: A lightweight privacy-preserving computing task orchestration framework based on K3s.
  • SCQL: A system that allows multiple distrusting parties to run joint analysis without revealing their private data.
  • SPU: A provable, measurable secure computation device, which provides computation ability while keeping your private data protected.
  • HEU: A high-performance homomorphic encryption algorithm library.
  • YACL: A C++ library that contains cryptography, network and io modules which other SecretFlow code depends on.

Install

Please check INSTALLATION.md

Deployment

Please check DEPLOYMENT.md

Learn PETs

We also provide a curated list of papers and SecretFlow's tutorials on Privacy-Enhancing Technologies (PETs).

Please check AWESOME-PETS.md

Contributing

Please check CONTRIBUTING.md

Benchmarks

Please check OVERALL_BENCHMARK.md

Disclaimer

Non-release versions of SecretFlow are prohibited from using in any production environment due to possible bugs, glitches, lack of functionality, security issues or other problems.

secretflow's People

Contributors

6fj avatar adijeshen avatar anakinxc avatar bloom0705 avatar cryptocxf avatar freepengui avatar fy222fy avatar ian-huu avatar jamie-cui avatar jinstorm avatar krout0n avatar lia0409 avatar liang-xiaojian avatar longshan-ant avatar lph-github avatar mingbo-lee avatar shaojian-ant avatar starlight039 avatar tarantula-leo avatar tongke6 avatar tonywu6 avatar usafchn avatar wangzul avatar wuxibin89 avatar xfap avatar zhang-tuo-pdf avatar zhangxingmeng avatar zhaocaibei123 avatar zhouaihui avatar zlyber avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

secretflow's Issues

secretflow-ray vs ray

我看 requirements.txt 里面用到了 secretflow-ray==2.0.0.dev0, 我想问一下 secretflow-rayray 的区别,相关代码有没有开源?

How to use 3PC in Neural Network with SPU?

Issue Type

Others

Source

binary

Secretflow Version

latest

OS Platform and Distribution

ubuntu18.04

Python version

3.8.13

Bazel version

No response

GCC/Compiler version

No response

What happend and What you expected to happen.

I want to use 3PC in NN, I set up three characters in the same way as 2PC, but the setting is didn't working, why? The machine is 8C16G


sf.init(['alice', 'bob', 'adan'], num_cpus=12, log_to_driver=True)
alice, bob, adan = sf.PYU('alice'), sf.PYU('bob'), sf.PYU('adan')
spu = sf.SPU(sf.utils.testing.cluster_def(['alice', 'bob', 'adan']))
x1, _ = alice(load_train_dataset)(party_id=0)
x0, _ = adan(load_train_dataset)(party_id=1)
x2, y = bob(load_train_dataset)(party_id=2)

Reproduction code to reproduce the issue.

import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import time
from jax.example_libraries import stax
from jax.example_libraries.stax import (
    Dense,
    Relu,
)
import jax
import jax.numpy as jnp
from jax.example_libraries import optimizers, stax
from sklearn.metrics import roc_auc_score


def load_train_dataset(party_id=None) -> (np.ndarray, np.ndarray):
    features, label = load_breast_cancer(return_X_y=True)
    scaler = StandardScaler()
    features = scaler.fit_transform(features)
    X_train, _, y_train, _ = train_test_split(
        features, label, test_size=0.2, random_state=42
    )
    if party_id == 0:
        return X_train[:, :10], _
    elif party_id == 1:
        return X_train[:, 10:20], _
    else:
        return X_train[:, 20:30], y_train

def load_test_dataset():
    features, label = load_breast_cancer(return_X_y=True)
    scaler = StandardScaler()
    features = scaler.fit_transform(features)
    _, X_test, _, y_test = train_test_split(
        features, label, test_size=0.2, random_state=42
    )
    return X_test, y_test

def MLP():
    nn_init, nn_apply = stax.serial(
        Dense(500),
        Relu,
        Dense(500),
        Relu,
        Dense(500),
        Relu,
        Dense(1),
    )

    return nn_init, nn_apply


KEY = jax.random.PRNGKey(42)
INPUT_SHAPE = (-1,30)

def init_state(learning_rate):
    init_fun, _ = MLP()
    _, params_init = init_fun(KEY, INPUT_SHAPE)
    opt_init, _, _ = optimizers.sgd(learning_rate)
    opt_state = opt_init(params_init)
    return opt_state

def train(
    train_x0,
    train_x1,
    train_x2,
    train_y,
    opt_state,
    learning_rate,
    epochs,
    batch_size,
):
    train_x = jnp.concatenate([train_x0, train_x1, train_x2], axis=1)

    _, predict_fun = MLP()
    _, opt_update, get_params = optimizers.sgd(learning_rate)

    def update_model(state, imgs, labels, i):
        def mse(y, pred):
            return jnp.mean(jnp.multiply(y - pred, y - pred) / 2.0)

        def loss_func(params):
            y = predict_fun(params, imgs)
            return mse(y, labels), y

        grad_fn = jax.value_and_grad(loss_func, has_aux=True)
        (loss, y), grads = grad_fn(get_params(state))
        return opt_update(i, grads, state)
    import time
    for i in range(1, epochs + 1):
        begin =time.time()
        imgs_batchs = jnp.array_split(train_x, len(train_x) / batch_size, axis=0)
        labels_batchs = jnp.array_split(train_y, len(train_y) / batch_size, axis=0)

        for batch_idx, (batch_images, batch_labels) in enumerate(
            zip(imgs_batchs, labels_batchs)
        ):
            opt_state = update_model(opt_state, batch_images, batch_labels, i)
        end = time.time()
        print("epoch-{} cost time:{} s".format(i, end - begin))
    return get_params(opt_state)


def validate_model(params, X_test, y_test):
    _, predict_fun = MLP()
    y_pred = predict_fun(params, X_test)
    return roc_auc_score(y_test, y_pred)

if __name__ == "__main__":
    import jax
    
    # Hyperparameter
    batch_size = 100
    epochs = 1
    learning_rate = 0.1
    
    init_params = init_state(learning_rate)
    import secretflow as sf

    # In case you have a running secretflow runtime already.
    sf.shutdown()

    sf.init(['alice', 'bob', 'adan'], num_cpus=12, log_to_driver=True)

    alice, bob, adan = sf.PYU('alice'), sf.PYU('bob'), sf.PYU('adan')
    spu = sf.SPU(sf.utils.testing.cluster_def(['alice', 'bob', 'adan']))

    x1, _ = alice(load_train_dataset)(party_id=0)
    x0, _ = adan(load_train_dataset)(party_id=1)
    x2, y = bob(load_train_dataset)(party_id=2)

    device = spu
    x0_, x1_, x2_, y_ = x0.to(device), x1.to(device), x2.to(device), y.to(device)
    init_params_ = sf.to(spu, lambda: init_state(learning_rate))
    begin = time.time()
    params_spu = spu(train, static_argnames=['learning_rate', 'epochs', 'batch_size'])(
        x0_, x1_, x2_, y_,init_params_, learning_rate=learning_rate, epochs=epochs, batch_size=batch_size
    )

    params = sf.reveal(params_spu)
    print("train cost time:",time.time() - begin)
    print(params)
    X_test, y_test = load_test_dataset()
    auc = validate_model(params, X_test, y_test)
    print(f'auc={auc}')

运行官方手册第三方psi时报错

Issue Type

Bug

Source

source

Secretflow Version

最新版

OS Platform and Distribution

centos 7.6

Python version

3.8.13

Bazel version

No response

GCC/Compiler version

No response

What happend and What you expected to happen.

RayActorError: The actor died because of an error raised in its creation task, ray::SPURuntime.__init__() (pid=89408, ip=172.16.160.4, repr=<secretflow.device.device.spu.SPURuntime object at 0x7f6822262d60>)
  File "/mnt/hgfs/code/secretflow/secretflow/device/device/spu.py", line 125, in __init__
    self.link = link.create_brpc(desc, rank)
RuntimeError: what: 
	[external/yasl/yasl/link/context.cc:140] connect to mesh failed, failed to setup connection to rank=2
stacktrace: 
#0 pybind11::cpp_function::initialize<>()::{lambda()#3}::_FUN()+0x7f6874c28ed7
#1 pybind11::cpp_function::dispatcher()+0x7f6874c150cb
#2 PyCFunction_Call+0x43bdca

Reproduction code to reproduce the issue.

运行说明书中三方psi的第10步

集群部署时出错

Issue Type

Others

Source

source

Secretflow Version

0.6.13b1

OS Platform and Distribution

Rocky Linux release 8.5

Python version

3.8.12

Bazel version

No response

GCC/Compiler version

No response

What happend and What you expected to happen.

请问有为spu分配IP与端口的示例吗,[SPU **init**](https://github.com/secretflow/secretflow/blob/beta/secretflow/device/device/spu.py#L386)
我在启动bob节点时并没有为其设置端口,在上述示例中需要填写bob节点的address,应该如何填写?
我目前使用的address是bob的ip和alice的端口,具体如下:
import secretflow as sf
import spu

sf.shutdown()
sf.init(address='alice's ip:8881')
alice = sf.PYU('alice')
bob = sf.PYU('bob')
device = sf.SPU({
'nodes': [
{
'party': 'alice',
'id': 'local:0',
# The address for other peers.
'address': 'alice's ip:8881',
# The listen address of this node.
# Optional. Address will be used if listen_address is empty.
'listen_address': ''
},
{
'party': 'bob',
'id': 'local:1',
'address': 'bob's ip:8881',
'listen_address': ''
},
],
'runtime_config': {
'protocol': spu.spu_pb2.SEMI2K,
'field': spu.spu_pb2.FM128,
'sigmoid_mode': spu.spu_pb2.RuntimeConfig.SIGMOID_REAL,
}
})
data1 = alice(lambda x : x)(2).to(device)
data2 = bob(lambda x : x)(2).to(device)
def add(a,b):
return a+b
data = device(add)(data1,data2)
sf.reveal(data)

在reveal时报错:
RayActorError Traceback (most recent call last)
Input In [7], in <cell line: 2>()
1 data = device(add)(data1,data2)
----> 2 sf.reveal(data)

File ~/.pyenv/versions/3.8.12/envs/secretflow/lib/python3.8/site-packages/secretflow/device/driver.py:158, in reveal(func_or_object)
155 value_ref.append(value.device.sk_keeper.decrypt.remote(value.data))
156 value_idx.append(i)
--> 158 value_obj = ray.get(value_ref)
159 idx = 0
160 for i in value_idx:

File ~/.pyenv/versions/3.8.12/envs/secretflow/lib/python3.8/site-packages/ray/_private/client_mode_hook.py:105, in client_mode_hook..wrapper(*args, **kwargs)
103 if func.name != "init" or is_client_mode_enabled_by_default:
104 return getattr(ray, func.name)(*args, **kwargs)
--> 105 return func(*args, **kwargs)

File ~/.pyenv/versions/3.8.12/envs/secretflow/lib/python3.8/site-packages/ray/worker.py:1845, in get(object_refs, timeout)
1843 raise value.as_instanceof_cause()
1844 else:
-> 1845 raise value
1847 if is_individual_id:
1848 values = values[0]

RayActorError: The actor died because of an error raised in its creation task, ray::SPURuntime.init() (pid=21797, ip=172.22.56.85, repr=<secretflow.device.device.spu.SPURuntime object at 0x7fc6d3326c10>)
File "/root/.pyenv/versions/3.8.12/envs/secretflow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 125, in init
self.link = link.create_brpc(desc, rank)
RuntimeError: what:
[external/yasl/yasl/link/context.cc:140] connect to mesh failed, failed to setup connection to rank=0
stacktrace:
#0 pybind11::cpp_function::initialize<>()::{lambda()#3}::_FUN()+0x7fc728c28ed7
#1 pybind11::cpp_function::dispatcher()+0x7fc728c150cb
#2 PyCFunction_Call+0x43be5a


此外,是否有更详细的集群部署示例?

Reproduction code to reproduce the issue.

import secretflow as sf
import spu

sf.shutdown()
sf.init(address='alice's ip:8881')

alice = sf.PYU('alice')
bob = sf.PYU('bob')
device = sf.SPU({
                        'nodes': [
                            {
                                'party': 'alice',
                                'id': 'local:0',
                                # The address for other peers.
                                'address': 'alice's ip:8881',
                                # The listen address of this node.
                                # Optional. Address will be used if listen_address is empty.
                                'listen_address': ''
                            },
                            {
                                'party': 'bob',
                                'id': 'local:1',
                                'address': 'bob's ip:8881',
                                'listen_address': ''
                            },
                        ],
                        'runtime_config': {
                            'protocol': spu.spu_pb2.SEMI2K,
                            'field': spu.spu_pb2.FM128,
                            'sigmoid_mode': spu.spu_pb2.RuntimeConfig.SIGMOID_REAL,
                        }
                    })
data1 = alice(lambda x : x)(2).to(device)
data2 = bob(lambda x : x)(2).to(device)
def add(a,b):
    return a+b
data = device(add)(data1,data2)
sf.reveal(data)

文档中的图像分类联邦学习的源代码错误bug

在开发文档的demo中,图像分类联邦学习的源代码,导入FL模型部分代码似乎是错误,
原代码:
from secretflow.security.aggregation import SpuAggregator, SecureAggregator
from secretflow.ml.nn import FLModelTF

正确代码:
from secretflow.security.aggregation import SPUAggregator, SecureAggregator
from secretflow.ml.nn import FLModelTF

改为上述代码后,运行报错:
File ~/miniconda3/envs/tensorflow2/lib/python3.8/site-packages/secretflow/security/privacy/mechanism/tensorflow/layers.py:23, in
17 from abc import ABC, abstractmethod
19 from secretflow.security.privacy.accounting.rdp_accountant import (
20 get_rdp,
21 get_privacy_spent_rdp,
22 )
---> 23 import secretflow.security.privacy._lib.random as random
26 class EmbeddingDP(tf.keras.layers.Layer, ABC):
27 def init(self) -> None:

ModuleNotFoundError: No module named 'secretflow.security.privacy._lib'

请问如何解决?

static_argnames要传什么参数

image

看到tutorial中逻辑回归的示例中,在device()()中第一个括号里除了传要执行计算的函数以外还要传一个static_argnames参数,请问什么时候需要传这个参数呢?我试过传[“epochs”, "learning_rate"]也是可以的,但是如果不传static_argnames或者传一个空数组就不可以,请问这个参数有相关的解释吗?

基于TEE的多方MPC计算解决方案

现阶段MPC协议算子在通信量和通信次数上可能会存在局限性,特别是在多方(n>3),公网环境上性能损耗明显。想问下TEE-based multi-party computation 会是多方安全计算的一种更优的解决方案么,你们怎么看?

SS-LR/XGB和HESS-LR/XGB 有什么区别

Issue Type

Documentation Bug

Source

source

Secretflow Version

0.6.13

OS Platform and Distribution

Ubuntu18.04

Python version

3.8.13

Bazel version

No response

GCC/Compiler version

No response

What happend and What you expected to happen.

SS-LR/XGB和HESS-LR/XGB 有什么区别

Reproduction code to reproduce the issue.

SS-LR/XGB和HESS-LR/XGB 有什么区别

安装问题

直接使用pip install 安装了secretflow后,在import secretflow时报错:
IMG_1649

请问是否是对系统的某些dependence版本什么的有要求?

BTW: 使用的系统是centos7,gcc版本4.8.5;google了一下这个报错,似乎是要安装更高版本的glibc,现在的服务器上还有其他应用,不敢乱来,请问后面有考虑提供一个docker镜像用来学习测试么?这样尝试的成本会小一些。

clone Error

Error downloading object: tests/datasets/adult/horizontal/adult.alice.npy (b67b7b2): Smudge error: Error downloading tests/datasets/adult/horizontal/adult.alice.npy (b67b7b234b61e53fbc989be38dd08c4f19dd5a910b986b70a8982f47e3c4465d): batch response: This repository is over its data quota. Account responsible for LFS bandwidth should purchase more data packs to restore access.

Errors logged to /home/vscode/secretflow-modelinghub/.git/lfs/logs/20220615T083829.124020079.log
Use git lfs logs last to view the log.
error: external filter 'git-lfs filter-process' failed
fatal: tests/datasets/adult/horizontal/adult.alice.npy: smudge filter lfs failed
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry the checkout with 'git checkout -f HEAD'

Two-party psi unbalanced data set reports an error

Issue Type

Bug

Source

source

Secretflow Version

secretflow-0.6.13b1

OS Platform and Distribution

Linux ubuntu 18.04

Python version

python 3.8

Bazel version

bazel 5.1.1

GCC/Compiler version

gcc 12.1

What happend and What you expected to happen.

Two-party psi unbalanced data set reports an error:
alice 10000000
bob:  100000000
intersection: 100000  
logs:
Traceback (most recent call last):
  File "/opt/mpc/secretflow/tests/psi2.py", line 16, in <module>
    spu.psi_csv('id', input_path, output_path)
  File "/opt/mpc/secretflow/secretflow/device/device/spu.py", line 491, in psi_csv
    return dispatch('psi_csv', self, key, input_path, output_path, protocol, sort)
  File "/opt/mpc/secretflow/secretflow/device/device/register.py", line 111, in dispatch
    return _registrar.dispatch(self.device_type, name, self, *args, **kwargs)
  File "/opt/mpc/secretflow/secretflow/device/device/register.py", line 80, in dispatch
    return self._ops[device_type][name](*args, **kwargs)
  File "/opt/mpc/secretflow/secretflow/device/kernels/spu.py", line 177, in psi_csv
    return ray.get(res)
  File "/opt/mpc/secretflow/venv/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
    return func(*args, **kwargs)
  File "/opt/mpc/secretflow/venv/lib/python3.8/site-packages/ray/worker.py", line 1843, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(RuntimeError): ray::SPURuntime.psi_csv() (pid=113075, ip=10.228.21.60, repr=<secretflow.device.device.spu.SPURuntime object at 0x7f2eac002880>)
  File "/opt/mpc/secretflow/secretflow/device/device/spu.py", line 368, in psi_csv
    libs.kkrt_2pc_psi(
RuntimeError: what: 
	[external/yasl/yasl/link/transport/channel.cc:86] Get data timeout, key=root:3:ALLGATHER
stacktrace: 
#0 yasl::link::Context::RecvInternal()+0x7f2f02b100b2
#1 yasl::link::AllGatherImpl<>()+0x7f2f029c8785
#2 yasl::link::AllGather()+0x7f2f029c8cb4
#3 spu::psi::PsiExecutorBase::Run()+0x7f2f02411709
#4 pybind11::cpp_function::initialize<>()::{lambda()#3}::_FUN()+0x7f2f00c341e8
#5 pybind11::cpp_function::dispatcher()+0x7f2f00c150cb
#6 PyCFunction_Call+0x5eda96

Process finished with exit code 1

Reproduction code to reproduce the issue.

import time
import pandas as pd
import secretflow as sf

sf.shutdown()
sf.init(['alice', 'bob', 'carol'], log_to_driver=False)

alice, bob = sf.PYU('alice'), sf.PYU('bob')
spu = sf.SPU(sf.utils.testing.cluster_def(['alice', 'bob']))

input_path = {alice: '.data/alice.csv', bob: '.data/bob.csv'}
output_path = {alice: '.data/alice_psi.csv', bob: '.data/bob_psi.csv'}

time_psi2_begin = time.time()
spu.psi_csv('id', input_path, output_path)
time_psi2_end = time.time()
print('psi2 cost time:', time_psi2_end - time_psi2_begin, 's')

da_psi = pd.read_csv('.data/alice_psi.csv')
db_psi = pd.read_csv('.data/bob_psi.csv')
expected = pd.read_csv('.data/intersection')

print(da_psi.shape[0] == expected.shape[0])
print(db_psi.shape[0] == expected.shape[0])

运行官方文档Move JAX program to SPU出错

Issue Type

Bug

Source

binary

Secretflow Version

0.6

OS Platform and Distribution

macos 12.4

Python version

3.8.13

Bazel version

No response

GCC/Compiler version

No response

What happend and What you expected to happen.

https://spu.readthedocs.io/en/beta/getting_started/quick_start.html
运行Move JAX program to SPU时报错ModuleNotFoundError: No module named '__mp_main__'

Reproduction code to reproduce the issue.

# run make_rand on P1, the value is visible for P1 only.
x = ppd.device("P1")(make_rand)()

# run make_rand on P2, the value is visible for P2 only.
y = ppd.device("P2")(make_rand)()

# run greater on SPU, it automatically fetches x/y from P1/P2 (as ciphertext), and compute the result securely.
ans = ppd.device("SPU")(greater)(x, y)


---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Input In [22], in <cell line: 2>()
      1 # run make_rand on P1, the value is visible for P1 only.
----> 2 x_ = ppd.device("P1")(make_rand)()
      4 # run make_rand on P2, the value is visible for P2 only.
      5 y_ = ppd.device("P2")(make_rand)()

File ~/anaconda3/envs/secretflow/lib/python3.8/site-packages/spu/binding/util/distributed.py:373, in PYU.Function.__call__(self, *args, **kwargs)
    367     return pyfunc(*args, **kwargs)
    369 args, kwargs = tree_map(prep_objref, (args, kwargs))
    371 return tree_map(
    372     partial(PYU.Object, self.device),
--> 373     self.device.node_client.run(server_fn, *args, **kwargs),
    374 )

File ~/anaconda3/envs/secretflow/lib/python3.8/site-packages/spu/binding/util/distributed.py:152, in NodeClient.run(self, fn, *args, **kwargs)
    150 """Run a function on the corresponding node server"""
    151 self._check_args(*args, **kwargs)
--> 152 return self._call(self._stub.Run, fn, *args, **kwargs)

File ~/anaconda3/envs/secretflow/lib/python3.8/site-packages/spu/binding/util/distributed.py:143, in NodeClient._call(self, stub_method, fn, *args, **kwargs)
    139 rsp_gen = stub_method(
    140     RunRequest(data=split) for split in split_message(payload)
    141 )
    142 rsp_data = rebuild_messages(rsp_itr.data for rsp_itr in rsp_gen)
--> 143 result = pickle.loads(rsp_data)
    144 if isinstance(result, Exception):
    145     raise Exception("remote exception", result)

ModuleNotFoundError: No module named '__mp_main__'

ERROR: Could not find a version that satisfies the requirement spu==0.1.0b1 (from secretflow) (from versions: none)

Issue Type

Others

Source

binary

Secretflow Version

0.6.13

OS Platform and Distribution

No response

Python version

No response

Bazel version

No response

GCC/Compiler version

No response

What happend and What you expected to happen.

你好。。pip install -U secretflow报错,需要怎么处理?
ERROR: Could not find a version that satisfies the requirement spu==0.1.0b1 (from secretflow) (from versions: none)

Reproduction code to reproduce the issue.

ERROR: Could not find a version that satisfies the requirement spu==0.1.0b1 (from secretflow) (from versions: none)

spu设备的声明

image

请问spu设备在申请时传递的这个参数是给这个spu起的名字吗?还有其他申请spu设备的方法吗?

Failed to create secure subchannel for secure name '192.168.137.4:6379'

Issue Type

Build/Install

Source

binary

Secretflow Version

latest

OS Platform and Distribution

Centos7.5

Python version

No response

Bazel version

No response

GCC/Compiler version

No response

What happend and What you expected to happen.

执行下面这个操作的时候
RAY_DISABLE_REMOTE_CODE=true \
RAY_SECURITY_CONFIG_PATH=config.yml \
RAY_USE_TLS=1 \
RAY_TLS_SERVER_CERT=servercert.pem \
RAY_TLS_SERVER_KEY=serverkey.pem \
RAY_TLS_CA_CERT=cacert.pem \
ray start --head --node-ip-address="192.168.137.4" --port="6379" --resources='{"alice": 8}' --include-dashboard=False --disable-usage-stats


抛出异常
(yinyu) [root@alice secretflow_cluster]# RAY_DISABLE_REMOTE_CODE=true RAY_SECURITY_CONFIG_PATH=config.yml RAY_USE_TLS=1 RAY_TLS_SERVER_CERT=servercert.pem RAY_TLS_SERVER_KEY=serverkey.pem RAY_TLS_CA_CERT=cacert.pem ray start --head --node-ip-address="192.168.137.4" --port="6379" --resources='{"alice": 8}' --include-dashboard=False --disable-usage-stats
Usage stats collection is disabled.

Local node IP: 192.168.137.4
E0714 07:48:09.783010834    8132 ssl_transport_security.cc:845] Invalid cert chain file.
E0714 07:48:09.783055834    8132 ssl_security_connector.cc:116] Handshaker factory creation failed with TSI_INVALID_ARGUMENT.
E0714 07:48:09.783061834    8132 chttp2_connector.cc:317]    Failed to create secure subchannel for secure name '192.168.137.4:6379'
E0714 07:48:09.783066534    8132 chttp2_connector.cc:276]    Failed to create channel args during subchannel creation.
E0714 07:48:10.785822614    8132 ssl_transport_security.cc:845] Invalid cert chain file.
E0714 07:48:10.785848214    8132 ssl_security_connector.cc:116] Handshaker factory creation failed with TSI_INVALID_ARGUMENT.
E0714 07:48:10.785853214    8132 chttp2_connector.cc:317]    Failed to create secure subchannel for secure name '192.168.137.4:6379'
E0714 07:48:10.785857714    8132 chttp2_connector.cc:276]    Failed to create channel args during subchannel creation.
E0714 07:48:11.787983491    8132 ssl_transport_security.cc:845] Invalid cert chain file.
E0714 07:48:11.788010492    8132 ssl_security_connector.cc:116] Handshaker factory creation failed with TSI_INVALID_ARGUMENT.
E0714 07:48:11.788015492    8132 chttp2_connector.cc:317]    Failed to create secure subchannel for secure name '192.168.137.4:6379'
E0714 07:48:11.788019492    8132 chttp2_connector.cc:276]    Failed to create channel args during subchannel creation.
E0714 07:48:12.790182069    8132 ssl_transport_security.cc:845] Invalid cert chain file.
E0714 07:48:12.790247570    8132 ssl_security_connector.cc:116] Handshaker factory creation failed with TSI_INVALID_ARGUMENT.
E0714 07:48:12.790256170    8132 chttp2_connector.cc:317]    Failed to create secure subchannel for secure name '192.168.137.4:6379'
E0714 07:48:12.790260470    8132 chttp2_connector.cc:276]    Failed to create channel args during subchannel creation.
E0714 07:48:13.792359247    8132 ssl_transport_security.cc:845] Invalid cert chain file.
E0714 07:48:13.792404647    8132 ssl_security_connector.cc:116] Handshaker factory creation failed with TSI_INVALID_ARGUMENT.
E0714 07:48:13.792410247    8132 chttp2_connector.cc:317]    Failed to create secure subchannel for secure name '192.168.137.4:6379'
E0714 07:48:13.792423847    8132 chttp2_connector.cc:276]    Failed to create channel args during subchannel creation.
E0714 07:48:14.794512125    8132 ssl_transport_security.cc:845] Invalid cert chain file.
E0714 07:48:14.794539725    8132 ssl_security_connector.cc:116] Handshaker factory creation failed with TSI_INVALID_ARGUMENT.
E0714 07:48:14.794545325    8132 chttp2_connector.cc:317]    Failed to create secure subchannel for secure name '192.168.137.4:6379'
E0714 07:48:14.794549725    8132 chttp2_connector.cc:276]    Failed to create channel args during subchannel creation.
2022-07-14 07:48:14,794	WARNING utils.py:1282 -- Unable to connect to GCS at 192.168.137.4:6379. Check that (1) Ray GCS with matching version started successfully at the specified address, and (2) there is no firewall setting preventing access.

Reproduction code to reproduce the issue.

执行下面这个操作的时候
RAY_DISABLE_REMOTE_CODE=true \
RAY_SECURITY_CONFIG_PATH=config.yml \
RAY_USE_TLS=1 \
RAY_TLS_SERVER_CERT=servercert.pem \
RAY_TLS_SERVER_KEY=serverkey.pem \
RAY_TLS_CA_CERT=cacert.pem \
ray start --head --node-ip-address="192.168.137.4" --port="6379" --resources='{"alice": 8}' --include-dashboard=False --disable-usage-stats


抛出异常
(yinyu) [root@alice secretflow_cluster]# RAY_DISABLE_REMOTE_CODE=true RAY_SECURITY_CONFIG_PATH=config.yml RAY_USE_TLS=1 RAY_TLS_SERVER_CERT=servercert.pem RAY_TLS_SERVER_KEY=serverkey.pem RAY_TLS_CA_CERT=cacert.pem ray start --head --node-ip-address="192.168.137.4" --port="6379" --resources='{"alice": 8}' --include-dashboard=False --disable-usage-stats
Usage stats collection is disabled.

Local node IP: 192.168.137.4
E0714 07:48:09.783010834    8132 ssl_transport_security.cc:845] Invalid cert chain file.
E0714 07:48:09.783055834    8132 ssl_security_connector.cc:116] Handshaker factory creation failed with TSI_INVALID_ARGUMENT.
E0714 07:48:09.783061834    8132 chttp2_connector.cc:317]    Failed to create secure subchannel for secure name '192.168.137.4:6379'
E0714 07:48:09.783066534    8132 chttp2_connector.cc:276]    Failed to create channel args during subchannel creation.
E0714 07:48:10.785822614    8132 ssl_transport_security.cc:845] Invalid cert chain file.
E0714 07:48:10.785848214    8132 ssl_security_connector.cc:116] Handshaker factory creation failed with TSI_INVALID_ARGUMENT.
E0714 07:48:10.785853214    8132 chttp2_connector.cc:317]    Failed to create secure subchannel for secure name '192.168.137.4:6379'
E0714 07:48:10.785857714    8132 chttp2_connector.cc:276]    Failed to create channel args during subchannel creation.
E0714 07:48:11.787983491    8132 ssl_transport_security.cc:845] Invalid cert chain file.
E0714 07:48:11.788010492    8132 ssl_security_connector.cc:116] Handshaker factory creation failed with TSI_INVALID_ARGUMENT.
E0714 07:48:11.788015492    8132 chttp2_connector.cc:317]    Failed to create secure subchannel for secure name '192.168.137.4:6379'
E0714 07:48:11.788019492    8132 chttp2_connector.cc:276]    Failed to create channel args during subchannel creation.
E0714 07:48:12.790182069    8132 ssl_transport_security.cc:845] Invalid cert chain file.
E0714 07:48:12.790247570    8132 ssl_security_connector.cc:116] Handshaker factory creation failed with TSI_INVALID_ARGUMENT.
E0714 07:48:12.790256170    8132 chttp2_connector.cc:317]    Failed to create secure subchannel for secure name '192.168.137.4:6379'
E0714 07:48:12.790260470    8132 chttp2_connector.cc:276]    Failed to create channel args during subchannel creation.
E0714 07:48:13.792359247    8132 ssl_transport_security.cc:845] Invalid cert chain file.
E0714 07:48:13.792404647    8132 ssl_security_connector.cc:116] Handshaker factory creation failed with TSI_INVALID_ARGUMENT.
E0714 07:48:13.792410247    8132 chttp2_connector.cc:317]    Failed to create secure subchannel for secure name '192.168.137.4:6379'
E0714 07:48:13.792423847    8132 chttp2_connector.cc:276]    Failed to create channel args during subchannel creation.
E0714 07:48:14.794512125    8132 ssl_transport_security.cc:845] Invalid cert chain file.
E0714 07:48:14.794539725    8132 ssl_security_connector.cc:116] Handshaker factory creation failed with TSI_INVALID_ARGUMENT.
E0714 07:48:14.794545325    8132 chttp2_connector.cc:317]    Failed to create secure subchannel for secure name '192.168.137.4:6379'
E0714 07:48:14.794549725    8132 chttp2_connector.cc:276]    Failed to create channel args during subchannel creation.
2022-07-14 07:48:14,794	WARNING utils.py:1282 -- Unable to connect to GCS at 192.168.137.4:6379. Check that (1) Ray GCS with matching version started successfully at the specified address, and (2) there is no firewall setting preventing access.

配置环境遇到关于zlib的问题

执行步骤:

- git clone https://github.com/secretflow/secretflow.git
- conda create -n secretflow python=3.8
- conda activate secretflow
- cd secretflow/
- pip install -r dev-requirements.txt -r requirements.txt

然后遇到关于zlib的问题:

File "/xxxxxxx/miniconda3/envs/secretflow/lib/python3.8/zipfile.py", line 1016, in _read1

data = self._decompressor.decompress(data, n)

zlib.error: Error -3 while decompressing data: invalid block type

请问你们有遇到过吗?如何解决的?
非常感谢!

集群模式下简单计算sf.reveal卡住的问题

Issue Type

Build/Install

Source

binary

Secretflow Version

latest

OS Platform and Distribution

No response

Python version

No response

Bazel version

No response

GCC/Compiler version

No response

What happend and What you expected to happen.

我有三台公网的服务器,A,B,C
A启动ray,作为head节点启动:ray start --head --node-ip-address="172.17.0.12" --port="6379" --resources='{"alice": 8}'
B和C分别连A的公网IP和端口
B:ray start --address="A公网IP:6379" --resources='{"bob": 8}'
C:ray start --address="A公网IP:6379" --resources='{"charlie": 8}'
ray status能看到集群是正常启动的,如:

(secretflow) ubuntu@VM-0-12-ubuntu:~$ ray status
======== Autoscaler status: 2022-07-19 11:55:11.912311 ========
Node status
---------------------------------------------------------------
Healthy:
 1 node_1b2eea51cd2e55ee525b52f7ab9d1936e39cfe0836dfcf8879385c98
 1 node_de20b9bdc340e6e008389ac8a1e014be3fe6a0d23d9cbe8e19cea057
 1 node_8c9cd0fe5c5a53234aebcc11559d688a88f088dc47c272ee46e20782
Pending:
 (no pending nodes)
Recent failures:
 (no failures)

Resources
---------------------------------------------------------------
Usage:
 0.0/2.0 CPU
 0.0/8.0 alice
 0.00/1.744 GiB memory
 0.00/0.872 GiB object_store_memory

Demands:
 (no resource demands)

我理解应该可以在A节点运行代码,但是让运算调度到B的device上吧?但是我不确定是否如此,然后sf.reveal的时候卡住了,代码如下(ray不太熟悉,不清楚是哪里的问题)
另外还想咨询下:
1、我如何证明运算是B处理的?(例如能否看到一些log)
2、如何保证最终结果只有C能看到?
求教,感谢!

Reproduction code to reproduce the issue.

>>> import secretflow as sf
>>> sf.init(address='172.17.0.12:6379')
>>> b = sf.PYU('bob')
>>> import numpy as np
>>> data = b(np.random.rand)(3, 4)
>>> sf.reveal(data)

ctrl+c 退出,返回报错:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ubuntu/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/device/driver.py", line 158, in reveal
    value_obj = ray.get(value_ref)
  File "/home/ubuntu/anaconda3/envs/secretflow/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
    return func(*args, **kwargs)
  File "/home/ubuntu/anaconda3/envs/secretflow/lib/python3.8/site-packages/ray/worker.py", line 1837, in get
    values, debugger_breakpoint = worker.get_objects(object_refs, timeout=timeout)
  File "/home/ubuntu/anaconda3/envs/secretflow/lib/python3.8/site-packages/ray/worker.py", line 364, in get_objects
    data_metadata_pairs = self.core_worker.get_objects(
  File "python/ray/_raylet.pyx", line 1200, in ray._raylet.CoreWorker.get_objects
  File "python/ray/_raylet.pyx", line 169, in ray._raylet.check_status
KeyboardInterrupt

Install .Build and Code Fail

Issue Type

Build/Install

Source

binary

Secretflow Version

secretflow 最新版

OS Platform and Distribution

Ubuntu18.04/Ubuntu22.04

Python version

python3.8.13

Bazel version

No response

GCC/Compiler version

No response

What happend and What you expected to happen.

操作系统Ubuntu18.04下
如果使用 pip install -U secretflow,报错为:
Requirement already satisfied: aiohttp<=4 in ./.local/lib/python3.8/site-packages (from s3fs==2022.1.0->secretflow) (3.8.1)
Collecting aiobotocore~=2.1.0
Downloading http://pypi.doubanio.com/packages/4e/8d/01035d9b56893bd3b5d6eb4505d3ed1383d124b1c9c2b6024c175681c64b/aiobotocore-2.1.2.tar.gz (58 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.7/58.7 kB 3.0 MB/s eta 0:00:00
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [1 lines of output]
ERROR: Can not execute setup.py since setuptools is not available in the build environment.
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

在源码下pip install -r requirement,报错为:

Collecting jax==0.3.7
Downloading http://pypi.doubanio.com/packages/d7/ef/8ff361f49244956f48c3528a42c392c31bdbcbb9af5399eba19e153a5c26/jax-0.3.7.tar.gz (944 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 944.2/944.2 kB 3.0 MB/s eta 0:00:00
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [1 lines of output]
ERROR: Can not execute setup.py since setuptools is not available in the build environment.
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.


在Ubuntu22.04下:
安装没有发生错误,但是运行简单的例子,出现错误:
(secretflow) shenghuo@shenghuo-machine:~/src$ python test.py
Traceback (most recent call last):
File "test.py", line 1, in
import secretflow as sf
File "/home/shenghuo/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/init.py", line 15, in
from . import data, device, ml, preprocessing, security, utils
File "/home/shenghuo/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/data/init.py", line 15, in
from . import horizontal, vertical
File "/home/shenghuo/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/data/horizontal/init.py", line 15, in
from .dataframe import HDataFrame
File "/home/shenghuo/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/data/horizontal/dataframe.py", line 21, in
from secretflow.data.base import DataFrameBase, Partition
File "/home/shenghuo/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/data/base.py", line 23, in
from secretflow.device import PYUObject, reveal
File "/home/shenghuo/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/device/init.py", line 15, in
from .device import *
File "/home/shenghuo/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/device/device/init.py", line 16, in
from .heu import HEU
File "/home/shenghuo/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/device/device/heu.py", line 17, in
from secretflow.device.device.spu import PyTreeLeaf
File "/home/shenghuo/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 31, in
import spu
File "/home/shenghuo/anaconda3/envs/secretflow/lib/python3.8/site-packages/spu/init.py", line 33, in
from .binding.api import Io, Runtime, compile
File "/home/shenghuo/anaconda3/envs/secretflow/lib/python3.8/site-packages/spu/binding/api.py", line 21, in
from . import _lib
ImportError: /home/shenghuo/anaconda3/envs/secretflow/lib/python3.8/site-packages/spu/binding/_lib.so: undefined symbol: _dl_sym, version GLIBC_PRIVATE

Reproduction code to reproduce the issue.

import secretflow as sf
sf.init(['alice', 'bob', 'carol'], num_cpus=8, log_to_driver=True)
dev = sf.PYU('alice')
import numpy as np
data = dev(np.random.rand)(3, 4)
data

cluster模式下,启动ray失败

Issue Type

Build/Install

Source

binary

Secretflow Version

latest

OS Platform and Distribution

No response

Python version

No response

Bazel version

No response

GCC/Compiler version

No response

What happend and What you expected to happen.

按照官方的文档
《https://secretflow.readthedocs.io/en/latest/getting_started/deployment.html》

RAY_DISABLE_REMOTE_CODE=true \
RAY_SECURITY_CONFIG_PATH=config.yml \
RAY_USE_TLS=1 \
RAY_TLS_SERVER_CERT=servercert.pem \
RAY_TLS_SERVER_KEY=serverkey.pem \
RAY_TLS_CA_CERT=cacert.pem \
ray start --head --node-ip-address="192.168.137.4" --port="6379" --resources='{"alice": 8}' --include-dashboard=False --disable-usage-stats

192.168.137.4 6379是本机的redis的ip和port

执行ray start的时候抛出异常如下:


(yinyu) [root@alice secretflow_cluster]# ray start --head --node-ip-address="192.168.137.4" --port="6379" --resources='{"alice": 8}' --include-dashboard=False --disable-usage-stats
Usage stats collection is disabled.

Local node IP: 192.168.137.4
2022-07-14 07:02:52,453	WARNING utils.py:1282 -- Unable to connect to GCS at 192.168.137.4:6379. Check that (1) Ray GCS with matching version started successfully at the specified address, and (2) there is no firewall setting preventing access.
2022-07-14 07:03:08,419	WARNING utils.py:1282 -- Unable to connect to GCS at 192.168.137.4:6379. Check that (1) Ray GCS with matching version started successfully at the specified address, and (2) there is no firewall setting preventing access.
2022-07-14 07:05:37,765	WARNING utils.py:1282 -- Unable to connect to GCS at 192.168.137.4:6379. Check that (1) Ray GCS with matching version started successfully at the specified address, and (2) there is no firewall setting preventing access.


请问这是什么造成的呢?(redis确定是可以连接成功的)

Reproduction code to reproduce the issue.

按照官方的文档
《https://secretflow.readthedocs.io/en/latest/getting_started/deployment.html》

RAY_DISABLE_REMOTE_CODE=true \
RAY_SECURITY_CONFIG_PATH=config.yml \
RAY_USE_TLS=1 \
RAY_TLS_SERVER_CERT=servercert.pem \
RAY_TLS_SERVER_KEY=serverkey.pem \
RAY_TLS_CA_CERT=cacert.pem \
ray start --head --node-ip-address="192.168.137.4" --port="6379" --resources='{"alice": 8}' --include-dashboard=False --disable-usage-stats

192.168.137.4 6379是本机的redis的ip和port

执行ray start的时候抛出异常如下:


(yinyu) [root@alice secretflow_cluster]# ray start --head --node-ip-address="192.168.137.4" --port="6379" --resources='{"alice": 8}' --include-dashboard=False --disable-usage-stats
Usage stats collection is disabled.

Local node IP: 192.168.137.4
2022-07-14 07:02:52,453	WARNING utils.py:1282 -- Unable to connect to GCS at 192.168.137.4:6379. Check that (1) Ray GCS with matching version started successfully at the specified address, and (2) there is no firewall setting preventing access.
2022-07-14 07:03:08,419	WARNING utils.py:1282 -- Unable to connect to GCS at 192.168.137.4:6379. Check that (1) Ray GCS with matching version started successfully at the specified address, and (2) there is no firewall setting preventing access.
2022-07-14 07:05:37,765	WARNING utils.py:1282 -- Unable to connect to GCS at 192.168.137.4:6379. Check that (1) Ray GCS with matching version started successfully at the specified address, and (2) there is no firewall setting preventing access.


请问这是什么造成的呢?(redis确定是可以连接成功的)

error occur when running the demo on different dataset

Issue Type

Bug

Source

binary

Secretflow Version

latest

OS Platform and Distribution

ubuntu 18.04

Python version

3.8.13

Bazel version

No response

GCC/Compiler version

No response

What happend and What you expected to happen.

But I used my own dataset which is much larger to run the demo,the following error occurred. 

2022-07-14 09:14:38,001	WARNING worker.py:1416 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: ffffffffffffffff86b5b6b3ddbc49317ecc0c5801000000 Worker ID: 5e383a84459651f44846404f76675b3a8dd9e9ac626ff4e286e39455 Node ID: e0f67ccadcf8945daacf7665de1242222467fc5e633957d4a381300d Worker IP address: 172.17.0.2 Worker port: 41807 Worker PID: 10310
Traceback (most recent call last):
  File "jax_fk.py", line 178, in <module>
    params = sf.reveal(params_spu)
  File "/usr/local/lib/python3.8/site-packages/secretflow/device/driver.py", line 158, in reveal
    value_obj = ray.get(value_ref)
  File "/usr/local/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/ray/worker.py", line 1845, in get
    raise value
ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task.
	class_name: SPURuntime
	actor_id: 86b5b6b3ddbc49317ecc0c5801000000
	pid: 10310
	namespace: 3b837c97-994a-4894-bdbc-f3e070b54a9e
	ip: 172.17.0.2
The actor is dead because its worker process has died. Worker exit type: SYSTEM_ERROR_EXIT
(SPURuntime pid=10311) I0714 09:14:38.084491 10471 external/com_github_brpc_brpc/src/brpc/socket.cpp:2202] Checking Socket{id=0 addr=127.0.0.1:36067} (0x563ea08fe600)

The url of demo is https://secretflow.readthedocs.io/en/latest/tutorial/nn_with_spu.html.
The dataset's info :
x1 is : [[0.00291545 0.1        0.         ... 0.         0.41176471 0.7852172 ]
 [0.00291545 0.1        0.         ... 0.         0.41176471 0.7852172 ]
 [0.00291545 0.1        0.00980392 ... 0.         0.41176471 1.        ]
 ...
 [0.         0.         0.         ... 0.         0.41176471 0.7852172 ]
 [0.         0.         0.         ... 0.         0.41176471 0.5       ]
 [0.0058309  0.1        0.01960784 ... 0.         0.41176471 0.7852172 ]]
x1 type : <class 'numpy.ndarray'>
x1 dtype : float64
x1 shape : (106501, 400)

Reproduction code to reproduce the issue.

import numpy as np
import pandas as pd
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import time
from jax.example_libraries import stax
from jax.example_libraries.stax import (
    Dense,
    Relu,
    Softplus,
    Dropout,
    Sigmoid,
)

import jax
import jax.numpy as jnp
from jax.example_libraries import optimizers, stax


def load_fk_dataset(party_id) -> (np.ndarray, np.ndarray):
    if party_id == 1:
        data = pd.read_csv("./data/train_fk_cu.csv")
        features = data.drop(["example_id"], axis=1)
        return features.to_numpy(), None
    else:
        data = pd.read_csv("./data/train_fk_jd.csv")
        labels = data["label"]
        features = data.drop(["label", "example_id"], axis=1)
        return features.to_numpy(), labels.to_numpy()
    
    
def load_train_dataset(party_id=None) -> (np.ndarray, np.ndarray):
    features, label = load_breast_cancer(return_X_y=True)
    scaler = StandardScaler()
    features = scaler.fit_transform(features)
    X_train, _, y_train, _ = train_test_split(
        features, label, test_size=0.8, random_state=42
    )

    if party_id:
        if party_id == 1:
            return X_train[:, 15:], _
        else:
            return X_train[:, :15], y_train
    else:
        return X_train, y_train

def load_test_dataset():
    features, label = load_breast_cancer(return_X_y=True)
    scaler = StandardScaler()
    features = scaler.fit_transform(features)
    _, X_test, _, y_test = train_test_split(
        features, label, test_size=0.8, random_state=42
    )
    return X_test, y_test


def MLP():
    nn_init, nn_apply = stax.serial(
        Dense(128*2),
        Relu,
        Dense(710),
        Relu,
        Dense(400),
        Relu,
        Dense(100),
        Softplus,
        Dense(50),
        Softplus,
        Dense(1),
        Sigmoid, 
    )

    return nn_init, nn_apply


KEY = jax.random.PRNGKey(0)
INPUT_SHAPE = (-1,731)

def init_state(learning_rate):
    init_fun, _ = MLP()
    _, params_init = init_fun(KEY, INPUT_SHAPE)
    opt_init, _, _ = optimizers.sgd(learning_rate)
    opt_state = opt_init(params_init)
    return opt_state

def train(
    train_x1,
    train_x2,
    train_y,
    opt_state,
    learning_rate,
    epochs,
    batch_size,
):
    train_x = jnp.concatenate([train_x1, train_x2], axis=1)

    _, predict_fun = MLP()
    _, opt_update, get_params = optimizers.sgd(learning_rate)

    def update_model(state, imgs, labels, i):
        def mse(y, pred):
            return jnp.mean(jnp.multiply(y - pred, y - pred) / 2.0)

        def loss_func(params):
            y = predict_fun(params, imgs)
            return mse(y, labels), y

        grad_fn = jax.value_and_grad(loss_func, has_aux=True)
        (loss, y), grads = grad_fn(get_params(state))
        return opt_update(i, grads, state)
    import time
    for i in range(1, epochs + 1):
        begin =time.time()
        imgs_batchs = jnp.array_split(train_x, len(train_x) / batch_size, axis=0)
        labels_batchs = jnp.array_split(train_y, len(train_y) / batch_size, axis=0)

        for batch_idx, (batch_images, batch_labels) in enumerate(
            zip(imgs_batchs, labels_batchs)
        ):
            opt_state = update_model(opt_state, batch_images, batch_labels, i)
        end = time.time()
        print("epoch-{} cost time:{} s".format(i, end - begin))
    return get_params(opt_state)
from sklearn.metrics import roc_auc_score


def validate_model(params, X_test, y_test):
    _, predict_fun = MLP()
    y_pred = predict_fun(params, X_test)
    return roc_auc_score(y_test, y_pred)

if __name__ == "__main__":
    import jax

    # Load the data
    x1, _ = load_fk_dataset(party_id=1)
    print("x1 is :", x1)
    print("x1 type :", type(x1))
    print("x1 dtype :", x1.dtype)
    print("x1 shape :", x1.shape)
    x2, y = load_fk_dataset(party_id=2)


    # Hyperparameter
    batch_size = 100
    epochs = 10
    learning_rate = 0.1


    # Load the data
    import secretflow as sf

    # In case you have a running secretflow runtime already.
    sf.shutdown()

    sf.init(['alice', 'bob'], num_cpus=16, log_to_driver=True)

    alice, bob = sf.PYU('alice'), sf.PYU('bob')
    spu = sf.SPU(sf.utils.testing.cluster_def(['alice', 'bob']))

    x1, _ = alice(load_fk_dataset)(party_id=1)
    x2, y = bob(load_fk_dataset)(party_id=2)


    device = spu
    x1_, x2_, y_ = x1.to(device), x2.to(device), y.to(device)
    init_params_ = sf.to(spu, lambda: init_state(learning_rate))
    begin = time.time()
    params_spu = spu(train, static_argnames=['learning_rate', 'epochs', 'batch_size'])(
        x1_, x2_, y_,init_params_, learning_rate=learning_rate, epochs=epochs, batch_size=batch_size
    )

    params = sf.reveal(params_spu)
    print("train cost time:",time.time() - begin)
    # print(params)
    X_test, y_test = load_test_dataset()
    auc = validate_model(params, X_test, y_test)
    print(f'auc={auc}')

hdf.mean in tutorial/DataFrame.ipynb appears to have output different from true value

Issue Type

Documentation Bug

Source

binary

Secretflow Version

0.6.13b1

OS Platform and Distribution

Anolis OS release 8.6

Python version

3.8.12

Bazel version

No response

GCC/Compiler version

No response

What happend and What you expected to happen.

I was looking at tutorial's DataFrame.ipynb and noticed that the hdf.mean(numeric_only=True) and vdf.mean(numeric_only=True) results are different, is this expected behavior?

I thought the result of mean would be the same whether the division method is horizontal or vertical, am I misunderstanding something?

Thinking it might be a typo in the documentation, I tried to reproduce it using spu's Dockerfile and confirmed that the output is as follows, just as in the documentation.

>>> hdf.mean(numeric_only=True)
sepal length (cm)    1.168667
sepal width (cm)     0.611467
petal length (cm)    0.751600
petal width (cm)     0.239867
target               0.200000
dtype: float64

>>> vdf.mean(numeric_only=True)
sepal length (cm)    5.843333
sepal width (cm)     3.057333
petal length (cm)    3.758000
petal width (cm)     1.199333
target               1.000000
dtype: float64

>> data.mean(numeric_only=True)
sepal length (cm)    5.843333
sepal width (cm)     3.057333
petal length (cm)    3.758000
petal width (cm)     1.199333
target               1.000000
dtype: float64

Reproduction code to reproduce the issue.

Can be reproduced by running https://github.com/secretflow/secretflow/blob/beta/docs/tutorial/DataFrame.ipynb

请问有没有纵向联邦回归算法的例子啊?

官网上面的例子都是横向的:“Federate Learning for Image Classification”、“Federate Xgboosts”
有一个例子数据是纵向的,但是算法用的是mpc:“Logistic Regression with SPU”

有没有纵向联邦回归算法的例子啊?

请问ray网络的生命周期——Head节点的高可用怎么做

Issue Type

Others

Source

source

Secretflow Version

beta

OS Platform and Distribution

Centos7

Python version

3.8

Bazel version

No response

GCC/Compiler version

No response

What happend and What you expected to happen.

请问是先创建ray网络,还是先执行任务。
如果先创建ray网络,如果header节点宕机了,高可用怎么做?

1、创建ray网络
ray start --head --node-ip-address="192.168.137.3" --port="7000" --resources='{"alice": 8}'
ray start --address="172.16.4.140:6379" --resources='{"bob": 8}' 

2、执行任务
import secretflow as sf
import spu

sf.shutdown()
sf.init(address='alice's ip:8881')
alice = sf.PYU('alice')
bob = sf.PYU('bob')
device = sf.SPU({
……

Reproduction code to reproduce the issue.

1、创建ray网络
ray start --head --node-ip-address="192.168.137.3" --port="7000" --resources='{"alice": 8}'
ray start --address="172.16.4.140:6379" --resources='{"bob": 8}' 

2、执行任务
import secretflow as sf
import spu

sf.shutdown()
sf.init(address='alice's ip:8881')
alice = sf.PYU('alice')
bob = sf.PYU('bob')
device = sf.SPU({
……

Get data timeout, key=root:110:ALLGATHER

Issue Type

Others

Source

binary

Secretflow Version

latest

OS Platform and Distribution

ubuntu 18.04

Python version

3.8.13

Bazel version

No response

GCC/Compiler version

No response

What happend and What you expected to happen.

2022-07-28 16:16:13,219 ERROR worker.py:94 -- Unhandled error (suppress with 'RAY_IGNORE_UNHANDLED_ERRORS=1'): ray::SPURuntime.run() (pid=13081, ip=10.100.82.74, repr=<secretflow.device.device.spu.SPURuntime object at 0x7f1fd47b1220>)
  File "/home/ops/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 224, in run
    self.runtime.run(executable)
  File "/home/ops/anaconda3/envs/secretflow/lib/python3.8/site-packages/spu/binding/api.py", line 43, in run
    return self._vm.Run(executable.SerializeToString())
RuntimeError: what: 
        [external/yasl/yasl/link/transport/channel.cc:86] Get data timeout, key=root:110:ALLGATHER
stacktrace: 
#0 yasl::link::Context::RecvInternal()+0x7f202eb100b2
#1 yasl::link::AllGatherImpl<>()+0x7f202e9c8785
#2 yasl::link::AllGather()+0x7f202e9c8cb4
#3 spu::mpc::Communicator::allReduce()+0x7f202e2c7a37
#4 spu::mpc::semi2k::B2A_Randbit::proc()::{lambda()#1}::operator()()::{lambda()#3}::operator()()+0x7f202e2bd9f2
#5 spu::mpc::semi2k::B2A_Randbit::proc()+0x7f202e2c0a89
#6 spu::mpc::UnaryKernel::evaluate()+0x7f202e19efdb
#7 spu::mpc::Object::call<>()+0x7f202e2c60b8
#8 spu::mpc::(anonymous namespace)::_Lazy2A()+0x7f202e2dfb19
#9 spu::mpc::ABProtAddSP::proc()+0x7f202e2e019b
#10 spu::mpc::BinaryKernel::evaluate()+0x7f202e19f2f2
#11 spu::mpc::Object::call<>()+0x7f202e2c6866
#12 spu::mpc::add_sp()+0x7f202e2c6994
#13 spu::hal::_add_sp()+0x7f202e171b63
#14 spu::hal::_add()+0x7f202e167486
#15 spu::hal::_popcount()+0x7f202e168b8c

Reproduction code to reproduce the issue.

在做三方逻辑回归时,遇到上述报错。似乎和训练的数据量有关系。这块如果代码不调整的话,是否只能升级机器配置或加计算节点优化呢?

安装报错

安装时候报错:
image

服务器不需要要8核16g吗?

安全聚合的疑问

image

如图,用“安全聚合模块”和“定义函数编译生成静态计算图”两种方式都可以达到聚合的效果,请问两者有什么不同呢?

Aby3 损失函数异常

Issue Type

Others

Source

binary

Secretflow Version

beta

OS Platform and Distribution

Centos7

Python version

3.8

Bazel version

No response

GCC/Compiler version

No response

What happend and What you expected to happen.

#!/usr/bin/env python
# coding: utf-8

# In[9]:


import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

def load_train_dataset(party_id=None) -> (np.ndarray, np.ndarray):
    features, label = load_breast_cancer(return_X_y=True)
    scaler = StandardScaler()
    features = scaler.fit_transform(features)
    X_train, _, y_train, _ = train_test_split(
        features, label, test_size=0.8, random_state=42
    )

    if party_id:
        if party_id == 1 or party_id == 3:
            return X_train[:, 15:], _
        else:
            return X_train[:, :15], y_train
    else:
        return X_train, y_train


def load_test_dataset():
    features, label = load_breast_cancer(return_X_y=True)
    scaler = StandardScaler()
    features = scaler.fit_transform(features)
    _, X_test, _, y_test = train_test_split(
        features, label, test_size=0.8, random_state=42
    )
    return X_test, y_test


# In[10]:


import jax.numpy as jnp
from jax import grad, jit, vmap
from jax import random


def sigmoid(x):
    return 1 / (1 + jnp.exp(-x))


# Outputs probability of a label being true.
def predict(W, b, inputs):
    return sigmoid(jnp.dot(inputs, W) + b)


# Training loss is the negative log-likelihood of the training examples.
def loss(W, b, inputs, targets):
    preds = predict(W, b, inputs)
    label_probs = preds * targets + (1 - preds) * (1 - targets)
    return -jnp.mean(jnp.log(label_probs))


# In[11]:


from jax import value_and_grad

def train_step(W, b, x1, x2, x3, y, learning_rate):
    x = jnp.concatenate([x1, x2, x3], axis=1)
    loss_value, Wb_grad = value_and_grad(loss, (0, 1))(W, b, x, y)
    W -= learning_rate * Wb_grad[0]
    b -= learning_rate * Wb_grad[1]
    return loss_value, W, b


# In[12]:


def fit(W, b, x1, x2, x3, y, epochs=1, learning_rate=1e-2):
    losses = jnp.array([])
    for _ in range(epochs):
        l, W, b = train_step(W, b, x1, x2, x3, y, learning_rate=learning_rate)
        losses = jnp.append(losses, l)
    return losses, W, b


# In[13]:


from sklearn.metrics import roc_auc_score

def validate_model(W, b, X_test, y_test):
    y_pred = predict(W, b, X_test)
    return roc_auc_score(y_test, y_pred)


# In[14]:


import matplotlib.pyplot as plt

def plot_losses(losses):
    plt.plot(np.arange(len(losses)), losses)
    plt.xlabel('epoch')
    plt.ylabel('loss')


# In[15]:


import secretflow as sf
import spu

# In case you have a running secretflow runtime already.
sf.shutdown()

sf.init(address='172.16.4.140:6379', _redis_password='')
alice, bob, charlie = sf.PYU('alice'), sf.PYU('bob'), sf.PYU('charlie')

device = sf.SPU({
    'nodes': [
                {
                    'party': 'alice',
                    'id': '140:0',
                    # The address for other peers.
                    'address': '172.16.4.140:8881',
                    # The listen address of this node.
                    # Optional. Address will be used if listen_address is empty.
                    # 'listen_address': ''
                },
                {
                    'party': 'bob',
                    'id':'141:0',
                    'address': '172.16.4.141:8881',
                    # 'listen_address': ''
                },
                {
                    'party': 'charlie',
                    'id':'142:0',
                    'address': '172.16.4.142:8881',
                    # 'listen_address': ''
                }
            ],
            'runtime_config': {
            'protocol': spu.spu_pb2.ABY3,
            'field': spu.spu_pb2.FM128,
            'sigmoid_mode': spu.spu_pb2.RuntimeConfig.SIGMOID_REAL
        }
})





# sf.init(['alice', 'bob', 'charlie'], num_cpus=8, log_to_driver=True)

# alice, bob, charlie = sf.PYU('alice'), sf.PYU('bob'), sf.PYU('charlie')
# spu = sf.SPU(sf.utils.testing.cluster_def(['alice', 'bob', 'charlie']))


# In[16]:


x1, _ = alice(load_train_dataset)(party_id=1)
x2, y = bob(load_train_dataset)(party_id=2)
x3, _ = charlie(load_train_dataset)(party_id=3)

x1, x2, x3, y


# In[17]:


W = jnp.zeros((30,))
b = 0.0

W_, b_, x1_, x2_, x3_, y_ = (
    sf.to(device, W),
    sf.to(device, b),
    x1.to(device),
    x2.to(device),
    x3.to(device),
    y.to(device),
)


# In[18]:


losses, W_, b_ = device(fit, static_argnames=['epochs'], num_returns=3)(
    W_, b_, x1_, x2_, x3_, y_, epochs=10, learning_rate=1e-2
)

losses, W_, b_


# In[19]:


get_ipython().run_line_magic('matplotlib', 'inline')

losses = sf.reveal(losses)

plot_losses(losses)


# In[ ]:

Reproduction code to reproduce the issue.

执行到损失函数In[18]:的时候抛出异常
2022-08-01 21:06:21,049	ERROR worker.py:94 -- Unhandled error (suppress with 'RAY_IGNORE_UNHANDLED_ERRORS=1'): ray::SPURuntime.run() (pid=16468, ip=172.16.4.140, repr=<secretflow.device.device.spu.SPURuntime object at 0x7f38e1bd0be0>)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 196, in run
    cfn, output = jax.xla_computation(fn, return_shape=True)(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/api.py", line 807, in computation_maker
    jaxpr, out_avals, consts = pe.trace_to_jaxpr_dynamic(jaxtree_fun, avals)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/profiler.py", line 206, in wrapper
    return func(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1779, in trace_to_jaxpr_dynamic
    jaxpr, out_avals, consts = trace_to_subjaxpr_dynamic(
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1816, in trace_to_subjaxpr_dynamic
    ans = fun.call_wrapped(*in_tracers_)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/linear_util.py", line 168, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/tmp/ipykernel_15144/2203275946.py", line 4, in fit
  File "/tmp/ipykernel_15144/4019579140.py", line 5, in train_step
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/api.py", line 982, in value_and_grad_f
    ans, vjp_py = _vjp(f_partial, *dyn_args, reduce_axes=reduce_axes)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/api.py", line 2441, in _vjp
    out_primal, out_vjp = ad.vjp(
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/ad.py", line 129, in vjp
    out_primals, pvals, jaxpr, consts = linearize(traceable, *primals)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/ad.py", line 116, in linearize
    jaxpr, out_pvals, consts = pe.trace_to_jaxpr(jvpfun_flat, in_pvals)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/profiler.py", line 206, in wrapper
    return func(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 606, in trace_to_jaxpr
    jaxpr, (out_pvals, consts, env) = fun.call_wrapped(pvals)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/linear_util.py", line 168, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/tmp/ipykernel_15144/1730905303.py", line 17, in loss
  File "/tmp/ipykernel_15144/1730905303.py", line 12, in predict
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/api.py", line 466, in cache_miss
    out_flat = xla.xla_call(
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1771, in bind
    return call_bind(self, fun, *args, **params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1787, in call_bind
    outs = top_trace.process_call(primitive, fun_, tracers, params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/ad.py", line 344, in process_call
    result = call_primitive.bind(f_jvp, *primals, *nonzero_tangents, **new_params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1771, in bind
    return call_bind(self, fun, *args, **params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1787, in call_bind
    outs = top_trace.process_call(primitive, fun_, tracers, params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 216, in process_call
    out = primitive.bind(_update_annotation(f_, f.in_type, in_knowns),
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1771, in bind
    return call_bind(self, fun, *args, **params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1787, in call_bind
    outs = top_trace.process_call(primitive, fun_, tracers, params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1534, in process_call
    jaxpr, out_avals, consts = trace_to_subjaxpr_dynamic(
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1816, in trace_to_subjaxpr_dynamic
    ans = fun.call_wrapped(*in_tracers_)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/linear_util.py", line 168, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/numpy/lax_numpy.py", line 2692, in dot
    return lax.dot(a, b, precision=precision)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/lax/lax.py", line 656, in dot
    raise TypeError("Incompatible shapes for dot: got {} and {}.".format(
jax._src.traceback_util.UnfilteredStackTrace: TypeError: Incompatible shapes for dot: got (113, 45) and (30,).

The stack trace below excludes JAX-internal frames.
The preceding is the original exception that occurred, unmodified.

--------------------

The above exception was the direct cause of the following exception:

ray::SPURuntime.run() (pid=16468, ip=172.16.4.140, repr=<secretflow.device.device.spu.SPURuntime object at 0x7f38e1bd0be0>)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 196, in run
    cfn, output = jax.xla_computation(fn, return_shape=True)(*args, **kwargs)
  File "/tmp/ipykernel_15144/2203275946.py", line 4, in fit
  File "/tmp/ipykernel_15144/4019579140.py", line 5, in train_step
  File "/tmp/ipykernel_15144/1730905303.py", line 17, in loss
  File "/tmp/ipykernel_15144/1730905303.py", line 12, in predict
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/numpy/lax_numpy.py", line 2692, in dot
    return lax.dot(a, b, precision=precision)
TypeError: Incompatible shapes for dot: got (113, 45) and (30,).
2022-08-01 21:06:21,054	ERROR worker.py:94 -- Unhandled error (suppress with 'RAY_IGNORE_UNHANDLED_ERRORS=1'): ray::SPURuntime.run() (pid=16468, ip=172.16.4.140, repr=<secretflow.device.device.spu.SPURuntime object at 0x7f38e1bd0be0>)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 196, in run
    cfn, output = jax.xla_computation(fn, return_shape=True)(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/api.py", line 807, in computation_maker
    jaxpr, out_avals, consts = pe.trace_to_jaxpr_dynamic(jaxtree_fun, avals)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/profiler.py", line 206, in wrapper
    return func(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1779, in trace_to_jaxpr_dynamic
    jaxpr, out_avals, consts = trace_to_subjaxpr_dynamic(
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1816, in trace_to_subjaxpr_dynamic
    ans = fun.call_wrapped(*in_tracers_)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/linear_util.py", line 168, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/tmp/ipykernel_15144/2203275946.py", line 4, in fit
  File "/tmp/ipykernel_15144/4019579140.py", line 5, in train_step
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/api.py", line 982, in value_and_grad_f
    ans, vjp_py = _vjp(f_partial, *dyn_args, reduce_axes=reduce_axes)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/api.py", line 2441, in _vjp
    out_primal, out_vjp = ad.vjp(
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/ad.py", line 129, in vjp
    out_primals, pvals, jaxpr, consts = linearize(traceable, *primals)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/ad.py", line 116, in linearize
    jaxpr, out_pvals, consts = pe.trace_to_jaxpr(jvpfun_flat, in_pvals)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/profiler.py", line 206, in wrapper
    return func(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 606, in trace_to_jaxpr
    jaxpr, (out_pvals, consts, env) = fun.call_wrapped(pvals)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/linear_util.py", line 168, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/tmp/ipykernel_15144/1730905303.py", line 17, in loss
  File "/tmp/ipykernel_15144/1730905303.py", line 12, in predict
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/api.py", line 466, in cache_miss
    out_flat = xla.xla_call(
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1771, in bind
    return call_bind(self, fun, *args, **params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1787, in call_bind
    outs = top_trace.process_call(primitive, fun_, tracers, params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/ad.py", line 344, in process_call
    result = call_primitive.bind(f_jvp, *primals, *nonzero_tangents, **new_params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1771, in bind
    return call_bind(self, fun, *args, **params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1787, in call_bind
    outs = top_trace.process_call(primitive, fun_, tracers, params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 216, in process_call
    out = primitive.bind(_update_annotation(f_, f.in_type, in_knowns),
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1771, in bind
    return call_bind(self, fun, *args, **params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1787, in call_bind
    outs = top_trace.process_call(primitive, fun_, tracers, params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1534, in process_call
    jaxpr, out_avals, consts = trace_to_subjaxpr_dynamic(
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1816, in trace_to_subjaxpr_dynamic
    ans = fun.call_wrapped(*in_tracers_)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/linear_util.py", line 168, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/numpy/lax_numpy.py", line 2692, in dot
    return lax.dot(a, b, precision=precision)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/lax/lax.py", line 656, in dot
    raise TypeError("Incompatible shapes for dot: got {} and {}.".format(
jax._src.traceback_util.UnfilteredStackTrace: TypeError: Incompatible shapes for dot: got (113, 45) and (30,).

The stack trace below excludes JAX-internal frames.
The preceding is the original exception that occurred, unmodified.

--------------------

The above exception was the direct cause of the following exception:

ray::SPURuntime.run() (pid=16468, ip=172.16.4.140, repr=<secretflow.device.device.spu.SPURuntime object at 0x7f38e1bd0be0>)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 196, in run
    cfn, output = jax.xla_computation(fn, return_shape=True)(*args, **kwargs)
  File "/tmp/ipykernel_15144/2203275946.py", line 4, in fit
  File "/tmp/ipykernel_15144/4019579140.py", line 5, in train_step
  File "/tmp/ipykernel_15144/1730905303.py", line 17, in loss
  File "/tmp/ipykernel_15144/1730905303.py", line 12, in predict
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/numpy/lax_numpy.py", line 2692, in dot
    return lax.dot(a, b, precision=precision)
TypeError: Incompatible shapes for dot: got (113, 45) and (30,).

2022-08-01 21:06:21,056	ERROR worker.py:94 -- Unhandled error (suppress with 'RAY_IGNORE_UNHANDLED_ERRORS=1'): ray::SPURuntime.run() (pid=16468, ip=172.16.4.140, repr=<secretflow.device.device.spu.SPURuntime object at 0x7f38e1bd0be0>)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 196, in run
    cfn, output = jax.xla_computation(fn, return_shape=True)(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/api.py", line 807, in computation_maker
    jaxpr, out_avals, consts = pe.trace_to_jaxpr_dynamic(jaxtree_fun, avals)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/profiler.py", line 206, in wrapper
    return func(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1779, in trace_to_jaxpr_dynamic
    jaxpr, out_avals, consts = trace_to_subjaxpr_dynamic(
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1816, in trace_to_subjaxpr_dynamic
    ans = fun.call_wrapped(*in_tracers_)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/linear_util.py", line 168, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/tmp/ipykernel_15144/2203275946.py", line 4, in fit
  File "/tmp/ipykernel_15144/4019579140.py", line 5, in train_step
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/api.py", line 982, in value_and_grad_f
    ans, vjp_py = _vjp(f_partial, *dyn_args, reduce_axes=reduce_axes)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/api.py", line 2441, in _vjp
    out_primal, out_vjp = ad.vjp(
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/ad.py", line 129, in vjp
    out_primals, pvals, jaxpr, consts = linearize(traceable, *primals)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/ad.py", line 116, in linearize
    jaxpr, out_pvals, consts = pe.trace_to_jaxpr(jvpfun_flat, in_pvals)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/profiler.py", line 206, in wrapper
    return func(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 606, in trace_to_jaxpr
    jaxpr, (out_pvals, consts, env) = fun.call_wrapped(pvals)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/linear_util.py", line 168, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/tmp/ipykernel_15144/1730905303.py", line 17, in loss
  File "/tmp/ipykernel_15144/1730905303.py", line 12, in predict
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/api.py", line 466, in cache_miss
    out_flat = xla.xla_call(
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1771, in bind
    return call_bind(self, fun, *args, **params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1787, in call_bind
    outs = top_trace.process_call(primitive, fun_, tracers, params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/ad.py", line 344, in process_call
    result = call_primitive.bind(f_jvp, *primals, *nonzero_tangents, **new_params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1771, in bind
    return call_bind(self, fun, *args, **params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1787, in call_bind
    outs = top_trace.process_call(primitive, fun_, tracers, params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 216, in process_call
    out = primitive.bind(_update_annotation(f_, f.in_type, in_knowns),
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1771, in bind
    return call_bind(self, fun, *args, **params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1787, in call_bind
    outs = top_trace.process_call(primitive, fun_, tracers, params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1534, in process_call
    jaxpr, out_avals, consts = trace_to_subjaxpr_dynamic(
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1816, in trace_to_subjaxpr_dynamic
    ans = fun.call_wrapped(*in_tracers_)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/linear_util.py", line 168, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/numpy/lax_numpy.py", line 2692, in dot
    return lax.dot(a, b, precision=precision)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/lax/lax.py", line 656, in dot
    raise TypeError("Incompatible shapes for dot: got {} and {}.".format(
jax._src.traceback_util.UnfilteredStackTrace: TypeError: Incompatible shapes for dot: got (113, 45) and (30,).

The stack trace below excludes JAX-internal frames.
The preceding is the original exception that occurred, unmodified.

--------------------

The above exception was the direct cause of the following exception:

ray::SPURuntime.run() (pid=16468, ip=172.16.4.140, repr=<secretflow.device.device.spu.SPURuntime object at 0x7f38e1bd0be0>)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 196, in run
    cfn, output = jax.xla_computation(fn, return_shape=True)(*args, **kwargs)
  File "/tmp/ipykernel_15144/2203275946.py", line 4, in fit
  File "/tmp/ipykernel_15144/4019579140.py", line 5, in train_step
  File "/tmp/ipykernel_15144/1730905303.py", line 17, in loss
  File "/tmp/ipykernel_15144/1730905303.py", line 12, in predict
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/numpy/lax_numpy.py", line 2692, in dot
    return lax.dot(a, b, precision=precision)
TypeError: Incompatible shapes for dot: got (113, 45) and (30,).
2022-08-01 21:06:21,305	ERROR worker.py:94 -- Unhandled error (suppress with 'RAY_IGNORE_UNHANDLED_ERRORS=1'): ray::SPURuntime.run() (pid=32105, ip=172.16.4.141, repr=<secretflow.device.device.spu.SPURuntime object at 0x7fdfbc16cac0>)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 196, in run
    cfn, output = jax.xla_computation(fn, return_shape=True)(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/api.py", line 807, in computation_maker
    jaxpr, out_avals, consts = pe.trace_to_jaxpr_dynamic(jaxtree_fun, avals)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/profiler.py", line 206, in wrapper
    return func(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1779, in trace_to_jaxpr_dynamic
    jaxpr, out_avals, consts = trace_to_subjaxpr_dynamic(
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1816, in trace_to_subjaxpr_dynamic
    ans = fun.call_wrapped(*in_tracers_)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/linear_util.py", line 168, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/tmp/ipykernel_15144/2203275946.py", line 4, in fit
  File "/tmp/ipykernel_15144/4019579140.py", line 5, in train_step
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/api.py", line 982, in value_and_grad_f
    ans, vjp_py = _vjp(f_partial, *dyn_args, reduce_axes=reduce_axes)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/api.py", line 2441, in _vjp
    out_primal, out_vjp = ad.vjp(
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/ad.py", line 129, in vjp
    out_primals, pvals, jaxpr, consts = linearize(traceable, *primals)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/ad.py", line 116, in linearize
    jaxpr, out_pvals, consts = pe.trace_to_jaxpr(jvpfun_flat, in_pvals)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/profiler.py", line 206, in wrapper
    return func(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 606, in trace_to_jaxpr
    jaxpr, (out_pvals, consts, env) = fun.call_wrapped(pvals)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/linear_util.py", line 168, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/tmp/ipykernel_15144/1730905303.py", line 17, in loss
  File "/tmp/ipykernel_15144/1730905303.py", line 12, in predict
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/api.py", line 466, in cache_miss
    out_flat = xla.xla_call(
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1771, in bind
    return call_bind(self, fun, *args, **params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1787, in call_bind
    outs = top_trace.process_call(primitive, fun_, tracers, params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/ad.py", line 344, in process_call
    result = call_primitive.bind(f_jvp, *primals, *nonzero_tangents, **new_params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1771, in bind
    return call_bind(self, fun, *args, **params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1787, in call_bind
    outs = top_trace.process_call(primitive, fun_, tracers, params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 216, in process_call
    out = primitive.bind(_update_annotation(f_, f.in_type, in_knowns),
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1771, in bind
    return call_bind(self, fun, *args, **params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1787, in call_bind
    outs = top_trace.process_call(primitive, fun_, tracers, params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1534, in process_call
    jaxpr, out_avals, consts = trace_to_subjaxpr_dynamic(
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1816, in trace_to_subjaxpr_dynamic
    ans = fun.call_wrapped(*in_tracers_)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/linear_util.py", line 168, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/numpy/lax_numpy.py", line 2692, in dot
    return lax.dot(a, b, precision=precision)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/lax/lax.py", line 656, in dot
    raise TypeError("Incompatible shapes for dot: got {} and {}.".format(
jax._src.traceback_util.UnfilteredStackTrace: TypeError: Incompatible shapes for dot: got (113, 45) and (30,).

The stack trace below excludes JAX-internal frames.
The preceding is the original exception that occurred, unmodified.

--------------------

The above exception was the direct cause of the following exception:

ray::SPURuntime.run() (pid=32105, ip=172.16.4.141, repr=<secretflow.device.device.spu.SPURuntime object at 0x7fdfbc16cac0>)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 196, in run
    cfn, output = jax.xla_computation(fn, return_shape=True)(*args, **kwargs)
  File "/tmp/ipykernel_15144/2203275946.py", line 4, in fit
  File "/tmp/ipykernel_15144/4019579140.py", line 5, in train_step
  File "/tmp/ipykernel_15144/1730905303.py", line 17, in loss
  File "/tmp/ipykernel_15144/1730905303.py", line 12, in predict
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/numpy/lax_numpy.py", line 2692, in dot
    return lax.dot(a, b, precision=precision)
TypeError: Incompatible shapes for dot: got (113, 45) and (30,).

2022-08-01 21:06:21,311	ERROR worker.py:94 -- Unhandled error (suppress with 'RAY_IGNORE_UNHANDLED_ERRORS=1'): ray::SPURuntime.run() (pid=32105, ip=172.16.4.141, repr=<secretflow.device.device.spu.SPURuntime object at 0x7fdfbc16cac0>)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 196, in run
    cfn, output = jax.xla_computation(fn, return_shape=True)(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/api.py", line 807, in computation_maker
    jaxpr, out_avals, consts = pe.trace_to_jaxpr_dynamic(jaxtree_fun, avals)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/profiler.py", line 206, in wrapper
    return func(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1779, in trace_to_jaxpr_dynamic
    jaxpr, out_avals, consts = trace_to_subjaxpr_dynamic(
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1816, in trace_to_subjaxpr_dynamic
    ans = fun.call_wrapped(*in_tracers_)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/linear_util.py", line 168, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/tmp/ipykernel_15144/2203275946.py", line 4, in fit
  File "/tmp/ipykernel_15144/4019579140.py", line 5, in train_step
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/api.py", line 982, in value_and_grad_f
    ans, vjp_py = _vjp(f_partial, *dyn_args, reduce_axes=reduce_axes)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/api.py", line 2441, in _vjp
    out_primal, out_vjp = ad.vjp(
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/ad.py", line 129, in vjp
    out_primals, pvals, jaxpr, consts = linearize(traceable, *primals)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/ad.py", line 116, in linearize
    jaxpr, out_pvals, consts = pe.trace_to_jaxpr(jvpfun_flat, in_pvals)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/profiler.py", line 206, in wrapper
    return func(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 606, in trace_to_jaxpr
    jaxpr, (out_pvals, consts, env) = fun.call_wrapped(pvals)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/linear_util.py", line 168, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/tmp/ipykernel_15144/1730905303.py", line 17, in loss
  File "/tmp/ipykernel_15144/1730905303.py", line 12, in predict
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/api.py", line 466, in cache_miss
    out_flat = xla.xla_call(
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1771, in bind
    return call_bind(self, fun, *args, **params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1787, in call_bind
    outs = top_trace.process_call(primitive, fun_, tracers, params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/ad.py", line 344, in process_call
    result = call_primitive.bind(f_jvp, *primals, *nonzero_tangents, **new_params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1771, in bind
    return call_bind(self, fun, *args, **params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1787, in call_bind
    outs = top_trace.process_call(primitive, fun_, tracers, params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 216, in process_call
    out = primitive.bind(_update_annotation(f_, f.in_type, in_knowns),
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1771, in bind
    return call_bind(self, fun, *args, **params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1787, in call_bind
    outs = top_trace.process_call(primitive, fun_, tracers, params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1534, in process_call
    jaxpr, out_avals, consts = trace_to_subjaxpr_dynamic(
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1816, in trace_to_subjaxpr_dynamic
    ans = fun.call_wrapped(*in_tracers_)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/linear_util.py", line 168, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/numpy/lax_numpy.py", line 2692, in dot
    return lax.dot(a, b, precision=precision)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/lax/lax.py", line 656, in dot
    raise TypeError("Incompatible shapes for dot: got {} and {}.".format(
jax._src.traceback_util.UnfilteredStackTrace: TypeError: Incompatible shapes for dot: got (113, 45) and (30,).

The stack trace below excludes JAX-internal frames.
The preceding is the original exception that occurred, unmodified.

--------------------

The above exception was the direct cause of the following exception:

ray::SPURuntime.run() (pid=32105, ip=172.16.4.141, repr=<secretflow.device.device.spu.SPURuntime object at 0x7fdfbc16cac0>)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 196, in run
    cfn, output = jax.xla_computation(fn, return_shape=True)(*args, **kwargs)
  File "/tmp/ipykernel_15144/2203275946.py", line 4, in fit
  File "/tmp/ipykernel_15144/4019579140.py", line 5, in train_step
  File "/tmp/ipykernel_15144/1730905303.py", line 17, in loss
  File "/tmp/ipykernel_15144/1730905303.py", line 12, in predict
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/numpy/lax_numpy.py", line 2692, in dot
    return lax.dot(a, b, precision=precision)
TypeError: Incompatible shapes for dot: got (113, 45) and (30,).
2022-08-01 21:06:21,314	ERROR worker.py:94 -- Unhandled error (suppress with 'RAY_IGNORE_UNHANDLED_ERRORS=1'): ray::SPURuntime.run() (pid=32105, ip=172.16.4.141, repr=<secretflow.device.device.spu.SPURuntime object at 0x7fdfbc16cac0>)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 196, in run
    cfn, output = jax.xla_computation(fn, return_shape=True)(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/api.py", line 807, in computation_maker
    jaxpr, out_avals, consts = pe.trace_to_jaxpr_dynamic(jaxtree_fun, avals)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/profiler.py", line 206, in wrapper
    return func(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1779, in trace_to_jaxpr_dynamic
    jaxpr, out_avals, consts = trace_to_subjaxpr_dynamic(
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1816, in trace_to_subjaxpr_dynamic
    ans = fun.call_wrapped(*in_tracers_)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/linear_util.py", line 168, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/tmp/ipykernel_15144/2203275946.py", line 4, in fit
  File "/tmp/ipykernel_15144/4019579140.py", line 5, in train_step
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/api.py", line 982, in value_and_grad_f
    ans, vjp_py = _vjp(f_partial, *dyn_args, reduce_axes=reduce_axes)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/api.py", line 2441, in _vjp
    out_primal, out_vjp = ad.vjp(
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/ad.py", line 129, in vjp
    out_primals, pvals, jaxpr, consts = linearize(traceable, *primals)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/ad.py", line 116, in linearize
    jaxpr, out_pvals, consts = pe.trace_to_jaxpr(jvpfun_flat, in_pvals)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/profiler.py", line 206, in wrapper
    return func(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 606, in trace_to_jaxpr
    jaxpr, (out_pvals, consts, env) = fun.call_wrapped(pvals)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/linear_util.py", line 168, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/tmp/ipykernel_15144/1730905303.py", line 17, in loss
  File "/tmp/ipykernel_15144/1730905303.py", line 12, in predict
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/api.py", line 466, in cache_miss
    out_flat = xla.xla_call(
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1771, in bind
    return call_bind(self, fun, *args, **params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1787, in call_bind
    outs = top_trace.process_call(primitive, fun_, tracers, params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/ad.py", line 344, in process_call
    result = call_primitive.bind(f_jvp, *primals, *nonzero_tangents, **new_params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1771, in bind
    return call_bind(self, fun, *args, **params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1787, in call_bind
    outs = top_trace.process_call(primitive, fun_, tracers, params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 216, in process_call
    out = primitive.bind(_update_annotation(f_, f.in_type, in_knowns),
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1771, in bind
    return call_bind(self, fun, *args, **params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/core.py", line 1787, in call_bind
    outs = top_trace.process_call(primitive, fun_, tracers, params)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1534, in process_call
    jaxpr, out_avals, consts = trace_to_subjaxpr_dynamic(
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1816, in trace_to_subjaxpr_dynamic
    ans = fun.call_wrapped(*in_tracers_)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/linear_util.py", line 168, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/numpy/lax_numpy.py", line 2692, in dot
    return lax.dot(a, b, precision=precision)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/lax/lax.py", line 656, in dot
    raise TypeError("Incompatible shapes for dot: got {} and {}.".format(
jax._src.traceback_util.UnfilteredStackTrace: TypeError: Incompatible shapes for dot: got (113, 45) and (30,).

The stack trace below excludes JAX-internal frames.
The preceding is the original exception that occurred, unmodified.

--------------------

The above exception was the direct cause of the following exception:

ray::SPURuntime.run() (pid=32105, ip=172.16.4.141, repr=<secretflow.device.device.spu.SPURuntime object at 0x7fdfbc16cac0>)
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 196, in run
    cfn, output = jax.xla_computation(fn, return_shape=True)(*args, **kwargs)
  File "/tmp/ipykernel_15144/2203275946.py", line 4, in fit
  File "/tmp/ipykernel_15144/4019579140.py", line 5, in train_step
  File "/tmp/ipykernel_15144/1730905303.py", line 17, in loss
  File "/tmp/ipykernel_15144/1730905303.py", line 12, in predict
  File "/opt/software/anaconda3/envs/secretflow/lib/python3.8/site-packages/jax/_src/numpy/lax_numpy.py", line 2692, in dot
    return lax.dot(a, b, precision=precision)
TypeError: Incompatible shapes for dot: got (113, 45) and (30,).

逻辑回归计算中更换数据集之后出现error

Issue Type

Bug

Source

source

Secretflow Version

secretflow 最新版

OS Platform and Distribution

MacOS 13.0

Python version

3.8.13

Bazel version

No response

GCC/Compiler version

No response

What happend and What you expected to happen.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [8], in <cell line: 22>()
     20 # Validate the model
     21 X_test, y_test = load_test_dataset()
---> 22 auc=validate_model(W,b, X_test, y_test)
     23 print(f'auc={auc}')
     24 print(type(x1))

Input In [5], in validate_model(W, b, X_test, y_test)
      4 def validate_model(W, b, X_test, y_test):
      5     y_pred = predict(W, b, X_test)
----> 6     return roc_auc_score(y_test, y_pred)

File ~/opt/anaconda3/envs/secretflow/lib/python3.8/site-packages/sklearn/metrics/_ranking.py:560, in roc_auc_score(y_true, y_score, average, sample_weight, max_fpr, multi_class, labels)
    553         raise ValueError(
    554             "Partial AUC computation not available in "
    555             "multiclass setting, 'max_fpr' must be"
    556             " set to `None`, received `max_fpr={0}` "
    557             "instead".format(max_fpr)
    558         )
    559     if multi_class == "raise":
--> 560         raise ValueError("multi_class must be in ('ovo', 'ovr')")
    561     return _multiclass_roc_auc_score(
    562         y_true, y_score, labels, multi_class, average, sample_weight
    563     )
    564 elif y_type == "binary":

ValueError: multi_class must be in ('ovo', 'ovr')

Reproduction code to reproduce the issue.

%matplotlib inline

# Load the data
x1, _ = load_train_dataset(party_id=1)
x2, y = load_train_dataset(party_id=2)

# Hyperparameter
W = jnp.zeros((4,))
b = 0.0
epochs = 10
learning_rate = 1e-2
 

# Train the model
losses, W, b = fit(W, b, x1, x2, y, epochs=100, learning_rate=1e-2)

# Plot the loss
plot_losses(losses)

# Validate the model
X_test, y_test = load_test_dataset()
auc=validate_model(W,b, X_test, y_test)
print(f'auc={auc}')
print(type(x1))

隐私sql 使用文档

目前看隐语的架构有支持sql 方式进行安全聚合分析,请问有相关例子和文档吗

萌新提问TEE

想请问一下,隐语框架针对TEE部分,是哪块的代码,基于TEE做了哪些算法

Install Error

Issue Type

Build/Install

Source

source

Secretflow Version

latest

OS Platform and Distribution

No response

Python version

No response

Bazel version

No response

GCC/Compiler version

No response

What happend and What you expected to happen.

When I use "pip install -U secretflow":
ERROR: Could not find a version that satisfies the requirement secretflow (from versions: none)
ERROR: No matching distribution found for secretflow

Reproduction code to reproduce the issue.

pip install -U secretflow

aggr = SecureAggregator(device=alice, participants=[alice, bob]) 无限期执行

Issue Type

Others

Source

binary

Secretflow Version

beta

OS Platform and Distribution

No response

Python version

No response

Bazel version

No response

GCC/Compiler version

No response

What happend and What you expected to happen.

按照官方文档的例子,执行到aggr = SecureAggregator(device=alice, participants=[alice, bob]) 之后,无限期执行下去,没有结果,请问怎么查找问题?

Reproduction code to reproduce the issue.

按照官方文档的例子,执行到aggr = SecureAggregator(device=alice, participants=[alice, bob]) 之后,无限期执行下去,没有结果,请问怎么查找问题?

萌新提问:ray start 的时候 resources 起名字和不起名字的区别

Issue Type

Others

Source

source

Secretflow Version

beta

OS Platform and Distribution

Centos7

Python version

3.8

Bazel version

No response

GCC/Compiler version

No response

What happend and What you expected to happen.

我看咱们隐语的官方教程中,启动ray start resource和代码中init ray的时候都有一个名字——alice bob charlie,可是ray的官方教程中的例子都没有在resources中指定名字。
请问这两种情况下(起名、不起名),在使用时有什么区别吗?

我先说我个人的理解,不起名字时候,ray是一个集群,并没有party的概念,一个task中的所有fun会通过调度分散在集群中的不同节点上执行——即每个节点运行一部分fun

起名之后才有了party的概念,一个task会被submit到不同的party上面执行,而每个party都会运行task中所有的fun?

不知道这么理解是否正确?如下图

image

Reproduction code to reproduce the issue.

ray start --head --node-ip-address="192.168.137.3" --port="7000" --resources='{"alice": 8}' 
ray start --address="172.16.4.140:6379" --resources='{"bob": 8}'
ray start --address="172.16.4.140:6379" --resources='{"charlie": 8}'

可信性证明

请问secretflow有提供示例来验证可信性吗?看到spu设备用reveal方法就可以获取原文。

运行两方PSI出错,请问报No available node types can fulfill resource request这个issue该如何解决?

Issue Type

Others

Source

source

Secretflow Version

secretflow 最新版

OS Platform and Distribution

macos 13.0

Python version

3.8.13

Bazel version

No response

GCC/Compiler version

No response

What happend and What you expected to happen.

(scheduler +27s) Error: No available node types can fulfill resource request {'bob': 1.0, 'CPU': 1.0}. Add suitable node types to this cluster to resolve this issue.
(scheduler +27s) Error: No available node types can fulfill resource request {'alice': 1.0, 'CPU': 1.0}. Add suitable node types to this cluster to resolve this issue.

Reproduction code to reproduce the issue.

import secretflow as sf
alice, bob = sf.PYU('alice'), sf.PYU('bob')
spu = sf.SPU(sf.utils.testing.cluster_def(['alice', 'bob']))
input_path = {alice: '.data/alice.csv', bob: '.data/bob.csv'}
output_path = {alice: '.data/alice_psi.csv', bob: '.data/bob_psi.csv'}
spu.psi_csv('uid', input_path, output_path)

一个party只指定一台服务器吗?

Issue Type

Others

Source

source

Secretflow Version

beta

OS Platform and Distribution

Centos7

Python version

3.8

Bazel version

No response

GCC/Compiler version

No response

What happend and What you expected to happen.

我在搭建ray集群的时候,与官方的例子稍有不同的是,每一个party我给了2个服务器,所有三个party总共6个服务器,
(secretflow) [root@fateonspark140 ~]# ray status
======== Autoscaler status: 2022-07-28 12:21:29.600048 ========
Node status
---------------------------------------------------------------
Healthy:
 1 node_4241b4efcb4c09ce8f883ce0d012b871afedfa5bb74fd4a5eb0076c4
 1 node_ebf3803eb0f45c335b87b6f5545da48353aac0de806c7c06cd954528
 1 node_367f61e86335ee89578d341a09b78f903ef6ebd4fdc32624e22874e3
 1 node_3832e719372baff08ce3ea1c296a29ce3b2f4f4ba7bfb96faf8723ec
 1 node_24628ec7113a604915dab58daf115b90f574807fd235c3b48789c0af
 1 node_a41c00e81d5074eb5753d1e4b0f9fe183da7c677697664d4dc77214a
Pending:
 (no pending nodes)
Recent failures:
 (no failures)

Resources
---------------------------------------------------------------
Usage:
 2.0/72.0 CPU
 1.0/16.0 alice
 1.0/16.0 bob
 0.0/16.0 charlie
 0.00/61.334 GiB memory
 0.00/27.034 GiB object_store_memory

Demands:
 (no resource demands)


但是在执行mpc spu的时候,发现每个party只能指定一个address,如下:

sf.init(address='172.16.4.140:6379', _redis_password='')
alice, bob = sf.PYU('alice'), sf.PYU('bob')

device = sf.SPU({
    'nodes': [
                {
                    'party': 'alice',
                    'id': '140:0',
                    # The address for other peers.
                    'address': '172.16.4.140:8881',
                    # The listen address of this node.
                    # Optional. Address will be used if listen_address is empty.
                    # 'listen_address': ''
                },
                {
                    'party': 'bob',
                    'id':'141:0',
                    'address': '172.16.4.141:8881',
                    # 'listen_address': ''
                }
            ],
            'runtime_config': {
            'protocol': spu.spu_pb2.SEMI2K,
            'field': spu.spu_pb2.FM128,
            'sigmoid_mode': spu.spu_pb2.RuntimeConfig.SIGMOID_REAL
        }
})


我的问题是:
1、一个party是否支持多个服务器?
2、一个任务在一个party上,是否只能跑在一个服务器上,因为代码中的address不是一个数组类型
3、必须指定address吗?如下,难道计算框架ray不能自己选择一台服务器吗?既然已经指定了Alice,难道不能在alice所拥有的服务器中自己选择一台吗?
{
     'party': 'alice',
     'id': '140:0',
      # The address for other peers.
      'address': '172.16.4.140:8881',
      # The listen address of this node.
      # Optional. Address will be used if listen_address is empty.
      # 'listen_address': ''
},

4、能否同时运行两个任务?如果可以,两个任务都指定到了同一台服务器上会怎么样?

Reproduction code to reproduce the issue.

sf.init(address='172.16.4.140:6379', _redis_password='')
alice, bob = sf.PYU('alice'), sf.PYU('bob')

device = sf.SPU({
    'nodes': [
                {
                    'party': 'alice',
                    'id': '140:0',
                    # The address for other peers.
                    'address': '172.16.4.140:8881',
                    # The listen address of this node.
                    # Optional. Address will be used if listen_address is empty.
                    # 'listen_address': ''
                },
                {
                    'party': 'bob',
                    'id':'141:0',
                    'address': '172.16.4.141:8881',
                    # 'listen_address': ''
                }
            ],
            'runtime_config': {
            'protocol': spu.spu_pb2.SEMI2K,
            'field': spu.spu_pb2.FM128,
            'sigmoid_mode': spu.spu_pb2.RuntimeConfig.SIGMOID_REAL
        }
})

教程能否提供一个HEU运算的例子

Issue Type

Documentation Feature Request

Source

source

Secretflow Version

latest

OS Platform and Distribution

No response

Python version

No response

Bazel version

No response

GCC/Compiler version

No response

What happend and What you expected to happen.

看源码感觉HEU初始化和SPU差别比较大,也没有看到专门的教程,这块能否提供一个小案例(如加法)便于上手呢,谢谢

Reproduction code to reproduce the issue.

同上

pip3 install -U secretflow

Issue Type

Build/Install

Source

binary

Secretflow Version

secretflow 最新版

OS Platform and Distribution

Ubuntu 22.04

Python version

3.8.13

Bazel version

0

GCC/Compiler version

0

What happend and What you expected to happen.

ERROR: Could not find a version that satisfies the requirement secretflow (from versions: none)
ERROR: No matching distribution found for secretflow

Reproduction code to reproduce the issue.

pip3 install -U secretflow

pytorch 使用sf.reveal获取模型参数时报错

Issue Type

Others

Source

source

Secretflow Version

0.6.13b1

OS Platform and Distribution

Rocky Linux release 8.5

Python version

3.8.12

Bazel version

No response

GCC/Compiler version

No response

What happend and What you expected to happen.

(_run pid=1977253) 2022-07-20 11:11:55.857356: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
(_run pid=1977254) 2022-07-20 11:11:56.079320: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
(pid=1977495) 2022-07-20 11:11:56.253085: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
(_run pid=1977250) 2022-07-20 11:11:56.365197: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
(pid=1977496) 2022-07-20 11:11:56.503680: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
(_run pid=1977253) 2022-07-20 11:11:56,939,939 WARNING [xla_bridge.py:backends:265] No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
(_run pid=1977254) 2022-07-20 11:11:57,120,120 WARNING [xla_bridge.py:backends:265] No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
(SPURuntime pid=1977495) I0720 11:11:57.312257 1977495 external/com_github_brpc_brpc/src/brpc/server.cpp:1065] Server[yasl::link::internal::ReceiverServiceImpl] is serving on port=46567.
(SPURuntime pid=1977495) I0720 11:11:57.312313 1977495 external/com_github_brpc_brpc/src/brpc/server.cpp:1068] Check out http://localhost.localdomain:46567 in web browser.
(_run pid=1977250) 2022-07-20 11:11:57,401,401 WARNING [xla_bridge.py:backends:265] No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
(SPURuntime pid=1977495) I0720 11:11:57.413011 1977723 external/com_github_brpc_brpc/src/brpc/socket.cpp:2202] Checking Socket{id=0 addr=127.0.0.1:58775} (0x65ef180)
(SPURuntime pid=1977496) I0720 11:11:57.516299 1977496 external/com_github_brpc_brpc/src/brpc/server.cpp:1065] Server[yasl::link::internal::ReceiverServiceImpl] is serving on port=58775.
(SPURuntime pid=1977496) I0720 11:11:57.516340 1977496 external/com_github_brpc_brpc/src/brpc/server.cpp:1068] Check out http://localhost.localdomain:58775 in web browser.
(SPURuntime pid=1977495) I0720 11:12:00.413251 1977722 external/com_github_brpc_brpc/src/brpc/socket.cpp:2262] Revived Socket{id=0 addr=127.0.0.1:58775} (0x65ef180) (Connectable)
2022-07-20 11:12:01,757	ERROR worker.py:94 -- Unhandled error (suppress with 'RAY_IGNORE_UNHANDLED_ERRORS=1'): ray::SPURuntime.run() (pid=1977495, ip=10.180.216.25, repr=<secretflow.device.device.spu.SPURuntime object at 0x7fc1af31ea00>)
IndexError: tuple index out of range

The above exception was the direct cause of the following exception:

ray::SPURuntime.run() (pid=1977495, ip=10.180.216.25, repr=<secretflow.device.device.spu.SPURuntime object at 0x7fc1af31ea00>)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/ray/workers/default_worker.py", line 238, in <module>
    ray.worker.global_worker.main_loop()
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 196, in run
    cfn, output = jax.xla_computation(fn, return_shape=True)(*args, **kwargs)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/_src/api.py", line 807, in computation_maker
    jaxpr, out_avals, consts = pe.trace_to_jaxpr_dynamic(jaxtree_fun, avals)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/_src/profiler.py", line 206, in wrapper
    return func(*args, **kwargs)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1779, in trace_to_jaxpr_dynamic
    jaxpr, out_avals, consts = trace_to_subjaxpr_dynamic(
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1816, in trace_to_subjaxpr_dynamic
    ans = fun.call_wrapped(*in_tracers_)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/linear_util.py", line 168, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/tmp/ipykernel_1290442/1368379063.py", line 2, in abc
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/core.py", line 528, in __len__
    return self.aval._len(self)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/core.py", line 1335, in _len
    raise TypeError("len() of unsized object") from err  # same as numpy error
jax._src.traceback_util.UnfilteredStackTrace: TypeError: len() of unsized object

The stack trace below excludes JAX-internal frames.
The preceding is the original exception that occurred, unmodified.

--------------------

The above exception was the direct cause of the following exception:

ray::SPURuntime.run() (pid=1977495, ip=10.180.216.25, repr=<secretflow.device.device.spu.SPURuntime object at 0x7fc1af31ea00>)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 196, in run
    cfn, output = jax.xla_computation(fn, return_shape=True)(*args, **kwargs)
  File "/tmp/ipykernel_1290442/1368379063.py", line 2, in abc
TypeError: len() of unsized object
---------------------------------------------------------------------------
RayTaskError(TypeError)                   Traceback (most recent call last)
Input In [273], in <cell line: 2>()
      1 w = device(abc)(x1_,x2_,y_)
----> 2 model_w = sf.reveal(w)
      3 model_w

File ~/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/secretflow/device/driver.py:158, in reveal(func_or_object)
    155             value_ref.append(value.device.sk_keeper.decrypt.remote(value.data))
    156         value_idx.append(i)
--> 158 value_obj = ray.get(value_ref)
    159 idx = 0
    160 for i in value_idx:

File ~/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/ray/_private/client_mode_hook.py:105, in client_mode_hook.<locals>.wrapper(*args, **kwargs)
    103     if func.__name__ != "init" or is_client_mode_enabled_by_default:
    104         return getattr(ray, func.__name__)(*args, **kwargs)
--> 105 return func(*args, **kwargs)

File ~/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/ray/worker.py:1843, in get(object_refs, timeout)
   1841     worker.core_worker.dump_object_store_memory_usage()
   1842 if isinstance(value, RayTaskError):
-> 1843     raise value.as_instanceof_cause()
   1844 else:
   1845     raise value

RayTaskError(TypeError): ray::SPURuntime.run() (pid=1977496, ip=10.180.216.25, repr=<secretflow.device.device.spu.SPURuntime object at 0x7fe6544a6a00>)
IndexError: tuple index out of range

The above exception was the direct cause of the following exception:

ray::SPURuntime.run() (pid=1977496, ip=10.180.216.25, repr=<secretflow.device.device.spu.SPURuntime object at 0x7fe6544a6a00>)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/ray/workers/default_worker.py", line 238, in <module>
    ray.worker.global_worker.main_loop()
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 196, in run
    cfn, output = jax.xla_computation(fn, return_shape=True)(*args, **kwargs)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/_src/api.py", line 807, in computation_maker
    jaxpr, out_avals, consts = pe.trace_to_jaxpr_dynamic(jaxtree_fun, avals)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/_src/profiler.py", line 206, in wrapper
    return func(*args, **kwargs)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1779, in trace_to_jaxpr_dynamic
    jaxpr, out_avals, consts = trace_to_subjaxpr_dynamic(
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1816, in trace_to_subjaxpr_dynamic
    ans = fun.call_wrapped(*in_tracers_)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/linear_util.py", line 168, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/tmp/ipykernel_1290442/1368379063.py", line 2, in abc
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/core.py", line 528, in __len__
    return self.aval._len(self)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/core.py", line 1335, in _len
    raise TypeError("len() of unsized object") from err  # same as numpy error
jax._src.traceback_util.UnfilteredStackTrace: TypeError: len() of unsized object

The stack trace below excludes JAX-internal frames.
The preceding is the original exception that occurred, unmodified.

--------------------

The above exception was the direct cause of the following exception:

ray::SPURuntime.run() (pid=1977496, ip=10.180.216.25, repr=<secretflow.device.device.spu.SPURuntime object at 0x7fe6544a6a00>)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 196, in run
    cfn, output = jax.xla_computation(fn, return_shape=True)(*args, **kwargs)
  File "/tmp/ipykernel_1290442/1368379063.py", line 2, in abc
TypeError: len() of unsized object

(SPURuntime pid=1977495) WARNING:absl:No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
(SPURuntime pid=1977496) WARNING:absl:No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
2022-07-20 11:12:06,945	ERROR worker.py:94 -- Unhandled error (suppress with 'RAY_IGNORE_UNHANDLED_ERRORS=1'): ray::SPURuntime.get_var() (pid=1977495, ip=10.180.216.25, repr=<secretflow.device.device.spu.SPURuntime object at 0x7fc1af31ea00>)
  At least one of the input arguments for this task could not be computed:
ray.exceptions.RayTaskError: ray::SPURuntime.run() (pid=1977496, ip=10.180.216.25, repr=<secretflow.device.device.spu.SPURuntime object at 0x7fe6544a6a00>)
IndexError: tuple index out of range

The above exception was the direct cause of the following exception:

ray::SPURuntime.run() (pid=1977496, ip=10.180.216.25, repr=<secretflow.device.device.spu.SPURuntime object at 0x7fe6544a6a00>)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/ray/workers/default_worker.py", line 238, in <module>
    ray.worker.global_worker.main_loop()
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 196, in run
    cfn, output = jax.xla_computation(fn, return_shape=True)(*args, **kwargs)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/_src/api.py", line 807, in computation_maker
    jaxpr, out_avals, consts = pe.trace_to_jaxpr_dynamic(jaxtree_fun, avals)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/_src/profiler.py", line 206, in wrapper
    return func(*args, **kwargs)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1779, in trace_to_jaxpr_dynamic
    jaxpr, out_avals, consts = trace_to_subjaxpr_dynamic(
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1816, in trace_to_subjaxpr_dynamic
    ans = fun.call_wrapped(*in_tracers_)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/linear_util.py", line 168, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/tmp/ipykernel_1290442/1368379063.py", line 2, in abc
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/core.py", line 528, in __len__
    return self.aval._len(self)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/core.py", line 1335, in _len
    raise TypeError("len() of unsized object") from err  # same as numpy error
jax._src.traceback_util.UnfilteredStackTrace: TypeError: len() of unsized object

The stack trace below excludes JAX-internal frames.
The preceding is the original exception that occurred, unmodified.

--------------------

The above exception was the direct cause of the following exception:

ray::SPURuntime.run() (pid=1977496, ip=10.180.216.25, repr=<secretflow.device.device.spu.SPURuntime object at 0x7fe6544a6a00>)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 196, in run
    cfn, output = jax.xla_computation(fn, return_shape=True)(*args, **kwargs)
  File "/tmp/ipykernel_1290442/1368379063.py", line 2, in abc
TypeError: len() of unsized object
2022-07-20 11:12:06,946	ERROR worker.py:94 -- Unhandled error (suppress with 'RAY_IGNORE_UNHANDLED_ERRORS=1'): ray::SPURuntime.get_var() (pid=1977496, ip=10.180.216.25, repr=<secretflow.device.device.spu.SPURuntime object at 0x7fe6544a6a00>)
  At least one of the input arguments for this task could not be computed:
ray.exceptions.RayTaskError: ray::SPURuntime.run() (pid=1977496, ip=10.180.216.25, repr=<secretflow.device.device.spu.SPURuntime object at 0x7fe6544a6a00>)
IndexError: tuple index out of range

The above exception was the direct cause of the following exception:

ray::SPURuntime.run() (pid=1977496, ip=10.180.216.25, repr=<secretflow.device.device.spu.SPURuntime object at 0x7fe6544a6a00>)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/ray/workers/default_worker.py", line 238, in <module>
    ray.worker.global_worker.main_loop()
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 196, in run
    cfn, output = jax.xla_computation(fn, return_shape=True)(*args, **kwargs)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/_src/api.py", line 807, in computation_maker
    jaxpr, out_avals, consts = pe.trace_to_jaxpr_dynamic(jaxtree_fun, avals)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/_src/profiler.py", line 206, in wrapper
    return func(*args, **kwargs)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1779, in trace_to_jaxpr_dynamic
    jaxpr, out_avals, consts = trace_to_subjaxpr_dynamic(
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/interpreters/partial_eval.py", line 1816, in trace_to_subjaxpr_dynamic
    ans = fun.call_wrapped(*in_tracers_)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/linear_util.py", line 168, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))
  File "/tmp/ipykernel_1290442/1368379063.py", line 2, in abc
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/core.py", line 528, in __len__
    return self.aval._len(self)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/jax/core.py", line 1335, in _len
    raise TypeError("len() of unsized object") from err  # same as numpy error
jax._src.traceback_util.UnfilteredStackTrace: TypeError: len() of unsized object

The stack trace below excludes JAX-internal frames.
The preceding is the original exception that occurred, unmodified.

--------------------

The above exception was the direct cause of the following exception:

ray::SPURuntime.run() (pid=1977496, ip=10.180.216.25, repr=<secretflow.device.device.spu.SPURuntime object at 0x7fe6544a6a00>)
  File "/root/.pyenv/versions/3.8.12/envs/secret_flow/lib/python3.8/site-packages/secretflow/device/device/spu.py", line 196, in run
    cfn, output = jax.xla_computation(fn, return_shape=True)(*args, **kwargs)
  File "/tmp/ipykernel_1290442/1368379063.py", line 2, in abc
TypeError: len() of unsized object

Reproduction code to reproduce the issue.

import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler


def load_train_dataset(party_id=None) -> (np.ndarray, np.ndarray):
    features, label = load_breast_cancer(return_X_y=True)
    scaler = StandardScaler()
    features = scaler.fit_transform(features)
    X_train, _, y_train, _ = train_test_split(
        features, label, test_size=0.8, random_state=42
    )

    if party_id:
        if party_id == 1:
            return X_train[:, 15:], _
        else:
            return X_train[:, :15], y_train
    else:
        return X_train, y_train


def load_test_dataset():
    features, label = load_breast_cancer(return_X_y=True)
    scaler = StandardScaler()
    features = scaler.fit_transform(features)
    _, X_test, _, y_test = train_test_split(
        features, label, test_size=0.8, random_state=42
    )
    return X_test, y_test

import secretflow as sf

# In case you have a running secretflow runtime already.
sf.shutdown()

sf.init(['alice', 'bob'], num_cpus=8, log_to_driver=True)

alice, bob = sf.PYU('alice'), sf.PYU('bob')
spu = sf.SPU(sf.utils.testing.cluster_def(['alice', 'bob']))

x1, _ = alice(load_train_dataset)(party_id=1)
x2, y = bob(load_train_dataset)(party_id=2)

device = spu
x1_, x2_, y_ = (
    x1.to(device),
    x2.to(device),
    y.to(device),
)
x1_, x2_, y_

from torch import nn
import numpy
class Lr(nn.Module):
    def __init__(self):
        super(Lr, self).__init__()  #继承父类init的参数
        self.linear = nn.Linear(30, 1) 
 
    def forward(self, x1, x2):
        x = numpy.concatenate([x1, x2], axis=1)
        out = self.linear(torch.tensor(x))
        return out

def abc(x1, x2, y):
    x1 = torch.tensor(x1,dtype=torch.float64)
    x2 = torch.tensor(x2,dtype=torch.float64)
    y = torch.tensor(y,dtype=torch.float64)
    model = Lr() 
    criterion = nn.MSELoss() 
    optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)
    for i in range(100):
        model = model.double()
        y_predict = model(x1, x2) 
        loss = criterion(torch.tensor(np.expand_dims(y,axis=1),dtype=torch.float64),y_predict) 
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
    return dict(model.state_dict())['linear.weight'].numpy()

w = device(abc)(x1_,x2_,y_)
model_w = sf.reveal(w)
model_w

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.