ropensci / cyphr Goto Github PK
View Code? Open in Web Editor NEW:shipit: Humane encryption
Home Page: https://docs.ropensci.org/cyphr
License: Other
:shipit: Humane encryption
Home Page: https://docs.ropensci.org/cyphr
License: Other
https://en.wikipedia.org/wiki/Encryptr
I have to start googling before making repos.
Thank you for this package!
With the "data encryption" workflow in https://docs.ropensci.org/cyphr/articles/data.html, is there a way to ensure deterministic encryption (i.e., same input leads to same output)? Currently:
key <- cyphr::key_sodium(raw(32))
cyphr::encrypt_data(raw(3), key)
#> [1] 17 79 70 f4 cd cf 44 04 75 92 51 af 15 43 d8 d9 41 5a ae af b1 48 1a 34 83
#> [26] a6 b0 d6 2e b3 55 58 d8 24 71 ec 14 87 c2 23 9e 4f b9
cyphr::encrypt_data(raw(3), key)
#> [1] 6e 1d 2c f8 ca e4 51 3c 73 98 49 ab 68 c0 a9 72 ec bd 23 1d 45 9e e8 9f 4b
#> [26] 68 d4 74 fc b4 58 a4 61 b8 3a 4d 36 ec 74 53 49 c0 f9
Created on 2024-01-18 with reprex v2.0.2
π¨
this is a departure from the current interface but would allow the "data" workflow to be a bit more seamless
For convenience, I am trying to do
save_enc_rds <- function(...) {
cyphr::encrypt(saveRDS(...), key)
}
However, this fails with
Error in match.call(defn, expr, expand.dots = TRUE) :
... used in a situation where it does not exist
I feel this may due to the use of substitute
here (but this may be wrong):
Lines 57 to 59 in b718ca5
How can I get the kind of function wrapper I am looking for?
Steph sorted out a pkgdown site for distcrete so use that same approach
Maybe consider Blobcrypt?
-- 1. Failure: nse 2 (@test-encrypt-wrapper.R#36) -----------------------------
decrypt(read.csv(filename), x) not equal to `iris`.
Component "Species": Modes: character, numeric
Component "Species": Attributes: < target is NULL, current is list >
Component "Species": target is character, current is factor
== testthat results ===========================================================
[ OK: 270 | SKIPPED: 2 | WARNINGS: 0 | FAILED: 1 ]
1. Failure: nse 2 (@test-encrypt-wrapper.R#36)
I am trying to request access to data on another machine. On that machine I did:
cyphr::data_request_access(data_dir)
And get the expected A request has been added... message. I then switch back to my main machine and try to authorise the request there, but get the following error message:
cyphr::data_admin_authorise(data_dir)
Error in value[[3L]](cond) :
Signature of data does not match for [key]
It would be nice to have some diagrams for the vignettes (see ropensci/software-review#114 (comment))
Currently, I try to figure out the performance of the cyphr
package on large datasets. It seems that for data that cannot be well compressed (random strings), cyphr::encrypt()
soon reaches some memory limits (10 M rows, 2 columns of which 1 is a long string with 500 characters). This limit seems to be independent of available system RAM and OS as I tested with (8GB, 16 GB, 32 GB on Windows 10; 170GB on Linux cluster) and has always executed saveRDS()
without problem, but got an error for cyphr::encrypt(saveRDS())
In the reprex, about 3.5 GB of RAM are used according to the RStudio memory usage report and writing the unencrypted compressed RDS file takes about 3.3 GB of storage.
This reprex takes about 3 minutes to run on a normal PC.
# packages
library(cyphr)
library(stringi)
# creating a data.frame with long random strings
rows <- 1E7
str_len <- 500 #length of strings
str_n <- 1000 #number of different strings
rand_strings <- stringi::stri_rand_strings(str_n, str_len)
large_data <- data.frame(
id = 1:rows,
year = sample(1980:2020, size = rows, replace = TRUE),
long_str = sample(rand_strings, size = rows, replace = TRUE)
)
# To do anything we first need a key:
key <- cyphr::key_sodium(sodium::keygen())
# Save large file unencrypted to figure out compressed size
# saveRDS(large_data, "myfile.rds")
# fs::file_size("myfile.rds")
# this file is about 3.3 GB when written unencrypted to disk (standard compression of rds)
# be careful, running this command will take about 3-10 minutes, before error is thrown
# Save large data with encryption
cyphr::encrypt(saveRDS(large_data, "myfile_encr.rds"), key)
#> Error in encrypt(msg, key()): lange Vektoren noch nicht unterstΓΌtzt: memory.c:3887
# --> Error: Error in encrypt(msg, key()) : long vectors not supported yet: memory.c:3887
Created on 2022-06-10 by the reprex package (v2.0.1)
sessioninfo::session_info()
#> β Session info βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
#> setting value
#> version R version 4.2.0 (2022-04-22 ucrt)
#> os Windows 10 x64 (build 19044)
#> system x86_64, mingw32
#> ui RTerm
#> language (EN)
#> collate German_Germany.utf8
#> ctype German_Germany.utf8
#> tz Europe/Berlin
#> date 2022-06-10
#> pandoc 2.17.1.1 @ C:/Program Files/RStudio/bin/quarto/bin/ (via rmarkdown)
#>
#> β Packages βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
#> package * version date (UTC) lib source
#> cli 3.3.0 2022-04-25 [1] CRAN (R 4.2.0)
#> crayon 1.5.1 2022-03-26 [1] CRAN (R 4.2.0)
#> cyphr * 1.1.2 2021-05-17 [1] CRAN (R 4.2.0)
#> digest 0.6.29 2021-12-01 [1] CRAN (R 4.2.0)
#> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.2.0)
#> evaluate 0.15 2022-02-18 [1] CRAN (R 4.2.0)
#> fansi 1.0.3 2022-03-24 [1] CRAN (R 4.2.0)
#> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.2.0)
#> fs 1.5.2 2021-12-08 [1] CRAN (R 4.2.0)
#> glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.0)
#> highr 0.9 2021-04-16 [1] CRAN (R 4.2.0)
#> htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.2.0)
#> knitr 1.39 2022-04-26 [1] CRAN (R 4.2.0)
#> lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.2.0)
#> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.0)
#> pillar 1.7.0 2022-02-01 [1] CRAN (R 4.2.0)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.0)
#> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.2.0)
#> R.cache 0.15.0 2021-04-30 [1] CRAN (R 4.2.0)
#> R.methodsS3 1.8.1 2020-08-26 [1] CRAN (R 4.2.0)
#> R.oo 1.24.0 2020-08-26 [1] CRAN (R 4.2.0)
#> R.utils 2.11.0 2021-09-26 [1] CRAN (R 4.2.0)
#> reprex 2.0.1 2021-08-05 [1] CRAN (R 4.2.0)
#> rlang 1.0.2 2022-03-04 [1] CRAN (R 4.2.0)
#> rmarkdown 2.14 2022-04-25 [1] CRAN (R 4.2.0)
#> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.2.0)
#> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.0)
#> sodium 1.2.0 2021-10-21 [1] CRAN (R 4.2.0)
#> stringi * 1.7.6 2021-11-29 [1] CRAN (R 4.2.0)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.2.0)
#> styler 1.7.0 2022-03-13 [1] CRAN (R 4.2.0)
#> tibble 3.1.7 2022-05-03 [1] CRAN (R 4.2.0)
#> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.2.0)
#> vctrs 0.4.1 2022-04-13 [1] CRAN (R 4.2.0)
#> withr 2.5.0 2022-03-03 [1] CRAN (R 4.2.0)
#> xfun 0.31 2022-05-10 [1] CRAN (R 4.2.0)
#> yaml 2.3.5 2022-02-21 [1] CRAN (R 4.2.0)
#>
#> [1] C:/Users/ga27jar/AppData/Local/R/win-library/4.2
#> [2] C:/Program Files/R/R-4.2.0/library
#>
#> ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
RStudio has changed the default generated SSH key to ED25519, which means that also the key file name changed and cannot be found by the default internal cyphr
function openssl_find_pubkey
. I would suggest to:
id_ed25519.pub
to the default algorithm to find keys inside the ~\.ssh
folder ORAt least a sodium installation issue. It would be nice to try the new-style build, but that might require sodium libs...
There's a chance that this is a problem on Windows, so do a windows test first.
Similar to the template, we should be able to request additional metadata (e.g., human readable name, email address) and store them against the key request and the key
We would like to use this package but it is not possible to use an hidden directory to store the credentials in our settings. Is it possible to make this configurable (with ".cyphr" as default value) via options() for example ?
Hello! Thank for the cyphr.
I have an encrypted string, it was encrypted by PHP.
What i need: Decrypting this string by R
What i have:
Encrypted data for test: "atJpjpKSv8SWEVovBE17K/+szAMyoJMC"
Decrypted data for test: "My secret string"
Encryption algoritm:
Blowfish
cipher: 'bf-cbc'
hashType: 'sha1'
encryptOptions: OPENSSL_ZERO_PADDING|OPENSSL_RAW_DATA;
secretKey: "asdasdni087uiweiuhHJKnoHkasdHUIB"
I try some packages: PKI, cyphr, safer But i can't decrypted atJpjpKSv8SWEVovBE17K/+szAMyoJMC to My secret string
I need get My secret string
from atJpjpKSv8SWEVovBE17K/+szAMyoJMC
.
...or at least not quite right. Give a full worked example here I think
In the .cyphr/README.md
, it says:
Files in
keys/
are encrypted copies of the (symmetric) data key, encrypted with different users' private keys.
As I understand the workflow, these files are encrypted with the public keys that users provide to the admin. Do I have that right?
Thanks for the helpful package.
More specifically, I'm selfishly interested in debian-based systems. So far it looks like libsodium-dev is not on standard repos. Compiling from the sources following:
https://download.libsodium.org/doc/installation/index.html
works for me. System:
Linux Swift 3.16.0-38-generic #52~14.04.1-Ubuntu SMP Fri May 8 09:43:57 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
Using Jeroen's openssl package, the general config interface can stay the same.
cc: @gaborcsardi -- would that be enough for use case? Happy to chat about this Monday.
In earlier versions of the README (3a20372 and earlier) I wrote:
## Why not a connection object?
A proper connection could be nice but there are at least three issues stopping this:
1. `sodium` does not support streaming encryption/decryption. It might be possible (bindings to node and swift have it). In general this would be great and allow the sort of cool things you can do with streaming large data in curl.
2. R plays pretty loose and free with creating connections when given a filename; `readRDS`/`saveRDS` will open files with decompression on in binary mode, `read.csv`/`write.csv` don't. `write.table` adds encoding information when openning the connection object. The logic around what happens is entirely within the functions themselves so is hard to capture in a general way.
3. Connection objects look like a pain to write.
There are still problems with the approach I've taken:
* Appending does not work: we'd need to unencrypt the file first for that to be OK
* Non-file arguments are going to suck (though it's possible that something could be done to detect connections)
In the end, you can always write things out however you like and use `encrypt_file` to encrypt the file afterwards.
This issue is just to record that proper connections might be nice to have
π¨
Hi, I have been experimenting with the cyphr package and have hit the memory limit with large .RData files. As an alternative, the arrow package offers partitioning of large data when writing files. I tried to create a new method for arrow::write_dataset()
, but when using cyphr::encrypt()
, it results in an error message of denied permissions (using any other build-in write functions of cyphr however works). A reprex with iris below.
# packages
library(cyphr)
library(arrow)
#>
#> Attache Paket: 'arrow'
#> Das folgende Objekt ist maskiert 'package:utils':
#>
#> timestamp
# To do anything we first need a key:
key <- cyphr::key_sodium(sodium::keygen())
# Register new method for arrow::write_dataset()
cyphr::rewrite_register("arrow", "write_dataset", "path")
ls(cyphr:::db)
#> [1] "arrow::write_dataset" "base::load" "base::readLines"
#> [4] "base::readRDS" "base::save" "base::saveRDS"
#> [7] "base::writeLines" "readxl::read_excel" "readxl::read_xls"
#> [10] "readxl::read_xlsx" "utils::read.csv" "utils::read.csv2"
#> [13] "utils::read.delim" "utils::read.delim2" "utils::read.table"
#> [16] "utils::write.csv" "utils::write.csv2" "utils::write.table"
#> [19] "writexl::write_xlsx"
# Trying to encrypt with cyphr results in error message of denied permissions
cyphr::encrypt(write_dataset(iris, tempfile(), partitioning = c("Species")),
key)
#> Warning in file(con, "rb"): cannot open file 'C:
#> \Users\ga27jar\AppData\Local\Temp\RtmpKw7PXv\filed4c33d93cd0d4c2d2f10cf'
#> Permission denied
#> Error in file(con, "rb"): cannot open the connection
#> Warning in file.remove(paths[ok]): cannot remove file 'C:
#> \Users\ga27jar\AppData\Local\Temp\RtmpKw7PXv\filed4c33d93cd0d4c2d2f10cf'
#> 'Permission denied'
Created on 2022-06-09 by the reprex package (v2.0.1)
sessioninfo::session_info()
#> β Session info βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
#> setting value
#> version R version 4.2.0 (2022-04-22 ucrt)
#> os Windows 10 x64 (build 19044)
#> system x86_64, mingw32
#> ui RTerm
#> language (EN)
#> collate German_Germany.utf8
#> ctype German_Germany.utf8
#> tz Europe/Berlin
#> date 2022-06-09
#> pandoc 2.17.1.1 @ C:/Program Files/RStudio/bin/quarto/bin/ (via rmarkdown)
#>
#> β Packages βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
#> package * version date (UTC) lib source
#> arrow * 8.0.0 2022-05-09 [1] CRAN (R 4.2.0)
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.2.0)
#> bit 4.0.4 2020-08-04 [1] CRAN (R 4.2.0)
#> bit64 4.0.5 2020-08-30 [1] CRAN (R 4.2.0)
#> cli 3.3.0 2022-04-25 [1] CRAN (R 4.2.0)
#> crayon 1.5.1 2022-03-26 [1] CRAN (R 4.2.0)
#> cyphr * 1.1.2 2021-05-17 [1] CRAN (R 4.2.0)
#> DBI 1.1.2 2021-12-20 [1] CRAN (R 4.2.0)
#> digest 0.6.29 2021-12-01 [1] CRAN (R 4.2.0)
#> dplyr 1.0.9 2022-04-28 [1] CRAN (R 4.2.0)
#> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.2.0)
#> evaluate 0.15 2022-02-18 [1] CRAN (R 4.2.0)
#> fansi 1.0.3 2022-03-24 [1] CRAN (R 4.2.0)
#> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.2.0)
#> fs 1.5.2 2021-12-08 [1] CRAN (R 4.2.0)
#> generics 0.1.2 2022-01-31 [1] CRAN (R 4.2.0)
#> glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.0)
#> highr 0.9 2021-04-16 [1] CRAN (R 4.2.0)
#> htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.2.0)
#> knitr 1.39 2022-04-26 [1] CRAN (R 4.2.0)
#> lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.2.0)
#> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.0)
#> pillar 1.7.0 2022-02-01 [1] CRAN (R 4.2.0)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.0)
#> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.2.0)
#> R.cache 0.15.0 2021-04-30 [1] CRAN (R 4.2.0)
#> R.methodsS3 1.8.1 2020-08-26 [1] CRAN (R 4.2.0)
#> R.oo 1.24.0 2020-08-26 [1] CRAN (R 4.2.0)
#> R.utils 2.11.0 2021-09-26 [1] CRAN (R 4.2.0)
#> R6 2.5.1 2021-08-19 [1] CRAN (R 4.2.0)
#> reprex 2.0.1 2021-08-05 [1] CRAN (R 4.2.0)
#> rlang 1.0.2 2022-03-04 [1] CRAN (R 4.2.0)
#> rmarkdown 2.14 2022-04-25 [1] CRAN (R 4.2.0)
#> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.2.0)
#> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.0)
#> sodium 1.2.0 2021-10-21 [1] CRAN (R 4.2.0)
#> stringi 1.7.6 2021-11-29 [1] CRAN (R 4.2.0)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.2.0)
#> styler 1.7.0 2022-03-13 [1] CRAN (R 4.2.0)
#> tibble 3.1.7 2022-05-03 [1] CRAN (R 4.2.0)
#> tidyselect 1.1.2 2022-02-21 [1] CRAN (R 4.2.0)
#> tzdb 0.3.0 2022-03-28 [1] CRAN (R 4.2.0)
#> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.2.0)
#> vctrs 0.4.1 2022-04-13 [1] CRAN (R 4.2.0)
#> withr 2.5.0 2022-03-03 [1] CRAN (R 4.2.0)
#> xfun 0.31 2022-05-10 [1] CRAN (R 4.2.0)
#> yaml 2.3.5 2022-02-21 [1] CRAN (R 4.2.0)
#>
#> [1] C:/Users/ga27jar/AppData/Local/R/win-library/4.2
#> [2] C:/Program Files/R/R-4.2.0/library
#>
#> ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
We planned to use cyphr wrappers for openssl since we do not have sodium installed and can't since libsodium is not installed on the machine in question. One should be able to install cyphr without having sodium installed. And probably visa versa (cyphr without openssl).
I believe, this means moving them to Suggests: and implementing if (require(sodium))
logic into the code.
Really enjoying this package, working great for my purposes, but notice the warning and somewhat worried. May I know when version 1.0.0 is expected so we expect backward compatibility? thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.