Deterministic encryption

Thank you for this package!

With the "data encryption" workflow in, is there a way to ensure deterministic encryption (i.e., same input leads to same output)? Currently:

key <- cyphr::key_sodium(raw(32))
cyphr::encrypt_data(raw(3), key)
#>  [1] 17 79 70 f4 cd cf 44 04 75 92 51 af 15 43 d8 d9 41 5a ae af b1 48 1a 34 83
#> [26] a6 b0 d6 2e b3 55 58 d8 24 71 ec 14 87 c2 23 9e 4f b9
cyphr::encrypt_data(raw(3), key)
#>  [1] 6e 1d 2c f8 ca e4 51 3c 73 98 49 ab 68 c0 a9 72 ec bd 23 1d 45 9e e8 9f 4b
#> [26] 68 d4 74 fc b4 58 a4 61 b8 3a 4d 36 ec 74 53 49 c0 f9

Created on 2024-01-18 with reprex v2.0.2

Comments about the intro vignette


  • " for that see the excellent vignettes in the openssl and sodium packages." add links to them. Oh I see "In the sodium package there is a vignette (vignette("crypto101"))"... if they have no pkgdown website state that one can find the vignettes after having installed the packages with command blablabla.
  • " many people (especially on Linux and macOS) have a keypair already. " do they have to know they have one? How can they check they have one? Later you say "Chances are, you have an openssl keypair in your .ssh/ directory." is this only not on Windows?
  • so the way the cyphr function know which kind of encryption to use is by recognizing the type of key?
  • say at the beginning of the vignette that and when you will explain how to save keys.
  • you could even give roles to Alice and Bob for a change, like "an epidemiologist" and "a statistician" and say what file they want to exchange.
  • technically how would they send the encrypted string? Save it as RData and then attach it to an email? might be good to give an example.
  • I was a bit puzzled that after explaining how to encrypt a string, you say you're going to speak about other objects... before coming back to strings.
  • "If you save this to disk with saveRDS it will be readable by everyone. " -> "If you save this to disk with saveRDS it will be readable by everyone before you erase it. "?
  • "moderately nasty call rewriting. " explain why it is nasty (and why it is moderately nasty)
  • maybe end the vignette with a short conclusion about where to find more information on some topics + where to get help for this package?

Error in ... used in a situation where it does not exist

For convenience, I am trying to do

save_enc_rds <- function(...) {
    cyphr::encrypt(saveRDS(...), key)

However, this fails with

Error in, expr, expand.dots = TRUE) :
... used in a situation where it does not exist

I feel this may due to the use of substitute here (but this may be wrong):

encrypt <- function(expr, key, file_arg = NULL, envir = parent.frame()) {
encrypt_(substitute(expr), key, file_arg, envir)

How can I get the kind of function wrapper I am looking for?

update webpages

Steph sorted out a pkgdown site for distcrete so use that same approach

Fix for R 4.0.0 change in stringsAsFactors behaviour

  -- 1. Failure: nse 2 (@test-encrypt-wrapper.R#36) -----------------------------
  decrypt(read.csv(filename), x) not equal to `iris`.
  Component "Species": Modes: character, numeric
  Component "Species": Attributes: < target is NULL, current is list >
  Component "Species": target is character, current is factor
  == testthat results ===========================================================
  [ OK: 270 | SKIPPED: 2 | WARNINGS: 0 | FAILED: 1 ]
  1. Failure: nse 2 (@test-encrypt-wrapper.R#36)

Signature of data does not match

I am trying to request access to data on another machine. On that machine I did:


And get the expected A request has been added... message. I then switch back to my main machine and try to authorise the request there, but get the following error message:

Error in value[[3L]](cond) : 
  Signature of data does not match for [key]

Memory limit error for `encrypt()`

Currently, I try to figure out the performance of the cyphr package on large datasets. It seems that for data that cannot be well compressed (random strings), cyphr::encrypt() soon reaches some memory limits (10 M rows, 2 columns of which 1 is a long string with 500 characters). This limit seems to be independent of available system RAM and OS as I tested with (8GB, 16 GB, 32 GB on Windows 10; 170GB on Linux cluster) and has always executed saveRDS() without problem, but got an error for cyphr::encrypt(saveRDS())

In the reprex, about 3.5 GB of RAM are used according to the RStudio memory usage report and writing the unencrypted compressed RDS file takes about 3.3 GB of storage.

This reprex takes about 3 minutes to run on a normal PC.

# packages

# creating a data.frame with long random strings
rows <- 1E7
str_len <- 500 #length of strings
str_n <- 1000  #number of different strings
rand_strings <- stringi::stri_rand_strings(str_n, str_len)

large_data <- data.frame(
  id = 1:rows,
  year = sample(1980:2020, size = rows, replace = TRUE),
  long_str = sample(rand_strings, size = rows, replace = TRUE)

# To do anything we first need a key:
key <- cyphr::key_sodium(sodium::keygen())

# Save large file unencrypted to figure out compressed size
# saveRDS(large_data, "myfile.rds")
# fs::file_size("myfile.rds")
# this file is about 3.3 GB when written unencrypted to disk (standard compression of rds)

# be careful, running this command will take about 3-10 minutes, before error is thrown
# Save large data with encryption
cyphr::encrypt(saveRDS(large_data, "myfile_encr.rds"), key)
#> Error in encrypt(msg, key()): lange Vektoren noch nicht unterstΓΌtzt: memory.c:3887

# --> Error: Error in encrypt(msg, key()) : long vectors not supported yet: memory.c:3887

Created on 2022-06-10 by the reprex package (v2.0.1)

Allow `` key file to be found as default SSH key

RStudio has changed the default generated SSH key to ED25519, which means that also the key file name changed and cannot be found by the default internal cyphr function openssl_find_pubkey. I would suggest to:

  • add to the default algorithm to find keys inside the ~\.ssh folder OR
  • allow custom key names

Fix travis

At least a sodium installation issue. It would be nice to try the new-style build, but that might require sodium libs...

Fix appveyor

There's a chance that this is a problem on Windows, so do a windows test first.

Customize the directory name ".cyphr"

We would like to use this package but it is not possible to use an hidden directory to store the credentials in our settings. Is it possible to make this configurable (with ".cyphr" as default value) via options() for example ?

Can I decrypt string if it was encrypt by PHP blowfish?

Hello! Thank for the cyphr.

I have an encrypted string, it was encrypted by PHP.

What i need: Decrypting this string by R

What i have:

Encrypted data for test: "atJpjpKSv8SWEVovBE17K/+szAMyoJMC"
Decrypted data for test: "My secret string"
Encryption algoritm:


cipher: 'bf-cbc'
hashType: 'sha1'
secretKey: "asdasdni087uiweiuhHJKnoHkasdHUIB"

I try some packages: PKI, cyphr, safer But i can't decrypted atJpjpKSv8SWEVovBE17K/+szAMyoJMC to My secret string

I need get My secret string from atJpjpKSv8SWEVovBE17K/+szAMyoJMC.

Remove warning about compatibility

  • create a test case that we can use to detect regressions (a little zip file)
  • remove warning
  • mark package as version 1.0.0
  • update repo status

Confirm correct terminology in .cyphr/

In the .cyphr/, it says:

Files in keys/ are encrypted copies of the (symmetric) data key, encrypted with different users' private keys.

As I understand the workflow, these files are encrypted with the public keys that users provide to the admin. Do I have that right?

Thanks for the helpful package.

Add support for RSA keys

Using Jeroen's openssl package, the general config interface can stay the same.

cc: @gaborcsardi -- would that be enough for use case? Happy to chat about this Monday.

Proper connections

In earlier versions of the README (3a20372 and earlier) I wrote:

## Why not a connection object?

A proper connection could be nice but there are at least three issues stopping this:

1. `sodium` does not support streaming encryption/decryption.  It might be possible (bindings to node and swift have it).  In general this would be great and allow the sort of cool things you can do with streaming large data in curl.
2. R plays pretty loose and free with creating connections when given a filename; `readRDS`/`saveRDS` will open files with decompression on in binary mode, `read.csv`/`write.csv` don't.  `write.table` adds encoding information when openning the connection object.  The logic around what happens is entirely within the functions themselves so is hard to capture in a general way.
3. Connection objects look like a pain to write.

There are still problems with the approach I've taken:

* Appending does not work: we'd need to unencrypt the file first for that to be OK
* Non-file arguments are going to suck (though it's possible that something could be done to detect connections)

In the end, you can always write things out however you like and use `encrypt_file` to encrypt the file afterwards.

This issue is just to record that proper connections might be nice to have

Is it possible to encrypt `arrow` files using the `cyphr` package?

Hi, I have been experimenting with the cyphr package and have hit the memory limit with large .RData files. As an alternative, the arrow package offers partitioning of large data when writing files. I tried to create a new method for arrow::write_dataset(), but when using cyphr::encrypt(), it results in an error message of denied permissions (using any other build-in write functions of cyphr however works). A reprex with iris below.

# packages
#> Attache Paket: 'arrow'
#> Das folgende Objekt ist maskiert 'package:utils':
#>     timestamp

# To do anything we first need a key:
key <- cyphr::key_sodium(sodium::keygen())

# Register new method for arrow::write_dataset()
cyphr::rewrite_register("arrow", "write_dataset", "path")
#>  [1] "arrow::write_dataset" "base::load"           "base::readLines"     
#>  [4] "base::readRDS"        "base::save"           "base::saveRDS"       
#>  [7] "base::writeLines"     "readxl::read_excel"   "readxl::read_xls"    
#> [10] "readxl::read_xlsx"    "utils::read.csv"      "utils::read.csv2"    
#> [13] "utils::read.delim"    "utils::read.delim2"   "utils::read.table"   
#> [16] "utils::write.csv"     "utils::write.csv2"    "utils::write.table"  
#> [19] "writexl::write_xlsx"

# Trying to encrypt with cyphr results in error message of denied permissions
cyphr::encrypt(write_dataset(iris, tempfile(), partitioning = c("Species")), 
#> Warning in file(con, "rb"): cannot open file 'C:
#> \Users\ga27jar\AppData\Local\Temp\RtmpKw7PXv\filed4c33d93cd0d4c2d2f10cf'
#> Permission denied
#> Error in file(con, "rb"): cannot open the connection
#> Warning in file.remove(paths[ok]):  cannot remove file 'C:
#> \Users\ga27jar\AppData\Local\Temp\RtmpKw7PXv\filed4c33d93cd0d4c2d2f10cf'
#> 'Permission denied'

Created on 2022-06-09 by the reprex package (v2.0.1)

Ability to install without sodium

We planned to use cyphr wrappers for openssl since we do not have sodium installed and can't since libsodium is not installed on the machine in question. One should be able to install cyphr without having sodium installed. And probably visa versa (cyphr without openssl).

I believe, this means moving them to Suggests: and implementing if (require(sodium)) logic into the code.

version 1.0.0

Really enjoying this package, working great for my purposes, but notice the warning and somewhat worried. May I know when version 1.0.0 is expected so we expect backward compatibility? thank you!

