Giter Site home page Giter Site logo

Comments (15)

 avatar commented on May 29, 2024

@edgararuiz commented on Sep 4, 2017, 6:17 PM UTC:

Hi @emilyriederer , does a regular using forward slash by itself works in you environment:

data <- mutate(data, z = x / 5)

from dbplyr.

 avatar commented on May 29, 2024

@emilyriederer commented on Sep 15, 2017, 1:29 AM UTC:

Hi @edgararuiz -- unfortunately not.

It might work just as happenstance due to integer division, but the SQL translation also turns 5 into a floating point:

SELECT "x", "x" / 5.0 AS "z" FROM "data_db"

from dbplyr.

 avatar commented on May 29, 2024

@edgararuiz commented on Sep 15, 2017, 1:31 AM UTC:

Ok, can we try appending an "L" to the right of 5?

data <- mutate(data, z = x / 5L)

from dbplyr.

 avatar commented on May 29, 2024

@hadley commented on Oct 23, 2017, 4:43 PM UTC:

Minimal reprex

dbplyr::translate_sql(x %/% 5)
#> <SQL> "x" / 5.0

from dbplyr.

 avatar commented on May 29, 2024

@hadley commented on Oct 23, 2017, 4:46 PM UTC:

To make equivalent to 1 %/% 0.2 will probably need to implement via modulo arithmetic.

x == (x %/% m) * m + (x %% m)  
x %/% m == (x - (x %% m)) / m

from dbplyr.

 avatar commented on May 29, 2024

@hadley commented on Oct 23, 2017, 10:35 PM UTC:

But this needs to be thought through correctly - I have a vague recollection that negative values might cause issues.

from dbplyr.

 avatar commented on May 29, 2024

@hadley commented on Oct 24, 2017, 10:11 PM UTC:

sql_int_div <- function() {
  function(x, m) {
    build_sql("((", x, " - (", x, " % ", m, ")) / ", m, ")")
  }
}

That definition gets us pretty close, but it turns out that %% in R and % in SQL have slightly different semantics when the signs are different:

library(dplyr, warn.conflicts = FALSE)
library(dbplyr, warn.conflicts = FALSE)

df <- tibble(
  x = c(10, 10, -10, -10), 
  y = c(3, -3, 3, -3)
)
df %>% mutate(x %% y, x %/% y)
#> # A tibble: 4 x 4
#>       x     y `x%%y` `x%/%y`
#>   <dbl> <dbl>  <dbl>   <dbl>
#> 1    10     3      1       3
#> 2    10    -3     -2      -4
#> 3   -10     3      2      -4
#> 4   -10    -3     -1       3

con <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")
mf <- con %>% copy_to(df)
mf %>% mutate(x %% y, x %/% y)
#> # Source:   lazy query [?? x 4]
#> # Database: sqlite 3.19.3 [:memory:]
#>       x     y `x%%y` `x%/%y`
#>   <dbl> <dbl>  <dbl>   <dbl>
#> 1    10     3      1       3
#> 2    10    -3      1      -3
#> 3   -10     3     -1      -3
#> 4   -10    -3     -1       3

I'm not sure how to handle this :(

from dbplyr.

 avatar commented on May 29, 2024

@alex-gable commented on Jan 26, 2018, 8:44 AM UTC:

This is a result of C98 and beyond (and by extension SQLite) using truncated division where the modulo operator takes the sign of the dividend, and R using the mathematically preferred floored division with the modulo sign taking the sign of the divisor.

Quite frankly, C(SQLite) and R are doing fundamentally different arithmetic. There's some fascinating reading on the subject here and of course an abridged version on Wikipedia. This will likely also vary across SQL dialects, which makes it more difficult to pin down in a unified way.

Given the above, I'm not sure it is reasonable to expect equivalent output in every language. Python covered some of the complexities in this discussion in PEP-228: Reworking Python's Numeric Model and PEP-238: Changing the Division Operator, highlighting that this is not just a "dplyr issue", but rather a significantly larger architectural decision in R and computer arithmetic itself.

@emilyriederer, In the case of RedShift, a python udf could be constructed in your database leveraging numpy to replicate the output from R.

from dbplyr.

 avatar commented on May 29, 2024

@hadley commented on Jan 26, 2018, 1:37 PM UTC:

@alex-gable thanks for that awesome summary of the problem!

from dbplyr.

 avatar commented on May 29, 2024

@hadley commented on Jun 7, 2018, 11:37 PM UTC:

I think the best way to handle this is simple to document it.

from dbplyr.

 avatar commented on May 29, 2024

@emilyriederer commented on Jun 9, 2018, 6:51 PM UTC:

Wow - thank you all for the very helpful, detailed responses. All of the context here is fascinating. I'm embarrassed to discover that I completely "went dark" on this thread. Somehow, I'm not getting notifications but luckily spotted this atop the new GitHub feed. Thanks again!

from dbplyr.

jsta avatar jsta commented on May 29, 2024

I was able to force non-integer operation by adding 0.0 to one of the variables. In contrast, as.numeric seemed to have no effect:

library(dplyr)

data <- data.frame(x = as.integer(1:5), y = as.integer(5:1))
data_db <- copy_to(dbplyr::src_memdb(), data, "data_db")

mutate(data_db, z = as.numeric(x)/y)
#> # Source:   lazy query [?? x 3]
#> # Database: sqlite 3.22.0 [:memory:]
#>       x     y     z
#>   <int> <int> <int>
#> 1     1     5     0
#> 2     2     4     0
#> 3     3     3     1
#> 4     4     2     2
#> 5     5     1     5
mutate(data_db, z = (x + 0)/y)
#> # Source:   lazy query [?? x 3]
#> # Database: sqlite 3.22.0 [:memory:]
#>       x     y     z
#>   <int> <int> <dbl>
#> 1     1     5   0.2
#> 2     2     4   0.5
#> 3     3     3   1  
#> 4     4     2   2  
#> 5     5     1   5

from dbplyr.

hadley avatar hadley commented on May 29, 2024

Oh but there’s a DIV function we could use instead

from dbplyr.

hadley avatar hadley commented on May 29, 2024

So I think overall this is just too complicated, and not worth providing a translation. I'll just make %/% return a clean error (since otherwise it's translated to /, which isn't correct)`

from dbplyr.

rsund avatar rsund commented on May 29, 2024

FYI, @hadley, it seems that the wanted results can be achieved by using FLOOR(x/y) for the integer division and by x-y*FLOOR(x/y) for the modulo.

from dbplyr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.