Giter Site home page Giter Site logo

Comments (6)

 avatar commented on May 29, 2024

@hadley commented on Jun 7, 2018, 9:38 PM UTC:

Could you please try to make a simpler example? It's not immediately obvious to me what is wrong with the incorrect output.

from dbplyr.

 avatar commented on May 29, 2024

@sirallen commented on Jun 7, 2018, 10:50 PM UTC:

@hadley The incorrect output does not return the earliest time_hours (for unique destinations) by carrier. For example, it returns 2013-01-03, 2013-02-01, 2013-04-14, 2013-10-17 for carrier 9E, while the "correct" query returns times on 2013-01-01, which are, in fact, the earliest. (I'm expecting the earliest times as I'm arranging by carrier and time_hour before the second filter in both cases.)

Hope this clears things up (kind of difficult to create a "simpler" example that requires repeated use of window functions)

from dbplyr.

sirallen avatar sirallen commented on May 29, 2024

Can somebody please acknowledge that this is a problem?

from dbplyr.

hadley avatar hadley commented on May 29, 2024

I can not do anything with it until I have time to create a simpler reprex.

from dbplyr.

edgararuiz-zz avatar edgararuiz-zz commented on May 29, 2024

@sirallen - Can you try this code in your setup?

incorrect <- tbl(localdb, 'flights') %>%
  group_by(carrier, dest) %>%
  arrange(carrier, dest, time_hour) %>%
  filter(row_number() == 1L) %>%
  group_by(carrier) %>%
  arrange(time_hour) %>%
  filter(row_number() < 5L) %>%
  select(carrier, dest, time_hour) %>%
  collect()

I just removed carrier from the second group by. I've been trying to recreate your issue, but have not been able to. I theorize that the redundant carrier inside the order by window function is making the database re-order the records unnecessarily. Again, just a theory, would like to see what you get.

from dbplyr.

hadley avatar hadley commented on May 29, 2024

I've looked at the generated SQL and it seems fine to me, so the most likely explanation is some minor difference in semantics between R and SQL (or between what you expect and what actually happens). If you can generate a simpler example and explain exactly what the problem is, please feel free to open a new issue.

from dbplyr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.