Giter Site home page Giter Site logo

afair / postgresql_cursor Goto Github PK

View Code? Open in Web Editor NEW
506.0 506.0 48.0 147 KB

ActiveRecord PostgreSQL Adapter extension for using a cursor to return a large result set

License: MIT License

Ruby 99.20% Shell 0.80%
activerecord batch cursor for-update large-scale postgresql postgresql-cursor ruby ruby-gem

postgresql_cursor's People

Contributors

aaronbroadintrohive avatar afair avatar amatsuda avatar bbonamin avatar bibendi avatar biinari avatar bryant1410 avatar dmitrybochkarev avatar fgarces avatar pedrofcuba avatar prashantvithani avatar ravbaker avatar redterror avatar simi avatar toothrot avatar vzvu3k6k avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

postgresql_cursor's Issues

SQL Syntax Errors referencing a $1 placeholder

When running a query using an association, ActiveRecord has changed to use prepared statements with bind parameters, something like:

"select * from  comments where post_id = $1", [["post_id", 123]]

so when to_sql is called on the relation, the $1 is not yet interpolated.

Solution: use connection.unprepared_statement { to_sql } syntax to skip preparation. Cursor declaration includes preparation anyway.

Return enumerators when no block given

Won't it be more natural for the each_instance method (as well as other iterators) w/o a block to return a enumerator instead of the "raw" cursor?

Now I can do this explicitly:

MyModel.all.each_instance.to_enum.with_object([]) { |record, obj| obj << record }

I think this to_enum can be applied to cursor under the hood, namely here:
https://github.com/afair/postgresql_cursor/blob/master/lib/postgresql_cursor/active_record/sql_cursor.rb#L34

This would make API a bit more natural IMHO:

MyModel.all.each_instance.with_object([]) { |raw, obj| ... }

each_instance doesn't handle enum attributes?

First of all, great gem, love it ๐Ÿ‘

I've only run into a problem with enum attributes in ActiveRecord. If I define a model with an enum attribute, .each() outputs the symbolic enum value, but .each_instance() outputs the underlying numeric value. e.g.

class User < ApplicationRecord
enum gender: [:male, :female]
end

User.all.each {|u| puts u.gender} # Outputs "male", "male", "female"
User.all.each_instance {|u| puts u.gender} # Outputs 0, 0, 1

Question - Return a collection

I'am trying to use this gem in the following way:

sum = 0
Product.for_sale.each_instance {|p| p.sum = sum + p.balance ; sum += p.balance }
And return that as a collection. From the documentation, it says that each_instance returns the cursor as an enumerable. Is tthere a way to get a collection of Products?

Thanks!

Deprecation warnings in Rails 6

I use #each_instance_batch method to iterate over batches, and when it tries to find column types for postgresql enum columns it warns

types[fname] = @connection.get_type_map.fetch(ftype, fmod) do |oid, mod|

/gems/2.7.0/gems/activerecord-6.0.3.4/lib/active_record/connection_adapters/abstract_adapter.rb:698: warning: deprecated Object#=~ is called on Integer; it always returns nil
/gems/2.7.0/gems/activerecord-6.0.3.4/lib/active_record/connection_adapters/abstract_adapter.rb:694: warning: deprecated Object#=~ is called on Integer; it always returns nil

ruby -v => ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
rails -v => Rails 6.0.3.4

Rails 5 compatibility?

With Rails 5 around the corner, no recent commits to this project and a few forks with a few small optimiztions that have not been pulled back in, I'd like to ask if a different way of lazily plucking attributes has arisen recently or if postgresql_cursor is still suggested to use?

Thanks!

Underlying SQL Errors Masked When Transaction Has Been Aborted

To reproduce run the following:

ActiveRecord::Base.each_row_by_sql('select * from bad_table') { }

This leads to the following exception:

ActiveRecord::StatementInvalid: PG::InFailedSqlTransaction: ERROR:  current transaction is aborted, commands ignored until end of transaction block
: close cursor_1

It would be much easier to troubleshoot issues if the original exception was thrown:

ActiveRecord::StatementInvalid: PG::UndefinedTable: ERROR:  relation "bad_table" does not exist
LINE 1: declare cursor_1 cursor for select * from bad_table
                                                  ^
: declare cursor_1 cursor for select * from bad_table

PG::ConnectionBad: PQsocket() can't get socket descriptor: ROLLBACK

Our app opens a cursor which can run for hours at a time.
On the longer runs (7+ hours) the cursor occasionally errors out with:

ActiveRecord::StatementInvalid
PG::ConnectionBad: PQsocket() can't get socket descriptor: ROLLBACK

I am trying to rule this gem out as the culprit. I'm not sure if this is masking a different error.
Should we be passing with_hold: true in the options?

Here is a snippet of the cursor loop:

# BATCH_SIZE = 10_000
 records.each_instance(block_size: BATCH_SIZE) { |record| @handler.call(record) }

Couple things that may or may not be pertinent:

  • We tried upgrading to 0.6.7 and see the same behavior
  • The records returned by the cursor have associations on them.
  • App makes other read only ActiveRecord calls within the block passed to #each_instance
    • these read only calls do not access the same table as the cursor

Publish version 0.5 to Rubygems

Hello there,

Great work!

I want to use the version for Rails 4 in another gem. Its not possible, since gemspec doesnt' support git branches as dependencies, but only dependencies from Rubygems.

Any chance version 0.5 will be published to Rubygems soon?

cursor_tuple_fraction doesn't get set

Looking at https://github.com/afair/postgresql_cursor/blob/master/lib/postgresql_cursor/cursor.rb#L298, the comment explains that the gem prefers to override PostgreSQL's default cursor_tuple_fraction = 0.1 to 1.0.

However because the arg frac=1.0 and there's an default on the options lookup @cursor_tuple_fraction ||= @options.fetch(:fraction) { 1.0 }, I believe the early return return @cursor_tuple_fraction if frac == @cursor_tuple_fraction will always fire unless you set a custom value not equal to 1.0.

This means both that the 1.0 default never gets applied, and also that it's impossible to apply a configuration value of 1.0.

I think the early return probably is an unnecessary optimization in terms of speed, and having any early return will possibly cause bugs since the default config on a given Postgres install might not be 0.1 anyway, so it's not possible to fully fix by changing to only return early if the desired value is 0.1.

Finally, the gem doesn't reset the value after the cursor is done, so it's potentially poisoning the connection settings for any other use not going through the gem.

How to setup data to use `each_row` in an rspec test?

Currently, I have my main code working where I run each_row on my ActiveRecord Relation. But in the test, I have an array, and the original code used to be .each so the test used to pass. Now I am running into undefined method 'each_row' for #<Array:0x00007f8a584358e8>. Any suggestion on what the structure in the test should be that it supports each_row ? Any advice will be great! Thank you.

Why is FOR UPDATE not supported?

The README says that this gem is not intended to support FOR UPDATE to process and update each row in place. But why not? I scanned the functionality of this gem and the PostgreSQL documentation for a bit but it's not obvious why FOR UPDATE is not supported.

Rails 5 compatibility?

Does postgresql_cursor work with current Rails 5 pre-releases or is it better to expect that this gem will not be compatible with Rails 5 and will need major reworking? As the last commit was 14 months ago, have people moved on to alternative solutions, and if yes, which?

Declaring cursors with NO SCROLL?

The DECLARE command allows the NO SCROLL argument, which hints to PostgreSQL that you will only be scrolling forward, not backwards. If you do not specify this argument then PostgreSQL tries to figure out for you whether it should allow scrolling or not, although the documentation does not explain what that means.

It seems this gem is only designed for scrolling forward anyway, so would it be a good idea to always call DECLARE with NO SCROLL as a micro-optimization? According to the documentation, NO SCROLL was introduced in PostgreSQL 8.3, which is very old. For reference: Ubuntu 12.04 and Debian 7 both supply PostgreSQL 9.1.

Documentation on cursors: https://www.postgresql.org/docs/9.6/static/plpgsql-cursors.html

Different column types and enum issues

each_instance incorrectly sets the column types, for example

class User < ActiveRecord::Base
  enum state: {
    active: 1,
  }
end

pp User.first
# #<User:0x00007f884e1bee40
#  id: 1,
#  created_at: Wed, 17 Apr 2019 14:33:34 AEST +10:00
#  state: "active">

pp User.each_instance.first
# #<User:0x00007f884e1bee40
#  id: 1,
#  created_at: 2019-04-16 22:24:59 UTC
#  state: 1>

Expected.
in the example above objects from both queries should be identical

rails -v => Rails 6.0.3.4
ruby -v => ruby 2.6.5p114 (2019-10-01 revision 67812) [x86_64-darwin19]

Bug: SyntaxError when using ActiveRecord none

We have discovered that this gem doesn't handle properly the ActiveRecord none method.

subscriptions = Subscription.none
subscriptions.each_row { |row| p row }

The above code raises:

     ActiveRecord::StatementInvalid:
       PG::SyntaxError: ERROR:  syntax error at end of input
       LINE 1: ...ursor_a7205fd123ab9169d9a3105 no scroll cursor for 
                                                                                                                     ^

The expected behavior is that no error is raised, because the above code is perfectly valid. For example the following code that uses ActiveRecord each does not raise any error:

subscriptions = Subscription.none
subscriptions.each { |row| p row }

I think that the problem is that this gem uses the to_sql method, which returns an empty string when none is used (somewhere in the filter chain).

Probably the best solution is to check if to_sql returns an empty string and in that case return an empty array of results.

uninitialized constant ActiveRecord::ConnectionAdapters::PostgreSQLAdapter

For anyone seeing this error, I encountered it when adding postgresql_cursor to my Gemfile.

I had to use gem "postgresql_cursor", require: false and then require "postgresql_cursor" manually where I needed it.

This would be good to document, I think.

Rails 4.2.11, Ruby 2.5.3p105, Bundler 1.17.2

Is it intentional that v0.6.5 does not support Rails 6?

Hi,

Thanks for the recent release. I've just tried to install 0.6.5 but it's not compatible with Rails 6 - was this intentional? We're likely to be a few months away from upgrading to Rails 7 but would really like to keep using this gem. Was the restriction intentional?

The following minimal Gemfile shows the problem when using bundle install:

source "https://rubygems.org"

gem "rails", "~> 6.1.7"
gem "postgresql_cursor", "~> 0.6.5"
Bundler could not find compatible versions for gem "activerecord":
  In Gemfile:
    postgresql_cursor (~> 0.6.5) was resolved to 0.6.5, which depends on
      activerecord (>= 7.0.0)

    rails (~> 6.1.7) was resolved to 6.1.7, which depends on
      activerecord (= 6.1.7)

Thanks,
Owen.

Rendering cursor as collection in RoR give error `undefined method 'size' for #<PostgreSQLCursor::Cursor`

Hello!

I try render a collection from cursor in Ruby on Rails (it may be useful for rendering big collections).
But got an error
undefined method 'size' for #<PostgreSQLCursor::Cursor:0x00007fcc4edd4dd0

Steps to reproduce:

Create user and database:

$ psql

postgres=# create user books with password 'books';
CREATE ROLE

postgres=# create database books_development owner books;
CREATE DATABASE

Create test application:

rails new books --database postgresql
cd books

Fix development block in config/database.yml file for connect to database.

Add controller and run server:

bundle add postgresql_cursor
rails g scaffold book name:string
rails db:migrate
rails s

Testing

Open in a browser: http://localhost:3000/books
It should show book list (now empty).

Append one or several books to list.

Modify view file for use render of collection

# app/views/books/index.html.erb
<p style="color: green"><%= notice %></p>

<h1>Books</h1>

<div id="books">
<%= render partial: "books/book", collection: @books %>
</div>

<%= link_to "New book", new_book_path %>

Link http://localhost:3000/books should show book list.

Modify view index.html.erb for use postgresql_cursor enum

Append .each_instance to books collection:

<%= render partial: "books/book", collection: @books.each_instance %>

Refresh book list in browser. It show an error:

NoMethodError in Books#index

Showing books/app/views/books/index.html.erb where line #6 raised:

undefined method `size' for #<PostgreSQLCursor::Cursor:0x00007fcc4edd4dd0 @sql="SELECT \"books\".* FROM \"books\"",  ...

each_instance does not respect ActiveRecord serialize?

I can pull together a more descriptive example if it's useful, but I have a JSON type in the db where I store arrays of stringified integers. I use serialize to serialize arrays of integers to arrays of strings upon save, and to deserialize back into arrays of integers upon load from the db. each_instance doesn't appear to do this; a standard ActiveRecord load does.

http://api.rubyonrails.org/classes/ActiveRecord/AttributeMethods/Serialization/ClassMethods.html

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.