seamusabshere / upsert Goto Github PK

Upsert on MySQL, PostgreSQL, and SQLite3. Transparently creates functions (UDF) for MySQL and PostgreSQL; on SQLite3, uses INSERT OR IGNORE.

License: MIT License

Ruby 97.14% Shell 2.86%

upsert's People

Contributors

Stargazers

Watchers

Forkers

epictetus libin swils jiacona sungchoi pgericson callawaywilson aleak cloudability ihough thbar timuckun hideoyanagi derekharmel towerhe andrewvc chrisyuska kushkella rakoth qinyushuang centresource jocelyndrean williamherry brixen risepeopleinc joshvar88 paulnsorensen funkyboy evadne radiofrequency inf1k jasonshih perryn zakcrawford everestatwork kirillian hcarver kangkot wemcdonald felixjmarte asiniy denispeplin hpetru ch4s3 hallucinations urimikhli johnrees hkdnet shivashankar-ror benkoshy steveparrington karlhungus orbit-dna tribecommunication golfgenius cdonadeo kevincolemaninc sandrociceros schoolkeep olivierlacan searls tusker-direct filipeamoreira jfabre wowremywang sachiotomita douglasdoro dlynam bearerpipelinetest trailsuite iq-scm quoinefinancial r-huk

upsert's Issues

ON DUPLICATE KEY UPDATE has incorrect behavior when a multi-column selector

unless you have a perfectly-formed multi-column unique index,

upsert.row({'name': 'Jerry', 'color': 'red'}, {'weight': 256.78, 'license': 888})

using ON DUPLICATE KEY UPDATE will incorrectly update a row with 'name': 'Jerry' but 'color': 'blue'

basically this is because we're not actually using the selector as a condition in a where clause

originally found in https://github.com/seamusabshere/py-upsert/blob/master/test/test_upsert.py

ON DUPLICATE KEY UPDATE should be replaced with stored procedure as in https://github.com/seamusabshere/py-upsert/blob/master/upsert/mysql.py

bug(postgres): possible to cause a PG::AmbiguousFunction exception

I got the following exceptions being thrown a couple of times in Postgres when I kick off a few thousand concurrent Sidekiq jobs that use Upsert.

PG::AmbiguousFunction: ERROR: function upsert2_1_0_portfolio_status_cache_entries_sel_portfo1447696404(unknown, unknown, unknown, unknown, unknown, unknown, unknown, unknown, unknown) is not uniqueLINE 1: SELECT upsert2_1_0_portfolio_status_cache_entries_SEL_portfo... ^HINT: Could not choose a best candidate function. You might need to add explicit type casts.

This particular exception is thrown here:

lib/upsert/connection/PG_Connection.rb: 10:in `exec'
lib/upsert/connection/PG_Connection.rb: 10:in `execute'
lib/upsert/merge_function/PG_Connection.rb: 21:in `execute'
lib/upsert.rb: 218:in `row'

Upon further investigation with the following query things look fine, all Upsert functions appear exactly once in the public schema:

with procs as (select proname, pronamespace from pg_proc where proowner <> 1 and proname like 'upsert%' order by proname asc) select procs.*, pg_namespace.nspname from procs left join pg_namespace on pg_namespace.oid = procs.pronamespace;

I have two separate concerns regarding this finding.

The CREATE OR REPLACE FUNCTION call has the function name prefixed with Upsert::MergeFunction::NAME_PREFIX which is a version tag. Should the Upsert functions always be created in a specific schema?
I’m still wondering what are the possible scenarios that cause the Upsert functions to become ambiguous?

on postgresql, trying to put strings into integer fields fails confusingly

upsert(foo: 0)

if foo is a string field, it will blow up confusingly

PG update by ID

1.9.3p125 :001 > require 'upsert/active_record_upsert'
1.9.3p125 :002 > p = Player.last
=> #<Player id: 681691, round_id: 9711, member_id: 29961, course_id: nil, tee_id: nil, foursome_id: nil, created_at: "2012-06-27 18:07:09", updated_at: "2012-06-27 18:07:09", token: "rTsscbjC3vJ7XawDMVYLGg", status: nil, invitation_answered_at: nil, attending: nil, division_id: nil, flight_id: nil, manual_foursome_id: nil, position: nil, role: 0, blind_id: nil, tournament_event_id: nil, score: nil, team_id: nil, usga_index: nil, mf_pos: nil, invitation_bounced: false> 
1.9.3p125 :003 > Player.upsert({id: 681691}, position: 2)
PG::Error: ERROR:  column "id_input" does not exist
LINE 1: ...s" SET "position" = "position_input" WHERE "id" = "id_input"
                                                             ^
QUERY:  UPDATE "players" SET "position" = "position_input" WHERE "id" = "id_input"
CONTEXT:  PL/pgSQL function "merge_players_4011720111" line 5 at SQL statement

    from /Users/atandrau/.rvm/gems/ruby-1.9.3-p125@glg/gems/upsert-0.3.3/lib/upsert/pg_connection.rb:27:in `exec'
    from /Users/atandrau/.rvm/gems/ruby-1.9.3-p125@glg/gems/upsert-0.3.3/lib/upsert/pg_connection.rb:27:in `execute'
    from /Users/atandrau/.rvm/gems/ruby-1.9.3-p125@glg/gems/upsert-0.3.3/lib/upsert.rb:100:in `row'
    from /Users/atandrau/.rvm/gems/ruby-1.9.3-p125@glg/gems/upsert-0.3.3/lib/upsert/active_record_upsert.rb:6:in `block in upsert'
    from /Users/atandrau/.rvm/gems/ruby-1.9.3-p125@glg/gems/activerecord-3.2.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:118:in `with_connection'
    from /Users/atandrau/.rvm/gems/ruby-1.9.3-p125@glg/gems/upsert-0.3.3/lib/upsert/active_record_upsert.rb:4:in `upsert'
    from (irb):3
    from /Users/atandrau/.rvm/gems/ruby-1.9.3-p125@glg/gems/railties-3.2.1/lib/rails/commands/console.rb:47:in `start'
    from /Users/atandrau/.rvm/gems/ruby-1.9.3-p125@glg/gems/railties-3.2.1/lib/rails/commands/console.rb:8:in `start'
    from /Users/atandrau/.rvm/gems/ruby-1.9.3-p125@glg/gems/railties-3.2.1/lib/rails/commands.rb:41:in `<top (required)>'
    from script/rails:6:in `require'
    from script/rails:6:in `<main>'
1.9.3p125 :004 > Player.upsert({token: "rTsscbjC3vJ7XawDMVYLGg"}, position: 3)
1.9.3p125 :005 > p.reload
=> #<Player id: 681691, round_id: 9711, member_id: 29961, course_id: nil, tee_id: nil, foursome_id: nil, created_at: "2012-06-27 18:07:09", updated_at: "2012-06-27 18:07:09", token: "rTsscbjC3vJ7XawDMVYLGg", status: nil, invitation_answered_at: nil, attending: nil, division_id: nil, flight_id: nil, manual_foursome_id: nil, position: 3, role: 0, blind_id: nil, tournament_event_id: nil, score: nil, team_id: nil, usga_index: nil, mf_pos: nil, invitation_bounced: false>

Upsert should update created_at/updated_at when using activerecord.

Since this merge #15 upsert
no longer updates (or creates) updated_at created_at columns.

To play well with existing rails apps this behaviour should be optional.

PostgreSQL + JDBC: Error when updating value of a BIGINT column

Hi,
I'm using Upsert (2.0.3) with JRuby 9.0.1.0 and PostgreSQL 9.3.

I have the following code:

Upsert.batch(MWMT::DAL.connection, table_name) do |upsert|
  node_data.each do |node_merge_data|
    upsert.row({node_id: node_merge_data[:node_id]}, pressure: node_merge_data[:pressure], merge_count: node_merge_data[:count])
  end
end

Which updates rows on the following table:

      Column      |         Type         |     Modifiers
------------------+----------------------+--------------------
 node_id          | bigint               | not null
 pressure_version | integer              | default 0
 merge_count      | smallint             | not null default 0
    .
    .
    .

This code used to work fine before I moved my application to JRuby. Now, with JRuby, I get the following error exception:

#<RangeError: too big for int: 3599657714>

This number (3599657714) is the value of the node_id column of the row being updated.

Here is the relevant part of the stacktrace:

[ 0] "/Users/aviv/.rvm/gems/jruby-9.0.1.0@mwmt/gems/upsert-2.0.3/lib/upsert/connection/jdbc.rb:47:in `block in execute'",
[ 1] "org/jruby/RubyArray.java:1560:in `each'",
[ 2] "org/jruby/RubyEnumerable.java:1016:in `each_with_index'",
[ 3] "/Users/aviv/.rvm/gems/jruby-9.0.1.0@mwmt/gems/upsert-2.0.3/lib/upsert/connection/jdbc.rb:36:in `execute'",
[ 4] "/Users/aviv/.rvm/gems/jruby-9.0.1.0@mwmt/gems/upsert-2.0.3/lib/upsert/merge_function/Java_OrgPostgresqlJdbc4_Jdbc4Connection.rb:18:in `execute'",
[ 5] "/Users/aviv/.rvm/gems/jruby-9.0.1.0@mwmt/gems/upsert-2.0.3/lib/upsert/merge_function.rb:13:in `execute'",
[ 6] "/Users/aviv/.rvm/gems/jruby-9.0.1.0@mwmt/gems/upsert-2.0.3/lib/upsert.rb:216:in `row'",
[ 7] "/Users/aviv/dev/mobiwize/mw-server/mw-maptools/lib/mw-maptools/data_layer.rb:357:in `block in persist_node_data'",

Looking at the Upsert code where the error originates, I saw the following:

setter = setters[v.class.name]
statement.send setter, i+1, v

Since the value v is a ruby Fixnum, the setter that is found for it is setInt.
However, since the number is very large, it can't be converted to int.
I think that the correct method to invoke on the statement in this case would be setLong.

I might be missing something and not using Upsert correctly, but it does seem that updating BIGINT columns is currently not supported.

Thanks for your help!

Broken merge function unique name (long case)

I got an exception here:

NoMethodError: undefined method `first' for #<String:0x71bc081f>

I suspect this is an untested part which contains a bug, maybe.

A quick fix that works for me:

[ parts[0..20], crc32 ].join('_')

Just replacing with the following did not work at least on pg:

[ parts[0..MAX_NAME_LENGTH-11], crc32 ].join

(but maybe I was just off by one here).

Does upsert work with postgres :json columns?

Hey tried to batch upsert documents with json fields. I get an error for PgHstore so I'm guessing it thinks the hashes are supposed to be :hstore and not :json.

NameError: uninitialized constant PgHstore
    ~/.rbenv/versions/2.2.0/lib/ruby/gems/2.2.0/gems/upsert-2.1.0/lib/upsert/connection/postgresql.rb:18:in `bind_value'

Is there a setting to have it assume :json instead?

Upsert eats all ram/memory when doing basic upserts with only 1 upserts(Postgres)

I run a simple upsert script on a Postgres server:

h = {
  :column1 => "something",
  :date_key => i, 
  :time_key => i, 
  :some_key => i, 
  :other_key => i, 
  :datetime => Time.now, 
  :column_string => "sfjda", 
  :column2_string => "syea"
}
Upsert.batch(ActiveRecord::Base.connection, 'simple_table') { |u| u.row(h,h) }

This runs forever and takes 20MB more ram of memory every second until there is no more memory.
On the wishlist, it says that Postgres should be able to take this kind of selector?
there is a unique index on(combined index) column1, date_key, time_key, some_key, other_key
this does not includes the columns datetime, column_string, column2_string.

Does the selector need to be only on the unique index columns?
and the rest of the values that needs to be added?? like below?

all_hash_keys = [:column1, :date_key, :time_key, :some_key, :other_key, :datetime, :column_string, :column2_string]
selector_keys = [:colunn1, :date_key, :time_key, :some_key, :other_key]

in the code you call it document for the second parameter in Model.upsert(selector, document), I will use the same.

document_keys = all_hash_keys - selector_keys

It is because I want to just feed the upsert function one hash, and it should know its unique indexes :) this is to make it very generic so I can use it more generally, I don't know if I'm doing something wrong or doing something that is not a feature(yet ;) )

if i do
SimpleModel.upsert(selectors, documents)
then it just works.

Upsert inserts another row in Rails 4.0.2 w/ sqlite3

For sqlite3 problem with rails 3.2.16 see below:
#19 (comment)

In rails 4.0.2:

➜  ~  sqlite3 --version 
3.7.17 2013-05-20 00:56:22 118a3b35693b134d56ebd780123b7fd6f1497668
➜  upsert-demo  ruby -v
ruby 2.0.0p247 (2013-06-27 revision 41674) [x86_64-linux]

Lets generate a 4.0.2 application:

rails new upsert-demo
cd upsert-demo
echo 'gem "upsert"' >> Gemfile
echo 'gem "rb-readline", "0.4.2"' >> Gemfile
bundle
rails g model try i:integer s
rake db:migrate
rails c

Upsert creates the first row successfully. When updating the same record, it duplicates it:

irb(main):005:0> u= Upsert.new Try.connection, Try.table_name 
=> #<Upsert:0x007f2b9a441b88 @table_name="tries", @flavor="Sqlite3", @adapter="SQLite3_Database", @connection=#<Upsert::Connection::SQLite3_Database:0x007f2b9a45cb90 @controller=#<Upsert:0x007f2b9a441b88 ...>, @metal=#<SQLite3::Database:0x007f2b9a426b58 @tracefunc=nil, @authorizer=nil, @encoding=nil, @busy_handler=nil, @collations={}, @functions={}, @results_as_hash=true, @type_translation=nil, @readonly=false>>, @merge_function_class=Upsert::MergeFunction::SQLite3_Database, @assume_function_exists=false>
irb(main):006:0> u.row({i: 1},{s: "String1"})
[upsert] PRAGMA table_info("tries")
[upsert] INSERT OR IGNORE INTO "tries" ("i","s") VALUES (?,?) with [1, "String1"]
[upsert] UPDATE "tries" SET "i"=?,"s"=? WHERE "i"=? with [1, "String1", 1]
=> nil
irb(main):007:0> Try.all
  Try Load (2.4ms)  SELECT "tries".* FROM "tries"
=> #<ActiveRecord::Relation [#<Try id: 1, i: 1, s: "String1", created_at: nil, updated_at: nil>]>
irb(main):008:0> u.row({i: 1},{s: "String2"})
[upsert] INSERT OR IGNORE INTO "tries" ("i","s") VALUES (?,?) with [1, "String2"]
[upsert] UPDATE "tries" SET "i"=?,"s"=? WHERE "i"=? with [1, "String2", 1]
=> nil
irb(main):009:0> Try.all
  Try Load (0.6ms)  SELECT "tries".* FROM "tries"
=> #<ActiveRecord::Relation [#<Try id: 1, i: 1, s: "String2", created_at: nil, updated_at: nil>, #<Try id: 2, i: 1, s: "String2", created_at: nil, updated_at: nil>]>

Gemfile.lock:

GEM
  remote: https://rubygems.org/
  specs:
    actionmailer (4.0.2)
      actionpack (= 4.0.2)
      mail (~> 2.5.4)
    actionpack (4.0.2)
      activesupport (= 4.0.2)
      builder (~> 3.1.0)
      erubis (~> 2.7.0)
      rack (~> 1.5.2)
      rack-test (~> 0.6.2)
    activemodel (4.0.2)
      activesupport (= 4.0.2)
      builder (~> 3.1.0)
    activerecord (4.0.2)
      activemodel (= 4.0.2)
      activerecord-deprecated_finders (~> 1.0.2)
      activesupport (= 4.0.2)
      arel (~> 4.0.0)
    activerecord-deprecated_finders (1.0.3)
    activesupport (4.0.2)
      i18n (~> 0.6, >= 0.6.4)
      minitest (~> 4.2)
      multi_json (~> 1.3)
      thread_safe (~> 0.1)
      tzinfo (~> 0.3.37)
    arel (4.0.2)
    atomic (1.1.14)
    builder (3.1.4)
    coffee-rails (4.0.1)
      coffee-script (>= 2.2.0)
      railties (>= 4.0.0, < 5.0)
    coffee-script (2.2.0)
      coffee-script-source
      execjs
    coffee-script-source (1.7.0)
    erubis (2.7.0)
    execjs (2.0.2)
    hike (1.2.3)
    i18n (0.6.9)
    jbuilder (1.5.3)
      activesupport (>= 3.0.0)
      multi_json (>= 1.2.0)
    jquery-rails (3.1.0)
      railties (>= 3.0, < 5.0)
      thor (>= 0.14, < 2.0)
    json (1.8.1)
    mail (2.5.4)
      mime-types (~> 1.16)
      treetop (~> 1.4.8)
    mime-types (1.25.1)
    minitest (4.7.5)
    multi_json (1.8.4)
    polyglot (0.3.3)
    rack (1.5.2)
    rack-test (0.6.2)
      rack (>= 1.0)
    rails (4.0.2)
      actionmailer (= 4.0.2)
      actionpack (= 4.0.2)
      activerecord (= 4.0.2)
      activesupport (= 4.0.2)
      bundler (>= 1.3.0, < 2.0)
      railties (= 4.0.2)
      sprockets-rails (~> 2.0.0)
    railties (4.0.2)
      actionpack (= 4.0.2)
      activesupport (= 4.0.2)
      rake (>= 0.8.7)
      thor (>= 0.18.1, < 2.0)
    rake (10.1.1)
    rb-readline (0.4.2)
    rdoc (4.1.1)
      json (~> 1.4)
    sass (3.2.14)
    sass-rails (4.0.1)
      railties (>= 4.0.0, < 5.0)
      sass (>= 3.1.10)
      sprockets-rails (~> 2.0.0)
    sdoc (0.4.0)
      json (~> 1.8)
      rdoc (~> 4.0, < 5.0)
    sprockets (2.10.1)
      hike (~> 1.2)
      multi_json (~> 1.0)
      rack (~> 1.0)
      tilt (~> 1.1, != 1.3.0)
    sprockets-rails (2.0.1)
      actionpack (>= 3.0)
      activesupport (>= 3.0)
      sprockets (~> 2.8)
    sqlite3 (1.3.8)
    thor (0.18.1)
    thread_safe (0.1.3)
      atomic
    tilt (1.4.1)
    treetop (1.4.15)
      polyglot
      polyglot (>= 0.3.1)
    turbolinks (2.2.1)
      coffee-rails
    tzinfo (0.3.38)
    uglifier (2.4.0)
      execjs (>= 0.3.0)
      json (>= 1.8.0)
    upsert (2.0.3)

PLATFORMS
  ruby

DEPENDENCIES
  coffee-rails (~> 4.0.0)
  jbuilder (~> 1.2)
  jquery-rails
  rails (= 4.0.2)
  rb-readline (= 0.4.2)
  sass-rails (~> 4.0.0)
  sdoc
  sqlite3
  turbolinks
  uglifier (>= 1.3.0)
  upsert

Truncated incorrect datetime value with mysql

I'm seeing the following (this does not happen with sqlite):

Truncated incorrect datetime value: '2013-11-05 23:24:22.124147+00:00'
/var/www/resumedb/shared/bundle/ruby/2.0.0/gems/upsert-2.0.1/lib/upsert/connection/Mysql2_Client.rb:7:in query' /var/www/resumedb/shared/bundle/ruby/2.0.0/gems/upsert-2.0.1/lib/upsert/connection/Mysql2_Client.rb:7:inexecute'
/var/www/resumedb/shared/bundle/ruby/2.0.0/gems/upsert-2.0.1/lib/upsert/merge_function/Mysql2_Client.rb:17:in execute' /var/www/resumedb/shared/bundle/ruby/2.0.0/gems/upsert-2.0.1/lib/upsert/merge_function.rb:13:inexecute'
/var/www/resumedb/shared/bundle/ruby/2.0.0/gems/upsert-2.0.1/lib/upsert.rb:216:in `row'

I'm calling upsert like so:

Upsert.batch(Division.connection, Division.table_name) do |upsert|
      records.each do |rec|
        upsert.row({id: rec['DivId']},
                   name: rec['DivName'],
                   abbrev: rec['Div'],
                   created_at: Time.now,
                   updated_at: Time.now)
      end
end

I've checked and the timezone is:

irb(main):003:0> ActiveRecord::Base.default_timezone=> :utc

License missing from gemspec

RubyGems.org doesn't report a license for your gem. This is because it is not specified in the gemspec of your last release.

via e.g.

  spec.license = 'MIT'
  # or
  spec.licenses = ['MIT', 'GPL-2']

Including a license in your gemspec is an easy way for rubygems.org and other tools to check how your gem is licensed. As you can imagine, scanning your repository for a LICENSE file or parsing the README, and then attempting to identify the license or licenses is much more difficult and more error prone. So, even for projects that already specify a license, including a license in your gemspec is a good practice. See, for example, how rubygems.org uses the gemspec to display the rails gem license.

There is even a License Finder gem to help companies/individuals ensure all gems they use meet their licensing needs. This tool depends on license information being available in the gemspec. This is an important enough issue that even Bundler now generates gems with a default 'MIT' license.

I hope you'll consider specifying a license in your gemspec. If not, please just close the issue with a nice message. In either case, I'll follow up. Thanks for your time!

Appendix:

If you need help choosing a license (sorry, I haven't checked your readme or looked for a license file), GitHub has created a license picker tool. Code without a license specified defaults to 'All rights reserved'-- denying others all rights to use of the code.
Here's a list of the license names I've found and their frequencies

p.s. In case you're wondering how I found you and why I made this issue, it's because I'm collecting stats on gems (I was originally looking for download data) and decided to collect license metadata,too, and make issues for gemspecs not specifying a license as a public service :). See the previous link or my blog post about this project for more information.

NULLs don't work as key values.

Doing something like this:

batch.row({
  value1: foo,
  value2: bar
}, {
  value3: baz,
  value4: umza
})

If bar is nil and there's a row in the database matching the values for foo and bar a duplicate row will still be entered (at least for MySQL and PostgreSQL) because value2 = NULL in the generated function will always return false.

I see that this was opened in #18 but I can confirm that the issue still exists (using the upsert 2.0.1 gem)

doesn't return id from upserted rows

it feels like it would be easy to return the ID of the upserted row... presumably this would help a lot of people

Upserting with Array does not work

I am using the gem with PostgreSQL and JRuby. Have anyone tried doing Upsert with Integer[] or character varying(255)[]? I can't seem to get it work and always getting:

"Hint: No function matches the given name and argument types. You might need to add explicit type casts."

When the same table is not upset with array attribute, everything worked fine. But when the array attributes are included, it failed. The "function" creation has integer[] and character varying(255)[] respectively, while the result sql has values "{'12345','23456'}" for the integer[] and "{'test','test2'}" for the string[].

Does it look right?

PG Segmentation fault (sidekiq / postgres)

using upsert 2.0.3 with sidekiq 2.17.7 I get the following error

upsert-2.0.3/lib/upsert/connection/PG_Connection.rb:10: [BUG] Segmentation fault
ruby 2.0.0p451 (2014-02-24 revision 45167) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0048 p:---- s:0210 e:000209 CFUNC :exec
c:0047 p:0025 s:0205 e:000204 METHOD /home/didil/.rvm/gems/ruby-2.0.0-p451@project/gems/upsert-2.0.3/lib/upsert/connection/PG_Connection.rb:10
c:0046 p:0080 s:0200 E:001588 METHOD /home/didil/.rvm/gems/ruby-2.0.0-p451@project/gems/upsert-2.0.3/lib/upsert/merge_function/PG_Connection.rb:21
c:0045 p:0021 s:0193 e:000192 METHOD /home/didil/.rvm/gems/ruby-2.0.0-p451@project/gems/upsert-2.0.3/lib/upsert/merge_function.rb:13
c:0044 p:0032 s:0187 e:000186 METHOD /home/didil/.rvm/gems/ruby-2.0.0-p451@project/gems/upsert-2.0.3/lib/upsert.rb:216

BUG: UTF8

After updating records with upsert all russian names become ????? ?????. In upsert log everything fine

[2015-04-08T13:13:09.837068 #3363] DEBUG -- : [upsert] CALL upsert2_1_0_users_SEL_id_SET_balance_A_credit_A_curre4278482623(46085, 11900.03, 0.0, 'USD',  'Валерий Франков').

I use MYSQL, mysql2 adapter and Upsert.batch. Column CHAR(128) with encoding UTF-8. Simple insert writes correct.

Batch create active records

When i try to batch create records... I have issues with non-null constraints that Active Records would normally provide, ie created_time.

Code:

ftp.gettextfile "results.txt" ,nil do |result_line|
      result_arr = result_line.split "\t"
      hash = 
        {
          race_id: result_arr[0],
          race_name: result_arr[1],
          candidate_id: result_arr[2],
          candidate_name: result_arr[3],
          party: result_arr[4],
          county_id: result_arr[5],
          county_name: result_arr[6],
          precincts_reporting: result_arr[7],
          precincts: result_arr[8],
          votes: result_arr[9],
          race_type: result_arr[10]
        }
      results << hash
    end

    Upsert.batch(StateElectionResult.connection, StateElectionResult.table_name) do |upsert|
      results.each do |result|
        puts result
        upsert.row({race_id: result[:race_id]}, result)
      end
    end



`{:race_id=>"300000", :race_name=>"Republican for President", :candidate_id=>"300002", :candidate_name=>"Fred Karger", :party=>"", :county_id=>"1", :county_name=>"Beaver", :precincts_reporting=>"9", :precincts=>"9", :votes=>"1", :race_type=>"Federal Offices"}
rake aborted!
ERROR:  null value in column "created_at" violates not-null constraint
CONTEXT:  SQL statement "INSERT INTO "state_election_results"("race_id","race_name","candidate_id","candidate_name","party","county_id","county_name","precincts_reporting","precincts","votes","race_type") VALUES ("race_id_input","race_name_input","candidate_id_input","candidate_name_input","party_input","county_id_input","county_name_input","precincts_reporting_input","precincts_input","votes_input","race_type_input")"
PL/pgSQL function "upsert_76afd9cffc3765fcc284740f9f8f8bd7" line 14 at SQL statement`

Batch insert using ActiveRecord

Never mind, por favor delete this. Thanks for this gem!

Cannot update table within a schema (Postgres)

Here is where the table name is being interpolated:
https://github.com/seamusabshere/upsert/blob/master/lib/upsert/column_definition/postgresql.rb#L12

Here is how it can be fixed

If the parameter is an Array, then all it's values are separately quoted
and then joined by a “.” character.This can be used for identifiers in 
the form “schema”.“table”.“column” .

The line above needs to be changed to pass an array of table_name, and schema. Not sure what the best way to extract the schema from the table_name string, you could also just pass the string through verbatim with simple quote escaping as an easy fix.

[23] pry(#<Datawarehouse::UpsertWorker>)> active_record.table_name= 'dialer.leads'
=> "dialer.leads"
[24] pry(#<Datawarehouse::UpsertWorker>)> active_record.upsert(update_by, attributes)
D, [2015-07-10T17:06:07.629033 #57624] DEBUG -- : [upsert] SELECT a.attname AS name, format_type(a.atttypid, a.atttypmod) AS sql_type, d.adsrc AS default
FROM pg_attribute a LEFT JOIN pg_attrdef d
  ON a.attrelid = d.adrelid AND a.attnum = d.adnum
WHERE a.attrelid = '"dialer.leads"'::regclass
AND a.attnum > 0 AND NOT a.attisdropped

PG::UndefinedTable: ERROR:  relation "dialer.leads" does not exist
LINE 4: WHERE a.attrelid = '"dialer.leads"'::regclass
                           ^
from /Users/zak/.rvm/gems/ruby-2.2.0/gems/upsert-2.1.0/lib/upsert/connection/PG_Connection.rb:13:in `exec'

# And with quoted table names
[25] pry(#<Datawarehouse::UpsertWorker>)> active_record.table_name= '"dialer"."leads"'
=> "\"dialer\".\"leads\""
[26] pry(#<Datawarehouse::UpsertWorker>)> active_record.upsert(update_by, attributes)
D, [2015-07-10T17:07:14.698160 #57624] DEBUG -- : [upsert] SELECT a.attname AS name, format_type(a.atttypid, a.atttypmod) AS sql_type, d.adsrc AS default
FROM pg_attribute a LEFT JOIN pg_attrdef d
  ON a.attrelid = d.adrelid AND a.attnum = d.adnum
WHERE a.attrelid = '"""dialer"".""leads"""'::regclass
AND a.attnum > 0 AND NOT a.attisdropped

PG::UndefinedTable: ERROR:  relation ""dialer"."leads"" does not exist
LINE 4: WHERE a.attrelid = '"""dialer"".""leads"""'::regclass
                           ^
from /Users/zak/.rvm/gems/ruby-2.2.0/gems/upsert-2.1.0/lib/upsert/connection/PG_Connection.rb:13:in `exec'

ERROR: tuple concurrently updated

upsert-1.1.6/lib/upsert/connection/PG_Connection.rb:13:in `exec': ERROR: tuple concurrently updated (PG::Error)

How to define different setter for update and insert?

I want to increment table field (by UPDATE) or set to 1 (by INSERT if record doesn't exist).
How can I do that with upsert?

can't upsert in redis

@pote suggested it!

bug(pg): function creation may fail if called from transactions concurrently

If I run this code in a Sidekiq worker and queue them up I sometimes get failures. It would seem that the error rate can be as high as 10%.

ActiveRecord::Base.connection_pool.with_connection { |connection|
  Upsert.batch(connection, entry_class.model_name.collection) { |upsert|
    connection.transaction {
      touch
      …
      upsert.row(selector, setter)
    }
  }
}

If I do this then I do not get failures.

ActiveRecord::Base.connection_pool.with_connection { |connection|
  Upsert.batch(connection, entry_class.model_name.collection) { |upsert|
    upsert.row(selector, setter)
    connection.transaction {
      touch
      …
      upsert.row(selector, setter)
    }
  }
}

I suspect that the ‘avoid first tuple concurrently updated error’ fix put in place in 1.1.7 does not like it when I call Upsert#row in a transaction.

Postgres 9.5 native upsert support?

Postgres 9.5 added Native Upsert support:

http://www.craigkerstiens.com/2015/05/08/upsert-lands-in-postgres-9.5/

INSERT INTO products (
    upc, 
    title, 
    description, 
    link) 
VALUES (
    123456789, 
    ‘Figment #1 of 5’, 
    ‘THE NEXT DISNEY ADVENTURE IS HERE - STARRING ONE OF DISNEY'S MOST POPULAR CHARACTERS! ’, 
    ‘http://www.amazon.com/dp/B00KGJVRNE?tag=mypred-20’
    )
ON CONFLICT DO UPDATE SET description=excluded.description;

make rspec test fail with current transaction is aborted error

not fail every time and not fail if run it individually, after days of dig, I find some useful info in postgresql log

ERROR:  function upsert2_0_3_subscribe_items_sel_uniq_index_set_action2974300400(unknown, unknown, unknown, unknown, unknown, unknown, unknown, unknown, unknown, unknown, unknown) does not exist at character 8
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.
STATEMENT:  SELECT upsert2_0_3_subscribe_items_SEL_uniq_index_SET_action2974300400($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11)
ERROR:  current transaction is aborted, commands ignored until end of transaction block
STATEMENT:..
..          CREATE OR REPLACE FUNCTION upsert2_0_3_subscribe_items_SEL_uniq_index_SET_action2974300400("uniq_index_sel" character varying(255), "action_set" character varying(255), "created_at_set" character varying(255), "receiver_id_set" integer, "receiver_type_set" character varying(255), "resource_id_set" integer, "resource_type_set" character varying(255), "sender_id_set" integer, "sender_type_set" character varying(255), "uniq_index_set" character varying(255), "updated_at_set" character varying(255)) RETURNS VOID AS

seems use that function before create it?

upserting with long setter hash fails (postgres)

Trying to upsert a database table with many columns fails while the same code with some items removed succeeds.
Simplified code:

Upsert.batch(connection,table_name) do |upsert|
          nereus_update_hash.each do |item|

            update_row = {:credit=>475358, :daily_credit=>20, :monthly_network_usage=>841123927, :online_now=>2, :online_today=>2, :mips_now=>3051, :mips_today=>3051, :active=>1, :updated_at=>Time.now, :created_at=>Time.now}
            upsert.row({:nereus_id => item[0].to_i}, update_row)
          end
        end

This fails with the error, the table name is nereus_stats_items:

ERROR:  function upsert_nereus_stats_items_sel_nereus_id_set_active_a_2167893291(unknown, unknown, unknown, unknown, unknown, unknown, unknown, unknown, unknown, unknown, unknown, unknown) is not unique
LINE 1: SELECT upsert_nereus_stats_items_SEL_nereus_id_SET_active_A_...
               ^
HINT:  Could not choose a best candidate function. You might need to add explicit type casts.

however change update_row to either and the script runs fine:

update_row = {:monthly_network_usage=>841123927, :online_now=>2, :online_today=>2, :mips_now=>3051, :mips_today=>3051, :active=>1, :updated_at=>Time.now, :created_at=>Time.now}
update_row = {:credit=>475358, :daily_credit=>20, :online_now=>2, :online_today=>2, :mips_now=>3051, :mips_today=>3051, :active=>1, :updated_at=>Time.now, :created_at=>Time.now}

I'm using:
ruby 1.9.3p194
rails 3.2.11
PostgreSQL 9.1.9 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.7.2-2ubuntu1) 4.7.2, 64-bit
upsert 1.2.0

Any help as to why its doing this would be greatly appreciated.
Cheers
Alex

Nil values in the identifier selector does not match

Pet.create(name: "Jerry", age: nil, breed: "samoyed")
...
upsert.row({:name => 'Jerry', age: nil}, :breed => 'beagle', :created_at => Time.now)
...
Pet.count.should == 1 # But it has a total of 2 records, having created another one just now.

@seamusabshere, is this the expected behavior? Thanks!

Have you considered using a common table expression?

http://stackoverflow.com/questions/7802806/bulk-upsert-with-ruby-on-rails

WITH updates (id) AS (
UPDATE mytable SET .....
WHERE ....
RETURNING id
)
INSERT INTO mytable (....)
SELECT ...
FROM mytemptable
WHERE id NOT IN (select id from updates);

Upsert created_at column?

This is something that is definitely needed.

Please add an optional parm or something that can modify the column?

Looks like it just needs to address these lines:

def update_column_definitions
    setter_column_definitions.select { |cd| cd.name !~ CREATED_COL_REGEX }
end

Support ODBC driver

I can't say I'm surprised, but upsert doesn't suport ODBC::Database driver. Just so happens for this one client I need to upsert to a ODBC (MSSQL) database.

If someone were to give me a clue where to start, I might be able to help here. But I haven't had a chance to go dig through the code to figure out whats needed.

I suspect its not an itch that many people need to scratch. But I figured I'd log it.

Getting PG Error from specs

I have the following method using upsert:
def job_code
@code = JobCode.new(params[:api])
selector = {:store_id => @code.store_id, :source_id => @code.source_id}
setter = { :name => @code.name, :created_at => Time.now, :updated_at => Time.now }
JobCode.upsert(selector, setter)
render_json(@code, :created)
end

And here is the spec that is giving me trouble:
describe 'job_code' do
context 'when it is a new job_code' do
Given(:params) { {:format => 'json', :api=>{ :store_id=> 1, :source_id=>25, :name=>'Server'}} }
Then { lambda{ post :job_code, params }.should change(JobCode, :count) }
end

context 'when the item already exists' do
  Given!(:code) { FactoryGirl.create :job_code }
  Given(:params) {  {:format => 'json', :api=>{ :store_id=> code.store_id, :source_id=>code.source_id, :name=>code.name}} }
  Then { lambda{ post :job_code, params }.should_not change(JobCode, :count) }   
end

end

I can run each of these specs individually and everything works well, but when I run them together, I get the following error:

ApiController job_code when the item already exists
Failure/Error: Then { lambda{ post :job_code, params }.should_not change(JobCode, :count) }
PG::Error:
ERROR: current transaction is aborted, commands ignored until end of transaction block

I can call this method with a script many times back-to back, and it works well. I'm also using upsert elsewhere without an issue. Also, all rows are successfully loaded in the database in all cases except for the test. Not sure if this is an Upsert issue or user error, but any insight is much appreciated!

integrate with AR find_or_create_by?

In Rails4 (maybe previously?), AR has a find_or_create_by -- it has close to the same semantics as an upsert, but doesn't truly do an atomic upsert, it first does a select, then does an insert if neccessary.

But it got me thinking if it would be possible to write true upsert func like this gem, but which enhanced the AR find_or_create_by instead of providing it's own API. You would just write your code using standard AR API, but if you had the upsert gem installed and you were using a database that upsert gem imported, you would get the true upsert implementation. Otherwise, you would get standard AR functionality. It would be almost the same semantics either way, the difference would be that since standard AR func isn't atomic it has the possibility of creating a duplicate (or violating a unique constraint), where true upsert doesn't. Plus different performance characteristics of course.

If you later uninstalled the gem, it would go back to standard AR functionality.

What do you think? Good idea? Feasible?

On mysql trying to write to an undefined field fails confusingly

In a data_miner block store 'foo', :units => :bar produces the following error if the table does not have a foo_units column.

 Failure/Error: Unable to find matching line from backtrace
 Mysql2::Error:
   Incorrect number of arguments for PROCEDURE test_earth.upsert_recs_2009_responses_SEL_id_SET_computer_2_idl3386076219; expected 82, got 83
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/upsert-1.0.2/lib/upsert/connection/Mysql2_Client.rb:7:in `query'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/upsert-1.0.2/lib/upsert/connection/Mysql2_Client.rb:7:in `execute'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/upsert-1.0.2/lib/upsert/merge_function/Mysql2_Client.rb:21:in `execute'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/upsert-1.0.2/lib/upsert/merge_function.rb:11:in `execute'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/upsert-1.0.2/lib/upsert.rb:145:in `row'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/data_miner-2.4.1/lib/data_miner/step/import.rb:95:in `block (2 levels) in start'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/remote_table-2.1.0/lib/remote_table.rb:435:in `block (2 levels) in each'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/remote_table-2.1.0/lib/remote_table.rb:424:in `each'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/remote_table-2.1.0/lib/remote_table.rb:424:in `block in each'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/remote_table-2.1.0/lib/remote_table/delimited.rb:66:in `block in _each'
 # /Users/ian/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/csv.rb:1768:in `each'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/remote_table-2.1.0/lib/remote_table/delimited.rb:35:in `_each'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/remote_table-2.1.0/lib/remote_table.rb:423:in `each'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/data_miner-2.4.1/lib/data_miner/step/import.rb:89:in `block in start'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/upsert-1.0.2/lib/upsert.rb:73:in `batch'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/data_miner-2.4.1/lib/data_miner/step/import.rb:88:in `start'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/data_miner-2.4.1/lib/data_miner/script.rb:227:in `block (2 levels) in start'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/data_miner-2.4.1/lib/data_miner/script.rb:226:in `each'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/data_miner-2.4.1/lib/data_miner/script.rb:226:in `block in start'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/data_miner-2.4.1/lib/data_miner/run.rb:93:in `block in start'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/data_miner-2.4.1/lib/data_miner/run.rb:92:in `catch'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/data_miner-2.4.1/lib/data_miner/run.rb:92:in `start'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/data_miner-2.4.1/lib/data_miner/script.rb:225:in `start'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/data_miner-2.4.1/lib/data_miner/active_record_class_methods.rb:26:in `run_data_miner!'
 # /Users/ian/code/earth/spec/spec_helper.rb:36:in `block (2 levels) in <top (required)>'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/rspec-core-2.11.1/lib/rspec/core/hooks.rb:23:in `instance_eval'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/rspec-core-2.11.1/lib/rspec/core/hooks.rb:23:in `run'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/rspec-core-2.11.1/lib/rspec/core/hooks.rb:106:in `run'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/rspec-core-2.11.1/lib/rspec/core/hooks.rb:424:in `run_hook'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/rspec-core-2.11.1/lib/rspec/core/example_group.rb:312:in `run_before_all_hooks'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/rspec-core-2.11.1/lib/rspec/core/example_group.rb:359:in `run'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/rspec-core-2.11.1/lib/rspec/core/command_line.rb:28:in `block (2 levels) in run'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/rspec-core-2.11.1/lib/rspec/core/command_line.rb:28:in `map'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/rspec-core-2.11.1/lib/rspec/core/command_line.rb:28:in `block in run'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/rspec-core-2.11.1/lib/rspec/core/reporter.rb:34:in `report'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/rspec-core-2.11.1/lib/rspec/core/command_line.rb:25:in `run'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/rspec-core-2.11.1/lib/rspec/core/runner.rb:69:in `run'
 # /Users/ian/.rvm/gems/ruby-1.9.2-p180/gems/rspec-core-2.11.1/lib/rspec/core/runner.rb:8:in `block in autorun'

Race condition on procedure creation? (sidekiq / mysql2)

In a multithreaded environment (sikdeiq), I got one exception during batch upsert:

Mysql2::Error: Mysql2::Error: PROCEDURE upsert2_0_3_push_filters_pushes_SEL_push_filter_id_A_4228041800 already exists

[SNIP]

File "[GEM_ROOT]/gems/upsert-2.0.3/lib/upsert.rb", line 72 in batch
File "[PROJECT_ROOT]/app/workers/push_preprocessor_worker.rb", line 87 in block (2 levels) in attach_push_filters_to_push!
File "[PROJECT_ROOT]/app/workers/push_preprocessor_worker.rb", line 87 in each
File "[PROJECT_ROOT]/app/workers/push_preprocessor_worker.rb", line 88 in block (3 levels) in attach_push_filters_to_push!
File "[GEM_ROOT]/gems/upsert-2.0.3/lib/upsert.rb", line 216 in row
File "[GEM_ROOT]/gems/upsert-2.0.3/lib/upsert/merge_function.rb", line 13 in execute
File "[GEM_ROOT]/gems/upsert-2.0.3/lib/upsert/merge_function/Mysql2_Client.rb", line 16 in execute
File "[GEM_ROOT]/gems/upsert-2.0.3/lib/upsert/merge_function/Mysql2_Client.rb", line 23 in rescue in execute
File "[GEM_ROOT]/gems/upsert-2.0.3/lib/upsert/merge_function/mysql.rb", line 28 in create!
File "[GEM_ROOT]/gems/upsert-2.0.3/lib/upsert/connection/Mysql2_Client.rb", line 7 in execute
File "[GEM_ROOT]/gems/upsert-2.0.3/lib/upsert/connection/Mysql2_Client.rb", line 7 in query

Is it expected? Would it be nice to retry the upsert automatically in that case?

Configure attributes not to be updated (other than created_(at|on)

Sometimes it makes sense not to update fields, but you still need to set them for new records. You are already doing this for created_(at|on). We have a use case for not updating an additional field.

This would be much easier to implement than different create_atts and update_atts like suggested in
#17 by passing an option and then using it in here:

  def update_column_definitions
    setter_column_definitions.select { |cd| cd.name !~ CREATED_COL_REGEX }
  end

Does that make sense? Would you integrate such a pull request into your gem?

Multiple columns selector

It would be a great idea if we could possibly use multiple columns for the selector.

Getting: TypeError: wrong argument type Class (expected Module) when using upsert

1.9.2p290 :027 > Product.superclass
 => ActiveRecord::Base 

1.9.2p290 :028 > upsert = Upsert.new(Product.connection, Product.table_name)
TypeError: wrong argument type Class (expected Module)
    from /Users/joe/projects/tanga/bundler/ruby/1.9.1/gems/upsert-0.3.1/lib/upsert.rb:79:in `extend'
    from /Users/joe/projects/tanga/bundler/ruby/1.9.1/gems/upsert-0.3.1/lib/upsert.rb:79:in `initialize'
    from (irb):28:in `new'
    from (irb):28
    from /Users/joe/projects/tanga/bundler/ruby/1.9.1/bundler/gems/rails-e10dc0eb8aa7/railties/lib/rails/commands/console.rb:47:in `start'
    from /Users/joe/projects/tanga/bundler/ruby/1.9.1/bundler/gems/rails-e10dc0eb8aa7/railties/lib/rails/commands/console.rb:8:in `start'
    from /Users/joe/projects/tanga/bundler/ruby/1.9.1/bundler/gems/rails-e10dc0eb8aa7/railties/lib/rails/commands.rb:41:in `<top (required)>'
    from script/rails:6:in `require'
    from script/rails:6:in `<main>'


1.9.2p290 :029 > Product.upsert({:name => '1'}, :name => '2')
TypeError: wrong argument type Class (expected Module)
    from /Users/joe/projects/tanga/bundler/ruby/1.9.1/gems/upsert-0.3.1/lib/upsert.rb:79:in `extend'
    from /Users/joe/projects/tanga/bundler/ruby/1.9.1/gems/upsert-0.3.1/lib/upsert.rb:79:in `initialize'
    from /Users/joe/projects/tanga/bundler/ruby/1.9.1/gems/upsert-0.3.1/lib/upsert/active_record_upsert.rb:5:in `new'
    from /Users/joe/projects/tanga/bundler/ruby/1.9.1/gems/upsert-0.3.1/lib/upsert/active_record_upsert.rb:5:in `block in upsert'
    from /Users/joe/projects/tanga/bundler/ruby/1.9.1/bundler/gems/rails-e10dc0eb8aa7/activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb:123:in `with_connection'
    from /Users/joe/projects/tanga/bundler/ruby/1.9.1/gems/upsert-0.3.1/lib/upsert/active_record_upsert.rb:4:in `upsert'
    from (irb):29
    from /Users/joe/projects/tanga/bundler/ruby/1.9.1/bundler/gems/rails-e10dc0eb8aa7/railties/lib/rails/commands/console.rb:47:in `start'
    from /Users/joe/projects/tanga/bundler/ruby/1.9.1/bundler/gems/rails-e10dc0eb8aa7/railties/lib/rails/commands/console.rb:8:in `start'
    from /Users/joe/projects/tanga/bundler/ruby/1.9.1/bundler/gems/rails-e10dc0eb8aa7/railties/lib/rails/commands.rb:41:in `<top (required)>'
    from script/rails:6:in `require'
    from script/rails:6:in `<main>'

Upsert "Timestamp without time zone" error (postgresql)

Suddenly, upsert throws errors.

The created_at field has a Time.now as value (a Datetime)
This gets converted into 2012-12-10 15:31:45.204132+00:00
Postgresql throws an error: ERROR: operator does not exist: timestamp without time zone
Postgresql suggets I might need an explicit type cast

See output below:

[upsert] SELECT upsert_indicators_SEL_aspect_id_A_code_A_created_at_A3127343750($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24) with [nil, "Alumni_arbeid", "2012-12-10 15:31:45.204084+00:00", "Afgestudeerdenoordeel: arbeidsmarktrelevantie", "arbeidsmarkt", nil, 0, nil, nil, nil, "2012-12-10 15:31:45.204132+00:00", "1", nil, "Alumni_arbeid", "2012-12-10 15:31:45.204084+00:00", "Afgestudeerdenoordeel: arbeidsmarktrelevantie", "arbeidsmarkt", nil, 0, nil, nil, nil, "2012-12-10 15:31:45.204132+00:00", "1"]
[Worker(host:Mark-Thinkpad pid:7854)] Class#import_indicators failed with PG::Error: ERROR:  operator does not exist: timestamp without time zone = character varying
LINE 2: ..._id_sel" AND "code" = "code_sel" AND "created_at" = "created...
                                                             ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.
QUERY:  UPDATE "indicators" SET "aspect_id" = "aspect_id_set", "code" = "code_set", "created_at" = CAST("created_at_set" AS timestamp without time zone), "description" = "description_set", "name" = "name_set", "norm_hbo" = "norm_hbo_set", "norm_sign" = "norm_sign_set", "norm_wo" = "norm_wo_set", "sd_hbo" = "sd_hbo_set", "sd_wo" = "sd_wo_set", "updated_at" =     CAST("updated_at_set" AS timestamp without time zone), "weight" = "weight_set"
            WHERE "aspect_id" = "aspect_id_sel" AND "code" = "code_sel" AND "created_at" = "created_at_sel" AND "description" = "description_sel" AND "name" = "name_sel" AND "norm_hbo" = "norm_hbo_sel" AND "norm_sign" = "norm_sign_sel" AND "norm_wo" = "norm_wo_sel" AND "sd_hbo" = "sd_hbo_sel" AND "sd_wo" = "sd_wo_sel" AND "updated_at" = "updated_at_sel" AND "weight" = "weight_sel"
CONTEXT:  PL/pgSQL function "upsert_indicators_sel_aspect_id_a_code_a_created_at_a3127343750" line 7 at SQL statement
 - 0 failed attempts

Slower than AR?

Hi!
I try to use this gem and there is something I can not understand...

# using upsert gem
# >>>>>> finished in 382.501348 sec <<<<<<<
now = Time.zone.now.utc
Upsert.batch(Certificate.connection, Certificate.table_name) do |upsert|
  CSV.foreach(@file, CSV_OPTIONS) do |row|
    selector = { part_number: row[1].tr("'", ""), manufacturer: row[4] }
    fields = { date: Date.strptime(row[0], "%d-%b-%y"),
               number: row[2],
               note: row[3],
               address: row[5],
               created_at: now,
               updated_at: now }
    upsert.row(selector, fields)
  end
end

while

# AR
# >>>>>> finished in 210.520644 sec <<<<<<<
CSV.foreach(@file, CSV_OPTIONS) do |row|
  params = { date: Date.strptime(row[0], "%d-%b-%y"),
             part_number: row[1].tr("'", ""),
             number: row[2],
             note: row[3],
             manufacturer: row[4],
             address: row[5] }
  Certificate
    .find_or_initialize_by(part_number: params[:part_number], manufacturer: params[:manufacturer]) do |m|
      m.assign_attributes params
    end
    .tap do |m|
      if m.persisted?
        m.update_columns(params.slice(:date, :number, :note, :address))
      else
        m.save!(validate: false)
      end
    end
end

Am I miss something? I'm using PG 9.3 btw.
File I was loading - is about 32k lines. When I use it to load small amount of records (about 100 lines) upsert was fastest. But not with real csv.

P.S.
Rails 4.0.13

uses a different merge function for every combination of inputs

it should just have a single postgres function that takes one selector arg and one setter arg

the arg type should be record or maybe hstore or maybe json

then we don't have to constantly create new merge functions for new inputs

PG merge function and nulls: `null value in column "iso_3166_code" violates not-null constraint (PG::Error)`

the pg merge function needs to be smarter about nulls where there can't be nulls

I, [2012-06-19T11:20:58.262005 #90857]  INFO -- : [data_miner] Country FAILED (5.98s)
/Users/seamusabshere/.rvm/gems/ruby-1.9.3-p194/gems/upsert-0.1.2/lib/upsert/buffer/pg_connection.rb:25:in `exec': ERROR:  null value in column "iso_3166_code" violates not-null constraint (PG::Error)
CONTEXT:  SQL statement "UPDATE countries SET iso_3166_code = iso_3166_code_input,iso_3166_numeric_code = iso_3166_numeric_code_input,iso_3166_alpha_3_code = iso_3166_alpha_3_code_input,name = name_input,heating_degree_days = heating_degree_days_input,heating_degree_days_units = heating_degree_days_units_input,cooling_degree_days = cooling_degree_days_input,cooling_degree_days_units = cooling_degree_days_units_input,automobile_urbanity = automobile_urbanity_input,automobile_fuel_efficiency = automobile_fuel_efficiency_input,automobile_fuel_efficiency_units = automobile_fuel_efficiency_units_input,automobile_city_speed = automobile_city_speed_input,automobile_city_speed_units = automobile_city_speed_units_input,automobile_highway_speed = automobile_highway_speed_input,automobile_highway_speed_units = automobile_highway_speed_units_input,automobile_trip_distance = automobile_trip_distance_input,automobile_trip_distance_units = automobile_trip_distance_units_input,electricity_emission_factor = electricity_emission_factor_input,electricity_emission_factor_units = electricity_emission_factor_units_input,electricity_loss_factor = electricity_loss_factor_input,flight_route_inefficiency_factor = flight_route_inefficiency_factor_input,lodging_occupancy_rate = lodging_occupancy_rate_input,lodging_natural_gas_intensity = lodging_natural_gas_intensity_input,lodging_natural_gas_intensity_units = lodging_natural_gas_intensity_units_input,lodging_fuel_oil_intensity = lodging_fuel_oil_intensity_input,lodging_fuel_oil_intensity_units = lodging_fuel_oil_intensity_units_input,lodging_electricity_intensity = lodging_electricity_intensity_input,lodging_electricity_intensity_units = lodging_electricity_intensity_units_input,lodging_district_heat_intensity = lodging_district_heat_intensity_input,lodging_district_heat_intensity_units = lodging_district_heat_intensity_units_input,rail_passengers = rail_passengers_input,rail_trip_distance = rail_trip_distance_input,rail_trip_distance_units = rail_trip_distance_units_input,rail_speed = rail_speed_input,rail_speed_units = rail_speed_units_input,rail_trip_electricity_intensity = rail_trip_electricity_intensity_input,rail_trip_electricity_intensity_units = rail_trip_electricity_intensity_units_input,rail_trip_diesel_intensity = rail_trip_diesel_intensity_input,rail_trip_diesel_intensity_units = rail_trip_diesel_intensity_units_input,rail_trip_co2_emission_factor = rail_trip_co2_emission_factor_input,rail_trip_co2_emission_factor_units = rail_trip_co2_emission_factor_units_input WHERE name = name_input"
PL/pgSQL function "merge_countries_68363704441" line 4 at SQL statement
  from /Users/seamusabshere/.rvm/gems/ruby-1.9.3-p194/gems/upsert-0.1.2/lib/upsert/buffer/pg_connection.rb:25:in `execute'
  from /Users/seamusabshere/.rvm/gems/ruby-1.9.3-p194/gems/upsert-0.1.2/lib/upsert/buffer.rb:48:in `add'
  from /Users/seamusabshere/.rvm/gems/ruby-1.9.3-p194/gems/upsert-0.1.2/lib/upsert.rb:71:in `row'
  from /Users/seamusabshere/code/data_miner/lib/data_miner/step/import.rb:96:in `block (2 levels) in start'
  from /Users/seamusabshere/.rvm/gems/ruby-1.9.3-p194/gems/remote_table-2.0.1/lib/remote_table.rb:429:in `block (2 levels) in each'
  from /Users/seamusabshere/.rvm/gems/ruby-1.9.3-p194/gems/remote_table-2.0.1/lib/remote_table.rb:418:in `each'
  from /Users/seamusabshere/.rvm/gems/ruby-1.9.3-p194/gems/remote_table-2.0.1/lib/remote_table.rb:418:in `block in each'
  from /Users/seamusabshere/.rvm/gems/ruby-1.9.3-p194/gems/remote_table-2.0.1/lib/remote_table/delimited.rb:66:in `block in _each'
  from /Users/seamusabshere/.rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/csv.rb:1792:in `each'
  from /Users/seamusabshere/.rvm/gems/ruby-1.9.3-p194/gems/remote_table-2.0.1/lib/remote_table/delimited.rb:35:in `_each'
  from /Users/seamusabshere/.rvm/gems/ruby-1.9.3-p194/gems/remote_table-2.0.1/lib/remote_table.rb:417:in `each'
  from /Users/seamusabshere/code/data_miner/lib/data_miner/step/import.rb:90:in `block in start'
  from /Users/seamusabshere/.rvm/gems/ruby-1.9.3-p194/gems/upsert-0.1.2/lib/upsert.rb:37:in `stream'
  from /Users/seamusabshere/code/data_miner/lib/data_miner/step/import.rb:89:in `start'
  from /Users/seamusabshere/code/data_miner/lib/data_miner/script.rb:223:in `block (2 levels) in start'
  from /Users/seamusabshere/code/data_miner/lib/data_miner/script.rb:222:in `each'
  from /Users/seamusabshere/code/data_miner/lib/data_miner/script.rb:222:in `block in start'
  from /Users/seamusabshere/code/data_miner/lib/data_miner/run.rb:93:in `block in start'
  from /Users/seamusabshere/code/data_miner/lib/data_miner/run.rb:92:in `catch'
  from /Users/seamusabshere/code/data_miner/lib/data_miner/run.rb:92:in `start'
  from /Users/seamusabshere/code/data_miner/lib/data_miner/script.rb:221:in `start'
  from /Users/seamusabshere/code/data_miner/lib/data_miner/active_record_class_methods.rb:26:in `run_data_miner!'
  from /Users/seamusabshere/code/data_miner/lib/data_miner.rb:82:in `block (2 levels) in start'
  from /Users/seamusabshere/code/data_miner/lib/data_miner.rb:81:in `map'
  from /Users/seamusabshere/code/data_miner/lib/data_miner.rb:81:in `block in start'
  from /Users/seamusabshere/code/data_miner/lib/data_miner/script.rb:11:in `uniq'
  from /Users/seamusabshere/code/data_miner/lib/data_miner.rb:80:in `start'
  from /Users/seamusabshere/code/data_miner/lib/data_miner.rb:128:in `run'
  from /Users/seamusabshere/code/data_miner/test/test_earth_tap.rb:10:in `<top (required)>'

Sum the new setter with the data

Hello, I don't know if is possible do it using this gem, it is the case:

connection = Mysql2::Client.new([...])
Upsert.batch(connection, :pets) do |upsert|
  # N times...
  upsert.row({:name => 'product1'}, :count => 2)
  upsert.row({:name => 'product2'}, :count => 10)
end

I don't wanna only update the count attribute, if the product1 has a count: 1 in the database, I wanna sum the database value with the new value. In this case product1.count became 3(2+1).

Is it possible? Thanks all

"Thread.exclusive is deprecated" warning on Ruby 2.3.0

The first call to the #logger method in the Upsert gem results in a deprecation warning on ruby 2.3.0

Thread.exclusive is deprecated, use Mutex

It seems like a temporary way to avoid the warning is to explicitly set the logger before this method is called using #logger=, however to maintain compatibility with future versions the #logger method may need to be updated.

Procedure re-created on every use

I'm using upsert rails plugin (ModelName.upsert(...))

Every time I call upsert method on rails plugin, I see new log entry

Creating or replacing database function "upsert2_1_0....

Procedure name is same but it's still re-created on every call. Is this bug or intentional?

Github page says that is should re-use existing function. ("Transparently creates (and re-uses) stored procedures/functions when necessary")

Upsert fails silently in Rails when run with SQlite3

I am running on a rails 3.2.13 application, along with sqlite3 1.3.7, attempting to use upsert 1.2.0 in my application but it doesn't seem to want to work and I'm stumped as to what I'm doing wrong. Here's what I've done so far:

Added gem 'upsert', '~> 1.2.0' to my Gemfile
Ran rake db:test:prepare
Run rails c test, attempt to use upsert:

> require 'upsert/active_record_upsert'
=> true
> Program.upsert({uid: UUIDTools::UUID.timestamp_create.to_s}, name: 'Foobar')
=> nil
> Program.count
=> 0

Examining the log/test.log here is the output:

Connecting to database specified by database.yml
[upsert] PRAGMA table_info("programs")
[upsert] INSERT OR IGNORE INTO "programs" ("name","uid") VALUES (?,?) with ["Foobar", "72736c0c-addd-11e2-a105-c82a14fffe94"]
[upsert] UPDATE "programs" SET "name"=?,"uid"=? WHERE "uid"=? with ["Foobar", "72736c0c-addd-11e2-a105-c82a14fffe94", "72736c0c-addd-11e2-a105-c82a14fffe94"]

Any idea what I might be doing wrong?

Problem with :hstore

I try upsert with hstore

NameError: uninitialized constant PgHstore
/Users/yurgon/.rvm/gems/ruby-2.0.0-p481@railstutorial_rails_4_0/gems/upsert-2.0.3/lib/upsert/connection/postgresql.rb:18:in bind_value' /Users/yurgon/.rvm/gems/ruby-2.0.0-p481@railstutorial_rails_4_0/gems/upsert-2.0.3/lib/upsert/merge_function/PG_Connection.rb:21:inblock in execute'
/Users/yurgon/.rvm/gems/ruby-2.0.0-p481@railstutorial_rails_4_0/gems/upsert-2.0.3/lib/upsert/merge_function/PG_Connection.rb:21:in map' /Users/yurgon/.rvm/gems/ruby-2.0.0-p481@railstutorial_rails_4_0/gems/upsert-2.0.3/lib/upsert/merge_function/PG_Connection.rb:21:inexecute'
/Users/yurgon/.rvm/gems/ruby-2.0.0-p481@railstutorial_rails_4_0/gems/upsert-2.0.3/lib/upsert/merge_function.rb:13:in execute' /Users/yurgon/.rvm/gems/ruby-2.0.0-p481@railstutorial_rails_4_0/gems/upsert-2.0.3/lib/upsert.rb:216:inrow'
/Users/yurgon/Rails/skidler/lib/tasks/import.rake:41:in block (3 levels) in <top (required)>' /Users/yurgon/.rvm/gems/ruby-2.0.0-p481@railstutorial_rails_4_0/gems/saxerator-0.9.4/lib/saxerator/parser/accumulator.rb:20:incall'
/Users/yurgon/.rvm/gems/ruby-2.0.0-p481@railstutorial_rails_4_0/gems/saxerator-0.9.4/lib/saxerator/parser/accumulator.rb:20:in end_element' /Users/yurgon/.rvm/gems/ruby-2.0.0-p481@railstutorial_rails_4_0/gems/nokogiri-1.6.3.1/lib/nokogiri/xml/sax/document.rb:127:inend_element_namespace'
/Users/yurgon/.rvm/gems/ruby-2.0.0-p481@railstutorial_rails_4_0/gems/saxerator-0.9.4/lib/saxerator/parser/latched_accumulator.rb:14:in check_latches_and_passthrough' /Users/yurgon/.rvm/gems/ruby-2.0.0-p481@railstutorial_rails_4_0/gems/saxerator-0.9.4/lib/saxerator/parser/latched_accumulator.rb:53:inend_element_namespace'
/Users/yurgon/.rvm/gems/ruby-2.0.0-p481@railstutorial_rails_4_0/gems/nokogiri-1.6.3.1/lib/nokogiri/xml/sax/parser.rb:95:in parse_with' /Users/yurgon/.rvm/gems/ruby-2.0.0-p481@railstutorial_rails_4_0/gems/nokogiri-1.6.3.1/lib/nokogiri/xml/sax/parser.rb:95:inparse_io'
/Users/yurgon/.rvm/gems/ruby-2.0.0-p481@railstutorial_rails_4_0/gems/nokogiri-1.6.3.1/lib/nokogiri/xml/sax/parser.rb:82:in parse' /Users/yurgon/.rvm/gems/ruby-2.0.0-p481@railstutorial_rails_4_0/gems/saxerator-0.9.4/lib/saxerator/document_fragment.rb:19:ineach'
/Users/yurgon/Rails/skidler/lib/tasks/import.rake:23:in block (2 levels) in <top (required)>' /Users/yurgon/.rvm/gems/ruby-2.0.0-p481@railstutorial_rails_4_0/bin/ruby_executable_hooks:15:ineval'
/Users/yurgon/.rvm/gems/ruby-2.0.0-p481@railstutorial_rails_4_0/bin/ruby_executable_hooks:15:in `

PG::UndefinedFunction: ERROR: function with OID 1148257 does not exist

Some of my offline workers run into this error when using upsert:

PG::UndefinedFunction: ERROR: function with OID 1148257 does not exist

The job fails, and later upon retry, everything works. I just wanted to file in case anybody else was seeing this with upsert. Thanks for your work on this library.

Stack is

Postgres 9.3.4
upsert 2.1.0
pg 0.17.1 (Ruby gem)
Ruby 2.1.3, Rails 3.2, Sidekiq 2.16.1, but I don't think that matters

Stacktrace:

vendor/bundle/ruby/2.1.0/gems/upsert-2.1.0/lib/upsert/connection/PG_Connection.rb:13 in execute
vendor/bundle/ruby/2.1.0/gems/upsert-2.1.0/lib/upsert/merge_function/postgresql.rb:106 in create!
vendor/bundle/ruby/2.1.0/gems/upsert-2.1.0/lib/upsert/merge_function.rb:39 in initialize
vendor/bundle/ruby/2.1.0/gems/upsert-2.1.0/lib/upsert.rb:229 in new
vendor/bundle/ruby/2.1.0/gems/upsert-2.1.0/lib/upsert.rb:229 in merge_function
vendor/bundle/ruby/2.1.0/gems/upsert-2.1.0/lib/upsert.rb:218 in row
app/models/feed_item.rb:106 in block (3 levels) in add_to_user_feeds

PostgreSQL error » duplicate key value violates unique constraint "pg_proc_proname_args_nsp_index"

ERROR:  duplicate key value violates unique constraint "pg_proc_proname_args_nsp_index"
DETAIL:  Key (proname, proargtypes, pronamespace)=(upsert2_1_0_measurement_data_sel_measurement_point_id2088041071, 23 1043 1043 23 1043 701, 2200) already exists.

/var/app/current/vendor/bundle/ruby/2.0/gems/upsert-2.1.0/lib/upsert/connection/PG_Connection.rb:13:in `exec'
/var/app/current/vendor/bundle/ruby/2.0/gems/upsert-2.1.0/lib/upsert/connection/PG_Connection.rb:13:in `execute'
/var/app/current/vendor/bundle/ruby/2.0/gems/upsert-2.1.0/lib/upsert/merge_function/postgresql.rb:106:in `create!'
/var/app/current/vendor/bundle/ruby/2.0/gems/upsert-2.1.0/lib/upsert/merge_function.rb:39:in `initialize'
/var/app/current/vendor/bundle/ruby/2.0/gems/upsert-2.1.0/lib/upsert.rb:229:in `new'
/var/app/current/vendor/bundle/ruby/2.0/gems/upsert-2.1.0/lib/upsert.rb:229:in `merge_function'
/var/app/current/vendor/bundle/ruby/2.0/gems/upsert-2.1.0/lib/upsert.rb:218:in `row'
/var/app/current/vendor/bundle/ruby/2.0/gems/upsert-2.1.0/lib/upsert/active_record_upsert.rb:6:in `block in upsert'
/var/app/current/vendor/bundle/ruby/2.0/gems/activerecord-4.2.0/lib/active_record/connection_adapters/abstract/connection_pool.rb:292:in `with_connection'
/var/app/current/vendor/bundle/ruby/2.0/gems/upsert-2.1.0/lib/upsert/active_record_upsert.rb:4:in `upsert'
…

I'm using upsert as part of a background process (Delayed::Job) dealing with inbound data on PostgreSQL with a number of processors.

PostgreSQL shows the following: http://www.postgresql.org/message-id/[email protected]

Making columns to be ignored on update a configuration

It's nice that created_at and created_on are being ignored on updates, but it'd be even nicer if it could be expanded to encompass a user specified list of columns 😄

Has any thought been put into this before? I have some fields that are required on insert, but don't need to be continually updated past that point (and don't change after insertion).