Giter Site home page Giter Site logo

Comments (16)

brycethornton avatar brycethornton commented on August 21, 2024 1

Yeah, I haven't been able to find a way to do it since the shadow table creation and insertion are all within the same transaction. I can't get any visibility.

from pg-osc.

brycethornton avatar brycethornton commented on August 21, 2024 1

My workaround on RDS was to monitor the free storage space. Since this was such a big table I could roughly see how far along it was as free storage space shrunk:

image

from pg-osc.

jfrost avatar jfrost commented on August 21, 2024 1

Oh good point and you could watch the raw metric in cloudwatch if it's a smaller table. That would work well enough for DBs with relatively stable sizes and big tables like you mentioned!

from pg-osc.

ngan avatar ngan commented on August 21, 2024 1

@shayonj is it possible to do this on the shadow table and compare it to the original table? That'll give you a decent sense of progress. I have no idea whether this works on the shadow table while it's in a transaction.

SELECT pg_size_pretty(pg_total_relation_size('"<schema>"."<table>"'));

from pg-osc.

jfrost avatar jfrost commented on August 21, 2024

You could take a look at pg_stat_activity with something like:

SELECT *
  FROM pg_stat_activity
 WHERE state <> 'idle'
;

and see what's happening on the db side.

from pg-osc.

brycethornton avatar brycethornton commented on August 21, 2024

@jfrost Yeah, I can see the INSERT INTO running in the process list. Just no idea how long it will take. My original command failed with an PQconsumeInput() server closed the connection unexpectedly (PG::ConnectionBad) error after over 20 hours. I have it running again to see if I can actually get it to run successfully. It brings to mind so many nice potential features:

  • Showing progress
  • Restarting a failed process (need to keep persisted state somehow)
  • Some sort of preview/recommendation feature that scan the size of your database, looks at the host resources (CPU/Memory), and various PG settings (max_wal_size etc.), and gives an idea of how long the process might take and any tweaks that might help.

Anyway, we'll see how this turns out. I also have a custom SQL script running on a test instance that uses a different approach for changing a primary key from int4 to int8. Hopefully one of these will succeed and I can run it on the real thing and move on.

from pg-osc.

brycethornton avatar brycethornton commented on August 21, 2024

Quick update, I did finally get the command to finish on a test instance. It took a little over 21 hours. Some way to show progress would be nice, though. Feel free to close this issue if there's no viable way to do it.

from pg-osc.

sasharevzin avatar sasharevzin commented on August 21, 2024

@brycethornton They have --verbose command to show you what is currently happening.

from pg-osc.

brycethornton avatar brycethornton commented on August 21, 2024

They have --verbose command to show you what is currently happening.

@sasharevzin Thanks. Yeah, that does add some handy output but the specific area where I would love some more visibility is during the Copying contents.. portion. I know what happens during this time but I don't know how far long it is at any given time. It's not a huge deal on smaller tables but on a really large table it would be really hand to know that it's 25%, 50%, 75%, etc. done with the "INSERT INTO..." statement. I don't think there's a good way to do this unfortunately.

from pg-osc.

shayonj avatar shayonj commented on August 21, 2024

I wonder if we periodically compare the count of the rows between the two tables and log a percentage complete every minute or so. I guess it might be hard to do with the transaction that is running the copy not completed πŸ€”

from pg-osc.

jfrost avatar jfrost commented on August 21, 2024

Sorry I'm late writing back...vacation got a little hectic. The joys of MVCC keep us from really seeing what the row count is. The only thing I can think of would be to look at the on disk file sizes, but you can only do that if you have access to the filesystem, so no way to do it on platforms like RDS.

from pg-osc.

shayonj avatar shayonj commented on August 21, 2024

Thats a good call. I think a good approach would be to aim for a balanced estimation. It seems using pg_table_size would be more suitable in this scenario, given that pg_total_relation_size includes indices, which aren't present in the shadow table during the copy_data! process. pg_table_size gives us a closer approximation of the table's size alone (like say if we were to instead perform a count(*) on both tables for estimate).

However, there are a few things to keep in mind, such as the possibility of the original table being bloated. To address this, we could run an analyze and vacuum on it beforehand. This also means the progress indicator in the logs might not reach 100%, as the new table could end up being smaller. But I think that's acceptable, as long as we have some form of an ongoing progress indicator.

PR incoming shortly.

from pg-osc.

shayonj avatar shayonj commented on August 21, 2024

ah! i missed that the table creation happens from the same transaction as well, which means the table isn't visible to another connection either.

This is interesting....

def setup_shadow_table!
# re-uses transaction with serializable
# This ensures that all queries from here till copy_data run with serializable.
# This is to to ensure that once the trigger is added to the primay table
# and contents being copied into the shadow, after a delete all on audit table,
# any replaying of rows that happen next from audit table do not contain
# any duplicates. We are ensuring there are no race conditions between
# adding the trigger, till the copy ends, since they all happen in the
# same serializable transaction.
Query.run(client.connection, "SET TRANSACTION ISOLATION LEVEL SERIALIZABLE", true)

UPDATE: Fixed

from pg-osc.

shayonj avatar shayonj commented on August 21, 2024

v0.9.7 is now out with the new estimated progress in logs during copy: https://github.com/shayonj/pg-osc/releases/tag/v0.9.7

Please feel free to share any ideas, feedback and bugs. Thanks everyone for all the brainstorming and feedback.

Closing now.

from pg-osc.

brycethornton avatar brycethornton commented on August 21, 2024

Thank you, @shayonj! I'm excited to give this a spin soon!

from pg-osc.

shayonj avatar shayonj commented on August 21, 2024

Woot! Would love to hear your thoughts.

Going to release 0.9.8 today evening. Looks like there are some gem version conflicts currently. Stay tuned.

UPDATE: https://github.com/shayonj/pg-osc/releases/tag/v0.9.8

from pg-osc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.