Comments (16)
Yeah, I haven't been able to find a way to do it since the shadow table creation and insertion are all within the same transaction. I can't get any visibility.
from pg-osc.
My workaround on RDS was to monitor the free storage space. Since this was such a big table I could roughly see how far along it was as free storage space shrunk:
from pg-osc.
Oh good point and you could watch the raw metric in cloudwatch if it's a smaller table. That would work well enough for DBs with relatively stable sizes and big tables like you mentioned!
from pg-osc.
@shayonj is it possible to do this on the shadow table and compare it to the original table? That'll give you a decent sense of progress. I have no idea whether this works on the shadow table while it's in a transaction.
SELECT pg_size_pretty(pg_total_relation_size('"<schema>"."<table>"'));
from pg-osc.
You could take a look at pg_stat_activity
with something like:
SELECT *
FROM pg_stat_activity
WHERE state <> 'idle'
;
and see what's happening on the db side.
from pg-osc.
@jfrost Yeah, I can see the INSERT INTO
running in the process list. Just no idea how long it will take. My original command failed with an PQconsumeInput() server closed the connection unexpectedly (PG::ConnectionBad)
error after over 20 hours. I have it running again to see if I can actually get it to run successfully. It brings to mind so many nice potential features:
- Showing progress
- Restarting a failed process (need to keep persisted state somehow)
- Some sort of preview/recommendation feature that scan the size of your database, looks at the host resources (CPU/Memory), and various PG settings (
max_wal_size
etc.), and gives an idea of how long the process might take and any tweaks that might help.
Anyway, we'll see how this turns out. I also have a custom SQL script running on a test instance that uses a different approach for changing a primary key from int4 to int8. Hopefully one of these will succeed and I can run it on the real thing and move on.
from pg-osc.
Quick update, I did finally get the command to finish on a test instance. It took a little over 21 hours. Some way to show progress would be nice, though. Feel free to close this issue if there's no viable way to do it.
from pg-osc.
@brycethornton They have --verbose
command to show you what is currently happening.
from pg-osc.
They have
--verbose
command to show you what is currently happening.
@sasharevzin Thanks. Yeah, that does add some handy output but the specific area where I would love some more visibility is during the Copying contents..
portion. I know what happens during this time but I don't know how far long it is at any given time. It's not a huge deal on smaller tables but on a really large table it would be really hand to know that it's 25%, 50%, 75%, etc. done with the "INSERT INTO..." statement. I don't think there's a good way to do this unfortunately.
from pg-osc.
I wonder if we periodically compare the count of the rows between the two tables and log a percentage complete every minute or so. I guess it might be hard to do with the transaction that is running the copy not completed π€
from pg-osc.
Sorry I'm late writing back...vacation got a little hectic. The joys of MVCC keep us from really seeing what the row count is. The only thing I can think of would be to look at the on disk file sizes, but you can only do that if you have access to the filesystem, so no way to do it on platforms like RDS.
from pg-osc.
Thats a good call. I think a good approach would be to aim for a balanced estimation. It seems using pg_table_size
would be more suitable in this scenario, given that pg_total_relation_size
includes indices, which aren't present in the shadow table during the copy_data!
process. pg_table_size
gives us a closer approximation of the table's size alone (like say if we were to instead perform a count(*)
on both tables for estimate).
However, there are a few things to keep in mind, such as the possibility of the original table being bloated. To address this, we could run an analyze and vacuum on it beforehand. This also means the progress indicator in the logs might not reach 100%, as the new table could end up being smaller. But I think that's acceptable, as long as we have some form of an ongoing progress indicator.
PR incoming shortly.
from pg-osc.
ah! i missed that the table creation happens from the same transaction as well, which means the table isn't visible to another connection either.
This is interesting....
pg-osc/lib/pg_online_schema_change/orchestrate.rb
Lines 166 to 175 in baa638f
UPDATE: Fixed
from pg-osc.
v0.9.7 is now out with the new estimated progress in logs during copy: https://github.com/shayonj/pg-osc/releases/tag/v0.9.7
Please feel free to share any ideas, feedback and bugs. Thanks everyone for all the brainstorming and feedback.
Closing now.
from pg-osc.
Thank you, @shayonj! I'm excited to give this a spin soon!
from pg-osc.
Woot! Would love to hear your thoughts.
Going to release 0.9.8 today evening. Looks like there are some gem version conflicts currently. Stay tuned.
UPDATE: https://github.com/shayonj/pg-osc/releases/tag/v0.9.8
from pg-osc.
Related Issues (20)
- Set the appropriate next sequence value on the new table HOT 11
- εΎδΈιηε·₯ε · HOT 1
- Potential need for a `pull-batch-delay` option HOT 4
- Views aren't updated after the swap HOT 7
- Auto-Vacuum Isn't Re-enabled HOT 14
- Using pg-osc only for backfilling data HOT 4
- Copy phase is acquiring a lock on my original table I don't know why HOT 18
- Views from different schema aren't updated because of error HOT 1
- INSERT's are failing for long table names HOT 2
- Validating foreign key constraints locks table HOT 17
- Scope of this tool HOT 1
- Add a preview/dry-run command
- fails to get view definition if it is from another schema HOT 4
- Does pg-osc is able to insert millions of rows to an existing table? HOT 2
- will support `after` sql syntax ? HOT 1
- How does this will work in a live project. HOT 4
- Unable to perform alterations more than once on a table. HOT 2
- Update Readme reg- docker setup of pg-osc HOT 2
- Error case with delta-count and pull-batch-count values HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pg-osc.