Comments (10)
Yep, so the current query looks like:
INSERT INTO #{shadow_table}
SELECT #{columns}
FROM ONLY #{client.table}
A custom SQL would just replace the 2nd and 3rd lines above with something like:
SELECT a.foo,a.bar,a.baz,a.rental_id,r.tenant_id AS tenant_id
FROM ONLY examples a
LEFT OUTER JOIN rentals r
ON a.rental_id = r.id
from pg-osc.
One other reason came to me this morning. Suppose you are splitting up your database geographically or just sharding by some hash of tenant ID. So, you want to replicate to the various new shards, then during a maintenance, you promote all the shards to primaries and then delete the unwanted data. You could write a custom SQL that looks like:
SELECT a.id,a.foo,a.bar,a.baz
FROM ONLY examples a
WHERE a.id IN (1,5,9,13....)
And that would be the equivalent of a DELETE plus a pg_repack
but should be much faster.
from pg-osc.
Thats a great idea! I was thinking something like providing a custom sql file like pgbench and that will get replayed.
Do you have an example (even high level is fine). I am curious to understand the before/after state of the table structurally and what might a potential custom backfil query (or set of queries) may look like.
There is also some other interesting stuff we can do by plugging into the triggers and re-using the replay_data
. So, trying to understand where does it makes the most sense.
from pg-osc.
I have a few use cases where this would be super nice, so I am also considering support this before launch (as part of v1.0.0) too.
from pg-osc.
ooh! thats very interesting. Nice, I am planning on prioritizing next week (hopefully), stay tuned.
from pg-osc.
Given that copy_data
happens before alter, this would only be possible for existing columns, which i think is a fair start (?). We can change the order too, but then we would need to ensure that we automatically skip the data being copied for any dropped rows or copy accordingly for renamed rows
Lmk if you have any thoughts.
from pg-osc.
Hrmm, I hadn't realized this. Wouldn't you want the run_alter_statement
to happen before copy_data
? Otherwise, you have to rewrite the entire table if you change a column type with the ALTER. If you do it beorehand, then you can just copy the data in.
from pg-osc.
yeah thats a good point, plus wasteful copy for when column is removed. I am whipping up a PR to alter first, allows for other custom stuff too.
from pg-osc.
BTW, back to the original point: if you are using the custom SQL option, you are already in the danger zone and doing something that will probably destroy some of your data (hopefully on purpose!), so it's ok to have limited functionality.
That said, I think flipping the order for the run_alter_statement
and copy_data
operations is a good idea.
from pg-osc.
Yep, I agree all around. I refactored and updated the logic in so that alter happens first, then copy. Will whip up another PR to support a custom option after this (#32)
from pg-osc.
Related Issues (20)
- Ability to show progress HOT 16
- Views from different schema aren't updated because of error HOT 1
- INSERT's are failing for long table names HOT 2
- Validating foreign key constraints locks table HOT 17
- Scope of this tool HOT 1
- Add a preview/dry-run command
- fails to get view definition if it is from another schema HOT 4
- Does pg-osc is able to insert millions of rows to an existing table? HOT 2
- will support `after` sql syntax ? HOT 1
- How does this will work in a live project. HOT 4
- Unable to perform alterations more than once on a table. HOT 2
- Update Readme reg- docker setup of pg-osc HOT 2
- Feature Request: We want to make it working with the table with trigger HOT 7
- Set the appropriate next sequence value on the new table HOT 11
- 很不错的工具 HOT 1
- Potential need for a `pull-batch-delay` option HOT 4
- Views aren't updated after the swap HOT 7
- Auto-Vacuum Isn't Re-enabled HOT 14
- Using pg-osc only for backfilling data HOT 4
- Copy phase is acquiring a lock on my original table I don't know why HOT 18
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pg-osc.