Migrations take a while to run (see benchmarks), and probably could be collapsed to sa

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Merged <a class="issue-link js-issue-link" data-error-text="Failed to load title" data

I also think we should do <a class="issue-link js-issue-link" data-error-

When we collapse migrations: someone who has already run all t

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

[RFE] Speed up `rake test:vmdb:setup` about manageiq-schema HOT 14 CLOSED

manageiq commented on July 29, 2024

[RFE] Speed up `rake test:vmdb:setup`

from manageiq-schema.

Comments (14)

NickLaMuro commented on July 29, 2024 1

@Fryguy because I apparently like to ignore you, I did throw together a POC PR for handling fixing the db/schema.rb ☝️ 👇 :

#504

If nothing else, it can help generate a collapse migration and prevent errors.

from manageiq-schema.

chessbyte commented on July 29, 2024 1

@kbrock Business goals are being hashed out in this PR

from manageiq-schema.

Fryguy commented on July 29, 2024 1

Merged #504 - I also think we should do #506

from manageiq-schema.

NickLaMuro commented on July 29, 2024 1

I also think we should do #506

Throwing my hat into the ring as well to say I agree. My proposal above makes it seem like the two approaches above are mutually exclusive, which they are not, so I am +1 for doing both.

from manageiq-schema.

NickLaMuro commented on July 29, 2024

So I spent the better part of today looking into how we could fix the schema (mostly trying to understand the ActiveRecord internals, and what how we have implemented the migrations around metrics. In short, I have a POC branch for how we could start to approach this:

master...NickLaMuro:fix-db-schema

As mentioned in the commit, this doesn't really create a valid schema, just allows the generated schema to run with rake db:schema:load. That said, this is a pattern lifted from projects that we probably could leverage (or at least use a inspiration):

(Note: The latter currently isn't heavily maintained at the moment)

But this leaves needing support:

metric* table "id" column CONSTRAINTs (ALTER TABLE calls currently)
metric* table inheritance (ALTER TABLE calls currently)
metric* TRIGGERs (hair_trigger might be able to support these)
metrics DB views (scenic should be able to support these)

I don't think it is worth me investing to much more time into this idea without some input and buy in from others, but scenic seems to have a solid pattern for this, where they prepend the overrides to the ActiveRecord::SchemaDumper:

https://github.com/rails/rails/blob/65c6f7030067b42e3c82c81d2424590ed61de29c/activerecord/lib/active_record/schema_dumper.rb#L102-L115

to also include their create_view definitions:

https://github.com/scenic-views/scenic/blob/main/lib/scenic/schema_dumper.rb#L6-L20
https://github.com/scenic-views/scenic/blob/main/lib/scenic/view.rb#L43-L52

Which is pretty much what I followed for my branch for the dumper. For the others, some additions probably will be need to the adapter like is done in scenic for create_view:

https://github.com/scenic-views/scenic/blob/048e08057e0bd76a700e7759344000cdc4e78235/lib/scenic/statements.rb#L25-L48

(and friends: drop_view, etc.)

from manageiq-schema.

kbrock commented on July 29, 2024

When we collapse migrations:

someone who has already run all the collapsed migrations is fine
someone starting from scratch is fine (and faster)
someone who has only run some of the migrations needs to install a version after the collapsed migrations and before the version that has the migrations collapsed

This works because in the older version of the code, the migrations are not collapsed yet.

Do we have a version of code that is from long time ago that we feel is reasonable to ask to upgrade to an interim version first

As for the invalid schema, I think we just use the code that did the migration in the first place. this is non trivial in code base like ours.

agree: Sure wish schema:load worked
If I remember correctly, we also have issues because the schema is alphabetical, and the child tables are before the parent table

from manageiq-schema.

NickLaMuro commented on July 29, 2024

If I remember correctly, we also have issues because the schema is alphabetical, and the child tables are before the parent table

No, I think it is more that we do a lot of custom stuff in these two migrations (one being the "collapse" one):

https://github.com/ManageIQ/manageiq-schema/blob/62d91b62/db/migrate/20130923182042_collapsed_initial_migration.rb#L963-L970
https://github.com/ManageIQ/manageiq-schema/blob/62d91b62/db/migrate/20190122213042_use_views_for_metrics.rb

And both of those lean heavily on the custom additions and helper here:

https://github.com/ManageIQ/manageiq-schema/blob/62d91b62/lib/migration_helper.rb

So when the db/schema.rb is written, it doesn't know how to support those, so those triggers/views/alterations are left on the floor.

Update/Additional info:

When we collapse migrations:

Just to clarify: I realized this after digging heavily into the ActiveRecord codebase, but db/schema.rb is also a "collapsed migration", and the ActiveRecord::Schema actually inherits from ActiveRecord::Migration (it is a little more complicated then that, but it is mostly the case).

So we already have a baked in way in rails for collapsing migrations, we just don't/can't use it because of lib/migration_helper.rb (triggers, views, etc.).

from manageiq-schema.

kbrock commented on July 29, 2024

User Experience

There are technical hurdles to consolidating the various migrations. And our result will be similar to our current schema.rb

I think we also need to take into account the user, developer, and support person's experience when upgrading to more recent versions of the product.

The developers and users will need to run this code many times more. (which again is why consolidation is important to do, but also why it is important to not make it too hard for us, users, and support to upgrade versions.

product version f before the great consolidation

migration	version
1	a
2	a
3	b
4	c
5	c
6	d
7	e
8	f

versions can have none or many migrations in them.
The migration number is actually a date stamp.

product version a

migration	version
1	a
2	a

product version c

migration	version
1	a
2	a
3	b
4	c
5	c

product version f after

It collapsed migrations from version a, b, and c

migration	version
1'{1-5}	f
5'{guard}	f
6	d
7	e
8	f

1' and 5' have the same exact version number as before but are rewritten.
1' contains the original 1-5
2,3,4 no longer exist
5' contains code that checks that the original 4 has already been run. if it has been run then nothing happens, but if it has not then the user is told to first upgrade to c,d, or e before upgrading to this version.

Users

user running a has run migrations 1,2
user running c has run migration 1,2,3,4,5
user running d has run migrations 1,2,3,4,5,6
new user has not run any migrations

Scenarios

new user installs f and runs migrations 1',5',6,7,8.
user upgrading from d to f runs migrations 7,8
user upgrading from c to f runs migrations 6,7,8
user upgrading from a to f. wants to run 5,6,7,8 but 5 detects that 4 has not been run and tells the user to upgrade to version c,d, or e first.
user upgrading from a to d runs migrations 4,5,6 and upgrades from d to f and runs migrations 7,8
user upgrading from b to e runs migrations 4,5,6,7 and upgrades from e to f and runs migration 8

Why is user a so complicated?

This is setup this way because users perviously running a and b, won't know how to run half of the consolidated migration.

The good news is that versions c,d, and e have the non-consolidated migrations (1-5), and can get the user past the consolidated migration and at a point that the future versions know how to migrate.

Take away

We can support all users, but we do inconvenience users running very old versions of the code, namely versions that have been consolidated.

While we do want to minimize the situation, the request to go through an interim versions is reasonable. We have done this in the past as well as other products in the wild.

from manageiq-schema.

Fryguy commented on July 29, 2024

I had started a collapse initial migration but it had to wait until jansa was released, so I could get a hard commit on which it could be done. It's actually relatively easy this time around.

from manageiq-schema.

Fryguy commented on July 29, 2024

@NickLaMuro I would say hold off on any specific fixes, because a collapse saves a huge amount of time.

from manageiq-schema.

kbrock commented on July 29, 2024

Do we have a version where we want the cutoff?

I would really like the old metrics code out of here. Not sure if that meshes with our business goals

from manageiq-schema.

Fryguy commented on July 29, 2024

#584 was merged (instead of #506)

from manageiq-schema.

Fryguy commented on July 29, 2024

@NickLaMuro Is this basically done? We also got ManageIQ/manageiq#21259 and ManageIQ/pg-logical_replication#6 since this was opened which improved performance a lot.

from manageiq-schema.

NickLaMuro commented on July 29, 2024

@Fryguy well, we could also update ManageIQ/manageiq to make use of rake db:schema:load, which would be even faster. That would really only fix dev setups, where the generated db/schema.rb already exist.

However, we could also generate the schema.rb in this repo, then update/enhance test:vmdb:setup to use that, and apply a db:migrate after to catch anything else. We would probably need a process in place for handling the schema.rb, and probably one of the two would work:

Devs are in charge of updating it with any schema-changing migrations, and just have to resolve conflicts when a PR is merged before theirs.
Maybe every sprint, we update the db/schema.rb in this repo so that it is up to date, so developers don't have to worry about this step.

Each have their pros and cons, though I would argue the first is simpler and the devs on the team can just deal with a minor conflict here and there that is easily resolved by just doing rm -rf db/schema.rb and re-running db:migrate.

That said, do #504 and ~~#506~~ #584 now speed up rake test:vmdb:setup? Yes.

The benchmarks above show it could be quicker, but up to the group on if we want to invest a bit more time getting a bit more of a speed up in CI/dev.

from manageiq-schema.

[RFE] Speed up `rake test:vmdb:setup` about manageiq-schema HOT 14 CLOSED

Comments (14)

User Experience

product version f before the great consolidation

product version a

product version c

product version f after

Users

Scenarios

Why is user a so complicated?

Take away

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent