Giter Site home page Giter Site logo

Comments (6)

tracyhenry avatar tracyhenry commented on June 11, 2024

An interesting note:

I was trying to increase maintenance_work_mem to speed up CREATE INDEX, but the performance of CREATE INDEX got even worse, and varied a lot.
After some investigation, I found that it was AUTO VACUUM workers who were contending the memory with CREATE INDEX. From this blog post:

"...Note that the system will run as many autovacuum workers at a time as specified by autovacuum_max_workers, and each one will allocate an amount of memory equal to maintenance_work_mem. Currently, it always allocates the full amount; hopefully, that will be fixed in the future. So, if you have the default setting of autovacuum_max_workers = 3, and you set a value like maintenance_work_mem = 10GB, you can and very often will have 30GB of RAM dedicated just to autovacuum, not counting any foreground VACUUM or CREATE INDEX operations that you may run. You can very easily run a small system out of memory this way, and even a large system may have some performance problems...."

In fact, it's this UPDATE query that triggered auto vacuum. And we don't need it: we can just put it in the big SELECT query together with other attributes. After doing that, the performance of CREATE INDEX got more normal and stable.

from kyrix.

peterg17 avatar peterg17 commented on June 11, 2024

Cool, thanks for sharing this! I definitely did not know that. How did you know that the UPDATE query triggers auto vacuum? And did you change both the maintenance_work_mem to be smaller (~1 GB) as the blog post suggests as well as combining the update query with the select query?

from kyrix.

tracyhenry avatar tracyhenry commented on June 11, 2024

I knew that UPDATE triggered autovaccum from this blog post. I also empirically tested: with UPDATE there were lots of autovacuum workers but none when UPDATE is gone.

The blog post suggested a low maintenance_work_mem because of the behavior of autovacuum. Now that we don't have this issue, I think we can set it to a higher value. I set it to 10GB, but I haven't tested what the performance would be with smaller or even bigger values.

from kyrix.

tracyhenry avatar tracyhenry commented on June 11, 2024

@peterg17 sorry, didn't see the latter part of your message.

Yes I did, 2GB. The performance still varied on a 100M dots dataset:
CREATE INDEX took 2301 secs on the top canvas, but 3768 secs on the bottom canvas.

not exactly sure why.

from kyrix.

peterg17 avatar peterg17 commented on June 11, 2024

Ok cool, and I assume you just ran perf numbers for the native box indexer? We can apply the same principle of using a lower maintenance_work_mem than before for the native cube indexer too, but i don't think there is a update query in the native cube indexer. Something that's interesting is that I was using 20GB as the value for maintenance_work_mem, but with 3 autovacuum workers that's 60 GB of total memory being used. But the instance only has 26 GB of RAM. So, how does Postgres interpret this impossible number?

from kyrix.

tracyhenry avatar tracyhenry commented on June 11, 2024

Yes I was running 2D, as well as my WIP partition indexer.

I don't really know how the allocation of memory happens if there is not enough. I haven't run that kind of experiment yet. My instance happened to have 30G memory...

from kyrix.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.