Giter Site home page Giter Site logo

Comments (12)

banlex73 avatar banlex73 commented on September 28, 2024

What I found on my powa repository DB
powa_kcache_metrics_current was as big as 133Gb
and SELECT powa_kcache_aggregate(13) threw an error:
ERROR: duplicate key value violates unique constraint "powa_kcache_metrics_pkey" Detail: Key (srvid, coalesce_range, queryid, dbid, userid, top)=(13, ["2021-03-16 12:19:31.162797-07","2021-03-16 16:46:16.570921-07"], -9206572327289564196, 862035, 861727, t) already exists.
PL/pgSQL function powa_kcache_aggregate(integer) line 12 at SQL statement
and my fix was adding ON CONFLICT DO NOTHING;
I don't know if it is acceptable solution

`CREATE OR REPLACE FUNCTION public.powa_kcache_aggregate(_srvid integer)
RETURNS void
LANGUAGE plpgsql
AS $function$
DECLARE
result bool;
v_funcname text := 'powa_kcache_aggregate(' || _srvid || ')';
v_rowcount bigint;
BEGIN
PERFORM powa_log(format('running %I', v_funcname));

PERFORM powa_prevent_concurrent_snapshot(_srvid);

-- aggregate metrics table
INSERT INTO public.powa_kcache_metrics (coalesce_range, srvid, queryid,
                                        top, dbid, userid, metrics,
                                        mins_in_range, maxs_in_range)
    SELECT tstzrange(min((metrics).ts), max((metrics).ts),'[]'),
    srvid, queryid, top, dbid, userid, array_agg(metrics),
    ROW(min((metrics).ts),
        min((metrics).plan_reads), min((metrics).plan_writes),
        min((metrics).plan_user_time), min((metrics).plan_system_time),
        min((metrics).plan_minflts), min((metrics).plan_majflts),
        min((metrics).plan_nswaps),
        min((metrics).plan_msgsnds), min((metrics).plan_msgrcvs),
        min((metrics).plan_nsignals),
        min((metrics).plan_nvcsws), min((metrics).plan_nivcsws),
        min((metrics).exec_reads), min((metrics).exec_writes),
        min((metrics).exec_user_time), min((metrics).exec_system_time),
        min((metrics).exec_minflts), min((metrics).exec_majflts),
        min((metrics).exec_nswaps),
        min((metrics).exec_msgsnds), min((metrics).exec_msgrcvs),
        min((metrics).exec_nsignals),
        min((metrics).exec_nvcsws), min((metrics).exec_nivcsws)
    )::powa_kcache_type,
    ROW(max((metrics).ts),
        max((metrics).plan_reads), max((metrics).plan_writes),
        max((metrics).plan_user_time), max((metrics).plan_system_time),
        max((metrics).plan_minflts), max((metrics).plan_majflts),
        max((metrics).plan_nswaps),
        max((metrics).plan_msgsnds), max((metrics).plan_msgrcvs),
        max((metrics).plan_nsignals),
        max((metrics).plan_nvcsws), max((metrics).plan_nivcsws),
        max((metrics).exec_reads), max((metrics).exec_writes),
        max((metrics).exec_user_time), max((metrics).exec_system_time),
        max((metrics).exec_minflts), max((metrics).exec_majflts),
        max((metrics).exec_nswaps),
        max((metrics).exec_msgsnds), max((metrics).exec_msgrcvs),
        max((metrics).exec_nsignals),
        max((metrics).exec_nvcsws), max((metrics).exec_nivcsws)
    )::powa_kcache_type
    FROM powa_kcache_metrics_current
    WHERE srvid = _srvid
    GROUP BY srvid, queryid, top, dbid, userid
 **ON CONFLICT DO NOTHING;**

`

from powa-archivist.

rjuju avatar rjuju commented on September 28, 2024

Hi,

Do you have some monitoring to see how is the growth factor like? If it's a constant and regular growth then it's likely due to inadequate autovacuum tuning.

For the ON CONFLICT, it's a workaround for a bug. So sure for now you can keep it as is, but it's a sign that there's something wrong going on here, and the root problem should be fixed instead.

That being said I have no clue how that could be happening. It seems that the function is aggregating the same data twice, but that shouldn't be possible because the function removes the data it just aggregated. As the function is atomic, you should either have both aggregated the data and remove the source OR done nothing.

Ideally what should be done is removing the ON CONFLICT cause, wait to see if the error happens again and if it does check if the problematic record exists in powa_kcache_metrics table.

from powa-archivist.

banlex73 avatar banlex73 commented on September 28, 2024

Unfortunately, I don't have any monitoring in place..
I monitor 6 clusters (253 DBs), retention policy looks like:
DBs Retention
4 3 days
4 7 days
245 2 days
Table was bloated, after vacuum full it became: 12 GB (133Gb before)
I use default auto vacuum setting on postgres and nothing special on that table.
To find the root cause, I am going to remove ON CONFLICT and start monitoring it carefully, to understand what causes the issue.

Thank you for support.

from powa-archivist.

banlex73 avatar banlex73 commented on September 28, 2024

Found what is wrong with auto vacuum on my powa repository DB - I have powa_take_snapshot constantly running. auto vacuum doesn't have enough time to complete between snapshots.
powa_snapshot

from powa-archivist.

rjuju avatar rjuju commented on September 28, 2024

What does those values actually mean?

But even if there's always one or multiple powa_take_snapshot() running, it shouldn't prevent autovacuum from working. It only means that you'll have an amount of bloat equal to the maximum number of snapshot being performed between two autovacuum runs. It can be a bit high, but it should stay relatively constant. And I don't think that it should represent 120GB.

Note also that if you have pg_qualstats setup and a lot of databases and/or users and/or different normalized queries, the aggregation can be quite expensive. We have a fix for that (682b753) but it's not released yet. If you have pg_stat_statements enabled (and maybe a local powa) on your repository server you could check if you could benefit from this patch.

from powa-archivist.

banlex73 avatar banlex73 commented on September 28, 2024

from powa-archivist.

rjuju avatar rjuju commented on September 28, 2024

Hi @banlex73, any news on this issue?

from powa-archivist.

banlex73 avatar banlex73 commented on September 28, 2024

from powa-archivist.

rjuju avatar rjuju commented on September 28, 2024

Ok, thanks!

from powa-archivist.

banlex73 avatar banlex73 commented on September 28, 2024

All good now, thank you

from powa-archivist.

banlex73 avatar banlex73 commented on September 28, 2024

All good now, thank you

from powa-archivist.

rjuju avatar rjuju commented on September 28, 2024

Thanks a lot for the confirmation!

from powa-archivist.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.