Greenplum version or build postgres=# select version(); versi

[7.X]Resgroup's MEM usage limitation didn't take effect. ,about greenplum-db/gpdb

Comments (7)

kainwen commented on June 13, 2024

GP7's memory model is almost the same as the resource queue's.

To understand this memory model, you need to understand:

query_mem
operator_mem

GP7's memory limit is used to compute a value for SQL's query_mem. query_mem is not hard memory limit.

from gpdb.

FYIsunny commented on June 13, 2024

Thank you for your response.
After reviewing the resource queue documentation for Greenplum 7, I did not find some information about query_mem and operator_mem.
It seems that my question about "the memory limits of resource groups in Greenplum 7 not taking effect" has not been answered. Could you please provide more specific details in your response? Thank you very much.

GP7's memory model is almost the same as the resource queue's.

To understand this memory model, you need to understand:

query_mem

operator_mem

GP7's memory limit is used to compute a value for SQL's query_mem. query_mem is not hard memory limit.

from gpdb.

kainwen commented on June 13, 2024

Thank you for your response. After reviewing the resource queue documentation for Greenplum 7, I did not find some information about query_mem and operator_mem. It seems that my question about "the memory limits of resource groups in Greenplum 7 not taking effect" has not been answered. Could you please provide more specific details in your response? Thank you very much.

GP7's memory model is almost the same as the resource queue's.
To understand this memory model, you need to understand:

query_mem

operator_mem

GP7's memory limit is used to compute a value for SQL's query_mem. query_mem is not hard memory limit.

Thank you for your response. After reviewing the resource queue documentation for Greenplum 7, I did not find some information about query_mem and operator_mem. It seems that my question about "the memory limits of resource groups in Greenplum 7 not taking effect" has not been answered. Could you please provide more specific details in your response? Thank you very much.

@FYIsunny Hi, I have a thread on memory model: https://groups.google.com/a/greenplum.org/g/gpdb-dev/c/h25aCqHifuQ/m/2BxxzxfMBgAJ
would you please read the memory part and then see if it helps you understand.

If still more questions, lets continue discuss here.

from gpdb.

FYIsunny commented on June 13, 2024

Thank you for your response. After reviewing the resource queue documentation for Greenplum 7, I did not find some information about query_mem and operator_mem. It seems that my question about "the memory limits of resource groups in Greenplum 7 not taking effect" has not been answered. Could you please provide more specific details in your response? Thank you very much.

GP7's memory model is almost the same as the resource queue's.
To understand this memory model, you need to understand:

query_mem

operator_mem

GP7's memory limit is used to compute a value for SQL's query_mem. query_mem is not hard memory limit.

Thank you for your response. After reviewing the resource queue documentation for Greenplum 7, I did not find some information about query_mem and operator_mem. It seems that my question about "the memory limits of resource groups in Greenplum 7 not taking effect" has not been answered. Could you please provide more specific details in your response? Thank you very much.

@FYIsunny Hi, I have a thread on memory model: https://groups.google.com/a/greenplum.org/g/gpdb-dev/c/h25aCqHifuQ/m/2BxxzxfMBgAJ would you please read the memory part and then see if it helps you understand.

If still more questions, lets continue discuss here.

@kainwen
Thank you for your response.Could you please take a look at the following questions:

1.The default value of gp_resource_group_bypass_direct_dispatch for a resource group is true, which bypasses the resource group's limits. This raises the question of why the default isn't false. Or is it that setting it to false only takes effect at the session level?

2.Which one do you recommend using: "select * from gp_toolkit.gp_resgroup_status_per_segment;" or "SELECT * FROM gp_toolkit.gp_resgroup_status_per_host;" to obtain more precise and real-time statistics of segment memory usage?

3.Can I calculate the real-time memory usage of each segment by dividing the memory_usage from gp_resgroup_status_per_host by the number of primary segments?

4.I need to confirm again, in the statement "The parameter MEMORY_LIMIT of a resource group sets the maximum amount of memory reserved for this resource group on a segment", when creating a resource group, does the MEMORY_LIMIT parameter refer to the memory limit on the segment host, or the memory of a single primary segment on the segment host?

5.When creating a new resource group, what is the recommended maximum value for MEMORY_LIMIT? For example, there is a calculation example for resource queues. Could you also provide an example for a resource group? The example for a resource queue is: "Set MEMORY_LIMIT to 90% of memory available on a per-segment basis. For example, if a host has 48 GB of physical memory and 6 segment instances, then the memory available per segment instance is 8 GB. You can calculate the recommended MEMORY_LIMIT for a single queue as 0.90*8=7.2 GB. If there are multiple queues created on the system, their total memory limits must also add up to 7.2 GB."

6.If multiple resource groups have allocated all available memory, would there be a situation of memory resource contention when all resource groups are actively utilizing their allocated memory?

7.If one instance experiences an exception, would the memory resource limit for another instance increase?

from gpdb.

kainwen commented on June 13, 2024

1.The default value of gp_resource_group_bypass_direct_dispatch for a resource group is true, which bypasses the resource group's limits. This raises the question of why the default isn't false. Or is it that setting it to false only takes effect at the session level?

The GUC can be session-level or cluster-level, users can choose themselves. I don't see any problem that bypass simple direct dispatch SQL.

2.Which one do you recommend using: "select * from gp_toolkit.gp_resgroup_status_per_segment;" or "SELECT * FROM gp_toolkit.gp_resgroup_status_per_host;" to obtain more precise and real-time statistics of segment memory usage?

Greenplum 7's resgroup does not use cgroup to control memory, but GPDB put each QE into cgroup's memory directory so that we can read from cgroup's memory stats info. A host might contain several segments and a host only have one cgroup directory.

gp_toolkit.gp_resgroup_status_per_segment is read from each segment's shared memory it is the memorycontext memory used by GPDB.

Both views can be useful.

3.Can I calculate the real-time memory usage of each segment by dividing the memory_usage from gp_resgroup_status_per_host by the number of primary segments?

No.

4.I need to confirm again, in the statement "The parameter MEMORY_LIMIT of a resource group sets the maximum amount of memory reserved for this resource group on a segment", when creating a resource group, does the MEMORY_LIMIT parameter refer to the memory limit on the segment host, or the memory of a single primary segment on the segment host?

Again, to understand memory model, you need firstly go to read the gpdev link I posted above, make sure you understand statement_mem, operator mem, query_mem.

Greenplum 7' resgroup MEMORY_LIMIT is a way to compute runtime query_mem and runtime query_mem is to control when the SQL spilling and sort of control the memory a query can use on a single segment. There are out of bound memory usage there.

5.When creaating a new resource group, what is the recommended maximum value for MEMORY_LIMIT? For example, there is a calculation example for resource queues. Could you also provide an example for a resource group? The example for a resource queue is: "Set MEMORY_LIMIT to 90% of memory available on a per-segment basis. For example, if a host has 48 GB of physical memory and 6 segment instances, then the memory available per segment instance is 8 GB. You can calculate the recommended MEMORY_LIMIT for a single queue as 0.90*8=7.2 GB. If there are multiple queues created on the system, their total memory limits must also add up to 7.2 GB."

I think you can just use resqueue's model. GP7's resgroup model is almost the same as resqueue.

6.If multiple resource groups have allocated all available memory, would there be a situation of memory resource contention when all resource groups are actively utilizing their allocated memory?

I don't understand your question.

7.If one instance experiences an exception, would the memory resource limit for another instance increase?

I don't understand this question. I suppose you are asking a segment is down and GPDB promote its mirror to primary, then it is just a normal working cluster just segments are not balanced on hosts. Some hosts might have high memory due to more segments on them. Users should recover seg and rebalance soon.

from gpdb.

FYIsunny commented on June 13, 2024

Thank you for your response. After reviewing the resource queue documentation for Greenplum 7, I did not find some information about query_mem and operator_mem. It seems that my question about "the memory limits of resource groups in Greenplum 7 not taking effect" has not been answered. Could you please provide more specific details in your response? Thank you very much.

GP7's memory model is almost the same as the resource queue's.
To understand this memory model, you need to understand:

query_mem

operator_mem

GP7's memory limit is used to compute a value for SQL's query_mem. query_mem is not hard memory limit.

Thank you for your response. After reviewing the resource queue documentation for Greenplum 7, I did not find some information about query_mem and operator_mem. It seems that my question about "the memory limits of resource groups in Greenplum 7 not taking effect" has not been answered. Could you please provide more specific details in your response? Thank you very much.

@FYIsunny Hi, I have a thread on memory model: https://groups.google.com/a/greenplum.org/g/gpdb-dev/c/h25aCqHifuQ/m/2BxxzxfMBgAJ would you please read the memory part and then see if it helps you understand.

If still more questions, lets continue discuss here.

I really appreciate your patient response, and I'm also interested in learning more about the gp7 database.

There are two parts that I didn't quite understand:
1.The first sentence: "GP7's memory limit is used to compute a value for SQL's query_mem."
2.The second sentence: "The concept of query_mem. PlannedStmt struct has a field query_mem, which is a concept to measure how much memory this SQL can use on a single segment. Based on this value, Greenplum will walk the plan tree to assign operator memory for each operator (just a new name for the plan node here)."

My questions:
1.What is the specific relationship between the memory_limit parameter and query_mem when creating shared resources? For example, under the same conditions and using the same query, if the memory_limit parameter is set to 1MB or 2MB, will the value of query_mem change? What is the logic behind this variation?
2.Operator memory (for the plan node) is based on query_mem. What is the specific relationship or calculation between them?
3.The memory_usage column in the SELECT * FROM gp_toolkit.gp_resgroup_status_per_host view, I noticed that even if a resource group has not been used for a long time, the memory amount does not become 0. If this column does not reflect the actual real-time memory usage, does it have limited significance?

from gpdb.

kainwen commented on June 13, 2024

1.What is the specific relationship between the memory_limit parameter and query_mem when creating shared resources? For example, under the same conditions and using the same query, if the memory_limit parameter is set to 1MB or 2MB, will the value of query_mem change? What is the logic behind this variation?

@FYIsunny

1.What is the specific relationship between the memory_limit parameter and query_mem when creating shared resources? For example, under the same conditions and using the same query, if the memory_limit parameter is set to 1MB or 2MB, will the value of query_mem change? What is the logic behind this variation?

If memory_limit is set to -1, then query_mem will be set using the GUC value of statement_mem.
otherwise, query_mem = memory_limit / nconcurrency for this group

2.Operator memory (for the plan node) is based on query_mem. What is the specific relationship or calculation between them?

GPDB divide operator (plan node) into two kinds:

memory intensive (Hash, HashAgg, Sort, Materialize ....)
memory non-intensive (like seqscan, motion, ...

Memory intensive operator needs to keep some tuples in memory and has implemented the logic of spilling to disk.

For non-intensive node, there is a GUC (defaultly 100KBytes) to set their operator memory (normally users don't change it, but users can).

Then there are two algorithms to set memory for memory-intensive operator:

simple one is called auto
complex one (default one) is called eager free

I just talk about auto here, after setting for non-intensive, see how much query_mem is left, and then divide equally to each intensive operator.

3.The memory_usage column in the SELECT * FROM gp_toolkit.gp_resgroup_status_per_host view, I noticed that even if a resource group has not been used for a long time, the memory amount does not become 0. If this column does not reflect the actual real-time memory usage, does it have limited significance?

Can you elaborate more? What is the resgroup's id and what is your linux cgroup directory structure and content?

from gpdb.

[7.X]Resgroup's MEM usage limitation didn't take effect. about gpdb HOT 7 OPEN

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent