Comments (7)
GP7's memory model is almost the same as the resource queue's.
To understand this memory model, you need to understand:
query_mem
operator_mem
GP7's memory limit is used to compute a value for SQL's query_mem
. query_mem
is not hard memory limit.
from gpdb.
Thank you for your response.
After reviewing the resource queue documentation for Greenplum 7, I did not find some information about query_mem and operator_mem.
It seems that my question about "the memory limits of resource groups in Greenplum 7 not taking effect" has not been answered. Could you please provide more specific details in your response? Thank you very much.
GP7's memory model is almost the same as the resource queue's.
To understand this memory model, you need to understand:
query_mem
operator_mem
GP7's memory limit is used to compute a value for SQL's
query_mem
.query_mem
is not hard memory limit.
Thank you for your response.
After reviewing the resource queue documentation for Greenplum 7, I did not find some information about query_mem and operator_mem.
It seems that my question about "the memory limits of resource groups in Greenplum 7 not taking effect" has not been answered. Could you please provide more specific details in your response? Thank you very much.
from gpdb.
Thank you for your response. After reviewing the resource queue documentation for Greenplum 7, I did not find some information about query_mem and operator_mem. It seems that my question about "the memory limits of resource groups in Greenplum 7 not taking effect" has not been answered. Could you please provide more specific details in your response? Thank you very much.
GP7's memory model is almost the same as the resource queue's.
To understand this memory model, you need to understand:
query_mem
operator_mem
GP7's memory limit is used to compute a value for SQL's
query_mem
.query_mem
is not hard memory limit.Thank you for your response. After reviewing the resource queue documentation for Greenplum 7, I did not find some information about query_mem and operator_mem. It seems that my question about "the memory limits of resource groups in Greenplum 7 not taking effect" has not been answered. Could you please provide more specific details in your response? Thank you very much.
@FYIsunny Hi, I have a thread on memory model: https://groups.google.com/a/greenplum.org/g/gpdb-dev/c/h25aCqHifuQ/m/2BxxzxfMBgAJ
would you please read the memory part and then see if it helps you understand.
If still more questions, lets continue discuss here.
from gpdb.
Thank you for your response. After reviewing the resource queue documentation for Greenplum 7, I did not find some information about query_mem and operator_mem. It seems that my question about "the memory limits of resource groups in Greenplum 7 not taking effect" has not been answered. Could you please provide more specific details in your response? Thank you very much.
GP7's memory model is almost the same as the resource queue's.
To understand this memory model, you need to understand:
query_mem
operator_mem
GP7's memory limit is used to compute a value for SQL's
query_mem
.query_mem
is not hard memory limit.Thank you for your response. After reviewing the resource queue documentation for Greenplum 7, I did not find some information about query_mem and operator_mem. It seems that my question about "the memory limits of resource groups in Greenplum 7 not taking effect" has not been answered. Could you please provide more specific details in your response? Thank you very much.
@FYIsunny Hi, I have a thread on memory model: https://groups.google.com/a/greenplum.org/g/gpdb-dev/c/h25aCqHifuQ/m/2BxxzxfMBgAJ would you please read the memory part and then see if it helps you understand.
If still more questions, lets continue discuss here.
@kainwen
Thank you for your response.Could you please take a look at the following questions:
1.The default value of gp_resource_group_bypass_direct_dispatch for a resource group is true, which bypasses the resource group's limits. This raises the question of why the default isn't false. Or is it that setting it to false only takes effect at the session level?
2.Which one do you recommend using: "select * from gp_toolkit.gp_resgroup_status_per_segment;" or "SELECT * FROM gp_toolkit.gp_resgroup_status_per_host;" to obtain more precise and real-time statistics of segment memory usage?
3.Can I calculate the real-time memory usage of each segment by dividing the memory_usage from gp_resgroup_status_per_host by the number of primary segments?
4.I need to confirm again, in the statement "The parameter MEMORY_LIMIT of a resource group sets the maximum amount of memory reserved for this resource group on a segment", when creating a resource group, does the MEMORY_LIMIT parameter refer to the memory limit on the segment host, or the memory of a single primary segment on the segment host?
5.When creating a new resource group, what is the recommended maximum value for MEMORY_LIMIT? For example, there is a calculation example for resource queues. Could you also provide an example for a resource group? The example for a resource queue is: "Set MEMORY_LIMIT to 90% of memory available on a per-segment basis. For example, if a host has 48 GB of physical memory and 6 segment instances, then the memory available per segment instance is 8 GB. You can calculate the recommended MEMORY_LIMIT for a single queue as 0.90*8=7.2 GB. If there are multiple queues created on the system, their total memory limits must also add up to 7.2 GB."
6.If multiple resource groups have allocated all available memory, would there be a situation of memory resource contention when all resource groups are actively utilizing their allocated memory?
7.If one instance experiences an exception, would the memory resource limit for another instance increase?
from gpdb.
1.The default value of gp_resource_group_bypass_direct_dispatch for a resource group is true, which bypasses the resource group's limits. This raises the question of why the default isn't false. Or is it that setting it to false only takes effect at the session level?
The GUC can be session-level or cluster-level, users can choose themselves. I don't see any problem that bypass simple direct dispatch SQL.
2.Which one do you recommend using: "select * from gp_toolkit.gp_resgroup_status_per_segment;" or "SELECT * FROM gp_toolkit.gp_resgroup_status_per_host;" to obtain more precise and real-time statistics of segment memory usage?
Greenplum 7's resgroup does not use cgroup to control memory, but GPDB put each QE into cgroup's memory directory so that we can read from cgroup's memory stats info. A host might contain several segments and a host only have one cgroup directory.
gp_toolkit.gp_resgroup_status_per_segment
is read from each segment's shared memory it is the memorycontext memory used by GPDB.
Both views can be useful.
3.Can I calculate the real-time memory usage of each segment by dividing the memory_usage from gp_resgroup_status_per_host by the number of primary segments?
No.
4.I need to confirm again, in the statement "The parameter MEMORY_LIMIT of a resource group sets the maximum amount of memory reserved for this resource group on a segment", when creating a resource group, does the MEMORY_LIMIT parameter refer to the memory limit on the segment host, or the memory of a single primary segment on the segment host?
Again, to understand memory model, you need firstly go to read the gpdev link I posted above, make sure you understand statement_mem
, operator mem
, query_mem
.
Greenplum 7' resgroup MEMORY_LIMIT
is a way to compute runtime query_mem
and runtime query_mem
is to control when the SQL spilling and sort of control the memory a query can use on a single segment. There are out of bound memory usage there.
5.When creaating a new resource group, what is the recommended maximum value for MEMORY_LIMIT? For example, there is a calculation example for resource queues. Could you also provide an example for a resource group? The example for a resource queue is: "Set MEMORY_LIMIT to 90% of memory available on a per-segment basis. For example, if a host has 48 GB of physical memory and 6 segment instances, then the memory available per segment instance is 8 GB. You can calculate the recommended MEMORY_LIMIT for a single queue as 0.90*8=7.2 GB. If there are multiple queues created on the system, their total memory limits must also add up to 7.2 GB."
I think you can just use resqueue's model. GP7's resgroup model is almost the same as resqueue.
6.If multiple resource groups have allocated all available memory, would there be a situation of memory resource contention when all resource groups are actively utilizing their allocated memory?
I don't understand your question.
7.If one instance experiences an exception, would the memory resource limit for another instance increase?
I don't understand this question. I suppose you are asking a segment is down and GPDB promote its mirror to primary, then it is just a normal working cluster just segments are not balanced on hosts. Some hosts might have high memory due to more segments on them. Users should recover seg and rebalance soon.
from gpdb.
Thank you for your response. After reviewing the resource queue documentation for Greenplum 7, I did not find some information about query_mem and operator_mem. It seems that my question about "the memory limits of resource groups in Greenplum 7 not taking effect" has not been answered. Could you please provide more specific details in your response? Thank you very much.
GP7's memory model is almost the same as the resource queue's.
To understand this memory model, you need to understand:
query_mem
operator_mem
GP7's memory limit is used to compute a value for SQL's
query_mem
.query_mem
is not hard memory limit.Thank you for your response. After reviewing the resource queue documentation for Greenplum 7, I did not find some information about query_mem and operator_mem. It seems that my question about "the memory limits of resource groups in Greenplum 7 not taking effect" has not been answered. Could you please provide more specific details in your response? Thank you very much.
@FYIsunny Hi, I have a thread on memory model: https://groups.google.com/a/greenplum.org/g/gpdb-dev/c/h25aCqHifuQ/m/2BxxzxfMBgAJ would you please read the memory part and then see if it helps you understand.
If still more questions, lets continue discuss here.
I really appreciate your patient response, and I'm also interested in learning more about the gp7 database.
There are two parts that I didn't quite understand:
1.The first sentence: "GP7's memory limit is used to compute a value for SQL's query_mem."
2.The second sentence: "The concept of query_mem. PlannedStmt struct has a field query_mem, which is a concept to measure how much memory this SQL can use on a single segment. Based on this value, Greenplum will walk the plan tree to assign operator memory for each operator (just a new name for the plan node here)."
My questions:
1.What is the specific relationship between the memory_limit parameter and query_mem when creating shared resources? For example, under the same conditions and using the same query, if the memory_limit parameter is set to 1MB or 2MB, will the value of query_mem change? What is the logic behind this variation?
2.Operator memory (for the plan node) is based on query_mem. What is the specific relationship or calculation between them?
3.The memory_usage column in the SELECT * FROM gp_toolkit.gp_resgroup_status_per_host view, I noticed that even if a resource group has not been used for a long time, the memory amount does not become 0. If this column does not reflect the actual real-time memory usage, does it have limited significance?
from gpdb.
1.What is the specific relationship between the memory_limit parameter and query_mem when creating shared resources? For example, under the same conditions and using the same query, if the memory_limit parameter is set to 1MB or 2MB, will the value of query_mem change? What is the logic behind this variation?
1.What is the specific relationship between the memory_limit parameter and query_mem when creating shared resources? For example, under the same conditions and using the same query, if the memory_limit parameter is set to 1MB or 2MB, will the value of query_mem change? What is the logic behind this variation?
- If
memory_limit
is set to-1
, thenquery_mem
will be set using the GUC value ofstatement_mem
. - otherwise,
query_mem
=memory_limit
/nconcurrency for this group
2.Operator memory (for the plan node) is based on query_mem. What is the specific relationship or calculation between them?
GPDB divide operator (plan node) into two kinds:
- memory intensive (Hash, HashAgg, Sort, Materialize ....)
- memory non-intensive (like seqscan, motion, ...
Memory intensive operator needs to keep some tuples in memory and has implemented the logic of spilling to disk.
For non-intensive node, there is a GUC (defaultly 100KBytes) to set their operator memory (normally users don't change it, but users can).
Then there are two algorithms to set memory for memory-intensive operator:
- simple one is called auto
- complex one (default one) is called eager free
I just talk about auto here, after setting for non-intensive, see how much query_mem
is left, and then divide equally to each intensive operator.
3.The memory_usage column in the SELECT * FROM gp_toolkit.gp_resgroup_status_per_host view, I noticed that even if a resource group has not been used for a long time, the memory amount does not become 0. If this column does not reflect the actual real-time memory usage, does it have limited significance?
Can you elaborate more? What is the resgroup's id and what is your linux cgroup directory structure and content?
from gpdb.
Related Issues (20)
- GPORCA not support using btree index for prefix queries with `LIKE 'prefix%'`, even if I declare the opclass when creating the index.
- Fold Stable Functions in Sublinks
- When the key values are the same, the memory limit is exceeded when making a hash join HOT 1
- Emtpy column table with hash policy leads to PANIC or error. HOT 1
- Error: "attempted to update invisible tuple" in truncate command HOT 12
- GP7 refuses to create mirror directories
- GPDB7 gpconfig fails to insert $libdir HOT 3
- ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list HOT 2
- installation package not working HOT 2
- Missing additional HashAggregate in query plan HOT 3
- compile plpython3 in an exist greenplum6, error HOT 5
- use plpython3u write gp7 will lock HOT 9
- SplitUpdate set wrong target list reference
- ERROR: set-returning functions are not allowed in CASE in gp7 HOT 2
- External writable table - one target file instead of file per segment
- Find bug "ERROR: token for user id: 10, session: 11 doesn't exist (cdbendpoint.c:942)" when I use PARALLEL RETRIEVE CURSOR HOT 1
- gppkg in version 7 issues HOT 4
- Madlib packaging issue in Greenplum 7 for madlib 2.1.0 rhel8 HOT 2
- REGRESSION: Queries that works in releases prior to 7, fails in 7 HOT 6
- EXCEPT over no columns reports ERROR: unexpected gang size HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gpdb.