This document outlines how to identify performance bottlenecks and describes typical performance tuning scenarios both inside and outside the YMatrix database.
When a business SQL query becomes slow or you want to improve the performance of a SQL query, first confirm the general direction of the issue:
Is the entire server system slowing down:
Or is only a specific SQL query slowing down:
Once a slow SQL query is identified, follow the steps below for investigation and analysis:
Note!
There are various methods to collect information. Here we detail only one or two. If you have other preferred commands, you can use them for collection and analysis.
Category | Information | Collection | Analysis |
---|---|---|---|
Server Resources | CPU Usage | Via YMatrix GUI - Cluster Management - Metrics View: • CPU usage per node Or use the top command to collect detailed CPU info:• User CPU (us) • System CPU (sy) • Idle CPU (id) |
• When CPU usage is high, determine whether it is user CPU or system CPU • If user CPU is high, it indicates high CPU usage by a program. Check the efficiency of the code execution • If system CPU is high, check whether other server resources (disk I/O, memory, network, etc.) are insufficient |
Memory & Virtual Memory | Via YMatrix GUI - Cluster Management - Metrics View: • Memory usage per node Or use vmstat to view memory details:• si (memory swapped in per second) • so (memory swapped out per second) |
If si and so are consistently non-zero, it indicates insufficient memory and heavy use of virtual memory, leading to performance degradation |
|
Disk I/O (Read/Write Rate) | Via YMatrix GUI - Cluster Management - Metrics View: • Disk I/O per node Or use iostat -x to view disk I/O details:• %util (percentage of time spent doing I/O) • %iowait (percentage of CPU time waiting for I/O) |
• Disk I/O speed, not size, affects performance. If %util is near 100%, it means too many I/O requests and the system is overloaded • If %iowait is high, it indicates a disk I/O bottleneck. Consider upgrading or replacing the disk array |
|
Network | Via YMatrix GUI - Cluster Management - Metrics View: • Network receive/send rate per node Or use sar -n DEV 1 2 to view network details:• rxkb/s (received data per second in KB) • txkb/s (transmitted data per second in KB) |
Compare rxkB/s with total network bandwidth. If it is close to the total bandwidth, a network bottleneck exists | |
Kernel Parameters | On Linux systems, check the corresponding kernel parameter files under /proc/sys , e.g., cat overcommit_memory |
Note!
Not all options in the table below are mandatory.
Category | Information | Collection | Analysis |
---|---|---|---|
Software Environment | OS Version | Use uname -a |
Analyze whether the bottleneck is related to the OS version |
YMatrix Version | Use SELECT version(); |
Analyze whether the bottleneck is related to the YMatrix version | |
Cluster Info | Cluster Deployment Topology | • Via YMatrix GUI - Cluster Management • Or use SELECT * FROM gp_segment_configuration; |
If failover occurs, a single physical node may host more Segments, potentially causing performance degradation |
Database Info | Table Structure | • Via YMatrix GUI - Data Tables • Or use \d+ |
Confirm whether data skew is severe due to unreasonable distribution key settings |
Related Logs | Default log directory: $HOME/gpAdminLogs Database logs are in the data directory |
Analyze logs if needed | |
Slow Queries | Via YMatrix GUI - Query Monitor: check for blocking sessions | If slow queries exist, locate and analyze them | |
Query Plan | Use EXPLAIN SELECT... to view the query plan |
If the query plan cost is too high, analyze the path and root cause | |
Save Environment | Use YMatrix's minirepro tool |
Internal database tuning refers to tuning individual SQL statements.
Symptoms:
A table has undergone significant data changes (e.g., inserts, deletes). You need to run ANALYZE
to collect accurate statistics to avoid incorrect query plans due to outdated stats, which can degrade performance.
Analysis Method:
Use EXPLAIN ANALYZE
to check if statistics are inaccurate. If the row
estimate in the query plan is far off, it indicates skew. Re-run:
=# ANALYZE <tablename>;
Symptoms:
Unreasonable distribution keys during initial data model design cause data to be concentrated on a few Segment instances, leading to skew. This creates a bottleneck in YMatrix's distributed architecture: query time depends on the slowest Segment + Master processing time.
Analysis Method:
Use the following command to check data distribution. If Segment data counts differ significantly, skew exists:
=# SELECT gp_segment_id, COUNT(*) FROM <tablename> GROUP BY gp_segment_id;
Solution:
If skew occurs, consider modifying the distribution key and redistributing data evenly:
=# ALTER TABLE <tablename> SET DISTRIBUTED BY (<newcolumn>);
=# ALTER TABLE <tablename> SET WITH (REORGANIZE=true);
=# ANALYZE <tablename>;
Attention!
Distribution key selection affects data distribution and query performance. Modifying it after business launch carries performance risks. Choose carefully during design. Refer to YMatrix DDL Best Practices.
Symptoms:
Frequent updates/deletes without a proper cleanup cycle cause table bloat.
Solutions:
Drop historical (cold) partition subtables:
=# DROP TABLE <partition_tablename>;
Set up a reasonable cleanup cycle:
=# VACUUM <tablename>;
See Daily Cleanup.
For severely bloated tables, reorganize:
=# ALTER TABLE <tablename> SET WITH (REORGANIZE=true);
=# ANALYZE <tablename>;
If none of the above causes the SQL performance bottleneck, analyze the query plan and ask:
Does a specific operation take too long?
enable_<operator>
parameters to force the optimizer to choose a different plan.Are selective predicates applied early?
WHERE
clause.Is the join order optimal?
join_collapse_limit=1
and use explicit JOIN
syntax to force the order. Or collect more stats on join columns.Does the optimizer selectively scan partitioned tables?
Are hash aggregations and joins used?
Does enabling/disabling vectorization significantly affect performance?
For more on query plans, see Understanding Query Plans.