Usage Guide

YMatrix supports runaway query detection. For queries managed by resource groups, YMatrix can automatically terminate queries based on their memory usage.

The relevant configuration parameters are as follows:

  • gp_vmem_protect_limit: Sets the total amount of memory that all postgres processes in an active segment instance can consume. If a query causes this limit to be exceeded, no additional memory will be allocated, and the query will fail.

  • runaway_detector_activation_percent: When resource groups are enabled, if memory usage exceeds gp_vmem_protect_limit multiplied by runaway_detector_activation_percent, YMatrix terminates queries managed by resource groups (excluding queries in the system_group), starting with the query consuming the most memory. Termination continues until memory usage falls below the specified percentage threshold.

Assigning Resource Groups to Roles

  • Use the RESOURCE GROUP clause of the CREATE ROLE or ALTER ROLE command to assign a resource group to a database role.
ALTER ROLE bill RESOURCE GROUP rg_light;
CREATE ROLE mary RESOURCE GROUP exec;

You can assign a resource group to one or more roles. If a role hierarchy is defined, the resource group assigned to a parent role does not propagate to its member roles.

  • To remove a resource group assignment from a role and assign it to the default group, change the role's resource group to NONE.
ALTER ROLE mary RESOURCE GROUP NONE;

Monitoring Resource Group Status

  • View resource group limits:

    SELECT * FROM gp_toolkit.gp_resgroup_config;
  • View resource group query status:

    SELECT * FROM gp_toolkit.gp_resgroup_status;
  • View memory usage per host for each resourceグループ:

    SELECT * FROM gp_toolkit.gp_resgroup_status_per_host;
  • View resource groups assigned to roles:

    SELECT rolname, rsgname FROM pg_roles, pg_resgroup
    WHERE pg_roles.rolresgroup=pg_resgroup.oid;
  • View running and pending queries in resource groups:

    SELECT query, rsgname, wait_event_type, wait_event 
    FROM pg_stat_activity;
  • Cancel running or queued transactions in a resource group:

To manually cancel a running or queued transaction, first identify the process ID (pid) associated with the transaction. Once you have the pid, call pg_cancel_backend() to terminate the process.

Follow these steps:

a. Run the following query to view process information for all active or idle statements across all resource groups. If no results are returned, there are no running or queued transactions in any resource group.

```
SELECT rolname, g.rsgname, pid, waiting, state, query, datname 
FROM pg_roles, gp_toolkit.gp_resgroup_status g, pg_stat_activity 
WHERE pg_roles.rolresgroup=g.groupid
AND pg_stat_activity.usename=pg_roles.rolname;
```

b. Example query output:
```
rolname | rsgname  | pid     | waiting | state  |          query           | datname 
---------+----------+---------+---------+--------+--------------------------+---------
  sammy  | rg_light |  31861  |    f    | idle   | SELECT * FROM mytesttbl; | testdb
  billy  | rg_light |  31905  |    t    | active | SELECT * FROM topten;    | testdb
```

c. Terminate the transaction process:
```
SELECT pg_cancel_backend(31905);
```

Note!
Do not use the operating system KILL command to cancel any YMatrix database processes.

Moving Queries Between Resource Groups

Users with superuser privileges can use the gp_toolkit.pg_resgroup_move_query() function to move a running query from one resource group to another without stopping the query. This function can accelerate long-running queries by moving them to a resource group with higher resource allocation or availability.

pg_resgroup_move_query() moves only the specified query to the target resource group. Subsequent queries submitted by the same role remain assigned to the original resource group.

Note!
Only active or running queries can be moved to a new resource group. Idle, queued, or pending queries cannot be moved due to concurrency or memory constraints.

pg_resgroup_move_query() requires the process ID (pid) of the running query and the name of the target resource group.

pg_resgroup_move_query( pid int4, group_name text );

As described in "Canceling Running or Queued Transactions in a Resource Group," you can use the gp_toolkit.gp_resgroup_status view to list the name, ID, and status of each resource group.

When pg_resgroup_move_query() is called, the running query becomes subject to the target resource group’s configuration, including concurrency and memory limits:

  • If the target resource group has reached its concurrency limit, the database queues the query until a slot becomes available, or until the time specified by gp_resource_group_queuing_timeout (in milliseconds) elapses.
  • If the target resource group has an available slot, pg_resgroup_move_query() attempts to transfer slot control to the target process for up to gp_resource_group_move_timeout milliseconds. If the target process fails to handle the move request within this timeout, the database returns an error.
  • If pg_resgroup_move_query() is canceled after the target process has already acquired some—but not all—required slots, the segment process is not moved to the new group, and the target process retains the acquired slots. This inconsistent state is resolved at transaction end or when the target process executes the next command within the same transaction.
  • If the target resource group lacks sufficient available memory to meet the query’s current memory demand, the database returns an error. To resolve this, either increase the shared memory allocated to the target resource group or wait for running queries to complete before retrying the move.

After moving a query, there is no guarantee that the total memory usage of currently running queries in the target resource group will stay within its memory quota. In such cases, one or more running queries—including the moved query—may fail. To minimize this risk, reserve sufficient global shared memory for the resource group.