This document provides detailed parameter information for MatrixGate.
Notes! The parameters in this section are those in the configuration file generated before mxgate is started.
Parameter Name | Default Value | Description |
---|---|---|
[database] Category | ||
--db-database | postgres | MatrixGate connects to the YMatrix database name |
--db-master-host | Local hostname | MatrixGate connects to the YMatrix hostname |
--db-master-port | 5432 | MatrixGate connects to the YMatrix host port number |
--db-user | Current system username | MatrixGate Connect YMatrix Username Notes! The user must have permission to create external tables. If you are using a non-superuser, use the following command to add permissions: alter user {username} CREATEEXTTABLE; |
--db-password | Empty | MatrixGate connection YMatrix user password |
--db-max-conn | 10 | MatrixGate connection YMatrix maximum number of connections |
[job]category | ||
--allow-dynamic | false | When --allow-dynamic=true is specified, it allows dynamic adaptation of the target table to be inserted based on the POST data content (first line). This option should only be used when the target table name is not yet determined when MatrixGate is started. If you want to insert into a known target table, it is recommended to explicitly specify the table name using --target |
--delimiter | | | Specifies the character used to separate columns within each line (row) of the file |
--error-handling | accurate | How to handle lines with format errors'accurate' :Incorrect data is not entered into the database and an error log is recorded. Other data in the same batch is not affected.'legacy' :The entire batch failed. |
--exclude-columns | Empty | The number and order of columns provided by default during data loading must match the table definition. When data loading only provides some columns, --exclude-columns is used to mark the column names to be excluded. The order of the remaining columns must still match the table definition. Note: If --use-auto-increment is enabled to skip auto-increment fields, there is no need to list these auto-increment fields here. This parameter only needs to mark other column names that need to be excluded. |
--format | text | Specifies the data format of the source data as text or csv . text is the fastest but does not support line breaks in character types. The csv format is more widely applicable, and character type columns must be enclosed in double quotes |
--null-as | empty string | Specifies the string representing the null value. The default value is an empty string without quotes. When the column constraint in the data table is NOT NULL and the data content for that column is null, it will cause a loading error. Note: If you need to use \N as the null value, you must escape the backslash, e.g.: --null-as \N |
--time-format | unix-second | Specifies the timestamp unit: unix-second \| unix-ms\|unix-nano \| raw . MatrixGate defaults to treating the first column of each row as the Unix representation of the timestamp and automatically converts it to the database time format. If the timestamp is not in the first column or the user has already converted it to the database format, use raw so that MatrixGate does not perform a time type conversion. |
--upsert-key | empty | The key name for UPSERT , which can be specified multiple times.Tables that require UPSERT must have UNIQUE constraints established, and all constraint keys must be specified in the parameters. |
--deduplicate-key | Empty | Similar to UPSERT , the difference is that only empty values are updated. If the old value is not empty, the new value is discarded.Mutually exclusive with the --upsert-key parameter; only one can be selected |
--use-auto-increment | true | When the target table contains an auto-increment field, whether to skip assigning values to the auto-increment field when loading data and use the system default auto-increment value instead |
--target | schemaName.tableName | Specifies the target table name. The schemaName can be omitted, with the default being public . Multiple target tables can be specified using the format “--target table1 --target table2 …”. When this parameter is not provided, the --allow-dynamic parameter can be additionally specified to allow dynamic adaptation of table names |
--dml-template | The file path of the mapping template that maps JSON fields to tuple columns | |
[misc] category | ||
--log-archive-hours | 72 | MatrixGate log files in the log directory that have not changed for a certain period of time are automatically compressed |
--log-compress | true | Global switch to enable automatic log compression |
--log-dir | /home/mxadmin/gpAdminLogs | Log directory |
--log-max-archive-files | 0 | Maximum number of compressed log files to retain. Once this number is exceeded, the oldest log files will be deleted. 0 means no deletion |
--log-remove-after-days | 0 | Number of days after which compressed log files are automatically deleted. 0 means no deletion |
--log-rotate-size-mb | 100 | When the current log file exceeds a certain size, it is automatically switched to a new file, and the old file is immediately compressed |
-v / --verbose | Print detailed verbose logs | |
-V / --debug | Print detailed debug logs (including verbose logs and debug logs) | |
--pprof-port | Port for accessing Pprof information. 0 indicates disabled |
|
--no-cleanup | Retain temporary mode even when exiting normally | |
--grpc-port | Port for accessing gRPC information; 0 indicates disabled |
|
[metrics] category | ||
--metrics-enable | true | Enable metrics |
--metrics-sample-interval | 15 (adjusted to 15 from v5.3.2, previously 3) | Metrics sampling interval (seconds). Set to >0 to enable metrics collection (will reduce performance) |
[source] category | ||
--source | http | MatrixGate data source, supports http / stdin / kafka / transfer / grpc |
[source] category | [HTTP] | Notes: This mode is the default data source connection mode in the configuration file |
--http-port | 8086 | MatrixGate HTTP interface for user data submission |
--max-body-bytes | 4194304 | Maximum size of each HTTP packet body |
--max-concurrency | 40000 | Maximum number of concurrent HTTP connections |
--request-timeout | 0 | Request timeout, default 0 . When set to a value greater than 0 , it will time out after waiting for the set time in milliseconds and return timeout(408) |
--disable-keep-alive | false | MatrixGate forces the connection to be closed after each HTTP request |
--http-debug | false | Output additional HTTP source diagnostic information |
[source] Category | [Transfer] | Notes: Migration mode is not the default mode. If you need to use this mode, please manually configure the parameters in this section. |
--src-host | IP address of the source repository Master | |
--src-port | Port number of the source database Master | |
--src-user | Username to connect to the source database (recommended to use Superuser ) |
|
--src-password | Connection password | |
--src-schema | The schema name of the source table | |
--src-table | The table name of the source table | |
--src-sql | The SQL used for data filtering during migration | |
--compress | The compression method for transferring data from the source database segment to this database: An empty string “”, representing no compression and plaintext transmission. gzip : To use gzip compression, the gzip Linux command must be installed on the Segment host of the source database.lz4 : To use lz4 compression, the lz4 Linux command must be installed on the Segment host of the source database.Recommended: lz4 > gzip > No compression |
|
--port-base | 9129 | A port will be occupied during transmission |
--local-ip | Must use the IP address that the source repository can connect to on the local machine | |
[transform] category | ||
--transform | plain | Convert the format or type of the data to be written. Supports plain / json / nil / tsbs / hanagdbc |
[writer] category | ||
--interval | 100 milliseconds | MatrixGate batch data loading time cycle |
--writer | stream | MatrixGate writes data to YMatrix via Writer. Supports stream / nil |
--stream-prepared |
10 | Call several slot processes simultaneously in a single job |
--stream-host | mdw | The hostname of the Master in YMatrix connected to MatrixGate. This is for systems with multiple network interfaces |
--use-gzip | auto | When MatrixGate sends data to Segment, you can configure whether to enable compression using the parameters auto /yes /no :auto Set to prioritize the zstd compression algorithm;Setting --use-gzip=yes means using the gzip compression algorithm.Setting --use-gzip=no will disable compression during transmission. While this setting can save a small amount of CPU usage, it will significantly increase the amount of network data transmitted. We recommend using the default value auto unless the database is deployed on a single machine and both mxgate and the database are on the same host. |
--max-seg-conn | 128 | The number of segments started when the external table pulls data from MatrixGate. Increasing this parameter will increase network connection resources. |
--timing | false | After setting this parameter to true , MatrixGate will add timing information to each INSERT when logging. |
--insert-timeout | 600000 | MatrixGate INSERT statement timeout.Setting a value greater than 0 will cause a timeout after waiting for the configured time in milliseconds. |
-I / --instrumentation | disable | Enable detection of slot(s) , supporting disable / single / all options:disable means to disable this feature;single indicates that only slot[0] detection is enabled.all indicates that detection for all slots is enabled |
--bytes-limit |
Batch data loading size limit. Ensures uniform data ingestion when the data stream input to MatrixGate is uneven. This feature is disabled by default. If enabled, the size must be manually configured, with values ranging from 0~INT_MAX |
|
--auto-tune |
false | After setting this parameter to true , MatrixGate can adjust the number of slots for write tasks. |
--abort-by-pause-timeout |
10000 | After a job is paused, if the specified timeout period in this parameter is reached, the data accumulated in memory that is about to be written to the database will be automatically discarded. The valid range for this parameter is 0~INT_MAX ; the recommended range is 1000~10000 , with the unit being milliseconds (ms); if set to 0 , this feature is disabled. It is recommended to configure this value significantly lower than --request-timeout to ensure that when the job is not in a paused state, --request-timeout is still used as the timeout value. Once the job is paused, mxgate will automatically compare the two timeout parameters mentioned above and trigger a timeout error based on the lower value. This parameter is only configurable when writer=“stream” |
Notes!
The parameters in this section can be run in the command line after mxgate is started.
Notes!
For an overview of MatrixGate's main features, please refer to MatrixGate Main Features.