watsonx.data

watsonx.data

Put your data to work, wherever it resides, with the hybrid, open data lakehouse for AI and analytics

 View Only

Presto: Observation "system.runtime" schema provided by System Connector

By MIKA SHIMOGAWA posted Tue August 05, 2025 02:59 AM

  

In this article, I want to show the observation results of "runtime" schema in "system" catalog using watsonx.data 2.1.2. Here shows the structure of each table obtained with presto-cli and the output from Presto (Java). 

To Japanese Readers : This is the translation of Presto: システム・コネクターの system.runtimeにアクセスしてみた。 in Qiita.

Presto System Connector at watsonx.data

Presto is one of the Open Source query engines included in watsonx.data to provide fast, reliable, and efficient processing of data.

Presto System Connector, implemented in the Open Source Presto, provides information and metrics for queries running on Presto clusters. 

When you add a Presto engine in watsonx.data, System connector is automatically configured in the Presto engine. User can access "system"  catalog provided by the Presto System Connector using regular SQL queries. 

Using the watsonx.data Data manager, "system" catalog is listed in "System monitoring data" as follows:

Here I show example outputs using presto-cli. You can get the same output using watsonx.data Query workspace.

Note that table structures in system.runtime schema of Presto (C++) are same as that of Presto (Java).

system.runtime

"system.runtime" schema has 4 tables:

Table Explanation
nodes The nodes table contains the list of visible nodes in the Presto cluster along with their status.
queries The queries table contains information about currently and recently running queries on the Presto cluster. From this table you can find out the original query text (SQL), the identity of the user who ran the query and performance information about the query including how long the query was queued and analyzed.
tasks The tasks table contains information about the tasks involved in a Presto query including where they were executed and how many rows and bytes each task processed.
transactions The transactions table contains the list of currently open transactions and related metadata. This includes information such as the create time, idle time, initialization parameters, and accessed catalogs.

 

The followings are details about each of the tables

 

system.runtime.nodes

"system.runtime.nodes" table contains the list of visible nodes in the Presto cluster along with their status.

Table structure

The table structure is as follows. In this example, I omitted  "Extra" and "Comment" columns because these columns have no value. I added  "values observed" column to show values I observed in my environment.

presto> describe system.runtime.nodes;
    Column    |  Type   | Precision | Scale |   Length   | values observed
--------------+---------+-----------+-------+------------+-------------
 node_id      | varchar | NULL      | NULL  | 2147483647 | 
 http_uri     | varchar | NULL      | NULL  | 2147483647 | https://ibm-lh-lakehouse... 
 node_version | varchar | NULL      | NULL  | 2147483647 | 0.286
 coordinator  | boolean | NULL      | NULL  | NULL       | true / false
 state        | varchar | NULL      | NULL  | 2147483647 | active
 node_type    | varchar | NULL      | NULL  | 2147483647 | NORMAL
(6 rows)

 

Output of Presto (Java) engine: In my watsonx.data , Presto (Java) runs in an environment where the coordinator and worker live together in one pod. So "system.runtime.nodes" shows 1 row and coordinator column becomes "true".

presto> select * from system.runtime.nodes ;
               node_id                | http_uri                                                       | node_version | coordinator | state  | node_type
--------------------------------------+----------------------------------------------------------------+--------------+-------------+--------+-----------
 536b979f-5e94-4cd2-bfab-45be1a1aee3a | https://ibm-lh-lakehouse-presto512-single-blue-0.(domain):8443 | 0.286        | true        | active | NORMAL
(1 row)

 

Output of Presto (C++) engine: In my watsonx.data, Presto (C++) consists of one Coordinator and three Workers. "system.runtime.nodes" shows 4 lines: focusing on coordinator column.

  • true is set on 1 row for the coordinator.
  • false is set on other 3 rows for the worker nodes.

presto> select * from system.runtime.nodes;
               node_id                | http_uri                                                                   | node_version | coordinator | state  | node_type
--------------------------------------+----------------------------------------------------------------------------+--------------+-------------+--------+-----------
 4846690c-05e6-439b-9a10-0712d5e515d0 | https://ibm-lh-lakehouse-prestissimo578-prestissimo-worker-1.(domain):7443 | 0.286        | false       | active | NORMAL
 a4cc17ce-158c-4e39-beaa-0cc1603219d0 | https://ibm-lh-lakehouse-prestissimo578-coordinator-blue-0.(domain):8443   | 0.286        | true        | active | NORMAL
 aa63bd77-8a92-43ad-bd7f-9844354cf6c9 | https://ibm-lh-lakehouse-prestissimo578-prestissimo-worker-0.(domain):7443 | 0.286        | false       | active | NORMAL
 beac420a-0007-4259-99e9-ff7c439fabb3 | https://ibm-lh-lakehouse-prestissimo578-prestissimo-worker-2.(domain):7443 | 0.286        | false       | active | NORMAL
(4 rows)

  

system.runtime.queries 

"system.runtime.queries" table contains information about currently and recently running queries on the Presto cluster. From this table you can find out the original query text (SQL), user who ran the query , and performance information including how long the query was queued  or analyzed.

Table structure

The table structure of system.runtime.queries is as follows. In this example, I omitted  "Extra" and "Comment" columns because these columns have no value. I added  "values observed" column to show values I observed in my environment.

presto> describe system.runtime.queries ;
      Column       |      Type      | Precision | Scale |   Length   | values observed
-------------------+----------------+-----------+-------+------------+---------
 query_id          | varchar        | NULL      | NULL  | 2147483647 +
 state             | varchar        | NULL      | NULL  | 2147483647 + RUNNING / FINISHED / FAILED / PLANNING
 user              | varchar        | NULL      | NULL  | 2147483647 +
 source            | varchar        | NULL      | NULL  | 2147483647 + presto-cli / wxd-system / wxd-sql ...
 query             | varchar        | NULL      | NULL  | 2147483647 +
 resource_group_id | array(varchar) | NULL      | NULL  | NULL       + [global]
 queued_time_ms    | bigint         |        10 | NULL  | NULL       +
 analysis_time_ms  | bigint         |        10 | NULL  | NULL       +
 created           | timestamp      | NULL      | NULL  | NULL       +
 started           | timestamp      | NULL      | NULL  | NULL       +
 last_heartbeat    | timestamp      | NULL      | NULL  | NULL       +
 end               | timestamp      | NULL      | NULL  | NULL       +
(12 rows)

 

Example output : Since it is horizontally long , I will introduce it in three parts.

(1) First 4 columns are query_id, state, user, source, resource_group_id

presto>  select query_id, state, user , source , resource_group_id  from system.runtime.queries order by query_id desc limit 5;
          query_id           |  state   |   user    |   source   | resource_group_id
-----------------------------+----------+-----------+------------+-------------------
 20250422_061937_00187_drami | RUNNING  | cpadmin   | presto-cli | [global]
 20250422_061848_00186_drami | FINISHED | cpadmin   | presto-cli | [global]
 20250422_061740_00184_drami | FINISHED | cpadmin   | presto-cli | [global]
 20250422_061639_00183_drami | FAILED   | prestoadm | NULL       | [global]
 20250422_061638_00182_drami | FINISHED | prestoadm | NULL       | [global]
(5 rows)

 

(2) Columns query_id, user, source, query: 2 rows of query are displayed as executed by "user02" using "presto-cli".

presto> select query_id, user, source , query from  system.runtime.queries where user='user02' order by query_id desc limit 2 ;

       query_id            |  user  |   source   |                                        query
-----------------------------+--------+------------+--------------------------------------------------------------------------------------
 20250422_055253_00163_drami | user02 | presto-cli | SELECT table_name FROM information_schema.tables WHERE table_schema = 'hsf1_no_part'
 20250422_055244_00162_drami | user02 | presto-cli | select                                                                              +
                             |        |            |  l_returnflag,                                                                      +
                             |        |            |  l_linestatus,                                                                      +
                             |        |            |  sum(l_quantity) as sum_qty,                                                        +
                             |        |            |  sum(l_extendedprice) as sum_base_price,                                            +
                             |        |            |  sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,                         +
                             |        |            |  sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,               +
                             |        |            |  avg(l_quantity) as avg_qty,                                                        +
                             |        |            |  avg(l_extendedprice) as avg_price,                                                 +
                             |        |            |  avg(l_discount) as avg_disc,                                                       +
                             |        |            |  count(*) as count_order                                                            +
                             |        |            | from                                                                                +
                             |        |            |  lineitem                                                                           +
                             |        |            | where                                                                               +
(Skip)

 

(3) Columns query_id, queued_time_ms,analysis_time_ms, created, started, last_heartbeat, end:

  •   "queued_time_ms","analysis_time_ms" show the duration (ms) of each phase.
  •   "created, "started", "last_heartbeat", "end" show timestamp.

presto> select query_id , queued_time_ms,analysis_time_ms ,created , started , last_heartbeat, "end" from system.runtime.queries where user='user02' order by query_id desc limit 2;
          query_id           | queued_time_ms | analysis_time_ms |         created         |         started         |     last_heartbeat      |           end
-----------------------------+----------------+------------------+-------------------------+-------------------------+-------------------------+-------------------------
 20250422_055253_00163_drami |              0 |              703 | 2025-04-22 05:52:54.019 | 2025-04-22 05:52:54.123 | 2025-04-22 05:52:55.517 | 2025-04-22 05:52:55.617
 20250422_055244_00162_drami |              0 |              686 | 2025-04-22 05:52:44.631 | 2025-04-22 05:52:44.947 | 2025-04-22 05:53:02.644 | 2025-04-22 05:53:02.716
(2 rows)

  

system.runtime.tasks

"system.runtime.tasks" table contains information about the tasks involved in a Presto query including where they were executed and how many rows and bytes each task processed.

Table structure

The table structure of system.runtime.queries is as follows. In this example, I omitted  "Extra" and "Comment" columns because these columns have no value.


presto> describe system.runtime.tasks ;
         Column          |   Type    | Precision | Scale |   Length
-------------------------+-----------+-----------+-------+------------
 node_id                 | varchar   | NULL      | NULL  | 2147483647
 task_id                 | varchar   | NULL      | NULL  | 2147483647
 stage_execution_id      | varchar   | NULL      | NULL  | 2147483647
 stage_id                | varchar   | NULL      | NULL  | 2147483647
 query_id                | varchar   | NULL      | NULL  | 2147483647
 state                   | varchar   | NULL      | NULL  | 2147483647
 splits                  | bigint    |        10 | NULL  | NULL
 queued_splits           | bigint    |        10 | NULL  | NULL
 running_splits          | bigint    |        10 | NULL  | NULL
 completed_splits        | bigint    |        10 | NULL  | NULL
 split_scheduled_time_ms | bigint    |        10 | NULL  | NULL
 split_cpu_time_ms       | bigint    |        10 | NULL  | NULL
 split_blocked_time_ms   | bigint    |        10 | NULL  | NULL
 raw_input_bytes         | bigint    |        10 | NULL  | NULL
 raw_input_rows          | bigint    |        10 | NULL  | NULL
 processed_input_bytes   | bigint    |        10 | NULL  | NULL
 processed_input_rows    | bigint    |        10 | NULL  | NULL
 output_bytes            | bigint    |        10 | NULL  | NULL
 output_rows             | bigint    |        10 | NULL  | NULL
 physical_written_bytes  | bigint    |        10 | NULL  | NULL
 created                 | timestamp | NULL      | NULL  | NULL
 start                   | timestamp | NULL      | NULL  | NULL
 last_heartbeat          | timestamp | NULL      | NULL  | NULL
 end                     | timestamp | NULL      | NULL  | NULL
(24 rows)

 

Example Output  : This is an example information about a query (query_id='20250422_065234_00233_drami'). Since it is horizontally long, I will introduce it in four parts.

(1) The first 6 columns node_id , task_id, stage_execution_id ,stage_id,query_id, state: You can see Presto executed this query in 5 tasks.

presto> select node_id , task_id , stage_execution_id ,stage_id,query_id, state from system.runtime.tasks where query_id='20250422_065234_00233_drami';
               node_id                |               task_id               |       stage_execution_id        |           stage_id            |          query_id           |  state
--------------------------------------+-------------------------------------+---------------------------------+-------------------------------+-----------------------------+----------
 536b979f-5e94-4cd2-bfab-45be1a1aee3a | 20250422_065234_00233_drami.2.0.0.0 | 20250422_065234_00233_drami.2.0 | 20250422_065234_00233_drami.2 | 20250422_065234_00233_drami | FINISHED
 536b979f-5e94-4cd2-bfab-45be1a1aee3a | 20250422_065234_00233_drami.3.0.0.0 | 20250422_065234_00233_drami.3.0 | 20250422_065234_00233_drami.3 | 20250422_065234_00233_drami | FINISHED
 536b979f-5e94-4cd2-bfab-45be1a1aee3a | 20250422_065234_00233_drami.4.0.0.0 | 20250422_065234_00233_drami.4.0 | 20250422_065234_00233_drami.4 | 20250422_065234_00233_drami | FINISHED
 536b979f-5e94-4cd2-bfab-45be1a1aee3a | 20250422_065234_00233_drami.1.0.0.0 | 20250422_065234_00233_drami.1.0 | 20250422_065234_00233_drami.1 | 20250422_065234_00233_drami | FINISHED
 536b979f-5e94-4cd2-bfab-45be1a1aee3a | 20250422_065234_00233_drami.0.0.0.0 | 20250422_065234_00233_drami.0.0 | 20250422_065234_00233_drami.0 | 20250422_065234_00233_drami | FINISHED
(5 rows)

 

(2) Column task_id ,splits, queued_splits, running_splits, completed_splits, split_cpu_time_ms, split_blocked_time_ms: Information about the splits for each task

presto> select task_id ,splits, queued_splits , running_splits , completed_splits , split_cpu_time_ms , split_blocked_time_ms from system.runtime.tasks where query_id='20250422_065234_00233_drami';
               task_id               | splits | queued_splits | running_splits | completed_splits | split_cpu_time_ms | split_blocked_time_ms
-------------------------------------+--------+---------------+----------------+------------------+-------------------+-----------------------
 20250422_065234_00233_drami.2.0.0.0 |     37 |             0 |              0 |               37 |              2258 |                148500
 20250422_065234_00233_drami.3.0.0.0 |     34 |             0 |              0 |               34 |               824 |                 59482
 20250422_065234_00233_drami.4.0.0.0 |      1 |             0 |              0 |                1 |               166 |                     0
 20250422_065234_00233_drami.1.0.0.0 |     32 |             0 |              0 |               32 |                47 |                205241
 20250422_065234_00233_drami.0.0.0.0 |     17 |             0 |              0 |               17 |                10 |                109855
(5 rows)

 

(3) Columns task_id ,raw_input_bytes , raw_input_rows , processed_input_bytes, processed_input_rows , output_bytes , output_rows : Amount of data processed for each task.

presto> select task_id ,raw_input_bytes , raw_input_rows , processed_input_bytes, processed_input_rows , output_bytes , output_rows from system.runtime.tasks where query_id='20250422_065234_00233_drami’;
               task_id               | raw_input_bytes | raw_input_rows | processed_input_bytes | processed_input_rows | output_bytes | output_rows
-------------------------------------+-----------------+----------------+-----------------------+----------------------+--------------+-------------
 20250422_065234_00233_drami.2.0.0.0 |        52865068 |        6148341 |              53453026 |              6148341 |       372026 |       11620
 20250422_065234_00233_drami.3.0.0.0 |        11426397 |        1530142 |              11486639 |              1530142 |      4119528 |      147126
 20250422_065234_00233_drami.4.0.0.0 |          330572 |         150000 |                330572 |               150000 |       542556 |       30142
 20250422_065234_00233_drami.1.0.0.0 |          372034 |          11620 |                429940 |                11620 |         4480 |         160
 20250422_065234_00233_drami.0.0.0.0 |            5088 |            160 |                  4480 |                  160 |          280 |          10
(5 rows)

 

(4) Columns task_id , physical_written_bytes, created, start, last_heartbeat, "end": Amount of data written by each task and the timestamps of start and end, etc.

presto> select task_id , physical_written_bytes, created, start , last_heartbeat , "end" from system.runtime.tasks where query_id='20250422_065234_00233_drami';
               task_id               | physical_written_bytes |         created         |          start          |     last_heartbeat      |           end
-------------------------------------+------------------------+-------------------------+-------------------------+-------------------------+-------------------------
 20250422_065234_00233_drami.2.0.0.0 |                      0 | 2025-04-22 06:52:35.521 | 2025-04-22 06:52:35.728 | 2025-04-22 06:52:41.719 | 2025-04-22 06:52:41.813
 20250422_065234_00233_drami.3.0.0.0 |                      0 | 2025-04-22 06:52:35.515 | 2025-04-22 06:52:35.630 | 2025-04-22 06:52:38.619 | 2025-04-22 06:52:38.741
 20250422_065234_00233_drami.4.0.0.0 |                      0 | 2025-04-22 06:52:35.435 | 2025-04-22 06:52:35.625 | 2025-04-22 06:52:36.313 | 2025-04-22 06:52:36.333
 20250422_065234_00233_drami.1.0.0.0 |                      0 | 2025-04-22 06:52:35.258 | 2025-04-22 06:52:35.325 | 2025-04-22 06:52:41.713 | 2025-04-22 06:52:41.923
 20250422_065234_00233_drami.0.0.0.0 |                      0 | 2025-04-22 06:52:35.314 | 2025-04-22 06:52:35.323 | 2025-04-22 06:52:42.018 | 2025-04-22 06:52:42.017
(5 rows)

 

system.runtime.transactions

"system.runtime.transactions" table contains the list of currently open transactions and related metadata. This includes information such as the create time, idle time, initialization parameters, and accessed catalogs.

Table structure

The table structure of system.runtime.transactions is as follows. In this example, I omitted  "Extra" and "Comment" columns because these columns have no value.


presto> describe system.runtime.transactions;
       Column        |      Type      | Precision | Scale |   Length
---------------------+----------------+-----------+-------+------------
 transaction_id      | varchar        | NULL      | NULL  | 2147483647
 isolation_level     | varchar        | NULL      | NULL  | 2147483647
 read_only           | boolean        | NULL      | NULL  | NULL
 auto_commit_context | boolean        | NULL      | NULL  | NULL
 create_time         | timestamp      | NULL      | NULL  | NULL
 idle_time_secs      | bigint         |        10 | NULL  | NULL
 written_catalog     | varchar        | NULL      | NULL  | 2147483647
 catalogs            | array(varchar) | NULL      | NULL  | NULL
(8 rows)

 

Example of output Since it is long horizontally, I will introduce it in two parts. 

Here you find that 2 transactions are running with "READ UNCOMMITTED".

When you encounter long-running transactions, "system.runtime.transactions" may help to investigate the cause like resource confliction, since it shows idle_time_secs, isolation level, accessed catalog info.

(1) Columns transaction_id , isolation_level, read_only , auto_commit_context, create_time

presto> select transaction_id , isolation_level, read_only , auto_commit_context, create_time from system.runtime.transactions order by create_time desc;
            transaction_id            | isolation_level  | read_only | auto_commit_context |       create_time
--------------------------------------+------------------+-----------+---------------------+-------------------------
 f8e84c00-21a3-40f4-8c32-81f0291356d7 | READ UNCOMMITTED | false     | true                | 2025-04-28 06:49:59.837
 b3d277b5-b99b-4323-af2d-8a91dd10b2f5 | READ UNCOMMITTED | false     | true                | 2025-04-28 06:49:55.063
(2 rows)

 

(2) Column transaction_id , create_time , idle_time_secs , written_catalog , catalogs

presto> select transaction_id , create_time , idle_time_secs , written_catalog , catalogs from system.runtime.transactions order by create_time desc;
            transaction_id            |       create_time       | idle_time_secs | written_catalog |                        catalogs
--------------------------------------+-------------------------+----------------+-----------------+--------------------------------------------------------
 229a66ea-4359-409f-bfcb-2460d4d41ab3 | 2025-04-28 06:50:05.169 |              0 | NULL            | [system, $info_schema@system, $system@system]
 b3d277b5-b99b-4323-af2d-8a91dd10b2f5 | 2025-04-28 06:49:55.063 |              0 | NULL            | [hive_data, $system@hive_data, $info_schema@hive_data]
(2 rows)

Conclusion

In this article, I show the observation results of "runtime" schema in "system" catalog using watsonx.data 2.1.2. 

The record in the table "system.runtime.queries" introduced here is used in Query History to visualize the Query running in the Presto engine in watsonx.data. And for analysis, debugging and monitoring purposes, it is also stored permanently using QHMM. See also, my blog "watsonx.data Query History" and "watsonx.data QHMM(Query History Monitoring and Management)".

Environment

  • OCP Version : 4.16
  • CP4D 5.1.2 (watsonx.data 2.1.2)
  • Configured Presto engines
    • P01(Presto Java) - Type : Presto (Java) v0.286 / Size : Starter / Coordinator Nodes : 1  Worker Nodes : 0
    • PCpp (Presto(C++)) - Type : Presto (C++) v0.286 / Size : Starter / Coordinator Nodes : 1 Worker Nodes : 3


#watsonx.data
#PrestoEngine

0 comments
49 views

Permalink