IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    How to find the corresponding session in MogDB/openGauss from OS thread ID

    kamus发表于 2022-05-30 16:14:55
    love 0

    诊断需求

    当MogDB数据库由于某种原因占用了较大的系统负载,比如CPU占用接近100%,那么如何知道到底是数据库里的哪个会话或者哪些会话占用了资源?
    在Oracle数据库中,这样的问题诊断,通常都会关联 v$session, v$process, 以及操作系统top命令或者ps命令中查到的操作系统进程ID。
    但是MogDB本身是线程模型,在操作系统上只能看到一个进程号,那么该如何定位问题?

    因为MogDB使用的是线程模型,与PostgreSQL的进程模型不同,在操作系统级别如果用lsof命令查看网络端口的接入,虽然是会看到有多个用户会话接入,但是在PID列只会显示进程ID,无法对应到线程。

    $ lsof -i 4 -a -p `pgrep -u omm3 mogdb`
    COMMAND   PID USER   FD   TYPE   DEVICE SIZE/OFF NODE NAME
    mogdb   12027 omm3    8u  IPv4 20313752      0t0  TCP *:biimenu (LISTEN)
    mogdb   12027 omm3    9u  IPv4 20313753      0t0  TCP *:18001 (LISTEN)
    mogdb   12027 omm3  325u  IPv4 28320946      0t0  TCP mogdb-kernel-0004:biimenu->softbank060132034173.bbtec.net:45206 (ESTABLISHED)
    mogdb   12027 omm3  330u  IPv4 28316174      0t0  TCP mogdb-kernel-0004:biimenu->softbank060132034173.bbtec.net:45208 (ESTABLISHED)
    mogdb   12027 omm3  336u  IPv4 28302815      0t0  TCP mogdb-kernel-0004:biimenu->softbank060132034173.bbtec.net:45210 (ESTABLISHED)
    mogdb   12027 omm3  340u  IPv4 28323140      0t0  TCP mogdb-kernel-0004:biimenu->softbank060132034173.bbtec.net:45212 (ESTABLISHED)
    mogdb   12027 omm3  360u  IPv4 28323141      0t0  TCP mogdb-kernel-0004:biimenu->softbank060132034173.bbtec.net:45214 (ESTABLISHED)
    mogdb   12027 omm3  375u  IPv4 28305050      0t0  TCP mogdb-kernel-0004:biimenu->softbank060132034173.bbtec.net:45216 (ESTABLISHED)

    如何获取线程ID

    可以使用htop,打开htop界面以后,按F5进行进程树形显示,第一个PID是进程号,而树形结构下的每一行的PID都是该进程中对应的线程号。

    也可以直接使用ps命令。-L参数表示显示线程,-o参数可以指定显示感兴趣的列。

    # ps -Lp `pgrep -u omm3 mogdb` -o %cpu,tid,pid,ppid,cmd,comm
    %CPU   TID   PID  PPID CMD                         COMMAND
     0.0 17847 17847     1 /opt/mogdb3/app/bin/mogdb - mogdb
     0.0 17848 17847     1 /opt/mogdb3/app/bin/mogdb - jemalloc_bg_thd
     0.0 17854 17847     1 /opt/mogdb3/app/bin/mogdb - mogdb
     0.0 17855 17847     1 /opt/mogdb3/app/bin/mogdb - syslogger
     0.0 17856 17847     1 /opt/mogdb3/app/bin/mogdb - reaper
     0.0 17857 17847     1 /opt/mogdb3/app/bin/mogdb - jemalloc_bg_thd
     0.0 17858 17847     1 /opt/mogdb3/app/bin/mogdb - jemalloc_bg_thd
     0.0 17860 17847     1 /opt/mogdb3/app/bin/mogdb - jemalloc_bg_thd
     0.0 17884 17847     1 /opt/mogdb3/app/bin/mogdb - checkpointer
     0.0 17885 17847     1 /opt/mogdb3/app/bin/mogdb - Spbgwriter
     0.1 17886 17847     1 /opt/mogdb3/app/bin/mogdb - pagewriter
     0.0 17887 17847     1 /opt/mogdb3/app/bin/mogdb - pagewriter
     0.0 17888 17847     1 /opt/mogdb3/app/bin/mogdb - pagewriter
     0.0 17889 17847     1 /opt/mogdb3/app/bin/mogdb - pagewriter
     0.0 17890 17847     1 /opt/mogdb3/app/bin/mogdb - pagewriter
     0.8 17891 17847     1 /opt/mogdb3/app/bin/mogdb - WALwriter
     0.0 17892 17847     1 /opt/mogdb3/app/bin/mogdb - WALwriteraux
     0.0 17893 17847     1 /opt/mogdb3/app/bin/mogdb - AVClauncher
     0.0 17894 17847     1 /opt/mogdb3/app/bin/mogdb - Jobscheduler
     0.0 17895 17847     1 /opt/mogdb3/app/bin/mogdb - asyncundolaunch
     0.0 17896 17847     1 /opt/mogdb3/app/bin/mogdb - globalstats
     0.0 17897 17847     1 /opt/mogdb3/app/bin/mogdb - applylauncher
     0.0 17898 17847     1 /opt/mogdb3/app/bin/mogdb - statscollector
     0.0 17899 17847     1 /opt/mogdb3/app/bin/mogdb - snapshotworker
     0.1 17900 17847     1 /opt/mogdb3/app/bin/mogdb - TrackStmtWorker
     0.0 17901 17847     1 /opt/mogdb3/app/bin/mogdb - 2pccleaner
     0.0 17902 17847     1 /opt/mogdb3/app/bin/mogdb - faultmonitor
     0.0 17904 17847     1 /opt/mogdb3/app/bin/mogdb - undorecycler
     0.0 18372 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 18373 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 18374 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 18375 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 18376 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 18377 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 18378 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 18379 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 18380 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 18381 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 18382 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 18454 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 19475 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 19480 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 29529 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 30999 17847     1 /opt/mogdb3/app/bin/mogdb - worker

    comm列中显示为woker的都是用户会话的数据库端后台进程,通常占用CPU较高的会是用户会话,因此可以用grep命令来过滤只显示用户会话。

    #  ps -Lp `pgrep -u omm3 mogdb` -o %cpu,tid,pid,ppid,cmd,comm | grep -w worker
     0.0 18372 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 18373 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 18374 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 18375 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 18376 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 18377 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 18378 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 18379 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 18380 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 18381 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 18382 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 18454 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 19475 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 19480 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 29529 17847     1 /opt/mogdb3/app/bin/mogdb - worker
     0.0 30999 17847     1 /opt/mogdb3/app/bin/mogdb - worker

    如何将操作系统线程ID跟数据库会话对应

    假设在这台服务器上,线程ID=18372的线程占用了大量CPU。

    在MogDB里查询pg_os_threads视图,可以获得该线程在数据库中对应的会话ID 。

    MogDB=# select * from pg_os_threads where lwpid=18372;
     node_name |       pid       | lwpid | thread_name |         creation_time
    -----------+-----------------+-------+-------------+-------------------------------
     dn_6001   | 140545137571584 | 18372 | dn_6001     | 2022-05-30 19:54:42.459129+08
    (1 row)

    pg_os_threads视图里记录了轻量级线程号和会话ID的关系,其中lwpid是操作系统线程ID,pid列是数据库会话ID。具体信息可以参考MogDB文档-PG_OS_THREADS。
    如果具备monadmin权限,也可以查询dbe_perf模式下的os_threads视图,信息是一样的。

    找到数据库会话ID之后,就可以为所欲为了,比如可以查询dbe_perf.session_stat_activity视图来获取该会话的应用名,发起的客户端IP地址,还有该会话正在执行的SQL。

    MogDB=# select application_name,client_addr,query from dbe_perf.session_stat_activity where pid=140545137571584;
     application_name | client_addr  |                          query
    ------------------+--------------+---------------------------------------------------------
     dn_6001          | 172.16.0.176 | SELECT cfg_value FROM bmsql_config  WHERE cfg_name = $1
    (1 row)

    也可以查询dbe_perf.thread_wait_status视图来获取会话的当前等待事件。BTW,实际上该视图中有lwtid字段,可以直接对应到线程ID。

    MogDB=# select lwtid,wait_status,wait_event from dbe_perf.thread_wait_status where sessionid=140545137571584;
     lwtid | wait_status | wait_event
    -------+-------------+------------
     18372 | wait cmd    | wait cmd
    (1 row)


沪ICP备19023445号-2号
友情链接