2010-02-22

【Linux筆記】關於/proc/stat與CPU使用率的計算

下達指令:

$ cat /proc/stat
會得到類似以下的顯示結果
cpu  480 0 708 4305468 1901 27 53 0 0
cpu0 264 0 503 2143900 1730 27 53 0 0
cpu1 215 0 204 2161567 170 0 0 0 0
intr 1587075 115 8 0 2 3 0 5 0 1 0 0 0 119 0 24081 0 1306153 3575
0 685 0 0 0 0 0 0 0 68461 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
ctxt 499671
btime 1266751940
processes 1984
procs_running 2
procs_blocked 0
softirq 1574719 0 142505 4290 68517 7047 1306131 25188 7 21034

詳細的說明可以參考man 5 proc手冊,下面簡單記錄各欄位意義
  1. 第一列的cpu表示所有cpu0~cpuN(此範例可看出該系統具備雙核心)的加總,接續後面的七組數字代表CPU花了多少"力氣"在不同的工作上,而這"力氣"的單位是Jiffies,cpu各欄位資訊的意義
    user: 一般跑在user mode下的processes
    nice: 跑在user mode下的nice processes(叫nice processes的理由是因為此類processes太nice了,讓出自己的優先權讓別人有更多機會搶先執行)
    system: 執行在kernel mode下的processes
    idle: cpu閒閒沒事的累計數量
    iowait: 等待I/O完成的時間
    irq: 執行中斷服務的時間
    softirq: 執行軟體中斷服務的時間
    stead: Since Linux 2.6.11, there is an eighth column, steal - stolen time, which is the time spent in other operating systems when running in a virtualized environment
    guest: Since Linux 2.6.24, there is a ninth column, guest, which is the time spent running a virtual CPU for guest operating systems under the control of the Linux kernel
  2. intr後面接續著很多行數字(包括很多的0),其計算從開機(boot time)以來每一種中斷的服務次數,其中第一欄數字是所有中斷的總合
  3. ctxt累計所有cpus的context switches次數
  4. btime紀錄系統開機時距離Unix epoch多少時間(也就是記錄開機的時間點),單位為秒。
  5. processes紀錄有多少processes與threads被建立
  6. procs_running紀錄cpu正執行多少個processes
  7. procs_blocked記錄當下有多少process被block住等待I/O服務完成

參考下面程式碼計算cpu的"當前"使用率(usage rate)
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
#include <string.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <sys/poll.h>
#include <sys/time.h>

#define TRIMz(x) ((tz = (unsigned long long)(x)) < 0 ? 0 : tz)

void stat() {

FILE *fp_stat ;
char buff[128+1] ;
static unsigned long long u, u_sav, u_frme, s, s_sav, s_frme,
n, n_sav, n_frme, i, i_sav, i_frme,
w, w_sav, w_frme, x, x_sav, x_frme,
y, y_sav, y_frme, z, z_sav, z_frme,
tot_frme, tz ;
float scale ;
fp_stat = fopen("/proc/stat","r") ;
while( fgets(buff, 128, fp_stat) ) {
if( strstr(buff, "cpu") ) {
sscanf(buff, "cpu %Lu %Lu %Lu %Lu %Lu %Lu %Lu %Lu",
&u, &n, &s, &i, &w, &x, &y, &z) ;
break ;
}
}
fclose(fp_stat) ;
u_frme = u - u_sav ;
s_frme = s - s_sav ;
n_frme = n - n_sav ;
i_frme = TRIMz(i - i_sav) ;
w_frme = w - w_sav ;
x_frme = x - x_sav ;
y_frme = y - y_sav ;
z_frme = z - z_sav ;
tot_frme = u_frme + s_frme + n_frme + i_frme + w_frme + x_frme + y_frme + z_frme ;
if (tot_frme < 1)
tot_frme = 1 ;
scale = 100.0 / (float)tot_frme ;
printf("%5.1f%%us,%5.1f%%sy,%5.1f%%ni,%5.1f%%id,%5.1f%%wa,%5.1f%%hi,%5.1f%%si,%5.1f%%st\n",
u_frme * scale, s_frme * scale, n_frme * scale,
i_frme * scale, w_frme * scale, x_frme * scale, y_frme * scale, z_frme * scale) ;
u_sav = u;
s_sav = s;
n_sav = n;
i_sav = i;
w_sav = w;
x_sav = x;
y_sav = y;
z_sav = z;
}

int main() {
int i, j ;
struct itimerval tick ;
int res ;

/**
*
* Timer interrupt handler
*
*/
signal(SIGALRM, stat) ; /* SIGALRM handeler */
/* setting first time interval */
tick.it_value.tv_sec = 0 ; // sec
tick.it_value.tv_usec = 500000 ; // usec
/* setting next time interval */
tick.it_interval.tv_sec = 0 ; // sec
tick.it_interval.tv_usec = 500000 ; // usec
res = setitimer(ITIMER_REAL, &tick, NULL);
if(res)
fprintf(stderr, "Error: timer setting faul.\n") ;
else
printf("Timer start...\n") ;
/*
* END
*/

getchar() ;

return 0 ;
}

此程式會印出cpu當前在各種不同工作下的使用比率,仿top指令中的效果,每隔0.5秒印出cpu總使用率。
【程式簡述說明】
設定timer每隔0.5秒發出SIGALRM訊號,接收SIGALRM訊號後將執行stat副程式,此副程式會讀取/proc/stat資訊,相減計算前後兩次間隔中總共耗費多少cpu的力氣

u_frme = u - u_sav ;
s_frme = s - s_sav ;
n_frme = n - n_sav ;
i_frme = TRIMz(i - i_sav) ;
w_frme = w - w_sav ;
x_frme = x - x_sav ;
y_frme = y - y_sav ;
z_frme = z - z_sav ;
tot_frme = u_frme + s_frme + n_frme + i_frme + w_frme + x_frme + y_frme + z_frme ;

藉由個別工作所耗費的力氣除以總耗費力氣即可求得即時的cpu使用率
scale = 100.0 / (float)tot_frme ;
計算scale的100,只是為了表示百分比加上去的,與其在計算個別工作使用率時先各自除以總耗費力氣後再乘上100,不如先乘好這100!