A common question is "why is my 100% utilization at 100%". There is a great deal of concern about the measurement of CPU at the Oracle server level.
If you suspect a CPU utilization problem, see these important notes on 100% CPU and Oracle. Also see Oracle and CPU utilization metrics.
Also see my notes on OS Busy scripts.
Once we understand the CPU resources are scarce (just like RAM resources), and not to be wasted), we need to understand how to tell if our Oracle server is making optimal usage of his computing hardware.
There are many OS utilities that allow us to see CPU utilization statistics, including these, but also with uptime and procinfo.

Each of these tools display CPU processor metrics at a finer level of detail than Oracle. This is because the OS does not reveal all processor details to applications (To UNIX, Oracle is just another application), and the best place to see what's going on inside your server is to use the operating systems CPU monitors. These will report different metrics on CPU utilization:

The runqueue - This is the far left-hand column of the vmstat command display (labeled with an "r"). It reports the total length of the CPU dispatcher queue. When the runqueue exceeds the number of CPU's on the server, you have have an overloaded server with a CPU bottleneck.
The load average - This is defined as the sum of the run queue length and the number of jobs currently running on the CPUs. In each display of the load average consists of three numbers. Most often, the load average numbers show a descending order from left to right, with the load average for 1, 5, and 15 minutes in the past. Occasionally, however, an ascending order appears (e.g. like that shown in the top output).

There are a host of UNIX commands that display CPU and memory consumption. While there are dialect-specific utilities such as glance, we will look at the common vmstat and top utilities.

Using top to monitor CPU
The "top" command can be used to display CPU utilization. The metrcis are:

load average - The load average is computed as
CPU states - This show percentage metrics for current processor usage.

System: corp-hp1 Thu Jul 6 09:14:23 2000

Load averages: 0.04, 0.03, 0.03

340 processes: 336 sleeping, 4 running

Cpu states:

CPU LOAD USER NICE SYS IDLE BLOCK SWAIT INTR SSYS

0 0.06 5.0% 0.0% 0.6% 94.4% 0.0% 0.0% 0.0% 0.0%

1 0.06 0.0% 0.0% 0.8% 99.2% 0.0% 0.0% 0.0% 0.0%

2 0.06 0.8% 0.0% 0.0% 99.2% 0.0% 0.0% 0.0% 0.0%

3 0.06 0.0% 0.0% 0.2% 99.8% 0.0% 0.0% 0.0% 0.0%

4 0.00 0.0% 0.0% 0.0% 100.0% 0.0% 0.0% 0.0% 0.0%

5 0.00 0.2% 0.0% 0.0% 99.8% 0.0% 0.0% 0.0% 0.0%

--- ---- ----- ----- ----- ----- ----- ----- ----- -----

avg 0.04 1.0% 0.0% 0.2% 98.8% 0.0% 0.0% 0.0% 0.0%

Memory: 493412K (229956K) real, 504048K (253952K) virtual, 767868K free Page# 1

/49

CPU TTY PID USERNAME PRI NI SIZE RES STATE TIME %WCPU %CPU COMMAND

0 ? 26835 applmgr 154 20 30948K 11936K sleep 0:49 3.91 3.90 f45runw

2 ? 27210 applmgr 154 20 31316K 12836K sleep 0:49 1.91 1.91 f45runw

5 ? 36 root 152 20 0K 0K run 56:28 1.16 1.16 vxfsd

1 ? 347 root 154 20 32K 96K sleep 567:15 1.11 1.11 syncer

5 ? 27429 oracle 154 20 20736K 2608K sleep 0:23 0.39 0.38 oraclePROD

4 ? 27067 oracle 154 20 21984K 3792K sleep 1:31 0.36 0.36 oraclePROD

Using svmon on AIX

root@AIX1 [/]#svmon

size inuse free pin virtual

memory 1048566 1023178 4976 55113 251293

pg space 524288 10871

work pers clnt

pin 55116 0 0

in use 250952 772224 2

Where:

size = the number of real memory frames (size of real memory)

inuse = is the number of frames containing pages

pin = Number of frames containing pinned pages in use

The svmon command can also be used with the ?p option to display characteristics for a specific process ID (PID):

Root> svmon -P 26060

-------------------------------------------------------------------------------

Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd

26060 pr 6871 1607 1022 6001 N N

Vsid Esid Type Description Inuse Pin Pgsp Virtual Addr Range

24029 d work shared library text 3992 0 22 2779 0..65535

0 0 work kernel seg 2509 1606 926 2897 0..32767 :

65475..65535

105e4 2 work process private 188 1 48 230 0..273 :

65298..65535

285ea f work shared library data 92 0 26 95 0..919

185e6 1 pers code,/dev/lvs001:301 81 0 - - 0..149

6c59b - pers /dev/lvs001:92402 6 0 - - 0..9

744fd - pers /dev/lvs001:763909 3 0 - - 0..9

7c5ff - pers /dev/lvs001:1327130 0 0 - - 0..29

The watch command

The w command shows the load average" which is computed from the current runqueue values. Watch also shows the same information uptime did.

$ w
22:42:14 up 2:34, 2 users,  load average: 0.00, 0.00, 0.00
USER     TTY        LOGIN@   IDLE   JCPU   PCPU WHAT
terry    :0        20:10   ?xdm?   5:24   1.49s gnome-session
terry    pts/1     22:22    0.00s 0.24s 0.04s /usr/sbin/sshd

Using SAR

The sar utility (System Activity Reporter) is the system activity reporter that is quite popular in HP/UX, and is widely becoming available for AIX and Solaris systems. SAR has much of the same functionality as vmstat, but provides additional details.

There are four major flags in sar:

sar -u = to see CPU

sar -w = for swapping

sar -b = for buffer activity

sar -d = for disk usage

Sar ?w (memory switching and swapping activity)

swpin/s Number of process swapins per second;

swpot/s Number of process swapouts per second;

bswin/s Number of 512-byte swap in?s per second.

bswot/s Number of 512-byte swap out?s per second

pswch/s Number of process context switches per second

ROOT-/

>sar -w 5 5

HP-UX corp-hp1 B.11.00 U 9000/800 08/09/00

19:37:57 swpin/s bswin/s swpot/s bswot/s pswch/s

19:38:02 0.00 0.0 0.00 0.0 222

19:38:07 0.00 0.0 0.00 0.0 314

19:38:12 0.00 0.0 0.00 0.0 280

19:38:17 0.00 0.0 0.00 0.0 295

19:38:22 0.00 0.0 0.00 0.0 359

Average 0.00 0.0 0.00 0.0 294

Sar ?u (CPU Report)

cpu cpu number (only on a multi-processor

system with the -M option);

%usr user mode;

%sys system mode

%wio idle with some process waiting for I/O

%idle otherwise idle.

ROOT-/

>sar -u 2 5

HP-UX burleson B.11.00 U 9000/800 08/09/00

08:37:06 %usr %sys %wio %idle

08:37:07 43 57 0 0

08:37:08 45 55 0 0

08:37:09 44 56 0 0

08:37:10 44 56 0 0

08:37:11 43 57 0 0

08:37:12 52 48 0 0

08:37:13 49 51 0 0

08:37:14 49 51 0 0

08:37:15 57 43 0 0

08:37:16 65 35 0 0

08:37:17 40 29 12 19

08:37:18 23 20 12 44

08:37:19 0 1 0 99

Sar ?b (buffer activity report)

bread/s Number of physical reads per second from disk

bwrit/s Number of physical writes per second

lread/s Number of reads per second from buffer cache

lwrit/s Number of writes per second to buffer cache

cache;

%rcache Buffer cache hit ratio for read requests

%wcache Buffer cache hit ratio for write requests

pread/s Number of reads per second from

pwrit/s Number of writes per second to character

root>sar -b 1 6

HP-UX corp-hp1 B.11.00 U 9000/800 08/09/00

19:44:53 bread/s lread/s %rcache bwrit/s lwrit/s %wcache pread/s pwrit/s

19:44:54 0 91 100 9 19 53 0 0

19:44:55 0 0 0 0 5 100 0 0

19:44:56 0 6 100 9 8 0 0 0

19:44:57 0 30 100 9 20 55 0 0

19:44:58 0 1 100 0 3 100 0 0

19:44:59 0 1 100 9 4 0 0 0

Average 0 22 100 6 10 39 0 0

Using sadc

The sadc (System Activity Report Package) is a popular package that can be used inside cron to schedule collections of server statistics.

All of the sadc reports are located in the /usr/lbin/sa directory. These reports must be run as root and provide detailed server information. One of the most popular sadc reports is sa1:

#! /usr/bin/sh

# @(#) $Revision: 72.3 $

# sa1.sh

DATE=`date +%d`

ENDIR=/usr/lbin/sa

DFILE=/var/adm/sa/sa$DATE

cd $ENDIR

if [ $# = 0 ]

then

exec $ENDIR/sadc 1 1 $DFILE

else

exec $ENDIR/sadc $* $DFILE

Using glance to monitor Oracle CPU

For complete details, see my notes on monitoring Oracle with glance.

The glance utility is provided on HP/UX systems to provide a graphical display of server performance. It displays current CPU, memory, disk and swap consumption, and also reports on the top processes.

Using the vmstat utility to monitor Oracle

The UNIX vmstat utility is especially useful for monitoring the performance of Oracle databases. You?ll find vmstat on almost all implementations of UNIX, including Linux. Click here for details on monitoring Oracle CPU with vmstat, and building a CPU monitor for Oracle.

The vmstat utility is the most common Unix monitor utility. It is found on virtually all dialects of UNIX (vmstat is called osview on IRIX), and vmstat quickly display?s server values. These values include:

r = runqueue ? When this value exceeds the number of CPUs (lsdev ?C|grep Proc|wc ?l). then the sever is experiencing an CPU bottleneck

pi = Page in ? Any non-zero values indicates that the server is short on memory and RAM memory is being send to the swap disk. However, this can also occur when numerous programs are accessing their memory for the first time, so always remember to check the scan rate ?sr? column. If both are non-zero. Then you are short on RAM.

sr = scan rate - If we see ?sr? rising steadily we know that the paging daemon is busy allocating memory pages.

For AIX and HP/UX, vmstat provides the following CPU values. These values are expressed as percentages and will sum to 100

us = user CPU percentage

sy = system CPU percentage

Id = Idle CPU percentage

wa = wait CPU percentage

When us+sy approaches 100, then the CPUs are busy, but not necessarily overloaded. Only the run queue values determines CPU overload and only when ?r? exceeds the number of CPUs on the server.

When ?wa? values exceed 20, then 20% of the processing time is waiting for a resource, usually I/O. It is common to see high wa values during backup and exports, but high wa values can also indicate an I/O bottleneck.

>vmstat 3

kthr memory page faults cpu

----- ----------- ------------------------ ------------ -----------

r b avm fre re pi po fr sr cy in sy cs us sy id wa

0 0 84283 207 0 1 1 59 174 0 178 40 142 18 4 75 4

0 0 84283 187 0 4 0 0 0 0 144 294 70 2 1 91 6

0 0 84283 184 0 0 0 0 0 0 171 740 99 5 2 89 4

0 0 84283 165 0 0 0 0 0 0 173 193 98 1 8 52 40

0 0 84283 150 0 3 0 0 0 0 205 615 136 4 2 87 6

0 0 84283 141 0 1 0 0 0 0 281 935 192 5 0 91 4

vmstat for Solaris

The display format for vmstat in Solaris is quite different than AIX and HP/UX. In Solaris the ?vmstat ?n? command is used to display server stats. The relevant columns are:

Pi = page-ins

Us = CPU user time

Sys = CPU system time

Id = CPU idle time

R = runqueue ? If this exceeds the number of CPU?s then you are CPU-bound

In the example below, we sample an overstresses Oracle server. Note that us + sy = 100, and that the r value far exceeds the 32 CPU?s on this server:

root> vmstat ?n 1

memory page faults

avm free re at pi po fr de sr in sy cs

41128 118400 4424 92 0 11 90 0 0 1124 77234 4113

CPU

cpu procs

us sy id r b w

49 51 0 100 2 0

46 54 0

49 51 0

42 58 0

54719 115379 4508 105 0 10 102 0 0 1107 78021 3912

44 56 0 67 2 0

56 44 0

58 42 0

45 55 0

54719 118479 4305 113 0 10 116 0 0 1070 75044 4085

41 59 0 67 2 0

56 44 0

50 50 0

54719 125113 4088 124 0 10 124 0 0 1055 75103 4520

52 48 0 67 2 0

50 50 0

65 35 0

53 47 0

54719 141189 3659 116 0 9 127 0 0 1065 71355 4882

60 40 0 67 2 0

60 40 0

61 39 0

54719 178306 3113 104 0 9 309 0 0 1075 64446 4741

4 15 81 67 2 0

9 13 78

16 9 75

10 9 81

Unix Knowledge base

Unix & Linux ,AIX, HP-Unix, Solaries Memory and CPU Utilization

Using svmon on AIX

Using SAR

Using sadc

Using glance to monitor Oracle CPU

Using the vmstat utility to monitor Oracle

vmstat for Solaris

sangeethakumar

0 comments:

Post a Comment

Unix & Linux ,AIX, HP-Unix, Solaries Memory and CPU Utilization

Using svmon on AIX

Using SAR

Using sadc

Using glance to monitor Oracle CPU

Using the vmstat utility to monitor Oracle

vmstat for Solaris

sangeethakumar

RELATED POSTS

0 comments:

Post a Comment