Performance monitoring tools are divided into two broad categories:
high-level tools that impose a negligible performance hit and
provide a correspondingly general overview of system performance and low-level
tools that provide detailed views of system performance at the component level
and exact a significant and noticeable performance penalty.
There is room and need for both types of tools in the performance
monitoring spectrum.
Overall System Status
To get some sense of the system’s overall performance status, the
most common command is uptime.
uptime
displays a one-line report that shows how long
the system has been up, how many users are logged in, and the system load
averages
during the last 1, 5, and 15 minutes.
The load
average represents the number of
jobs waiting in the run queue, that is, processes ready to run but not running.
The following shows uptime’s
output on a host
$ uptime
2:51pm
up 1 day, 25 min, 3 users, load average: 2.03, 2.10, 2.06
As you can see, the load average is 2.03 over the last minute, 2.10
during the last five minutes, and 2.06 during the last quarter hour.
The sar
(system activity reporter) command is more
informative because it samples CPU utilization a specific number of times over
an interval defined in seconds.
The simplest invocation is:
sar
secs count
This command instructs sar
to take count
samples that are secs seconds apart.
For example, the following sar
invocation shows CPU usage sampled five times
in five seconds:
$
sar -u 5 5
Output:
Linux 2.4.7-10 (localhost.localdomain) 10/25/2001
03:09:56 PM CPU %user %nice %system %idle
03:10:01 PM all 9.20 89.20 1.60 0.00
03:10:06 PM all 10.00 89.00 1.00 0.00
03:10:11 PM all 0.00 99.60 0.40 0.00
03:10:16 PM all 0.00 99.00 1.00 0.00
03:10:21 PM all 0.00 99.80 0.20 0.00
Average:
all 3.84 95.32 0.84 0.00
The -u option tells sar
to report on CPU utilization.
%user indicates the
percentage of time that the CPU spends executing user mode code (typically,
applicationprograms), %nice is the percentage of time the CPU spends
executing user mode code with a nonzero nice priority, %system indicates
the time spent executing kernel code (system calls), and %idle shows
how much time the CPU is not doing anything.
Monitoring
Running Processes
The ps
command is the weapon of choice for examining
running processes. In terms of performance monitoring, the most useful set of
command options to use is -el — that is, ps
-el. Why -el? The -e
option lists information on every running process
and the -l option
generates a long listing that displays information pertinent to performance
monitoring and tuning.
The following short listing illustrates the output of ps –el
$ ps -el
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
100 S 0 858 1 0 68 0 - 1554 do_pol ? 00:00:00 gdm
000 S 500 1020 1 0 69 0 - 427 do_sel ? 00:00:00 esd
040 S 500 1244 1 0 69 0 - 990 do_sel ? 00:00:00 fetchmail
140 S 0 3914 858 0 69 0 - 1556 pipe_w ? 00:00:00 gdm
100 S 0 3915 3914 0 69 0 - 15264 do_sel ? 00:00:01 X
100 S 42 3920 3914 0 69 0 - 1762 do_pol ? 00:00:00 gdmlogin
000 S 42 3921 3920 0 69 0 - 2128 do_pol ? 00:00:00 xsri
040 S 0 5081 1 0 69 0 - 356 rt_sig ? 00:00:00 ppp-watch
100 S 0 5083 5081 0 68 0 - 478 do_sel ttyS4 00:00:00 pppd
040 S 0 6170 757 0 68 0 - 400 pipe_w ? 00:00:00 crond
040 S 500 6173 1 0 69 0 - 488 wait4 ? 00:00:00 seti.sh
000 R 500 6175 6173 48 75 1 - 4457 - ? 04:50:05 setiathom
100 S 0 6176 6170 0 69 0 - 1370 pipe_w ? 00:00:00 sendmail
040
S 500 6178 1 0 69 0 - 488 wait4 ? 00:00:00 seti.sh
Field
|
Description
|
F
|
Contains a value consisting of the sum of one or more
hexadecimal values that describes the process’s current status
|
S
|
Shows the process’s current state (running, sleeping, and so
forth)
|
UID
|
Lists the numeric user ID (UID) of the user who owns the
process
|
PID
|
Lists the process’s process ID (PID)
|
PPID
|
Lists the PID of the parent process
|
C
|
Shows the CPU utilization of the process
|
PRI
|
States the process’s priority (higher numbers mean higher
priority)
|
NI
|
Shows the process’s nice value (higher numbers mean lower
priority)
|
SZ
|
Lists how much swap space (virtual memory) the process
requires
|
TTY
|
Names the terminal on which the process started (also known
as the controlling
terminal)
|
TIME
|
Summarizes the total CPU time (in hours and minutes) that the
process has consumed
|
CMD
|
Displays the command that initiated the
process
|
The ps
command really serves two purposes in
performance monitoring and tuning.
First, it provides clues about what process (or processes) might be
causing the problem, which, in turn, suggests possible measures to alleviate
the problem.
Second, ps
enables you to see whether the action you took
solved the problem.
After taking steps to discipline a wayward process, use ps -el again
and evaluate the results.
If the corrective measure worked, you have at least temporarily
solved the problem. If not, try another tactic.
Monitoring
Memory Utilization
Use the vmstat
command to examine the virtual memory
subsystem and isolate problems.
vmstat
reports a variety of statistics about the
virtual memory subsystem, CPU load, and disk and CPU activity.
Its general format is:
vmstat
[secs [count]]
vmstat
takes a total of count samples every secs
seconds and displays its output to stdout.
$ vmstat 5 5
procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id
2 0 0 792 58440 18868 45008 0 0 3 7 195 360 42 1 56
2 0 0 792 58436 18884 45008 0 0 0 10 102 21 100 0 0
2 0 0 792 58436 18884 45008 0 0 0 1 102 17 99 1 0
2 0 0 792 58436 18884 45008 0 0 0 1 101 19 99 1 0
2 0 0 792 58164 18884 45008 0 0 0 1 102 17 99 1 0
At first glance the information is hard to decipher. The first line
of any vmstat report shows only summary information.
Subsequent lines show the information that you use to track down
memory problems.
Field
|
Description
|
Procs
|
|
r
|
The number of processes ready to run (in the run queue)
|
b
|
The number of processes blocked while waiting for some
resources
|
w
|
The number of processes swapped out while waiting for some
resources
|
memory
|
|
swpd
|
The amount of swap space used in KB
|
free
|
The size of the unallocated memory store in KB
|
buff
|
The amount of memory used as a buffer in KB
|
cache
|
The amount of memory used as cache in KB
|
Swap
|
|
si
|
The amount of memory swapped in from disk in KB/sec
|
so
|
The amount of memory swapped out to disk in KB/sec
|
io
|
|
bi
|
The number of blocks written to a block device in blocks/sec
|
bo
|
The number of blocks read from a block device in blocks/sec
|
system
|
|
in
|
The number of device interrupts/sec
|
cs
|
The number of CPU process context switches in switches/sec
|
cpu
|
|
us
|
The percentage of CPU time spent executing user code
|
sy
|
The percentage of CPU time spent executing system/kernel code
|
id
|
The percentage of CPU time spent idle
|
When the amount of available swap space (shown in the swap column
and displayed in kilobyte units) is low, this is an indication that the system
is swapping heavily and likely is performing poorly.
Use ps
to identify the process or processes making
heavy use of swap. In some cases, you might be able to tune the application
itself to address the issue, but, more often than not, the permanent solution
is adding RAM to the system.
If the w column is nonzero and the so and
si columns
indicate continuous
swapping, start looking for processes consuming memory,
particularly those with large virtual memory requirements.
Watching the r and
b columns
over a period of time enables you to get some sense of how fast processes are
moving through the queue.
Except for long-running processes (easily identified using ps), the values in the r and
b columns
should stay low.
On systems that support it, the free
command shows memory utilization,
including swap usage and capacity.
For example, the following
command shows the output of the free
command on a moderately loaded system:
$ free
total used free shared
buffers cached
Mem: 191136 133768 57368 580 19216 45028
-/+ buffers/cache: 69524 121612
Swap:
253508 792 252716
At a glance, free
gives you an informative snapshot of system
memory usage.
The first line following the header shows physical memory
information (all information is in kilobytes), including the total amount of
installed RAM, how much of it the kernel has allocated (but not necessarily
used), how much RAM is free, how much
RAM is configured as shared memory, and how much RAM is used for buffering and
cache.
The second line shows memory usage adjusted for buffering. That is,
free
subtracts the amount of memory used for buffering and caching from
the used column
and adds it to the free
column so that you get a more accurate picture
of actual RAM usage as opposed to the kernel’s allocation of RAM, which changes
according to system usage and the load average. Note, for example, that if you subtract
the buffers and
cached values
from the used value on the first line, you get 69,524 (133,768 – 19,216 –
45,028 = 69,524) and that adding buffers
and cached
to free
yields precisely 121,612 (57,368 + 19,216 +
45,028 = 121,612).
The third line, finally, shows the amounts of swap space available,
used, and free.
As you can see, swap usage on this system was fairly low at the
time the snapshot was taken.
If
you prefer a graphical display, follow following steps in RedHat Linux 6:
Step
1: Click on ‘Applications’
Step
2: Click on “System Tools”
Step
3: Click on “System Monitor”
Monitoring Disk Usage and Performance
Disk usage is straightforward to monitor using the df and
du commands.
df reports
the amount of disk space that is available, and du
reports the amount that is used.
df OPTIONS
Option
|
Description
|
-a
|
Includes empty file systems in the report
|
-h
|
Displays the report using human readable units
|
-k
|
Uses blocks of 1K
|
-l
|
Limits the report to local file systems
|
-m
|
Uses blocks of 1024K
|
Use df’s
-k option
to see disk usage in kilobytes — for example:
$ df -k
Filesystem 1k-blocks Used Available Use% Mounted on
/dev/sda2 959396 117364 793296 13% /
/dev/sda1 21129 2997 17041 15% /boot
/dev/hdc1 4076976 1203725 2662282 32% /home
none 95568 0 95568 0% /dev/shm
/dev/sda3
3153592 1638608 1354788 55% /usr
To make the display easier to read, you can use the -h option:
$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 937M 115M 774M 13% /
/dev/sda1 21M 3.0M 16M 15% /boot
/dev/hdc1 3.9G 1.2G 2.5G 32% /home
none 93M 0 93M 0% /dev/shm
/dev/sda3
3.0G 1.6G 1.2G 55% /usr
If you use the -l option, you list only the disk space that is
physically located on your own system, not on NFS mounts or other disk drives
that are accessed remotely.
The -h option makes it amply clear how much space is
available on each
file system.
All of the file systems are well under their maximum capacity.
To see how much disk space is in use, use the du command.
Its basic syntax is :
du
[options] file
file can be a
file system, a device, a directory, or a file.
Useful values for options
include:
-d — Limits the report to
the specified file system
-h — Displays the report
using human readable units
-k — Prints the report using
K units
-m — Prints the report using
MB unites
-s
— Reports only summary values
The default report, if no options are specified, shows usage
statistics for each directory at or beneath file.
For example, the following command shows du’s default output:
du
/tmp
112 /tmp/.kde/tmp-localhost.localdomain
116 /tmp/.kde
4 /tmp/.X11-unix
4 /tmp/.font-unix
4 /tmp/.ICE-unix
12 /tmp/orbit-root
4 /tmp/.sawfish-root
12 /tmp/orbit-kwall
4 /tmp/.sawfish-kwall
4 /tmp/.esd
488
/tmp
The next command shows only a summary report for disk usage in /usr:
# du -s /usr
1605796
/usr
The next command shows the summary report but uses the -h option
to print the report in more familiar notation and the -m option
to use megabyte (MB) units:
$ du -smh /usr
1.6G
/usr
The most frequently used command for drilling down on I/O
performance
problems is iostat.
Like vmstat, iostat’s
simplest invocation is:
iostat
[seconds [count]]
seconds defines
the interval between reports, and count
specifies the number of reports.
For example, the following command generates the default iostat report:
$ iostat
Linux 2.4.7-10 (localhost.localdomain) 10/25/2001
avg-cpu: %user %nice %sys %idle
2.02 52.99 1.01 43.97
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
dev8-0 0.84 4.00 11.37 482868 1371784
dev22-0
0.48 0.41 1.68 49298 203210
The device listed under the Device
column shows the physical device partitions in
the format devm-n,
where m is the major number and n is the minor number.
In this case, dev8-0 maps to the first SCSI disk (/dev/sda),
and dev22-0 maps
to the second IDE disk (/dev/hdc).
iostat OUTPUT FIELDS
Field
|
Description
|
tps
|
The numbers of transfers (or I/O requests) per second sent to
the disk
|
Blk_read/s
|
The number of blocks read from the device per second
|
Blk_wrtn/s
|
The number of blocks written to the device per second
|
Blk_read
|
The total number of blocks read
|
Blk_wrtn
|
The total number of blocks written
|
If you want to dig deeper into disk I/O performance, use the option
-x with
iostat and
specify the disk or the partition in which you are interested.
-x reports extended
statistics on the I/O performance of the specified disks (or all disks if no disks
are specified).
The following command takes
a snapshot of the activity on /dev/sda during a kernel compile (the -d option
causes iostat not to print the CPU utilization summary):
$ iostat -d -x /dev/sda
Linux 2.4.7-10 (localhost.localdomain) 10/26/2001
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await
svctm %util
sda 0.27 0.79 0.31 0.59 4.64 11.07 17.45 0.29 322.27
48.97
0.44
Field
|
Description
|
rrqm/s
|
The number of merged read requests per second
|
wrqm/s
|
The number of merged write requests per second
|
r/s
|
The number of read requests per second
|
w/s
|
The number of write requests per second
|
rsec/s
|
The number of sectors read per second
|
wsec/s
|
The number of sectors written per second
|
avgrq-sz
|
The average size (in sectors) of requests
|
avgqu-sz
|
The average queue length of the requests
|
await
|
The average time I/O waited to be served (in milliseconds)
|
svctm
|
The average time I/O requests required to complete (in
milliseconds)
|
%util
|
Percentage of CPU consumed by I/O
requests
|
No comments:
Post a Comment