Hi Just Imagine: Linux system space checking and system performance monitoring tools

Performance monitoring tools are divided into two broad categories:

high-level tools that impose a negligible performance hit and provide a correspondingly general overview of system performance and low-level tools that provide detailed views of system performance at the component level and exact a significant and noticeable performance penalty.

There is room and need for both types of tools in the performance monitoring spectrum.

Overall System Status

To get some sense of the system’s overall performance status, the most common command is uptime.

uptime displays a one-line report that shows how long the system has been up, how many users are logged in, and the system load averages

during the last 1, 5, and 15 minutes.

The load average represents the number of jobs waiting in the run queue, that is, processes ready to run but not running.

The following shows uptime’s output on a host

$ uptime

2:51pm up 1 day, 25 min, 3 users, load average: 2.03, 2.10, 2.06

As you can see, the load average is 2.03 over the last minute, 2.10 during the last five minutes, and 2.06 during the last quarter hour.

The sar (system activity reporter) command is more informative because it samples CPU utilization a specific number of times over an interval defined in seconds.

The simplest invocation is:

sar secs count

This command instructs sar to take count samples that are secs seconds apart.

For example, the following sar invocation shows CPU usage sampled five times in five seconds:

$ sar -u 5 5

Output:

Linux 2.4.7-10 (localhost.localdomain) 10/25/2001

03:09:56 PM CPU %user %nice %system %idle

03:10:01 PM all 9.20 89.20 1.60 0.00

03:10:06 PM all 10.00 89.00 1.00 0.00

03:10:11 PM all 0.00 99.60 0.40 0.00

03:10:16 PM all 0.00 99.00 1.00 0.00

03:10:21 PM all 0.00 99.80 0.20 0.00

Average: all 3.84 95.32 0.84 0.00

The -u option tells sar to report on CPU utilization.

%user indicates the percentage of time that the CPU spends executing user mode code (typically, applicationprograms), %nice is the percentage of time the CPU spends executing user mode code with a nonzero nice priority, %system indicates the time spent executing kernel code (system calls), and %idle shows how much time the CPU is not doing anything.

Monitoring Running Processes

The ps command is the weapon of choice for examining running processes. In terms of performance monitoring, the most useful set of command options to use is -el — that is, ps -el. Why -el? The -e option lists information on every running process and the -l option generates a long listing that displays information pertinent to performance monitoring and tuning.

The following short listing illustrates the output of ps –el

$ ps -el

F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD

100 S 0 858 1 0 68 0 - 1554 do_pol ? 00:00:00 gdm

000 S 500 1020 1 0 69 0 - 427 do_sel ? 00:00:00 esd

040 S 500 1244 1 0 69 0 - 990 do_sel ? 00:00:00 fetchmail

140 S 0 3914 858 0 69 0 - 1556 pipe_w ? 00:00:00 gdm

100 S 0 3915 3914 0 69 0 - 15264 do_sel ? 00:00:01 X

100 S 42 3920 3914 0 69 0 - 1762 do_pol ? 00:00:00 gdmlogin

000 S 42 3921 3920 0 69 0 - 2128 do_pol ? 00:00:00 xsri

040 S 0 5081 1 0 69 0 - 356 rt_sig ? 00:00:00 ppp-watch

100 S 0 5083 5081 0 68 0 - 478 do_sel ttyS4 00:00:00 pppd

040 S 0 6170 757 0 68 0 - 400 pipe_w ? 00:00:00 crond

040 S 500 6173 1 0 69 0 - 488 wait4 ? 00:00:00 seti.sh

000 R 500 6175 6173 48 75 1 - 4457 - ? 04:50:05 setiathom

100 S 0 6176 6170 0 69 0 - 1370 pipe_w ? 00:00:00 sendmail

040 S 500 6178 1 0 69 0 - 488 wait4 ? 00:00:00 seti.sh

Field	Description
F	Contains a value consisting of the sum of one or more hexadecimal values that describes the process’s current status
S	Shows the process’s current state (running, sleeping, and so forth)
UID	Lists the numeric user ID (UID) of the user who owns the process
PID	Lists the process’s process ID (PID)
PPID	Lists the PID of the parent process
C	Shows the CPU utilization of the process
PRI	States the process’s priority (higher numbers mean higher priority)
NI	Shows the process’s nice value (higher numbers mean lower priority)
SZ	Lists how much swap space (virtual memory) the process requires
TTY	Names the terminal on which the process started (also known as the controlling terminal)
TIME	Summarizes the total CPU time (in hours and minutes) that the process has consumed
CMD	Displays the command that initiated the process

The ps command really serves two purposes in performance monitoring and tuning.

First, it provides clues about what process (or processes) might be causing the problem, which, in turn, suggests possible measures to alleviate the problem.

Second, ps enables you to see whether the action you took solved the problem.

After taking steps to discipline a wayward process, use ps -el again and evaluate the results.

If the corrective measure worked, you have at least temporarily solved the problem. If not, try another tactic.

Monitoring Memory Utilization

Use the vmstat command to examine the virtual memory subsystem and isolate problems.

vmstat reports a variety of statistics about the virtual memory subsystem, CPU load, and disk and CPU activity.

Its general format is:

vmstat [secs [count]]

vmstat takes a total of count samples every secs seconds and displays its output to stdout.

$ vmstat 5 5

procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id

2 0 0 792 58440 18868 45008 0 0 3 7 195 360 42 1 56

2 0 0 792 58436 18884 45008 0 0 0 10 102 21 100 0 0

2 0 0 792 58436 18884 45008 0 0 0 1 102 17 99 1 0

2 0 0 792 58436 18884 45008 0 0 0 1 101 19 99 1 0

2 0 0 792 58164 18884 45008 0 0 0 1 102 17 99 1 0

At first glance the information is hard to decipher. The first line of any vmstat report shows only summary information.

Subsequent lines show the information that you use to track down memory problems.

Field	Description
Procs
r	The number of processes ready to run (in the run queue)
b	The number of processes blocked while waiting for some resources
w	The number of processes swapped out while waiting for some resources
memory
swpd	The amount of swap space used in KB
free	The size of the unallocated memory store in KB
buff	The amount of memory used as a buffer in KB
cache	The amount of memory used as cache in KB
Swap
si	The amount of memory swapped in from disk in KB/sec
so	The amount of memory swapped out to disk in KB/sec
io
bi	The number of blocks written to a block device in blocks/sec
bo	The number of blocks read from a block device in blocks/sec
system
in	The number of device interrupts/sec
cs	The number of CPU process context switches in switches/sec
cpu
us	The percentage of CPU time spent executing user code
sy	The percentage of CPU time spent executing system/kernel code
id	The percentage of CPU time spent idle

When the amount of available swap space (shown in the swap column and displayed in kilobyte units) is low, this is an indication that the system is swapping heavily and likely is performing poorly.

Use ps to identify the process or processes making heavy use of swap. In some cases, you might be able to tune the application itself to address the issue, but, more often than not, the permanent solution is adding RAM to the system.

If the w column is nonzero and the so and si columns indicate continuous

swapping, start looking for processes consuming memory, particularly those with large virtual memory requirements.

Watching the r and b columns over a period of time enables you to get some sense of how fast processes are moving through the queue.

Except for long-running processes (easily identified using ps), the values in the r and b columns should stay low.

On systems that support it, the free command shows memory utilization,

including swap usage and capacity.

For example, the following command shows the output of the free command on a moderately loaded system:

$ free

total used free shared buffers cached

Mem: 191136 133768 57368 580 19216 45028

-/+ buffers/cache: 69524 121612

Swap: 253508 792 252716

At a glance, free gives you an informative snapshot of system memory usage.

The first line following the header shows physical memory information (all information is in kilobytes), including the total amount of installed RAM, how much of it the kernel has allocated (but not necessarily used), how much RAM is free, how much RAM is configured as shared memory, and how much RAM is used for buffering and cache.

The second line shows memory usage adjusted for buffering. That is, free

subtracts the amount of memory used for buffering and caching from the used column and adds it to the free column so that you get a more accurate picture of actual RAM usage as opposed to the kernel’s allocation of RAM, which changes according to system usage and the load average. Note, for example, that if you subtract the buffers and cached values from the used value on the first line, you get 69,524 (133,768 – 19,216 – 45,028 = 69,524) and that adding buffers and cached to free yields precisely 121,612 (57,368 + 19,216 + 45,028 = 121,612).

The third line, finally, shows the amounts of swap space available, used, and free.

As you can see, swap usage on this system was fairly low at the time the snapshot was taken.

If you prefer a graphical display, follow following steps in RedHat Linux 6:

Step 1: Click on ‘Applications’

Step 2: Click on “System Tools”

Step 3: Click on “System Monitor”

Monitoring Disk Usage and Performance

Disk usage is straightforward to monitor using the df and du commands. df reports the amount of disk space that is available, and du reports the amount that is used.

df OPTIONS

Option	Description
-a	Includes empty file systems in the report
-h	Displays the report using human readable units
-k	Uses blocks of 1K
-l	Limits the report to local file systems
-m	Uses blocks of 1024K

Use df’s -k option to see disk usage in kilobytes — for example:

$ df -k

Filesystem 1k-blocks Used Available Use% Mounted on

/dev/sda2 959396 117364 793296 13% /

/dev/sda1 21129 2997 17041 15% /boot

/dev/hdc1 4076976 1203725 2662282 32% /home

none 95568 0 95568 0% /dev/shm

/dev/sda3 3153592 1638608 1354788 55% /usr

To make the display easier to read, you can use the -h option:

$ df -h

Filesystem Size Used Avail Use% Mounted on

/dev/sda2 937M 115M 774M 13% /

/dev/sda1 21M 3.0M 16M 15% /boot

/dev/hdc1 3.9G 1.2G 2.5G 32% /home

none 93M 0 93M 0% /dev/shm

/dev/sda3 3.0G 1.6G 1.2G 55% /usr

If you use the -l option, you list only the disk space that is physically located on your own system, not on NFS mounts or other disk drives that are accessed remotely.

The -h option makes it amply clear how much space is available on each

file system.

All of the file systems are well under their maximum capacity.

To see how much disk space is in use, use the du command.

Its basic syntax is :

du [options] file

file can be a file system, a device, a directory, or a file.

Useful values for options include:

-d — Limits the report to the specified file system

-h — Displays the report using human readable units

-k — Prints the report using K units

-m — Prints the report using MB unites

-s — Reports only summary values

The default report, if no options are specified, shows usage statistics for each directory at or beneath file.

For example, the following command shows du’s default output:

du /tmp

112 /tmp/.kde/tmp-localhost.localdomain

116 /tmp/.kde

4 /tmp/.X11-unix

4 /tmp/.font-unix

4 /tmp/.ICE-unix

12 /tmp/orbit-root

4 /tmp/.sawfish-root

12 /tmp/orbit-kwall

4 /tmp/.sawfish-kwall

4 /tmp/.esd

488 /tmp

The next command shows only a summary report for disk usage in /usr:

# du -s /usr

1605796 /usr

The next command shows the summary report but uses the -h option to print the report in more familiar notation and the -m option to use megabyte (MB) units:

$ du -smh /usr

1.6G /usr

The most frequently used command for drilling down on I/O performance

problems is iostat.

Like vmstat, iostat’s simplest invocation is:

iostat [seconds [count]]

seconds defines the interval between reports, and count specifies the number of reports.

For example, the following command generates the default iostat report:

$ iostat

Linux 2.4.7-10 (localhost.localdomain) 10/25/2001

avg-cpu: %user %nice %sys %idle

2.02 52.99 1.01 43.97

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn

dev8-0 0.84 4.00 11.37 482868 1371784

dev22-0 0.48 0.41 1.68 49298 203210

The device listed under the Device column shows the physical device partitions in the format devm-n, where m is the major number and n is the minor number.

In this case, dev8-0 maps to the first SCSI disk (/dev/sda), and dev22-0 maps to the second IDE disk (/dev/hdc).

iostat OUTPUT FIELDS

Field	Description
tps	The numbers of transfers (or I/O requests) per second sent to the disk
Blk_read/s	The number of blocks read from the device per second
Blk_wrtn/s	The number of blocks written to the device per second
Blk_read	The total number of blocks read
Blk_wrtn	The total number of blocks written

If you want to dig deeper into disk I/O performance, use the option -x with iostat and specify the disk or the partition in which you are interested.

-x reports extended statistics on the I/O performance of the specified disks (or all disks if no disks are specified).

The following command takes a snapshot of the activity on /dev/sda during a kernel compile (the -d option causes iostat not to print the CPU utilization summary):

$ iostat -d -x /dev/sda

Linux 2.4.7-10 (localhost.localdomain) 10/26/2001

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await

svctm %util

sda 0.27 0.79 0.31 0.59 4.64 11.07 17.45 0.29 322.27

48.97 0.44

Field	Description
rrqm/s	The number of merged read requests per second
wrqm/s	The number of merged write requests per second
r/s	The number of read requests per second
w/s	The number of write requests per second
rsec/s	The number of sectors read per second
wsec/s	The number of sectors written per second
avgrq-sz	The average size (in sectors) of requests
avgqu-sz	The average queue length of the requests
await	The average time I/O waited to be served (in milliseconds)
svctm	The average time I/O requests required to complete (in milliseconds)
%util	Percentage of CPU consumed by I/O requests

Hi Just Imagine

Menus

Jun 5, 2016

Linux system space checking and system performance monitoring tools

No comments:

Post a Comment

Contact Form