TweetFollow Us on Twitter

OS X Investigation and Troubleshooting - Part 2

Volume Number: 22 (2006)
Issue Number: 6
Column Tag: Mac In The Shell

OS X Investigation and Troubleshooting - Part 2

The Secrets to OS X success

by Edward Marczak

Last month, we explored some basic ways that we can dig into OS X that will help us learn the system and become better troubleshooters. This month, we'll take all of that a little further and investigate some more tools to use, when trying to figure out, "what is OS X doing right now?!?"

What's up, top?

The first utility to talk about is top. top is a process monitor, much like ps, that will refresh its output and sort by various criteria. Its goal is to show you the 'top' processes according to your sort. It may not pass for the be-all, end-all utility, but understanding the data it presents, is crucial to understanding what is happening with your system.

You may recall from last month that I talked about the difference between tasks that run in the kernel, and tasks that run in userspace. It's important to note that top, mostly will show you what's going on in userspace. You do see system CPU utilization, and you will see process 0 - called kernel_task - that gives some sense of what's going on kernel-side. You won't see this in a standard ps listing. kernel_task comes into being during the boot sequence, called into existence as one of the kernel's first jobs (see kernel source xnu/bsd/kern/bsd_init.c - well, the PPC version, anyway).

Let's run top to see what we're going to be looking at. Go ahead: open Terminal.app and type top. By simply running with the default settings, you're looking at a list of processes, sorted by descending process id (pid), with associated statistics about each. Besides each individual process, you'll see a dashboard of statistics, similar to Figure 1.

Processes:  110 total, 3 running, 107 sleeping... 355 threads          07:39:18
Load Avg:  0.58, 0.71, 0.67     CPU usage:  17.9% user, 26.1% sys, 56.0% idle
SharedLibs: num =  203, resident = 48.0M code, 4.29M data, 11.2M LinkEdit
MemRegions: num = 29613, resident =  797M + 25.1M private,  386M shared
PhysMem:   170M wired, 1.21G active,  422M inactive, 1.79G used,  215M free
VM: 12.8G +  123M   106654(0) pageins, 2748(0) pageouts

Figure 1 - Display from top

The first line tells us how many processes the BSD system is currently responsible for, the number that are active, how many are idle, the total number of threads (remember, each 'process' is further broken down into threads of execution), and the current time. On the next line, we have the important statistic of load average.

What to say about load average? First, the dry technical aspect. You'll see the load average metric in several places: using top, from uptime, in the output from w, and more. Like in figure 1, you'll see 3 numbers. Those are the 1 minute, 5 minute and 15 minute load averages. What determines the numbers at those points? The number of jobs in the run queue - the load on the system. Now for the fun part: this may or may not be of any use. You'll find people who say it's the most important metric, and some that say it's of little use. Personally, I maintain that you have to know your system, or at least, the system that you're using, and that load average is just another data point for your investigation. We could spend several pages on load average alone, but we're not going to do that. I will wrap up with the shortest explanation possible: Load average is not solely CPU usage! It also encompasses disk I/O and network bound processes. It is also not an 'average' that you'd expect. It is a time-based damped average. Why not just give us the number of jobs competing for CPU attention (the run queue) at those points? I don't know. In any case, a load average of 0 means you have a completely idle system (not unheard of, but rare). 1 means your CPU is handling things fine. Less than 1 means that you have more headroom to spare, and over 1 means that you could really use a more powerful single processor, or, more processors (in the form of SMP) to handle the load. It is very situation dependent, and will mean different things depending on the use of the system: a machine acting solely as a database server - even under heavy use - will have a completely different load average pattern than a file server or a shell server. So, while a tad confusing, it's certainly not a useless metric. Watch it, and learn the patterns from your system(s).

Next in the list is CPU usage. This also tends to be a little misunderstood. People tend to panic a bit when they see the CPU load going up. It's OS X's job to make sure the CPU is getting used. No sense in having a CPU if you're not going to put it to work, right? The values displayed in top, like many other utilities, show CPU usage segmented into user processes, system use, and percent idle. You're likely to see these numbers jumping about as the CPU does its job. Even though you may run a basic userspace program like, say, iTunes, the kernel still has to do work keeping track of all the resources in use by the application, and the resources that it wants to allocate. Remember, that these values are affected by everything the CPU needs to handle - running applications, processing interrupts (think video cards, network interfaces, etc.), moving memory around, and more. Once again, you need to learn the patterns of the system you're monitoring. Very often, though, people panic when they see usage that seems higher than they'd expect. In conjunction with the load average metric, you can get a good idea if processes are suffering or flourishing.

On the next line, top summarizes statistics about shared libraries. Basically, a shared library is a set of code that multiple programs use in common. For example, the SSL libraries contain routines that are useful to many other programs. We don't want those programs to have to each implement SSL routines themselves, nor do we want them each to have to use up memory on loading their own copy. So, they can all load the pre-compiled "libssl" and use its proven routines. To pull this off, multiple applications are able to share the code.

The "MemRegions" line lists the number and size of allocated memory regions. This is broken down into private (library and non-library) components and shared components.

"PhysMem" is just what you'd expect: the breakdown of physical memory allocation. "Wired" memory is active memory that can't be moved out of real RAM; it's 'wired down'. The 'active' and 'inactive' portions add up to how much memory is 'used'. 'Used' plus 'free' equal the total RAM in your machine. Like CPU usage, these RAM statistics are often misinterpreted. Don't panic when you see low free RAM, that's just the way OS X works. About the only time you'll see high free RAM is just after booting up. However, as you use OS X, and it starts to fill RAM for different purposes, it doesn't release RAM into the free pool immediately after a program is finished with it - rather, it then becomes 'inactive'. OS X keeps this data on tap, in case it needs it. If not, and it does really need more real RAM for some task, the inactive memory is the first to be purged to make room. So don't panic when you see low free memory! OS X has a sophisticated and effective memory management scheme that shuffles pages of memory out to disks, wires them down, caches memory, and frees it as needed. Speaking of paging out to disk...

The final line displays statistics about virtual memory. Now, the "VM" statistic does not refer to virtual memory in the way you may remember from OS 9 - simple swapping to disk. The first statistic on that line represents the entire virtual address space being used currently. You can match this number by adding up everything in the 'VSIZE' column. I'll get to VSIZE a little later, but know this: it's a fairly useless statistic under OS X because OS X always gives apps a generous virtual address space to work in. In short, though, it gives you a good idea of the total address space in use, or, about how much RAM you'd really need if OS X had no virtual address space.

Finally, you'll see "pagein" and "pageout" statistics. A pagein happens when a page is copied from 'swap' (or, the backing store) into main memory. A pageout happens when memory is written to the backing store. Unlike older methods, OS X pages, rather than swaps. In earlier systems, a program is either fully in main memory, or it's swapped out entirely. OS X, on the other hand, can take pages - 4k blocks - of RAM and get them out of the way, or pull them back in as needed. A pager is responsible for moving pages in and out of RAM. A page-fault occurs when the system looks for something that should be in core memory, but doesn't find it. A page-fault then causes the pager to read the appropriate page(s) from the backing store and into core memory. What does top have to say about all of this?

top, simply will display the current number of pageins and pageouts requested by a pager. These counts are shown as the total number, followed by the recent counts in parentheses. The recent counts are the number of page-ins or page-outs in the last 1 second, for the respective counter. These are the important values to watch! Normally these are zero - especially for page-outs. If you're watching top, and the number of recent page-outs stay above zero, your system is short on real RAM. The count of page-outs will rise occasionally. But if you're witnessing a full-on page-out-fest over a long period of time, your system is thrashing - the system spends more time paging in and out than actually accomplishing any real work. If you see your page-outs keep creeping up, stuff some more real RAM in that machine!

And the rest

That's already a lot to take in, but certainly not all that top is displaying. Let's look at a clip of the lower-half of top's default display: (Top of page 10.)

   PID   COMMAND      %CPU      TIME    #TH   #PRTS   #MREGS   RPRVT   RSHRD    RSIZE   VSIZE
 18098   top          23.6%    0:18.73    1     18       23    1.36M    412K    1.80M   26.9M 
   325   X Resource    5.0%   71:36.17    4    119      202    4.11M   12.2M    7.10M    177M
    72   WindowServ    4.1%   53:37.21    2    559    11228    16.8M    148M     155M    408M 
   321   Terminal      3.3%   12:54.51   10    223      258    9.67M   21.6M    19.1M    220M 
   800   Microsoft     1.7%   32:35.14    4    108      500    32.2M   50.4M    53.8M    292M

This is a sample, sorted by CPU%. I started top with the "-ocpu" switch to achieve this, which I highly recommend. Doing so, makes top's process area dynamic, always sorting the highest CPU using tasks to the top. Let's take a quick run through the columns.

    PID - The BSD process ID

    Command - The name of the program or application bundle.

    %CPU - The percentage of CPU cycles used during top's refresh interval for this process. This includes both kernel and user space.

    Time - CPU time used by this process since launch, in minutes:seconds:hundreds format.

    #TH - Number of threads in use by this process.

    #PRTS - Number of Mach ports used by the process.

    #MREGS - The number of memory regions this process has allocated.

    RPRVT - The amount of resident private memory. Probably the best of these statistics to determine how much real memory a program is using.

    RSHRD - The amount of resident shared memory used.

    VSIZE - The total address space allocated to the program.

There are some switches that will alter the number of columns and amount of information displayed. The ones I tend to use the most are the "-o" order by switch, and the "-l" switch. "-o" lets you order top's output by any of the keys you'd expect: command, cpu, pid, prt, reg, rprvt, rshrd, th, time, uid, and username. You can make the order ascending or descending by prefixing the key with a "+" or "-", like this: top -o+pid.

The "-l" switch turns on logging mode, which makes top non-interactive - it just dumps its output raw. You can tell top how many times it should output, or until you interrupt it, using "0". I like top -l 0 -ocpu -n 15. That's a nice one to leave running as you put a machine to sleep - you'll then have everything that drives the machine nuts as it wakes up 'logged'.

We're Mac Users!

top is nice. It's the first utility I tend to reach for when I want to get a glimpse of system activity. Of course, I always keep a shell open, too. Also, I'm often into a remote system via ssh where non-GUI tools are the only option. However, in more common scenarios, Apple provides some tools that match and exceed top's abilities, and I'd be remiss if I didn't mention them.

In your Utilities folder, you'll find Activity Monitor.app. Most people have run Activity Monitor and gotten a small taste of it. You'll see that it's basically a graphical equivalent of top. Dig deeper, though, and you'll find a bit more. Double-click on a process and you get more detailed information. Memory, of course, and some useful statistics. Figure 1 displays some of my favorite features.



Figure 1 - Activity Monitor showing some detail

Simply being able to give us, having the application's open files and ports, alone is great. How many times have you asked yourself, "what file is that application changing?" Granted, Activity Monitor may miss displaying files that an application opens and closes very quickly. For that, you'll need heavier-duty tools - and next month's article. Going over and above this, though, we have the application's parent process listed, and it's a hyperlink! Go on, click on it. Further, and probably the most important part of activity monitor is that "Sample" button in the lower left-hand corner. There are times where an application is doing something out of the ordinary - running "slow," not updating its display, giving you a spinning-pizza-of-death - and you need to find out why. Or, perhaps you just want to find out what an application is doing. Go hit the sample button. After being told to, "Please wait while the sample is taken", you'll see something similar to that of figure 2.



Figure 2 - The sample has been taken.

I scrolled this output to the bottom to show the stats, but if you start at the top, you will see a list of every call the application makes. Immensely useful to figure out what SPOD apps are doing. Are they coming back? Are they just stuck doing heavy processing? Inquiring minds, and upset users, want to know. Note that, just like in this sample, you're always going to see mach_msg_trap as one of the most oft-called routines. This call represents the app being blocked, or, waiting on another event. Super powerful computers, and they spend most of their time waiting...

One thing to remember with using Application Monitor: you only have control over processes that you own. Want to rule the world? Launch it as root.

What's News?

One last GUI app before closing this month. I pointed out that Activity Monitor will let you sample an application, which is really handy when you get the spinning-pizza-of-death. Sometimes, an application never returns from a SPOD, and you force quit it. That gives you plenty of time to sample. Other apps, though, will SPOD for a bit, and then come back to you. The SPOD is annoying, but not long enough to get Activity Monitor running, find the app in question, double-click and sample. Enter Spin Control.

Spin Control is part of the Developer Tools install (which, frankly, just about every machine should have). You'll find it in /Developer/Applications/Performance Tools. Spin Control automatically samples any application that the Window Server deems unresponsive. By default, any application running under Window Server will be sampled if it is hung for at least 5 seconds. Also, in an effort to not make things worse, Spin Control will only sample 4 processes at a time. The preferences are adjustable for the length of hung-time-till-sampling, and can be set to watch a single application.

If you leave Spin Control running for a while, you'll catch all sorts of apps with small hangs, as you'll see in figure 3.



Figure 3 - Spin Control in action

(If you're like me, you'll see Mail.app in that list more than most.) Once an application has been sampled, it gets a new line item in the main window. Selecting the application enables the action buttons along the bottom. "Show text report" gives us a report just like Activity Monitor's sample function does. More interesting is what the sample-browser double-clicking, or, choosing "Open..." gives you. Figure 4 gives you a glimpse of Word hanging for a bit.



Figure 4 - The sample browser examining a hang from Word

While you may or may not be able to determine exactly what is making a particular app hang, a sample dump is useful for sending in with bug reports.

Momentary Rest

Again, we covered a lot of ground. Understanding system statistics will go a long way to learning what's happening at the moment, with any one particular app, and over a period of time. Next month, we'll be delving into more ways to sample apps and figure out what they're up to. A little more intimately this time.

Also, it's less than two months until WWDC! Hope to see everyone in San Francisco for the event. Fellow writers, MacTech readers, and Mac people from all over - please drop me a line if you'd like to meet up.

References

Just about everything at http://developer.apple.com

Top source code from http://www.fysh.org/~chris/top/top-3.5.tar.gz and developer.apple.com

Kernel Programming Guide:

http://developer.apple.com/documentation/Darwin/Conceptual/KernelProgramming/index.html


Ed Marczak owns and operates Radiotope, a technology consulting company. Radiotope helps separate technology issues from policy issues, cool-tech from needed-tech. Guide your decision at http://www.radiotope.com

 

Community Search:
MacTech Search:

Software Updates via MacUpdate

Viber 12.4.0 - Send messages and make fr...
Viber lets you send free messages and make free calls to other Viber users, on any device and network, in any country! Viber syncs your contacts, messages and call history with your mobile device, so... Read more
OmniFocus 3.5.1 - GTD task manager with...
OmniFocus is an organizer app. It uses projects to organize tasks naturally, and then add tags to organize across projects. Easily enter tasks when you’re on the go, and process them when you have... Read more
Network Radar 2.9 - $17.99
Network Radar is an advanced network scanning and managing tool. Featuring an easy-to-use and streamlined design, the all-new Network Radar 2 has been engineered from the ground up as a modern Mac... Read more
Tidy Up 5.3.4 - Find duplicate files and...
Tidy Up is a full-featured duplicate finder and disk-tidiness utility. Features: Supports Lightroom: it is now possible to search and collect duplicates directly in the Lightroom library. Multiple... Read more
DiskCatalogMaker 8.0 - Catalog your disk...
DiskCatalogMaker is a simple disk management tool which catalogs disks. Simple, light-weight, and fast Finder-like intuitive look and feel Super-fast search algorithm Can compress catalog data for... Read more
ExpanDrive 7.4.11 - Access cloud storage...
ExpanDrive builds cloud storage in every application, acts just like a USB drive plugged into your Mac. With ExpanDrive, you can securely access any remote file server directly from the Finder or... Read more
OmniGraffle Pro 7.13 - Create diagrams,...
OmniGraffle Pro helps you draw beautiful diagrams, family trees, flow charts, org charts, layouts, and (mathematically speaking) any other directed or non-directed graphs. We've had people use... Read more
OmniGraffle 7.13 - Create diagrams, flow...
OmniGraffle helps you draw beautiful diagrams, family trees, flow charts, org charts, layouts, and (mathematically speaking) any other directed or non-directed graphs. We've had people use Graffle to... Read more
Airmail 4.0 - Powerful, minimal email cl...
Airmail is an mail client with fast performance and intuitive interaction. Support for iCloud, MS Exchange, Gmail, Google Apps, IMAP, POP3, Yahoo!, AOL, Outlook.com, Live.com. Airmail was designed... Read more
OmniOutliner Essentials 5.5.3 - Organize...
OmniOutliner Essentials (was OmniOutliner) is a flexible program for creating, collecting, and organizing information. Give your creativity a kick start by using an application that's actually... Read more

Latest Forum Discussions

See All

Isle Escape: The House is an upcoming pu...
Isle Escape: The House is an upcoming puzzle game from Simeon Angelov that's intended to serve as an introduction to a saga they're planning on releasing in an episodic fashion. The first chapter is set to release for both iOS and Android on 29th... | Read more »
Company of Heroes, the classic RTS, is n...
Feral Interactive has finally released their highly anticipated iOS version of the strategy classic Company of Heroes. It's available now for iPad as a premium title and has had various tweaks to ensure that it's optimised for touch controls. [... | Read more »
Mario Kart Tour's Vancouver Tour ha...
With Mario Kart Tour's Valentine's Tour now at an end (suspiciously before Valentine's Day has even arrived), it's now time to move on to the all-new and exciting Vancouver Tour. This time around, the featured drivers are Hiker Wario and Aurora... | Read more »
A new PictoQuest update makes it a much...
PictoQuest is a charming little puzzle game, but it left us a little disappointed. The game just didn’t seem to use screen space effectively, to the point that using the touch controls (as opposed to the default virtual d-pad) could lead to errant... | Read more »
Alley is an atmospheric adventure game a...
Alley is an atmospheric adventure game that sees you playing as a young girl trapped in an inescapable nightmare. Surrounded by her worst fears, every step forward for her is a huge challenge that you'll help guide her through using some simple... | Read more »
Fight monsters and collect heroes in Cry...
From Final Fantasy to Chaos Rings, Japanese roleplaying games have found a large and loyal fanbase on mobile devices. If you’re seeking a more under-the-radar JRPG to escape into, Lionsfilm’s Cryptract could be the one. The game has been around... | Read more »
Circuit Dude is a top-down, tile-based p...
Circuit Dude is a tile-based puzzler that was originally released on Steam back in 2017. Now it's made it's way over to mobile devices where it's available for both iOS and Android as a premium game. [Read more] | Read more »
Liege Dragon is another upcoming RPG for...
Liege Dragon is an upcoming RPG from Kemco, who has certainly streamlined the process of making their particular brand retro-inspired turn-based games at this point. Liege Dragon will be available for both iOS and Android. [Read more] | Read more »
Hidden Survivor from Joy Brick is a hide...
Joy Brick's Hidden Survivor is an interesting title of two halves: part story-focused survival experience, part intense hide-and-seek multiplayer game. Both elements come together to form a compellingly strange and enjoyable whole. The hide-and-... | Read more »
Stupid Zombies 4 is an upcoming trick-sh...
The Stupid Zombies are preparing to make their grand return to iOS and Android in the fourth instalment of the hugely popular trick-shot shooter series. If you missed out on the earlier games, the basic idea is that you have to bounce bullets... | Read more »

Price Scanner via MacPrices.net

Sunday sale: 27″ 5K iMacs for $150 off Apple’...
B&H Photo has new 2019 27″ 5K iMacs in stock today and on sale for $150 off Apple’s MSRP. Overnight shipping is free to many locations in the US: – 27″ 3.0GHz 5K iMac: $1649.99 $150 off MSRP – 27... Read more
Sunday sale: 21″ iMacs for $100-$150 off Appl...
B&H Photo has new 21″ Apple iMacs on sale for $100 off MSRP with models available starting at $999. These are the same iMacs offered by Apple in their retail and online stores. Overnight shipping... Read more
Best Buy President’s Day Weekend 2019 sale: A...
Best Buy has Apple HomePods on sale for $249.99 as part of their President’s Day Weekend 2019 sale. Both Space Gray and White HomePods are on sale for this price. Their price is $50 off Apple’s MSRP... Read more
President’s Day Weekend Sale: 13″ 1.4GHz MacB...
Amazon has new 2019 13″ 1.4GHz MacBook Pros on sale for $200 off Apple’s MSRP, starting at $1099, as part of their President’s Day Weekend sale. These are the same MacBook Pros sold by Apple in its... Read more
President’s Day Weekend Sale: Apple AirPods f...
Amazon has new 2019 Apple AirPods on sale today ranging up to $35 off MSRP, starting at $129, as part of their President’s Day Weekend sale. Shipping is free: – AirPods Pro: $234.98 $15 off MSRP –... Read more
Save hundreds on custom 16″ MacBook Pro confi...
Save up to $920 on a custom-configured 16″ MacBook Pro with these Certified Refurbished models now available at Apple. Each MacBook Pro features a new outer case, free shipping, and includes Apple’s... Read more
Back on sale: 4 and 6-core Mac Minis for $100...
B&H Photo has 4-Core and 6-Core Mac minis on sale for $100 off Apple’s standard MSRP, with prices starting at only $699. Overnight shipping is free to many US addresses: – 3.6GHz Quad-Core mini... Read more
16″ MacBook Pros, Certified Refurbished, now...
Apple is now offering Certified Refurbished 2019 16″ MacBook Pros for up to $420 off the cost of new models, starting at $2039. Each model features a new outer case, shipping is free, and an Apple 1-... Read more
Purchase a new Apple Pro Display XDR and pay...
Apple reseller DataVision has Apple’s new Pro Display XDR models available for order including sales tax for NY, NJ, PA, and CA residents only. If you don’t reside in one of those states, you can... Read more
B&H has select 13″ 2.4GHz MacBook Pros on...
B&H Photo has select 2019 13″ 2.4GHz MacBook Pros on sale $250 off Apple’s MSRP, starting at $1549. Overnight shipping is free to many addresses in the US. These are the same MacBook Pros sold by... Read more

Jobs Board

*Apple* Computing Professional - Best Buy (U...
**761650BR** **Job Title:** Apple Computing Professional **Job Category:** Store Associates **Store NUmber or Department:** 000217-Aurora-Store **Job Description:** Read more
Medical Assistant - *Apple* Valley Clinic -...
…provide professional, quality care to patients in the ambulatory setting at the Fairview Apple Valley Clinic, located in Apple Valley, MN. Join the **Fairview Read more
Geek Squad *Apple* Consultation Professiona...
**762475BR** **Job Title:** Geek Squad Apple Consultation Professional **Job Category:** Store Associates **Store NUmber or Department:** 001423-San Jose-Store **Job Read more
*Apple* Engineering Specialist - Amentum (Un...
Job Summary Amentum has an immediate opportunity for an Apple Engineering Solutions to support a government agencys capabilities in Washington, DC (Union Station / Read more
Best Buy *Apple* Computing Master - Best Bu...
**745058BR** **Job Title:** Best Buy Apple Computing Master **Job Category:** Store Associates **Store NUmber or Department:** 001080-Lake Charles-Store **Job Read more
All contents are Copyright 1984-2011 by Xplain Corporation. All rights reserved. Theme designed by Icreon.