The following are sample outputs of the pfilestat tool for various scenarios. Starting with something simple, Running: dd if=/dev/rdsk/c0d0s0 of=/dev/null bs=56k # x86, 32-bit # ./pfilestat `pgrep -x dd` STATE FDNUM Time Filename read 3 2% /devices/pci@0,0/pci-ide@1f,1/ide@0/cmdk@0,0 write 4 3% /devices/pseudo/mm@0:null waitcpu 0 7% running 0 16% sleep-r 0 69% STATE FDNUM KB/s Filename write 4 53479 /devices/pseudo/mm@0:null read 3 53479 /devices/pci@0,0/pci-ide@1f,1/ide@0/cmdk@0,0 Total event time (ms): 4999 Total Mbytes/sec: 104 Most of the time we are sleeping on read, which is to be expected as dd on the raw device is simple -> read:entry, strategy, biodone, read:return. CPU time in read() itself is small. Now for the dsk device, Running: dd if=/dev/dsk/c0d0s0 of=/dev/null bs=56k # x86, 32-bit # ./pfilestat `pgrep -x dd` STATE FDNUM Time Filename write 4 5% /devices/pseudo/mm@0:null waitcpu 0 8% running 0 15% sleep-r 0 18% read 3 53% /devices/pci@0,0/pci-ide@1f,1/ide@0/cmdk@0,0 STATE FDNUM KB/s Filename read 3 53492 /devices/pci@0,0/pci-ide@1f,1/ide@0/cmdk@0,0 write 4 53492 /devices/pseudo/mm@0:null Total event time (ms): 4914 Total Mbytes/sec: 102 Woah, we are now spending much more time in read()! I imagine segmap is a busy bee. The "running" and "write" times are hardly different. Now for a SPARC demo of the same, Running: dd if=/dev/dsk/c0d0s0 of=/dev/null bs=56k # SPARC, 64-bit # ./pfilestat `pgrep -x dd` STATE FDNUM Time Filename write 4 3% /devices/pseudo/mm@0:zero waitcpu 0 7% running 0 17% read 3 24% /devices/pci@1f,0/pci@1,1/ide@3/dad@0,0:a sleep-r 0 54% STATE FDNUM KB/s Filename read 3 13594 /devices/pci@1f,0/pci@1,1/ide@3/dad@0,0:a write 4 13606 /devices/pseudo/mm@0:zero Total event time (ms): 4741 Total Mbytes/sec: 25 I did prime the cache by running this a few times first. There is less read() time than with the x86 32-bit demo, as I would guess that the process is more often exhausting the (faster) segmap cache and getting to the point where it must sleep. (However, do take this comparison with a grain of salt - my development servers aren't ideal for comparing statistics: one is a 867 MHz Pentium, and the other a 360 MHz Ultra 5). The file system cache is faster on 64-bit systems due to the segkpm enhancement in Solaris 10. For details see, http://blogs.sun.com/roller/page/rmc?entry=solaris_10_fast_filesystem_cache Now, back to x86. Running: tar cf /dev/null / # ./pfilestat `pgrep -x tar` STATE FDNUM Time Filename read 11 0% /extra1/test/amd64/libCstd.so.1 read 11 0% /extra1/test/amd64/libXm.so read 11 0% /extra1/test/amd64/libXm.so.4 read 11 1% /extra1/test/amd64/libgtk-x11-2.0.so read 11 2% /extra1/test/amd64/libgtk-x11-2.0.so.0 waitcpu 0 2% read 9 4% /extra1/5000 write 3 7% /devices/pseudo/mm@0:null running 0 19% sleep-r 0 46% STATE FDNUM KB/s Filename read 11 293 /extra1/test/amd64/libgdk-x11-2.0.so read 11 295 /extra1/test/amd64/libgdk-x11-2.0.so.0 read 9 476 /extra1/1000 read 11 526 /extra1/test/amd64/libCstd.so.1 read 11 594 /extra1/test/amd64/libXm.so.4 read 11 594 /extra1/test/amd64/libXm.so read 11 1603 /extra1/test/amd64/libgtk-x11-2.0.so.0 read 11 1606 /extra1/test/amd64/libgtk-x11-2.0.so read 9 4078 /extra1/5000 write 3 21254 /devices/pseudo/mm@0:null Total event time (ms): 4903 Total Mbytes/sec: 41 Fair enough. tar is crusing along at 21 Mbytes/sec (writes to fd 3!). More interesting is to do the following, Running: tar cf - / | gzip > /dev/null # ./pfilestat `pgrep -x tar` STATE FDNUM Time Filename read 11 0% /extra1/test/amd64/libm.so read 11 0% /extra1/test/amd64/libm.so.2 read 11 0% /extra1/test/amd64/libnsl.so read 11 0% /extra1/test/amd64/libnsl.so.1 read 11 0% /extra1/test/amd64/libc.so.1 write 3 2% waitcpu 0 4% sleep-r 0 4% running 0 6% sleep-w 0 78% STATE FDNUM KB/s Filename read 11 74 /extra1/test/amd64/libldap.so read 11 75 /extra1/test/amd64/libldap.so.5 read 11 75 /extra1/test/amd64/libresolv.so.2 read 11 76 /extra1/test/amd64/libresolv.so read 11 97 /extra1/test/amd64/libm.so.2 read 11 98 /extra1/test/amd64/libm.so read 11 174 /extra1/test/amd64/libnsl.so read 11 176 /extra1/test/amd64/libnsl.so.1 read 11 216 /extra1/test/amd64/libc.so.1 write 3 3022 Total event time (ms): 4911 Total Mbytes/sec: 6 Woah now, tar is writing 3 Mbytes/sec - AND spending 78% of it's time on sleep-w, sleeping on writes! Of course, this is because we are piping the output to gzip, which is spending a while compressing the data. 78% matches the time gzip was on the CPU (using either "prstat -m" or dtrace to measure; procfs's pr_pctcpu would take too long to catch up). Also interesting is, Running: perl -e 'while (1) {;}' & Running: perl -e 'while (1) {;}' & Running: perl -e 'while (1) {;}' & Running: perl -e 'while (1) {;}' & Running: tar cf /dev/null / # ./pfilestat `pgrep -x tar` STATE FDNUM Time Filename read 11 0% /extra1/test/amd64/libxml2.so.2 read 11 0% /extra1/test/amd64/libgdk-x11-2.0.so.0 read 11 0% /extra1/test/amd64/libgdk-x11-2.0.so read 11 0% /extra1/test/amd64/libCstd.so.1 read 11 0% /extra1/test/amd64/libgtk-x11-2.0.so.0 read 11 2% /extra1/test/amd64/libgtk-x11-2.0.so write 3 2% /devices/pseudo/mm@0:null running 0 8% sleep-r 0 22% waitcpu 0 65% STATE FDNUM KB/s Filename read 11 182 /extra1/test/amd64/libsun_fc.so read 11 264 /extra1/test/amd64/libglib-2.0.so read 11 266 /extra1/test/amd64/libglib-2.0.so.0 read 11 280 /extra1/test/amd64/libxml2.so.2 read 11 293 /extra1/test/amd64/libgdk-x11-2.0.so read 11 295 /extra1/test/amd64/libgdk-x11-2.0.so.0 read 11 526 /extra1/test/amd64/libCstd.so.1 read 11 761 /extra1/test/amd64/libgtk-x11-2.0.so.0 read 11 1606 /extra1/test/amd64/libgtk-x11-2.0.so write 3 7881 /devices/pseudo/mm@0:null Total event time (ms): 4596 Total Mbytes/sec: 13 Now we have "waitcpu" as tar competes for CPU cycles along with the greedy infinite perl processes.