Out with the old and in with the new.

by Mads
August 05, 2007 at 23:20 | categories: hardware, sun, solaris, asf, httpd

For quite some time, the infrastructure team at the ASF has been running our websites, mail-archives and wiki on a Sun Fire T2000 Server kindly donated by Sun. Along with the T2000 there's also a Dell SATA raid donated by ask.
Naturally, the machine is running Solaris 10 and that along with dtrace has already allowed us to find and correct pretty serious performance issue. Our load was hitting 500 and beyond and was close to knocking the machine over. Some digging around with DTRACE showd us an insane number of syscalls and almost all of them being reads.
More digging around with the following one-liner by Brendan Gregg:

# Read bytes by process,
dtrace -n 'sysinfo:::readch { @bytes[execname] = sum(arg0); }'

It gave a very clear picture that almost all reads were of 1k size and that allowed Joe Schaefer to create a patch for apr to Use buffered I/O with SDBM..
The current look of things is a lot better:

  httpd                                             
           value  ------------- Distribution ------------- count    
              -1 |                                         0        
               0 |                                         987      
               1 |                                         0        
               2 |                                         6        
               4 |                                         296      
               8 |                                         30       
              16 |                                         147      
              32 |                                         130      
              64 |                                         47       
             128 |                                         140      
             256 |                                         460      
             512 |                                         118      
            1024 |                                         19       
            2048 |                                         72       
            4096 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 336511   
            8192 |                                         11       
           16384 |                                         3        
           32768 |                                         0        
           65536 |                                         8        
          131072 |                                         0        

With the change, our load has dropped from over 500 to somewhere between 5 and 10.

For a long time we've also been wanting to add some redundancy by placing a similar setup at our European site. The board approved our request to go shopping and after lots of hassle trying to buy the machine from Sun (being a small customer at Sun is rarely much fun and I think we were even more unlucky than usual). Eventually we got there and Sander along with Colm got the machines racked.

The pictures are by Colm
before

after

The upper picture show the old Itanic and a broken X-serve. Below is the "after" picture, showing Aurora which is now the European mirror of Eos. The machine above Aurora is a Sun Fire X2200 M2 Server that will serve as a mail frontend.

and so ends the tale of how the rising Sun replaced the sinking Itanic :)