Archive for the ‘Hardware’ Category

Some of our machines have five 500gb disks configured into a single raid5 system. The cases that these disks are in have three empty slots. I’ve just added three more 500gb disks and configured these into another raid. The size of this raid will only be 1tb, but we’re pretty short on disk space, so that’s ok.

[root@cdfs tw_cli]# ./tw_cli
//cdfs> info

Ctl   Model        Ports   Drives   Units   NotOpt   RRate   VRate   BBU
------------------------------------------------------------------------
c8    9650SE-8LPML 8       8        1       0        4       4       -

//cdfs> info c8

Unit  UnitType  Status         %Cmpl  Stripe  Size(GB)  Cache  AVerify  IgnECC
------------------------------------------------------------------------------
u0    RAID-5    OK             -      64K     1862.61   ON     OFF      OFF 

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     OK               u0     465.76 GB   976773168     WD-WCAPW2252807
p1     OK               u0     465.76 GB   976773168     WD-WCAPW2071371
p2     OK               u0     465.76 GB   976773168     WD-WCAPW2252305
p3     OK               u0     465.76 GB   976773168     WD-WCAPW2252370
p4     OK               u0     465.76 GB   976773168     WD-WCAPW2252832
p5     OK               -      465.76 GB   976773168     WD-WCAPW1478014
p6     OK               -      465.76 GB   976773168     WD-WCAPW1478552
p7     OK               -      465.76 GB   976773168     WD-WCAPW1478397

//cdfs> /c8 add type=raid5 disk=5:6:7
Creating new unit on controller /c8 ...  Done. The new unit is /c8/u1.
Setting write cache=ON for the new unit ... Done.
Warning: You do not have a battery backup unit for /c8/u1 and the enabled
write cache (default) may cause data loss in the event of power failure.
Setting default Command Queuing Policy for unit /c8/u1 to [off] ... Done.

//cdfs> info c8

Unit  UnitType  Status         %Cmpl  Stripe  Size(GB)  Cache  AVerify  IgnECC
------------------------------------------------------------------------------
u0    RAID-5    OK             -      64K     1862.61   ON     OFF      OFF 
u1    RAID-5    OK             -      64K     931.303   ON     OFF      OFF 

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     OK               u0     465.76 GB   976773168     WD-WCAPW2252807
p1     OK               u0     465.76 GB   976773168     WD-WCAPW2071371
p2     OK               u0     465.76 GB   976773168     WD-WCAPW2252305
p3     OK               u0     465.76 GB   976773168     WD-WCAPW2252370
p4     OK               u0     465.76 GB   976773168     WD-WCAPW2252832
p5     OK               u1     465.76 GB   976773168     WD-WCAPW1478014
p6     OK               u1     465.76 GB   976773168     WD-WCAPW1478552
p7     OK               u1     465.76 GB   976773168     WD-WCAPW1478397

Now create a filesystem on this new raid.

[root@cdfs local]# fdisk /dev/sdb
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel. Changes will remain in memory only,
until you decide to write them. After that, of course, the previous
content won't be recoverable.


The number of cylinders for this disk is set to 121573.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): p

Disk /dev/sdb: 999.9 GB, 999978696704 bytes
255 heads, 63 sectors/track, 121573 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-121573, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-121573, default 121573):
Using default value 121573

Command (m for help): p

Disk /dev/sdb: 999.9 GB, 999978696704 bytes
255 heads, 63 sectors/track, 121573 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1      121573   976535091   83  Linux

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.
[root@cdfs local]# mkfs.ext3 -m0 -E stride=32 -j -O dir_index,resize_inode,spar se_super /dev/sdb1
mke2fs 1.35 (28-Feb-2004)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
122077184 inodes, 244133772 blocks
0 blocks (0.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=247463936
7451 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
        102400000, 214990848

Writing inode tables: done
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 26 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
[root@cdfs local]# tune2fs -c0 -i0 /dev/sdb1
tune2fs 1.35 (28-Feb-2004)
Setting maximal mount count to -1
Setting interval between check 0 seconds

Now mount and export the disk.

Most of our raids were built using Western Digital disks WD5000YS. When I ordered some more disks, the ones I received were WD5000AAYS. I sent Western Digital email asking if these disks are interchangable with the ones in our raid and according to them, they are.

Dear Mary,

Thank you for contacting Western Digital Customer Service and Support. 

Please see the links below for specifications of each product. You will see that they are 
indeed interchangeable.

WD5000YS
http://wdc.com/en/products/products.asp?driveid=238&language=en

WD5000AAYS
http://wdc.com/en/products/products.asp?driveid=331&language=en

Sincerely,
Peter K.
Western Digital Service and Support
http://support.wdc.com

We’re having some problems with our 3ware 9650se raid card. The messages in the log are:

3w-9xxx: scsi9: WARNING: (0x06:0x002C): Unit #0: Command (0x2a) timed out, resetting card.

And we get LOTS of these messages. The 3ware site said to try upgrading the driver. So I downloaded the file and untarred it. To create the driver, I ran:

make -f Makefile.rh

This created a 3w-9xxx.o file which I copied to /lib/modules/2.4.21-47.0.1.ELsmp/kernel/drivers/scsi.

I also upped the number of nfsd running from 64 to 128, though I’m pretty sure that this wasn’t the problem.

We’ll see if the new driver helps.

3ware provides a utility that you can use. I put a copy of it on my website:

tw_cli-linux-x86-9.3.0.4.tar

After you untar, you should be able to just run the tw_cli command. I think you guys have the same hardware that we do. Here’s what it looked like on one of the other hep machines:

cdfs1:tw_cli$ ./tw_cli
//cdfs1> show

Ctl   Model        Ports   Drives   Units   NotOpt   RRate   VRate   BBU
------------------------------------------------------------------------
c0    9550SX-8LP   8       8        1       0        4       4       -
c1    9550SX-8LP   8       8        1       0        4       4       -

Problems will show in the NotOpt (Not Optimal) column. You can get more information with the info command. It will show you which disks are bad.

//cdfs1> info c0

Unit  UnitType  Status         %Cmpl  Stripe  Size(GB)  Cache  AVerify  IgnECC
------------------------------------------------------------------------------
u0    RAID-5    OK             -      64K     1629.74   ON     OFF      OFF

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     OK               u0     233.76 GB   490234752     WD-WCANK2922552
p1     OK               u0     233.76 GB   490234752     WD-WCANK2785980
p2     OK               u0     233.76 GB   490234752     WD-WCANK2922551
p3     OK               u0     233.76 GB   490234752     WD-WCANK2941855
p4     OK               u0     233.76 GB   490234752     WD-WCANK2785894
p5     OK               u0     233.76 GB   490234752     WD-WCANK2785927
p6     OK               u0     233.76 GB   490234752     WD-WCANK2922607
p7     OK               u0     233.76 GB   490234752     WD-WCANK2941311

When applying thermal compound to a cpu, make sure that you install it as thin as possible. Ideally, drop about a pea-sized dollup in the middle of the cpu and spread it all over the cpu top. Don’t get any on the underside.

If you apply too much, as I did, the response was that the machine would start up and then emit a high-pitched continuous beep for as long as the computer was on.

I upgraded a machine with a radeon 7000 video card to FC5.  The installation was easy and it appeared as though everything worked.  However, after logging in as any user, the entire machine would hang.  Reboot, login and it would hang again.  Solution:  Edit the file /etc/X11/xorg.conf and look for a line that reads:

Load “dri”

Comment out this line and reboot.