Archive for December, 2006

3ware provides a utility that you can use. I put a copy of it on my website:

tw_cli-linux-x86-9.3.0.4.tar

After you untar, you should be able to just run the tw_cli command. I think you guys have the same hardware that we do. Here’s what it looked like on one of the other hep machines:

cdfs1:tw_cli$ ./tw_cli
//cdfs1> show

Ctl   Model        Ports   Drives   Units   NotOpt   RRate   VRate   BBU
------------------------------------------------------------------------
c0    9550SX-8LP   8       8        1       0        4       4       -
c1    9550SX-8LP   8       8        1       0        4       4       -

Problems will show in the NotOpt (Not Optimal) column. You can get more information with the info command. It will show you which disks are bad.

//cdfs1> info c0

Unit  UnitType  Status         %Cmpl  Stripe  Size(GB)  Cache  AVerify  IgnECC
------------------------------------------------------------------------------
u0    RAID-5    OK             -      64K     1629.74   ON     OFF      OFF

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     OK               u0     233.76 GB   490234752     WD-WCANK2922552
p1     OK               u0     233.76 GB   490234752     WD-WCANK2785980
p2     OK               u0     233.76 GB   490234752     WD-WCANK2922551
p3     OK               u0     233.76 GB   490234752     WD-WCANK2941855
p4     OK               u0     233.76 GB   490234752     WD-WCANK2785894
p5     OK               u0     233.76 GB   490234752     WD-WCANK2785927
p6     OK               u0     233.76 GB   490234752     WD-WCANK2922607
p7     OK               u0     233.76 GB   490234752     WD-WCANK2941311

The eight disks on our laptop backup machine will be put into a raid. But, since I didn’t want to spend a lot of money on a hardware raid card, I’ll be doing in it software.

1. Create partitions on each disk. After do the New – Primary partition, must to t (Change a partition’s system id) to fd (Linux raid autodetect)

2. Use mdadm (Manage MD devices = Linux Software Raid) to create the raid.

mdadm --create --verbose /dev/md0 --level=5 --raid-devices=8 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1

3. Make a filesystem on the raid

mkfs /dev/md0

4. Make it a journaling file system

tune2fs -c0 -i0 -j /dev/md0

5. Mount it

The status of the raid can be checked with:

mdadm --detail /dev/md0

Now just need to write scripts to check the raid and send email if something is wrong.

After setting up the cpv queue, when I would submit a job, I would always get the error that said something like:

job rejected by all possible destinations

I copied the queue settings from the working cdf queue and couldn’t figure out what was wrong. Then, I found on the torque wiki a line that said if you get this error, that you should set the route_destinations to queue@localhost. I tried this and it didn’t work. But then I changed it to:

s q route_destinations = cpv@pnn

That did the trick. So there must be a difference in how SLF and SLC determine their hostname or something like that. I’m not looking into it any further because this seems to work just fine.