Five Nines

Something you won’t meet in real life.  At least, you won’t meet it except by pure luck in affordable web hosting.

Just in case you’ve never run the numbers…

Year Minutes
1 525,600
.1 52,560
.01 5,256
.001 525.6
.0001 52.56
.00001 5.256

So there you have it. Being up .99999 of the time (over a year) means you can have just over 5 minutes of downtime. Total. The whole year. (This being a 365 day calendar year, not an astronomical year.)

OpenSolaris Static IP

I’ve been driven pretty closely to incoherent rage trying to simply configure a static IP on my reinstalled server box, because of incomplete and incorrect web pages and forum posts.

It makes sense that the default configuration is DHCP; that’s what the vast majority of sites work via. I don’t, because the DHCP server in my router doesn’t let me configure static IPs for certain MAC addresses, which is a requirement for (for example) the fileserver.

I know it’s possible to configure a static IP via NWAM. But only on one interface at a time, and I’m not prepared to accept that limit (this motherboard has two, and I’ve got various ideas for putting both to use).

So, here’s the deal:

Disable svc:/network.physical:nwam and enable svc:network.physical:default. Copy /etc/nsswitch.dns to /etc/nsswitch.conf. Put your IP and name into /etc/hosts, and take your name off of the localhost IP lines. Make sure /etc/resolv.conf is valid. Put the default router ip into /etc/defaultrouter. (Turning on RIP in the router hasn’t resulted in its being found automatically.)

Now, here’s the completely weird and undocumented bit: If your live Ethernet interface is nge0, create /etc/hostname.ngeo, and in it put TWO lines; on the first line, the static IP you want. On the second line, “netmask broadcast + up”. (“+” means the all-ones broadcast address rather than the all-zeros one, I think.) If this is documented anywhere, I couldn’t find it. I’ve found two examples, and the one I found this weekend I couldn’t find tonight, I ended up finding a different one.

And restart the service. Seems to work.

OpenSolaris ZFS Root Pool Mirroring

This has been a real pain, due to incomplete and misleading documentation. I believe I’ve finally gotten it to work again, and I need to write down in some detail what I did.

This is based on installing OpenSolaris 2008.11. The two disks I want to make my root pool are c5t0d0 and c5t1d0. The general process is to install on one of the disks, and attach the second as a mirror later. Most of this requires root, and I see I haven’t been too careful showing that. Sometimes the “pfexec” command is showing, which is the OpenSolaris roles-oriented equivalent of sudo for executing one command with the root role.

First point: while in general you want to give ZFS whole disks to work with, you cannot do that for a root pool that you intend to mirror. There’s a nice convenient “whole disk” option in the installer, too, and no warning that you shouldn’t use it if you want to mirror later.

So, let’s say you’ve gone ahead and installed with nearly all of c5t0d0 as your zfs root pool (it’s called rpool), and now you want to make it a mirror.

First, using the format tool, see what the partition structure is (what I expect on an x86 box is one fdisk partition occupying the whole disk, and a set of Solaris slices in that). In my case, I think what the installer always does, s0 is the root slice, and it occupied cylinders 1-9724 on a disk having cylinders 0-9725. In any case — duplicate this structure on the disk you intend to mirror with (c5t1d0 in my case).

At which point your second disk will look like this:

Specify disk (enter its number)[1]: 1
selecting c5t1d0
[disk formatted]
format> verify

Primary label contents:

Volume name = <        >
ascii name  = 
pcyl        = 9728
ncyl        = 9726
acyl        =    2
bcyl        =    0
nhead       =  255
nsect       =   63
Part      Tag    Flag     Cylinders        Size            Blocks
  0       root    wm       1 - 9724       74.49GB    (9724/0/0) 156216060
  1 unassigned    wm       0               0         (0/0/0)            0
  2     backup    wu       0 - 9725       74.50GB    (9726/0/0) 156248190
  3 unassigned    wm       0               0         (0/0/0)            0
  4 unassigned    wm       0               0         (0/0/0)            0
  5 unassigned    wm       0               0         (0/0/0)            0
  6 unassigned    wm       0               0         (0/0/0)            0
  7 unassigned    wm       0               0         (0/0/0)            0
  8       boot    wu       0 -    0        7.84MB    (1/0/0)        16065
  9 unassigned    wm       0               0         (0/0/0)            0

Now you can mirror it:

pfexec zpool attach -f rpool c5t0d0s0 c5t1d0s0

The “-f” is necessary because, in previous playing around, I’ve sometimes managed to get a recognizable part of a zpool on that slice, so I have to tell zpool to overwrite it. So be VERY careful using -f! Don’t do it at first, and if you get an error and you’re SURE you really want to overwrite the old data, then use -f.

So this was very successful:

localddb@fsfs:/boot/grub$ zpool status
  pool: rpool
 state: ONLINE
 scrub: resilver completed after 0h3m with 0 errors on Sun Jan 18 11:28:36 2009

	rpool         ONLINE       0     0     0
	  mirror      ONLINE       0     0     0
	    c5t0d0s0  ONLINE       0     0     0  25.6M resilvered
	    c5t1d0s0  ONLINE       0     0     0  3.57G resilvered

errors: No known data errors

Now, to do a really complete job, you need to install grub on the secondary disk as well. In fact zpool will tell you you should do that. If you don’t do this, you won’t be able to boot from just the second disk when something happens to the first one. (Yes, “when”. Murphy rules!)

So, to install grub:

localddb@fsfs:/$ cd /boot/grub
localddb@fsfs:/boot/grub$ pfexec installgrub stage1 stage2 /dev/rdsk/c5t1d0s0
stage1 written to partition 0 sector 0 (abs 16065)
stage2 written to partition 0, 267 sectors starting at 50 (abs 16115)

And that should be that.

A suppose a truly wise admin would play with various failure modes and recoveries before going on to install much on this base.