Tuesday, July 19, 2011

FreeNAS 8 - getting out of my comfort zone

Well, I have been using a different version of FreeNAS 0.7 on my backup NAS (NAS2) for a while and just recently connected a monitor to the box to see if I could capture anything on the console that would give me a clue as to why the box would fall over at random times. What I found was a spinlock causing a panic. Not good. Since this version of FreeNAS was from the pre-8 days, I decided to step out of my comfort zone and completely rebuild this box with FreeNAS 8 and ZFS.

Just to give a little background on the hardware, this machine (actually two machines, NAS1 and NAS2) is built with:

  • Intel BOXD945GCLF2 Atom 330 Intel 945GC Mini ITX Motherboard/CPU Combo
  • Thermaltake Black SECC Japanese steel LANBOX Lite VF6000BNS Micro ATX Media Center / HTPC Case
  • Thermaltake TR2 W0070RUC 430W ATX12V V2.2 Intel Core i7 Compliant Dual 80mm Fans Full Cable Sleevings Power Supply
  • Kingston ValueRAM 2GB 240-Pin DDR2 SDRAM DDR2 667 (PC2 5300) Desktop Memory Model KVR667D2N5/2G
  • 2x SanDisk SDCFH2-2048 2GB CF cards attached to CF->PATA adapters (OS disks)
  • 1x WD 1TB spinning disk
  • 1x Seagate spinning disk

Since the original construction, both Seagates have died and been replaced by a newer vintage WD drives, also 1TB in size. On each box, the drives are configured as a 1TB gmirror. The primary box, NAS1, is still running FreeNAS 0.7 (build 4292) and is dead reliable. The secondary box, NAS2, which I am documenting the rebuild here, was running FreeNAS 0.7 (build 4919) and this is the version that sporadically went into spinlock panic.

Now that you know the history, let's get into the new project!

First, these servers are running FreeNAS embedded on the ata0 (now ada0 in FreeNAS 8) CF card and the other 2GB CF card is idle. Previously I had taken the CF card out of the machine and used a PCMCIA CF adapter in my laptop to write the new embedded image. This time, I used a USB flash drive to test the new image before tearing the machine apart to get to the CF card.

On my windows laptop, using cygwin, I ran...

cat /proc/partitions

...before and after inserting the USB flash drive so that I could be sure of the target. In my case, with one fixed disk (known as /dev/sda), the USB flash drive was known as /dev/sdb. With this information, it was a piece of cake to write the embedded image for FreeNAS 8 to the USB flash drive using...

xz --decompress --stdout FreeNAS-8.0-RELEASE-amd64.Full_Install.xz | dd of=/dev/sdb

After the write, I inserted the stick into the powered-off server and flipped-on the power. It booted the image just fine, albeit slow, and I was in business. After some tinkering around I decided that I indeed wanted to push ahead and load FreeNAS 8 to the primary CF drive, but still not wanting to tear the machine down to get to the card, I decided to use `dd` to copy the image from the USB flash to the CF card. I had never done this but figured I would try and see if it would break or work. I logged in to the server and issued the following command via the CLI...

dd if=/dev/da0 of=/dev/ada0 bs=1M count=1000

It only took a few seconds to complete so I shutdown the machine, removed the USB stick, and powered it on. Lo and behold, it worked and I now have the embedded FreeNAS 8 on the primary CF card. Whoo hoo!

At this point, I am going to leave out all of the trials and tribulations that I had trying to get the machine to forget the old gmirror and reusing those 1TB disks for a new ZFS volume. If you need help sorting this out then you can ask via comments. Bottom line is that I didn't care about saving any data on those disks so was carefree in my destructive behavior.

So, now I have FreeNAS 8 embedded running from the ada0 CF card and have created my 1TB ZFS mirror. The next roadblock was the lack of rsync on FreeNAS 8. Rather, let me say the lack of browser-driven rsync, which meant that I needed to get the NAS1 -> NAS2 rsync working via the CLI. I am good with the CLI, but have never set up rsyncd like this before, so more fun!

First, I went to NAS1 and had a look at the rsyncd.conf file generated by FreeNAS 0.7 and decided I needed to replicate that on NAS2. Since I had previously used rsync to sync the entire 1TB from NAS1 to NAS2, I didn't want to put this rsyncd.conf into my 1TB storage volume, so I decided that this would be a great time to get the other 2GB CF card mounted for unique local storage.

From the browser interface of NAS2, I created a new volume based on the ada1 CF card but kept getting these "Error getting used space", "Error getting available space", and "Error getting total space" messages for the new volume. Since this wasn't very helpful information, I headed back to the CLI and kept an eye on /var/log/messages while recreating the volume. What I found in the messages log was...

Jul 19 13:08:25 nas2 freenas[1505]: Executing: gpart create -s gpt /dev/ada1 && gpart add -t freebsd-swap -l swap-ada1 -s 4194304 ada1 && gpart add -t freebsd-ufs ada1
Jul 19 20:08:25 nas2 freenas: ada1 created
Jul 19 20:08:25 nas2 freenas: gpart: autofill: No space left on device
Jul 19 13:08:25 nas2 freenas[1505]: Executing: newfs -U -L PERSIST /dev/ada1p2
Jul 19 20:08:25 nas2 freenas: newfs: /dev/ada1p2: could not find special device

In retrospect, the geometry of the CF card probably caused some glitch in the size calculation and therefore caused a failure in the creation of the partitions by gpart. Not knowing that at the time, I decided to see if I could try and make it work manually. I ran the following commands...

[root@nas2 ~]#
[root@nas2 ~]#
[root@nas2 ~]#
[root@nas2 ~]# cd /dev
[root@nas2 /dev]# ls ada1*
crw-r-----  1 root  operator    0,  85 Jul 19 20:08 ada1
[root@nas2 /dev]#
[root@nas2 /dev]#
[root@nas2 /dev]#
[root@nas2 /dev]# gpart add -t freebsd-ufs ada1
ada1p1 added
[root@nas2 /dev]#
[root@nas2 /dev]#
[root@nas2 /dev]#
[root@nas2 /dev]# newfs -U -L PERSIST /dev/ada1p2
newfs: /dev/ada1p2: could not find special device
[root@nas2 /dev]#
[root@nas2 /dev]#
[root@nas2 /dev]#
[root@nas2 /dev]# newfs -U -L PERSIST /dev/ada1p1
/dev/ada1p1: 1954.0MB (4001692 sectors) block size 16384, fragment size 2048
        using 11 cylinder groups of 183.72MB, 11758 blks, 23552 inodes.
        with soft updates
super-block backups (for fsck -b #) at:
 160, 376416, 752672, 1128928, 1505184, 1881440, 2257696, 2633952, 3010208, 3386464, 3762720
[root@nas2 /dev]#
[root@nas2 /dev]#
[root@nas2 /dev]#

Again, lady luck was on my side and the results were as expected and the partition was created. Since I hadn't deleted the error-laden volume from the browser GUI, I went ahead and rebooted to see what would happen. Upon reboot, the CF card was mounted correctly and I now have my 2GB "PERSIST" space for local persistent data. Whoo hoo again!

Keep checking back for the next installment where I document the creation of the rsyncd.conf file in the persistent local storage area, the initialization of rsyncd, and the resumption of the daily job to keep NAS1 backed-up to NAS2.

Thursday, May 26, 2011

Might start expanding content here soon...

Ok guys, I know that this blog has been focused on technical pursuits thus far, and with a title like "RF and IP" I guess you could have figured that out. I am thinking about expanding that content to a general log of activities in my life outside of the technical pursuits you have come to expect here. What I am not sure about is whether that's a good idea or whether I should start another blog for general interest stuff and keep this one technical (geeky) in nature. I would appreciate your comments on this since you are the ones to be affected by the decision.

Thanks,
Mike

Thursday, March 17, 2011

Full mesh dial peers script

I wrote the following script to support the automated construction of a full-mesh configuration of dial-peers in Cisco IOS.

The original driver for this was a scalability issue with Cisco's Call Manager Express (CME) feature in that each time a new site, or range of DIDs for an existing site, was added to the network then every other router in the network would have to be touched to update the configuration with a new dial-peer for the new DID range. This isn't an issue really when you have a Unity Call Manager managing everything and acting as a centralized directory server. However, when you have no central source of directory information (like when you are trying to save money and implementing VoIP on a very tight budget) you need some efficient way to keep all of the routers in the network educated about all of the DID ranges in the network. The below script helped me do that and works great for my needs.

#!/bin/sh

VCC="voice class codec 1000
codec preference 1 g711ulaw
codec preference 2 g729r8 bytes 30
codec preference 3 g729br8 bytes 30"

cat dp_list.txt | while read a target_router b c;
do
 outfile="${target_router}_dp_config.txt"
 echo "!" >${outfile}
 echo "${VCC}" >>${outfile}
 echo "!" >>${outfile}
 cat dp_list.txt | grep -v ${target_router} | while read sequence hostname pattern ipv4_target;
 do
  echo "dial-peer voice ${sequence} voip" >> ${outfile}
  echo "descrip SEQ ${sequence} FOR ${pattern} TO ${hostname} AT ${ipv4_target}" >> ${outfile}
  echo "destination-pattern ${pattern}" >> ${outfile}
  echo "progress_ind setup enable 3" >> ${outfile}
  echo "voice-class codec 1000" >> ${outfile}
  echo "voice-class h323 1" >> ${outfile}
  echo "session target ipv4:${ipv4_target}" >> ${outfile}
  echo "dtmf-relay h245-alphanumeric" >> ${outfile}
  echo "ip qos dscp ef media" >> ${outfile}
  echo "ip qos dscp af41 signaling" >> ${outfile}
  echo "!" >> ${outfile}
 done
 echo "end" >> ${outfile}
 echo "" >> ${outfile}
done

Ok, so for this script you need to supply an input file with the following tab-delimited pieces of information:

  1. Sequential dial-peer numeric identifier. I start mine at 1000000 and increment by 10.
  2. Destination router hostname
  3. DID/Number range (regex encouraged!)
  4. IPv4 address of destination router

Here is a sample of that input:
1000270 router027 91859960.. 192.168.1.27
1000280 router028 9185826953 192.168.1.28
1000290 router029 9185831071 192.168.1.29
1000300 router030 86381680.. 192.168.1.30
1000310 router027 86381681.. 192.168.1.27
1000320 router029 863686[0-3][1-4].. 192.168.1.29

In this example, you can see that additional ranges were added to sites 27 and 29, but this is no problem for the script, it does the right thing. The most important part is keeping the sequence number unique. You could even make it more intuitive by embedding the site number into the sequence number, like 1XXXXYY, where XXXX is your site number (assuming you have <10k sites) and YY are the DID/number ranges for the site (assuming <=100 ranges per site). This would make the above sequence 1000290->1002900 and 1000320->1002901. You get the idea.

Of course, this can be extended in any way that meets your needs. The important part of the script is that it builds a config for each hostname that includes all rows from the input file that are from !hostname. Once the configurations are built, you can use SNMP & TFTP to get them loaded to each router in the network. Further still, using cron to run the script & TFTP load on a regular basis will always keep everything in sync. Let the machines do the work for you!

Leave me comments if you like, don't like, have ideas for improvement, etc. If you want, also give a visit to an advertiser.