I recently came across a MikroTik RouterBoard 493AH at work. We’d acquired the device among numerous other pieces of equipment from a now defunct wireless ISP.
The 493AH features 9 ethernet ports, can accept POE on its WAN interface, has 64M NAND, 128M RAM, and can support 3 mini-PCI cards. Configuration can be performed via a serial interface and there is an external power connector available if POE isn’t used.
The device itself wouldn’t boot, only hang at the RouterBoot bootloader. Attempts to boot the NAND image failed, but the bootloader gives an easy option for downloading an image to it via TFTP.
Looking around, it appears the 493AH is an Atheros AR7161 .. an architecture readily supported under OpenWRT. Sure, I could just re-install RouterOS… but let’s do that later.
To install OpenWRT to the 493AH, first format the NAND. This can easily be done via the bootloader (option e):
Next, use subversion to check out the Backfire version of OpenWRT:
mkdir ~/svn/
cd ~/svn/
svn co svn://svn.openwrt.org/openwrt/branches/backfire backfire
Building the image is fairly easy, all configuration is done via a “make menuconfig“. First, we’ll build a small initramfs. This will give us a tiny environment to boot into the device and later install our kernel with.
Ensure that you’ve selected the AR71xx target architecture…
Next, I opt for the default profile (to give me all the modules I should need)
And finally, select the build of a ramdisk image:
After you’ve made your selections, exit saving your changes, and run make. The build itself will take some time, but when you’re finished you’ll have the first key ingredient – a basic root filesystem embedded into the kernel. This image is essentially a “Live CD” that we’ll use to install our real kernel.
As with all of the images you create, you’ll find them under ~/svn/backfire/bin/ar71xx/ Our newly created image is openwrt-ar71xx-vmlinux-initramfs.elf
Next, we’ll want to build our actual system. To do this, re-run make menuconfig and select the packages that you wish to compile and include in your firmware image. After you’ve made all of your selections, change your Target Image to squashfs and exit saving your changes.
A quick make later, and we now have a working rootfs and kernel – in addition to our initramfs to install the system with:
To install our kernel, we need a few additional tools. First off, we need to configure a DHCP server (I’m using ISC’s). Here’s an example from my dhcpd.conf file:
Power up the RouterBoard and quickly press the space bar. Select “boot over Ethernet” and it will download and boot the linux kernel.
Next ,we need to install the kernel and root filesystem. Here’s where I ran into my first problem – the kernel has no init variable specified so it panics. Thankfully it clearly states this: “Kernel panic – not syncing: No init found. Try passing init= option to kernel.”
Unfortunately the boot loader doesn’t appear to allow one to specify command line options for the kernel and I was unable to find a way to set this when configuring the kernel. (I vaguely recall seeing it when compiling for x86, but may be mistaken). Either way, the solution is simple:
Add your kernel parameters to a file (kernel-params in my instance) and use objcopy to insert it into the ELF file:
The toolchain supplied with OpenWRT contains a MIPs compatible version of objcopy that will allow you to add a kernel parameters section to the ELF file:
To install the kernel, configure an IP on your ethernet (or bridge) interface, mount /dev/mtdblock1 and use scp to copy your kernel to the device (as “kernel”).
After the root filesystem is installed, reboot the device and welcome to OpenWRT on the RouterBoard 493AH
I’m not quite sure what I’ll end up doing with the 493AH just yet. The neighborhood wireless system now consists of 2 Engenius EOC2610 units running firmware images based off OpenWRT… so there may be the potential to add it to the fray. The 9 ethernet ports would make it ideal for a Quagga router (although I already have one). Installing the MikroTik RouterOS and working with MPLS is another options. Right now it sits on my desk at work as a “pretty cool paperweight with a lot of potential”.
I’ve been using KVM more and more frequently in the past year (with nearly 25 virtual guests in production). While there are graphical user interfaces for KVM out there, I’ve yet to see one that supports DRBD replication (although one may exist). For that reason, I’ve basically built my Virtualization cluster using a handful of open-source tools “by hand”.
KVM has the capability to run both Windows and Linux operating systems (as well as numerous others), supports both CPU based virtualization and para-virtualization, and has virtual drivers that can run in the guest instance speeding up disk and network IO as well as balloon memory drivers to reduce the actual utilization of memory on the host machine.
I’ll stop here and say this – For home use, I’d probably suggest avoiding anything laid out in this article and simply use Virtualbox. I use it on my desktop extensively and have for many years, but for this article I’m focusing on building a cluster with KVM.
With the absence of a proper Storage Area Network (SAN), I’m utilizing DRBD (Distributed Replicated Block Device) to provide VM disk replication across both virtual nodes. This allows for live migration of a guest from front-end node to front-end node. Additionally, this architecture will still allow for replacement of storage OR supplementing of storage with a SAN in the future.
DRBD replicates individual LVM volumes (and not the raid array as a whole) across my 2 host nodes. Each virtual guest has it’s own logical volume assigned to it, which is accessed via DRBD block device interfaces (/dev/drbd<number>).
In the example image above, jabber0 and web0 (virtual “guests“) are running on virtual0 (a virtual “host” or “node“), with web1 (another “guest“) running on virtual1 (a virtual “host” or “node“). The DRBD resource is set to Primary mode on the virtual host connected to a running guest, with the disk being copied to the Secondary (the non-running virtual host) Primary mode allows the virtual host (and it’s guests) to access the DRBD resource (and read/write from the connected logical volume).
As far as a virtual guest is concerned, there is no DRBD, only a /dev/vda or /dev/sda device.
Only during live-migration should the DRBD resources on both virtual hosts be placed into Primary (a situation called Dual Primary). As one virtual guest is paused prior to the other going active, data corruption will not occur.
Each node is presently a Dell PowerEdge 2950 with 32G of memory and over 1 Terrabyte of storage. With the DRBD replication this gives approximately 1 Terrabyte of storage (and not a combined 2 Terrabytes).
Each node has 4 gigabit ethernet interfaces.
Interface
Purpose
eth0
Administrative Access
eth1
DRBD Replication
eth2
Connected to the world_br0 bridge for guest host routing
eth3
Connected to the world_br1 bridge for guest host routing
There are presently three ethernet bridges on each node:
Bridge Interface
Purpose
kickstart_br0
Used to kickstart guest machines
world_br0
Used to connect guest machines to the public network
world_br1
Used to connect guest machines to the public network
Connecting to a Guest:
Each guest is accessible via standard means (ssh) when configured correctly. Additionally, one can connect to each guest by VNCing to a unique port on the virtual host. (I do maintain a list of which DRBD ports and VNC ports are used for each of my virtual guests)
Configuring an LVM volume:
The “vmdisks” LVM volume group is approximately 1.3TB of disk storage, used to provide individual volumes to the guest instances. I use a logical volume of 30G for most guests.
To add a logical disk for guest usage is simple – and Must be done uniformly across all nodes:
lvcreate -L <size>M -n <name> vmdisks
Initial Configuration of the DRBD MD device:
The DRBD MD device is the actual block device that the Guest machine will interface with.
The following MUST BE DONE ACROSS ALL NODES, however only upon initial creation:
Update the /etc/drbd.conf file to add your new node (here’s an example):
resource <resource name>
{
net
{
allow-two-primaries;
}
on virtual0.braindeadprojects.com
{
device /dev/drbd<Next Available block>;
disk /dev/vmdisks/<LVM Volume Group>;
address 10.255.255.1:<Next available port>;
meta-disk internal;
}
on virtual1.braindeadprojects.com
{
device /dev/drbd<Next available block>;
disk /dev/vmdisks/<LVM Volume Group>;
address 10.255.255.2:<next available port>;
meta-disk internal;
}
}
After updating the config, create the block device and enable it:
#drbdadm create-md <resource name>
#drbdadm up <resource name>
At this point, all nodes have a record of this DRBD resource. /proc/drbd will have additional information.
The following must be done ONLY ON THE PRIMARY (MASTER) NODE:
This will begin an initial synchronization across the nodes. Again, this is only run on the “Master node” (the virtual host node that is initially running the VM guest). At this time, the DRBD resource is available on ALL nodes, however until the synchronization is finished, reads/writes will take slightly longer.
An important note on synchronization:
The syncer{} stanza in the resource config plays an important role in how fast a drive is synchronized. Default sync speed is roughly 340K/sec, which in turn causes a drive synchronization of a 30G drive to take appx 28 hours.
This can safely be set to 33M/sec in my environment, reducing sync-time to roughly 20 minutes, depending upon load.
Sync rate will play an important factor in instances where an entire node fails, and the resources of the failover node cannot keep up. In such an event, a 3rd node should be added to the system, with drives synced to it.
Creating the VM Guest:
I’m utilizing libvirt as a convenient way to provision and manage virtual machines.
Creation of a guest is fairly easy, and can be done interactively or via a one-liner:
#virt-install –connect qemu:///system -n <Guest Name> -r <RAM in MB> –vcpus=1 \
–arch=<i686|x86_64|…> –vnc –vncport=<unused VNC port number> –noautoconsole –os-type linux –accelerate \
–network=bridge:<kickstart_br0|world_br0|world_br1> –hvm –disk path=/dev/drbd<resource number>,bus=virtio \
–<pxe|import|cdrom>
After which time the guest will automatically start, with it’s vnetX interface bridging with kickstart_br0.
I’ve installed DNSMasq on each host machine. It sits atop the kickstart_br0 interface, and assigns the VM an IP in the 192.168.6.0/24 network (via DHCP), and PXE boots/kickstarts it off a mirroring server. (The 192.168.6.0/24 network is MASQUERADEd in iptables so requests appear to come from virtual[01])
After kickstarting the guest, the reboot process tends to shut down the virtual guest so it may need to be restarted (Normal reboots are not prone to this shutdown). Once restarted, server configuration can be done via ssh from the host node, or VNC.
Once the machine is built, customized and ready to be placed on the Internet, power down the VM guest and edit the XML config file (replacing kickstart_br0 with world_br0 or world_br1). If you find that the VM guest attempts to PXE boot once again, you may need to also change the boot device order (specifying hd instead of network)
You will also want to adjust the clock to source itself from the host machine.
# virsh
Welcome to virsh, the virtualization interactive terminal.
Type: ‘help’ for help with commands
‘quit’ to quit
virsh # edit <guestname>
…
…
<os>
<type arch=’x86_64′ machine=’rhel5.4.0′>hvm</type>
<boot dev=’network’/>
<boot dev=’hd’/>
</os>
…
…
<interface type=’bridge’>
<mac address=’54:52:00:2d:21:10’/>
<source bridge=’kickstart_br0’/>
<target dev=’vnet1’/>
</interface>
…
…
<clock offset=’localtime’/>
…
…
I’ve made sure to install virt-top, an interface to the hypervisor similar to the “top” command. This gives a nice overview of the system:
#virt-top
The shell API for libvirt makes manipulating guest instances easy. Here are a few of the more frequently used virsh commands:
#list <–all> (Lists running and non-running guests)
#start <guestname> (Starts guest instance)
#autostart <guestname> (Marks guest to be autostarted at node boot)
#destroy <guestname> (Immediately powers off guest)
#suspend <guestname> (Powers down guest gracefully)
#reboot <guestname> (Reboots guest)
#edit <guestname> (Edits the guest XML config)
#migrate (See the migration section for more info)
Live migration:
Live migration between nodes can be done via ssh (with shared keys) or TLS. I’m currently utilizing the ssh method:
Prior to migration, the DRBD resource needs to be place in Primary on both nodes:
#drbdadm primary <resource name>
After doing so, the following is run on the SENDING node only:
As part of the migration process, the sending node copies memory and kernel state via ssh to the receiving node. During the migration process, the guest instance on the sending node is active, with the guest node being marked as paused.
Once information is migrated, the sending node pauses it’s guest instance, with the receiving node un-pausing. Actual migration depends upon a number of factors, although is generally accomplished in under 35 seconds.
Following the migration, it’s essential to place the losing node into DRBD secondary mode. Should I accidentally try to start the guest on the losing node, this will prevent it from obtaining access to the disk (and corrupting data):
#drbdadm secondary <resource name>
Virtualizing Physical Machines:
Virtualizing a physical machine is extremely easy. Instead of PXE booting and kickstarting an install (–pxe), I use the –cdrom /usr/src/systemrescuecd-x86-1.6.2.iso flag when creating the virtual guest. On each virtual host, I have a copy of the excellent Gentoo based SystemRescueCd.
When the system boots into the live CD, partition the drive (usually /dev/vda or /dev/sda) as you wish (taking advantage of LVM for non-boot partitions if possible).
Create a /mnt/migrate directory from the live cd, and mount your newly created partitions there.
mount /dev/sda2 /mnt/migrate
for dir in {boot,proc,dev}; do mkdir /mnt/migrate/$dir; done
mount /dev/sda1 /mnt/migrate/boot
(Do the same for /var and any other directories you have partitioned separately)
Utilizing rsync over ssh, synchronize all files from the physical host to the virtual one (taking care that you perform the action correctly, so as not to overwrite the original server). A handful of files and directories NEED TO BE OMITTED, namely:
/proc
/dev
/sys
/lost+found
(possibly others)
I generally use an rsync command similar to this one:
The following devices need to be present in order to boot into the OS. Either rsync them or manually make them with mknod.
/dev/console
/dev/initctl
/dev/null
/dev/zero
Another easy way to accomplish this is:
for file in {console,initctl,null,zero}; do cp -a /dev/$file /mnt/migrate/dev/$file ; done
Following the rsync, the virtual guest will need a bootloader and an updated initial ramdisk. Both of these are best done in a chroot environment:
mount -o bind /dev/ /mnt/migrate/dev/
mount -t proc none /mnt/migrate/proc/
mount -o bind /sys/ /mnt/migrate/sys/
chroot /mnt/migrate/ /bin/bash
Inside the chroot environment, you will need to update /etc/mtab and /etc/fstab to reflect your new partitioning (at the very least drives will likely change to /dev/vda). You will also need to update /boot/grub/device.map to reflect hd0 as a valid drive.
Once these changes have been made, grub installation should be as simple as:
# grub
grub> root (hd0,0) <– where hd0,0 is the first partition on the first drive
grub> setup (hd0) <– install grub on the mbr of the first drive
grub> quit
With the Bootloader installed, we need to create a working initial ramdisk with necessary drivers. Look for the most recent installed kernel in grub.conf and create the ramdisk (replace the version numbers with yours):
Please be aware that many files (IE: databases) should only be rsynced when shutdown. For this reason, it’s often best to shutdown MySQL, PostgreSQL, and various other services PRIOR TO RSYNCing to prevent database corruption.
How to physicalize a virtual machine:
In the event of a major issue, converting a virtual machine back to a physical machine has the same process as physical to virtual, but reversed.
Of Note:
While Fedora currently supports (automatically, out of the box) SELinux profiles/labels for KVM guest instances, Centos 5.6 does not. It will be incorporated in Centos 6, however… and I plan on migrating to that OS eventually.
Final Thoughts:
As with everything, there’s pros and cons to this methodology.
While I’ve always preferred avoiding GUIs, the fact is they standardize on what steps happen in which order (limiting the potential for user-induced errors.)
A high performance SAN (or perhaps an OpenFiler box) would make things much easier to configure and migrate, but at the same time introduce a possible single point of failure.
Utilizing an automation engine (like puppet) could limit the number of steps needed to provision a virtual guest across all nodes.
Outside of some possible restrictions (virtio drivers being specific to KVM, LVM2 support for Windows), migrating from the present day system to vmware, virtualbox, or <insert your favorite hypervisor here> — should be fairly easy, requiring simply creating a guest and pointing it at an LVM share.
All in all, the system has been in production for nearly a year now and is performing beautifully. And best of all, I’m saving on power and generating less heat.
I’ve been quite annoyed recently with my video card, the “nVidia Corporation GeForce 8400 GS (rev a1)“. A number of sites using Flash tend to bleed through Firefox or Chrome and into other tabs or even other workspaces.
I’ve upgraded the nvidia-drivers a number of times, never actually fixing the problem. Other Gentoo users on the #gentoo channel of freenode have suggested migrating to gnash instead… and while I have contemplated this, I’ve noticed a number of things that don’t work well under gnash on my netbook.
Thankfully I’m not the only person to be experiencing this. Earlier today I came across a solution (that while not optimal) definitely fixes the problem:
Disabling hardware acceleration thankfully stops the bleedthrough. (Just right click on a flash movie, select “Settings” and disable acceleration under “display”) You will need to restart your browsers for it to fully take effect.
I was hoping to have time to further familiarize myself with IPv6 this weekend – but workplace emergencies, sleep, and a technical glitch slowed me down. Here’s the story of the technical glitch.
GNS3/dynamips can handle emulation of a number of routers, and I was happy to see that it can emulate host PCs as well. Unfortunately networking in GNS3 is done via UDP tunnels – something that requires a Qemu patch to implement.
In Gentoo, simply building GNS3 from the Sunrise Overlay doesn’t provide sufficient support for this requirement. Emulated routers (and I’m assuming switches) work fine when networked together. Emulated host machines work fine when NOT connected to anything. But once you network things together you’ll discover that your Qemu host won’t start.
Here’s why – when you start a Qemu host, this is essentially what GNS3 runs:
GNS3 attempts to use the -net udp option, something not offered in the default QEMU-KVM distribution. With no Gentoo USE flags to add the patch in either 0.13.0 or 0.14.50, I decided to simply upgrade to 0.14.50 and see if it was part of the default build: