In the next few days, I’ll be upgrading my north facing access point from Engenius to Ubiquiti. The firmware is already written, and I have most everything prepped for the rooftop mount. Before I post about working with the (hidden) Ubiquiti 5.3 SDK, I thought I’d give a quick tour of my system so far.
Build a stable and (mostly) embedded captive portal system with a minimal ToS acceptance screen.
Let the surrounding neighborhood use the Internet for free in exchange for helping me build and test the system.
Use the system as a way to introduce neighbors, let them post local interest items (missing pets, crime reports, events, etc).
Provide maps of recently reported crimes via the Harrisburg, PA online Police Blotter.
Over the years I’ve accomplished all of this, to one degree or another. Harrisburg, PA is in the midst of some serious financial problems, so their online police blotter has gone away – preventing me from easily obtaining local crime information. People are what they are, and as Google+, MySpace, and any other Social Site knows — getting people to truly use your social portal is a trick that requires sheer genius. Getting them to log into it and push a “Free Wifi” button however, is easy.
How it works:
After connecting to one of the open access points, the end user is redirected (courtesy of a patched NoDogSplash) to a captive web portal. The web portal is based off Elgg, a fairly easy to use Social Network Engine written in PHP. I’ve made a few modifications to the base system, adding a more recent JQuery and JQueryUI (so that I can create interactive Dialogs), and writing a few plugins to handle Netflow display, wireless signal strength reports (per user), user speed tests, and to verify that they have a picture set before allowing them to use the Free Wifi.
By nature, people won’t set a profile picture when all they want is Free Wifi. I had to enforce a profile picture (“it doesn’t have to be you, it can be anything non-offensive”) to make the site NOT appear like a barren wasteland.
I eventually limited account creation strictly to the access points as registrations from outside those IPs were mostly just spam.
After a user creates an account and logs in, they are directed to the “Dashboard”, which is a listing of recent posts from any of the users. Most are quick “Hey you!”, but sometimes people post something more substantive. When my rear car window was broken, I used the system as a venting forum.
I’ve consolidated most of my customizations relative to the wireless users into a single Elgg plugin I named “TSA Patdown”. Initially TSA Patdown only verified that a user had a profile image set, but now it does quite a bit more. Every 30 seconds I export Received Signal Strength Indication (RSSI) for each client from the Engenius equipment. I collect this information, as well as information from a Javascript based speedtest widget I wrote to get an idea as to what kind of online experience each user is having.
I represent this information to myself on the following menu, with signal bars that I created using Blender:
I can further delve into information on a per-user basis by simply clicking on a name. I can also pull a full neighborhood report, graphing each clients RSSI values as well as their recent speedtest results.
Being implemented in Javascript, the speedtest results aren’t the same as you’d see when visiting a Flash based speedtest. The standard web method of performing such a test is to have the end-user download an image file or two (oftentimes two images simultaneously)- and at random intervals determine how much of the image has been downloaded by that timeframe. With a single image download, it can perform multiple measurements at various intervals and determine available bandwidth much more accurately. Since there’s additional overhead in the underlying TCP/IP layers, it appears most tests also add padding to their calculation to make things more accurate.
Flash has methods that will allow for such periodic sampling, Javascript however does not. This makes my Javascript implementation an overall average – so a report of 900Kbit/sec can easily represent 1.5Mbit/sec. (My results are much more akin to what Wireshark will report as throughput). I do plan to write a Flash based speedtest in the near future.
In this example, the capture in Wireshark measures the throughput as 21.22Mbit/sec, nowhere near the 52.37Mbit/sec rating given by Speakeasy. The recent throughput information is all displayed in the signal screen:
The Netflow section of my TSA Patdown plugin details the current traffic flow on the network. This screen updates dynamically as users surf the internet. (I’ll reiterate my past posts here: The netflow data is only packet endpoints… basically “this person called this person at this time”, but not the actual content of those conversations). I’ve also added a small port-based protocol disector that colorizes the flows and provides protocol information depending upon the packet you select. If you choose a NetBIOS packet, you’ll get something similar to this:
The system monitors for NetBIOS names as well as DHCP hostnames that appear on Midtown Wifi. All of this information comes together to paint an accurate view of the network.
Clicking a Protocol Name (in this instance NetBIOS) will direct you to a Wikipedia article on the protocol and how it works. Unclassified protocols can be classified and colorized with a simple click. You can also specify the URL to load when the protocol name is clicked.
The pie charts, RSSI graphs, and throughput graphs are all handled using the PHP JPGraph libraries. In the future, I intend to improve the graphs (there are tendencies for my labels to bleed off-screen or over each other).
The access points share their own ADSL line for bandwidth but maintain individual PPPoE sessions. The wiring in my home needs improvement (the house was built in the 1800’s, the Cat5 running through the house is obviously not that old but does have some serious issues) . Most of the exterior walls appear to be metal, which does hinder re-running the DSL line a bit.
I recently migrated my home network graphing from NetMRG to Cacti, and I’m using Cacti‘s (albeit poor) FTP export function to offload graphs pertaining to MidTown Wifi to the captive portal.
As you can see in the graphs, the system currently has 175 subscribers. I have deleted the bogus accounts that weren’t created through the APs. The high number of subscribers is largely the result of transient users (my home is on a major bus line, rental homes in the area turn over somewhat frequently, the local college is blocks away, etc). A couple of users have duplicate accounts having apparently lost their credentials (as is evidenced by a few repeat MAC addresses).
To put the large number into proper perspective: in the last 7 days there were 157 logins by 18 unique users. Unlike myself, most of the users don’t spend every waking moment on the Internet.
I’ve covered the access points and the firmware images in a number of previous posts, so I’ll let them speak for themselves. In the next few days (hopefully not weeks), I’ll be introducing my first Ubiquiti access point to the system with full details posted then. If you have any thoughts or input, by all means reach me in the comments section.
I recently came across a MikroTik RouterBoard 493AH at work. We’d acquired the device among numerous other pieces of equipment from a now defunct wireless ISP.
The 493AH features 9 ethernet ports, can accept POE on its WAN interface, has 64M NAND, 128M RAM, and can support 3 mini-PCI cards. Configuration can be performed via a serial interface and there is an external power connector available if POE isn’t used.
The device itself wouldn’t boot, only hang at the RouterBoot bootloader. Attempts to boot the NAND image failed, but the bootloader gives an easy option for downloading an image to it via TFTP.
Looking around, it appears the 493AH is an Atheros AR7161 .. an architecture readily supported under OpenWRT. Sure, I could just re-install RouterOS… but let’s do that later.
To install OpenWRT to the 493AH, first format the NAND. This can easily be done via the bootloader (option e):
Next, use subversion to check out the Backfire version of OpenWRT:
mkdir ~/svn/
cd ~/svn/
svn co svn://svn.openwrt.org/openwrt/branches/backfire backfire
Building the image is fairly easy, all configuration is done via a “make menuconfig“. First, we’ll build a small initramfs. This will give us a tiny environment to boot into the device and later install our kernel with.
Ensure that you’ve selected the AR71xx target architecture…
Next, I opt for the default profile (to give me all the modules I should need)
And finally, select the build of a ramdisk image:
After you’ve made your selections, exit saving your changes, and run make. The build itself will take some time, but when you’re finished you’ll have the first key ingredient – a basic root filesystem embedded into the kernel. This image is essentially a “Live CD” that we’ll use to install our real kernel.
As with all of the images you create, you’ll find them under ~/svn/backfire/bin/ar71xx/ Our newly created image is openwrt-ar71xx-vmlinux-initramfs.elf
Next, we’ll want to build our actual system. To do this, re-run make menuconfig and select the packages that you wish to compile and include in your firmware image. After you’ve made all of your selections, change your Target Image to squashfs and exit saving your changes.
A quick make later, and we now have a working rootfs and kernel – in addition to our initramfs to install the system with:
To install our kernel, we need a few additional tools. First off, we need to configure a DHCP server (I’m using ISC’s). Here’s an example from my dhcpd.conf file:
Power up the RouterBoard and quickly press the space bar. Select “boot over Ethernet” and it will download and boot the linux kernel.
Next ,we need to install the kernel and root filesystem. Here’s where I ran into my first problem – the kernel has no init variable specified so it panics. Thankfully it clearly states this: “Kernel panic – not syncing: No init found. Try passing init= option to kernel.”
Unfortunately the boot loader doesn’t appear to allow one to specify command line options for the kernel and I was unable to find a way to set this when configuring the kernel. (I vaguely recall seeing it when compiling for x86, but may be mistaken). Either way, the solution is simple:
Add your kernel parameters to a file (kernel-params in my instance) and use objcopy to insert it into the ELF file:
The toolchain supplied with OpenWRT contains a MIPs compatible version of objcopy that will allow you to add a kernel parameters section to the ELF file:
To install the kernel, configure an IP on your ethernet (or bridge) interface, mount /dev/mtdblock1 and use scp to copy your kernel to the device (as “kernel”).
After the root filesystem is installed, reboot the device and welcome to OpenWRT on the RouterBoard 493AH
I’m not quite sure what I’ll end up doing with the 493AH just yet. The neighborhood wireless system now consists of 2 Engenius EOC2610 units running firmware images based off OpenWRT… so there may be the potential to add it to the fray. The 9 ethernet ports would make it ideal for a Quagga router (although I already have one). Installing the MikroTik RouterOS and working with MPLS is another options. Right now it sits on my desk at work as a “pretty cool paperweight with a lot of potential”.
I’ve been using KVM more and more frequently in the past year (with nearly 25 virtual guests in production). While there are graphical user interfaces for KVM out there, I’ve yet to see one that supports DRBD replication (although one may exist). For that reason, I’ve basically built my Virtualization cluster using a handful of open-source tools “by hand”.
KVM has the capability to run both Windows and Linux operating systems (as well as numerous others), supports both CPU based virtualization and para-virtualization, and has virtual drivers that can run in the guest instance speeding up disk and network IO as well as balloon memory drivers to reduce the actual utilization of memory on the host machine.
I’ll stop here and say this – For home use, I’d probably suggest avoiding anything laid out in this article and simply use Virtualbox. I use it on my desktop extensively and have for many years, but for this article I’m focusing on building a cluster with KVM.
With the absence of a proper Storage Area Network (SAN), I’m utilizing DRBD (Distributed Replicated Block Device) to provide VM disk replication across both virtual nodes. This allows for live migration of a guest from front-end node to front-end node. Additionally, this architecture will still allow for replacement of storage OR supplementing of storage with a SAN in the future.
DRBD replicates individual LVM volumes (and not the raid array as a whole) across my 2 host nodes. Each virtual guest has it’s own logical volume assigned to it, which is accessed via DRBD block device interfaces (/dev/drbd<number>).
In the example image above, jabber0 and web0 (virtual “guests“) are running on virtual0 (a virtual “host” or “node“), with web1 (another “guest“) running on virtual1 (a virtual “host” or “node“). The DRBD resource is set to Primary mode on the virtual host connected to a running guest, with the disk being copied to the Secondary (the non-running virtual host) Primary mode allows the virtual host (and it’s guests) to access the DRBD resource (and read/write from the connected logical volume).
As far as a virtual guest is concerned, there is no DRBD, only a /dev/vda or /dev/sda device.
Only during live-migration should the DRBD resources on both virtual hosts be placed into Primary (a situation called Dual Primary). As one virtual guest is paused prior to the other going active, data corruption will not occur.
Each node is presently a Dell PowerEdge 2950 with 32G of memory and over 1 Terrabyte of storage. With the DRBD replication this gives approximately 1 Terrabyte of storage (and not a combined 2 Terrabytes).
Each node has 4 gigabit ethernet interfaces.
Interface
Purpose
eth0
Administrative Access
eth1
DRBD Replication
eth2
Connected to the world_br0 bridge for guest host routing
eth3
Connected to the world_br1 bridge for guest host routing
There are presently three ethernet bridges on each node:
Bridge Interface
Purpose
kickstart_br0
Used to kickstart guest machines
world_br0
Used to connect guest machines to the public network
world_br1
Used to connect guest machines to the public network
Connecting to a Guest:
Each guest is accessible via standard means (ssh) when configured correctly. Additionally, one can connect to each guest by VNCing to a unique port on the virtual host. (I do maintain a list of which DRBD ports and VNC ports are used for each of my virtual guests)
Configuring an LVM volume:
The “vmdisks” LVM volume group is approximately 1.3TB of disk storage, used to provide individual volumes to the guest instances. I use a logical volume of 30G for most guests.
To add a logical disk for guest usage is simple – and Must be done uniformly across all nodes:
lvcreate -L <size>M -n <name> vmdisks
Initial Configuration of the DRBD MD device:
The DRBD MD device is the actual block device that the Guest machine will interface with.
The following MUST BE DONE ACROSS ALL NODES, however only upon initial creation:
Update the /etc/drbd.conf file to add your new node (here’s an example):
resource <resource name>
{
net
{
allow-two-primaries;
}
on virtual0.braindeadprojects.com
{
device /dev/drbd<Next Available block>;
disk /dev/vmdisks/<LVM Volume Group>;
address 10.255.255.1:<Next available port>;
meta-disk internal;
}
on virtual1.braindeadprojects.com
{
device /dev/drbd<Next available block>;
disk /dev/vmdisks/<LVM Volume Group>;
address 10.255.255.2:<next available port>;
meta-disk internal;
}
}
After updating the config, create the block device and enable it:
#drbdadm create-md <resource name>
#drbdadm up <resource name>
At this point, all nodes have a record of this DRBD resource. /proc/drbd will have additional information.
The following must be done ONLY ON THE PRIMARY (MASTER) NODE:
This will begin an initial synchronization across the nodes. Again, this is only run on the “Master node” (the virtual host node that is initially running the VM guest). At this time, the DRBD resource is available on ALL nodes, however until the synchronization is finished, reads/writes will take slightly longer.
An important note on synchronization:
The syncer{} stanza in the resource config plays an important role in how fast a drive is synchronized. Default sync speed is roughly 340K/sec, which in turn causes a drive synchronization of a 30G drive to take appx 28 hours.
This can safely be set to 33M/sec in my environment, reducing sync-time to roughly 20 minutes, depending upon load.
Sync rate will play an important factor in instances where an entire node fails, and the resources of the failover node cannot keep up. In such an event, a 3rd node should be added to the system, with drives synced to it.
Creating the VM Guest:
I’m utilizing libvirt as a convenient way to provision and manage virtual machines.
Creation of a guest is fairly easy, and can be done interactively or via a one-liner:
#virt-install –connect qemu:///system -n <Guest Name> -r <RAM in MB> –vcpus=1 \
–arch=<i686|x86_64|…> –vnc –vncport=<unused VNC port number> –noautoconsole –os-type linux –accelerate \
–network=bridge:<kickstart_br0|world_br0|world_br1> –hvm –disk path=/dev/drbd<resource number>,bus=virtio \
–<pxe|import|cdrom>
After which time the guest will automatically start, with it’s vnetX interface bridging with kickstart_br0.
I’ve installed DNSMasq on each host machine. It sits atop the kickstart_br0 interface, and assigns the VM an IP in the 192.168.6.0/24 network (via DHCP), and PXE boots/kickstarts it off a mirroring server. (The 192.168.6.0/24 network is MASQUERADEd in iptables so requests appear to come from virtual[01])
After kickstarting the guest, the reboot process tends to shut down the virtual guest so it may need to be restarted (Normal reboots are not prone to this shutdown). Once restarted, server configuration can be done via ssh from the host node, or VNC.
Once the machine is built, customized and ready to be placed on the Internet, power down the VM guest and edit the XML config file (replacing kickstart_br0 with world_br0 or world_br1). If you find that the VM guest attempts to PXE boot once again, you may need to also change the boot device order (specifying hd instead of network)
You will also want to adjust the clock to source itself from the host machine.
# virsh
Welcome to virsh, the virtualization interactive terminal.
Type: ‘help’ for help with commands
‘quit’ to quit
virsh # edit <guestname>
…
…
<os>
<type arch=’x86_64′ machine=’rhel5.4.0′>hvm</type>
<boot dev=’network’/>
<boot dev=’hd’/>
</os>
…
…
<interface type=’bridge’>
<mac address=’54:52:00:2d:21:10’/>
<source bridge=’kickstart_br0’/>
<target dev=’vnet1’/>
</interface>
…
…
<clock offset=’localtime’/>
…
…
I’ve made sure to install virt-top, an interface to the hypervisor similar to the “top” command. This gives a nice overview of the system:
#virt-top
The shell API for libvirt makes manipulating guest instances easy. Here are a few of the more frequently used virsh commands:
#list <–all> (Lists running and non-running guests)
#start <guestname> (Starts guest instance)
#autostart <guestname> (Marks guest to be autostarted at node boot)
#destroy <guestname> (Immediately powers off guest)
#suspend <guestname> (Powers down guest gracefully)
#reboot <guestname> (Reboots guest)
#edit <guestname> (Edits the guest XML config)
#migrate (See the migration section for more info)
Live migration:
Live migration between nodes can be done via ssh (with shared keys) or TLS. I’m currently utilizing the ssh method:
Prior to migration, the DRBD resource needs to be place in Primary on both nodes:
#drbdadm primary <resource name>
After doing so, the following is run on the SENDING node only:
As part of the migration process, the sending node copies memory and kernel state via ssh to the receiving node. During the migration process, the guest instance on the sending node is active, with the guest node being marked as paused.
Once information is migrated, the sending node pauses it’s guest instance, with the receiving node un-pausing. Actual migration depends upon a number of factors, although is generally accomplished in under 35 seconds.
Following the migration, it’s essential to place the losing node into DRBD secondary mode. Should I accidentally try to start the guest on the losing node, this will prevent it from obtaining access to the disk (and corrupting data):
#drbdadm secondary <resource name>
Virtualizing Physical Machines:
Virtualizing a physical machine is extremely easy. Instead of PXE booting and kickstarting an install (–pxe), I use the –cdrom /usr/src/systemrescuecd-x86-1.6.2.iso flag when creating the virtual guest. On each virtual host, I have a copy of the excellent Gentoo based SystemRescueCd.
When the system boots into the live CD, partition the drive (usually /dev/vda or /dev/sda) as you wish (taking advantage of LVM for non-boot partitions if possible).
Create a /mnt/migrate directory from the live cd, and mount your newly created partitions there.
mount /dev/sda2 /mnt/migrate
for dir in {boot,proc,dev}; do mkdir /mnt/migrate/$dir; done
mount /dev/sda1 /mnt/migrate/boot
(Do the same for /var and any other directories you have partitioned separately)
Utilizing rsync over ssh, synchronize all files from the physical host to the virtual one (taking care that you perform the action correctly, so as not to overwrite the original server). A handful of files and directories NEED TO BE OMITTED, namely:
/proc
/dev
/sys
/lost+found
(possibly others)
I generally use an rsync command similar to this one:
The following devices need to be present in order to boot into the OS. Either rsync them or manually make them with mknod.
/dev/console
/dev/initctl
/dev/null
/dev/zero
Another easy way to accomplish this is:
for file in {console,initctl,null,zero}; do cp -a /dev/$file /mnt/migrate/dev/$file ; done
Following the rsync, the virtual guest will need a bootloader and an updated initial ramdisk. Both of these are best done in a chroot environment:
mount -o bind /dev/ /mnt/migrate/dev/
mount -t proc none /mnt/migrate/proc/
mount -o bind /sys/ /mnt/migrate/sys/
chroot /mnt/migrate/ /bin/bash
Inside the chroot environment, you will need to update /etc/mtab and /etc/fstab to reflect your new partitioning (at the very least drives will likely change to /dev/vda). You will also need to update /boot/grub/device.map to reflect hd0 as a valid drive.
Once these changes have been made, grub installation should be as simple as:
# grub
grub> root (hd0,0) <– where hd0,0 is the first partition on the first drive
grub> setup (hd0) <– install grub on the mbr of the first drive
grub> quit
With the Bootloader installed, we need to create a working initial ramdisk with necessary drivers. Look for the most recent installed kernel in grub.conf and create the ramdisk (replace the version numbers with yours):
Please be aware that many files (IE: databases) should only be rsynced when shutdown. For this reason, it’s often best to shutdown MySQL, PostgreSQL, and various other services PRIOR TO RSYNCing to prevent database corruption.
How to physicalize a virtual machine:
In the event of a major issue, converting a virtual machine back to a physical machine has the same process as physical to virtual, but reversed.
Of Note:
While Fedora currently supports (automatically, out of the box) SELinux profiles/labels for KVM guest instances, Centos 5.6 does not. It will be incorporated in Centos 6, however… and I plan on migrating to that OS eventually.
Final Thoughts:
As with everything, there’s pros and cons to this methodology.
While I’ve always preferred avoiding GUIs, the fact is they standardize on what steps happen in which order (limiting the potential for user-induced errors.)
A high performance SAN (or perhaps an OpenFiler box) would make things much easier to configure and migrate, but at the same time introduce a possible single point of failure.
Utilizing an automation engine (like puppet) could limit the number of steps needed to provision a virtual guest across all nodes.
Outside of some possible restrictions (virtio drivers being specific to KVM, LVM2 support for Windows), migrating from the present day system to vmware, virtualbox, or <insert your favorite hypervisor here> — should be fairly easy, requiring simply creating a guest and pointing it at an LVM share.
All in all, the system has been in production for nearly a year now and is performing beautifully. And best of all, I’m saving on power and generating less heat.
I’ve been quite annoyed recently with my video card, the “nVidia Corporation GeForce 8400 GS (rev a1)“. A number of sites using Flash tend to bleed through Firefox or Chrome and into other tabs or even other workspaces.
I’ve upgraded the nvidia-drivers a number of times, never actually fixing the problem. Other Gentoo users on the #gentoo channel of freenode have suggested migrating to gnash instead… and while I have contemplated this, I’ve noticed a number of things that don’t work well under gnash on my netbook.
Thankfully I’m not the only person to be experiencing this. Earlier today I came across a solution (that while not optimal) definitely fixes the problem:
Disabling hardware acceleration thankfully stops the bleedthrough. (Just right click on a flash movie, select “Settings” and disable acceleration under “display”) You will need to restart your browsers for it to fully take effect.