Archive for the ‘Development’ Category

Old School to New School: Refactoring Perl


Friday, July 25th, 2008

At YAPC::NA I sat in on lots of great talks (I also won Randal Schwartz in the charity auction, and so got to be beaten soundly at pool by him, and learn a few things about Smalltalk and Seaside). In particular, Michael Schwern gave a fantastic talk entitled Skimmable Code: Fast to Read, Fast to Change‎. This got me thinking about our own code. Webmin is an old codebase, approaching 11 years old, and thus has some pretty old school Perl practices throughout. Coding standards sort of stick to projects over a few years, and as new code comes in, it tends to look like the old code. And, to add to that momentum, Jamie has religiously kept compatibility for module authors throughout the entire life of the project. Modules written ten years ago can, astonishingly, be expected to work identically in todays Webmin, though they might not participate in logging or advanced ACLs or other nifty features that have come to exist in the framework in that time.

So, when I found myself needing to make a modification to oschooser.pl, a small program for detecting the operating system on which Webmin is running (sounds trivial, but when you realize that Webmin runs on hundreds of operating systems and versions, it turns out to be a rather complex problem), I decided to take the opportunity to put into practice some of the niceties of modern Perl. This article is a little different than what I usually write for In the Box, in that it covers a lot of ground fast, and most of it is probably pretty mundane stuff for folks already writing modern Perl. But, I think there’s enough old Perl code running around out there, running the Internet and such, that it’s worth talking about modernization work.

So, let’s go spelunking!

Introduction to oschooser.pl

The code we’ll be picking apart, and putting back together, is probably one of the more heavily used pieces of Perl code, and certainly one of the oldest, in the wild. It’s the OS detection code that Webmin and Usermin use to figure out what system they’re running on during installation. With Webmin having 12 million (give or take several million) downloads over its ten year history, this equals a lot of operating systems successfully detected. Perhaps I should have picked something a little less important for my first stab at modernization, but I’ve rarely been accused of being smart about making sweeping changes! (Jamie will reel me in, before I break actual Webmin code. I manage to break Virtualmin every now and then…but he’s more suspicious when I check code into Webmin, since it happens quite rarely.)

The oschooser.pl program actually loads up a rather complex definitions file called os_list.txt (by default, though it’s configurable, and we use different lists for Virtualmin and Webmin, since they have different requirements for version identification). The definitions file can contain snippets of Perl code, which will be executed via eval, when appropriate. Most of the updates to OS detection over the years have happened in os_list.txt, so oschooser.pl hasn’t seen a lot of grooming over the years, which makes it a prime candidate for modernization. Assuming, of course, that it works identically when I’m done with it.

Where to start?

My end goal with this project is to make oschooser.pl usable as a library from Perl programs, since our new product installer is written in Perl rather than POSIX shell. I also figured it’d be nice to make it testable, since I’ve made several mistakes in the detection code (in os_list.txt, specifically) over the past few years that led to our product being uninstallable on some systems until the bug was tracked down. But, first things first. Almost nothing in Webmin is strict compatible, and even warnings can cause some complaints, so that seems like a good starting point.

The code we’re starting with can be found here, so you can follow along at home.

Enabling warnings reveals the following (don’t worry about the arguments for now):

$ perl -w oschooser.pl os_list.txt outfile 1
Name "main::uname" used only once: possible typo at oschooser.pl line 31.
Name "main::donename" used only once: possible typo at oschooser.pl line 17.

Not too bad, actually. Just a couple of variables that are only seen once, easy enough to fix by giving them a my declaration. Though, in this case, it looks like enabling warnings turns up some unused code. While donename is actually keeping track of what names we’ve seen, so far, and it’s one of several idiomatic ways to build an array of unique values, the uname variable seems to have no purpose. So I’m going to kill that whole line rather than declare it.

Next up in our “low-hanging fruit” exercise is enabling use strict. Turns out this is quite a lot more intimidating:

$ perl -c oschooser.pl
Global symbol "$oslist" requires explicit package name at oschooser.pl line 15.
Global symbol "$out" requires explicit package name at oschooser.pl line 15.
Global symbol "$auto" requires explicit package name at oschooser.pl line 15.
Global symbol "$oslist" requires explicit package name at oschooser.pl line 16.
Global symbol "$oslist" requires explicit package name at oschooser.pl line 16.
Global symbol "@list" requires explicit package name at oschooser.pl line 20.
Global symbol "@names" requires explicit package name at oschooser.pl line 21.
Global symbol "%names_to_real" requires explicit package name at oschooser.pl line 22.
Global symbol "$auto" requires explicit package name at oschooser.pl line 27.
Global symbol "$etc_issue" requires explicit package name at oschooser.pl line 30.
Global symbol "$etc_issue" requires explicit package name at oschooser.pl line 33.
Global symbol "$o" requires explicit package name at oschooser.pl line 36.
Global symbol "@list" requires explicit package name at oschooser.pl line 36.
Global symbol "$o" requires explicit package name at oschooser.pl line 37.
Global symbol "$o" requires explicit package name at oschooser.pl line 37.
Global symbol "$ver" requires explicit package name at oschooser.pl line 39.
Global symbol "$o" requires explicit package name at oschooser.pl line 39.
Global symbol "$ver" requires explicit package name at oschooser.pl line 40.
Global symbol "$ver" requires explicit package name at oschooser.pl line 41.
Global symbol "$o" requires explicit package name at oschooser.pl line 41.
Global symbol "$ver" requires explicit package name at oschooser.pl line 41.
Global symbol "$ver" requires explicit package name at oschooser.pl line 43.
Global symbol "$ver" requires explicit package name at oschooser.pl line 44.
Global symbol "$o" requires explicit package name at oschooser.pl line 44.
Global symbol "$ver" requires explicit package name at oschooser.pl line 44.
Global symbol "$o" requires explicit package name at oschooser.pl line 49.
Global symbol "$ver" requires explicit package name at oschooser.pl line 53.
Global symbol "$auto" requires explicit package name at oschooser.pl line 54.
Global symbol "$auto" requires explicit package name at oschooser.pl line 59.
Global symbol "$rv" requires explicit package name at oschooser.pl line 61.
Global symbol "$auto" requires explicit package name at oschooser.pl line 67.
Global symbol "$auto" requires explicit package name at oschooser.pl line 72.
Global symbol "$auto" requires explicit package name at oschooser.pl line 77.
Global symbol "$cmd" requires explicit package name at oschooser.pl line 80.
Global symbol "$i" requires explicit package name at oschooser.pl line 81.
Global symbol "$i" requires explicit package name at oschooser.pl line 81.
Global symbol "@names" requires explicit package name at oschooser.pl line 81.
Global symbol "$i" requires explicit package name at oschooser.pl line 81.
Global symbol "$cmd" requires explicit package name at oschooser.pl line 82.
Global symbol "$i" requires explicit package name at oschooser.pl line 82.
Global symbol "@names" requires explicit package name at oschooser.pl line 82.
Global symbol "$i" requires explicit package name at oschooser.pl line 82.
Global symbol "$tmp_base" requires explicit package name at oschooser.pl line 84.
Global symbol "$temp" requires explicit package name at oschooser.pl line 85.
Global symbol "$tmp_base" requires explicit package name at oschooser.pl line 85.
Global symbol "$cmd" requires explicit package name at oschooser.pl line 86.
Global symbol "$temp" requires explicit package name at oschooser.pl line 86.
Global symbol "$osnum" requires explicit package name at oschooser.pl line 87.
Global symbol "$temp" requires explicit package name at oschooser.pl line 87.
Global symbol "$osnum" requires explicit package name at oschooser.pl line 88.
Global symbol "$osnum" requires explicit package name at oschooser.pl line 88.
Global symbol "$osnum" requires explicit package name at oschooser.pl line 89.
Global symbol "$name" requires explicit package name at oschooser.pl line 96.
Global symbol "@names" requires explicit package name at oschooser.pl line 96.
Global symbol "$osnum" requires explicit package name at oschooser.pl line 96.
Global symbol "@vers" requires explicit package name at oschooser.pl line 97.
Global symbol "$name" requires explicit package name at oschooser.pl line 97.
Global symbol "@list" requires explicit package name at oschooser.pl line 97.
Global symbol "$cmd" requires explicit package name at oschooser.pl line 98.
Global symbol "$i" requires explicit package name at oschooser.pl line 99.
Global symbol "$i" requires explicit package name at oschooser.pl line 99.
Global symbol "@vers" requires explicit package name at oschooser.pl line 99.
Global symbol "$i" requires explicit package name at oschooser.pl line 99.
Global symbol "$cmd" requires explicit package name at oschooser.pl line 100.
Global symbol "$i" requires explicit package name at oschooser.pl line 100.
Global symbol "$name" requires explicit package name at oschooser.pl line 100.
Global symbol "@vers" requires explicit package name at oschooser.pl line 100.
Global symbol "$i" requires explicit package name at oschooser.pl line 100.
Global symbol "$cmd" requires explicit package name at oschooser.pl line 102.
Global symbol "$temp" requires explicit package name at oschooser.pl line 102.
Global symbol "$vnum" requires explicit package name at oschooser.pl line 103.
Global symbol "$temp" requires explicit package name at oschooser.pl line 103.
Global symbol "$vnum" requires explicit package name at oschooser.pl line 104.
Global symbol "$vnum" requires explicit package name at oschooser.pl line 104.
Global symbol "$temp" requires explicit package name at oschooser.pl line 105.
Global symbol "$vnum" requires explicit package name at oschooser.pl line 106.
Global symbol "$ver" requires explicit package name at oschooser.pl line 110.
Global symbol "@vers" requires explicit package name at oschooser.pl line 110.
Global symbol "$vnum" requires explicit package name at oschooser.pl line 110.
Global symbol "$dashes" requires explicit package name at oschooser.pl line 114.
Global symbol "$dashes" requires explicit package name at oschooser.pl line 115.
Global symbol "$i" requires explicit package name at oschooser.pl line 121.
Global symbol "$i" requires explicit package name at oschooser.pl line 121.
Global symbol "@names" requires explicit package name at oschooser.pl line 121.
Global symbol "$i" requires explicit package name at oschooser.pl line 121.
Global symbol "$i" requires explicit package name at oschooser.pl line 122.
Global symbol "@names" requires explicit package name at oschooser.pl line 122.
Global symbol "$i" requires explicit package name at oschooser.pl line 122.
Global symbol "$i" requires explicit package name at oschooser.pl line 123.
Global symbol "$i" requires explicit package name at oschooser.pl line 125.
Global symbol "$dashes" requires explicit package name at oschooser.pl line 126.
Global symbol "$osnum" requires explicit package name at oschooser.pl line 128.
Global symbol "$osnum" requires explicit package name at oschooser.pl line 129.
Global symbol "$osnum" requires explicit package name at oschooser.pl line 134.
Global symbol "$osnum" requires explicit package name at oschooser.pl line 134.
Global symbol "@names" requires explicit package name at oschooser.pl line 134.
Global symbol "$osnum" requires explicit package name at oschooser.pl line 135.
Global symbol "$name" requires explicit package name at oschooser.pl line 141.
Global symbol "@names" requires explicit package name at oschooser.pl line 141.
Global symbol "$osnum" requires explicit package name at oschooser.pl line 141.
Global symbol "$name" requires explicit package name at oschooser.pl line 142.
Global symbol "$vnum" requires explicit package name at oschooser.pl line 146.
Global symbol "$vnum" requires explicit package name at oschooser.pl line 147.
Global symbol "$ver" requires explicit package name at oschooser.pl line 153.
Global symbol "$name" requires explicit package name at oschooser.pl line 153.
Global symbol "$vnum" requires explicit package name at oschooser.pl line 153.
Global symbol "%names_to_real" requires explicit package name at oschooser.pl line 154.
Global symbol "$name" requires explicit package name at oschooser.pl line 154.
Global symbol "$vnum" requires explicit package name at oschooser.pl line 154.
Global symbol "$out" requires explicit package name at oschooser.pl line 159.
Global symbol "$ver" requires explicit package name at oschooser.pl line 160.
Global symbol "$ver" requires explicit package name at oschooser.pl line 161.
Global symbol "$ver" requires explicit package name at oschooser.pl line 162.
Global symbol "$ver" requires explicit package name at oschooser.pl line 163.
Global symbol "$d" requires explicit package name at oschooser.pl line 170.
Global symbol "$rv" requires explicit package name at oschooser.pl line 172.
Global symbol "$rv" requires explicit package name at oschooser.pl line 174.
Global symbol "$d" requires explicit package name at oschooser.pl line 177.
Global symbol "$d" requires explicit package name at oschooser.pl line 178.
Global symbol "$rv" requires explicit package name at oschooser.pl line 178.
Global symbol "$d" requires explicit package name at oschooser.pl line 178.
Global symbol "$rv" requires explicit package name at oschooser.pl line 181.
oschooser.pl had compilation errors.

Wow! I think that might be more lines than the program itself. Luckily, it’s almost entirely unscoped variables. A quick pass over the code, adding my declarations to the obvious candidates, gets things looking a little better. One tricky bit is the $i loop variables used in for loops. We don’t want those to be declared several times in the code, and we don’t want them to leak out into the scope of the rest of the program. In modern Perl, this is no problem, as you can use the following:

    for(my $i=0; $i<@names; $i++) {
      $cmd .= " ".($i+1)." '$names[$i]'";
      }

And $i will be local to the for loop. I momentarily feared that I’d need to use an outer block to accomplish this, as Webmin needs to be compatible with quite old Perl versions (5.005, for core Webmin, unless Unicode support is needed, in which case 5.8.1 is required), but after downloading and installing Perl 5.005_4, I found that was an unnecessary precaution. The foreach loops can also make use of this convenient feature. If you do happen to be stuck with an even more ancient version that 5.005 (but still higher than 4)–though I can’t imagine how you could, as 5.005 is over nine years old–you can use the following:

  {
  my $i;
    for($i=0; $i<@names; $i++) {
      $cmd .= " ".($i+1)." '$names[$i]'";
      }
  }

Which provides similar private scope for the $i variable, at the cost of three extra lines.

Making it Testable

So far, I haven’t made any changes that are likely to break the code. It’s merely been cleanup and syntax tweaks. But, to accomplish everything I’d like in this exercise, we’ll be doing some refactoring and refining the code. To do that with confidence, it’d be nice to have some tests to insure the code works the same before and after any changes.

Since this is not historically a library, it’s not particularly easy to test. One could write a custom test harness, or use Test::Command, and test its behavior as a whole, but since it’s written in Perl and one of my goals is to make it useful as a library from Perl scripts, I decided instead to make it loadable as a module and use Test::More. A trick that’s very common in the Python world, but doesn’t seem as well-known amongst Perlmongers is a main function which is called if the script is executed independently rather than via use or require. The main function then calls whatever the script would normally do, optionally setting up variables or parsing command line arguments.

So, I added the following near the beginning of the file:

# main
sub main() {
if ($#ARGV < 1) { die "Usage: $0 os_list.txt outfile [0|1|2|3]\n"; }
my ($oslist, $out, $auto) = @ARGV;
oschooser($oslist, $out, $auto);
}
main() unless caller();  # make it testable and usable as a library

I also took this opportunity to add a simple usage message if the command is executed with fewer than two arguments (@ARGV, like all Perl arrays starts counting at 0). I also needed to wrap the main function of the script in a sub block, so that the script doesn’t do anything immediately if loaded as a library.

Make it a Module

Since I want to use this code as a library, I face a choice. The use statement is functionally equivalent to:

BEGIN { require Module; Module->import( LIST ); }

Which means, I suppose, I could keep the name oschooser.pl and use:

require 'oschooser.pl';

We don’t need BEGIN level assurance, since we have no prototypes in this library and only use simple subroutines. But, I find this a bit unsatisfying, since it’s no longer in common use amongst Perl developers, and use provides the ability to export functions explicitly. Test::More has both a use_ok and a require_ok function, so it’s irrelevant from a testing perspective. It’ll probably remain oschooser.pl in Webmin proper, and OsChooser.pm in my Virtualmin installer library, at least for the foreseeable future. Not really a lot of difference between the two.

Some Tests

So, now that we can call the library roughly the way we want, using use, it’s time to write a few tests to be sure things actually work after we begin making more sweeping changes.

We can start with simple compile tests (I usually call these types of tests t/return.t, as they just check to be sure the module returns without error on load and the functions within return the data type that is expected):

#!/usr/bin/perl -w
# These tests just check to be sure all functions return something
# It doesn't care what it is returned...so garbage can still pass,
# as long as the garbage is the right data type.
 
use strict;
use Test::More qw(no_plan);
 
use_ok( 'OsChooser' );
 
isa_ok(\OsChooser::have_tty(), 'SCALAR');
isa_ok(\OsChooser::has_command("cp"), 'SCALAR');

Hmm…OK, so we don’t actually have a lot to test yet, just a couple of utility functions (and I’ve even cheated a little and looked ahead to where I introduced a have_tty function, or this would be an even shorter set of tests). The most important function, oschooser, doesn’t know how to return anything very useful yet. It can only write out its findings to a file. But, since we’re always going to be creating that file, regardless of how nice the module usage becomes, we need to figure out how to test it anyway.

Unsurprisingly, there is already a full-featured module on CPAN for testing the contents of files, called, unlikely though it may seem, Test::Files. So, we’ll just grab that:

$ sudo perl -MCPAN -e shell
 
cpan shell -- CPAN exploration and modules installation (v1.7602)
ReadLine support enabled
 
cpan> install Test::Files
...

And then create as many Operating System definition files as we want in the t directory. We’ll just name them for the OS they represent. This is the kind of testing I love, because the actual test file will be extremely simple, no matter how many operating systems I want to test on:

#!/usr/bin/perl -w
use strict;
use OsChooser;
 
# Get a list of the example OS definition files
opendir(DIR, "t/") || die "can't opendir t/ $!";
my @files = grep { /\.os/ } readdir(DIR);
closedir DIR;
use Test::More qw(no_plan);
use Test::Files;
 
foreach my $file (@files) {
  $file =~ /(.*)\.os$/;
  my $osname = $1;
  my $outfile = "t/outfile";
  OsChooser::oschooser("os_list.txt", $outfile, 1);
  compare_ok("t/$file", $outfile, $osname);
 
  # Cleanup
  unlink $outfile;
}

I love data-driven software, and this is a fun little example of it. We can run as many tests as we want, merely by adding more OS data files–one with the “os” suffix to provide what should be output by oschooser and one to contain the file that oschooser would normally use to identify the OS (/etc/issue, among others), which isn’t yet supported, but I’ll talk about it in the next post. Speaking of being data-driven, I think it’d also be pretty nifty to get the test count from the @files array, rather than using no_plan, but because modules loaded with use are loaded early during compile time (in a BEGIN block, effectively) we don’t actually have anything in @files yet.

However, as mentioned, the oschooser function doesn’t yet allow one to specify the issue file to look at, so no matter how many definitions I provide, it’ll never be able to test anything but the OS the test is running on. Oh, well, for now we’ll just create one OS definition file that matches my current OS, and make it a priority to make the function more testable somehow, possibly via an optional parameter to oschooser.

Alright, so now that we have some rudimentary tests in place, we can break stuff with confidence! We’ll come back to testing again in the near future, since we’re leaving so much untested right now.

Plain Old Documentation

I’m going to take a quick detour now that we’ve got some basic tests in place. Testing is one practice that most developers agree makes for great code, and the other practice that most folks can agree on is documentation.

Since this is such a simple piece of code, and was intended exclusively for use during installation of Webmin and Usermin, Jamie never really documented it. Now that I’m forcing it to be useful in other locations, and having some fun giving it a modern Perl face lift, it’s as good a time as any to add some documentation. POD isn’t the only documentation format usable within Perl code, but it is, by far, the most popular, and it has lots of great tools for processing and testing coverage, so that’s what Jamie recently chose for use in documenting the Virtualmin API. It’s also easy to learn, and results in text that is pretty readable even before processing.

I’m not sure of the recommended practices for documenting scripts that work on both the command line and as a module, but here’s what I came up with:

=head1 OsChooser.pm
 
Attempt to detect operating system and version, or ask the user to select
from a list.  Works from the command line, for usage from shell scripts,
or as a library for use within Perl scripts.
 
=head2 COMMAND LINE USE
 
OsChooser.pm os_list.txt outfile [auto]
 
Where "auto" can be the following values:
 
=over 4
 
=item 0
 
always ask user
 
=item 1
 
automatic, give up if fails
 
=item 2
 
automatic, ask user if fails
 
=item 3
 
automatic, ask user if fails and if a TTY
 
=back
 
=head2 SYNOPSIS
 
    use OsChooser;
    my ($os_type, $version, $real_os_type, $real_os_version) =
       OsChooser->oschooser("os_list.txt", "outfile", $auto, [$issue]);
 
=cut

Pretty simple, but covers the basics.

Next Time

Unfortunately, the code is now longer and probably a little less readable than before! It’s probably more robust to changes, since it now has reasonably scoped variables. And it’s more friendly to others who might want to use it, due to the new documentation and the ability to use it as a library in Perl or as a command in shell scripts.

Next time we’ll start in on the refactoring, and we’ll also write some more tests. This is turning into a real challenge, due to the data-driven nature of the script, and the fact that it’s somewhat hardcoded to look for OS data in very specific locations. Since, a big part of what I want to test is in the os_list.txt file, we don’t have the luxury of just saying, “It’s configuration…we’ll just make a special version for testing purposes.” We’ll have to get far more clever.

Extending Virtualmin with plugins


Monday, March 3rd, 2008

Plugins can be big

Not many people know that Virtualmin’s already extensive list of built-in features can be extended by writing plugins, which are basically Webmin modules that export a special API. Why would you want to do this, you may ask? Let’s say their is a mailing list application, log analyzer, database or source code control system that you want to make available on a per-domain basis .. if so, a plugin is the way to do it.

A plugin is typically used to a new feature to Virtualmin. In it’s parlance, a feature is something that is enabled on a per-domain basis, such as a website, DNS domain or MySQL database. Let’s say you have discovered an awesome new log analysis program that you want run on each domain’s log files - a plugin would be the way to implement it.

A plugin can also add options to mailbox users. The most common use of this is to grant access on a per-user basis to some resource, such as statistics, an application or database. Plugins can also create new database types, add links to the left menu in the Virtualmin framed theme, and add sections to it’s system information page.

Some of the existing plugins give you an idea of what’s possible :

  1. The DAV plugin adds a feature which makes a virtual server’s web pages editable from applications that support the protocol, such as Windows and OSX. It also lets you enable DAV logins for each mailbox in the domain.
  2. The Bootup Actions plugin allows domain owners to have their long-running server processes started when the system boots.
  3. The Mail Relay plugin lets you forward email for a domain to another server, which can be configured by the domain owner.
  4. The Admin Notes feature adds a new section to the right-hand frame for entering comments about the system, for sharing status between master admins.

To see a full list of plugins that exist, check out the third-party modules database.

If you know Perl, have written a regular Webmin module, and want to write your own plugin, check out the extensive documentation on the API.

Webmin::API: Using Webmin as a library


Tuesday, December 11th, 2007

Webmin is perhaps the largest bundle of system administration related Perl code in existence (outside of CPAN, of course), much of which is unavailable anywhere else.  I often find myself wishing for a function or two from Webmin in my day-to-day Perl scripting.  Historically, one could use Webmin functions by first pulling in all of the bits and pieces manually, and running a few of the helper functions.  For example, at Virtualmin, Inc. we use this bit of code to start up the configuration stage of our install scripts:

#!/usr/bin/perl
$|=1;
# Setup Webmin environment
$no_acl_check++;
$ENV{'WEBMIN_CONFIG'} ||= "/etc/webmin";
$ENV{'WEBMIN_VAR'} ||= "/var/webmin";
$ENV{'MINISERV_CONFIG'} = $ENV{'WEBMIN_CONFIG'}."/miniserv.conf";
open(CONF, "$ENV{'WEBMIN_CONFIG'}/miniserv.conf") || die "Failed to open miniserv.conf";
while(<CONF>) {
  if (/^root=(.*)/) {
    $root = $1;
    }
  }
close(CONF);
$root ||= "/usr/libexec/webmin";
chdir($root);
require './web-lib.pl';
init_config();

Wow.  That’s a lot of extraneous crap just to make use of Webmin functions.  Not all of that is necessary in every script that wants to use Webmin functions, but it’s always something I have to refer to the documentation for.

So, I’ve been bugging Jamie for some time to make a simpler way to get at the Webmin API, and he’s just released the Webmin::API Perl module.   To use it, you’ll first need Webmin installed.  There’s an RPM, deb, tarball, and Solaris pkg, so it’s easy no matter what UNIX-like OS you run (it’ll also run on Windows, but only in relatively limited fashion), and then you can install it like any other Perl module:

# tar xvzf Webmin-API-1.0.tar.gz
# cd Webmin-API
# perl Makefile.PL
# make install

Once that’s done, you can make use of the entirety of the web-lib.pl, plus the libraries for all of the Webmin modules.  For example, one could access all of the Webmin variables, like %gconfig, as well as all of the web-lib.pl functions, such as ftp_download (pure Perl FTP client), kill_byname (like killall), nice_size (return a number in GB, MB, etc.), running_in_zone (detects whether it’s running in a Solaris Zone), etc.

So, making an application that downloads and does something with remote files is trivial, for example.  But, probably more interesting, is that once Webmin::API has been loaded, you can make use of the foreign_require function, which is used to access any available Webmin module function library.

For example, if I wanted to make sure Postfix was configured to use Maildir mail spools, I could do the following:

foreign_require("postfix", "postfix-lib.pl");
postfix::set_current_value("home_mailbox", "Maildir/", 1);
postfix::reload_postfix();

That’s it.  No need to worry about parsing the file and no regex needed.  You don’t need to figure out where the Postfix main.cf is located (assuming Webmin is configured correctly), or what the proper way to restart the service is.

One common, and surprisingly complicated, task is setting up initscripts to start on boot.  It seems like every Linux distribution uses a slightly different directory layout, slightly different scripts, and different tools for managing the rc directories and files.  Webmin knows about the vast majority of those quirks, and provides a uniform interface to all of them, and this functionality is exposed to scripts via the init module.  For example, I could enable Postfix on boot with the following:

foreign_require("init", "init-lib.pl");
init::enable_at_boot("postfix");

There is one unfortunate caveat to this: You have to know the name of the initscript.  On all of the systems I work with, this is pretty consistent across most services, with the exception of Apache.  One Red Hat based systems the Apache services is called httpd, while on Debian/Ubuntu systems it is apache2.  Some systems also call it apache.

Working With the Linux Firewall

One of the most powerful Webmin modules is the Linux Firewall module, which manages an iptables firewall.  It is nearly comprehensive, covering many of the advanced stateful capabilities, as well as logging and creation of and management of arbitrary chains.  We can make use of the basic functionality of the module by importing the firewall library.

foreign_require("firewall", "firewall-lib.pl");

Once imported, we have access to the get_iptables_save function, which imports any existing rules from the system default iptables save file into an array.  You can then work with them using standard Perl data management tools like push and splice.

Say you want to open ports 10000 and 20000 (for Webmin and Usermin, of course).  Maybe you also want to make sure ssh (port 22) is available for those times when you need to hit the command line.  The simplest is probably to drop them into an array (so you can add new ports later without having to read code):

#!/usr/bin/perl
 
use Webmin::API;
foreign_require("firewall", "firewall-lib.pl");
use warnings;
 
my @tcpports = qw(ssh 10000 20000);
my @tables = &amp;firewall::get_iptables_save();
(my $filter) = grep { $_->{'name'} eq 'filter' } @tables;
if (!$filter) {
  my $filter = { 'name' => 'filter',
              'rules' => [ ] };
}
 
foreach ( @tcpports ) {
  print "  Allowing traffic on TCP port: $_\n";
  my $newrule = { 'chain' => 'INPUT',
               'p' => [ [ '', 'tcp' ] ],
               'dport' => [ [ '', $_ ] ],
               'j' => [ [ '', 'ACCEPT' ] ],
             };
  splice(@{$filter->{'rules'}}, 0, 0, $newrule);
}
firewall::save_table($filter);
firewall::apply_configuration();

This reads the existing rules, and adds new ones, saves it out, and applies the new rules. The rules that this creates are identical to what you would get if you’d entered the following on the command line on a Red Hat based system:

iptables -I INPUT -p tcp --dport ssh -j ACCEPT
iptables -I INPUT -p tcp --dport 10000 -j ACCEPT
iptables -I INPUT -p tcp --dport 20000 -j ACCEPT
service iptables save

Now, of course you could do all of that with backticks and subsitution, but you’d have to add a bunch of additional logic to figure out whether to use iptables-save, service iptables save, or some variant of the former with an option or two (Debian and Ubuntu have a rather complex set of firewall configuration files, and thus the appropriate iptables save file may not be immediately obvious).  And, dealing with things programmatically is more difficult, if you want to do something interesting like “only add a rule if these two other rules already exist, otherwise add the following two rules”.   And, reading and parsing the rather complex save file and writing it back out yourself can be a challenge (feel free to steal the Webmin code for it, if you prefer not to need all of Webmin).

Known Issues

This Perl module is new, so it’s pretty safe to say there is room for improvement.  The biggest is that only the core Webmin web-lib.pl and ui-lib.pl functions are documented, and thus the vast majority of functionality found in Webmin you’ll have to parse out from the relevant modules yourself.  I plan to spend some time adding POD documentation to each of those libraries in the not too distant future, but in the meantime, the best documentation is the source itself.  Luckily, every library has an accompanying working example application in the form of the module that it is part of.

Another issue is that Webmin is full of old code.  It’s a ten year old codebase…and much of it isn’t “use strict” or even “use warnings” compliant.   You can, of course, trigger warnings after “use Webmin::API” and it works fine.  See my final iptables example for that kind of usage.  Strict is only usable, even after the import of Webmin, if you disable many types of check.  This is another issue I’ll spend some time on in the future.

In the meantime, there’s a lot of great functionality that’s just been made a little easier to make use of.  I’ll be writing several more articles with examples of using this API in the near future.  Specifically, the next installment of my series on Analysis and Reporting of System Data  will make use of the Webmin System and Server Status module to build a flexible ping monitoring and reporting tool in just a few lines of code.

Sharing JavaScript Code in Webmin


Wednesday, October 31st, 2007

I posted a while back on my personal blog about some UI enhancement work that I’ve been doing in Webmin using the ExtJS JavaScript toolkit. Several folks had questions about whether Webmin was getting a new “official” JavaScript toolkit (it has some ancient and ugly API calls to generate a few JavaScript helpers for things like field graying and validation and such, but they aint got that AJAX religion), and, if not, how one could add a JavaScript library to Webmin to cleanly share it across modules and themes.

So, the answer to the first question is that Webmin is not getting an “official” JavaScript toolkit at this time. Webmin has as one of its core goals that it can be used by anyone anywhere with any browser. AJAX and heavy JavaScript usage makes that goal far more complicated. For example, we consider it a serious bug if a blind user using a screen reader can’t use Webmin. That said, we also recognize that AJAX is the best way to handle huge classes of user interaction problems, and with our commercial offering we have a strong interest in having the best looking, and most pleasant to use, UI in the field. So, I’ve begun to build a “semi-official” Webmin module that contains ExtJS and some helper functions and classes. The first example usage of this will be our new TheJAX Virtualmin theme, and soon after a few new modules.

For the second question, I’d just like to show how I’ve created this new ExtJS module for Webmin, and how one can use it. It only takes a few minutes to wrap something up into a module, and since most AJAX frameworks are making use of good JavaScript design practices and using their own namespaces, you can actually mix and match without too much pain.

Hidden Modules

So, Webmin has a very powerful module system, that allows you to package code for easy distribution and installation. A Webmin module is simply a directory with some files in it. Only one file is mandatory to make the directory into a “module”: module.info

So, we create a directory named extjs within the Webmin directory (/usr/libexec/webmin on my system), and make a file called module.info with the following contents:

name=ExtJS
desc=ExtJS AJAX Toolkit
depends=1.360
version=0.1
hidden=1

Here I’ve given it a name, and a short description, noted the version of Webmin it depends on, given it a version (I’m going to stat it at 0.1, though the contained ExtJS version is 2.0b), and set it to be hidden. The hidden option means that users won’t be able to see this module in the UI, but other modules can make calls to it. Later, if I decide to add configurable options to this library that I do want users to be able to see, I can make it visible and add an icon and a UI.

Now, I can start dropping in my files. I merely unzipped the ExtJS bundle, deleted the extraneous files, and dropped it into an ext directory within the module directory. That’s just to make it easy to update ExtJS components separately from the helper functions that I write in Perl in the top-level directory.

Helper Functions

So, the simplest thing to automate away is the inclusion of the script tags that load the library. So, I’ll create a header_text function in a file called extjs-lib.pl (Webmin has a convention of calling function libraries modulename-lib.pl), which looks like this:

 # extjs-lib.pl
 
do '../web-lib.pl';
&amp;init_config();
 
my $debug=''; # Set to '-debug' to use non-stripped library
 
# header_text()
# Text to load JavaScript and CSS for use of extjs
sub header_text {
  return <EOF;
<script src="/extjs/ext/adapter/ext/ext-base.js" type="text/javascript"></script>
<script src="/extjs/ext/ext-all.js" type="text/javascript"></script>
<link href="/extjs/ext/resources/css/ext-all.css" rel="stylesheet" type="text/css" />
<link href="/extjs/ext/resources/css/xtheme-$config%7B" rel="stylesheet" type="text/css" />
EOF
}
 
1;

Here we pull in the Webmin core library, pull in the configuration for this module (which I’ll cover in a couple of days when I’ve completed the configuration code for this module), and build the function to return the bits of text we need to properly load ExtJS and its stylesheets.

Using It

Believe it or not, we’ve now got a library that can be used by other Webmin modules or by themes. Webmin has a foreign_require function that will pull libraries like this in under their own namespace. So, when I need to use ExtJS, I can do this:

foreign_require("extjs", "extjs-lib.pl");
print extjs::header_text();

All done! In a few days I’ll be finished with the first full-featured version of this library, and will wrap it up for distribution, along with some proof-of-concept modules that show how to use a full-featured AJAX interface without breaking text-mode browsers and readers, among other things.

One config file to rule them all


Wednesday, October 10th, 2007

Configuration files are a boring necessity in software development. Parsing existing configuration files is a necessary aspect of almost any systems automation task. I regularly need to read and write configuration files from different languages, as I have simple maintenance, startup, and installation scripts written in BASH, larger Webmin-related tools in Perl, and stuff related to our website written in PHP. Of course, there are some great configuration file parsers for Perl in CPAN, but if you need a highly portable script and you don’t want your user to have to know anything about CPAN, it makes sense to build your own.

Luckily, in all three of these languages, plus Ruby and Python (other favorites of mine), simple configuration files can be easy, if you choose the right format.

Start from the Least Common Denominator

The least capable language in this story, at least with regard to data structures, is probably BASH, so we’ll start by creating a configuration file that’s easy to use with BASH. The obvious choice is a file filled with simple variable assignments, like so:

apache.config

# A comment
show_order=0
start_cmd=/etc/rc.d/init.d/httpd start
mime_types=/etc/mime.types
apachectl_path=/usr/sbin/apachectl
stop_cmd=/etc/rc.d/init.d/httpd stop
emptyvalue=
# A blank line too..
 
max_servers=100
test_config=1
apply_cmd=/etc/rc.d/init.d/httpd restart
httpd_path=/usr/sbin/httpd
httpd_dir=/etc/httpd
#  A comment with an=sign

This file is valid BASH syntax–you could run this directly with /bin/sh apache.config and it would return no errors (though it wouldn’t do anything, because the values are not exported, so they are only in scope for the split second it takes BASH to parse the file. Because it’s BASH syntax, empty lines are ignored, and any line that starts with a # is a comment and also ignored. Empty values are also legal, so we need to accommodate lines that have only a key and no value. Also because this is a valid BASH script, we can make use of these variables in our scripts easily by sourcing this file. In shell scripts this is done using the dot operator ( . ), like so:

. apache.config

After this, each of the values in the apache.config file are accessible by their names. There are some caveats that make this a less than ideal practice for anything more complicated than a small script. The variables pollute the namespace when pulled in this way. So, if you later wanted to use $apachectl_path as a variable for some other purpose, for example, you would overwrite the existing assignment, and cause possibly difficult to diagnose errors. BASH doesn’t have support for complex data structures, so there isn’t much we can do about this, without introducing quite a lot of complexity, so we’ll take our chances and keep our scripts short and simple.

Getting the values into a Perl data structure

While our configuration file is not valid Perl syntax, Perl still has plenty of tools for working with this kind of file. After all, Perl was born to pick up the ball where shell scripts fumbled (and eventually evolved into a hodge podge of every great, and some not so great, ideas in programming languages from the past couple of decades), so it’s natural that it would have the ability to do the same sorts of things as a shell script.

But, since our configuration file is not valid Perl syntax, we can’t simply call do apache.config; as we would to import another Perl script. We’ll have to parse it into a data structure (which is better programming practice, anyway, as mentioned above). One way to do this would be a while loop, like so:

my $file = "apache.config";
my %config;
open(CONFIG, "&lt; $file") or die "can't open $file: $!";
while () {
    chomp;
    s/#.*//; # Remove comments
    s/^\s+//; # Remove opening whitespace
    s/\s+$//;  # Remove closing whitespace
    next unless length;
    my ($key, $value) = split(/\s*=\s*/, $_, 2);
    $config{$key} = $value;
}
 
# Print it out
use Data::Dumper;
print Dumper(\%config);

Now, we can access the values in our configuration file from the %config hash, such as $config{’apachectl_path’}. Another option, if you’re feeling particularly idiomatic, is to use map:

my $file = "apache.config";
open(CONFIG, "\&lt; $file") or die "can't open $file: $!";
my %config = map {
      s/#.*//; # Remove comments
      s/^\s+//; # Remove opening whitespace
      s/\s+$//;  # Remove closing whitespace
      m/(.*?)=(.*)/; }
      ;
 
# Print it out
use Data::Dumper;
print Dumper(\%config);

So, what’s the benefit to this latter example? Nothing major, it’s just another way to approach the problem. It’s a couple of lines shorter, but more importantly it has fewer temporary variables, which can be a source of errors in large programs. The multiple substitution regular expressions I’ve shown above in either example could be reduced to a single line, but I believe this is more readable, and according to the Perl documentation breaking the tests out into single tests is faster than having multiple possible tests in a single substitution. Some folks also find long regular expressions difficult to scan.

But, I only like Ruby!

OK, so you want to do it in Ruby. Ruby has a lot in common with Perl, so it’s actually pretty similar, though a bit more verbose. Ruby fans seem to discourage regular expressions, though it is a core part of the language and it has roughly the same regex capabilities as Perl, so I’ve only used one (I guess I could have gotten rid of it somehow…but I got tired of searching for the non-regex answer and punted):

config = {}
 
File.foreach("apache.config") do |line|
  line.strip!
  # Skip comments and whitespace
  if (line[0] != ?# and line =~ /\S/ )
    i = line.index('=')
    if (i)
      config[line[0..i - 1].strip] = line[i + 1..-1].strip
    else
      config[line] = ''
    end
  end
end
 
# Print it out
config.each do |key, value|
  print key + " = " + value
  print "\n"
end

Same end result as the Perl versions above: A config hash containing all of the elements in our configuration file.

What about those web applications written in PHP?

Two of the websites I maintain (Virtualmin.com, and this site) are written in PHP. One is a Joomla application with numerous extensions and custom modules and components, the other is a mildly customized Wordpress site. In the case of Virtualmin.com, we’re developing a number of applications that have both Perl components for the back end work and PHP components for the web front end, so sharing configuration files can be useful. Webmin, conveniently enough, already uses shell variable key=value style configuration files, so everything we do is already in this format.

So, let’s see about getting these configuration files into a PHP data structure. PHP isn’t quite as rich as Perl in its data manipulation capabilities, but it did inherit quite a few of the same tools from Perl, so our solution in PHP looks pretty similar to the while loop version above, though it is a bit more verbose due to the keyword heavy nature of PHP (Perl is often accused of having too much syntax, and PHP has way too many keywords):

$file="apache.config";
$lines = file($file);
$config = array();
 
foreach ($lines as $line_num=&gt;$line) {
  # Comment?
  if ( ! preg_match("/#.*/", $line) ) {
    # Contains non-whitespace?
    if ( preg_match("/\S/", $line) ) {
      list( $key, $value ) = explode( "=", trim( $line ), 2);
      $config[$key] = $value;
    }
  }
}
 
// Print it out
print_r($config);

Hey, what about snake handlers?

Of course, it can also be done in Python. As with the Ruby implementation, I’m not certain this is the best way to do it, but it works on my test file.

import sys
config = {}
 
file_name = "apache.config"
config_file = open(file_name, 'r')
for line in config_file:
    # Get rid of \n
    line = line.rstrip()
    # Empty?
    if not line:
        continue
    # Comment?
    if line.startswith("#"):
        continue
    (name, value) = line.split("=")
    name = name.strip()
    config[name] = value
 
print config

Or, as dysmas suggested on Reddit, a more idiomatic version would be:

config = {}
 
file_name = "apache.config"
config_file= open(file_name)
 
for line in config_file:
    line = line.strip()
    if line and line[0] is not "#" and line[-1] is not "=":
        var,val = line.rsplit("=",1)
        config[var.strip()] = val.strip()
 
print config

So, now we’ve got a config associative array filled with all of our values in all of our favorite languages (except BASH, which gets straight variables). Assuming we use a common file locking mechanism, or always open them read only, we could even begin to use the same configuration files across our BASH, Perl, Ruby, Python, and PHP scripts independently but simultaneously.

What’s the point?

This isn’t just an academic exercise. The simple examples above make up the early start of a cross-language set of tools for systems management.

With these simple parsers, we can build tools that use the best language for the job, while still leveraging some interesting knowledge contained in Webmin’s configuration files (which are in this key=value format). Webmin supports dozens of Operating Systems and hundreds of services and configuration files, so the config files in a Webmin installation (usually found in /etc/webmin) contain a huge array of compatibility information that would take ages to gather. If you need to know how to stop or start Apache on Debian 4.0, or on Solaris, or on Red Hat Enterprise Linux, you’d have to check an installation of those systems or search the web or ask someone who has one of those systems handy. Or, you could check the Webmin configuration file, and get the same data for all of the Operating Systems Webmin supports. It’s a pretty valuable pile of data. Imagine writing a script for your own favorite OS, and then being able to hand it to anyone that happens to have Webmin installed, regardless of their OS and version. Or, if they don’t have Webmin installed, you could provide a template configuration file that they could fix for their OS and version, addressing both situations as simply as possible.

Not the only configuration file format

Of course, this isn’t the only configuration file format out there, or even the best. Python users really like INI files, and I can’t argue with them. When I was writing Perl and Python predominantly, I used the Config::INI::Simple module from CPAN and ConfigParser for Python so I could share configuration between my various software easily (I was generally writing a Webmin front end in Perl to a Python back end application). That worked great. So, I’m not arguing you ought to be using key=value configuration files for everything. But being able to read them makes a lot of portability data available to you for free.

Next time I’ll wrap a couple of these routines up into friendly libraries for easy use, and add some tests to be sure we’re doing what we think we’re doing.

Analysis and Reporting of System Data Part 1


Tuesday, October 2nd, 2007

There are a few basic elements to maintaining and administering systems: configuration, software management, data integrity and availability, and monitoring and reporting. This article introduces a number of tools for the last of those components, as well as presents some simple ways to create custom tools to report on data specific to your environment. There are dozens of great Open Source tools for gathering and presenting data, and so this series merely scratches the surface, but it provides a good introduction to some of the major system data analysis problems and presents some solutions.

Before trouble starts

Who, What, When, Where, Why and How

The six W’s (yeah, I’m not sure why “how” is one of the Ws, either) of reporting also apply to systems data. You want to know:

Who has been interacting with your server and services.

What they did.

When they did it, so you can determine if something they did is related to problems on the system.

Where they were coming from, just in case they aren’t who they claim to be.

Why? OK, so systems data probably can’t tell you why someone did something. You’ll have to ask them. But, with the right tools you’ll know who to ask and what to ask them, if anything funny does happen on your systems.

And, how any problems came about, so you can prevent them in the future. In short, the goal of all of this analysis and reporting on systems data is to keep your sysadmin house in order.

Oops.  Something went wrong.

The Basics

In the spirit of starting from first principles, we’ll begin this little exercise with the rudimentary tools that every system administrator ought to know a bit about: grep and tail.

While there are lots of automatic tools that provide graphs and charts and doohickeys that you can click or drag or hover over for hours of fun, odds are very good that some day, you’ll need to find out something very specific about a service on your system. Do you really want to schlep all over the Internet looking for just the right log analysis tool to find out whether that important message your boss sent to your companies biggest client was actually delivered? Of course not! Your boss is breathing down your neck right now. This is a job for grep!

grep is a search tool. It finds lines in a text file that match a regular expression1 and prints it to STDOUT. Like all UNIX command line tools, it can easily be combined with other tools for maximum awesomeness. So, let’s see grep in action, eh?

Find the boss’ email to badass@superhappymegacorp.com. Your boss (wimpy@thefacelesscorp.com) sent it out yesterday and he still hasn’t gotten a reply!

grep "to=<badass@superhappymegacorp.com>" /var/log/maillog</badass@superhappymegacorp.com>

Assuming your boss actually sent the message, this will print out something along the lines of:

Sep 24 23:04:52 www postfix/smtp[3208]: 93498290E97: to=, relay=none, delay=42281, 
status=deferred (connect to mail.superhappymegacorp.com[192.168.1.100]: Connection timed out)

Aha! The superhappymegacorp.com mail server isn’t responding. The message didn’t go through yet, but it’s not our fault! Ass covered. Rest easy and reward yourself with another one of those delicious cupcakes that cute secretary brought in this morning.

Just when you begin to think the rest of your day is going to be easy, in comes the web designer. She’s thoroughly in a panic because one of her off-shore contractors got the syntax wrong in an .htaccess file and exposed a directory filled with sensitive files. It’s now been fixed, but she needs to know if anyone outside of the company accessed those files during the couple of days while they were exposed. Hmmm…sounds like another job for grep. But, we need to find entries that don’t match a particular pattern. We’ll use the “-v” option to negate the pattern.

grep -v ^192\.168\.1\. /var/log/httpd/access_log

This assumes 192.168.1. matches our local company subnet. The “^” indicates that the pattern should appear at the beginning of a line, which in the Apache common log format is where the client IP appears. Because grep uses regular expressions, and the period “.” has special meaning (it means “match any single character”), I’ve used a backslash “\” to escape the periods in the IP. It would match anyway, because a period matches “any single character”, but it could lead to false positives (or negatives in this case) because 192.168.100.1 would match even though it isn’t in the 192.168.1.0/24 network.

Next up, tail, a nifty little tool that I use many times every day. In its simplest form it simply displays the last 10 to 20 lines of a file. Because log files on a UNIX system always append new entries to the end of the file, this will always show the most recent items in the log. It’s very useful for interactively debugging problems.

Even better, modern tail implementations include the “-f”, or “–follow”, option, which prints the log entries as they are added. So, if I were debugging a particularly ornery mail problem, I might watch the maillog with “tail -f” while making requests. Of course, if I’m looking at the logs of a very active server, I might want to only see very specific entries. Say, I’m not sure why a particular mailbox isn’t receiving mail. We can combine tail and grep, like so:

tail -f /var/log/maillog | grep info@thefacelesscorp.com

Now, when I send an email to info@thefacelesscorp.com, I’ll see the related entries in the maillog (of course, in some cases, it won’t show all related entries…you might then need to pick out a message ID and grep the whole log based on that ID).

Next week, we’ll cover using Perl to extract useful information from your system and build time series graphs from the data.

See also

grep documentation

grep at Wikipedia

tail documentation

tail at Wikipedia

  1. Regular expressions, or regexes, are a syntax for advanced pattern matching. There is a de facto standard known as egrep, or extended grep, style regexes. This further evolved into Perl style regexes, which are used by many other languages and tools, via the pcre (Perl Compatible Regular Expressions) library. The Perl regex documentation is among the best on the subject. Jeffrey Friedl’s Mastering Regular Expressions takes the subject to the next level, and covers grep, egrep, sed, Perl, and much more. []