Dan Harkless' Software

This is a collection of software I've written to solve specific problems. Some of it is UNIX-specific, but most of it should also work on Windows, assuming you have ActiveState Perl and/or Cygwin installed. Everything is released under the terms of the GNU General Public License.

In addition to the below original software, in the past I have also been a developer and co-maintainer of GNU wget and nmh. Unfortunately recently I have not had time to keep up with either project :-(, but I'm planning to become involved in nmh again as soon as I am able.

I've also contributed fixes and/or new features to Apache, BIND, Cistron RADIUS, CVS, Douglas K. Rand's syslogd, dump-contacts2db, EGD, GNU Emacs, patch, and wdiff; LEAF, MRTG, Perl modules CGI, Cwd, Image::Size, Lingua::EN::Squeeze, and Mail::SPF::Query; Mozilla Firefox, pidentd, Plum Hall C Validation Suite, sendmail and Milters dnsbl-milter and spf-milter, Snort, SSH.com's ssh, tcsh, TeX4ht, trn, XEmacs, XFree86 xterm, and others.


reformat_contacts2.db_phone_numbers_to_use_parens    The VCF import function in the Android Contacts app reformats (###) ###-#### phone numbers to ###-###-####, which I find harder to read. If you run this script on your Android device after the VCF contact list is imported, this will be fixed.

CGI Scripts

email_form_data    See entry in Email: Anti-spam section below.

frame_generator    Generates an HTML 4.01 frameset on the fly. For an example of it in action, check out my Buying Guides.

gen_form_search_db    Takes a TAB-delimited database and a database description file in my proprietary plain-text format and displays a table of the data and/or a form for searching it. The search form uses checkboxes for enumerated values and text-entry boxes for arbitrary text fields. You can search for plain-text strings or Perl regular expressions, and database fields can be sorted in any order or omitted entirely. Results can be output one record per row or one record per column. Check out my Buying Guides or the "Have" links on my Possessions page to see it in action.

GET_to_POST    Converts a GET request to this script into a POST request to the specified URL, redirecting automatically via JavaScript, if desired.

image_album    See entry in Images section below.

tryout_inlines    Allows you to easily try out a bunch of different alternatives for some piece of inline media on an HTML page. For instance, if you're not sure what background graphic you want to use on a page, you can use this script to try out several.


ifup_nsupdate    Does a TSIG-signed dynamic DNS update using BIND's nsupdate command when your network interface comes up (and has a new IP address). Useful for PPP, PPPoE, and other systems with dynamically-assigned IPs. Written for use with LEAF (specifically Bering) and is the main component of nsupdate.lrp, but should also work on normal Linux installations (particularly Debian) with little or no modification.

watch_whois_record    Called as a cron job, periodically checks to see if a given domain name has become available (or has otherwise changed) by checking for a particular string in its WHOIS record.


csh-mode.el    The csh-mode.el and csh-mode2.el available in the old Emacs Lisp Archive are extremely primitive compared to, say, Gary Ellison's ksh-mode. When I bemoaned this fact on comp.emacs, Carlo Migliorini sent me a copy of ksh-mode.el he had done a search-and-replace on to produce csh-mode.el. I have done extensive further polishing, resulting in a mode that is even superior to ksh-mode 2.6 in some ways. I probably should have called this file tcsh-mode.el, as it has support both for true csh and the additional features of tcsh.

Email: Anti-spam

email_form_data    Yet another web form data emailer along the lines of Reuven M. Lerner's original 1994 form-mail.pl script (but minus the security holes that so many of them have). Has some nice features (including anti-spam measures) to make it more feasible to replace mailto: links with calls to this script.

spfilter2bind    Converts an spfilter-format DNSBL input file into a BIND zone file. I wrote this because the spfilter.pl script contained bugs that prevented BIND from successfully loading zone files created by it, and because I wanted a much more simple and lightweight option to use in conjunction with my run_if_modified script.

test_DNSBL    Looks up one or more IP addresses on the specified DNS Blackhole List. Optionally uses monitor_file to keep tabs on whether the DNSBL stays operational.

Email: General

email_address    Takes a full email address, which may include a real name, such as Dan Harkless <dan@wave.eng.uci.edu>, and prints just the actual address part -- dan@wave.eng.uci.edu in this case.

rename_to_DATE--SUBJ    Takes a group of mail message files (as from MH/nmh, or .eml files from Outlook Express) and renames them to be of the form YYYY-MM-DD_HH:MM[:SS]--SUBJECT. This was written for the purpose of converting an archive of security vulnerability emails into a group of files downloadable from a website.

sms_biff    Just as xbiff extends the biff concept (of instant notification of email arrivial) from tty to X11, this script extends the concept to SMS messaging. If you set up your email filtering to call this script on a copy of incoming messages, it'll send a terse version of each message to your cell phone via SMS. Here's an example of calling this script from procmail (or see below for a more complex example).

treocentral_wmexperts_notification    Takes a TreoCentral.com or WMExperts.com "Reply to post" or "New Private Message" email on stdin and filters it to contain just the URL (and subject), so that when an SMS notification of the email is sent to you by my sms_biff script (or equivalent), you'll be able to browse to the thread / private message directly from the SMS message on your Treo. Here's an example of calling this script in conjunction with sms_biff from procmail.

Email: MH/nmh

compp    A wrapper around MH/nmh's comp for composing PGP-encrypted messages. Requires medit.

find_correspondents    If you use MH/nmh or some other mail system that keeps one message per file, named numerically, you can use this script to find everyone who has emailed you at a particular address or addresses. This is useful, for instance, if you need to inform everyone of a new address.

forwp    A wrapper around MH/nmh's forw for forwarding and PGP-encrypting messages. Requires medit.

medit    An emulation of Jim Hester's 1986 medit C program. When used as your MH/nmh Editor, prompts you for header information and then spawns your favorite editor to edit the actual body. Also fills out headers automagically when possible and handles MIME-decoding and PGP encrypting / decrypting.

replm    A wrapper around MH/nmh's repl for replying to MIME-encoded messages. Requires medit.

replp    A wrapper around MH/nmh's repl for replying to PGP-encrypted messages. Requires medit and an mhl.replp file.

show_JIS_email    A program to be called from MH/nmh's mhshow-charset-iso-2022-jp entry to view JIS-encoded Japanese emails on AIX (or on other OSes with some tweaking). Requires Ken Lunde's jconv program.

showp    A wrapper around MH/nmh's show for showing PGP-encrypted messages. Requires mhl.headers and mhl.body, which come with nmh.

General commandline utilities

backslashify    This filter puts backslashes in front of characters that are special to commands that process regular expressions. Can backslashify all extended regexp characters, [$()*+.?[\^|], or just basic regexp characters, [$*.[\^]. Especially useful in scripts where an egrep or sed expression includes a variable whose contents are unpredictable and may contain special characters (say, a filename like "c++filt.c").

bgrep.c    Rather than using find . -type f -exec egrep pattern {} \; -print, use find * -type f -exec bgrep --blank pattern {} \;. For each file with egrep matches, a terminal-width boldface banner will be printed before the matches (rather than afterwards, as with find's -print), containing the filename, right-justified. The --blank flag causes a blank line to be printed between each file, for even more readability. (Another option that came into existence after I wrote bgrep is to use the -H option in GNU grep 2.1+ to prefix each match line with the filename, as in find . -type f -exec egrep -H pattern {} \;, but bgrep still wins out on the readability front.)

count    Like sort -u, but with instance counts. Also handy for sorting IP addresses if you can't remember GNU sort's sort -k 1,1 -k 2,2 -k 3,3 -k 4,4 -n -t . -u syntax.

d2u    A wrapper around dos2unix which preserves file ownership, permissions, and modification time, and which won't mangle binary files.

fstype.c    Prints what kind of filesystem a file or files live in (e.g. autofs, cdfs, ext2/3/4, hfs, jfs, nfs, tmpfs, ufs, etc.), or else the filesystem magic numbers.

grep16    Greps for a Perl regular expression in each file specified on the commandline, or stdin if no files are specified. The regular expression is searched for in each file both as-is and converted to a UTF-16 string with NULs between each character. Input files can be big and/or little endian.

longest_line.c    Prints the length (including the trailing \n, if any) of the longest line in the specified file (or stdin, if none specified).

lookin    Looks in all files in a hierarchy matching a given pattern (e.g. *.[ch]) for a specified regular expression by having find call bgrep.

mirror    Mirrors a directory tree onto another machine. Unlike rdist, it builds up a secondary tree and then switches out the old one for the new one at the last second so that the tree isn't in an inconsistent state during the transfer.

most.c    A pager with several improvements over more -- it properly displays bold and underlined text, allows backwards movement even on stdin, completely completely consumes stdin to allow percentage displayed, and highlights the last line on each page like trn -m so you don't lose your place on the final, partial page. less now has all those capabilities as well, but one important feature most has that neither more nor less does is that if you copy & paste a long, wrapping line, it will preserve it as-is rather than breaking it up into multiple newline-terminated screen-width lines.

total    Takes a column of numbers on stdin and outputs their total on stdout (simple but very useful).


image_album    Generates HTML on the fly for navigation of albums of related images; for example, online photo albums. Supports several different viewing modes appropriate for different display sizes and connection speeds, and allows the viewing of additional image info (such as EXIF data). To see it in action, check out my online photo albums, or those of my brother Steve, Mom, or Dad.

image_album_prep    Prepares a group of image files for use with my image_album CGI script (or other presentation methods). Can edit caption files, add comments, restore file timestamps from EXIF data, create reduced-size copies of images (e.g. for thumbnails), rename images to prefixNNN.extension, remove embedded EXIF thumbnails, losslessly rotate portrait shots into the correct orientation, and more. Some of these abilities require particular external programs to be available, as specified in the documentation.

image_info    A standalone script interface to the Perl CPAN module Image::ExifTool. Prints out information (such as comments, dimensions, and EXIF fields) for the image files specified on the commandline. The data is formatted such that you can easily grep for particular values of particular attributes in a range of files.

image_info_modtime    Sets the modification time for one or more image files to the value of the embedded timestamp field specified (or the EXIF DateTimeOriginal field by default). The embedded timestamp can be interpreted as being in the local time zone or as UTC.

rdjpgcom_wrapper    A wrapper around rdjpgcom that allows the specification of multiple files on the commandline and that prints each comment with a prefix of "filename: ". If a file has no comment, "No JPEG comment" is printed (rdjpcom is simply silent on such files).

System administrator utilities

access_log_domains    Outputs all the unique domains found among website visitors in an Apache access_log. For each one, outputs the hostname and timestamp of the last visitor from that domain, and optionally the hostname / timestamp of the first visitor as well.

booby.c    A little boobytrap. If you find an evil binary on your system, like some kind of cracker tool, and rather than deleting it you want to be notified next time someone uses it, you can move it to some odd location and replace it with an executable compiled from this file. The wrapper will do a little snooping and then run the real executable as if nothing's amiss. Basically you're taking an unimportant known-compromised machine and transforming it into a poor man's honeypot.

check_df    A script you can call from cron to output (and thus email) a df listing if any of the filesystems' usage is greater than or equal to the percentage you specify on the commandline.

crypt8    Prompts for a string to run through crypt() and prints the result on stdout. This can come in handy if you need to change a password but you can't (or don't want to) use the passwd program, e.g. because hackers have broken on to your system and deleted it. You can also manually specify a salt, e.g. to verify if someone's password is set to a given value without having to attempt a log-on.

extract_UserDir_logs.c    A shell account customer wanted to know if he could have access to his http://shell_machine/~user log files, and I was surprised to discover that Apache doesn't give you any way to set up per-user logs. The best solution would be a module that would enable this, but this program was easier to write. It writes the logs as user_access_log and user_error_log in a directory you specify, which users should have read-access to but not write-access to.

monitor_file    Monitors a file (or pseudo-file, like a /proc entry) to watch for changes. If there are any, diff output will be displayed, or sent to the email address specified.

single_document_webserver.c    A simple webserver that will only serve a single specified HTML file.


image_info_modtime    See entry in Images section above.

run_if_modified    Examines the modification timestamp on a specified file, then runs a specified update command. If the update command has modified the timestamp, a second specified command will then be run. One example of where you might use this is in a cron job that periodically retrieves a DNS zone file via rsync or wget (such as a DNS Blackhole List that doesn't offer traditional zone transfers), then restarts your nameserver if the zone has been modified.

timepreserve.c    Preserves the access and modification timestamps of one or more files across the execution of a commandline that would otherwise modify them.

timestamp    Outputs, manipulates, saves, and restores timestamps on files in UNIX-style seconds-past-the-epoch form. (This script replaces my old timecopy.c, timeset.c, timestamp.c, and timestamp++.c programs.)

Windows utilities

MBProbe_critical.sh    A Cygwin shell script to be called from MBProbe when measurements go into the Critical zone. Quickly and unceremoniously shuts down the machine. This script is necessary because on most machines MBProbe's builtin Shutdown feature only shuts down to the "It is now safe to shutdown your computer" screen, rather than powering off.

MBProbe_warning.pl    A Perl script to be called from MBProbe when measurements go into the Warning zone. Emails you the first and last few lines from MBProbe's Event and History logs.

reverse_lookup    Does NetBIOS and DNS reverse lookups on a list of IP addresses.

In addition to the above, I've also squirrelled away some old software that used to be on this page but is now either obsolete or just no longer of use to me in my current computing environments. A few things there might still be of use to someone, though, particularly if they're on an old system.

Dan Harkless
Last modified: September 26, 2016
Validated HTML 4.01 Transitional