cron bug: silently fails when too much output is produced

cron bug: silently fails when too much output is produced

Edited: I reported this issue to Debian and Christian Kastner patched it. Debian cron versions 3.0pl1-110 and higher should behave properly.

As reported in Ubuntu bug #151231 and at stackoverflow, cron jobs can fail silently when too much output is produced and an MTA is not installed.

Although the comments in the Ubuntu bug indicate that Debian is not affected because it includes the exim MTA by default, using a Debian VServer only installs the most basic packages – no MTA. I encountered this bug today while I was working on a backup script for a Subversion repository that runs in a Debian-based VServer. I previously discussed svn backupsand my script was using the same svnadmin dump technique I mentioned in that post.

When I ran the backup script manually, everything worked as intended and it produced a ~1.3GB backup file; however, when it ran under cron, the backup file was only ~2MB. Since the simplest solution is usually the correct one, I didn't initially suspect a bug in cron as the problem and tried to troubleshoot my script/environment. I eventually found the two links I listed above, where others have encountered the same bug.

It appears that the bug in cron causes it to silently fail when too much output is directed to STDERR without an MTA being present. In the case of svn backups, using the "quiet" flag (-q) for svnadmin dump is a possible workaround. In other cases, either redirecting STDERR to /dev/null or setting MAILTO="" in /etc/default/cron seem to be valid workarounds.

April 8, 2010 0 comments Read More
Speed up browsing with faster DNS resolution using namebench

Speed up browsing with faster DNS resolution using namebench

In December 2009, Google publicized their new public DNS resolution service; a week later, they also announced the open-source release of a DNS benchmarking tool called namebench, a DNS benchmarking tool. It is designed to "hunt down the fastest DNS servers available for your computer to use."

Since web traffic is measured in milliseconds, delays of 200-400ms for slow DNS resolution can be more meaningful than it would seem. Using namebench to benchmark a variety of DNS servers, it is possible to speed up browsing by finding a faster DNS resolver.

You could write a script to test DNS resolution via nslookup, or a similar query tool; however, namebench includes several interesting features: it uses your browser history to select a list of hosts to test against, appropriately handles multiple IPs of cache-sharing resolvers, and reports on resolvers that use DNS hijacking. At the end of its benchmarking process, namebench suggests which DNS resolver seems to be the fastest, and displays useful statistical information about the test results.

March 1, 2010 0 comments Read More
Extend Vista and Windows 7 Activation

Extend Vista and Windows 7 Activation

After installing Windows Vista or Windows 7, you have a 30 day grace period before activation is required. We've recently been having problems accessing our internal use license keys under our Microsoft Partner/Action Pack subscription, and Microsoft has been slow to resolve the underlying technical issue. Unfortunately, a coworker installed Windows 7 about a month ago, expecting the issue with license keys to be resolved within 30 days.

If you need to renew the 30 day grace period, it is possible to extend it twice – providing up to a total of 90 days to obtain a license key. To extend the activation time for Windows Vista or Windows 7, launch a command prompt with Administrator privileges and type: slmgr -rearm

February 19, 2010 0 comments Read More
Recover an Overwritten File on ext3 File System

Recover an Overwritten File on ext3 File System

I've needed to recover deleted files on ext3, FAT, and NTFS file systems in the past, but I recently needed to recover a previous version of a text file I had overwritten by editing and saving it. I initially thought I might be able to recover it either by accessing the inode used by the previous version of the file, or by looking at ext3's journal.

Unfortunately, I had used nano to edit the file. Apparently, nano saves files by truncating and overwriting the file, reusing the same inode. Also, I quickly realized ext3's journal wouldn't help because my file system was mounted using data=ordered, not data=journal. From the ext3 FAQ:

  1. data=journal: Journals all data and metadata, so data is written twice.
  2. data=ordered: Only journals metadata changes.

Ultimately, I was able to recover the file with some help from stat, debugfs, and blkls from The Sleuth Kit. Before getting started, you'll need to install The Sleuth Kit. On Debian, it is available as a package, so: apt-get install sleuthkit

First, check the inode being used by the file: stat file.txt | grep Inode

This should return a line containing the inode, like: Inode: 1474575

Next, backup the file, then delete it:
cp file.txt file.old
rm file.txt

Run debugfs /dev/sda1, replacing /dev/sda1 with the hard drive the file is on. From the debugfs CLI, run stats and check its output for "Blocks per group". On my system, and most of the time, this is 32768. While still in the debugfs CLI, run imap <inode> to get the block: imap <1474575>. In my case, the block was 5898242.

Once you know the block the inode is in, and the number of blocks per group, create a block range: 5898242+32768-1 and use blkls to copy the block to a file: blkls /dev/sda1 5898242-5931009 > tmp.dat

Finally, open tmp.dat in your favorite text editor or use grep to search for the overwritten version of your file.

For more details about ext3 file systems and recovering deleted files:

  1. Recovering Deleted Files on an ext3 File System
  2. Data Recovery on Linux and ext3
February 17, 2010 3 comments Read More
WorldLingo Multilingual Archive

WorldLingo Multilingual Archive

As you may already know, I'm the Director of IT of WorldLingo, one of the leaders of online translation and localization. In addition to working with great people, I also have the opportunity to work with cutting-edge technology on a daily basis.

One of our newest projects is the Multilingual Archive, a constantly growing repository of translations of some of the world's best freely available information sources. Initially, we have translated approximately 2.8 million English Wikipedia articles into 8 languages: Spanish, French, Portuguese, German, Dutch, Russian, Korean, and Japanese. Translation into Italian, Swedish, Arabic, Simplified Chinese, Traditional Chinese, and Greek will be completed in the near future, and additional information sources will be added on an ongoing basis.

To create the Multilingual Archive, we leveraged WorldLingo's existing translation technology infrastructure and implemented Hadoop/HBase for storing the articles. Check out Lars George's blog for more information about our use of Hadoop/HBase.

December 10, 2009 0 comments Read More
SysAdvent – ATA over Ethernet (AoE)

SysAdvent – ATA over Ethernet (AoE)

In my last post, I mentioned I would be contributing to the SysAdvent calendar. Google Analytics seems to indicate my posts on ATA over Ethernet (AoE) have been pretty popular lately, so I decided to write about AoE for my SysAdvent article. Check it out: Storage: ATA over Ethernet.

December 6, 2009 0 comments Read More