Living Linux 05/05/2000

This week's column describes two venerable UNIX tools for checking your writing that have been rewritten for Linux, style and diction.

Old-timers probably remember these names -- the originals had came with AT&T UNIX as part of the much-loved ``Writer's Workbench'' (WWB) suite of tools back in the late 1970s and early 1980s. (There had also been a group who planned a ``Reader's Workbench''; we can only guess at what that might have been, but today we do have Project Gutenbook, a new etext reader.)

AT&T unbundled the Writer's Workbench from their UNIX System 7, and as the many flavors of UNIX blossomed over the years, these tools were lost by the wayside -- eventually becoming the stuff of UNIX lore.

In 1997, Michael Haardt wrote new Linux versions of these tools from scratch. They support both the English and German languages, and they're now part of the GNU Project; if you don't already have them installed on your system, you can get them from here.

Let's take a look at some of the things that these tools can do.

Checking text for misused phrases

Use the diction tool to check for wordy, trite, clichéd or misused phrases in a text. It checks for the kind of expressions William Strunk has warned us about in his Elements of Style.

According to Andrew Walker's excellent book The UNIX Environment, the diction tool that came with the old Writer's Workbench just found the phrases, and a separate command called suggest would output suggestions. In the GNU version that works for Linux, both functions have been combined in the single diction command.

In GNU diction, the words or phrases are enclosed in brackets [like this]. If diction has any suggested replacements, it gives them preceded by a right arrow, -> like this.

When checking more than just a screenful of text, you'll want to pipe the output to a tool such as less, so that you can peruse it on the screen. For example, to check a file called banquet-speech.txt for clichés or other misused phrases, you'd type:

$ diction banquet-speech.txt | less RET

You could also redirect the output to a file if you wanted to look at it later:

$ diction banquet-speech.txt > banquet-speech.diction RET

Here, the output is written to a text file called banquet-speech.diction.

Checking more than files

If you don't specify a filename, diction reads text from the standard input until you type Control-D on a line by itself -- this is especially useful for when you want to check the diction of a sentence:

$ diction RET

So finally, tonight, let us ask the question 
we wish to state. RET
(stdin):1: [So -> (do not use as intensifier)] finally, tonight, 
let us [ask the question -> ask] [we wish to state -> (cliche, avoid)].

To check the text of a Web page, use the text-only Web browser lynx with the -dump and -nolist options to output the plain text of a given URL, and pipe it to diction. (If you expect there to be a lot of output, add another pipe at the end to the less tool so you can peruse it.)

For example, to check the text on the Web page for wordy and misused phrases, you'd type:

$ lynx -dump -nolist | diction | less RET

Checking text for doubled words

One of the things that diction looks for are doubled words -- words repeated twice in a row. It encloses the second member of the doubled pair in brackets followed by a right arrow and the text "Double word", like this [this -> Double word.].

If you only want to check a text file for doubled words, and not any of the other things diction checks for, use grep to find only those lines in diction's output that contains the text "Double word", if any. For example, to output all lines containing double words in the file banquet-speech.txt, you'd type:

$ diction banquet-speech.txt | grep 'Double word' RET


