Windows DevCenter    
 Published on Windows DevCenter (http://www.windowsdevcenter.com/)
 See this if you're having trouble printing code examples


O'Reilly Book Excerpts: Word Hacks

Hacking Word

by Andrew Savikas

Editor's note: How can you make the world's most widely used word processing program work even better for you? Andrew Savikas, author of Word Hacks, gets you five steps closer to taming your text with his generous collection of hacks. Rid yourself of hidden character styles, tap VBScript for powerful string searching, use Perl from within a Word macro, and more.

moderate

Hack #47: Use an Outline to Build an Org Chart

Wordís outlining feature excels at managing hierarchical information. This hack shows you how to use an outline to create an organizational chart on the fly.

Related Reading

Word Hacks
Tips & Tools for Taming Your Text
By Andrew Savikas

Maintaining a company’s organizational charts can be a time-consuming task. Word does include a Diagram feature that makes it easy to create an organizational chart (select Insert -> Diagram), but editing charts can be a real challenge, particularly after any substantial reorganization.

Unfortunately, the macro recorder ignores diagrams, which removes a valuable tool for deciphering unfamiliar Word objects. But you can still automate diagrams—you just need to decipher the Diagram object on your own. The code in this hack should give you a good starting point.

Rather than manipulating an existing diagram, you can store the organizational information in an outline and create the diagram from scratch after any changes. With your organizational information stored in an outline, like the one shown in Figure 5-6, you can quickly add, remove, or rearrange entries.

Once you complete your outline, the code will use it to produce a diagram like the one shown in Figure 5-7.

The next time you need to change the chart, just edit the outline and make a new one.

The Code

Place this macro in the template of your choice [Hack #50] and either run it from the Tools-> Macro -> Macros dialog or put a button for it on a menu or toolbar [Hack #1].

The text for the top-level entry (or root node) is culled from the CompanyName property in the outline document. To enter a company name, select File -> Properties and go to the Summary tab. If you don’t fill in the property, Word inserts some dummy text.

Though your chart could go 10 levels deep (9 for each of Word’s outline levels, plus one more for the body-text level), this code goes only 4 levels deep. Adding more levels would require substantially more code, most of which would be nearly identical to that for the first four levels. You’ll need to add your own additional code to handle an outline more than four levels deep.

Figure 5-6. Edit your organization information in Outline view
Figure 5-6. Edit your organization information in Outline view

Sub MakeOrgChartFromOutline( )
  Dim doc As Document
  Dim para As Paragraph
  Dim sCompanyName As String
  Dim sParaText As String
  Dim nodeRoot As DiagramNode
  Dim shShape As Shape
  Dim node1 As DiagramNode
  Dim node2 As DiagramNode
  Dim node3 As DiagramNode
  Dim node4 As DiagramNode
  
  Set doc = ActiveDocument

  sCompanyName = doc.BuiltInDocumentProperties("Company")
  If Len(sCompanyName) <= 1 Then
      sCompanyName = "Type Company Name Here"

  End If
  
  Set shShape = _
      Documents.Add.Shapes.AddDiagram(msoDiagramOrgChart, 0, 0, 500, 500)
  Set nodeRoot = shShape.DiagramNode.Children.AddNode
  nodeRoot.TextShape.TextFrame.TextRange.text = sCompanyName
 
  For Each para In doc.Paragraphs
      Select Case para.OutlineLevel

Figure 5-7. An organizational chart created from an outline
Figure 5-7. An organizational chart created from an outline

Case wdOutlineLevel1
  sParaText = Left(para.Range.text, _
      para.Range.Characters.Count - 1)
  Set node1 = nodeRoot.Children.Ad dNode
  node1.TextShape.TextFrame.TextRange.text = sParaText
  Set node2 = Nothing
  Set node3 = Nothing
  Set node4 = Nothing
Case wdOutlineLevel2
  sParaText = Left(para.Range.text, _
      para.Range.Characters.Count - 1)
  Set node2 = node1.Children.AddNode
  node2.TextShape.TextFrame.TextRange.text = sParaText
  Set node3 = Nothing
  Set node4 = Nothing
Case wdOutlineLevel3
  sParaText = Left(para.Range.text, _
      para.Range.Characters.Count - 1)
  Set node3 = node2.Children.AddNode
  node3.TextShape.TextFrame.TextRange.text = sParaText
Set node4 = Nothing

Case wdOutlineLevel4
  sParaText = Left(para.Range.text, _
      para.Range.Characters.Count - 1)
  Set node4 = node3.Children.AddNode
  node4.TextShape.TextFrame.TextRange.text = sParaText
End Select
Next para

End Sub

Rather than attempting to position elements in the diagram, the macro just relies on Word’s default automatic behavior to align and position the entries.

moderate

Hack #55: Clean Out Linked “Char” Styles

Documents created or edited in Word 2002 or 2003 have a nasty habit of sprouting hidden character styles that are hard to see, let alone eliminate. This hack shows you how to lead a “Char-free” life with Word.

In most programs that offer style-based formatting (such as InDesign, FrameMaker, or QuarkXPress), if you try to apply a paragraph style to just part of a paragraph, the entire paragraph is modified to reflect the new style. But in Word, things aren’t so simple. In older versions of Word, when you apply a paragraph style to only part of a paragraph, the paragraph retains its original style, but the selected text takes on the character formatting of the paragraph style you tried to apply. That introduces a lot of direct formatting into documents, which can make them difficult to modify and maintain.

WARNING: Apply paragraph styles only with all or none—never some—of the paragraph selected to avoid this “feature.”

Starting in Word 2002, rather than just applying the character formatting of the paragraph style, Word creates a new, hidden character style based on the paragraph style and tacks the word “Char” on the end, such as “Heading 1 Char.”

Ostensibly, this is an improvement of the behavior of earlier versions of Word. Rather than changing Word to behave like other programs (and thus encouraging the conscientious use of character styles), Microsoft changed Word to behave in a strange, new, and—surprise!—poorly documented way. Because the Char styles are linked to the paragraph styles on which they’re based, if you later change the paragraph style, the character style also changes.

Spotting Char Styles

To see these bizarre styles in action (assuming you are using a newer version of Word), open a new document and drop in a few paragraphs of placeholder text [Hack #14].

Now select a word or two within one of the paragraphs and apply the “Heading 1” style to the selection. Check your styles, either using the Task Pane or the Styles pull-down menu on the Formatting toolbar. See anything amiss? Well, that’s a trick question. There are two things amiss, but you won’t see them right away on the Styles pull-down menu or the Task Pane:

Now, hold down the Shift key and click on the Styles pull-down menu on the Formatting toolbar. Scroll down to the “H” section, as shown in Figure 6-8, and you’ll see the new style. You can also view the linked style from the Reveal Formatting Task Pane.

Figure 6-8. The elusive Char style
Figure 6-8. The elusive Char style

You can work in a document for weeks and never notice these styles. However, if you send your document to someone using an older version of Word that doesn’t support these “linked styles,” they’ll appear right away in the Styles pull-down menu and the Styles and Formatting dialog.

Once you cut and paste the styles around your document, among different documents, and back and forth across different versions of Word, something even stranger happens. They evolve. Mutate. Fester. Until your document is rife with monstrosities like the following:

  Body Text Char Char, Body Text Char1 Char Char, Body Text Char Char

Sometimes the “Char” extension even gets added to some of your paragraph styles. Does the fun ever stop?

Unfortunately, you can’t prevent Word from creating these styles. And even if you still use Word 2000, these styles will show up in documents worked on by Word 2002 and 2003 users.

The situation gets even stranger when you try to delete these styles. In Word 2000, you at least stand a chance. Since 2000 doesn’t have linked styles, you can rename or delete them as needed, just like any other style. But in Word 2002 and 2003—the source of these bizarre styles—you aren’t so lucky. When you try to delete “Heading 1 Char,” the silence is deafening. Nothing. Not even a dialog admonishing your efforts.

So maybe you should try a little VBA? Running the following code in the Immediate window [Hack #2] would seem to be a solution:

  ActiveDocument.Styles("Heading 1 Char").Delete

When you run the code, the dialog shown in Figure 6-9 greets you. But take a look at your document again—the style’s gone.

Figure 6-9. Deleting the linked style generates a runtime error
Figure 6-9. Deleting the linked style generates a runtime error

What’s going on here? For a clue, try the following code in the Immediate window:

  ActiveDocument.Styles("Heading 1").Delete

You get your old friend, Runtime Error 4198.

Remember, the styles are linked. If you change one, you change both—and that goes for deletion as well. However, you can’t delete a built-in style, which is why you’re greeted with Runtime Error 4198 (and why nothing happened when you tried to delete it from within Word).

So what if you repeat this with a Char style not based on one of Word’s built-ins? Then there’s nothing stopping Word from deleting both styles and removing all of the formatting from any text that used them. Yikes! Fortunately, there’s a fix. With the code in this hack, you can quickly clear out any linked Char styles in your document, without losing any other styles.

The Code

This code will delete any character style with the word “Char” in it and remove the word “Char” from the name of any paragraph style. Since deleting a linked style also deletes any style it’s linked to, the link must first be broken (which is accomplished by linking it to, ironically, the Normal style).

TIP: This macro deletes character styles from your document. You will lose any formatting applied to text using the deleted character styles. If you want to keep the character formatting, see the upcoming section “Hacking the Hack.”

Because Word 2000 (and Word 97) doesn’t have a LinkStyle property for styles, if you’re using that version of Word, this code will not run unless you comment out one line, as noted within the code. Note that there are two procedures here: the main DeleteCharCharStyles macro and a supporting function named SwapStyles. Both are needed for this hack to work.

Sub DeleteCharCharStyles( )
Dim sty As Style
Dim i As Integer
Dim doc As Document
Dim sStyleName As String
Dim sStyleReName As String
Dim bCharCharFound As Boolean

Set doc = ActiveDocument
Do
    bCharCharFound = False
    For i = doc.Styles.Count To 1 Step -1
        Set sty = doc.Styles(i)
        sStyleName = sty.NameLocal
        If sStyleName Like "* Char*" Then
            bCharCharFound = True
            If sty.Type = wdStyleTypeCharacter Then
            On Error Resume Next
           '#############################################
           ' COMMENT OUT THE NEXT LINE IN WORD 2000 OR 97
           sty.LinkStyle = wdStyleNormal
           sty.Delete
           Err.Clear
        Else
           sStyleReName = Replace(sStyleName, " Char", "")
           On Error Resume Next
           sty.NameLocal = sStyleReName
           If Err.Number = 5173 Then
              Call SwapStyles(sty, doc.Styles(sStyleReName), doc)
              sty.Delete
              Err.Clear
           Else
             On Error GoTo ERR_HANDLER
           End If
         End If
         Exit For
      End If
      Set sty = Nothing
    Next i
  Loop While bCharCharFound = True
  Exit Sub
  ERR_HANDLER:
  MsgBox "An Error has occurred" & vbCr & _
          Err.Number & Chr(58) & Chr(32) & Err.Description, _
          vbExclamation
  End Sub
  
  Function SwapStyles(ByRef styFind As Style, _
                      ByRef styReplace As Style, _
                      ByRef doc As Document)
  With doc.Range.Find
       .ClearFormatting
       .Text = ""

       .Wrap = wdFindContinue
       .MatchCase = False
       .MatchWholeWord = False
       .MatchWildcards = False
       .MatchSoundsLike = False
       .MatchAllWordForms = False
       .Style = styFind
       .Replacement.ClearFormatting
       .Replacement.Style = styReplace
       .Replacement.Text = "^&"
       .Execute Replace:=wdReplaceAll
     End With
     End Function

The second procedure, SwapStyles, is there because of a scenario that often occurs when documents that have these linked styles go back and forth between different versions of Word. Often, what started as one paragraph style—for example, Sidebar—may have mutated into two paragraph styles, such as:

In that situation, if the code just tries to remove the “Char” strings from the second Sidebar style, an error will be raised because the name Sidebar is already taken. With the SwapStyles procedure, all the text formatted with the second style is modified and formatted with the first, and then the second style is simply deleted.

Running the Hack

After you put both procedures in the template of your choice [Hack #50], and either run it from the Tools -> Macro -> Macros dialog or put a button for it on a menu or toolbar [Hack #1].

This code affects only styles with the string “ Char”, including the leading space. If you plan to use this macro to clean your documents, you should avoid deliberately using the string “ Char” in any of your styles. However, feel free to begin a style name with Char, as in “CharacterStyleNumberOne.”

Hacking the Hack

If you want to delete the linked Char styles but retain the character formatting on the text, use this version of the code. It includes an additional procedure, StripStyleKeepFormatting, that removes the character style applied to the text but retains the formatting defined by that style. Again, if you use an earlier version of Word, you’ll need to comment out the line that unlinks the character style, as noted in the code.


  Sub DeleteCharCharStylesKeepFormatting( )
  Dim sty As Style
  Dim i As Integer
  Dim doc As Document
  Dim sStyleName As String
  Dim sStyleReName As String
  Dim bCharCharFound As Boolean
  
  Set doc = ActiveDocument
  Do
  
      bCharCharFound = False
      For i = doc.Styles.Count To 1 Step -1
          Set sty = doc.Styles(i)
          sStyleName = sty.NameLocal
          If sStyleName Like "* Char*" Then
 
          bCharCharFound = True
          If sty.Type = wdStyleTypeCharacter Then
          Call StripStyleKeepFormatting(sty, doc)
          On Error Resume Next
          '#############################################
          ' COMMENT OUT THE NEXT LINE IN WORD 2000 OR 97
          sty.LinkStyle = wdStyleNormal
          sty.Delete
          Err.Clear
      Else
          sStyleReName = Replace(sStyleName, " Char", "")
          On Error Resume Next
          sty.NameLocal = sStyleReName
          If Err.Number = 5173 Then
             Call SwapStyles(sty, doc.Styles(sStyleReName), doc)
             sty.Delete
          Err.Clear
        Else
          On Error GoTo ERR_HANDLER
        End If
      End If
      Exit For
    End If
    Set sty = Nothing
   Next i
 Loop While bCharCharFound = True
 Exit Sub
ERR_HANDLER:
MsgBox "An Error has occurred" & vbCr & _
        Err.Number & Chr(58) & Chr(32) & Err.Description, _
        vbExclamation
End Sub

Function SwapStyles(ByRef styFind As Style, _
                    ByRef styReplace As Style, _
                    ByRef doc As Document)
With doc.Range.Find
     .ClearFormatting
     .Text = ""
     .Wrap = wdFindContinue
     .MatchCase = False
     .MatchWholeWord = False
     .MatchWildcards = False
     .MatchSoundsLike = False
     .MatchAllWordForms = False
     .Style = styFind
     .Replacement.ClearFormatting
     .Replacement.Style = styReplace
     .Replacement.Text = "^&"
     .Execute Replace:=wdReplaceAll
  End With
End Function


Function StripStyleKeepFormatting(ByRef sty As Style, _
                                  ByRef doc As Document)
Dim rngToSearch As Range
Dim rngResult As Range
Dim f As Font

Set rngToSearch = doc.Range
Set rngResult = rngToSearch.Duplicate

Do
    With rngResult.Find
    .ClearFormatting
    .Style = sty
    .Text = ""
    .Forward = True
    .Wrap = wdFindStop
    .Execute
   End With

   If Not rngResult.Find.Found Then Exit Do
  
   Set f = rngResult.Font.Duplicate
   With rngResult
        .Font.Reset
        .Font = f
        .MoveStart wdWord
        .End = rngToSearch.End
   End With
   Set f = Nothing
Loop Until Not rngResult.Find.Found
End Function

 
TIP: For an alternative method, check out “Remove Linked “Char” Styles with XSLT” [Hack #98].

expert

Hack #65: Show Progress from VBA

When macros take a long time to run, people get nervous. Did it crash? How much longer will it take? Do I have time to run to the bathroom? Relax. This hack shows you two ways to create a macro progress bar using VBA.

Before adding a full-fledged progress bar to your macro, consider whether something more subtle might be effective enough to keep the macro user informed. Within a macro, you can use the StatusBar property to display text in Word’s status bar—the little area at the bottom of the window that displays the current page, line count, and so forth. The following macro displays a personalized message in the status bar. Put the macro in the template of your choice [Hack #50] and run it from Tools -> Macro -> Macros:

  Sub SayHello( )
  StatusBar = "Hello, " & Application.UserName & _
      ". My that's a nice shirt you're wearing."
  End Sub

You can take a tip from Word, which often displays messages in the status bar (e.g., when you save a document), and use the status bar as a means of communication from within a macro.

For example, the following macro uses a For Each loop [Hack #66] to highlight any paragraph set to outline Level 2 that contains more than 10 words. As it completes this task, it prints the text of the paragraph to the status bar:


    Sub HighlightLongHeadings( )
    Dim para As Paragraph
    For Each para In ActiveDocument.Paragraphs
        StatusBar = "Checking: " & para.Range.Text
        If para.OutlineLevel = wdOutlineLevel2 Then
          If para.Range.Words.Count > 10 Then

para.Range.HighlightColorIndex = wdBrightGreen End If End If Next para StatusBar = "" End Sub

This solution usually provides enough visual feedback to keep users assured that the macro’s still hard at work and that Word hasn’t crashed.

If you want a more specific, or just less subtle, feedback method, you can create a custom progress bar that appears in its own dialog box while your macro runs. The following sections describe two ways to create your own progress bar using VBA. Both adapt the HighlightLongHeadings macro shown above.

Continuous Progress

The first technique combines the code for the progress bar with the code for the macro.

To keep the example simple, you should put this code in your Normal template. Select Tools -> Macro -> VisualBasic Editor, choose Normal in the Project Explorer (near the top left of the window), and then select Insert -> UserForm. Next, choose View -> Toolbox to display the Toolbox (it may already be showing). Select the Label control (the one with the “A” on it). Now move your cursor to the UserForm and drag the cursor to create a new label, like the one shown in Figure 7-9. Try to position the top-left corner of the label near the top left of the UserForm.

Next, select View -> Code and insert the following code:

      Private Sub UserForm_Activate( )
      Dim lParaCount As Long
	    Dim i As Integer
	    Dim para As Paragraph
	    Dim lMaxProgressBarWidth As Long
	    Dim sIncrement As Single

	    ' Resize the UserForm
	    Me.Width = 240
	    Me.Height = 120

	    ' Resize the label
	    Me.Label1.Height = 50
	    Me.Label1.Caption = ""
	    Me.Label1.Width = 0
	    Me.Label1.BackColor = wdColorBlue

Figure 7-9. Creating a simple progress bar with a UserForm
Figure 7-9. Creating a simple progress bar with a UserForm

lMaxProgressBarWidth = 200
lParaCount = ActiveDocument.Paragraphs.Count
sIncrement = lMaxProgressBarWidth / lParaCount
i = 1

For Each para In ActiveDocument.Paragraphs
  Me.Label1.Width = Format(Me.Label1.Width + sIncrement, "#.##")
  Me.Caption = "Checking " & CStr(i) & " of " & CStr(lParaCount)
  Me.Repaint
  If para.OutlineLevel = wdOutlineLevel2 Then
    If para.Range.Words.Count > 10 Then
      para.Range.HighlightColorIndex = wdBrightGreen
    End If
  End If
  i = i + 1
  Next para

  Unload Me
  
  End Sub

From the Project Explorer, select one of the code modules in Normal, as shown in Figure 7-10. If you don’t have any code modules in Normal, select Insert -> Module to create one.

Figure 7-10. Select one of the code modules in your Normal template
Figure 7-10. Select one of the code modules in your Normal template

In the code module you’ve selected, insert the following code:

Sub HighlightLongHeadings( )
  UserForm1.Show
End Sub

Now select File -> Close and Return to Microsoft Word. To run the macro, select Tools -> Macro -> Macros and choose HighlightLongHeadings. When you run the macro, you’ll see a progress bar like the one shown in Figure 7-11.

Figure 7-11. A simple progress bar in action
Figure 7-11. A simple progress bar in action

TIP: If the document is very short, you probably won’t see the progress bar—it’ll finish filling in too fast. Test this out on a long document to really see it in action.

One of the lines in the UserForm code deserves a closer look:

Format(Me.Label1.Width + sIncrement, "#.##")

The variable sIncrement is the final width of the progress bar divided by the total number of paragraphs in the document. As the macro visits each paragraph in the document, the width of the bar increases by the value of sIncrement. Since the maximum width of the bar in this example is 200 pixels (as defined in the variable lMaxProgressBarWidth), if there are 10 paragraphs in the document, the width of the bar will increase by 20 pixels as each paragraph is examined.

If there are hundreds or thousands of paragraphs in a document, the value of sIncrement can become quite small—smaller than the measurements UserForms are designed to handle. When that happens, VBA will round the number according to its own internal rounding rules, which can cause the width of the progress bar to eventually exceed the width of the UserForm. However, if you use the Format function, the increment amount will be rounded more precisely, keeping it confined to the boundaries of the UserForm.

Incremental Progress

One drawback to the technique described in the previous section is that the code for the progress bar is mixed with the code used to modify the document. To create another macro that displays a similar progress bar, you’d need to create another, similar UserForm. But by separating the code for the progress bar from the code that works on the document, you can reuse your progress bar in a variety of situations.

This section shows you how to create a dialog that reports the progress of a macro as a percentage, in increments of 10%, as shown in Figure 7-12. You can use this same progress bar from within any macro whose progress can be translated into a percentage.

Figure 7-12. A progress bar that displays percentage increments
Figure 7-12. A progress bar that displays percentage increments

To keep the example simple, you should put this code in your Normal template. Select Tools -> Macro -> VisualBasic Editor, choose Normal in the Project Explorer (near the top left of the window), and then select Insert -> UserForm. Next, choose View -> Toolbox to display the Toolbox (it may already be showing).

On the Toolbox, select the Frame control (the box with “xyz” at the top), and then draw a single frame on your blank UserForm. With the frame selected, go to the Properties window. Change the frame’s height to 30 and its width to 18 and set its Visible property to False. Then delete the frame’s caption and change its background color to blue, as shown in Figure 7-13.

Figure 7-13. Change the frame’s caption and background color from the Properties
Figure 7-13. Change the frame’s caption and background color from the Properties window

In the listbox at the top of the Properties window, select UserForm1 instead of Frame1, and then change the ShowModal property to False. While in the Properties window, change the name of the UserForm to IncrementalProgress.

Now go back to the UserForm itself and select the frame. Choose Edit -> Copy, and then paste the frame nine times. Align the 10 frames as best you can in a single row. While holding down the Ctrl key, select all of the frames. Then select Format -> Align and align the centers and tops of all the frames.

Next, select the Label control from the Toolbox (the one with the “A” on it) and draw a label underneath the frames, as shown in Figure 7-14. From the Properties window, delete the label’s caption.

With this method, you display the dialog when your macro starts, then periodically increment its progress as a percentage. It involves more code, but it’s more versatile than the first method.

Now select View -> Code and insert the following:

  Private Sub UserForm_Initialize( )
  Me.Caption = "0% Complete"
  End Sub

  Public Function Increment(sPercentComplete As Single, _
      sDescription As String)
  On Error Resume Next
  Me.Label1.Caption = sDescription
  Me.Repaint

Figure 7-14. Creating an incremental progress bar
Figure 7-14. Creating an incremental progress bar

Dim iPercentIncrement As Integer
iPercentIncrement = Format(sPercentComplete, "#")

Select Case iPercentIncrement
  Case 10
    Me.Frame1.visible = True
    Me.Caption = "10% Complete"
    Me.Repaint
  Case 20
    Me.Frame2.visible = True
    Me.Caption = "20% Complete"
    Me.Repaint
  Case 30
    Me.Frame3.visible = True
    Me.Caption = "30% Complete"
    Me.Repaint
  Case 40
    Me.Frame4.visible = True
    Me.Caption = "40% Complete"
    Me.Repaint
  Case 50
    Me.Frame5.visible = True
    Me.Caption = "50% Complete"
    Me.Repaint
  Case 60
    Me.Frame6.visible = True
    Me.Caption = "60% Complete"    
    Me.Repaint  
  Case 70    
    Me.Frame7.visible = True    
    Me.Caption = "70% Complete"    
    Me.Repaint  
  Case 80    
    Me.Frame8.visible = True    
    Me.Caption = "80% Complete"    
    Me.Repaint  
  Case 90    
    Me.Frame9.visible = True    
    Me.Caption = "90% Complete"    
    Me.Repaint  
  Case 100    
    Me.Frame10.visible = True    
    Me.Caption = "100% Complete"    
    Me.Repaint  
End Select
End Function

You can now use the progress bar from within your macros. All you need to do is provide the percentage and any text you’d like displayed underneath the progress bars.

The following is the HighlightLongHeadings macro, revised to use this progress bar. The lines shown in bold are the ones that interact with the progress bar.

  Sub HighlightLongHeadings( )
  Dim lParaCount As Long
  Dim sPercentage As Single
  Dim i As Integer
  Dim para As Paragraph
  Dim sStatus As String

  IncrementalProgress.Show

  lParaCount = ActiveDocument.Paragraphs.Count
  i = 1

  For Each para In ActiveDocument.Paragraphs

    sPercentage = (i / lParaCount) * 100
    sStatus = "Checking " & i & " of " & lParaCount & " paragraphs"
    IncrementalProgress.Increment sPercentage, sStatus

    If para.OutlineLevel = wdOutlineLevel2 Then
      If para.Range.Words.Count > 10 Then
        para.Range.HighlightColorIndex = wdBrightGreen
      End If
    End If

  i = i + 1
  Next para  
  Unload IncrementalProgress
  End Sub

Running this macro will display the progress bar shown in Figure 7-12.

Your macros will take longer to run, because the progress bar adds overhead. You should test versions of your macros with and without the progress bar to determine whether you find the performance hit acceptable.

WARNING: The above code assumes you will hit each percentage stop along the way. If you expect to skip increments, modify the code to make sure you “turn on” all the increment frames lower than the current one. For example:
    ...
    Case 40
    With Me
    .Frame1.Visible = True
    .Frame2.Visible = True
    .Frame3.Visible = True
    .Frame4.Visible = True
    .Caption = "40% Complete"
    .Repaint
    End With
    ...

expert

Hack #82: Perform Power Text Searches with Regular Expressions

When wildcards just arenít enough, tap VBScript for powerful string searching in Word.

Although Word’s wildcard searching is much better than most users realize, if you’ve previously used a language like Perl, Python, or JavaScript, you might prefer sticking with the special characters you already know for your searches. Besides, sometimes wildcards just aren’t up to the job.

To borrow an example from O’Reilly’s Learning Python, suppose you need to replace any occurrence of “red pepper” or “green pepper” with “bell pepper” if and only if they occur together in a paragraph before the word “salad,” but not if they are followed (with no space) by the string “corn.” That’s definitely way out of Word’s wildcards’ league. (The pattern is \b(red|green)(\s+pepper(?!corn)(?=.*salad)), for those of you too impatient to wait until the full example at the end of this hack.)

Though VBA doesn’t have built-in support for regular expressions, Microsoft does include a RegExp object with VBScript. With a slight change to your settings in the Visual Basic Editor, you can use the RegExp object in your macros.

First, select Tools -> Macro -> Visual Basic Editor, and then choose Tools -> References. In the next dialog, shown in Figure 9-3, check the “Microsoft VBScript Regular Expressions 5.5” box and click the OK button.

Figure 9-3. Setting a reference to VBScript regular expressions from the Visual Basic
Figure 9-3. Setting a reference to VBScript regular expressions from the Visual Basic Editor

Now you can include instances of the RegExp object in your macros. The following section describes the RegExp object.

RegExp’s Properties and Methods

The RegExp object has four properties, described in the following list:

Pattern
The pattern string to search for.
Global
Whether search is for all occurrences that match Pattern, or just the first. This is a Boolean value, and the default is False.
IgnoreCase
Whether search is case-sensitive. This is a Boolean value, and the default is False.
MultiLine
Whether Pattern is matched across line breaks. This is a Boolean value, and the default is False.

The RegExp object has three methods, described in the following list:

Execute
Returns a Matches collection containing the matched substrings and information about those substrings.
Replace

Replaces all the substrings in a searched string that match a pattern with a replacement string. The syntax for this method is:

RegExpobject.Replace("string to search", "replacement pattern")
Test
Whether a search has successfully matched a pattern. Returns a Boolean value. Since this method always returns True if there were one or more successful matches, there’s no need to set the Global property when using this method.

The Matches collection returned by the Execute method contains one or more Match objects, which have three properties, shown in the following list:

FirstIndex
The position of the Match’s first character within the search string
Length
The number of characters in the Match
Value
The matched string

Using the RegExp Object in a Macro

The following macro interactively tests search patterns against the selected text.

Place this macro in the template of your choice [Hack #50] and either run it from the Tools-> Macro -> Macros dialog or put a button for it on a menu or toolbar [Hack #1].

    Sub RegExpTest( )
	  Dim re As RegExp
	  Dim strToSearch As String
	  Dim strPattern As String
	  Dim strResults As String
	  Dim oMatches As MatchCollection
	  Dim oMatch As Match
	  
    strToSearch = Selection.Text

	  Set re = New RegExp
	  re.Global = True
	  re.IgnoreCase = True
	  
    Do While (1)
	      strPattern = InputBox("Enter search pattern string:", _
	                            "RegExp Search", "")
	  If Len(strPattern) = 0 Then Exit Do

	  re.Pattern = strPattern

	  Set oMatches = re.Execute(strToSearch)
	  If oMatches.Count <> 0 Then
  	 strResults = Chr(34) & strPattern & Chr(34) & _
	                " matched " & oMatches.Count & " times:" _
	                & vbCr & vbCr
	     For Each oMatch In oMatches
	         strResults = strResults & _
	                      oMatch.Value & _
                        ": at position " & _
                        oMatch.FirstIndex & vbCr
      Next oMatch
	  Else
	      strResults = Chr(34) & strPattern & Chr(34) & _
	                 " didn't match anything. Try again."

	  End If
	  MsgBox strResults
 Loozp

 End Sub

When you run this macro, you’ll be prompted with the dialog shown in Figure 9-4.

The dialog shown in Figure 9-5 displays the search results.

The RegExp object supports the same metacharacters you might have seen in Perl:

   \ | ( ) [ { ^ $ * + ? .

You also get all the classic Perl character-class shortcuts:

   \d \D \s \S \w \W

Figure 9-4. Enter your search pattern here, including any special characters
Figure 9-4. Enter your search pattern here, including any special characters

Figure 9-5. Fine-tune your search patterns interactively
Figure 9-5. Fine-tune your search patterns interactively

For a full listing of special characters for using the RegExp object, see http://msdn.microsoft.com/library/default.asp?url=/library/en-us/script56/html/vspropattern.asp.

Performing Replacements

When using the Replace method, you can group and reuse parts of the matched pattern. Known as backreferencing, this is a powerful technique. The following code snippet demonstrates how to change the format of some dates in a string:

  re.Replace("(September) (\d\d?), (\d{4})", "$2 $1, $3")

This code will change a date like “September 12, 1978” into “12 September, 1978.” Modifying the code to replace September with a different month won’t require making any change to the replacement string, thanks to backreferencing.

Bringing all of this together, the following macro shows you how to use the “bell pepper” pattern discussed at the beginning of this hack to get the results shown in Figure 9-6.

Figure 9-6. Performing complex replacements with regular expressions
Figure 9-6. Performing complex replacements with regular expressions

Place this macro in the template of your choice [Hack #50] and either run it from the Tools -> Macro -> Macros dialog or put a button for it on a menu or toolbar [Hack #1]:

    Sub FixPeppers( )
    Dim re As RegExp
    Dim para As Paragraph
    Dim rng As Range
    Set re = New RegExp
    re.Pattern = "\b(red|green)(\s+pepper(?!corn)(?=.*salad))"
    re.IgnoreCase = True
    re.Global = True
    For Each para In ActiveDocument.Paragraphs
      Set rng = para.Range
      rng.MoveEnd unit:=wdCharacter, Count:=-1
      rng.Text = re.Replace(rng.Text, "bell$2")
    Next para
    End Sub
TIP: For more on regular expressions, check out “Hack Word from Python” [Hack #85], “Hack Word from Perl” [Hack #86], and Mastering Regular Expressions (O’Reilly).

expert

Hack #89: Use Perl from Word

This hack shows you two very different ways of getting at Perl from within a Word macro.

Perl’s hard to beat for heavy-duty text processing, and if you’ve already got some Perl scripts lying around for performing certain tasks, you may want to use those from a Word macro rather than starting from scratch in VBA—a decidedly lighter-weight contender when it comes to text processing.

TIP: This hack assumes that you have Perl installed on your system and that you can run Perl scripts from the DOS command line. To download a free version of Perl for Windows, go to the ActiveState web site at http://www.activestate.com.

This hack demonstrates two ways to get at Perl code from a Word macro. The first is the more polished method and requires some special software from ActiveState, which lets you create standalone Windows .dll (dynamic link library) files. These files contain libraries of functions used to perform specific tasks. The second method is about as quick and dirty as they come and does not require any special software.

Creating COM Objects with the Perl Dev Kit

In addition to the free ActivePerl distribution, ActiveState sells software to help Perl developers create Windows applications. The Perl Dev Kit (PDK) lets you create standalone Windows executables, Microsoft installation files, and even .NET applications. You can try it free for 21 days, though any applications you build will expire at the same time as the evaluation period. You can, however, renew your trial period for an additional 21 days. The standard license costs $195. You can get the free trial version from http://www.activestate.com/Products/Perl_Dev_Kit/.

TIP: You need to download and install the PDK to do the rest of the stuff in this section. The method described in the next section, “Call Perl Directly with the VBA Shell Function,” doesn’t require the PDK.

One part of the PDK is PerlCTRL, which builds Windows .dll files from a Perl script. It involves a bit of setup work, but once you create the .dll, using it as a COM object [Hack #84] from VBA is a breeze.

This example is similar to “Use Python from Word” [Hack #88]; it shows you how to build a standalone interface to Perl’s split function. A detailed explanation of PerlCTRL is beyond the scope of this hack, which demonstrates only a simple example.

TIP: Although VBA also includes a split function, with Perl’s you can use a regular expression pattern as the delimiter rather than just a string, making it a much more powerful function. The following is the base Perl code used to build the COM object. It’s a simple wrapper around Perl’s built-in split function:

The following is the base Perl code used to build the COM object. It’s a simple wrapper around Perl’s built-in split function:

  package PerlSample;
  sub Split {
  my $pattern = shift;
  my $string = shift;
  my @list = split(/$pattern/, $string);
  return \@list;
  }

There are three main steps to turning this Perl code into a COM object:

  1. Create a template file using the PDK. The template file contains boilerplate code and examples of the information that PerlCTRL needs to generate the .dll.
  2. Modify the template file.
  3. Generate the .dll from the template.

First, create a new folder on your system and name it C:\PerlCOMSample\. Open a DOS command prompt and navigate to the folder you created. At the DOS prompt, type the following:

> PerlCtrl –t > template.pl

Now open the template.pl file in a text editor, such as Notepad. The file will look like the one shown in Figure 9-13.

Figure 9-13. The template file generated by PerlCTRL
Figure 9-13. The template file generated by PerlCTRL

The template file also includes three unique identifiers that Windows will use to keep track of your .dll. PerlCtrl creates these identifiers when you generate the template file. As the comments in the template file indicate, do not edit those lines. Otherwise, modify the template file as follows:

  package PerlSample;
  
  sub Split {
    my $pattern = shift;
    my $string = shift;
  
    my @list = split(/$pattern/, $string);
    return \@list;
  }
  
  =pod
  
  =begin PerlCtrl
  
  %TypeLib = (
  PackageName => 'PerlSample',
  TypeLibGUID => '{26798342-6F54-4271-9668-B4C0D31EB5C8}', # do NOT edit this
  line
  ControlGUID => '{BD48D84F-C5C9-4E3B-8E36-24E019E4F48D}', # do NOT edit this
  line
  DispInterfaceIID=> '{FF546B71-4492-4E07-BD44-1EDE507CB5A4}', # or this one
  ControlName => 'PerlSample',
  ControlVer => 1, # increment if new object with same ProgID
  # create new GUIDs as well
  ProgID => 'PerlSample.Split',
  DefaultMethod => '',

  Methods => {
    'Split' => {
	      RetType => VT_ARRAY|VT_VARIANT,
		        TotalParams => 2,
				NumOptionalParams => 0,
				ParamList =>[ 'pattern' => VT_BSTR,
				'string' => VT_BSTR ]      
	  },  
	}, 
	# end of 'Methods' 
	Properties => {  }
	, # end of 'Properties'  ); # end of %TypeLib
  
  =end PerlCtrl
  
  =cut

Again, use the three lines generated in your template file, not the ones shown in bold in this example.

Next, save this file as PerlCOMObject.ctrl in the same directory, and then run the following command at a DOS prompt:

> PerlCtrl PerlCOMObject.ctrl

You’ll see the following output:

    Created 'PerlCOMObject.dll'

Now you need to register the new .dll with Windows. At the DOS prompt, enter the following:

> regsvr32 PerlCOMObject.dll

In a few seconds, you’ll see the dialog shown in Figure 9-14, indicating that the .dll file was successfully registered.

Figure 9-14. Windows notifies you when your .dll file is registered successfully
Figure 9-14. Windows notifies you when your .dll file is registered successfully

Now you can call the .dll as a COM object from within a Word macro. Open Word and create the following macro in the template of your choice [Hack #50]:


  Sub TestPerlObject( )
  Dim pl As Object

  Set pl = CreateObject("PerlSample.Split")  
  Dim str As String  
  Dim var( ) As Variant  
  Dim v As Variant
  str = "Hello from Perl!"  
  var = pl.Split(" ", str)  
  For Each v In var    
  MsgBox v  
  Next v  
  End Sub

When you run the macro, you’ll see three dialog boxes displayed in sequence, each showing one of the words in the string “Hello from Perl!”

If you’d like to remove the .dll from your system, enter the following at a DOS command prompt:

> regsvr32 /u PerlCOMObject.dll

Call Perl Directly with the VBA Shell Function

VBA includes a function you can use to launch other Windows applications. At its simplest, the Shell function is roughly equivalent to entering a command at a DOS prompt. For example, enter the following in the Visual Basic Editor’s Immediate window [Hack #2] to launch the Notepad text editor:

   Shell("notepad.exe")

Because Perl is an executable file, you can use Shell to run Perl scripts. For example, if you had a Perl script called C:\foo.pl, you could enter the following in the Immediate window to run the script:

Shell("C:\perl\bin\wperl.exe C:\foo.pl")

After the Shell function executes and the executable program starts, the VBA code continues.

TIP: wperl.exe is the “windowless” Perl. When run, it won’t launch a new DOS window, unlike the regular perl.exe.

You can use the clipboard to pass and return values between VBA and Perl. For example, you can copy selected text to the clipboard and then call a Perl script that reads the clipboard, processes the text, and puts the result back on the clipboard to paste into your document.

However, the VBA macro might try to paste from the clipboard before the Perl script finishes. Thus, you also need a way to have VBA “wait” for the Perl script to finish. One solution is to use a semaphore; that is, have the macro create a temporary folder on your computer, and then have the Perl script delete it once it puts the script result on the clipboard. All you need is a few lines of VBA to check to see if the folder still exists and, if so, instruct the macro to wait a few seconds until the Perl script finishes.

TIP: For a more thorough discussion of semaphores, see http://interglacial.com/~sburke/tpj/as_html/tpj23.html and http://interglacial.com/~sburke/tpj/as_html/tpj24.html.

Since it's likely you'd want to access a variety of Perl scripts from within a Word macro, it's worthwhile to create a reusable function to act as a wrapper around the Shell function call to Perl. The following function takes three arguments: the name of the Perl script to run, the name of the semaphore folder the Perl script should delete when it finishes, and finally the maximum time to wait for the Perl script to run before giving up. The function returns a value of True if the Perl script deleted the semaphore folder, or False if the folder still exists when the time limit is reached. Put this code into the template of your choice [Hack #50]:

  Function RunPerl(sPerlScriptToRun As String, _
                   sSemFolderName As String, _
                   sngWaitMax As Single) As Boolean

  Dim sPerlPath As String
  Dim sFullShellCommand As String
  Dim sSemDir As String
  Dim sSemDirFullName As String
  Dim sngStartTime As Single

  ' Full path of "Windowless" Perl executable
  sPerlPath = "C:\perl\bin\wperl.exe"

  ' Get the full path from the environment variable
  sSemDirFullName = Environ("TEMP") & "\" & sSemFolderName

  ' Put quotes around full script path.
  ' This allows for spaces in script path names, common on Windows systems.
  sFullShellCommand = sPerlPath & " " & _
    Chr(34) & sPerlScriptToRun & Chr(34)

  ' Create semaphore directory, unless it already exists
  If Not LCase(Dir(sSemDirFullName, vbDirectory)) = LCase(sSemFolderName) Then
    MkDir (sSemDirFullName)
  End If

  ' Start the countdown to timeout
  sngStartTime = Timer

  ' Run Perl script
  Shell (sFullShellCommand)

  ' The script will stay in this loop until either
  ' the semaphore directory is deleted, or until the
  ' time limit set by sngMaxWaitTime has passed
  Do While LCase$(Dir$(sSemDirFullName, vbDirectory)) = _
                  sSemFolderName And _
     ((Timer - sngStartTime) < sngWaitMax)
    ' Display a countdown in status bar
    StatusBar = "Waiting " & _
      Int((sngWaitMax - (Timer - sngStartTime))) & _
      " more seconds for Perl ..."

  Loop

  If LCase$(Dir$(sSemDirFullName, vbDirectory)) = sSemFolderName Then
    ' Gave up waiting.
    RmDir (sSemDirFullName)
    StatusBar = "Gave up waiting for Perl"
    RunPerl = False
  Else
    ' Perl script successfully deleted semaphore folder
    StatusBar = ""
    RunPerl = True
  End If

End Function

To see an example of this function in action, and to borrow Tim Meadowcroft's example from Computer Science and Perl Programming (O’Reilly), the following code will demonstrate how to use Perl to standardize phone numbers that are in a variety of formats. (Note: This example uses U.K. phone numbers.)

For starters, the following Perl script called FixPhoneNumbers.pl pulls the text off the Windows clipboard, checks it using a series of regular expressions, then either puts the modified number on the clipboard, or the original, if it couldn't fix it. The standard ActiveState Windows Perl distribution includes the Win32::Clipboard module. Save this script as C:\FixPhoneNumbers.pl:

  use Win32::Clipboard;

  my $TEMP = $ENV{"TMP"};
  my $clipcontents = Win32::Clipboard();
  my $cliptext = $clipcontents->Get();
  my $num = PerlFixPhone($cliptext);

  if ($num != '') {
    $cliptext = $num
  }

$clipcontents->Set($cliptext);
rmdir("$TEMP/vba_sem") || die "cannot rmdir $TEMP\\vba_sem: $!";

sub PerlFixPhone {
    # Tests:
    # 020 xxxx xxxx : fine as is
    # xxx xxxx : assume 020 7xxx xxxx
    # 2xxx : Building 1 extension, assume 020 7457 2xxx
    # 8xxx : Building 2 extension, assume 020 7220 8xxx
    # 0171 xxx xxxx : convert to 020 7xxx xxxx
    # 0181 xxx xxxx : convert to 020 8xxx xxxx
    # Anything else is an error and should be ignored...
    #
    local $_ = shift;
    return $_ if /^020 \d{4} \d{4}$/;
    return $_ if s/^\s*(\d{3})[-\s]+(\d{4})\s*$/020 7$1 $2/;
    return $_ if s/^\s*(\d{3})[-\s]+(\d{4})[-\s]+(\d{4})\s*$/$1 $2 $3/;
    return $_ if s/^\s*(2\d{3})\s*$/020 7457 $1/;
    return $_ if s/^\s*(8\d{3})\s*$/020 7220 $1/;
    return $_ if s/^\s*0171[-\s]+(\d{3})[-\s]+(\d{4})\s*$/020 7$1 $2/;
    return $_ if s/^\s*0181[-\s]+(\d{3})[-\s]+(\d{4})\s*$/020 8$1 $2/;
    return '';
}

The following macro uses the RunPerl function shown above to run the FixPhoneNumbers.pl script. Put this code in the same template as the RunPerl function:

Sub UsePerlToFixSelectedPhoneNumber()
' Pass selected text to a Perl program
' to format/normalize phone numbers
Dim sel As Selection

Set sel = Selection
' Exit if selection doesn't include some text
If sel.Type = wdSelectionIP Then
  MsgBox "Please select some text first"
  Exit Sub
End If

' Copy selected text to clipboard for Perl
sel.Copy

' Run Perl script. If successful,
' paste in changed text from Perl
If (RunPerl(sPerlScriptToRun:="C:\FixPhoneNumbers.pl", _
            sSemFolderName:="vba_sem", _
            sngWaitMax:=5)) = True Then
    sel.Paste
Else
    MsgBox "Gave up waiting for Perl"
End If

End Sub

To see this macro in action, type the following four (U.K.) phone numbers into a Word document:

0171 123 6554
8000
220-8537
220 8537

Select each in turn and run the UsePerlToFixSelectedPhoneNumber macro from Tools -> Macro -> Macros. The macro will convert them to the following format, according to the rules laid out in the FixPhoneNumbers.pl script:

020 7123 6554
020 7220 8000
020 7220 8537
020 7220 8537

If your Perl program takes a long time to run, you may need to adjust the value passed in the sngWaitMax argument to the RunPerl function. This example sets it to five seconds, more than enough time for Perl to finish this little bit of text crunching.

—Sean M. Burke, Andy Bruno, and Andrew Savikas

Andrew Savikas is the VP of Digital Initiatives at O'Reilly Media, and is the Program Chair for the Tools of Change for Publishing Conference.


View catalog information for Word Hacks

Return to Windows DevCenter.

Copyright © 2009 O'Reilly Media, Inc.