Word – Vbscript to search through Microsoft Word files for keywords

microsoft wordsearchvbscript

I'd like to search through a large number of word files all held in the same folder, searching each document for a keyword.

When a document is identified which contains the keyword the script should output the document file name to a text file report.

So far I have created the following script which searches specified word docs for set terms. This script is obviously hardcoded for specific files, if you could demonstrate how to make it search all ".doc" files in the folder that would be great. It also does not create the text file report.

Set objWord = CreateObject("Word.Application")
objWord.Visible = True

Set objDoc = objWord.Documents.Open("L:\STSMP00001.docx")
Set objSelection = objWord.Selection

objSelection.Find.Forward = True
objSelection.Find.MatchWildcards = True
objSelection.Find.Text = "presentation"

Do While True
    objSelection.Find.Execute
    If objSelection.Find.Found Then
        strWord = objSelection.Text
        strWord = Replace(strWord, "[[", "")
        strWord = Replace(strWord, "]]", "")
        Wscript.Echo strWord
    Else
        Exit Do
    End If
Loop

Set objWord = CreateObject("Word.Application")
objWord.Visible = True

Set objDoc = objWord.Documents.Open("L:\STSMP00002.docx")
Set objSelection = objWord.Selection

objSelection.Find.Forward = True
objSelection.Find.MatchWildcards = True
objSelection.Find.Text = "presentation"

Do While True
    objSelection.Find.Execute
    If objSelection.Find.Found Then
        strWord = objSelection.Text
        strWord = Replace(strWord, "[[", "")
        strWord = Replace(strWord, "]]", "")
        Wscript.Echo strWord
    Else
        Exit Do
    End If
Loop

Best Answer

The following code snippet, modified from a post by Ansgar Wiechers, allows you to specify a folder and then creates a list based on the extension:

Set fso = CreateObject("Scripting.FileSystemObject")

Set objLog = fso.CreateTextFile("c:\temp\out.log", true)
Set list = CreateObject("ADOR.Recordset")
list.Fields.Append "name", 200, 255
'list.Fields.Append "date", 7
list.Open

For Each f In fso.GetFolder("C:\temp").Files
    If (UCase(Right(f.Path,4))=".DOC" or UCase(Right(f.Path,5))=".DOCX") then
        list.AddNew
        list("name").Value = f.Path
        'list("date").Value = f.DateLastModified
        list.Update
    end if
Next

list.MoveFirst
Do Until list.EOF
  WScript.Echo list("name").Value
  objLog.write list("name").Value
  list.MoveNext
Loop

list.Close
objLog.Close

Modify the GetFolder("C:\temp") with the path where your files are, and the InStr(f.Path,".txt") part with the document extension you would like to include. Any files found, for the example, will get written to the screen and to c:\temp\out.log Shouldn't be that difficult to include with your code so that you can iterate through a folder and just find the files with .DOC (or .DOCX) on them.

Answer to author's comment below:

The code to get the list of files in folder c:\temp starts in the for each loop. We look at each file name retrieved, and if the string ".txt" is in the file name, it would be added to the list object. You could change the logic of the if...end if statement by first changing the .txt section to .doc and then adding an or Instr(f.path,".docx") before the then so that both .DOC and .DOCX files are found. As I look at it, I would probably change the Instr to include the rightmost 4 and 5 characters of the string, so that only the extension is used. As it stands, if there's a file named "this is my.txt file.pdf" it would be selected as the string .txt is found in there. I have made the changes. As it stands, the script should find any .DOC or .DOCX files in folder c:\temp

Hope this helps!

Related Question