How to Copy Data from Multiple MS Word Files to Excel Using VBA

microsoft wordmicrosoft-excel-2007microsoft-officevba

I know this question was already asked (Copying data from multiple word docs into one excel sheet) the thing is I can't use the answer.

I'm fresh to VBA, but I thought I can handle it. I was wrong. I was trying to use the code provided in the mentioned thread to parse some Word documents, at first with some amendments, then just using the original code. Unfortunately, I get the "object required" run-time error.

The code is provided below. The documents I'm trying to get data from are Word 2003 files (I first tried to change the "docx" to "doc", then to save the documents in docx and using the original script, didn't help). One thing is that they are in fact scanned and ocr'ed paper documents, so…
a) most of the tables inside are kept in frames (don't know if it changes anything, supposedly no, considering their xml structure)
b) when I try to save them as docx the application first proposes to save them as rtfs. So maybe they're in fact rtf files, not .doc?

Sub macro1()
  Dim xl As Object
 Set xl = CreateObject("excel.application")

 xl.Workbooks.Add
 xl.Visible = True

 'Here put your path where you have your documents to read:
 myPath = "C:\some\path\"  'End with '\'
 myFile = Dir(myPath & "*.docx")

 xlRow = 1
 Do While myFile <> ""
  Documents.Open Filename:=myPath & myFile, ConfirmConversions:=False, _
     ReadOnly:=False, AddToRecentFiles:=False, PasswordDocument:="", _
     PasswordTemplate:="", Revert:=False, WritePasswordDocument:="", _
     WritePasswordTemplate:="", Format:=wdOpenFormatAuto, XMLTransform:=""

  xlCol = 0
  For Each t In ActiveDocument.Tables
     For Each r In t.Rows
        For Each c In r.Range.Cells
           myText = c
           myText = Replace(myText, Chr(13), "")
           myText = Replace(myText, Chr(7), "")
           xlCol = xlCol + 1
           xl.ActiveWorkbook.ActiveSheet.Cells(xlRow, xlCol) = myText

        Next c
        xl.ActiveWorkbook.ActiveSheet.Cells(xlRow, xlCol + 1) = myFile
        xlRow = xlRow + 1
        xlCol = 0
     Next r
  Next t
  ActiveWindow.Close False

  myFile = Dir
  Loop

 xl.Visible = True
 End Sub

Best Answer

I have tested it. It actually works works well. Several points to have in mind before using the current version of the code:

  1. It should be added to Word VBA, not Excel or other (this might be the reason why you received the "object required" error).
  2. It processes just .docx
  3. It processes all actual MS Word tables, not pictures that might look like tables.

I have slightly modified the code to make it a bit more readable, at least for me, coming from the Excel VBA world. Your should always use Option Explicit!

Option Explicit

Sub Word_tables_from_many_docx_to_Excel()
Dim myPath As String, myFile As String, myText As String
Dim xlRow As Long, xlCol As Long
Dim t As Table
Dim r As Row
Dim c As Cell
Dim xl As Object
 Set xl = CreateObject("excel.application")

 xl.Workbooks.Add
 xl.Visible = True

 'Here put your path where you have your documents to read:
 myPath = "C:\Temp\"  'End with '\'
 myFile = Dir(myPath & "*.docx")

 xlRow = 1
 Do While myFile <> ""
 Documents.Open myPath & myFile

  For Each t In ActiveDocument.Tables
     For Each r In t.Rows
        xlCol = 1
        For Each c In r.Range.Cells
           myText = c.Range.Text
           myText = Replace(myText, Chr(13), "")
           myText = Replace(myText, Chr(7), "")
           xl.ActiveWorkbook.ActiveSheet.Cells(xlRow, xlCol) = myText
           xlCol = xlCol + 1
        Next c
        xl.ActiveWorkbook.ActiveSheet.Cells(xlRow, xlCol + 1) = myFile
        xlRow = xlRow + 1
     Next r
     xlRow = xlRow + 1
  Next t

  ActiveWindow.Close False

 myFile = Dir
 Loop

End Sub
Related Question