If your office is anything like mine, you get a lot of Web site content sent to you inside of Word documents. Most of the time, that’s the best way of doing things, since formatting within e-mail is so inconsistent and unreliable. The main problem with using Word documents to send new Web site content is the inclusion of images. A lot of people will embed the images directly in a Word document without attaching the originals to the e-mail message.
Normally, it can be extremely difficult to actually extract images from Word documents. Microsoft did not include any reliable tools to save the images out of the document. Once they’re in there, there’s no simple way to get the original images out.
Some people have suggested selecting the image, copying it and pasting it into a photo manipulation program (like Photoshop or GIMP). Many times, though, you end up with a degraded copy of the image when you paste it. You could also try capturing a screen shot of the Word document and then cropping that screen shot so that only the image shows up.
Unfortunately, neither of those methods is extremely reliable or consistent. However, there is a very simple way to extract images from Word 2007 documents. This method takes advantage of the fact that Office 2007 files are actually ZIP archives. In order to extract the images, simply rename the Word 2007 document, replacing the “docx” file extension with a “zip” file extension. Then, use a program like WinRAR or the “Compressed Folders” program built into Windows to open the ZIP archive.
It should be noted that this process only works with Office 2007 documents (docx, pptx, xlsx, etc.). If you have a Word 97-2003 document (doc), you can open it in Word 2007, then save it as a Word 2007 document to do this.
Within the archive, you’ll see a folder called “word,” then a folder inside of that called “media.” Within the “media” directory, you should see all of the images that were embedded in the document. If you are working with a PowerPoint 2007 presentation, the images can be found in “\ppt\media\,” and images within an Excel 2007 workbook will be in “\xl\media\.”
The images will be named with consecutive numbers based on when they were inserted in the document (“image1.jpeg”, “image2.jpeg”, etc.).