
Finding text inside Word Docs without using Word
Quote:
>Get yourself a hex viewer and look at a Word document. You will find that
>text is stored simply as text. Therefore, opening the Word document in
>binary mode and using Instr works like a charm.
Well, yes and no. Yes, Word docs are usually saved in binary format that
nonetheless allows text grepping, but often the structure (and text) is
screwy. So
you should have no trouble seeing if a given word appears *someplace* in the
doc, but large-scale parsing is a drag.
Plus, Word97 docs are often saved as RTF if the user wants it to be
readable in Word95. So, you'll find lots of {\pard stuff.
I've been parsing Word docs using perl on a Unix box, running into all sorts
of fun.
James
Quote:
>>Is there a way to find text inside Word documents, on a computer where
Word
>>is not available (only WordView), and where i can't use Find (advanced)
>from
>>W95.
>>Is it possible to create a prog witch can do this??
>>Any suggestion are welcome....
>>Stephan
>>--
>>To send E-mail, remove the spamblock ".rotzooi"
>>Om E-mail te sturen, verwijder de rotzooi.