Searching a WORD 6 DOC by line line number 
Author Message
 Searching a WORD 6 DOC by line line number

I want to write a PERL script to read through a MS Word document and
return a line number where a match occurs.

The problem is the binary line breaks. How can I open the file into an
array that can be read line by line.  I've tried out splitting on \m and
a few others but I'm not sure what WORD uses for its line breaks.

Any help?

Thanks



Sat, 06 May 2000 03:00:00 GMT  
 Searching a WORD 6 DOC by line line number

I know nothing about the internals of Word documents, but if I wanted to
solve this problem I would look at a dump of a file to see what is used
for the line breaks and other control characters. Debug is your friend.

Cheers,

Jim

: I want to write a PERL script to read through a MS Word document and
: return a line number where a match occurs.

: The problem is the binary line breaks. How can I open the file into an
: array that can be read line by line.  I've tried out splitting on \m and
: a few others but I'm not sure what WORD uses for its line breaks.



Sun, 07 May 2000 03:00:00 GMT  
 Searching a WORD 6 DOC by line line number

On Tue, 18 Nov 1997 17:37:17 -0600, Michael Farris

Quote:

>I want to write a PERL script to read through a MS Word document and
>return a line number where a match occurs.

>The problem is the binary line breaks. How can I open the file into an
>array that can be read line by line.  I've tried out splitting on \m and
>a few others but I'm not sure what WORD uses for its line breaks.

They say "a little knowledge is a dangerous thing", and I think they
are right.

Don't try and treat a Word document like a text file, because its not.
Save as text, then run through it, or look at the LAOLA docs for some
more info.

Mike



Sun, 07 May 2000 03:00:00 GMT  
 Searching a WORD 6 DOC by line line number

Quote:

> I want to write a PERL script to read through a MS Word document and
> return a line number where a match occurs.

> The problem is the binary line breaks. How can I open the file into an
> array that can be read line by line.  I've tried out splitting on \m and
> a few others but I'm not sure what WORD uses for its line breaks.

Check

http://user.cs.tu-berlin.de/~schwartz/pmh/laola.html

It may help.
--
Bob Breedlove
WebSite Automation/SurfnetUSA Technical Support

http://colo.tntmedia.com/~breedlov/



Mon, 08 May 2000 03:00:00 GMT  
 
 [ 4 post ] 

 Relevant Pages 

1. Search text database, return line number containing a match

2. Randomly selecting lines by line number

3. displaying lines with a specific number of lines on each page

4. search for one line and print that line and the one after it

5. Perl command line search and line replace

6. Search a file (line by line) for String/text

7. Module to search for word(*.doc) files from web

8. Perl line numbers vs Apache error_log reported numbers?

9. In search of file searching script,output,every line occurance occurs

10. delete some lines in a doc

11. capturing lines even with breaks (from specific point in text doc to another)

12. Extract last word on line

 

 
Powered by phpBB® Forum Software