Wanted: module to access MS-Word Documents 
Author Message
 Wanted: module to access MS-Word Documents

Hi all there,

do you know about modules to access (scan) MS-Word Documents.  I need
this to create a index for a search engine.

Any other suggestions ?

greetings Carsten
--
GiS - Gesellschaft fuer integrierte Systemplanung mbH   Tel. +49-6201-503-38
Carsten Schabacker                                      Fax  +49-6201-503-66

D-69469 Weinheim                                           http://www.*-*-*.com/



Sat, 17 Mar 2001 03:00:00 GMT  
 Wanted: module to access MS-Word Documents
This is not a problem with the python win32com extensions.  MSOffice works
well.

The biggest problem will be finding documentation on - then understanding! -
the MSWord object model.  I believe it is in the MS Office Developers kit.

When you install win32all, check out the win32com\test\testMSOffice.py file
for some basic samples that use MSWord.

Mark.

Quote:

>Hi all there,

>do you know about modules to access (scan) MS-Word Documents.  I need
>this to create a index for a search engine.

>Any other suggestions ?

>greetings Carsten
>--
>GiS - Gesellschaft fuer integrierte Systemplanung mbH   Tel.
+49-6201-503-38
>Carsten Schabacker                                      Fax
+49-6201-503-66
>Junkersstr. 2                               E-Mail


Quote:
>D-69469 Weinheim

http://www.ibfs.de


Sat, 17 Mar 2001 03:00:00 GMT  
 Wanted: module to access MS-Word Documents
This is not a problem with the Python win32com extensions.  MSOffice works
well.

The biggest problem will be finding documentation on - then understanding! -
the MSWord object model.  I believe it is in the MS Office Developers kit.

When you install win32all, check out the win32com\test\testMSOffice.py file
for some basic samples that use MSWord.

Mark.

Quote:

>Hi all there,

>do you know about modules to access (scan) MS-Word Documents.  I need
>this to create a index for a search engine.

>Any other suggestions ?

>greetings Carsten
>--
>GiS - Gesellschaft fuer integrierte Systemplanung mbH   Tel.
+49-6201-503-38
>Carsten Schabacker                                      Fax
+49-6201-503-66
>Junkersstr. 2                               E-Mail


Quote:
>D-69469 Weinheim

http://www.ibfs.de


Sat, 17 Mar 2001 03:00:00 GMT  
 Wanted: module to access MS-Word Documents
Am I right in my understanding that Microsoft changes these APIs from
version to version and does not maintain backwards compatibility? I can't
figure out how to get PowerPoint 7-targeted code to work with PowerPoint
8. Or am I doing something wrong?

 Paul Prescod

Quote:

> This is not a problem with the Python win32com extensions.  MSOffice works
> well.

> The biggest problem will be finding documentation on - then understanding! -
> the MSWord object model.  I believe it is in the MS Office Developers kit.

> When you install win32all, check out the win32com\test\testMSOffice.py file
> for some basic samples that use MSWord.

> Mark.


> >Hi all there,

> >do you know about modules to access (scan) MS-Word Documents.  I need
> >this to create a index for a search engine.

> >Any other suggestions ?

> >greetings Carsten
> >--
> >GiS - Gesellschaft fuer integrierte Systemplanung mbH   Tel.
> +49-6201-503-38
> >Carsten Schabacker                                      Fax
> +49-6201-503-66
> >Junkersstr. 2                               E-Mail

> >D-69469 Weinheim
> http://www.ibfs.de

--
 Paul Prescod  - http://itrc.uwaterloo.ca/~papresco

Bart: Dad, do I really have to brush my teeth?
Homer: No, but at least wash your mouth out with soda.



Sat, 17 Mar 2001 03:00:00 GMT  
 Wanted: module to access MS-Word Documents

Thanks to alle the people who answered me by mail, but I must quote myself:

Quote:
> do you know about modules to access (scan) MS-Word Documents.  I need
> this to create a index for a search engine.

I forgot to say, that I want to do this on Unix (our Fileserver).  So
I can not use MS-DLL's (win32*), can I ?  

I think I need a module which knows how to handle the MS-Word-Fileformat.

carsten
--
GiS - Gesellschaft fuer integrierte Systemplanung mbH   Tel. +49-6201-503-38
Carsten Schabacker                                      Fax  +49-6201-503-66

D-69469 Weinheim                                          http://www.ibfs.de



Sat, 17 Mar 2001 03:00:00 GMT  
 Wanted: module to access MS-Word Documents

Quote:

>I forgot to say, that I want to do this on Unix (our Fileserver).  So
>I can not use MS-DLL's (win32*), can I ?  

You might try 'WordView' avaailable at:

http://www.csn.ul.ie/~caolan/docs/MSWordView.html

You could use it to convert word to html, then use htmllib to parse
the html.   Or maybe the source code could be used to create a real
extension module.   If all else fails, that page has a link to the
format of the word files, and you could write a parser or something.

This would sure be a handy module to have!



Sat, 17 Mar 2001 03:00:00 GMT  
 Wanted: module to access MS-Word Documents
The APIs do change from version to version. All in the name of progress :-)

-g

Quote:

>Am I right in my understanding that Microsoft changes these APIs from
>version to version and does not maintain backwards compatibility? I can't
>figure out how to get PowerPoint 7-targeted code to work with PowerPoint
>8. Or am I doing something wrong?
>...



Sat, 17 Mar 2001 03:00:00 GMT  
 
 [ 7 post ] 

 Relevant Pages 

1. Printing MS-WORD documents

2. Re : Reading MS Word Document

3. The best solution for printing report into MS Word document

4. Printing Report to MS Word Document

5. MS Word Documents

6. Clipper ->MS Word document?

7. Write a MS Word document ?

8. Reading from MS Word and Excel documents

9. tcom. Print pages from MS Word document

10. Code to recognize MS-Word document files?

11. Class(y) 2.4b, with documentation for MS-Word 97, MS-Word 6.0 and WordPerfect 7.0

12. Accessing Microsoft Word documents

 

 
Powered by phpBB® Forum Software