doc2txt man page on Plan9

Man page or keyword search:  
man Server   549 pages
apropos Keyword Search (all sections)
Output format
Plan9 logo
[printable version]

DOC2TXT(1)							    DOC2TXT(1)

       doc2txt, doc2ps, wdoc2txt, xls2txt, olefs, mswordstrings, msexceltables
       - extract printable text from Microsoft documents

       doc2txt [ file.doc ]
       doc2ps [ file.doc ]
       wdoc2txt [ file.doc ]
       xls2txt [ file.xls ]
       aux/olefs [ -m mtpt ] file.doc
       aux/mswordstrings mtpt/WordDocument
       aux/msexceltables [ -qaDnt ] [ -d delim ] [  -c	column-range  ]	 [  -w
       worksheet-range ] mtpt/Workbook

       Doc2txt is an rc(1) script that uses olefs and mswordstrings to extract
       the printable text from the body of a Microsoft Word document and write
       it  on  the  standard  output.  Doc2ps is similar, but emits PostScript
       corresponding to the document.  Wdoc2txt is  similar  to	 doc2txt,  but
       uses  plumb(1)  to  send	 the  output  to a new acme(1) window instead.
       Xls2txt performs a similar function for Microsoft Excel documents.

       Microsoft Office documents are stored in OLE (Object Linking and Embed‐
       ding)  format,  which  is a scaled down version of Microsoft's FAT file
       system.	Olefs presents the contents of an MS Office document as a file
       system  on  mtpt,  which	 defaults to /mnt/doc.	Mswordstrings or msex‐
       celtables may then be used to parse the files inside, extracting a text
       stream.	 Msexceltables	may be given options to control the formatting
       of its output.

       -a     Attempt  conversion  of  non-tabular  sheets  in	the   workbook

       -d delim
	      Sets the inter-field delimiter to the string delim, by default a
	      single space.

       -D     Enables debugging output.

       -c range
	      Range is a comma-separated list of column	 numbers  and  ranges.
	      Ranges  are separated by dashes.	Limit processing to just those
	      columns named; by default all columns are output.

       -n     Disables field padding to column width.

       -q     Disable quoting of textural fields (see quote(2).)

       -t     Truncate fields to the column width.

       -w range
	      Range is a comma-separated list of worksheet numbers and ranges,
	      this  limits  the	 sheets output using the same syntax as the -c
	      option above.  Suppressed chart pages are always included in the
	      sheet count.

       Extract pieces of an MS Excel spreadsheet.
	      aux/olefs report.xls
	      msexceltables -q -w 1,7,9-14 -c 3-5 -n -d '@' /mnt/doc/Workbook > rpt.txt
	      unmount /mnt/doc

	      doc2txt, doc2ps, wdoc2txt, and xls2txt

	      the others

       ``Microsoft  Word  97  Binary  File  Format'', at Microsoft's developer
       (MSDN) home page.
       ``LAOLA Binary Structures'',
       ``OpenOffice.Org's Excel Documentation'',

                             _         _         _ 
                            | |       | |       | |     
                            | |       | |       | |     
                         __ | | __ __ | | __ __ | | __  
                         \ \| |/ / \ \| |/ / \ \| |/ /  
                          \ \ / /   \ \ / /   \ \ / /   
                           \   /     \   /     \   /    
                            \_/       \_/       \_/ 
More information is available in HTML format for server Plan9

List of man pages available for Plan9

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
Vote for polarhome
Free Shell Accounts :: the biggest list on the net