Package org.apache.poi.hssf.extractor
Class OldExcelExtractor
java.lang.Object
org.apache.poi.hssf.extractor.OldExcelExtractor
- All Implemented Interfaces:
Closeable
,AutoCloseable
A text extractor for old Excel files, which are too old for
HSSFWorkbook to handle. This includes Excel 95, and very old
(pre-OLE2) Excel files, such as Excel 4 files.
Returns much (but not all) of the textual content of the file, suitable for indexing by something like Apache Lucene, or used by Apache Tika, but not really intended for display to the user.
-
Constructor Summary
ConstructorsConstructorDescriptionOldExcelExtractor
(InputStream input) OldExcelExtractor
(DirectoryNode directory) -
Method Summary
Modifier and TypeMethodDescriptionvoid
close()
int
The Biff version, largely corresponding to the Excel versionint
The kind of the file, one ofBOFRecord.TYPE_WORKSHEET
,BOFRecord.TYPE_CHART
,BOFRecord.TYPE_EXCEL_4_MACRO
orBOFRecord.TYPE_WORKSPACE_FILE
getText()
Retrieves the text contents of the file, as best we can for these old file formatsprotected void
handleNumericCell
(StringBuffer text, double value) static void
-
Constructor Details
-
OldExcelExtractor
- Throws:
IOException
-
OldExcelExtractor
- Throws:
IOException
-
OldExcelExtractor
- Throws:
IOException
-
OldExcelExtractor
- Throws:
IOException
-
-
Method Details
-
main
- Throws:
IOException
-
getBiffVersion
public int getBiffVersion()The Biff version, largely corresponding to the Excel version- Returns:
- the Biff version
-
getFileType
public int getFileType()The kind of the file, one ofBOFRecord.TYPE_WORKSHEET
,BOFRecord.TYPE_CHART
,BOFRecord.TYPE_EXCEL_4_MACRO
orBOFRecord.TYPE_WORKSPACE_FILE
- Returns:
- the file type
-
getText
Retrieves the text contents of the file, as best we can for these old file formats- Returns:
- the text contents of the file
-
close
public void close()- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceCloseable
-
handleNumericCell
-