PDF to Excel Converter
Convert pdf to excel. Only the text will be extracted.
You can use the java application (in org/eadge/extractpdfexcel/0.1 directory) to convert one pdf file into excel format.
java -jar extractpdfexcel-0.1.jar source.pdf result.xcl
To specify some options, you can consult the help. You may want to specify column and row width.
java -jar extractpdfexcel-0.1.jar
Add the repository:
<repository>
<id>Extract-PDF-Excel</id>
<url>https://raw.githubusercontent.com/eadgyo/Extract-PDF-Excel/master/</url>
</repository>
And the dependency:
<dependency>
<groupId>org.eadge</groupId>
<artifactId>extractpdfexcel</artifactId>
<version>0.1</version>
</dependency>
You can convert your pdf into an Excel file in java application.
PdfConverter.createExcelFile("File.pdf", "File.xcl");
You can also execute each steps, in case you would like to access to data before creating the file.
Text information is extracted in keeps in blocks.
ExtractedData extractedData = PdfConverter.extractFromFile(sourcePDFPath, textBlockIdentifier);
Blocks are sorted, lines and columns are created.
SortedData sortedData = PdfConverter.sortExtractedData(extractedData, lineAxis, columnAxis, true);
2D array containing text blocks are created.
ArrayList<XclPage> excelPages = PdfConverter.createExcelPages(sortedData);
Using POI Library, you can create xcl sheets.
HSSFSheet excelSheet = PdfConverter.createExcelSheet("sheetName", workbook, excelPage);
You can also visualize all the created excel sheets.
PdfConverter.displayXCLPages(excelPages);
Download Details:
Author: eadgyo
Official Github: https://github.com/eadgyo/Extract-PDF-Excel
License: GPL-3.0 license