Package com.acumenvelocity.ath.common
Class PdfUtil
- java.lang.Object
-
- com.acumenvelocity.ath.common.PdfUtil
-
public final class PdfUtil extends Object
-
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static InputStreamconvertDocxToPdf(InputStream docxInputStream, net.sf.okapi.common.LocaleId locale)Convert DOCX to PDF using Adobe PDF Servicesstatic InputStreamconvertPdfToDocx(InputStream pdfInputStream, net.sf.okapi.common.LocaleId locale, OcrMode ocrMode)Convert PDF to DOCX using Adobe PDF Servicesstatic com.adobe.pdfservices.operation.pdfjobs.params.createpdf.word.DocumentLanguagegetDocumentLanguage(net.sf.okapi.common.LocaleId locale)Converts an Okapi LocaleId to Adobe DocumentLanguage.static com.adobe.pdfservices.operation.PDFServicesgetPdfServices()Get the PDF Services instance for the writer to usestatic voidinit()static booleanneedsOcr(File pdfFile)Determines whether the given PDF file likely needs OCR (i.e., contains no selectable text).static com.adobe.pdfservices.operation.pdfjobs.params.exportpdf.ExportOCRLocaletoAdobeLocale(net.sf.okapi.common.LocaleId locale)Converts an Okapi LocaleId to Adobe ExportOCRLocale.
-
-
-
Method Detail
-
init
public static void init()
-
needsOcr
public static boolean needsOcr(File pdfFile)
Determines whether the given PDF file likely needs OCR (i.e., contains no selectable text).Works with PDFBox 3.x (uses Loader.loadPDF()).
- Parameters:
pdfFile- the local PDF file- Returns:
trueif the PDF appears image-only (no selectable text),falseotherwise
-
toAdobeLocale
public static com.adobe.pdfservices.operation.pdfjobs.params.exportpdf.ExportOCRLocale toAdobeLocale(net.sf.okapi.common.LocaleId locale)
Converts an Okapi LocaleId to Adobe ExportOCRLocale. Falls back to EN_US with a warning if the locale is not supported.- Parameters:
locale- the Okapi LocaleId to convert- Returns:
- the corresponding ExportOCRLocale, or EN_US as fallback
-
getDocumentLanguage
public static com.adobe.pdfservices.operation.pdfjobs.params.createpdf.word.DocumentLanguage getDocumentLanguage(net.sf.okapi.common.LocaleId locale)
Converts an Okapi LocaleId to Adobe DocumentLanguage. Falls back to EN_US with a warning if the locale is not supported.- Parameters:
locale- the Okapi LocaleId to convert- Returns:
- the corresponding DocumentLanguage, or EN_US as fallback
-
convertPdfToDocx
public static InputStream convertPdfToDocx(InputStream pdfInputStream, net.sf.okapi.common.LocaleId locale, OcrMode ocrMode) throws Exception
Convert PDF to DOCX using Adobe PDF Services- Throws:
Exception
-
convertDocxToPdf
public static InputStream convertDocxToPdf(InputStream docxInputStream, net.sf.okapi.common.LocaleId locale) throws Exception
Convert DOCX to PDF using Adobe PDF Services- Throws:
Exception
-
getPdfServices
public static com.adobe.pdfservices.operation.PDFServices getPdfServices()
Get the PDF Services instance for the writer to use
-
-