public class DocumentOcrTemplate extends Object
Constructor and Description |
---|
DocumentOcrTemplate(ImageAnnotatorClient imageAnnotatorClient,
Storage storage,
Executor executor,
int jsonOutputBatchSize) |
Modifier and Type | Method and Description |
---|---|
DocumentOcrResultSet |
readOcrOutputFile(GoogleStorageLocation jsonFile)
Parses a single JSON output file and returns the list of pages stored in the file.
|
DocumentOcrResultSet |
readOcrOutputFileSet(GoogleStorageLocation jsonOutputFilePathPrefix)
Parses the OCR output files who have the specified
jsonFilesetPrefix . |
org.springframework.util.concurrent.ListenableFuture<DocumentOcrResultSet> |
runOcrForDocument(GoogleStorageLocation document,
GoogleStorageLocation outputFilePathPrefix)
Runs OCR processing for a specified
document and generates OCR output files under the
path specified by outputFilePathPrefix . |
public DocumentOcrTemplate(ImageAnnotatorClient imageAnnotatorClient, Storage storage, Executor executor, int jsonOutputBatchSize)
public org.springframework.util.concurrent.ListenableFuture<DocumentOcrResultSet> runOcrForDocument(GoogleStorageLocation document, GoogleStorageLocation outputFilePathPrefix)
document
and generates OCR output files under the
path specified by outputFilePathPrefix
.
For example, if you specify an outputFilePathPrefix
of
"gs://bucket_name/ocr_results/myDoc_", all the output files of OCR processing will be saved
under prefix, such as:
Note: OCR processing operations may take several minutes to complete, so it may not be
advisable to block on the completion of the operation. One may use the returned ListenableFuture
to register callbacks or track the status of the operation.
document
- The GoogleStorageLocation
of the document to run OCR processingoutputFilePathPrefix
- The GoogleStorageLocation
of a file, folder, or a bucket
describing the path for which all output files shall be saved underListenableFuture
allowing you to register callbacks or wait for the
completion of the operation.public DocumentOcrResultSet readOcrOutputFileSet(GoogleStorageLocation jsonOutputFilePathPrefix)
jsonFilesetPrefix
. This method
assumes that all of the OCR output files with the prefix are a part of the same document.jsonOutputFilePathPrefix
- the folder location containing all of the JSON files of OCR
outputDocumentOcrResultSet
describing the OCR content of a documentpublic DocumentOcrResultSet readOcrOutputFile(GoogleStorageLocation jsonFile)
Each page of the document is represented as a TextAnnotation
which contains the
parsed OCR data.
jsonFile
- the location of the JSON output fileTextAnnotation
containing the OCR resultsRuntimeException
- if the JSON file cannot be deserialized into a TextAnnotation
objectCopyright © 2022. All rights reserved.