Class DocumentOcrTemplate

java.lang.Object
com.google.cloud.spring.vision.DocumentOcrTemplate

public class DocumentOcrTemplate extends Object
Template providing convenient operations for interfacing with Google Cloud Vision's Document OCR feature, which allows you to run OCR algorithms on documents (PDF or TIFF format) stored on Google Cloud Storage.
  • Constructor Details

    • DocumentOcrTemplate

      public DocumentOcrTemplate(com.google.cloud.vision.v1.ImageAnnotatorClient imageAnnotatorClient, com.google.cloud.storage.Storage storage, Executor executor, int jsonOutputBatchSize)
  • Method Details

    • runOcrForDocument

      public CompletableFuture<DocumentOcrResultSet> runOcrForDocument(GoogleStorageLocation document, GoogleStorageLocation outputFilePathPrefix)
      Runs OCR processing for a specified document and generates OCR output files under the path specified by outputFilePathPrefix.

      For example, if you specify an outputFilePathPrefix of "gs://bucket_name/ocr_results/myDoc_", all the output files of OCR processing will be saved under prefix, such as:

      • gs://bucket_name/ocr_results/myDoc_output-1-to-5.json
      • gs://bucket_name/ocr_results/myDoc_output-6-to-10.json
      • gs://bucket_name/ocr_results/myDoc_output-11-to-15.json

      Note: OCR processing operations may take several minutes to complete, so it may not be advisable to block on the completion of the operation. One may use the returned CompletableFuture to register callbacks or track the status of the operation.

      Parameters:
      document - The GoogleStorageLocation of the document to run OCR processing
      outputFilePathPrefix - The GoogleStorageLocation of a file, folder, or a bucket describing the path for which all output files shall be saved under
      Returns:
      A CompletableFuture allowing you to register callbacks or wait for the completion of the operation.
    • readOcrOutputFileSet

      public DocumentOcrResultSet readOcrOutputFileSet(GoogleStorageLocation jsonOutputFilePathPrefix)
      Parses the OCR output files who have the specified jsonFilesetPrefix. This method assumes that all of the OCR output files with the prefix are a part of the same document.
      Parameters:
      jsonOutputFilePathPrefix - the folder location containing all of the JSON files of OCR output
      Returns:
      A DocumentOcrResultSet describing the OCR content of a document
    • readOcrOutputFile

      public DocumentOcrResultSet readOcrOutputFile(GoogleStorageLocation jsonFile)
      Parses a single JSON output file and returns the list of pages stored in the file.

      Each page of the document is represented as a TextAnnotation which contains the parsed OCR data.

      Parameters:
      jsonFile - the location of the JSON output file
      Returns:
      the list of TextAnnotation containing the OCR results
      Throws:
      RuntimeException - if the JSON file cannot be deserialized into a TextAnnotation object