Class: Google::Cloud::DocumentAI::V1beta3::OcrConfig
- Inherits:
-
Object
- Object
- Google::Cloud::DocumentAI::V1beta3::OcrConfig
- Extended by:
- Protobuf::MessageExts::ClassMethods
- Includes:
- Protobuf::MessageExts
- Defined in:
- proto_docs/google/cloud/documentai/v1beta3/document_io.rb
Overview
Config for Document OCR.
Defined Under Namespace
Classes: Hints, PremiumFeatures
Instance Attribute Summary collapse
-
#advanced_ocr_options ⇒ ::Array<::String>
A list of advanced OCR options to further fine-tune OCR behavior.
-
#compute_style_info ⇒ ::Boolean
deprecated
Deprecated.
This field is deprecated and may be removed in the next major version update.
-
#disable_character_boxes_detection ⇒ ::Boolean
Turn off character box detector in OCR engine.
-
#enable_image_quality_scores ⇒ ::Boolean
Enables intelligent document quality scores after OCR.
-
#enable_native_pdf_parsing ⇒ ::Boolean
Enables special handling for PDFs with existing text information.
-
#enable_symbol ⇒ ::Boolean
Includes symbol level OCR information if set to true.
-
#hints ⇒ ::Google::Cloud::DocumentAI::V1beta3::OcrConfig::Hints
Hints for the OCR model.
-
#premium_features ⇒ ::Google::Cloud::DocumentAI::V1beta3::OcrConfig::PremiumFeatures
Configurations for premium OCR features.
Instance Attribute Details
#advanced_ocr_options ⇒ ::Array<::String>
Returns A list of advanced OCR options to further fine-tune OCR behavior. Current valid values are:
legacy_layout
: a heuristics layout detection algorithm, which serves as an alternative to the current ML-based layout detection algorithm. Customers can choose the best suitable layout algorithm based on their situation.
164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 |
# File 'proto_docs/google/cloud/documentai/v1beta3/document_io.rb', line 164 class OcrConfig include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods # Hints for OCR Engine # @!attribute [rw] language_hints # @return [::Array<::String>] # List of BCP-47 language codes to use for OCR. In most cases, not # specifying it yields the best results since it enables automatic language # detection. For languages based on the Latin alphabet, setting hints is # not needed. In rare cases, when the language of the text in the # image is known, setting a hint will help get better results (although it # will be a significant hindrance if the hint is wrong). class Hints include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end # Configurations for premium OCR features. # @!attribute [rw] enable_selection_mark_detection # @return [::Boolean] # Turn on selection mark detector in OCR engine. Only available in OCR 2.0 # (and later) processors. # @!attribute [rw] compute_style_info # @return [::Boolean] # Turn on font identification model and return font style information. # @!attribute [rw] enable_math_ocr # @return [::Boolean] # Turn on the model that can extract LaTeX math formulas. class PremiumFeatures include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end end |
#compute_style_info ⇒ ::Boolean
This field is deprecated and may be removed in the next major version update.
Returns Turn on font identification model and return font style information. Deprecated, use PremiumFeatures.compute_style_info instead.
164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 |
# File 'proto_docs/google/cloud/documentai/v1beta3/document_io.rb', line 164 class OcrConfig include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods # Hints for OCR Engine # @!attribute [rw] language_hints # @return [::Array<::String>] # List of BCP-47 language codes to use for OCR. In most cases, not # specifying it yields the best results since it enables automatic language # detection. For languages based on the Latin alphabet, setting hints is # not needed. In rare cases, when the language of the text in the # image is known, setting a hint will help get better results (although it # will be a significant hindrance if the hint is wrong). class Hints include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end # Configurations for premium OCR features. # @!attribute [rw] enable_selection_mark_detection # @return [::Boolean] # Turn on selection mark detector in OCR engine. Only available in OCR 2.0 # (and later) processors. # @!attribute [rw] compute_style_info # @return [::Boolean] # Turn on font identification model and return font style information. # @!attribute [rw] enable_math_ocr # @return [::Boolean] # Turn on the model that can extract LaTeX math formulas. class PremiumFeatures include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end end |
#disable_character_boxes_detection ⇒ ::Boolean
Returns Turn off character box detector in OCR engine. Character box detection is enabled by default in OCR 2.0 (and later) processors.
164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 |
# File 'proto_docs/google/cloud/documentai/v1beta3/document_io.rb', line 164 class OcrConfig include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods # Hints for OCR Engine # @!attribute [rw] language_hints # @return [::Array<::String>] # List of BCP-47 language codes to use for OCR. In most cases, not # specifying it yields the best results since it enables automatic language # detection. For languages based on the Latin alphabet, setting hints is # not needed. In rare cases, when the language of the text in the # image is known, setting a hint will help get better results (although it # will be a significant hindrance if the hint is wrong). class Hints include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end # Configurations for premium OCR features. # @!attribute [rw] enable_selection_mark_detection # @return [::Boolean] # Turn on selection mark detector in OCR engine. Only available in OCR 2.0 # (and later) processors. # @!attribute [rw] compute_style_info # @return [::Boolean] # Turn on font identification model and return font style information. # @!attribute [rw] enable_math_ocr # @return [::Boolean] # Turn on the model that can extract LaTeX math formulas. class PremiumFeatures include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end end |
#enable_image_quality_scores ⇒ ::Boolean
Returns Enables intelligent document quality scores after OCR. Can help with diagnosing why OCR responses are of poor quality for a given input. Adds additional latency comparable to regular OCR to the process call.
164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 |
# File 'proto_docs/google/cloud/documentai/v1beta3/document_io.rb', line 164 class OcrConfig include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods # Hints for OCR Engine # @!attribute [rw] language_hints # @return [::Array<::String>] # List of BCP-47 language codes to use for OCR. In most cases, not # specifying it yields the best results since it enables automatic language # detection. For languages based on the Latin alphabet, setting hints is # not needed. In rare cases, when the language of the text in the # image is known, setting a hint will help get better results (although it # will be a significant hindrance if the hint is wrong). class Hints include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end # Configurations for premium OCR features. # @!attribute [rw] enable_selection_mark_detection # @return [::Boolean] # Turn on selection mark detector in OCR engine. Only available in OCR 2.0 # (and later) processors. # @!attribute [rw] compute_style_info # @return [::Boolean] # Turn on font identification model and return font style information. # @!attribute [rw] enable_math_ocr # @return [::Boolean] # Turn on the model that can extract LaTeX math formulas. class PremiumFeatures include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end end |
#enable_native_pdf_parsing ⇒ ::Boolean
Returns Enables special handling for PDFs with existing text information. Results in better text extraction quality in such PDF inputs.
164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 |
# File 'proto_docs/google/cloud/documentai/v1beta3/document_io.rb', line 164 class OcrConfig include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods # Hints for OCR Engine # @!attribute [rw] language_hints # @return [::Array<::String>] # List of BCP-47 language codes to use for OCR. In most cases, not # specifying it yields the best results since it enables automatic language # detection. For languages based on the Latin alphabet, setting hints is # not needed. In rare cases, when the language of the text in the # image is known, setting a hint will help get better results (although it # will be a significant hindrance if the hint is wrong). class Hints include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end # Configurations for premium OCR features. # @!attribute [rw] enable_selection_mark_detection # @return [::Boolean] # Turn on selection mark detector in OCR engine. Only available in OCR 2.0 # (and later) processors. # @!attribute [rw] compute_style_info # @return [::Boolean] # Turn on font identification model and return font style information. # @!attribute [rw] enable_math_ocr # @return [::Boolean] # Turn on the model that can extract LaTeX math formulas. class PremiumFeatures include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end end |
#enable_symbol ⇒ ::Boolean
Returns Includes symbol level OCR information if set to true.
164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 |
# File 'proto_docs/google/cloud/documentai/v1beta3/document_io.rb', line 164 class OcrConfig include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods # Hints for OCR Engine # @!attribute [rw] language_hints # @return [::Array<::String>] # List of BCP-47 language codes to use for OCR. In most cases, not # specifying it yields the best results since it enables automatic language # detection. For languages based on the Latin alphabet, setting hints is # not needed. In rare cases, when the language of the text in the # image is known, setting a hint will help get better results (although it # will be a significant hindrance if the hint is wrong). class Hints include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end # Configurations for premium OCR features. # @!attribute [rw] enable_selection_mark_detection # @return [::Boolean] # Turn on selection mark detector in OCR engine. Only available in OCR 2.0 # (and later) processors. # @!attribute [rw] compute_style_info # @return [::Boolean] # Turn on font identification model and return font style information. # @!attribute [rw] enable_math_ocr # @return [::Boolean] # Turn on the model that can extract LaTeX math formulas. class PremiumFeatures include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end end |
#hints ⇒ ::Google::Cloud::DocumentAI::V1beta3::OcrConfig::Hints
Returns Hints for the OCR model.
164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 |
# File 'proto_docs/google/cloud/documentai/v1beta3/document_io.rb', line 164 class OcrConfig include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods # Hints for OCR Engine # @!attribute [rw] language_hints # @return [::Array<::String>] # List of BCP-47 language codes to use for OCR. In most cases, not # specifying it yields the best results since it enables automatic language # detection. For languages based on the Latin alphabet, setting hints is # not needed. In rare cases, when the language of the text in the # image is known, setting a hint will help get better results (although it # will be a significant hindrance if the hint is wrong). class Hints include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end # Configurations for premium OCR features. # @!attribute [rw] enable_selection_mark_detection # @return [::Boolean] # Turn on selection mark detector in OCR engine. Only available in OCR 2.0 # (and later) processors. # @!attribute [rw] compute_style_info # @return [::Boolean] # Turn on font identification model and return font style information. # @!attribute [rw] enable_math_ocr # @return [::Boolean] # Turn on the model that can extract LaTeX math formulas. class PremiumFeatures include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end end |
#premium_features ⇒ ::Google::Cloud::DocumentAI::V1beta3::OcrConfig::PremiumFeatures
Returns Configurations for premium OCR features.
164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 |
# File 'proto_docs/google/cloud/documentai/v1beta3/document_io.rb', line 164 class OcrConfig include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods # Hints for OCR Engine # @!attribute [rw] language_hints # @return [::Array<::String>] # List of BCP-47 language codes to use for OCR. In most cases, not # specifying it yields the best results since it enables automatic language # detection. For languages based on the Latin alphabet, setting hints is # not needed. In rare cases, when the language of the text in the # image is known, setting a hint will help get better results (although it # will be a significant hindrance if the hint is wrong). class Hints include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end # Configurations for premium OCR features. # @!attribute [rw] enable_selection_mark_detection # @return [::Boolean] # Turn on selection mark detector in OCR engine. Only available in OCR 2.0 # (and later) processors. # @!attribute [rw] compute_style_info # @return [::Boolean] # Turn on font identification model and return font style information. # @!attribute [rw] enable_math_ocr # @return [::Boolean] # Turn on the model that can extract LaTeX math formulas. class PremiumFeatures include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end end |