v1 - Documentation

Source:

index.js, line 27

Members

(static, constant) Feature :number

Video annotation feature.

Properties:

Name	Type	Description
`FEATURE_UNSPECIFIED`	number	Unspecified.
`LABEL_DETECTION`	number	Label detection. Detect objects, such as dog or flower.
`SHOT_CHANGE_DETECTION`	number	Shot change detection.
`EXPLICIT_CONTENT_DETECTION`	number	Explicit content detection.
`FACE_DETECTION`	number	Human face detection and tracking.
`SPEECH_TRANSCRIPTION`	number	Speech transcription.
`TEXT_DETECTION`	number	OCR text detection and tracking.
`OBJECT_TRACKING`	number	Object detection and tracking.

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 1047

(static, constant) LabelDetectionMode :number

Label detection mode.

Properties:

Name	Type	Description
`LABEL_DETECTION_MODE_UNSPECIFIED`	number	Unspecified.
`SHOT_MODE`	number	Detect shot-level labels.
`FRAME_MODE`	number	Detect frame-level labels.
`SHOT_AND_FRAME_MODE`	number	Detect both shot-level and frame-level labels.

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 1096

(static, constant) Likelihood :number

Bucketized representation of likelihood.

Properties:

Name	Type	Description
`LIKELIHOOD_UNSPECIFIED`	number	Unspecified likelihood.
`VERY_UNLIKELY`	number	Very unlikely.
`UNLIKELY`	number	Unlikely.
`POSSIBLE`	number	Possible.
`LIKELY`	number	Likely.
`VERY_LIKELY`	number	Very likely.

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 1125

Type Definitions

AnnotateVideoProgress

Video annotation progress. Included in the metadata field of the Operation returned by the GetOperation call of the google::longrunning::Operations service.

Properties:

Name Type Description

Name	Type	Description
`annotationProgress`	Array.<Object>	Progress metadata for all videos specified in `AnnotateVideoRequest`. This object should have the same structure as VideoAnnotationProgress

annotationProgress

Array.<Object>

Progress metadata for all videos specified in AnnotateVideoRequest.

This object should have the same structure as VideoAnnotationProgress

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 652

See:

google.cloud.videointelligence.v1.AnnotateVideoProgress definition in proto format

AnnotateVideoRequest

Video annotation request.

Properties:

Name	Type	Description
`inputUri`	string	Input video location. Currently, only Google Cloud Storage URIs are supported, which must be specified in the following format: `gs://bucket-id/object-id` (other URI formats return google.rpc.Code.INVALID_ARGUMENT). For more information, see Request URIs. A video URI may include wildcards in `object-id`, and thus identify multiple videos. Supported wildcards: '*' to match 0 or more characters; '?' to match 1 character. If unset, the input video should be embedded in the request as `input_content`. If set, `input_content` should be unset.
`inputContent`	Buffer	The video data bytes. If unset, the input video(s) should be specified via `input_uri`. If set, `input_uri` should be unset.
`features`	Array.<number>	Requested video annotation features. The number should be among the values of Feature
`videoContext`	Object	Additional video context and/or feature-specific parameters. This object should have the same structure as VideoContext
`outputUri`	string	Optional location where the output (in JSON format) should be stored. Currently, only Google Cloud Storage URIs are supported, which must be specified in the following format: `gs://bucket-id/object-id` (other URI formats return google.rpc.Code.INVALID_ARGUMENT). For more information, see Request URIs.
`locationId`	string	Optional cloud region where annotation should take place. Supported cloud regions: `us-east1`, `us-west1`, `europe-west1`, `asia-east1`. If no region is specified, a region will be determined based on video file location.

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 18

See:

google.cloud.videointelligence.v1.AnnotateVideoRequest definition in proto format

AnnotateVideoResponse

Video annotation response. Included in the response field of the Operation returned by the GetOperation call of the google::longrunning::Operations service.

Properties:

Name Type Description

Name	Type	Description
`annotationResults`	Array.<Object>	Annotation results for all videos specified in `AnnotateVideoRequest`. This object should have the same structure as VideoAnnotationResults

annotationResults

Array.<Object>

Annotation results for all videos specified in AnnotateVideoRequest.

This object should have the same structure as VideoAnnotationResults

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 593

See:

google.cloud.videointelligence.v1.AnnotateVideoResponse definition in proto format

Entity

Detected entity from video analysis.

Properties:

Name	Type	Description
`entityId`	string	Opaque entity ID. Some IDs may be available in Google Knowledge Graph Search API.
`description`	string	Textual description, e.g. `Fixed-gear bicycle`.
`languageCode`	string	Language code for `description` in BCP-47 format.

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 318

See:

google.cloud.videointelligence.v1.Entity definition in proto format

ExplicitContentAnnotation

Explicit content annotation (based on per-frame visual signals only). If no explicit content has been detected in a frame, no annotations are present for that frame.

Properties:

Name Type Description

Name	Type	Description
`frames`	Array.<Object>	All video frames where explicit content was detected. This object should have the same structure as ExplicitContentFrame

frames

Array.<Object>

All video frames where explicit content was detected.

This object should have the same structure as ExplicitContentFrame

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 396

See:

google.cloud.videointelligence.v1.ExplicitContentAnnotation definition in proto format

ExplicitContentDetectionConfig

Config for EXPLICIT_CONTENT_DETECTION.

Properties:

Name	Type	Description
`model`	string	Model to use for explicit content detection. Supported values: "builtin/stable" (the default if unset) and "builtin/latest".

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 182

See:

google.cloud.videointelligence.v1.ExplicitContentDetectionConfig definition in proto format

ExplicitContentFrame

Video frame level annotation results for explicit content.

Properties:

Name Type Description

Name	Type	Description
`timeOffset`	Object	Time-offset, relative to the beginning of the video, corresponding to the video frame for this location. This object should have the same structure as Duration
`pornographyLikelihood`	number	Likelihood of the pornography content.. The number should be among the values of Likelihood

timeOffset

Object

Time-offset, relative to the beginning of the video, corresponding to the video frame for this location.

This object should have the same structure as Duration

pornographyLikelihood

number

Likelihood of the pornography content..

The number should be among the values of Likelihood

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 374

See:

google.cloud.videointelligence.v1.ExplicitContentFrame definition in proto format

FaceAnnotation

Face annotation.

Properties:

Name Type Description

Name	Type	Description
`thumbnail`	Buffer	Thumbnail of a representative face view (in JPEG format).
`segments`	Array.<Object>	All video segments where a face was detected. This object should have the same structure as FaceSegment
`frames`	Array.<Object>	All video frames where a face was detected. This object should have the same structure as FaceFrame

thumbnail

Buffer

Thumbnail of a representative face view (in JPEG format).

segments

Array.<Object>

All video segments where a face was detected.

This object should have the same structure as FaceSegment

frames

Array.<Object>

All video frames where a face was detected.

This object should have the same structure as FaceFrame

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 479

See:

google.cloud.videointelligence.v1.FaceAnnotation definition in proto format

FaceDetectionConfig

Config for FACE_DETECTION.

Properties:

Name	Type	Description
`model`	string	Model to use for face detection. Supported values: "builtin/stable" (the default if unset) and "builtin/latest".
`includeBoundingBoxes`	boolean	Whether bounding boxes be included in the face annotation output.

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 198

See:

google.cloud.videointelligence.v1.FaceDetectionConfig definition in proto format

FaceFrame

Video frame level annotation results for face detection.

Properties:

Name Type Description

Name	Type	Description
`normalizedBoundingBoxes`	Array.<Object>	Normalized Bounding boxes in a frame. There can be more than one boxes if the same face is detected in multiple locations within the current frame. This object should have the same structure as NormalizedBoundingBox
`timeOffset`	Object	Time-offset, relative to the beginning of the video, corresponding to the video frame for this location. This object should have the same structure as Duration

normalizedBoundingBoxes

Array.<Object>

Normalized Bounding boxes in a frame. There can be more than one boxes if the same face is detected in multiple locations within the current frame.

This object should have the same structure as NormalizedBoundingBox

timeOffset

Object

Time-offset, relative to the beginning of the video, corresponding to the video frame for this location.

This object should have the same structure as Duration

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 455

See:

google.cloud.videointelligence.v1.FaceFrame definition in proto format

FaceSegment

Video segment level annotation results for face detection.

Properties:

Name Type Description

Name	Type	Description
`segment`	Object	Video segment where a face was detected. This object should have the same structure as VideoSegment

segment

Object

Video segment where a face was detected.

This object should have the same structure as VideoSegment

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 439

See:

google.cloud.videointelligence.v1.FaceSegment definition in proto format

LabelAnnotation

Label annotation.

Properties:

Name	Type	Description
`entity`	Object	Detected entity. This object should have the same structure as Entity
`categoryEntities`	Array.<Object>	Common categories for the detected entity. E.g. when the label is `Terrier` the category is likely `dog`. And in some cases there might be more than one categories e.g. `Terrier` could also be a `pet`. This object should have the same structure as Entity
`segments`	Array.<Object>	All video segments where a label was detected. This object should have the same structure as LabelSegment
`frames`	Array.<Object>	All video frames where a label was detected. This object should have the same structure as LabelFrame

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 340

See:

google.cloud.videointelligence.v1.LabelAnnotation definition in proto format

LabelDetectionConfig

Config for LABEL_DETECTION.

Properties:

Name	Type	Description
`labelDetectionMode`	number	What labels should be detected with LABEL_DETECTION, in addition to video-level labels or segment-level labels. If unspecified, defaults to `SHOT_MODE`. The number should be among the values of LabelDetectionMode
`stationaryCamera`	boolean	Whether the video has been shot from a stationary (i.e. non-moving) camera. When set to true, might improve detection accuracy for moving objects. Should be used with `SHOT_AND_FRAME_MODE` enabled.
`model`	string	Model to use for label detection. Supported values: "builtin/stable" (the default if unset) and "builtin/latest".
`frameConfidenceThreshold`	number	The confidence threshold we perform filtering on the labels from frame-level detection. If not set, it is set to 0.4 by default. The valid range for this threshold is [0.1, 0.9]. Any value set outside of this range will be clipped. Note: for best results please follow the default threshold. We will update the default threshold everytime when we release a new model.
`videoConfidenceThreshold`	number	The confidence threshold we perform filtering on the labels from video-level and shot-level detections. If not set, it is set to 0.3 by default. The valid range for this threshold is [0.1, 0.9]. Any value set outside of this range will be clipped. Note: for best results please follow the default threshold. We will update the default threshold everytime when we release a new model.

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 122

See:

google.cloud.videointelligence.v1.LabelDetectionConfig definition in proto format

LabelFrame

Video frame level annotation results for label detection.

Properties:

Name Type Description

Name	Type	Description
`timeOffset`	Object	Time-offset, relative to the beginning of the video, corresponding to the video frame for this location. This object should have the same structure as Duration
`confidence`	number	Confidence that the label is accurate. Range: [0, 1].

timeOffset

Object

Time-offset, relative to the beginning of the video, corresponding to the video frame for this location.

This object should have the same structure as Duration

confidence

number

Confidence that the label is accurate. Range: [0, 1].

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 298

See:

google.cloud.videointelligence.v1.LabelFrame definition in proto format

LabelSegment

Video segment level annotation results for label detection.

Properties:

Name Type Description

Name	Type	Description
`segment`	Object	Video segment where a label was detected. This object should have the same structure as VideoSegment
`confidence`	number	Confidence that the label is accurate. Range: [0, 1].

segment

Object

Video segment where a label was detected.

This object should have the same structure as VideoSegment

confidence

number

Confidence that the label is accurate. Range: [0, 1].

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 279

See:

google.cloud.videointelligence.v1.LabelSegment definition in proto format

NormalizedBoundingBox

Normalized bounding box. The normalized vertex coordinates are relative to the original image. Range: [0, 1].

Properties:

Name	Type	Description
`left`	number	Left X coordinate.
`top`	number	Top Y coordinate.
`right`	number	Right X coordinate.
`bottom`	number	Bottom Y coordinate.

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 414

See:

google.cloud.videointelligence.v1.NormalizedBoundingBox definition in proto format

NormalizedBoundingPoly

Normalized bounding polygon for text (that might not be aligned with axis). Contains list of the corner points in clockwise order starting from top-left corner. For example, for a rectangular bounding box: When the text is horizontal it might look like: 0----1 | | 3----2

When it's clockwise rotated 180 degrees around the top-left corner it becomes: 2----3 | | 1----0

and the vertex order will still be (0, 1, 2, 3). Note that values can be less than 0, or greater than 1 due to trignometric calculations for location of the box.

Properties:

Name Type Description

Name	Type	Description
`vertices`	Array.<Object>	Normalized vertices of the bounding polygon. This object should have the same structure as NormalizedVertex

vertices

Array.<Object>

Normalized vertices of the bounding polygon.

This object should have the same structure as NormalizedVertex

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 877

See:

google.cloud.videointelligence.v1.NormalizedBoundingPoly definition in proto format

NormalizedVertex

A vertex represents a 2D point in the image. NOTE: the normalized vertex coordinates are relative to the original image and range from 0 to 1.

Properties:

Name	Type	Description
`x`	number	X coordinate.
`y`	number	Y coordinate.

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 858

See:

google.cloud.videointelligence.v1.NormalizedVertex definition in proto format

ObjectTrackingAnnotation

Annotations corresponding to one tracked object.

Properties:

Name	Type	Description
`segment`	Object	Non-streaming batch mode ONLY. Each object track corresponds to one video segment where it appears. This object should have the same structure as VideoSegment
`trackId`	number	Streaming mode ONLY. In streaming mode, we do not know the end time of a tracked object before it is completed. Hence, there is no VideoSegment info returned. Instead, we provide a unique identifiable integer track_id so that the customers can correlate the results of the ongoing ObjectTrackAnnotation of the same track_id over time.
`entity`	Object	Entity to specify the object category that this track is labeled as. This object should have the same structure as Entity
`confidence`	number	Object category's labeling confidence of this track.
`frames`	Array.<Object>	Information corresponding to all frames where this object track appears. Non-streaming batch mode: it may be one or multiple ObjectTrackingFrame messages in frames. Streaming mode: it can only be one ObjectTrackingFrame message in frames. This object should have the same structure as ObjectTrackingFrame

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 1000

See:

google.cloud.videointelligence.v1.ObjectTrackingAnnotation definition in proto format

ObjectTrackingConfig

Config for OBJECT_TRACKING.

Properties:

Name	Type	Description
`model`	string	Model to use for object tracking. Supported values: "builtin/stable" (the default if unset) and "builtin/latest".

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 217

See:

google.cloud.videointelligence.v1.ObjectTrackingConfig definition in proto format

ObjectTrackingFrame

Video frame level annotations for object detection and tracking. This field stores per frame location, time offset, and confidence.

Properties:

Name Type Description

Name	Type	Description
`normalizedBoundingBox`	Object	The normalized bounding box location of this object track for the frame. This object should have the same structure as NormalizedBoundingBox
`timeOffset`	Object	The timestamp of the frame in microseconds. This object should have the same structure as Duration

normalizedBoundingBox

Object

The normalized bounding box location of this object track for the frame.

This object should have the same structure as NormalizedBoundingBox

timeOffset

Object

The timestamp of the frame in microseconds.

This object should have the same structure as Duration

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 978

See:

google.cloud.videointelligence.v1.ObjectTrackingFrame definition in proto format

ShotChangeDetectionConfig

Config for SHOT_CHANGE_DETECTION.

Properties:

Name	Type	Description
`model`	string	Model to use for shot change detection. Supported values: "builtin/stable" (the default if unset) and "builtin/latest".

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 166

See:

google.cloud.videointelligence.v1.ShotChangeDetectionConfig definition in proto format

SpeechContext

Provides "hints" to the speech recognizer to favor specific words and phrases in the results.

Properties:

Name	Type	Description
`phrases`	Array.<string>	Optional A list of strings containing words and phrases "hints" so that the speech recognition is more likely to recognize them. This can be used to improve the accuracy for specific words and phrases, for example, if specific commands are typically spoken by the user. This can also be used to add additional words to the vocabulary of the recognizer. See usage limits.

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 738

See:

google.cloud.videointelligence.v1.SpeechContext definition in proto format

SpeechRecognitionAlternative

Alternative hypotheses (a.k.a. n-best list).

Properties:

Name Type Description

Name	Type	Description
`transcript`	string	Transcript text representing the words that the user spoke.
`confidence`	number	Output only. The confidence estimate between 0.0 and 1.0. A higher number indicates an estimated greater likelihood that the recognized words are correct. This field is set only for the top alternative. This field is not guaranteed to be accurate and users should not rely on it to be always provided. The default of 0.0 is a sentinel value indicating `confidence` was not set.
`words`	Array.<Object>	Output only. A list of word-specific information for each recognized word. Note: When `enable_speaker_diarization` is true, you will see all the words from the beginning of the audio. This object should have the same structure as WordInfo

transcript

string

Transcript text representing the words that the user spoke.

confidence

number

Output only. The confidence estimate between 0.0 and 1.0. A higher number indicates an estimated greater likelihood that the recognized words are correct. This field is set only for the top alternative. This field is not guaranteed to be accurate and users should not rely on it to be always provided. The default of 0.0 is a sentinel value indicating confidence was not set.

words

Array.<Object>

Output only. A list of word-specific information for each recognized word. Note: When enable_speaker_diarization is true, you will see all the words from the beginning of the audio.

This object should have the same structure as WordInfo

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 783

See:

google.cloud.videointelligence.v1.SpeechRecognitionAlternative definition in proto format

SpeechTranscription

A speech recognition result corresponding to a portion of the audio.

Properties:

Name Type Description

Name	Type	Description
`alternatives`	Array.<Object>	May contain one or more recognition hypotheses (up to the maximum specified in `max_alternatives`). These alternatives are ordered in terms of accuracy, with the top (first) alternative being the most probable, as ranked by the recognizer. This object should have the same structure as SpeechRecognitionAlternative
`languageCode`	string	Output only. The BCP-47 language tag of the language in this result. This language code was detected to have the most likelihood of being spoken in the audio.

alternatives

Array.<Object>

May contain one or more recognition hypotheses (up to the maximum specified in max_alternatives). These alternatives are ordered in terms of accuracy, with the top (first) alternative being the most probable, as ranked by the recognizer.

This object should have the same structure as SpeechRecognitionAlternative

languageCode

string

Output only. The BCP-47 language tag of the language in this result. This language code was detected to have the most likelihood of being spoken in the audio.

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 758

See:

google.cloud.videointelligence.v1.SpeechTranscription definition in proto format

SpeechTranscriptionConfig

Config for SPEECH_TRANSCRIPTION.

Properties:

Name	Type	Description
`languageCode`	string	Required The language of the supplied audio as a BCP-47 language tag. Example: "en-US". See Language Support for a list of the currently supported language codes.
`maxAlternatives`	number	Optional Maximum number of recognition hypotheses to be returned. Specifically, the maximum number of `SpeechRecognitionAlternative` messages within each `SpeechTranscription`. The server may return fewer than `max_alternatives`. Valid values are `0`-`30`. A value of `0` or `1` will return a maximum of one. If omitted, will return a maximum of one.
`filterProfanity`	boolean	Optional If set to `true`, the server will attempt to filter out profanities, replacing all but the initial character in each filtered word with asterisks, e.g. "f***". If set to `false` or omitted, profanities won't be filtered out.
`speechContexts`	Array.<Object>	Optional A means to provide context to assist the speech recognition. This object should have the same structure as SpeechContext
`enableAutomaticPunctuation`	boolean	Optional If 'true', adds punctuation to recognition result hypotheses. This feature is only available in select languages. Setting this for requests in other languages has no effect at all. The default 'false' value does not add punctuation to result hypotheses. NOTE: "This is currently offered as an experimental service, complimentary to all users. In the future this may be exclusively available as a premium feature."
`audioTracks`	Array.<number>	Optional For file formats, such as MXF or MKV, supporting multiple audio tracks, specify up to two tracks. Default: track 0.
`enableSpeakerDiarization`	boolean	Optional If 'true', enables speaker detection for each recognized word in the top alternative of the recognition result using a speaker_tag provided in the WordInfo. Note: When this is true, we send all the words from the beginning of the audio for the top alternative in every consecutive responses. This is done in order to improve our speaker tags as our models learn to identify the speakers in the conversation over time.
`diarizationSpeakerCount`	number	Optional If set, specifies the estimated number of speakers in the conversation. If not set, defaults to '2'. Ignored unless enable_speaker_diarization is set to true.
`enableWordConfidence`	boolean	Optional If `true`, the top result includes a list of words and the confidence for those words. If `false`, no word-level confidence information is returned. The default is `false`.

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 670

See:

google.cloud.videointelligence.v1.SpeechTranscriptionConfig definition in proto format

TextAnnotation

Annotations related to one detected OCR text snippet. This will contain the corresponding text, confidence value, and frame level information for each detection.

Properties:

Name Type Description

Name	Type	Description
`text`	string	The detected text.
`segments`	Array.<Object>	All video segments where OCR detected text appears. This object should have the same structure as TextSegment

text

string

The detected text.

segments

Array.<Object>

All video segments where OCR detected text appears.

This object should have the same structure as TextSegment

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 957

See:

google.cloud.videointelligence.v1.TextAnnotation definition in proto format

TextDetectionConfig

Config for TEXT_DETECTION.

Properties:

Name Type Description

Name	Type	Description
`languageHints`	Array.<string>	Language hint can be specified if the language to be detected is known a priori. It can increase the accuracy of the detection. Language hint must be language code in BCP-47 format. Automatic language detection is performed if no hint is provided.
`model`	string	Model to use for text detection. Supported values: "builtin/stable" (the default if unset) and "builtin/latest".

languageHints

Array.<string>

Language hint can be specified if the language to be detected is known a priori. It can increase the accuracy of the detection. Language hint must be language code in BCP-47 format.

Automatic language detection is performed if no hint is provided.

model

string

Model to use for text detection. Supported values: "builtin/stable" (the default if unset) and "builtin/latest".

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 233

See:

google.cloud.videointelligence.v1.TextDetectionConfig definition in proto format

TextFrame

Video frame level annotation results for text annotation (OCR). Contains information regarding timestamp and bounding box locations for the frames containing detected OCR text snippets.

Properties:

Name Type Description

Name	Type	Description
`rotatedBoundingBox`	Object	Bounding polygon of the detected text for this frame. This object should have the same structure as NormalizedBoundingPoly
`timeOffset`	Object	Timestamp of this frame. This object should have the same structure as Duration

rotatedBoundingBox

Object

Bounding polygon of the detected text for this frame.

This object should have the same structure as NormalizedBoundingPoly

timeOffset

Object

Timestamp of this frame.

This object should have the same structure as Duration

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 934

See:

google.cloud.videointelligence.v1.TextFrame definition in proto format

TextSegment

Video segment level annotation results for text detection.

Properties:

Name Type Description

Name	Type	Description
`segment`	Object	Video segment where a text snippet was detected. This object should have the same structure as VideoSegment
`confidence`	number	Confidence for the track of detected text. It is calculated as the highest over all frames where OCR detected text appears.
`frames`	Array.<Object>	Information related to the frames where OCR detected text appears. This object should have the same structure as TextFrame

segment

Object

Video segment where a text snippet was detected.

This object should have the same structure as VideoSegment

confidence

number

Confidence for the track of detected text. It is calculated as the highest over all frames where OCR detected text appears.

frames

Array.<Object>

Information related to the frames where OCR detected text appears.

This object should have the same structure as TextFrame

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 909

See:

google.cloud.videointelligence.v1.TextSegment definition in proto format

VideoAnnotationProgress

Annotation progress for a single video.

Properties:

Name	Type	Description
`inputUri`	string	Video file location in Google Cloud Storage.
`progressPercent`	number	Approximate percentage processed thus far. Guaranteed to be 100 when fully processed.
`startTime`	Object	Time when the request was received. This object should have the same structure as Timestamp
`updateTime`	Object	Time of the most recent update. This object should have the same structure as Timestamp
`feature`	number	Specifies which feature is being tracked if the request contains more than one features. The number should be among the values of Feature
`segment`	Object	Specifies which segment is being tracked if the request contains more than one segments. This object should have the same structure as VideoSegment

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 611

See:

google.cloud.videointelligence.v1.VideoAnnotationProgress definition in proto format

VideoAnnotationResults

Annotation results for a single video.

Properties:

Name	Type	Description
`inputUri`	string	Video file location in Google Cloud Storage.
`segment`	Object	Video segment on which the annotation is run. This object should have the same structure as VideoSegment
`segmentLabelAnnotations`	Array.<Object>	Topical label annotations on video level or user specified segment level. There is exactly one element for each unique label. This object should have the same structure as LabelAnnotation
`segmentPresenceLabelAnnotations`	Array.<Object>	Presence label annotations on video level or user specified segment level. There is exactly one element for each unique label. This will eventually get publicly exposed and the restriction will be removed. This object should have the same structure as LabelAnnotation
`shotLabelAnnotations`	Array.<Object>	Topical label annotations on shot level. There is exactly one element for each unique label. This object should have the same structure as LabelAnnotation
`shotPresenceLabelAnnotations`	Array.<Object>	Presence label annotations on shot level. There is exactly one element for each unique label. This will eventually get publicly exposed and the restriction will be removed. This object should have the same structure as LabelAnnotation
`frameLabelAnnotations`	Array.<Object>	Label annotations on frame level. There is exactly one element for each unique label. This object should have the same structure as LabelAnnotation
`faceAnnotations`	Array.<Object>	Face annotations. There is exactly one element for each unique face. This object should have the same structure as FaceAnnotation
`shotAnnotations`	Array.<Object>	Shot annotations. Each shot is represented as a video segment. This object should have the same structure as VideoSegment
`explicitAnnotation`	Object	Explicit content annotation. This object should have the same structure as ExplicitContentAnnotation
`speechTranscriptions`	Array.<Object>	Speech transcription. This object should have the same structure as SpeechTranscription
`textAnnotations`	Array.<Object>	OCR text detection and tracking. Annotations for list of detected text snippets. Each will have list of frame information associated with it. This object should have the same structure as TextAnnotation
`objectAnnotations`	Array.<Object>	Annotations for list of objects detected and tracked in video. This object should have the same structure as ObjectTrackingAnnotation
`error`	Object	If set, indicates an error. Note that for a single `AnnotateVideoRequest` some videos may succeed and some may fail. This object should have the same structure as Status

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 503

See:

google.cloud.videointelligence.v1.VideoAnnotationResults definition in proto format

VideoContext

Video context and/or feature-specific parameters.

Properties:

Name	Type	Description
`segments`	Array.<Object>	Video segments to annotate. The segments may overlap and are not required to be contiguous or span the whole video. If unspecified, each video is treated as a single segment. This object should have the same structure as VideoSegment
`labelDetectionConfig`	Object	Config for LABEL_DETECTION. This object should have the same structure as LabelDetectionConfig
`shotChangeDetectionConfig`	Object	Config for SHOT_CHANGE_DETECTION. This object should have the same structure as ShotChangeDetectionConfig
`explicitContentDetectionConfig`	Object	Config for EXPLICIT_CONTENT_DETECTION. This object should have the same structure as ExplicitContentDetectionConfig
`faceDetectionConfig`	Object	Config for FACE_DETECTION. This object should have the same structure as FaceDetectionConfig
`speechTranscriptionConfig`	Object	Config for SPEECH_TRANSCRIPTION. This object should have the same structure as SpeechTranscriptionConfig
`textDetectionConfig`	Object	Config for TEXT_DETECTION. This object should have the same structure as TextDetectionConfig
`objectTrackingConfig`	Object	Config for OBJECT_TRACKING. This object should have the same structure as ObjectTrackingConfig

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 69

See:

google.cloud.videointelligence.v1.VideoContext definition in proto format

VideoSegment

Video segment.

Properties:

Name Type Description

Name	Type	Description
`startTimeOffset`	Object	Time-offset, relative to the beginning of the video, corresponding to the start of the segment (inclusive). This object should have the same structure as Duration
`endTimeOffset`	Object	Time-offset, relative to the beginning of the video, corresponding to the end of the segment (inclusive). This object should have the same structure as Duration

startTimeOffset

Object

Time-offset, relative to the beginning of the video, corresponding to the start of the segment (inclusive).

This object should have the same structure as Duration

endTimeOffset

Object

Time-offset, relative to the beginning of the video, corresponding to the end of the segment (inclusive).

This object should have the same structure as Duration

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 256

See:

google.cloud.videointelligence.v1.VideoSegment definition in proto format

WordInfo

Word-specific information for recognized words. Word information is only included in the response when certain request parameters are set, such as enable_word_time_offsets.

Properties:

Name	Type	Description
`startTime`	Object	Time offset relative to the beginning of the audio, and corresponding to the start of the spoken word. This field is only set if `enable_word_time_offsets=true` and only in the top hypothesis. This is an experimental feature and the accuracy of the time offset can vary. This object should have the same structure as Duration
`endTime`	Object	Time offset relative to the beginning of the audio, and corresponding to the end of the spoken word. This field is only set if `enable_word_time_offsets=true` and only in the top hypothesis. This is an experimental feature and the accuracy of the time offset can vary. This object should have the same structure as Duration
`word`	string	The word corresponding to this set of information.
`confidence`	number	Output only. The confidence estimate between 0.0 and 1.0. A higher number indicates an estimated greater likelihood that the recognized words are correct. This field is set only for the top alternative. This field is not guaranteed to be accurate and users should not rely on it to be always provided. The default of 0.0 is a sentinel value indicating `confidence` was not set.
`speakerTag`	number	Output only. A distinct integer value is assigned for every speaker within the audio. This field specifies which one of those speakers was detected to have spoken this word. Value ranges from 1 up to diarization_speaker_count, and is only set if speaker diarization is enabled.

Source:

v1/doc/google/cloud/videointelligence/v1/doc_video_intelligence.js, line 812

See:

google.cloud.videointelligence.v1.WordInfo definition in proto format