Namespace Google.Apis.Dataflow.v1b3.Data
Classes
ApproximateProgress
Obsolete in favor of ApproximateReportedProgress and ApproximateSplitRequest.
ApproximateReportedProgress
A progress measurement of a WorkItem by a worker.
ApproximateSplitRequest
A suggestion by the service to the worker to dynamically split the WorkItem.
AutoscalingEvent
A structured message reporting an autoscaling decision made by the Dataflow service.
AutoscalingSettings
Settings for WorkerPool autoscaling.
Base2Exponent
Exponential buckets where the growth factor between buckets is 2**(2**-scale)
. e.g. for scale=1
growth
factor is 2**(2**(-1))=sqrt(2)
. n
buckets will have the following boundaries. - 0th: [0, gf) - i in [1,
n-1]: [gf^(i), gf^(i+1))
BigQueryIODetails
Metadata for a BigQuery connector used by the job.
BigTableIODetails
Metadata for a Cloud Bigtable connector used by the job.
BucketOptions
BucketOptions
describes the bucket boundaries used in the histogram.
CPUTime
Modeled after information exposed by /proc/stat.
ComponentSource
Description of an interstitial value between transforms in an execution stage.
ComponentTransform
Description of a transform executed as part of an execution stage.
ComputationTopology
All configuration data for a particular Computation.
ConcatPosition
A position that encapsulates an inner position and an index for the inner position. A ConcatPosition can be used by a reader of a source that encapsulates a set of other sources.
ContainerSpec
Container Spec.
CounterMetadata
CounterMetadata includes all static non-name non-value counter attributes.
CounterStructuredName
Identifies a counter within a per-job namespace. Counters whose structured names are the same get merged into a single value for the job.
CounterStructuredNameAndMetadata
A single message which encapsulates structured name and metadata for a given counter.
CounterUpdate
An update to a Counter sent from a worker.
CreateJobFromTemplateRequest
A request to create a Cloud Dataflow job from a template.
CustomSourceLocation
Identifies the location of a custom souce.
DataDiskAssignment
Data disk assignment for a given VM instance.
DataSamplingConfig
Configuration options for sampling elements.
DataSamplingReport
Contains per-worker telemetry about the data sampling feature.
DataflowHistogramValue
Summary statistics for a population of values. HistogramValue contains a sequence of buckets and gives a count of values that fall into each bucket. Bucket boundares are defined by a formula and bucket widths are either fixed or exponentially increasing.
DatastoreIODetails
Metadata for a Datastore connector used by the job.
DebugOptions
Describes any options that have an effect on the debugging of pipelines.
DeleteSnapshotResponse
Response from deleting a snapshot.
DerivedSource
Specification of one of the bundles produced as a result of splitting a Source (e.g. when executing a SourceSplitRequest, or when splitting an active task using WorkItemStatus.dynamic_source_split), relative to the source being split.
Disk
Describes the data disk used by a workflow job.
DisplayData
Data provided with a pipeline or transform to provide descriptive info.
DistributionUpdate
A metric value representing a distribution.
DynamicSourceSplit
When a task splits using WorkItemStatus.dynamic_source_split, this message describes the two parts of the split relative to the description of the current task's input.
Environment
Describes the environment in which a Dataflow Job runs.
ExecutionStageState
A message describing the state of a particular execution stage.
ExecutionStageSummary
Description of the composing transforms, names/ids, and input/outputs of a stage of execution. Some composing transforms and sources may have been generated by the Dataflow service during execution planning.
FailedLocation
Indicates which [regional endpoint] (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) failed to respond to a request for data.
FileIODetails
Metadata for a File connector used by the job.
FlattenInstruction
An instruction that copies its inputs (zero or more) to its (single) output.
FlexTemplateRuntimeEnvironment
The environment values to be set at runtime for flex template.
FloatingPointList
A metric value representing a list of floating point numbers.
FloatingPointMean
A representation of a floating point mean metric contribution.
GPUUsage
Information about the GPU usage on the worker.
GPUUtilization
Utilization details about the GPU.
GetDebugConfigRequest
Request to get updated debug configuration for component.
GetDebugConfigResponse
Response to a get debug configuration request.
GetTemplateResponse
The response to a GetTemplate request.
Histogram
Histogram of value counts for a distribution. Buckets have an inclusive lower bound and exclusive upper bound and use "1,2,5 bucketing": The first bucket range is from [0,1) and all subsequent bucket boundaries are powers of ten multiplied by 1, 2, or 5. Thus, bucket boundaries are 0, 1, 2, 5, 10, 20, 50, 100, 200, 500, 1000, ... Negative values are not supported.
HotKeyDebuggingInfo
Information useful for debugging a hot key detection.
HotKeyDetection
Proto describing a hot key detected on a given WorkItem.
HotKeyInfo
Information about a hot key.
InstructionInput
An input of an instruction, as a reference to an output of a producer instruction.
InstructionOutput
An output of an instruction.
IntegerGauge
A metric value representing temporal values of a variable.
IntegerList
A metric value representing a list of integers.
IntegerMean
A representation of an integer mean metric contribution.
Job
Defines a job to be run by the Cloud Dataflow service. Do not enter confidential information when you supply string values using the API.
JobExecutionDetails
Information about the execution of a job.
JobExecutionInfo
Additional information about how a Cloud Dataflow job will be executed that isn't contained in the submitted job.
JobExecutionStageInfo
Contains information about how a particular google.dataflow.v1beta3.Step will be executed.
JobMessage
A particular message pertaining to a Dataflow job.
JobMetadata
Metadata available primarily for filtering jobs. Will be included in the ListJob response and Job SUMMARY view.
JobMetrics
JobMetrics contains a collection of metrics describing the detailed progress of a Dataflow job. Metrics correspond to user-defined and system-defined metrics in the job. For more information, see [Dataflow job metrics] (https://cloud.google.com/dataflow/docs/guides/using-monitoring-intf). This resource captures only the most recent values of each metric; time-series data can be queried for them (under the same metric names) from Cloud Monitoring.
KeyRangeDataDiskAssignment
Data disk assignment information for a specific key-range of a sharded computation. Currently we only support UTF-8 character splits to simplify encoding into JSON.
KeyRangeLocation
Location information for a specific key-range of a sharded computation. Currently we only support UTF-8 character splits to simplify encoding into JSON.
LaunchFlexTemplateParameter
Launch FlexTemplate Parameter.
LaunchFlexTemplateRequest
A request to launch a Cloud Dataflow job from a FlexTemplate.
LaunchFlexTemplateResponse
Response to the request to launch a job from Flex Template.
LaunchTemplateParameters
Parameters to provide to the template being launched. Note that the [metadata in the pipeline code] (https://cloud.google.com/dataflow/docs/guides/templates/creating-templates#metadata) determines which runtime parameters are valid.
LaunchTemplateResponse
Response to the request to launch a template.
LeaseWorkItemRequest
Request to lease WorkItems.
LeaseWorkItemResponse
Response to a request to lease WorkItems.
Linear
Linear buckets with the following boundaries for indices in 0 to n-1. - i in [0, n-1]: [start + (i)*width, start
- (i+1)*width)
ListJobMessagesResponse
Response to a request to list job messages.
ListJobsResponse
Response to a request to list Cloud Dataflow jobs in a project. This might be a partial response, depending on the page size in the ListJobsRequest. However, if the project does not have any jobs, an instance of ListJobsResponse is not returned and the requests's response body is empty {}.
ListSnapshotsResponse
List of snapshots.
MapTask
MapTask consists of an ordered set of instructions, each of which describes one particular low-level operation for the worker to perform in order to accomplish the MapTask's WorkItem. Each instruction must appear in the list before any instructions which depends on its output.
MemInfo
Information about the memory usage of a worker or a container within a worker.
MetricShortId
The metric short id is returned to the user alongside an offset into ReportWorkItemStatusRequest
MetricStructuredName
Identifies a metric, by describing the source which generated the metric.
MetricUpdate
Describes the state of a metric.
MetricValue
The value of a metric along with its name and labels.
MountedDataDisk
Describes mounted data disk.
MultiOutputInfo
Information about an output of a multi-output DoFn.
NameAndKind
Basic metadata about a counter.
OutlierStats
Statistics for the underflow and overflow bucket.
Package
The packages that must be installed in order for a worker to run the steps of the Cloud Dataflow job that will be assigned to its worker pool. This is the mechanism by which the Cloud Dataflow SDK causes code to be loaded onto the workers. For example, the Cloud Dataflow Java SDK might use this to install jars containing the user's code and all of the various dependencies (libraries, data files, etc.) required in order for that code to run.
ParDoInstruction
An instruction that does a ParDo operation. Takes one main input and zero or more side inputs, and produces zero or more outputs. Runs user code.
ParallelInstruction
Describes a particular operation comprising a MapTask.
Parameter
Structured data associated with this message.
ParameterMetadata
Metadata for a specific parameter.
ParameterMetadataEnumOption
ParameterMetadataEnumOption specifies the option shown in the enum form.
PartialGroupByKeyInstruction
An instruction that does a partial group-by-key. One input and one output.
PerStepNamespaceMetrics
Metrics for a particular unfused step and namespace. A metric is uniquely identified by the metrics_namespace
,
original_step
, metric name
and metric_labels
.
PerWorkerMetrics
Per worker metrics.
PipelineDescription
A descriptive representation of submitted pipeline as well as the executed form. This data is provided by the Dataflow service for ease of visualizing the pipeline and interpreting Dataflow provided metrics.
Point
A point in the timeseries.
Position
Position defines a position within a collection of data. The value can be either the end position, a key (used with ordered collections), a byte offset, or a record index.
ProgressTimeseries
Information about the progress of some component of job execution.
PubSubIODetails
Metadata for a Pub/Sub connector used by the job.
PubsubLocation
Identifies a pubsub location to use for transferring data into or out of a streaming Dataflow job.
PubsubSnapshotMetadata
Represents a Pubsub snapshot.
ReadInstruction
An instruction that reads records. Takes no inputs, produces one output.
ReportWorkItemStatusRequest
Request to report the status of WorkItems.
ReportWorkItemStatusResponse
Response from a request to report the status of WorkItems.
ReportedParallelism
Represents the level of parallelism in a WorkItem's input, reported by the worker.
ResourceUtilizationReport
Worker metrics exported from workers. This contains resource utilization metrics accumulated from a variety of sources. For more information, see go/df-resource-signals.
ResourceUtilizationReportResponse
Service-side response to WorkerMessage reporting resource utilization.
RuntimeEnvironment
The environment values to set at runtime.
RuntimeMetadata
RuntimeMetadata describing a runtime environment.
RuntimeUpdatableParams
Additional job parameters that can only be updated during runtime using the projects.jobs.update method. These fields have no effect when specified during job creation.
SDKInfo
SDK Information.
SdkBug
A bug found in the Dataflow SDK.
SdkHarnessContainerImage
Defines an SDK harness container for executing Dataflow pipelines.
SdkVersion
The version of the SDK used to run the job.
SendDebugCaptureRequest
Request to send encoded debug information. Next ID: 8
SendDebugCaptureResponse
Response to a send capture request. nothing
SendWorkerMessagesRequest
A request for sending worker messages to the service.
SendWorkerMessagesResponse
The response to the worker messages.
SeqMapTask
Describes a particular function to invoke.
SeqMapTaskOutputInfo
Information about an output of a SeqMapTask.
ServiceResources
Resources used by the Dataflow Service to run the job.
ShellTask
A task which consists of a shell command for the worker to execute.
SideInputInfo
Information about a side input of a DoFn or an input of a SeqDoFn.
Sink
A sink that records can be encoded and written to.
Snapshot
Represents a snapshot of a job.
SnapshotJobRequest
Request to create a snapshot of a job.
Source
A source that records can be read and decoded from.
SourceFork
DEPRECATED in favor of DynamicSourceSplit.
SourceGetMetadataRequest
A request to compute the SourceMetadata of a Source.
SourceGetMetadataResponse
The result of a SourceGetMetadataOperation.
SourceMetadata
Metadata about a Source useful for automatically optimizing and tuning the pipeline, etc.
SourceOperationRequest
A work item that represents the different operations that can be performed on a user-defined Source specification.
SourceOperationResponse
The result of a SourceOperationRequest, specified in ReportWorkItemStatusRequest.source_operation when the work item is completed.
SourceSplitOptions
Hints for splitting a Source into bundles (parts for parallel processing) using SourceSplitRequest.
SourceSplitRequest
Represents the operation to split a high-level Source specification into bundles (parts for parallel processing). At a high level, splitting of a source into bundles happens as follows: SourceSplitRequest is applied to the source. If it returns SOURCE_SPLIT_OUTCOME_USE_CURRENT, no further splitting happens and the source is used "as is". Otherwise, splitting is applied recursively to each produced DerivedSource. As an optimization, for any Source, if its does_not_need_splitting is true, the framework assumes that splitting this source would return SOURCE_SPLIT_OUTCOME_USE_CURRENT, and doesn't initiate a SourceSplitRequest. This applies both to the initial source being split and to bundles produced from it.
SourceSplitResponse
The response to a SourceSplitRequest.
SourceSplitShard
DEPRECATED in favor of DerivedSource.
SpannerIODetails
Metadata for a Spanner connector used by the job.
SplitInt64
A representation of an int64, n, that is immune to precision loss when encoded in JSON.
StageExecutionDetails
Information about the workers and work items within a stage.
StageSource
Description of an input or output of an execution stage.
StageSummary
Information about a particular execution stage of a job.
StateFamilyConfig
State family configuration.
Status
The Status
type defines a logical error model that is suitable for different programming environments,
including REST APIs and RPC APIs. It is used by gRPC. Each Status
message contains
three pieces of data: error code, error message, and error details. You can find out more about this error model
and how to work with it in the API Design Guide.
Step
Defines a particular step within a Cloud Dataflow job. A job consists of multiple steps, each of which performs some specific operation as part of the overall job. Data is typically passed from one step to another as part of the job. Note: The properties of this object are not stable and might change. Here's an example of a sequence of steps which together implement a Map-Reduce job: * Read a collection of data from some source, parsing the collection's elements. * Validate the elements. * Apply a user-defined function to map each element to some value and extract an element-specific key value. * Group elements with the same key into a single element with that key, transforming a multiply-keyed collection into a uniquely-keyed collection. * Write the elements out to some data sink. Note that the Cloud Dataflow service may be used to run many different types of jobs, not just Map-Reduce.
Straggler
Information for a straggler.
StragglerDebuggingInfo
Information useful for debugging a straggler. Each type will provide specialized debugging information relevant for a particular cause. The StragglerDebuggingInfo will be 1:1 mapping to the StragglerCause enum.
StragglerInfo
Information useful for straggler identification and debugging.
StragglerSummary
Summarized straggler identification details.
StreamLocation
Describes a stream of data, either as input to be processed or as output of a streaming Dataflow job.
StreamingApplianceSnapshotConfig
Streaming appliance snapshot configuration.
StreamingComputationConfig
Configuration information for a single streaming computation.
StreamingComputationRanges
Describes full or partial data disk assignment information of the computation ranges.
StreamingComputationTask
A task which describes what action should be performed for the specified streaming computation ranges.
StreamingConfigTask
A task that carries configuration information for streaming computations.
StreamingOperationalLimits
Operational limits imposed on streaming jobs by the backend.
StreamingScalingReport
Contains per-user worker telemetry used in streaming autoscaling.
StreamingScalingReportResponse
Contains per-user-worker streaming scaling recommendation from the backend.
StreamingSetupTask
A task which initializes part of a streaming Dataflow job.
StreamingSideInputLocation
Identifies the location of a streaming side input.
StreamingStageLocation
Identifies the location of a streaming computation stage, for stage-to-stage communication.
StreamingStragglerInfo
Information useful for streaming straggler identification and debugging.
StringList
A metric value representing a list of strings.
StructuredMessage
A rich message format, including a human readable string, a key for identifying the message, and structured data associated with the message for programmatic consumption.
TaskRunnerSettings
Taskrunner configuration settings.
TemplateMetadata
Metadata describing a template.
TopologyConfig
Global topology of the streaming Dataflow job, including all computations and their sharded locations.
TransformSummary
Description of the type, names/ids, and input/outputs for a transform.
WorkItem
WorkItem represents basic information about a WorkItem to be executed in the cloud.
WorkItemDetails
Information about an individual work item execution.
WorkItemServiceState
The Dataflow service's idea of the current state of a WorkItem being processed by a worker.
WorkItemStatus
Conveys a worker's progress through the work described by a WorkItem.
WorkerDetails
Information about a worker
WorkerHealthReport
WorkerHealthReport contains information about the health of a worker. The VM should be identified by the labels attached to the WorkerMessage that this health ping belongs to.
WorkerHealthReportResponse
WorkerHealthReportResponse contains information returned to the worker in response to a health ping.
WorkerLifecycleEvent
A report of an event in a worker's lifecycle. The proto contains one event, because the worker is expected to asynchronously send each message immediately after the event. Due to this asynchrony, messages may arrive out of order (or missing), and it is up to the consumer to interpret. The timestamp of the event is in the enclosing WorkerMessage proto.
WorkerMessage
WorkerMessage provides information to the backend about a worker.
WorkerMessageCode
A message code is used to report status and error messages to the service. The message codes are intended to be machine readable. The service will take care of translating these into user understandable messages if necessary. Example use cases: 1. Worker processes reporting successful startup. 2. Worker processes reporting specific errors (e.g. package staging failure).
WorkerMessageResponse
A worker_message response allows the server to pass information to the sender.
WorkerPool
Describes one particular pool of Cloud Dataflow workers to be instantiated by the Cloud Dataflow service in order to perform the computations required by a job. Note that a workflow job may use multiple pools, in order to match the various computational requirements of the various stages of the job.
WorkerSettings
Provides data to pass through to the worker harness.
WorkerShutdownNotice
Shutdown notification from workers. This is to be sent by the shutdown script of the worker VM so that the backend knows that the VM is being shut down.
WorkerShutdownNoticeResponse
Service-side response to WorkerMessage issuing shutdown notice.
WorkerThreadScalingReport
Contains information about the thread scaling information of a worker.
WorkerThreadScalingReportResponse
Contains the thread scaling recommendation for a worker from the backend.
WriteInstruction
An instruction that writes records. Takes one input, produces no outputs.