Class GoogleCloudAiplatformV1MachineSpec

Specification of a single machine.

Inheritance

object

GoogleCloudAiplatformV1MachineSpec

Implements

IDirectResponseSchema

Inherited Members

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Namespace: Google.Apis.Aiplatform.v1.Data

Assembly: Google.Apis.Aiplatform.v1.dll

Syntax

public class GoogleCloudAiplatformV1MachineSpec : IDirectResponseSchema

Properties

AcceleratorCount

The number of accelerators to attach to the machine. For accelerator optimized machine types (https://cloud.google.com/compute/docs/accelerator-optimized-machines), One may set the accelerator_count from 1 to N for machine with N GPUs. If accelerator_count is less than or equal to N / 2, Vertex will co-schedule the replicas of the model into the same VM to save cost. For example, if the machine type is a3-highgpu-8g, which has 8 H100 GPUs, one can set accelerator_count to 1 to 8. If accelerator_count is 1, 2, 3, or 4, Vertex will co-schedule 8, 4, 2, or 2 replicas of the model into the same VM to save cost. When co-scheduling, CPU, memory and storage on the VM will be distributed to replicas on the VM. For example, one can expect a co-scheduled replica requesting 2 GPUs out of a 8-GPU VM will receive 25% of the CPU, memory and storage of the VM. Note that the feature is not compatible with multihost_gpu_node_count. When multihost_gpu_node_count is set, the co-scheduling will not be enabled.

Declaration

[JsonProperty("acceleratorCount")]
public virtual int? AcceleratorCount { get; set; }

Property Value

Type	Description
int?

AcceleratorType

Immutable. The type of accelerator(s) that may be attached to the machine as per accelerator_count.

Declaration

[JsonProperty("acceleratorType")]
public virtual string AcceleratorType { get; set; }

Property Value

Type	Description
string

ETag

The ETag of the item.

Declaration

public virtual string ETag { get; set; }

Property Value

Type	Description
string

GpuPartitionSize

Optional. Immutable. The Nvidia GPU partition size. When specified, the requested accelerators will be partitioned into smaller GPU partitions. For example, if the request is for 8 units of NVIDIA A100 GPUs, and gpu_partition_size="1g.10gb", the service will create 8 * 7 = 56 partitioned MIG instances. The partition size must be a value supported by the requested accelerator. Refer to Nvidia GPU Partitioning for the available partition sizes. If set, the accelerator_count should be set to 1.

Declaration

[JsonProperty("gpuPartitionSize")]
public virtual string GpuPartitionSize { get; set; }

Property Value

Type	Description
string

MachineType

Immutable. The type of the machine. See the list of machine types supported for prediction See the list of machine types supported for custom training. For DeployedModel this field is optional, and the default value is n1-standard-2. For BatchPredictionJob or as part of WorkerPoolSpec this field is required.

Declaration

[JsonProperty("machineType")]
public virtual string MachineType { get; set; }

Property Value

Type	Description
string

ReservationAffinity

Optional. Immutable. Configuration controlling how this resource pool consumes reservation.

Declaration

[JsonProperty("reservationAffinity")]
public virtual GoogleCloudAiplatformV1ReservationAffinity ReservationAffinity { get; set; }

Property Value

Type	Description
GoogleCloudAiplatformV1ReservationAffinity

TpuTopology

Immutable. The topology of the TPUs. Corresponds to the TPU topologies available from GKE. (Example: tpu_topology: "2x2x1").

Declaration

[JsonProperty("tpuTopology")]
public virtual string TpuTopology { get; set; }

Property Value

Type	Description
string

Implements

IDirectResponseSchema