Class BigQueryTemplate

java.lang.Object
com.google.cloud.spring.bigquery.core.BigQueryTemplate
All Implemented Interfaces:
BigQueryOperations

public class BigQueryTemplate extends Object implements BigQueryOperations
Helper class which simplifies common operations done in BigQuery.
Since:
1.2
  • Constructor Summary

    Constructors
    Constructor
    Description
    BigQueryTemplate(com.google.cloud.bigquery.BigQuery bigQuery, com.google.cloud.bigquery.storage.v1.BigQueryWriteClient bigQueryWriteClient, Map<String,Object> bqInitSettings, org.springframework.scheduling.TaskScheduler taskScheduler)
    A Full constructor which creates the BigQuery template.
    BigQueryTemplate(com.google.cloud.bigquery.BigQuery bigQuery, String datasetName)
    Deprecated.
    As of release 3.3.1, use BigQueryTemplate(BigQuery,BigQueryWriteClient,Map,TaskScheduler) instead
    BigQueryTemplate(com.google.cloud.bigquery.BigQuery bigQuery, String datasetName, org.springframework.scheduling.TaskScheduler taskScheduler)
    Deprecated.
    As of release 3.3.1, use BigQueryTemplate(BigQuery,BigQueryWriteClient,Map,TaskScheduler) instead
  • Method Summary

    Modifier and Type
    Method
    Description
    com.google.cloud.bigquery.Table
    createTable(String tableName, com.google.cloud.bigquery.Schema schema)
     
    getBigQueryJsonDataWriter(com.google.cloud.bigquery.storage.v1.TableName parentTable)
     
    com.google.cloud.bigquery.storage.v1.BatchCommitWriteStreamsResponse
    getCommitResponse(com.google.cloud.bigquery.storage.v1.TableName parentTable, BigQueryJsonDataWriter writer)
     
     
    int
     
    getWriteApiResponse(String tableName, InputStream jsonInputStream)
     
    void
    setAutoDetectSchema(boolean autoDetectSchema)
    Sets whether BigQuery should attempt to autodetect the schema of the data when loading data into an empty table for the first time.
    void
    setCreateDisposition(com.google.cloud.bigquery.JobInfo.CreateDisposition createDisposition)
    Sets the JobInfo.CreateDisposition which specifies whether a new table may be created in BigQuery if needed.
    void
    setJobPollInterval(Duration jobPollInterval)
    Sets the Duration amount of time to wait between successive polls on the status of a BigQuery job.
    void
    setWriteDisposition(com.google.cloud.bigquery.JobInfo.WriteDisposition writeDisposition)
    Sets the JobInfo.WriteDisposition which specifies how data should be inserted into BigQuery tables.
    org.springframework.util.concurrent.ListenableFuture<com.google.cloud.bigquery.Job>
    writeDataToTable(String tableName, InputStream inputStream, com.google.cloud.bigquery.FormatOptions dataFormatOptions)
    Writes data to a specified BigQuery table.
    org.springframework.util.concurrent.ListenableFuture<com.google.cloud.bigquery.Job>
    writeDataToTable(String tableName, InputStream inputStream, com.google.cloud.bigquery.FormatOptions dataFormatOptions, com.google.cloud.bigquery.Schema schema)
    Writes data to a specified BigQuery table with a manually-specified table Schema.
    org.springframework.util.concurrent.ListenableFuture<WriteApiResponse>
    writeJsonStream(String tableName, InputStream jsonInputStream)
    This method uses BigQuery Storage Write API to write new line delimited JSON file to the specified table.
    org.springframework.util.concurrent.ListenableFuture<WriteApiResponse>
    writeJsonStream(String tableName, InputStream jsonInputStream, com.google.cloud.bigquery.Schema schema)
    This method uses BigQuery Storage Write API to write new line delimited JSON file to the specified table.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • BigQueryTemplate

      @Deprecated public BigQueryTemplate(com.google.cloud.bigquery.BigQuery bigQuery, String datasetName)
      Deprecated.
      As of release 3.3.1, use BigQueryTemplate(BigQuery,BigQueryWriteClient,Map,TaskScheduler) instead
      Creates the BigQuery template.
      Parameters:
      bigQuery - the underlying client object used to interface with BigQuery
      datasetName - the name of the dataset in which all operations will take place
    • BigQueryTemplate

      @Deprecated public BigQueryTemplate(com.google.cloud.bigquery.BigQuery bigQuery, String datasetName, org.springframework.scheduling.TaskScheduler taskScheduler)
      Deprecated.
      As of release 3.3.1, use BigQueryTemplate(BigQuery,BigQueryWriteClient,Map,TaskScheduler) instead
      Creates the BigQuery template.
      Parameters:
      bigQuery - the underlying client object used to interface with BigQuery
      datasetName - the name of the dataset in which all operations will take place
      taskScheduler - the TaskScheduler used to poll for the status of long-running BigQuery operations
    • BigQueryTemplate

      public BigQueryTemplate(com.google.cloud.bigquery.BigQuery bigQuery, com.google.cloud.bigquery.storage.v1.BigQueryWriteClient bigQueryWriteClient, Map<String,Object> bqInitSettings, org.springframework.scheduling.TaskScheduler taskScheduler)
      A Full constructor which creates the BigQuery template.
      Parameters:
      bigQuery - the underlying client object used to interface with BigQuery
      bigQueryWriteClient - the underlying BigQueryWriteClient reference use to connect with BigQuery Storage Write Client
      bqInitSettings - Properties required for initialisation of this class
      taskScheduler - the TaskScheduler used to poll for the status of long-running BigQuery operations
  • Method Details

    • setAutoDetectSchema

      public void setAutoDetectSchema(boolean autoDetectSchema)
      Sets whether BigQuery should attempt to autodetect the schema of the data when loading data into an empty table for the first time. If set to false, the schema must be defined explicitly for the table before load.
      Parameters:
      autoDetectSchema - whether data schema should be autodetected from the structure of the data. Default is true.
    • setWriteDisposition

      public void setWriteDisposition(com.google.cloud.bigquery.JobInfo.WriteDisposition writeDisposition)
      Sets the JobInfo.WriteDisposition which specifies how data should be inserted into BigQuery tables.
      Parameters:
      writeDisposition - whether to append to or truncate (overwrite) data in the BigQuery table. Default is WriteDisposition.WRITE_APPEND to append data to a table.
    • setCreateDisposition

      public void setCreateDisposition(com.google.cloud.bigquery.JobInfo.CreateDisposition createDisposition)
      Sets the JobInfo.CreateDisposition which specifies whether a new table may be created in BigQuery if needed.
      Parameters:
      createDisposition - whether to never create a new table in the BigQuery table or only if needed.
    • setJobPollInterval

      public void setJobPollInterval(Duration jobPollInterval)
      Sets the Duration amount of time to wait between successive polls on the status of a BigQuery job.
      Parameters:
      jobPollInterval - the Duration poll interval for BigQuery job status polling
    • writeDataToTable

      public org.springframework.util.concurrent.ListenableFuture<com.google.cloud.bigquery.Job> writeDataToTable(String tableName, InputStream inputStream, com.google.cloud.bigquery.FormatOptions dataFormatOptions)
      Description copied from interface: BigQueryOperations
      Writes data to a specified BigQuery table.
      Specified by:
      writeDataToTable in interface BigQueryOperations
      Parameters:
      tableName - name of the table to write to
      inputStream - input stream of the table data to write
      dataFormatOptions - the format of the data to write
      Returns:
      ListenableFuture containing the BigQuery Job indicating completion of operation
    • writeDataToTable

      public org.springframework.util.concurrent.ListenableFuture<com.google.cloud.bigquery.Job> writeDataToTable(String tableName, InputStream inputStream, com.google.cloud.bigquery.FormatOptions dataFormatOptions, com.google.cloud.bigquery.Schema schema)
      Description copied from interface: BigQueryOperations
      Writes data to a specified BigQuery table with a manually-specified table Schema.

      Example:

      
       Schema schema = Schema.of(
          Field.of("CountyId", StandardSQLTypeName.INT64),
          Field.of("State", StandardSQLTypeName.STRING),
          Field.of("County", StandardSQLTypeName.STRING)
       );
      
       ListenableFuture<Job> bigQueryJobFuture =
           bigQueryTemplate.writeDataToTable(
                TABLE_NAME, dataFile.getInputStream(), FormatOptions.csv(), schema);
       
      Specified by:
      writeDataToTable in interface BigQueryOperations
      Parameters:
      tableName - name of the table to write to
      inputStream - input stream of the table data to write
      dataFormatOptions - the format of the data to write
      schema - the schema of the table being loaded
      Returns:
      ListenableFuture containing the BigQuery Job indicating completion of operation
    • writeJsonStream

      public org.springframework.util.concurrent.ListenableFuture<WriteApiResponse> writeJsonStream(String tableName, InputStream jsonInputStream, com.google.cloud.bigquery.Schema schema)
      This method uses BigQuery Storage Write API to write new line delimited JSON file to the specified table. This method creates a table with the specified schema.
      Specified by:
      writeJsonStream in interface BigQueryOperations
      Parameters:
      tableName - name of the table to write to
      jsonInputStream - input stream of the json file to be written
      Returns:
      ListenableFuture containing the WriteApiResponse indicating completion of operation
    • createTable

      public com.google.cloud.bigquery.Table createTable(String tableName, com.google.cloud.bigquery.Schema schema)
    • writeJsonStream

      public org.springframework.util.concurrent.ListenableFuture<WriteApiResponse> writeJsonStream(String tableName, InputStream jsonInputStream)
      This method uses BigQuery Storage Write API to write new line delimited JSON file to the specified table. The Table should already be created as BigQuery Storage Write API doesn't create it automatically.
      Specified by:
      writeJsonStream in interface BigQueryOperations
      Parameters:
      tableName - name of the table to write to
      jsonInputStream - input stream of the json file to be written
      Returns:
      ListenableFuture containing the WriteApiResponse indicating completion of operation
    • getBigQueryJsonDataWriter

      public BigQueryJsonDataWriter getBigQueryJsonDataWriter(com.google.cloud.bigquery.storage.v1.TableName parentTable) throws com.google.protobuf.Descriptors.DescriptorValidationException, IOException, InterruptedException
      Throws:
      com.google.protobuf.Descriptors.DescriptorValidationException
      IOException
      InterruptedException
    • getWriteApiResponse

      public WriteApiResponse getWriteApiResponse(String tableName, InputStream jsonInputStream) throws com.google.protobuf.Descriptors.DescriptorValidationException, IOException, InterruptedException
      Throws:
      com.google.protobuf.Descriptors.DescriptorValidationException
      IOException
      InterruptedException
    • getCommitResponse

      public com.google.cloud.bigquery.storage.v1.BatchCommitWriteStreamsResponse getCommitResponse(com.google.cloud.bigquery.storage.v1.TableName parentTable, BigQueryJsonDataWriter writer)
    • getDatasetName

      public String getDatasetName()
    • getJsonWriterBatchSize

      public int getJsonWriterBatchSize()