BigQuery destination for batch exports

Last updated:

|Edit this page

Batch exports can be used to export data to a BigQuery table.

Setting up BigQuery access

To set up the right permissions for a batch export targeting BigQuery, you need:

  1. A Service Account.
  2. A dataset which has permissions allowing the service account to access it.

Here's how to set these up so that the app has access only to the dataset it needs:

  1. Create a Service Account.

Create service account

  1. Create a key for the Service Account you created in the previous step.
  2. Save the key file as JSON to upload it when configuring a batch export.

Create JSON private key

  1. Create a role which has only the specific permissions the batch export requires:
    • bigquery.datasets.get
    • bigquery.jobs.create
    • bigquery.tables.create
    • bigquery.tables.get
    • bigquery.tables.list
    • bigquery.tables.updateData

Create custom role for batch exports

This step can be skipped if using the built-in roles BigQuery Data Editor and BigQuery Job User in the steps that follow.

  1. Grant the Service Account access to run jobs in your Google Cloud project. This can be done by granting the BigQuery Jobs User role or the role we created in the previous step on your project.

Navigate to IAM and click on Grant Access to arrive at this screen:

Project grant

In the screenshot above, we have used a custom role named Testing PostHog BatchExports with the permissions listed in the previous step.

  1. Create a dataset within a BigQuery project (ours is called BatchExports, but any name will do).
  2. Use the Sharing and Add Principal buttons to grant access to your dataset with your Service Account created in step 1. Next, assign either the BigQuery Data Editor role or your custom role created in step 4 to provide permissions for the dataset access. Read the full instructions on granting access to the dataset in BigQuery if unclear.

Sharing dataset Add principal

In the screenshot below, we grant our Service Account access to the BatchExports data set and assign the Testing PostHog BatchExports role permissions for it.

Dataset grant access

  1. All done! After completing these steps you can create a BigQuery batch export in PostHog and your data will start flowing from PostHog to BigQuery.

Event schema

This is the schema of all the fields that are exported to BigQuery.

FieldTypeDescription
uuidSTRINGThe unique ID of the event within PostHog
eventSTRINGThe name of the event that was sent
propertiesSTRINGA JSON object with all the properties sent along with an event
elementsSTRINGA string of elements surrounding an autocaptured event
setSTRINGA JSON object with any person properties sent with the $set field
set_onceSTRINGA JSON object with any person properties sent with the $set_once field
distinct_idSTRINGThe distinct_id of the user who sent the event
team_idSTRINGThe team_id for the event
ipSTRINGThe IP address that was sent with the event
site_urlSTRINGThe $current_url property of the event. This field has been kept for backwards compatibility and will be deprecated
timestampTIMESTAMPThe timestamp associated with an event
bq_ingested_timestampTIMESTAMPThe timestamp when the event was sent to BigQuery

Creating the batch export

  1. Navigate to the exports page in your PostHog instance (Quick links if you use PostHog Cloud US or PostHog Cloud EU).
  2. Click "Create export workflow".
  3. Select BigQuery as the batch export destination.
  4. Fill in the necessary configuration details.
  5. Finalize the creation by clicking on "Create".
  6. Done! The batch export will schedule its first run on the start of the next period.

BigQuery configuration

Configuring a batch export targeting BigQuery requires the following BigQuery-specific configuration values:

  • Table ID: The ID of the destination BigQuery table. This is not the fully-qualified name of a table, so omit the dataset and project IDs. For example for the fully-qualified table name project-123:dataset:MyExportTable, use only MyExportTable as the table ID.
  • Dataset ID: The ID of the BigQuery dataset which contains the destination table. Only the dataset ID is required, so omit the project ID if present. For example for the dataset project-123:my-dataset, use only my-dataset as the dataset ID.
  • Google Cloud JSON key file: The JSON key file for your BigQuery Service Account to access your instance. Generated on Service Account creation. See here for more information.

Questions?

Was this page useful?

Next article

Segment

Send events to PostHog, via Segment. Segment allows you to easily manage data and integrations with services across your Growth, Product, and Marketing stack. By tracking events and users via Segment’s API and libraries, you can send your product’s data to all of your analytics/marketing platforms, with minimal instrumentation code. They offer support for most platforms, including iOS, Android, JavaScript, Node.js, PHP, and more. Requirements This requires either PostHog Cloud, or a self-hosted…

Read next article