BigQuery Integration

Connect BigQuery with Statisfy

The BigQuery integration lets you import data from your Google BigQuery warehouse into Statisfy. Author SQL queries in the Statisfy settings UI, preview results, and schedule them — sync product usage, account-level custom fields, person-level attributes, or custom objects on whatever cadence you need. BigQuery uses the same self-serve configuration model as the other warehouse integrations. You can find the shared concepts — query types, account resolvers, transformers, schedules, configuration history — in Integration Concepts. This page focuses on what’s specific to BigQuery. Prerequisites:

Admin access to Statisfy
A Google Cloud project with BigQuery enabled
A service account with read access to the datasets and tables you want to sync
A downloaded service account JSON key

What You Can Import

Data Type	Description	Use Case
Product Usage	Time-series usage metrics per account	MAU, API calls, feature usage
Custom Fields	Account- or person-level attributes	Health scores, segments, industry, plan tier
Custom Objects	Domain-specific records	Subscriptions, deployments, contracts
Contact Data	Person-level attribute values	Activity dates, role, status

Set Up the Service Account

In the Google Cloud Console, open the project that hosts your BigQuery data.
Go to IAM & Admin → Service Accounts and create a new service account dedicated to Statisfy.
Grant the service account, at minimum:
- roles/bigquery.dataViewer on the datasets / project you want Statisfy to read
- roles/bigquery.jobUser on the project Statisfy will run jobs in (needed to execute queries)
Create a JSON key for the service account and download it. You’ll upload this to Statisfy in the next step.

Grant the service account only the datasets and tables Statisfy needs. The credential is stored encrypted, but limiting the blast radius is good hygiene — rotate the key periodically.

Steps to Connect

Log in to Statisfy.
Navigate to Integrations → Admin Apps → BigQuery → Connect.
Upload your service account JSON key file.
(Optional) Enter a default project ID — the project Statisfy will run jobs against if a query doesn’t specify its own. Most tenants want this set to the project that contains both the credential and the data.
Click Test Connection to confirm Statisfy can authenticate and run a no-op query.
Once connected, BigQuery will appear as Connected on the Integrations page.

Configure Queries

Open Integrations → BigQuery → Settings to:

Browse datasets and tables the service account can see, with column types and previews
Author queries using BigQuery Standard SQL with autocomplete and a preview pane
Map results to Statisfy accounts (via Statisfy account ID, CRM ID, organization ID, email domain, account name, or any custom field) and to fields, metrics, or custom objects
Apply column transformers (strip_prefix, regex_extract, lookup, etc.) to reshape values before they’re written
Set the cadence — daily, weekly, monthly, or a sub-day interval (HOURLY through TWELVE_HOURS)
Run a one-off sync from the same page

Each saved query is one of three types:

Query Type	Output
`product_usage`	Time-series metric values per account
`custom_field`	Account- or person-level attribute values
`custom_object`	Records of a custom object type

See Query types and Account resolvers for the shared model.

BigQuery-Specific Notes

Project qualification. If your data lives in a different GCP project than the default, fully qualify table names in your query (`other-project.dataset.table`).
Date placeholders. Use {START_DATE} / {END_DATE_SQL} to keep queries incremental. For product_usage snapshots, BigQuery’s DATE() and TIMESTAMP_TRUNC() work well with these placeholders.
Cost control. BigQuery bills per byte scanned. Partition filters (WHERE _PARTITIONDATE BETWEEN ...) and clustered-column filters dramatically reduce cost on large tables — push them into the saved query rather than relying on Statisfy to discard rows.
Streaming buffer. Rows in the streaming buffer (_PARTITIONTIME IS NULL) may not appear in scheduled queries that filter on a partition column. If you depend on near-real-time data, either query unpartitioned tables or filter to allow IS NULL.

Account Matching

Resolver	Description
Statisfy Account ID	Native Statisfy account identifier
CRM Account ID	Salesforce or HubSpot account identifier
Organization ID	External org identifier
Email Domain	Company website domain
Account Name	Company name (fuzzy matching)
Custom Field	Any custom field defined on accounts

Rows that don’t resolve to an existing Statisfy account are skipped. For CRM ID resolvers, make sure your CRM integration has synced the matching accounts first.

For custom_field queries that target people instead of accounts, configure a person resolver (email column, optionally name) — Statisfy will upsert the person record by email.

Sync Behavior

Scheduled Jobs:

Each saved query runs on its own cadence.
The first run after connecting is a bootstrap that covers the last 30 days; subsequent runs cover only the active window for the cadence.
Self-serve configuration takes precedence. If no self-serve config exists for a query type, Statisfy falls back to any built-in mapping set up by support.

Error Handling:

Per-query errors are logged but do not block other queries from running.
Invalid rows within a query are skipped and reported in the job audit log.

Network Access

BigQuery is reachable over the public internet, so no IP allowlisting is required. The service account credential is the only authentication step.

Security

The service account JSON is stored encrypted in Google Cloud Secret Manager.
Queries authored in the Settings page are read-only — Statisfy validates the query against the BigQuery Standard SQL dialect and rejects DDL/DML before save.
Access is scoped to whatever the service account can see — grant dataViewer only on the datasets you want Statisfy to read.

Troubleshooting

Connection test fails

Check the service account:

Verify the JSON key file is the latest version (rotated keys invalidate older copies)
Confirm the service account has bigquery.jobUser on the default project (needed to execute queries)
Confirm the service account has bigquery.dataViewer on the dataset(s) you want to query

Schema or tables not visible in the browser

Confirm the service account has dataViewer (or higher) on the dataset
Information schema views (INFORMATION_SCHEMA.*) are intentionally hidden from the picker

Query validation errors at save time

Statisfy validates queries against BigQuery Standard SQL. Legacy SQL syntax (#legacySQL) is not supported — switch to Standard SQL.
DDL (CREATE, DROP) and DML (INSERT, UPDATE, DELETE, MERGE) are rejected — Statisfy only runs read-only queries.
Check the validation error for the offending function or syntax.

Records not matching to accounts

Verify the resolver column contains values that exist in Statisfy
For CRM ID resolvers, confirm your CRM integration has synced the matching accounts
Try a different resolver (switch from account name to email domain) if matching is unreliable
Add a regex_extract or strip_prefix transformer if the column contains the right ID with extra characters

Cost is higher than expected

BigQuery bills per byte scanned. Use the Query Plan preview to see scan size before saving.
Add partition filters (_PARTITIONDATE, _PARTITIONTIME) and use {START_DATE} / {END_DATE_SQL} to keep scans incremental.
Drop the cadence to DAILY so each run scans a smaller window.

Need Help?

For initial setup, query design, or troubleshooting, contact support@statisfy.com.

Documentation Index

​Connect BigQuery with Statisfy

​What You Can Import

​Set Up the Service Account

​Steps to Connect

​Configure Queries

​BigQuery-Specific Notes

​Account Matching

​Sync Behavior

​Network Access

​Security

​Troubleshooting

​Need Help?

Connect BigQuery with Statisfy

What You Can Import

Set Up the Service Account

Steps to Connect

Configure Queries

BigQuery-Specific Notes

Account Matching

Sync Behavior

Network Access

Security

Troubleshooting

Need Help?