Skip to main content

Documentation Index

Fetch the complete documentation index at: https://help.statisfy.com/llms.txt

Use this file to discover all available pages before exploring further.

Connect BigQuery with Statisfy

The BigQuery integration lets you import data from your Google BigQuery warehouse into Statisfy. Author SQL queries in the Statisfy settings UI, preview results, and schedule them — sync product usage, account-level custom fields, person-level attributes, or custom objects on whatever cadence you need. BigQuery uses the same self-serve configuration model as the other warehouse integrations. You can find the shared concepts — query types, account resolvers, transformers, schedules, configuration history — in Integration Concepts. This page focuses on what’s specific to BigQuery. Prerequisites:
  • Admin access to Statisfy
  • A Google Cloud project with BigQuery enabled
  • A service account with read access to the datasets and tables you want to sync
  • A downloaded service account JSON key

What You Can Import

Data TypeDescriptionUse Case
Product UsageTime-series usage metrics per accountMAU, API calls, feature usage
Custom FieldsAccount- or person-level attributesHealth scores, segments, industry, plan tier
Custom ObjectsDomain-specific recordsSubscriptions, deployments, contracts
Contact DataPerson-level attribute valuesActivity dates, role, status

Set Up the Service Account

  1. In the Google Cloud Console, open the project that hosts your BigQuery data.
  2. Go to IAM & Admin → Service Accounts and create a new service account dedicated to Statisfy.
  3. Grant the service account, at minimum:
    • roles/bigquery.dataViewer on the datasets / project you want Statisfy to read
    • roles/bigquery.jobUser on the project Statisfy will run jobs in (needed to execute queries)
  4. Create a JSON key for the service account and download it. You’ll upload this to Statisfy in the next step.
Grant the service account only the datasets and tables Statisfy needs. The credential is stored encrypted, but limiting the blast radius is good hygiene — rotate the key periodically.

Steps to Connect

  1. Log in to Statisfy.
  2. Navigate to Integrations → Admin Apps → BigQuery → Connect.
  3. Upload your service account JSON key file.
  4. (Optional) Enter a default project ID — the project Statisfy will run jobs against if a query doesn’t specify its own. Most tenants want this set to the project that contains both the credential and the data.
  5. Click Test Connection to confirm Statisfy can authenticate and run a no-op query.
  6. Once connected, BigQuery will appear as Connected on the Integrations page.

Configure Queries

Open Integrations → BigQuery → Settings to:
  • Browse datasets and tables the service account can see, with column types and previews
  • Author queries using BigQuery Standard SQL with autocomplete and a preview pane
  • Map results to Statisfy accounts (via Statisfy account ID, CRM ID, organization ID, email domain, account name, or any custom field) and to fields, metrics, or custom objects
  • Apply column transformers (strip_prefix, regex_extract, lookup, etc.) to reshape values before they’re written
  • Set the cadence — daily, weekly, monthly, or a sub-day interval (HOURLY through TWELVE_HOURS)
  • Run a one-off sync from the same page
Each saved query is one of three types:
Query TypeOutput
product_usageTime-series metric values per account
custom_fieldAccount- or person-level attribute values
custom_objectRecords of a custom object type
See Query types and Account resolvers for the shared model.

BigQuery-Specific Notes

  • Project qualification. If your data lives in a different GCP project than the default, fully qualify table names in your query (`other-project.dataset.table`).
  • Date placeholders. Use {START_DATE} / {END_DATE_SQL} to keep queries incremental. For product_usage snapshots, BigQuery’s DATE() and TIMESTAMP_TRUNC() work well with these placeholders.
  • Cost control. BigQuery bills per byte scanned. Partition filters (WHERE _PARTITIONDATE BETWEEN ...) and clustered-column filters dramatically reduce cost on large tables — push them into the saved query rather than relying on Statisfy to discard rows.
  • Streaming buffer. Rows in the streaming buffer (_PARTITIONTIME IS NULL) may not appear in scheduled queries that filter on a partition column. If you depend on near-real-time data, either query unpartitioned tables or filter to allow IS NULL.

Account Matching

ResolverDescription
Statisfy Account IDNative Statisfy account identifier
CRM Account IDSalesforce or HubSpot account identifier
Organization IDExternal org identifier
Email DomainCompany website domain
Account NameCompany name (fuzzy matching)
Custom FieldAny custom field defined on accounts
Rows that don’t resolve to an existing Statisfy account are skipped. For CRM ID resolvers, make sure your CRM integration has synced the matching accounts first.
For custom_field queries that target people instead of accounts, configure a person resolver (email column, optionally name) — Statisfy will upsert the person record by email.

Sync Behavior

Scheduled Jobs:
  • Each saved query runs on its own cadence.
  • The first run after connecting is a bootstrap that covers the last 30 days; subsequent runs cover only the active window for the cadence.
  • Self-serve configuration takes precedence. If no self-serve config exists for a query type, Statisfy falls back to any built-in mapping set up by support.
Error Handling:
  • Per-query errors are logged but do not block other queries from running.
  • Invalid rows within a query are skipped and reported in the job audit log.

Network Access

BigQuery is reachable over the public internet, so no IP allowlisting is required. The service account credential is the only authentication step.

Security

  • The service account JSON is stored encrypted in Google Cloud Secret Manager.
  • Queries authored in the Settings page are read-only — Statisfy validates the query against the BigQuery Standard SQL dialect and rejects DDL/DML before save.
  • Access is scoped to whatever the service account can see — grant dataViewer only on the datasets you want Statisfy to read.

Troubleshooting

Check the service account:
  • Verify the JSON key file is the latest version (rotated keys invalidate older copies)
  • Confirm the service account has bigquery.jobUser on the default project (needed to execute queries)
  • Confirm the service account has bigquery.dataViewer on the dataset(s) you want to query
  • Confirm the service account has dataViewer (or higher) on the dataset
  • Information schema views (INFORMATION_SCHEMA.*) are intentionally hidden from the picker
  • Statisfy validates queries against BigQuery Standard SQL. Legacy SQL syntax (#legacySQL) is not supported — switch to Standard SQL.
  • DDL (CREATE, DROP) and DML (INSERT, UPDATE, DELETE, MERGE) are rejected — Statisfy only runs read-only queries.
  • Check the validation error for the offending function or syntax.
  • Verify the resolver column contains values that exist in Statisfy
  • For CRM ID resolvers, confirm your CRM integration has synced the matching accounts
  • Try a different resolver (switch from account name to email domain) if matching is unreliable
  • Add a regex_extract or strip_prefix transformer if the column contains the right ID with extra characters
  • BigQuery bills per byte scanned. Use the Query Plan preview to see scan size before saving.
  • Add partition filters (_PARTITIONDATE, _PARTITIONTIME) and use {START_DATE} / {END_DATE_SQL} to keep scans incremental.
  • Drop the cadence to DAILY so each run scans a smaller window.

Need Help?

For initial setup, query design, or troubleshooting, contact support@statisfy.com.