Documentation Index
Fetch the complete documentation index at: https://help.statisfy.com/llms.txt
Use this file to discover all available pages before exploring further.
Connect BigQuery with Statisfy
The BigQuery integration lets you import data from your Google BigQuery warehouse into Statisfy. Author SQL queries in the Statisfy settings UI, preview results, and schedule them — sync product usage, account-level custom fields, person-level attributes, or custom objects on whatever cadence you need. BigQuery uses the same self-serve configuration model as the other warehouse integrations. You can find the shared concepts — query types, account resolvers, transformers, schedules, configuration history — in Integration Concepts. This page focuses on what’s specific to BigQuery. Prerequisites:- Admin access to Statisfy
- A Google Cloud project with BigQuery enabled
- A service account with read access to the datasets and tables you want to sync
- A downloaded service account JSON key
What You Can Import
| Data Type | Description | Use Case |
|---|---|---|
| Product Usage | Time-series usage metrics per account | MAU, API calls, feature usage |
| Custom Fields | Account- or person-level attributes | Health scores, segments, industry, plan tier |
| Custom Objects | Domain-specific records | Subscriptions, deployments, contracts |
| Contact Data | Person-level attribute values | Activity dates, role, status |
Set Up the Service Account
- In the Google Cloud Console, open the project that hosts your BigQuery data.
- Go to IAM & Admin → Service Accounts and create a new service account dedicated to Statisfy.
-
Grant the service account, at minimum:
roles/bigquery.dataVieweron the datasets / project you want Statisfy to readroles/bigquery.jobUseron the project Statisfy will run jobs in (needed to execute queries)
- Create a JSON key for the service account and download it. You’ll upload this to Statisfy in the next step.
Steps to Connect
- Log in to Statisfy.
- Navigate to Integrations → Admin Apps → BigQuery → Connect.
- Upload your service account JSON key file.
- (Optional) Enter a default project ID — the project Statisfy will run jobs against if a query doesn’t specify its own. Most tenants want this set to the project that contains both the credential and the data.
- Click Test Connection to confirm Statisfy can authenticate and run a no-op query.
- Once connected, BigQuery will appear as Connected on the Integrations page.
Configure Queries
Open Integrations → BigQuery → Settings to:- Browse datasets and tables the service account can see, with column types and previews
- Author queries using BigQuery Standard SQL with autocomplete and a preview pane
- Map results to Statisfy accounts (via Statisfy account ID, CRM ID, organization ID, email domain, account name, or any custom field) and to fields, metrics, or custom objects
- Apply column transformers (
strip_prefix,regex_extract,lookup, etc.) to reshape values before they’re written - Set the cadence — daily, weekly, monthly, or a sub-day interval (HOURLY through TWELVE_HOURS)
- Run a one-off sync from the same page
| Query Type | Output |
|---|---|
product_usage | Time-series metric values per account |
custom_field | Account- or person-level attribute values |
custom_object | Records of a custom object type |
BigQuery-Specific Notes
- Project qualification. If your data lives in a different GCP project than the default, fully qualify table names in your query (
`other-project.dataset.table`). - Date placeholders. Use
{START_DATE}/{END_DATE_SQL}to keep queries incremental. Forproduct_usagesnapshots, BigQuery’sDATE()andTIMESTAMP_TRUNC()work well with these placeholders. - Cost control. BigQuery bills per byte scanned. Partition filters (
WHERE _PARTITIONDATE BETWEEN ...) and clustered-column filters dramatically reduce cost on large tables — push them into the saved query rather than relying on Statisfy to discard rows. - Streaming buffer. Rows in the streaming buffer (
_PARTITIONTIME IS NULL) may not appear in scheduled queries that filter on a partition column. If you depend on near-real-time data, either query unpartitioned tables or filter to allowIS NULL.
Account Matching
| Resolver | Description |
|---|---|
| Statisfy Account ID | Native Statisfy account identifier |
| CRM Account ID | Salesforce or HubSpot account identifier |
| Organization ID | External org identifier |
| Email Domain | Company website domain |
| Account Name | Company name (fuzzy matching) |
| Custom Field | Any custom field defined on accounts |
custom_field queries that target people instead of accounts, configure a person resolver (email column, optionally name) — Statisfy will upsert the person record by email.
Sync Behavior
Scheduled Jobs:- Each saved query runs on its own cadence.
- The first run after connecting is a bootstrap that covers the last 30 days; subsequent runs cover only the active window for the cadence.
- Self-serve configuration takes precedence. If no self-serve config exists for a query type, Statisfy falls back to any built-in mapping set up by support.
- Per-query errors are logged but do not block other queries from running.
- Invalid rows within a query are skipped and reported in the job audit log.
Network Access
BigQuery is reachable over the public internet, so no IP allowlisting is required. The service account credential is the only authentication step.Security
- The service account JSON is stored encrypted in Google Cloud Secret Manager.
- Queries authored in the Settings page are read-only — Statisfy validates the query against the BigQuery Standard SQL dialect and rejects DDL/DML before save.
- Access is scoped to whatever the service account can see — grant
dataVieweronly on the datasets you want Statisfy to read.
Troubleshooting
Connection test fails
Connection test fails
Check the service account:
- Verify the JSON key file is the latest version (rotated keys invalidate older copies)
- Confirm the service account has
bigquery.jobUseron the default project (needed to execute queries) - Confirm the service account has
bigquery.dataVieweron the dataset(s) you want to query
Schema or tables not visible in the browser
Schema or tables not visible in the browser
- Confirm the service account has
dataViewer(or higher) on the dataset - Information schema views (
INFORMATION_SCHEMA.*) are intentionally hidden from the picker
Query validation errors at save time
Query validation errors at save time
- Statisfy validates queries against BigQuery Standard SQL. Legacy SQL syntax (
#legacySQL) is not supported — switch to Standard SQL. - DDL (
CREATE,DROP) and DML (INSERT,UPDATE,DELETE,MERGE) are rejected — Statisfy only runs read-only queries. - Check the validation error for the offending function or syntax.
Records not matching to accounts
Records not matching to accounts
- Verify the resolver column contains values that exist in Statisfy
- For CRM ID resolvers, confirm your CRM integration has synced the matching accounts
- Try a different resolver (switch from account name to email domain) if matching is unreliable
- Add a
regex_extractorstrip_prefixtransformer if the column contains the right ID with extra characters
Cost is higher than expected
Cost is higher than expected
- BigQuery bills per byte scanned. Use the Query Plan preview to see scan size before saving.
- Add partition filters (
_PARTITIONDATE,_PARTITIONTIME) and use{START_DATE}/{END_DATE_SQL}to keep scans incremental. - Drop the cadence to
DAILYso each run scans a smaller window.