You've set up BigQuery Export and started querying your GA4 event data — but the numbers don't match what you see in the GA4 interface. Session counts are different. Revenue figures don't line up. Some events appear to be missing entirely. BigQuery discrepancies are one of the most technically complex data quality issues in the GA4 ecosystem. Here's why they happen.
Sampling, Thresholds, and Data Modelling
GA4's standard interface reports use a mix of raw data, sampled data, and modelled data depending on your property tier and the report type. BigQuery Export contains only raw, unsampled event data — no modelling, no thresholds applied. This is why the numbers differ: you're comparing an apples-and-oranges dataset.
Specifically: GA4's Consent Mode modelling adds estimated conversion data to interface reports that doesn't exist as real events in BigQuery. Conversions attributed to non-consenting users are modelled in the interface but absent from BigQuery. If your property has significant EEA traffic with Consent Mode enabled, this gap can be substantial.
Export Lag and Timing
GA4's daily BigQuery export typically completes within 24 hours, but the exact timing varies. There's also a distinction between the intraday export (streaming, available within hours but may have duplicates) and the daily export (processed, deduplicated, typically complete by the following day). Comparing interface data from today against BigQuery data will always show discrepancies if the export hasn't completed.
Additionally, GA4 processes late-arriving hits — events that fire with a delay due to offline mode, slow connections, or measurement protocol submissions — for up to 72 hours after the event date. BigQuery exports for a given date may be updated retroactively within this window.
Session and Attribution Differences
GA4's interface attributes sessions and conversions using a processed attribution model. BigQuery contains raw event data with session identifiers but without the same attribution processing applied. Reconstructing sessions from BigQuery requires explicit session stitching logic using ga_session_id and ga_session_number event parameters — it doesn't happen automatically.
A common mistake: querying BigQuery for event_name = 'session_start' count and expecting it to match the Sessions metric in GA4. It won't, because GA4's session metric uses additional processing logic that the raw event count doesn't replicate.
SELECT COUNT(DISTINCT CONCAT(user_pseudo_id, '.', (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id'))) — not a simple count of session_start events. The GA4 documentation has the canonical query structure.User Counting Differences
GA4's interface user count uses probabilistic modelling to account for cross-device users and non-cookie users. BigQuery contains only the deterministic user_pseudo_id — a device-level identifier. The interface will show fewer users than BigQuery's raw device ID count because it merges identities; BigQuery will never automatically merge them without custom logic.
Diagnosing Your Specific Discrepancy
Start by isolating the metric type and date range. Revenue discrepancies often indicate Consent Mode modelling differences or transaction ID deduplication differences. Session discrepancies usually indicate query construction issues or attribution processing. Event count discrepancies often reveal late-arriving hits or intraday export deduplication issues.
For a thorough GA4 data quality baseline before diving into BigQuery analysis, our comprehensive GA4 audit checklist covers all configuration checks that affect data integrity. Run a 60-second automated audit to confirm your GA4 property is sending clean data before troubleshooting downstream.
