How to load data from external sources into BigQuery?

there are plenty of tools to do this with minimal code or ops headaches.

Native GCP Options :

  • BigQuery Data Transfer Service (DTS) – Great for native Google connectors (Ads, Analytics, YouTube). Useless for external DBs.

  • Cloud Data Fusion – Drag-and-drop UI, supports on-prem and SaaS, including SQL Server and PostgreSQL. Runs on managed Dataproc (Spark under the hood). Bit heavy but solid for complex flows. ( it Can get expensive at scale if not monitored well.)

  • Cloud Composer (Airflow) – Overkill unless you’re doing multi-step workflows or already familiar with Airflow.

  • Cloud Functions + Scheduler – Good for lightweight stuff. Think polling APIs, writing to GCS, triggering BigQuery jobs, Still requires code not ideal for scaling many sources.

Third-Party ELT Tools :

  • Fivetran – The gold standard. Plug and play for almost everything: PostgreSQL, SQL Server (CDC supported), Salesforce, HubSpot, etc. Managed, reliable, but pricing can sting for high volumes.

  • Hevo Data – Similar to Fivetran, easier on the budget, Great UI, handles real-time sync well.

  • Stitch Data – Basic, affordable, does the job

what I’d do and suggest you the same -

  1. Use Fivetran/Hevo for all major sources (SQL Server, PostgreSQL, SaaS).

  2. For weird or niche sources Cloud Functions or Data Fusion.

  3. Do transformations using dbt Cloud or BQ scheduled queries.

  4. Set up Cloud Monitoring for alerts/logging.

3 Likes