Uploading millions of documents to Google Cloud Storage

Hi @maxwellpinna ,

Welcome to Google Cloud Community!

You are right in your interpretation that the Google Transfer Appliance is not a good fit for your workload because it doesn’t support files smaller than 1 MB, ruling out most of your dataset.

Switching to the Google Transfer Service is a wise decision. This service is ideal for managing small file datasets, allowing for parallel file uploads with restricted bandwidth usage to prevent network disruptions.

For your workload, you may:

  • Use gsutil -m to perform multi-threaded uploads for better efficiency.
  • Consider using bandwidth control options in gsutil to ensure that network resources aren’t overwhelmed during the transfer.
  • If feasible, aggregating files into larger archives may improve performance.

I hope the above information is helpful.

1 Like