Scaling Beyond the Scrappy Era: Why We're Evaluating Google Cloud for Our Next 500K Users

Scaling Beyond the Scrappy Era: Why We’re Evaluating Google Cloud for Our Next 500K Users

Three years ago, we made a decision that seemed crazy to everyone: build a social commerce platform for Nigeria on a single $20/month VPS instead of using cloud infrastructure.

Today, we’re serving 102,000 active users, processing 2 million monthly events, and maintaining 99.2% uptime all on that same budget VPS.

Now we’re at a crossroads: migrate to cloud infrastructure in 2026. We’re evaluating both Azure and Google Cloud, and I’m here to understand which platform better fits our journey and constraints.

This isn’t a theoretical question. This is a real migration happening in Q1 2026, and I need the Google Cloud community’s wisdom.

The Journey So Far

Current Architecture:

  • Platform: Social commerce

  • Users: 102,000 active (last 30 days)

  • Events: ~2 million monthly

  • Uptime: 99.2% over 12 months

  • Infrastructure: Single VPS ($20/month)

  • Stack: PHP/Laravel, MySQL, Redis

  • Monthly revenue: ~₦850,000 (~$550 USD)

Yes, you read that correctly. One VPS serving 100K+ users.

Why we started this way:

We raised ₦15 million ($10,000 USD) in 2021. That needed to last 18 months in the Nigerian market.

When we priced cloud infrastructure:

  • AWS/GCP estimate: ~$150-200/month minimum

  • VPS cost: $20/month

  • Difference over 18 months: ~$2,500-3,200

That difference was:

  • 3 months of a junior developer salary, or

  • User acquisition for 5,000+ users, or

  • Extended runway when revenue wasn’t predictable

We chose to optimize instead of scale with money.

What We Got Right

1. We Became Optimization Experts

When you can’t throw money at infrastructure, you learn to actually fix problems.

Example: Database Query Optimization

Our feed query initially joined 7 tables and took ~6.4 seconds at 30K users.

If we’d been on cloud, we might have just:

  • Upgraded the database instance

  • Added read replicas

  • Thrown managed caching at it

Instead, we learned strategic denormalization, caching patterns, and query optimization.

Result: Feed query time: 6.4s → 280ms (96% faster)

This made us better engineers.

2. We Understood Every Resource Constraint

Example: Image Upload Disaster

At 50K users, synchronous image processing (8-15 seconds per upload) crashed our server at 2:47 AM.

Instead of just scaling horizontally, we redesigned the architecture:

  • Moved to async job processing

  • Background workers for heavy operations

  • Immediate user feedback

Result:

  • Upload endpoint: 8s → 340ms

  • Server capacity: 4x increase

  • No additional infrastructure costs

Ceiling 1: Can’t Scale Beyond ~150K Users

Current bottlenecks:

  • Single database struggling with write load

  • One server handling all traffic

  • Manual, error-prone scaling

To reach 500K users, we need:

  • Database replication or managed database

  • Load balancing across multiple instances

  • Geographic distribution (planning Ghana, Kenya expansion)

  • Managed caching layer

Ceiling 2: Operational Burden on One Person

Right now:

  • I’m the only person who can deploy

  • I’m the only person who can debug production

  • 3 AM alerts come to my phone

  • Vacations require careful planning

This doesn’t scale. We need managed services and team distribution.

Ceiling 3: We Have Revenue Now

2021: Every ₦1,000 mattered 2025: ₦850K monthly revenue, ₦15M raised

We can now justify:

  • $500-800/month on infrastructure

  • Proper observability

  • Managed services

  • Peace of mind

It’s time to graduate.

Target Architecture: The “GCP Way”

Rather than just lifting and shifting our monolith, we’re considering a cloud-native approach that leverages Google Cloud’s strengths.

Proposed Architecture (Phase 3 - Q3 2026):

┌─────────────────────────────────────────────────────┐
│  Cloud CDN (Reduces egress costs significantly)    │
└─────────────────┬───────────────────────────────────┘
                  │
┌─────────────────▼───────────────────────────────────┐
│  Cloud Storage (Images, Static Assets)             │
└─────────────────────────────────────────────────────┘
                  
┌─────────────────────────────────────────────────────┐
│  Cloud Run (Gen 2) - Laravel Application           │
│  • Startup CPU Boost (mitigates cold starts)       │
│  • Min-instances=1 for critical services           │
│  • Autoscaling 1-10 instances                      │
│  • Standard Knative containers (portability)       │
└─────────────────┬───────────────────────────────────┘
                  │
    ┌─────────────┼─────────────┐
    │             │             │
    ▼             ▼             ▼
┌─────────┐  ┌──────────┐  ┌─────────────────┐
│Cloud SQL│  │Memorystore│ │Cloud Functions  │
│Enterprise│  │(Redis)    │ │(Background jobs)│
│99.95% SLA│  │Sessions + │ │                 │
│Read      │  │Caching    │ │                 │
│Replicas  │  └───────────┘ └─────────────────┘
│(Ghana,   │
│Kenya)    │
└──────────┘
     │
     ▼
┌─────────────────────────────────────────────────────┐
│  Cloud Monitoring + Cloud Logging + Error Reporting │
└─────────────────────────────────────────────────────┘

Design Decisions:

1. Cloud Run Gen 2 with Startup CPU Boost

  • Addresses the PHP/Laravel cold start concerns

  • Min-instances=1 keeps one instance warm for critical paths

  • Knative-based means we can move to Azure Container Apps if needed

2. Cloud SQL Enterprise Edition

  • 99.95% SLA (industry standard for our scale)

  • Read replicas for international expansion (Ghana, Kenya)

  • Automated backups with point-in-time recovery

3. Cloud CDN as Cost Control

  • Critical: With ~5TB monthly bandwidth, egress to Africa costs $500/month

  • Cloud CDN caches our images/assets closer to users

  • Reduces egress costs by 60-80%

  • Improves load times for Nigerian users

4. Memory store for External Sessions

  • Cloud Run is stateless (can’t use file-based Laravel sessions)

  • Memory store provides Redis-compatible session storage

  • Also serves as application cache

5. Standard Containers (Portability)

  • Using Docker with php-fpm + Nginx (not GCP-specific)

  • If we need to move to Azure/AWS later, it’s just a container registry switch

  • Avoiding deep vendor lock-in

Why Google Cloud is on Our Evaluation List

We’re not just defaulting to one cloud provider. We’re actively evaluating, and Google Cloud has several compelling features for our use case.

What Attracts Us to Google Cloud:

1. Cloud Run for Cost-Efficient Compute

Our traffic is spiky (6-10 PM Nigerian time). We don’t need 24/7 full capacity.

Questions:

  • Can Cloud Run handle our Laravel application efficiently?

  • How does pricing work for unpredictable traffic patterns?

  • What’s the cold start situation for PHP/Laravel?

2. Firestore for Real-Time Features

We’re planning real-time notifications and live feed updates. Firestore’s real-time capabilities look promising.

Questions:

  • How does Firestore pricing scale with 100K+ users?

  • Can we migrate from MySQL incrementally, or is it all-or-nothing?

  • What’s the learning curve for developers used to SQL?

3. Cloud Storage + CDN Integration

We serve millions of images. Efficient storage + CDN is critical.

Questions:

  • How does Cloud Storage + Cloud CDN pricing compare to alternatives?

  • What’s the integration path from our current setup?

  • How do we handle our existing 180KB optimized images?

4. Cloud SQL for Managed Database

We’re currently on MySQL. Cloud SQL for MySQL seems like a natural fit.

Questions:

  • What’s the realistic cost at our scale?

  • How do we migrate 3 years of production data safely?

  • Can we test migration without committing fully?

5. Firebase for Mobile App Backend

We have Android/iOS apps. Firebase integration could simplify a lot.

Questions:

  • Can we use Firebase alongside our existing Laravel backend?

  • What’s the migration path for existing mobile authentication?

  • How does Firebase pricing work at scale?

What Concerns Us About Google Cloud:

1. Cost Visibility (Especially Egress)

The biggest surprise in my research: network egress costs to Africa.

With 5TB monthly bandwidth, egress could be $400-600/month alone. This is why Cloud CDN is critical for our architecture not optional.

Questions:

  • How do we monitor egress costs in real-time?

  • What cost monitoring tools do you recommend?

  • Are there other “hidden” costs newcomers miss?

2. Vendor Lock-In

My strategy for portability:

Using standard containers (Cloud Run is Knative-based):

  • Cloud Run → can migrate to Azure Container Apps, AWS App Runner, or any Kubernetes

  • Cloud SQL → standard MySQL (can export/migrate anywhere)

  • Memorystore → standard Redis protocol (portable)

GCP-specific services I’m cautious about:

  • Firestore (proprietary, harder to migrate)

  • Cloud Tasks (can use standard message queues instead)

  • Firebase (deep integration, harder to unwind)

Questions:

  • Is my portability strategy realistic?

  • Which GCP services create the strongest lock-in?

  • What’s the real difficulty of migrating off Google Cloud later?

  • Should I avoid certain services to maintain flexibility?

Our concern: We’re a small team. Deep lock-in could limit options if our needs change or costs increase significantly.

3. Learning Curve

Our team knows traditional LAMP stack infrastructure. Cloud-native is a paradigm shift.

Questions:

  • What’s the recommended learning path?

  • Best resources for PHP/Laravel on GCP?

  • How much refactoring should we expect?

  • Should we hire GCP-experienced contractors for initial setup?

Our Planned Migration Approach

We’re thinking of a phased migration over 6 months:

Phase 1: Hybrid Mode (Q1 2026)

Budget: $100-150/month

Move to Google Cloud:

  • Cloud Storage + Cloud CDN (static assets, images)

    • Critical for controlling egress costs

    • Expect 60-80% reduction in bandwidth costs

  • Cloud Monitoring (observability)

  • Budget Alerts + Pub/Sub (cost control automation)

Keep on VPS:

  • Application server

  • Database

  • Redis

  • Background jobs

Success Metrics:

  • Cloud CDN reduces egress by >60%

  • Images load faster for Nigerian users (<2s vs <5s)

  • Storage costs under $50/month

  • No surprises in first bill

Goal: Learn Google Cloud without risk. Measure impact. Validate cost assumptions.

Phase 2: Compute Migration (Q2 2026)

Budget: $300-500/month

Move to Google Cloud:

  • Cloud Run Gen 2 (application)

    • Startup CPU Boost enabled

    • Min-instances=1 for main service

    • Autoscaling 1-10 instances

  • Cloud SQL Enterprise (MySQL database)

    • 99.95% SLA

    • Read replicas in preparation for Ghana/Kenya expansion

  • Memory store (Redis for sessions + caching)

Keep on VPS:

  • Background jobs (safety net)

  • Legacy cron jobs

Success Metrics:

  • 100% traffic on Cloud Run

  • Uptime maintained (99.2%+)

  • Performance equivalent or better (API <300ms)

  • Cost within budget

Goal: Main user traffic on Google Cloud; VPS as safety net.

Phase 3: Full Cloud (Q3 2026)

Budget: $500-800/month

Move to Google Cloud:

  • Cloud Functions (background processing)

  • Cloud Tasks (job queue)

  • Everything else

Retire:

  • VPS completely

Goal: Fully cloud-native, managed, scalable.

Questions I Need Google Cloud Community to Answer

This is where I really need your help. These aren’t hypothetical we’re making these decisions in the next 60 days.

1. Architecture Design

Question: For a Laravel monolith serving 100K users with plans for 500K, should we:

Option A: Deploy entire monolith to Cloud Run Gen 2?

  • Keep Laravel app as-is

  • Single Cloud Run service

  • Simpler initial migration

  • Concerns: Scaling limitations? Container size?

Option B: Split into multiple Cloud Run services from day one?

  • Separate services: API, Web, Admin, Jobs

  • More complex initial setup

  • Better scaling granularity

  • Concerns: Over-engineering? Premature optimization?

Option C: Use Compute Engine initially, refactor to Cloud Run later?

  • Closer to current VPS setup

  • Less architectural change upfront

  • Concerns: Defeats purpose of cloud-native migration?

Cloud Run Gen 2 specifics I’m considering:

  • Startup CPU Boost to handle Laravel bootstrapping

  • Min-instances=1 to keep critical services warm

  • Concurrency settings (PHP-FPM typically 10-50 concurrent requests)

Our concern: We want to avoid over-engineering, but also don’t want to create migration debt. What would experienced GCP architects recommend for our scale and trajectory?

2. Cost Management

Question: What’s the realistic monthly cost for:

  • Cloud Run: 100K requests/day (spiky traffic 6-10 PM)

  • Cloud SQL Enterprise: MySQL database, 15GB data, ~500 queries/minute peak

  • Cloud Storage: 200GB images

  • Network Egress: 5TB monthly bandwidth to Africa

  • Memory store: Redis, ~2GB memory

Critical concern - Network Egress: I’ve seen estimates that egress to Africa costs $0.08-$0.12/GB. At 5TB/month, that’s potentially $400-$600 just in networking costs which would consume most of our budget.

My understanding: Cloud CDN significantly reduces egress by caching assets closer to users. For our use case (mostly images and static content):

  • Without CDN: 5TB egress = ~$500/month

  • With CDN: 1-2TB egress + CDN costs = $150-200/month (60-80% savings)

Questions:

  • Is my egress math correct?

  • How much does Cloud CDN typically reduce egress for image-heavy platforms?

  • Are there other networking optimizations I’m missing?

  • What’s the realistic total monthly cost with CDN properly configured?

Startup Credits: I’ve read about Google for Startups Cloud Program. With ₦15M (~$10K) raised:

  • Do we qualify for the “Start” tier ($2,000 credits)?

  • What’s required to access higher tiers?

  • How long do credits typically last?

Our concern: Budget estimates online vary wildly. Real world costs from similar workloads would be incredibly valuable.

3. Database Migration

Question: What’s the safest way to migrate production MySQL data to Cloud SQL Enterprise?

My research so far: I’ve learned about Google Cloud Database Migration Service (DMS) with Change Data Capture (CDC). If I understand correctly:

  1. Keep VPS MySQL as primary (production continues normally)

  2. DMS continuously replicates changes to Cloud SQL in real-time

  3. Once sync is complete, we have a 60-second cutover window

  4. Change Laravel .env database connection string

  5. VPS becomes backup for 1-2 weeks

Questions:

  • Is DMS + CDC the recommended approach for production migrations?

  • What’s the typical sync time for ~15GB database with ongoing writes?

  • How do we validate data integrity before cutover?

  • What’s the rollback procedure if something goes wrong?

  • Any gotchas specific to Laravel/PHP applications?

Our concern: We can’t afford significant downtime. Our users (500+ vendors) depend on the platform for their livelihoods. A 60-second cutover is acceptable; hours of downtime is not.

4. PHP/Laravel on Google Cloud

Question: Are there any gotchas for running PHP/Laravel on Cloud Run?

What I’ve researched:

  • Cloud Run Gen 2 with Startup CPU Boost should mitigate cold start issues

  • Min-instances=1 keeps one container warm for critical services

  • Session management: Cloud Run is stateless, so I’ll need external session storage

My planned approach:

# Dockerfile for Cloud Run
FROM php:8.2-fpm
# Install Nginx + php-fpm
# Or use Google Cloud PHP Buildpack

Configure Laravel for external sessions:

// config/session.php
'driver' => env('SESSION_DRIVER', 'redis'),
'connection' => 'session',

// Use Memorystore for Redis

Questions:

  • Is php-fpm + Nginx in Docker the recommended setup for Cloud Run?

  • Or should I use Google Cloud’s PHP Buildpack?

  • Any performance considerations for Laravel on Cloud Run Gen 2?

  • How do file uploads work? (Users upload images—should these go directly to Cloud Storage?)

  • What about Laravel’s scheduler (cron jobs)? Move to Cloud Scheduler + Functions?

Our concern: Most Cloud Run documentation assumes Node.js or Python. PHP/Laravel patterns seem less documented. Want to avoid discovering gotchas in production.

5. Testing Without Commitment

Question: How can we test our full stack on Google Cloud without committing to migration?

  • Is there a free tier sufficient for testing?

  • Can we run parallel infrastructure temporarily?

  • What’s the recommended staging approach?

Our concern: We want to validate everything works before cutting over production.

6. Startup/Budget-Friendly Resources & Cost Control

Question: How do we prevent the “$5,000 surprise bill” scenario?

My understanding of cost control options:

Hard Limits (What I initially wanted): I’ve learned GCP doesn’t have a “hard stop” button for billing (makes sense it could corrupt databases mid-transaction).

Budget Alerts + Automation (What I’m now considering): Set up Budget Alerts → Pub/Sub → Cloud Function that:

  • Scales Cloud Run to min-instances=0 if we hit 80% of budget

  • Sends urgent alerts at 50%, 70%, 90%

  • Provides emergency runway before actual overages

Questions:

  • Is Budget Alerts + Pub/Sub + Cloud Functions the recommended approach?

  • Are there better cost control patterns?

  • Should we use separate projects for prod/staging to isolate budgets?

  • What monitoring do you recommend for real-time cost tracking?

Startup Credits:

  • Google for Startups Cloud Program do we qualify with ₦15M (~$10K) raised?

  • What’s the application process?

  • How long do credits typically last?

  • Are there African startup-specific programs?

Our concern: Coming from $20/month VPS, the idea of accidentally running a $5,000 bill is terrifying. We need robust safeguards before committing to cloud migration.

7. Regional Considerations

Question: What’s the latency/cost tradeoff for hosting in:

  • Europe (closest to Nigeria)

  • Asia (alternative)

  • Multiple regions (for expansion to Ghana, Kenya)

Our concern: Nigerian users are sensitive to slow load times. We need the right balance of latency and cost.

8. What Would You Do Differently?

Question: For those who’ve migrated from self-hosted to Google Cloud:

  • What surprised you?

  • What would you have done differently?

  • What mistakes did you make?

  • What turned out better than expected?

Our concern: We want to learn from others’ experiences.

Why This Migration Matters (Beyond Our Company)

This isn’t just about Social commerce. It’s about demonstrating that African startups can:

  1. Build for local constraints (expensive data, intermittent connectivity)

  2. Achieve scale efficiently (100K users on minimal infrastructure)

  3. Graduate to cloud infrastructure (when revenue justifies it)

  4. Do it transparently (sharing lessons openly)

If we can document this migration well the costs, the challenges, the surprises it becomes a blueprint for other African startups facing similar decisions.

The Bigger Picture

The next wave of successful African platforms will be built by Africans for African realities.

We’re proving you don’t need $500K in funding to build platforms that scale. You need:

  • Deep understanding of your market’s constraints

  • Technical creativity within those constraints

  • Financial discipline to survive until product-market fit

  • Willingness to make unpopular technical choices

But there comes a time to graduate from scrappy optimization to professional infrastructure.

We’re at that inflection point.

Google Cloud community: help us make the right decisions.

If you’ve been through similar migrations:

  • Share your experience in the comments

  • What worked? What didn’t?

  • What would you do differently?

If you’re a Google Cloud expert:

  • Answer our specific questions

  • Point us to resources we might have missed

  • Help us avoid common mistakes

If you’re considering similar migration:

  • Let’s compare notes

  • Maybe we can learn together

  • I’m happy to share more details

I’ll be checking this thread daily and responding to every comment. Let’s figure this out together.

By: Michael Okpotu Onoja.