Technical
How a Marketplace Platform Should Handle Scale from Day One
May 9, 2026 · 7 min read · Yellow Labs Team
Marketplace platforms fail at scale for predictable reasons. The data model does not support the transaction patterns that emerge. The payment and payout system was designed for happy paths. Search becomes unusable. Fraud appears and there is no tooling to detect or act on it.
Most of these failures trace back to decisions made early — often decisions that seemed fine at the time because they were simple. This guide covers the architecture decisions that are worth getting right from the beginning, and the ones that can safely wait.
The Data Model: Getting Buyers, Sellers, and Transactions Right
A marketplace data model is more complex than a standard e-commerce model because you have three actors (buyer, seller, platform) involved in every transaction, and the relationships between them are non-trivial.
Users with multiple roles
The first mistake is building separate buyers and sellers tables. Most marketplaces allow users to be both — a freelancer who also hires other freelancers, a landlord who also rents, a seller who also buys. Separating the tables creates enormous friction when a user wants to switch roles.
The right pattern: a single users table with a separate seller_profiles table for sellers who have completed onboarding. A user has a seller profile if they are a seller. Whether they are currently acting as a buyer or seller is session state, not identity.
The listing data model
Listings need to handle:
- Availability (for time-based or inventory-based marketplaces)
- Variants (different options, prices, SKUs)
- Images (multiple, ordered, with alt text)
- Status lifecycle (draft, published, sold, removed, flagged)
- Soft deletion (listings should not be hard-deleted — transactions reference them)
Missing the soft deletion requirement causes broken transaction history when listings are removed. Missing the status lifecycle creates moderation problems.
The transaction model
A transaction in a marketplace has a lifecycle, not just a state:
initiated → payment_authorized → seller_accepted → in_progress →
completed → payout_triggered → payout_completed
(Plus dispute, refunded, cancelled at various stages.)
Every state transition should be recorded — not just the current state. You need to know when a transaction moved from in_progress to completed for payout timing, for dispute resolution, and for analytics.
A simple status column is insufficient. You need either a state machine library or an events table that records every transition.
Payments and Payouts: The Hard Part
Payments in a marketplace are not the same as payments in a standard e-commerce store. The money flows in, sits in escrow, and flows out to a seller — minus the platform's fee. Every part of that is complex.
Stripe Connect is the right starting point
Stripe Connect handles the multi-party payment problem: the buyer pays, Stripe holds the funds, you pay out the seller. It handles KYC for sellers, supports automatic or manual payouts, and manages the platform fee split.
The alternatives (building your own escrow, using a payment gateway that does not support marketplace flows) create regulatory and operational problems that compound over time.
Payout timing is a product decision, not just a technical one
When do sellers get paid? Options:
- Immediately on completed transaction
- After a hold period (e.g., 7 days after delivery confirmation)
- On a schedule (weekly, biweekly)
- On demand (seller requests payout)
Each option has implications for fraud, disputes, and cash flow. A hold period gives you time to process disputes before money leaves. Immediate payouts are simpler but leave no room to claw back fraudulent transactions.
The payout timing rule needs to be explicit in your data model and business logic from the start. Changing it later means auditing thousands of existing transactions.
Platform fee handling
Your platform takes a percentage of each transaction. This percentage may vary by:
- Category of listing
- Seller tier (volume discounts)
- Promotional periods
If you hardcode the fee percentage, changing it requires a code deployment. A fee_rules table keyed to category and seller attributes is worth building early.
Real-Time vs. Polling for Notifications
Marketplaces generate high-frequency notifications: new message, offer accepted, payment received, order shipped. The question is how to deliver them.
Polling — the client asks the server every N seconds if there is anything new — is simple to implement and wrong for anything time-sensitive. A 30-second polling interval means messages take up to 30 seconds to appear. Buyers and sellers waiting on responses will not wait that long.
WebSocket or Server-Sent Events — a persistent connection from the client to the server that allows the server to push updates — is the right solution for real-time messaging and notifications.
The practical implementation for most early-stage marketplaces:
- Use WebSockets for real-time chat and critical status updates
- Use push notifications (via APNs/FCM for mobile, Web Push for browsers) for notifications when the user is not actively in the app
- Use email for important asynchronous events (offer accepted, payment received)
The notification architecture should be event-driven from the start: transactions emit events, events trigger notifications via the appropriate channel. If you wire notifications directly to specific code paths, adding a new channel (push notifications, SMS) later requires finding every relevant code path.
Search and Filtering at Scale
Search is where many marketplaces quietly break as they grow. A basic database query works for hundreds of listings. It breaks at tens of thousands.
What breaks first
- Full-text search via
LIKE '%keyword%'does not use indexes and scans every row - Filtering on multiple attributes simultaneously gets slow when combined with text search
- Geo-search (listings within X miles) requires specific database extensions or external services
- Ranking by relevance is impossible with basic SQL
The right architecture
For early-stage marketplaces (under ~50,000 listings), PostgreSQL with proper indexing and full-text search is sufficient. PostgreSQL's tsvector type and GIN indexes support real full-text search with ranking.
For geo-search, PostgreSQL with PostGIS handles proximity queries efficiently.
Once you exceed ~100,000 listings or need complex relevance ranking, Elasticsearch or Typesense become the right tools. They can be added incrementally — the application queries the search service, which is populated by syncing changes from the database.
The critical design decision: abstract search behind a service interface from the beginning, even if the initial implementation is just a database query. When you need to swap in Elasticsearch, you change the implementation behind the interface, not every place in the codebase that queries listings.
Fraud Signals: The Problem That Appears on Traction
Fraud in marketplaces is not an early-stage problem — until it is. The moment you have real money flowing, fraud attempts follow.
The most common patterns:
- Fake seller listings that take payment and never deliver
- Stolen payment methods used to buy, followed by chargebacks
- Review manipulation (fake positive reviews on own listings, fake negative reviews on competitors)
- Account takeover followed by payout redirect
You cannot eliminate fraud, but you can detect and act on it. What you need:
Signals to track:
- New account age at transaction time
- Velocity (how many transactions in the last 24 hours for this user)
- Payment method first-use timing
- Listing age at first purchase
- Dispute and chargeback rates per seller
Tooling to build:
- A moderation queue where flagged content waits for human review
- Ability to suspend accounts and freeze payouts without a code deployment
- Rate limiting per account for messages, listings, and transactions
The moderation queue and account suspension controls are worth building early — you will need them as soon as you have any meaningful transaction volume. The fraud signal system can be built reactively, but should not be deferred past your first significant fraud event.
What to Over-Engineer Early
A few things are genuinely worth the extra upfront investment:
The transaction state machine — a proper state machine with explicit transitions prevents invalid state bugs that are very hard to debug in production
Idempotent payment handling — webhook delivery from Stripe is not guaranteed to be exactly-once; your webhook handlers should be idempotent so processing the same event twice does not double-charge or double-payout
The notification architecture — building it event-driven from the start costs 5–10 extra hours and saves 50+ hours of refactoring later
Seller KYC data handling — storing tax information and identity verification data requires proper encryption at rest from the start; retrofitting encryption is expensive and risky
What to Safely Defer
Advanced search features — faceted filtering, synonyms, spell correction, semantic ranking — start with basic search and add these when users ask
Recommendation systems — "you might also like" requires enough data to be useful; do not build it until you have the data
Dynamic pricing — surge pricing, promotional pricing, A/B tested pricing — complex to build and requires a lot of transaction data to make good decisions
Multi-currency support — unless your market requires it from day one, build single-currency and add later; it is not a small change but it is a tractable one
The full moderation pipeline — start with manual moderation via the queue; build automation when you have enough data to train the rules
The Yellow Labs builds marketplace platforms with the data model, payment flows, and architecture decisions that hold up as the platform grows.
Our on-demand marketplace development services can help you get this right from the beginning.
Talk to us about your project
Senior engineers, honest scoping, and hourly billing. No fixed-price guesses on work we haven't understood yet.
Start a conversation →