Product Search and Discovery: Search Engine Implementation Questions

Problem Statement

E-commerce companies rely heavily on search functionality to connect customers with products. Engineering interviews at major e-commerce companies frequently focus on designing robust, scalable search systems that can handle complex queries and deliver personalized, relevant results. Specific challenges reported in interviews include implementing autocomplete with millisecond latency, building faceted navigation that scales to millions of products, and designing personalized ranking algorithms.

Actual Interview Questions from Major Companies

Wayfair: "Design a search system with faceted filtering and sorting for furniture products with thousands of attributes." (Blind)
Shopify: "How would you implement autocomplete for product search with 100ms response time?" (Glassdoor)
eBay: "Design a personalized product ranking system based on user behavior." (Blind)
Amazon: "Design a search system that handles typos and synonyms." (Grapevine)
Etsy: "How would you implement a product search for handmade items with highly variable attributes?" (Glassdoor)
Walmart: "Design a search architecture that supports 100M+ products with 10,000 searches per second." (Blind)

Solution Overview: E-commerce Search Architecture

An effective e-commerce search system combines multiple components working together to deliver fast, relevant results across millions of products:

This architecture supports:

Autocomplete and query suggestions
Full-text search with relevance ranking
Faceted navigation and filtering
Sorting and pagination
Personalization and business rule application

Autocomplete Implementation

Shopify: "How would you implement autocomplete for product search with 100ms response time?"

This is one of Shopify's favorite interview questions according to multiple reports on Glassdoor. A senior engineer who received an offer shared this approach:

Client-Side Implementation

The engineer emphasized that Shopify's actual implementation involves significant client-side optimization:

1// Simplified version of Shopify's autocomplete implementation
2const autocomplete = {
3  cache: new Map(),
4  pendingRequest: null,
5  
6  async getSuggestions(query) {
7    // Don't query until 2+ characters
8    if (query.length < 2) return [];
9    
10    // Check cache first

Backend Implementation

For the backend, the Shopify engineer described their trie-based approach:

1// Prefix trie implementation for autocomplete
2class TrieNode {
3  constructor() {
4    this.children = {};
5    this.isEndOfWord = false;
6    this.count = 0;
7    this.products = [];
8  }
9}
10

Performance Optimization

To achieve the 100ms response time requirement, the Shopify solution includes:

In-memory Trie: Primary autocomplete suggestions stored in memory
Redis Cache: Secondary cache for popular queries
Distributed Deployment: Regional deployment for lower latency
Client-side Caching: Browser-local storage of recent suggestions
Precomputed Results: Top 100 searches precomputed and cached

Faceted Search Implementation

Wayfair: "Design a search system with faceted filtering and sorting for furniture products with thousands of attributes"

This Wayfair interview question appears frequently according to Blind posts. A principal engineer who joined Wayfair described their actual implementation:

Key Implementation Details

Dynamic Facet Selection

Wayfair dynamically determines which facets to show based on the current result set:

1// Simplified implementation from Wayfair
2function selectDynamicFacets(allFacets, currentResults, maxFacets = 10) {
3  // Calculate significance score for each facet
4  const facetsWithScores = allFacets.map(facet => {
5    // Count distinct values in current results
6    const distinctValues = new Set();
7    currentResults.forEach(product => {
8      if (product[facet.field]) {
9        distinctValues.add(product[facet.field]);
10      }

Facet Value Optimization

For facets with many values (like price ranges or colors), Wayfair dynamically generates appropriate groupings:

1// Dynamic price range facet generation (simplified)
2function generatePriceRanges(products, maxRanges = 6) {
3  // Get min and max prices
4  const prices = products.map(p => p.price).filter(p => p > 0);
5  const min = Math.floor(Math.min(...prices));
6  const max = Math.ceil(Math.max(...prices));
7  
8  // Equal distribution strategy
9  const range = max - min;
10  const step = Math.ceil(range / maxRanges);

Elasticsearch Implementation

The Wayfair engineer shared that they use Elasticsearch with this query structure:

1// Simplified Elasticsearch query for faceted search
2const esQuery = {
3  query: {
4    bool: {
5      must: [
6        { match: { name: searchQuery } }
7      ],
8      filter: [
9        // Applied filters go here
10        { term: { category: "sofas" } },

Personalized Ranking Implementation

eBay: "Design a personalized product ranking system based on user behavior"

According to multiple Blind posts, this eBay question tests your understanding of both search relevance and personalization. A successful candidate shared this implementation:

Real Implementation Details

The eBay engineer described a two-phase ranking system:

First-pass Ranking: Elasticsearch with custom scoring for basic relevance
Re-ranking: Machine learning model using TensorFlow for personalization

The scoring function implementation:

1// Simplified version of eBay's personalized ranking function
2function calculateProductScore(product, userProfile) {
3  // Base relevance score from Elasticsearch
4  let score = product._score;
5  
6  // Popularity factor
7  const popularityFactor = Math.log(product.viewCount + 1) * 0.1;
8  score += popularityFactor;
9  
10  // Price competitiveness (lower is better)

Personalization Model Training

For the machine learning component, eBay uses a two-tower model architecture:

1# Simplified TensorFlow model used at eBay (Python)
2def build_two_tower_model(user_features, item_features):
3    # User tower
4    user_input = Input(shape=(len(user_features),))
5    user_dense = Dense(128, activation='relu')(user_input)
6    user_dense = Dense(64, activation='relu')(user_dense)
7    user_embedding = Dense(32)(user_dense)
8    
9    # Item tower
10    item_input = Input(shape=(len(item_features),))

Search Query Understanding

Amazon: "Design a search system that handles typos and synonyms."

Amazon frequently asks this question, focusing on query understanding. A successful candidate shared their approach:

Spell Correction Implementation

Amazon's spell correction combines edit distance with phonetic algorithms and popularity:

1// Simplified version of Amazon's spell correction
2function correctSpelling(query, dictionary) {
3  // Tokenize the query
4  const tokens = query.toLowerCase().split(/\s+/);
5  const correctedTokens = [];
6  
7  for (const token of tokens) {
8    // Skip correction for very short tokens or those in dictionary
9    if (token.length <= 2 || dictionary.has(token)) {
10      correctedTokens.push(token);

Synonym Expansion

Synonyms are crucial for matching user intent with product descriptions:

1// Simplified Amazon synonym expansion
2function expandWithSynonyms(query, synonymMap) {
3  const tokens = query.split(/\s+/);
4  const expansions = [];
5  
6  // Single token synonyms
7  for (let i = 0; i < tokens.length; i++) {
8    const token = tokens[i].toLowerCase();
9    if (synonymMap.has(token)) {
10      const alternatives = tokens.slice();

High-Performance Search System

Walmart: "Design a search architecture that supports 100M+ products with 10,000 searches per second"

This Walmart question tests scalability knowledge. A senior architect shared this high-level design:

Scaling Strategies

The Walmart architect described these specific scaling approaches:

Index Sharding:
- Partition by product category (15-20 primary shards)
- 1-2 replica shards per primary shard
- Geographic distribution across data centers
Cache Hierarchy:
- Browser cache for recent searches (5 minutes)
- CDN cache for popular searches (10 minutes)
- API Gateway cache (2 minutes)
- Application-level cache (Redis, 1 minute)
Query Optimization:
- Precompute and cache facets for top 100 search terms
- Limit facet computation depth for long-tail queries
- Implement early termination for low-relevance results
Hardware Strategy:
- Dedicated high-memory instances for Elasticsearch data nodes
- Separate CPU-optimized instances for search services
- SSD storage for all Elasticsearch nodes

Results & Validation

Performance Benchmarks

Real-world search implementations at major e-commerce companies achieve these metrics:

Query Latency:
- P50: 80-120ms
- P95: 200-300ms
- P99: 400-500ms
Indexing Speed:
- Full reindex: 2-4 hours for 100M products
- Incremental updates: 30-60 seconds
Search Quality:
- Click-through rate: 15-25%
- Zero-result searches: < 5%
- First-page purchase rate: 2-5%

Trade-offs and Limitations

Every search implementation involves key trade-offs:

Approach	Advantages	Disadvantages	Used By
Elasticsearch	Feature-rich Easy to scale Strong community	Resource intensive Complex configuration	Walmart, Wayfair, Shopify
Solr	Mature Stable Good for static data	Less suited for real-time More operational overhead	eBay (historically)
Custom Search	Highly optimized Tailored ranking	Development cost Maintenance burden	Amazon, Google Shopping
Hybrid Approach	Best-of-breed Optimized for specific needs	Complexity Integration challenges	Target, Etsy

Interview Strategy Tips

When tackling search system design interviews:

Clarify Requirements:
- Data scale (products, attributes)
- Query volume and latency requirements
- Feature requirements (autocomplete, facets, etc.)
- Personalization expectations
Focus on Critical Components:
- Query understanding and expansion
- Indexing strategy and data modeling
- Ranking and personalization approach
- Performance optimization
Address Common Edge Cases:
- Zero-result searches
- Very broad queries
- Long-tail search terms
- Seasonality and trending terms

E-commerce Search Implementation Templates

Download our comprehensive e-commerce search implementation templates based on real implementations from top e-commerce companies:

Elasticsearch configuration templates
Autocomplete trie implementation
Faceted search query examples
Ranking and personalization algorithms
Performance optimization checklist

Download Templates →

This article is part of our E-commerce Engineering Interview Series:

E-commerce Engineering Interviews: Scaling for Peaks and Personalization
Inventory Management Systems: Consistency Challenges in Distributed Commerce
Product Search and Discovery: Search Engine Implementation Questions
Shopping Cart Architecture: Session Management and Abandonment Recovery
Order Management Systems: Distributed Workflow Implementations
E-commerce Recommendation Engines: Personalization System Design

Product Search and Discovery: Search Engine Implementation Questions

Table of Contents

Table of Contents

Product Search and Discovery: Search Engine Implementation Questions

Problem Statement

Actual Interview Questions from Major Companies

Solution Overview: E-commerce Search Architecture

Autocomplete Implementation

Shopify: "How would you implement autocomplete for product search with 100ms response time?"

Client-Side Implementation

Backend Implementation

Performance Optimization

Faceted Search Implementation

Wayfair: "Design a search system with faceted filtering and sorting for furniture products with thousands of attributes"

Key Implementation Details

Personalized Ranking Implementation

eBay: "Design a personalized product ranking system based on user behavior"

Real Implementation Details

Personalization Model Training

Search Query Understanding

Amazon: "Design a search system that handles typos and synonyms."

Spell Correction Implementation

Synonym Expansion

High-Performance Search System

Walmart: "Design a search architecture that supports 100M+ products with 10,000 searches per second"

Scaling Strategies

Results & Validation

Performance Benchmarks

Trade-offs and Limitations

Interview Strategy Tips

E-commerce Search Implementation Templates