Improving Search with AI

Increased search click-through rate by 4%, improved customer satisfaction by 8%, and reduced engineering maintenance time by 42% by first making "relevance" mean something specific.

Overview

A cross-team initiative to rebuild HSE's core search experience, one of the highest-intent journeys for 1.3M+ active customers across web and app.

Search in action.

My direct contribution

Strategy, framing & design direction

  • Defined the four search intent types (product number, brand, category, open query) and the success criteria for each; this became the shared rubric for Design, Engineering, Data, and Research
  • Led the initial analysis of top-100 queries and zero-result failure patterns that shaped the redesign brief
  • Directed the information hierarchy on result pages: what gets promoted, what gets demoted, and why visual prominence had to follow purchase signal strength
  • Pushed to include the -42% engineering maintenance reduction as a success metric; most teams only measured customer-facing KPIs

Team & collaboration

Cross-team alignment & delivery

  • Coached and mentored the search team's designer throughout the project lifecycle, providing strategic direction on key decisions
  • Worked closely with the search PM to ensure both business goals and user satisfaction shaped the success metrics; search touched Merchandising, Engineering, Design, and Data simultaneously
  • Kept design involved in platform decisions early, not just the UI layer
  • Coordinated with the User Researcher on qualitative interview framing

Duration: 2024 · Channels: Web shop, main app · Platform: Elasticsearch

Impact

  • +4% click-through rate on search result pages
  • +8% customer satisfaction for search
  • −42% engineering maintenance time

The Problem

The previous search behaved like a black box. Limited control over ranking, difficult to explain why results appeared, hard to tune for different user intents, and too expensive for engineering to maintain. This created friction for 1.3M+ customers and slowed down every attempt to improve.

The harder problem was upstream: nobody had agreed on what "good search" actually meant. Engineering measured index performance. Merchandising measured product visibility. Customer service measured complaint volume. Design was measuring nothing. Before anything could improve, that fragmentation had to be resolved.

The decision that unlocked everything: defining relevance first

My most important contribution to this project happened before any screen was designed. I pushed the team to define "relevance" as four distinct scenarios, not one global metric.

HSE customers search in four fundamentally different ways: by product number shown on live TV (exact match, high intent), by brand (brand loyalty, browsing), by category (exploratory), and by open-ended query (uncertain intent, needs guidance). Each requires a different definition of a good result: different ranking logic, different result page hierarchy, different filter behaviour, different handling of zero results.

Collapsing all four into a single "relevance score" was what had made the previous system so hard to improve. Every change that helped one intent type hurt another. Splitting them created four separate improvement tracks, and four separate success metrics that Product, Engineering, and Data could all align on.

What qualitative research added

Before finalising the success metrics, the User Researcher ran interviews with customers who had recently experienced search failures. The goal wasn't to discover new intent types; those we already knew. It was to understand the language customers used to describe failure.

That language mattered for two reasons. First, it revealed that customers didn't distinguish between "no results" and "wrong results"; both felt like the product was broken. Second, it told us which failure modes were most damaging to purchase intent, which shaped our prioritisation order.

What we built

  • Intent-based result page hierarchy: result prominence follows purchase signal, not just keyword match
  • Improved filter and refinement patterns matched to how customers actually think about product categories
  • Better zero and near-zero result states: recovery paths instead of dead ends
  • Consistent behaviour across web and mobile; previously they had diverged significantly

Takeaway

Redesigning search for 1.3M+ customers starts before the first screen. When design defines what relevance means, across four distinct intents, engineering, data, and product can finally pull in the same direction.