BENTONVILLE, AR – Walmart Global Tech has unveiled ReLSum, a novel reinforcement learning framework designed to significantly enhance e-commerce search relevance by generating concise, query-relevant product summaries. This innovation addresses a critical challenge in online retail where traditional search systems often miss crucial product details, impacting customer experience and sales.
Modern e-commerce rankers typically process only product titles to maintain low latency, a limitation that can bury relevant items if key attributes are solely in the product description. As stated in a recent tweet by Rohan Paul, "That misses key details, like 'taurine' for a pet supplement, so relevant items get buried." ReLSum provides a solution by appending small, description-aware summaries to product titles, capturing vital information such as "SPF 18" or "bookcase headboard" while keeping token count low.
The ReLSum framework employs a trainable Large Language Model (LLM) to create candidate summaries. A frozen BERT ranker then scores these summaries based on how well they align the product with a given query, and this score serves as a reward signal to train the LLM. "No gradients are passed into the ranker, so production stays simple and fast," Paul highlighted. The training utilizes reinforcement learning, specifically Group Relative Policy Optimization (GRPO) and Direct Preference Optimization (DPO).
Experimental results from the paper "Generating Query-Relevant Document Summaries via Reinforcement Learning" (arXiv:2508.08404) demonstrate significant improvements. ReLSum with GRPO achieved a 0.51% gain in Recall at 90% Precision (R@90P) and a 1.03% gain in Normalized Discounted Cumulative Gain (NDCG@5) on the Golden-tail dataset, outperforming all baselines. These gains, while seemingly small, translate to substantial improvements in user engagement and business metrics in large-scale e-commerce.
This development aligns with Walmart's broader "Adaptive Retail" strategy, which leverages AI and generative AI to create hyper-personalized shopping experiences. Suresh Kumar, Walmart Inc. Global CTO and Chief Development Officer, emphasized the company's commitment to using technology to adapt to individual customer preferences. ReLSum's ability to improve search accuracy and efficiency contributes directly to Walmart's goal of increasing online sales and enhancing the overall digital shopping journey.