Does Knowledge Distillation Matter for Large Language Model-Based Bundle Generation?

None, None; None, None; None, None; None, None; None, None; None, None

Does Knowledge Distillation Matter for Large Language Model-Based Bundle Generation?

Journal Article (2026)

Author(s)

Kaidong Feng (Yanshan University)

Zhu Sun (Singapore University of Technology and Design)

Jie Yang (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Hui Fang (Shanghai University of Finance and Economics)

Xinghua Qu (ByteDance)

Wenyuan Liu (Yanshan University)

Research Group

Web Information Systems

Large Language Models Efficiency Recommender Systems Knowledge Distillation Bundle Generation

DOI related publication

https://doi.org/10.1145/3808223 Final published version

To reference this document use

https://resolver.tudelft.nl/uuid:d862cd64-c59c-49d0-a51f-b6c215d3aac1

More Info

expand_more

Publication Year

2026

Language

English

Research Group

Web Information Systems

Journal title

ACM Transactions on Information Systems

Issue number

4

Volume number

44

Article number

99

Downloads counter

20

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Large Language Models (LLMs) have been extensively applied in various recommendation scenarios, including bundle generation, thanks to their exceptional reasoning capabilities and comprehensive knowledge. However, exploiting large-scale LLMs for bundle generation introduces significant efficiency challenges—primarily high computational costs during fine-tuning and inference due to their massive parameterization. Knowledge Distillation (KD) offers a promising solution by transferring expertise from large teacher models to more compact student models. This study systematically investigates KD approaches for bundle generation with the goal of minimizing computational demands while preserving performance. Specifically, we explore three critical research questions: (1) how does the format of distilled knowledge impact bundle generation performance? (2) to what extent does the quantity of distilled knowledge influence the performance? and (3) how do different ways of utilizing the distilled knowledge affect the performance? To support this investigation, we propose a comprehensive KD framework that (i) progressively extracts knowledge from raw data in increasingly complex forms, i.e., frequent patterns → formalized rules → deep thoughts; (ii) captures varying quantities of distilled knowledge through different sampling strategies, multi-domain accumulation, and multi-format aggregation; and (iii) exploits complementary LLM adaptation techniques—in-context learning, supervised fine-tuning, and their combination—to leverage the distilled knowledge for domain-specific adaptation and enhanced efficiency in small student models. Through extensive experiments on multiple real-world datasets, we provide valuable insights into how knowledge format, quantity, and utilization methods collectively shape the performance of LLM-based bundle generation, which exhibits the significant potential of KD for more efficient yet effective LLM-based bundle generation.

Files

3808223.pdf

(pdf | 55.3 Mb)