Print Email Facebook Twitter Detecting Dish Types in Picnic Deliveries Title Detecting Dish Types in Picnic Deliveries Author Noorthoek, Sterre (TU Delft Electrical Engineering, Mathematics and Computer Science) Contributor Yorke-Smith, N. (mentor) Vlaming, B. (mentor) Yang, J. (graduation committee) Degree granting institution Delft University of Technology Programme Computer Science Date 2022-08-26 Abstract In addition to delivering groceries at customers’ doorsteps, online supermarket Picnic goes the extra mile by aiming to improve customer satisfaction. For instance, by providing cooking inspiration to customers through a recently launched recipe page in the app. This feature presents new recipes weekly and allows customers to easily add the ingredients to their shopping basket. It has raised interest in finding out what dishes customers are cooking as it could be helpful in choosing recipes for the page, predicting which articles are forgotten before checkout, and building a recipe recommender system. Hence, this work proposes two models to detect dish types in Picnic deliveries. The problem is scoped to detect main meals from a specified list of dish types in deliveries which were ordered in the Netherlands. The first model, named the Frequent Itemset Model, applies unsupervised learning techniques. First the articles in the deliveries are pre-processed by removing certain articles, choosing the representation of articles, and cleaning the text. The itemsets which represent core ingredients are obtained by applying techniques such as frequent itemset mining, association rule mining, and hierarchical clustering. In the final step itemsets are matched to dish types with the use of programmatic labelling and fuzzy string matching. Newly available labelled data enables the creation of a second model, referred to as the Supervised Learning Model, which applies supervised learning techniques. Features are selected and extracted. Multiple machine learning models, some in combination with binary relevance, are compared. The models are evaluated on two datasets: a large, weakly labelled dataset obtained through the recipe page, and a very small, manually labelled dataset with deliveries from a single customer. It is challenging to evaluate the performance of the models, since no large, truly labelled dataset is available. The results do indicate that the Frequent Itemset Model is able to detect common dish types, and that the Supervised Learning Model is able to detect dish types which are similar to the Picnic recipes it has trained on. Multiple suggestions are made for future work, such as obtaining a larger variety of labelled data and redefining the class labels. The contribution of this work is the formulation of the problem, two proposed solutions, insights into the challenges, and suggestions for future work. Subject DishesRecipesOnline groceryMulti-label classificationUnsupervised learningSupervised learningMachine learning To reference this document use: http://resolver.tudelft.nl/uuid:31cfd9de-4f74-43bf-bebe-9145e75d6e58 Embargo date 2024-08-26 Part of collection Student theses Document type master thesis Rights © 2022 Sterre Noorthoek Files file embargo until 2024-08-26