Detecting Dish Types in Picnic Deliveries

Noorthoek, Sterre

Detecting Dish Types in Picnic Deliveries

Title

Detecting Dish Types in Picnic Deliveries

Author

Noorthoek, Sterre (TU Delft Electrical Engineering, Mathematics and Computer Science)

Contributor

Yorke-Smith, N. (mentor)
Vlaming, B. (mentor)
Yang, J. (graduation committee)

Degree granting institution

Delft University of Technology

Programme

Computer Science

Date

2022-08-26

Abstract

In addition to delivering groceries at customers’ doorsteps, online supermarket Picnic goes the extra mile by aiming to improve customer satisfaction. For instance, by providing cooking inspiration to customers through a recently launched recipe page in the app. This feature presents new recipes weekly and allows customers to easily add the ingredients to their shopping basket. It has raised interest in finding out what dishes customers are cooking as it could be helpful in choosing recipes for the page, predicting which articles are forgotten before checkout, and building a recipe recommender system. Hence, this work proposes two models to detect dish types in Picnic deliveries. The problem is scoped to detect main meals from a specified list of dish types in deliveries which were ordered in the Netherlands. The first model, named the Frequent Itemset Model, applies unsupervised learning techniques. First the articles in the deliveries are pre-processed by removing certain articles, choosing the representation of articles, and cleaning the text. The itemsets which represent core ingredients are obtained by applying techniques such as frequent itemset mining, association rule mining, and hierarchical clustering. In the final step itemsets are matched to dish types with the use of programmatic labelling and fuzzy string matching. Newly available labelled data enables the creation of a second model, referred to as the Supervised Learning Model, which applies supervised learning techniques. Features are selected and extracted. Multiple machine learning models, some in combination with binary relevance, are compared. The models are evaluated on two datasets: a large, weakly labelled dataset obtained through the recipe page, and a very small, manually labelled dataset with deliveries from a single customer. It is challenging to evaluate the performance of the models, since no large, truly labelled dataset is available. The results do indicate that the Frequent Itemset Model is able to detect common dish types, and that the Supervised Learning Model is able to detect dish types which are similar to the Picnic recipes it has trained on. Multiple suggestions are made for future work, such as obtaining a larger variety of labelled data and redefining the class labels. The contribution of this work is the formulation of the problem, two proposed solutions, insights into the challenges, and suggestions for future work.

Subject

Dishes
Recipes
Online grocery
Multi-label classification
Unsupervised learning
Supervised learning
Machine learning

To reference this document use:

http://resolver.tudelft.nl/uuid:31cfd9de-4f74-43bf-bebe-9145e75d6e58

Embargo date

2024-08-26

Part of collection

Student theses

Document type

master thesis

Rights

Files

file embargo until 2024-08-26