A Dataset of Scratch Programs

Scraped, Shaped and Scored

Conference Paper (2017)
Author(s)

E.A. Aivaloglou (TU Delft - Software Engineering)

Felienne Hermans (TU Delft - Software Engineering)

Jesús Moreno-León (King Juan Carlos University)

Gregorio Robles (King Juan Carlos University)

Research Group
Software Engineering
Copyright
Campus only
DOI related publication
https://doi.org/10.1109/MSR.2017.45
More Info
expand_more
Publication Year
2017
Language
English
Copyright
Campus only
Research Group
Software Engineering
Pages (from-to)
511-514
ISBN (electronic)
978-1-5386-1544-7

Abstract

Scratch is increasingly popular, both as an introductory programming language and as a research target in the computing education research field. In this paper, we present a dataset of 250K recent Scratch projects from 100K different authors scraped from the Scratch project repository. We processed the projects' source code and metadata to encode them into a database that facilitates querying and further analysis. We further evaluated the projects in terms of programming skills and mastery, and included the project scoring results. The dataset enables the analysis of the source code of Scratch projects, of their quality characteristics, and of the programming skills that their authors exhibit. The dataset can be used for empirical research in software engineering and computing education.

No files available

Metadata only record. There are no files for this record.