Automatically extracting class diagrams from spreadsheets

More Info
expand_more

Abstract

The use of spreadsheets to capture information is widespread in industry. Spreadsheets can thus be a wealthy source of domain information. We propose to automatically extract this information and transform it into class diagrams. The resulting class diagram can be used by software engineers to understand, refine, or re-implement the spreadsheet’s functionality. To enable the transformation into class diagrams we create a library of common spreadsheet usage patterns. These patterns are localized in the spreadsheet using a two- dimensional parsing algorithm. The resulting parse tree is transformed and enriched with information from the library. We evaluate our approach on the spreadsheets from the Euses Spreadsheet Corpus by comparing a subset of the generated class diagrams with reference class diagrams created manually. Preprint accepted for publication in Proceedings of the 24th European Conference on Object-Oriented Programming (ECOOP 2010), Maribor (Slovenija), 21-25 June 2010, Lecture Notes in Computer Science, Springer-Verlag, 2010.