SR

S. Roy

info

Please Note

7 records found

Conference paper (2019) - Sohon Roy, Arie Van Deursen, Felienne Hermans
Microsoft VBA (Visual Basic for Applications) is a programming language widely used by end-user programmers, often alongside the popular spreadsheet software Excel. Together they form the popular Excel-VBA application ecosystem. Despite being popular, spreadsheets are known to be fault-prone, and to minimize risk of faults in the overall Excel-VBA ecosystem, it is important to support end-user programmers in improving the code quality of their VBA programs also, in addition to improving spreadsheet technology and practices. In traditional software development, automatic code inspection using static analysis tools has been found effective in improving code quality, but the practical relevance of this technique in an end-user development context remains unexplored. With the aim of popularizing it in the end-user community, in this paper we examine the relevance of automatic code inspection in terms of how inspection rules are perceived by VBA programmers. We conduct a qualitative study consisting of interviews with 14 VBA programmers, who share their perceptions about 20 inspection rules that most frequently detected code quality issues in an industrial dataset of 25 VBA applications, obtained from a financial services company. Results show that the 20 studied inspection rules can be grouped into three categories of user perceptions based on the type of issues they warn about: i) 11 rules that warn about serious problems which need fixing, ii) 7 rules that warn about bad practices which do not mandate fixing, and iii) 2 rules that warn about purposeful code elements rather than issues. Based on these perceptions, we conclude that automatic code inspection is considerably relevant in an end-user development context such as VBA. The perceptions also indicate which inspection rules deserve the most attention from interested researchers and tool developers. Lastly, our results also reveal 3 additional issue types that are not covered by the existing inspection rules, and are therefore impetus for creating new rules. ...
Conference paper (2018) - Sohon Roy, Arie van Deursen, Felienne Hermans
Automatically inferred invariants have been found to be successful in detecting regression faults in traditional software, but their application has not been explored in the context of spreadsheets. In this paper, we investigate the effectiveness of automatically inferred invariants in detecting regression faults in spreadsheets. We conduct an exploratory empirical study on eight spreadsheets taken from VEnron and EUSES corpora. We apply automatic invariant inference to them, create tests based on the inferred invariants, and finally seed the sheets with faults. Results indicate that the effectiveness of the inferred invariants, in terms of accuracy of fault detection, largely varies from spreadsheet to spreadsheet. The effectiveness is found to be affected by the formulas and data contained in the spreadsheets, and also by the type of faults to be detected. ...
Conference paper (2017) - Sohon Roy, Felienne Hermans, Arie Van Deursen
Despite being popular end-user tools, spreadsheets suffer from the vulnerability of error-proneness. In software engineering, testing has been proposed as a way to address errors. It is important therefore to know whether spreadsheet users also test, or how do they test and to what extent, especially since most spreadsheet users do not have the training, or experience, of software engineering principles. Towards this end, we conduct a two-phase mixed methods study. First, a qualitative phase, in which we interview 12 spreadsheet users, and second, a quantitative phase, in which we conduct an online survey completed by 72 users. The outcome of the interviews, organized into four different categories, consists of an overview of test practices, perceptions of spreadsheet users about testing, a set of preventive measures for avoiding errors, and an overview of maintenance practices for ensuring correctness of spreadsheets over time. The survey adds to the findings by providing quantitative estimates indicating that ensuring correctness is an important concern, and a major fraction of users do test their spreadsheets. However, their techniques are largely manual and lack formalism. Tools and automated supports are rarely used. ...
Conference paper (2017) - Sohon Roy, Felienne Hermans, Arie van Deursen
Spreadsheets in the industry are used by multiple employees in organizations, and they remain in use for several years. Maintenance of existing spreadsheets is thus common. One of the issues in maintaining spreadsheets is the fact that formulas create cell dependencies, and these dependencies are invisible to users. To address this, dependence tracing techniques have been developed, both commercially and in research. However, these techniques are effort consuming, and are designed as separate activities that force the users to leave the context of editing spreadsheets. As such, these techniques are not suitably supportive for usual spreadsheet maintenance tasks. In this extended abstract, we present our work in progress on a novel approach for notifying users of cell dependencies, integrated into the context of editing spreadsheets. We present a preliminary evaluation of the approach in the form of an exploratory user-study with seven employees of a financial modeling company. Results show that the approach has the potential to support industrial spreadsheet users in the context of spreadsheet maintenance, as indicated by the responses of six out of seven participants. ...

An Overview of Software Engineering Approaches applied to Spreadsheets

Spreadsheets can be considered to be the world's most successful end-user programming language. In fact, one could say spreadsheets are programs. This paper starts with a comparison of spreadsheets to software: spreadsheets are similar in terms of applications domains, expressive power and maintainability problems. We then reflect upon what makes spreadsheets successful: liveness, directness and an easy deployment environment seem contribute largely to their success. Being a programming language, several techniques from software engineering can be applied to spreadsheets. We present an overview of such research directions, including spreadsheet testing, reverse engineering, smell detection, clone detection and refactoring. Finally, open challenges and future plans for the domain of spreadsheet software engineering are presented. ...
Conference paper (2016) - S. Roy, F. Hermans, E. Aivaloglou, J. Winter, Arie van Deursen
Spreadsheets are popular end-user computing applications and one reason behind their popularity is that they offer a large degree of freedom to their users regarding the way they can structure their data. However, this flexibility also makes spreadsheets difficult to understand. Textual documentation can address this issue, yet for supporting automatic generation of textual documentation, an important pre-requisite is to extract metadata inside spreadsheets. It is a challenge though, to distinguish between data and metadata due to the lack of universally accepted structural patterns in spreadsheets. Two existing approaches for automatic extraction of spreadsheet metadata were not evaluated on large datasets consisting of user inputs. Hence in this paper, we describe the collection of a large number of user responses regarding identification of spreadsheet metadata from participants of a MOOC. We describe the use of this large dataset to understand how users identify metadata in spreadsheets, and to evaluate two existing approaches of automatic metadata extraction from spreadsheets. The results provide us with directions to follow in order to improve metadata extraction approaches, obtained from insights about user perception of metadata. We also understand what type of spreadsheet patterns the existing approaches perform well and on what type poorly, and thus which problem areas to focus on in order to improve. ...
Conference paper (2014) - Sohon Roy, Felienne Hermans
Spreadsheet cells contain data but also may contain formulas that refer to data from other cells, perform operations on them, and render the results directly to show it to the user. In order to understand the structure of spreadsheets, one needs to understand the formulas that control cell-to-cell dataflow. Understanding this cell-to-cell inter-relation or dependence tracing is easier done in visual manners and therefore quite a few techniques have been proposed over the years. This paper aims to report the results of an investigative study of such techniques. The study is a first step of an attempt to evaluate the relevance of these techniques from the point of view of their benefits and effectiveness in the context of real world spreadsheet users. Results obtained from such a study will have the potential for motivating the conception of newer and better techniques, in case it is found that the need for them is still not fully catered. ...