Beyond Traditional Lexing

None, None

Beyond Traditional Lexing

Exploiting SIMD Instructions for Tokenizing C

Bachelor Thesis (2024)

Author(s)

A. Bolfă (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

D.G. Sprokholt – Mentor (TU Delft - Programming Languages)

S.S. Chakraborty – Mentor (TU Delft - Programming Languages)

Burcu Kulahcioglu Ozkan – Graduation committee member (TU Delft - Software Engineering)

Faculty

Electrical Engineering, Mathematics and Computer Science

Parallelism SIMD Lexical Analysis

To reference this document use:

https://resolver.tudelft.nl/uuid:14b9d92f-77e9-451c-a79b-0037535e65b1

More Info

expand_more

Publication Year

2024

Language

English

Graduation Date

23-06-2024

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Over the past decades, Single Instruction, Multiple Data (SIMD) instructions have become common- place in conventional hardware. Lexical analysis, the first stage of compilation, can take advantage of this by splitting its workload across sub lexers that identify groups of tokens with similar structures. Each sub lexer can leverage SIMD to search for these structures across multiple characters in paral- lel and extract token positions efficiently. This pa- per presents a lexer with this architecture for the C programming language, along with implementation details for x86/x64 processors. Benchmark results show a 12x speed up in throughput compared to the lexer found in GCC, a state-of-the-art compiler.

Files

Alexandru_bolfa_final_paper.pd... (pdf)

(pdf | 0.203 Mb)

License info not available