Print Email Facebook Twitter Scaling up data analytics in Python using multiple FPGAs Title Scaling up data analytics in Python using multiple FPGAs Author Aggarwal, S. (TU Delft Electrical Engineering, Mathematics and Computer Science) Contributor Al-Ars, Z. (mentor) Rellermeyer, J.S. (graduation committee) Hofstee, H.P. (graduation committee) Hoozemans, J.J. (graduation committee) Degree granting institution Delft University of Technology Date 2020-08-05 Abstract Big data applications are becoming more commonplace due to an abundance of digital data and increasingly powerful hardware. One of these classes of hardware devices are FPGAs, which are being used today in various ways such as data centers and embedded systems. High performance, power efficiency, and reprogrammability are the primary reasons behind their wide use. Another trend over the previous years has been to use distributed data processing frameworks such as Apache Spark to improve the performance of big data applications. Traditionally, such frameworks are deployed on commodity hardware to save costs. This approach is fairly popular, with organizations often having on-premise compute clusters or using a cloud provider to access a managed cluster. This project attempts to combine the above-mentioned worlds - FPGAs and dis- tributed data processing. We have designed an architecture that allows us to use FP- GAs as end-devices in a compute cluster to perform the actual computation instead of CPUs. This architecture is designed by composing together several open source technologies and allows us to interact with an FPGA cluster using Python. Using a high-level programming language such as Python makes this system easy to use for software developers and data scientists, and also abstracts away the internal commu- nication within the cluster. We have built prototypes based on this architecture for 3 hardware platforms (FPGA families) and 3 specific applications to demonstrate general applicability. We have observed noticeable performance gains in these applications by scaling up the FPGA cluster. Subject FPGADistributed systemsData ScienceFletcher To reference this document use: http://resolver.tudelft.nl/uuid:e54bbdca-3e9f-4c23-8c89-463751193061 Part of collection Student theses Document type master thesis Rights © 2020 S. Aggarwal Files PDF Shashank_Aggarwal_MSc_The ... report.pdf 508.79 KB Close viewer /islandora/object/uuid:e54bbdca-3e9f-4c23-8c89-463751193061/datastream/OBJ/view