LH

L.E. Hoogland

info

Please Note

2 records found

Modern Artificial Intelligence (AI) applications, such as Deep Neural Networks (DNNs), require substantial amounts of data in order to carry out the classification or recognition task, which must be retrieved from the memory, supplied to the processor, and finally the results stored back in the memory. In Von-Neumann architectures, this data movement incurs significant performance costs, leaving the CPU with many idle cycles while waiting for data to arrive. One way of addressing this issue is by investigating alternative computing paradigms, such as Computation in Memory (CIM). In CIM architectures, the processor and the memory are integrated into one physical location. As such, computations are performed in the memory core directly, without the need to be transferred to a central processor. A promising technology to efficiently implement CIM crossbar arrays is the emerging Ferroelectric Field Effect Transistor (FeFET), in which data can be stored in a non-volatile manner in the polarization state of a ferroelectric layer.

In existing literature, CIM crossbar arrays are optimized for the inference task, but do not perform the learning task locally. This means the neural network is trained externally, for example using cloud computing. Only once the training is finished, the weights are written to the physical crossbar array. For medical applications, such as ECG classification, sending sensitive medical data off to the cloud for training leads to privacy concerns. A solution to this problem is On-chip learning: training the network locally in the crossbar itself.

This thesis focuses on integrating the FeFET technology in a CIM architecture to design a crossbar array that supports On-Chip learning for Convolutional Neural Networks. The accelerator overcomes the memory-wall inherent to Von Neumann machines by embracing the CIM framework and uses FeFET devices to overcome the scaling walls associated with CMOS technology. The result is a novel accelerator which leverages the parallelism of Analog Crossbars to optimize the inference task and forward propagation, while leveraging the accuracy of Digital Crossbars to optimize the back propagation task. ...

Data Compression and Nearest Neighbour Search

One of the main problems with Instance-level Image Retrieval in video data is that for longer query videos or large amount of image queries, comparing all of the query images to every extracted frame is time-inefficient. This thesis aims to solve this problem by implementing Nearest Neighbour Search (NNS) algorithms and data compression methods, significantly reducing total comparison time. In most NNS use cases, the reference data is provided before reaching the user, allowing methods such as ANNOY or HNSW to partition the data beforehand. However, little research has been done into partitioning the data during run-time. In this thesis, the use of Nearest Neighbor Search and Data Compression methods are discussed for the purposes of matching a query image to a query video, both of which are provided at run-time. The result is an implementation of several state-of-the-art NNS and data compression methods in a system which, based on the amount of query images and the amount of extracted keyframes, selects the optimal comparison method to be used, as well as its optimal parameters if applicable. ...