As communication capacity continues to expand, the application of deep neural networks (DNNs) for digital pre-distortion (DPD) has become increasingly prominent in addressing non-linearity issues in wideband power amplifiers (PAs). The advent of the fifth-generation (5G) era impo
...
As communication capacity continues to expand, the application of deep neural networks (DNNs) for digital pre-distortion (DPD) has become increasingly prominent in addressing non-linearity issues in wideband power amplifiers (PAs). The advent of the fifth-generation (5G) era imposes higher requirements on DPD regarding frequency and latency. The integration of multiple-input multiple-output (MIMO) technology and micro base stations has driven the trend towards low-power, small-area DPD chips. This paper presents a high-performance, Gated Recurrent Unit (GRU)-based hardware architecture, characterized by high parallelism, and low resource consumption, enabling real-time signal processing by DPD. A novel method is proposed, employing quantization-aware training (QAT) with Hardsigmoid and Hardtanh functions to quantize the floating-point model in software. The optimized algorithm is implemented on hardware with inter-layer pipelining and retiming to optimize timing and increase clock frequency. Additionally, hardware-efficient linear functions, Hardsigmoid and Hardtanh, are utilized for activation functions to minimize hardware overhead. Experimental results demonstrate that hardware implementation achieves an Adjacent Channel Power Ratio (ACPR) of 49.48 dBc and an Error Vector Magnitude (EVM) of 46.05 dB, showin minimal degradation compared to the floating-point model (49.58 dBc/ 46.70 dB). Simulated under 22nm CMOS technology, the DPD chip, operating at 2GHz, occupied an area of 0.047 mm2 and is capable of handling signals with bandwidth up to 70 MHz. The highest throughput reaches 256.5 GOp/s while the power efficiency reaches 1.3154 TOp/s/W.