Advancements in deep reinforcement learning (RL) open the door to the development of robust flight control systems (FCS) that have the potential to improve both safety and performance during off-nominal flight conditions. Simulation-based work on offline-RL FCS has already demonstrated robustness to adverse weather conditions, mechanical failures, and a wide range of operational conditions. However, it has neglected important dynamical phenomena that limit its applicability to reality. In anticipation of a future flight testing campaign of similar RL-based FCS, this research emulates the transition from simulation to reality by modelling prevalent sensor and actuator dynamics, and introduces a method to incorporate a long short-term memory (LSTM) artificial neural network (ANN) into the policy of a Soft Actor-Critic (SAC) agent. The approach is found to largely diminish the sensitivity of the controller to sensor noise and actuator dynamics, while increasing its robustness to delays in comparison with the ubiquitous feed forward deep neural network (DNN) and a traditional linear controller.