With precise knowledge of the rules which govern a deterministic chaotic system, it is possible to interact with the system and change its dynamics. This research is part of a larger project, in which chaos control is used to improve the bubbling behavior of multi-phase chemical reactors. Chaos control requires models which capture the complete behavior of the system. If we replace the system by its model, or vice versa, we should not notice a change in dynamical behavior. We restrict ourselves to data-driven models, which learn both their structure and parameters from measured data. In cooperation with Robert Jan de Korte [1], we use a neural network model to control the chaotic dynamics of an experimental, driven and damped pendulum. The neural network provides a nearly perfect model for this system. The gas-solids fluidized bed is a much more difficult system, because it has a large number of state variables, while the pendulum has only three. To get a good predictive model, the neural network approach is improved with several enhancements: (1) Inputs are compressed by weighted Principal Component Analysis. (2) An 'error propagation" scheme is introduced, in the which the model synchronizes itself with the data. (3) The neural network is connected in parallel to a linear predictive model. (4) A new pruning algorithm removes unused nodes from the network. (5) A statistical test by Diks et. al. [3] compares the chaotic attractors of the model-generated and measured time series. The approach is succesfully applied to benchmark tests. But Diks' test reveals that during training, the correctness of the model's attractor jumps from right to wrong from one iteration to another. We investigate why this is, and present an example of a model which can predict the measured data with zero error, but yet has a very different attractor. This has far-reaching consequences. It turns out that the learning of an attractor from measured data is a very ill-posed problem, in which the data covers all available dimensions globally, but locally the data is confined to low-dimensional structures. When a global nonlinear model is trained on this data, it locally has too many degrees of freedom, and this leads to arbitrary dynamics. A Nonlinear Principal Component Regression (NLPCR) algorithm is needed, which locally detects and eliminates the unused dimensions. We develop the 'Split & Fit' (S&F) algorithm, based on a fuzzy partitioning of the input space. In each region, unused dimensions are detected with Principal Component Analysis (PCA). This algorithm is shown to keep an otherwise unstable model for a chaotic laser onto the desired trajectory. Meanwhile, Robert Jan de Korte found that deterministic prediction of gas-solids fluidized beds is not feasible. But the S&F algorithm does learn the attractor of another experimental reactor, a gas-liquid bubble column with a single train of rising bubbles [2]. The S&F model paves the way for robust learning of chaotic attractors. However, real-world systems rarely meet the requirement of determinism and low-dimensionality. For these systems, we recommend to develop algorithms which find structure in 'noisy' nonlinear behavior. A good starting point is to have a probabilistic representation (kernel smoother or mixture density) of how the measured data are distributed in state space. [1] R.J. De Korte (2000), "Controlling the Chaotic Hydrodynamics of Fluidized Beds", PhD thesis, Delft Unversity of Technology [2] S. Kaart (2002), "Controlling Chaotic Bubbles", PhD thesis, Delft Unversity of Technology [3] C. Diks, W.R. van Zwet, F. Takens, J. de Goede (1996), "Detecting differences between delay vector distributions, pp. 2169--2176