In current simulation-based training of knowledgeintensive tasks, human instructors are needed to evaluate a student's task performance. This paper reports a study into the development of a multi-agent-based training system that evaluates student behavior at the result-level (quality of performance) and at the process-level (appropriateness of taken approach). The system uses expert and error modeling as well as plan recognition to evaluate and diagnose the student behavior. Furthermore, it keeps track of this behavior over time and generates feedback on the student's task performance after either one trial or a series of trials. Exploratory results suggest that the system can correctly diagnose the behavior of students. © 2008 IEEE.