Results: The criteria macrovesicular steatosis, microvesicular steatosis, hepatocellular hypertrophy, inflammation and fibrosis were generally applicable to rodent NAFLD. The inter-observer reproducibility (evaluated using the Intraclass Correlation Coefficient) between the ten observers was high for the analysis of macrovesicular steatosis and microvesicular steatosis (ICC50.784 and 0.776, all p,0.001, respectively) and moderate for the analysis of hypertrophy and inflammation (ICC50.685 and 0.650, all p,0.001, respectively). The intra-observer reproducibility between the different observations of one observer was high for the analysis of macrovesicular steatosis, microvesicular steatosis and hypertrophy (ICC50.871, 0.871 and 0.896, all p,0.001, respectively) and very high for the analysis of inflammation (ICC50.931, p,0.001). Conclusions: We established a simple NAFLD scoring system with high reproducibility that is applicable for different rodent models and for all stages of NAFLD etiology. Background and aims: The recently developed histological scoring system for non-alcoholic fatty liver disease (NAFLD) by the NASH Clinical Research Network (NASH-CRN) has been widely used in clinical settings, but is increasingly employed in preclinical research as well. However, it has not been systematically analyzed whether the human scoring system can directly be converted to preclinical rodent models. To analyze this, we systematically compared human NAFLD liver pathology, using human liver biopsies, with liver pathology of several NAFLD mouse models. Based upon the features pertaining to mouse NAFLD, we aimed at establishing a modified generic scoring system that is applicable to broad spectrum of rodent models. Copyright: Methods: The histopathology of NAFLD was analyzed in several different mouse models of NAFLD to define generic criteria for histological assessment (preclinical scoring system). For validation of this scoring system, 36 slides of mouse livers, covering the whole spectrum of NAFLD, were blindly analyzed by ten observers. Additionally, the livers were blindly scored by one observer during two separate assessments longer than 3 months apart.