Human understanding and interpretability of AI models have become a growing problem in policymaking, particularly in urban planning. As AI models become more complex and opaque, decision-makers struggle to translate and explain their outcomes into human terms, leading to policy m
...
Human understanding and interpretability of AI models have become a growing problem in policymaking, particularly in urban planning. As AI models become more complex and opaque, decision-makers struggle to translate and explain their outcomes into human terms, leading to policy models that lack explainability and human-understandable outcomes. This research explores whether applying Explainable Artificial Intelligence (XAI) techniques, specifically Local Interpretable Model-agnostic Explanations (LIME), can enhance the interpretability of computer vision-enriched discrete choice models (CVDCMs) for street view images in urban analysis. CVDCMs analyze the liveability of areas using street view images, by analyzing the full image to determine its liveability score. This is unlike traditional urban models using object detection for street view images (relating liveability to a certain set of predefined objects in street views and measuring feature importances), thereby introducing a bias of the modeller in the object set choice. CVDCMs, in combination with XAI, can avoid this specific modeller bias, providing a more holistic view of human perception of urban liveability and improving the CVDCM’s utility for policy advice by possibly providing new insights into urban liveability.
The methods employed in this research include applying CVDCMs to perform a face validity analysis with street view images to gain an initial understanding of the model’s decision behaviour. LIME was then used to generate explanations for these model decisions by segmenting the images, perturbing these segments to create modified images, and analyzing the impact on the classification of the street views into ’good’ or ’bad’. This approach highlights which ’parts’ of street view images contribute most to preferred liveability, thus providing a more comprehensive view of human perception. LIME can showcase the complex decision behaviour of the CVDCM by considering the entire street view image as a possible explanation for a liveability score, without requiring a predefined set of factors. This contrasts with non-XAI methods, which leave users without insight into how the model arrives at its
results. By not relying on prior object detection, LIME avoids modeller bias and allows for a more holistic view of human perception, enhancing the interpretability and transparency of CVDCMs. The LIME segmentations were evaluated with ground truth images and the LIME explanations were assessed with various metrics (Binary Classification Ratio, Coefficient of Variation, and Probability Distribution Uniformity Metric), ensuring robust and reliable interpretability of both the segmentations and the LIME explanations.
The study revealed several significant findings. The quality of LIME explanations is highly dependent on the segmentation process, as already also found in earlier research. Street view images, which contain a variety of complex objects, present challenges in achieving meaningful segmentation. While ground truth analysis can improve segmentation quality, it is labour-intensive and prone to inconsistencies. The lack of semantically meaningful segmentation results in non-human-understandable explanations of the model’s outcomes, and thus does not increase the interpretability of CVDCMs. Additionally, the sampling process of LIME results in perturbed images that significantly differ from the original image.
This impacts the utility-based liveability score and necessitates a unique classification threshold for each image in the deterministic classification, complicating comparisons and reducing explanation coherence. In the probabilistic classification, this is not necessary, but the distribution of the classification probabilities is therefore not adequate, resulting in scientifically less valid LIME explanations. Moreover, the computational intensity of the LIME methodology limits its scalability for analyzing large datasets necessary for comprehensive urban policy analysis. Therefore it is concluded that in the current applied methodology, the interpretability of the complex model is not increased and no human-understandable explanations are created. This research uniquely identifies the complications of using CVDCM with LIME image analysis and suggests that simpler object detection models combined with LIME tabular explanations could offer a valuable alternative.
In response to these findings, several recommendations are proposed to increase
interpretability and create human-understandable explanations. The use of street view-trained segmentation models for superpixel creation (via object detection) can provide human-understandable LIME explanations by providing semantically meaningful segmentations, which could also decrease the computational time of the LIME method, thereby introducing some bias but still enabling the analysis of the holistic view of human perception. Further research on the distribution of utility scores in the sampling process and experimenting with different parameters or distributions for the sampling process could improve the LIME
explanations. Additionally, applying methods to integrate regression models directly within the LIME framework, rather than converting them to classification models, could avoid the pitfalls of thresholdbased classification. These recommendations ensure that the explanations are more aligned with the process of urban policy advice creation, as they become logical and actionable through these improvements, enabling its use for municipal policy advice.