Evaluation of linear regression and neural network methods for estimating occupancy in office buildings using bGrid sensor data

More Info
expand_more

Abstract

This thesis outlines the use of measured data collected using the bGrid system to estimate the number of people in two rooms in the Microsoft office at Schiphol. The main objective is to derive a correlation that transforms the data into a specific number of people. The bGrid system consists of a network of sensor nodes that measure CO2 concentration, movement intensity, relative humidity, ambient temperature, infrared object temperature and sound intensity. These nodes are strategically placed throughout the building and are interconnected through a gateway that is also part of the bGrid system. The specific offices used had an equal area of roughly 16 m², a capacity for eight people and contained two bGrid sensor nodes each. Data was collected on two days, separated by one day, yielding 1396 minutes of data. Ground truth data was collected using direct observation from outside the offices as to not interfere with the measurements. Since the offices were of equal area and capacity, were both used for the same purpose and each contained two sensor nodes, the rooms were deemed equivalent and could therefore be combined. On the first day of observation, only room 1 could be observed but on the second day both rooms could be observed. The data from room 1 was used to synthesise occupancy models while the data from room 2 was used for validation. Two methods were used to model the occupancy. An approach based on Multiple Linear Regression used a combination of the movement intensity and CO2 concentration to achieve a performance of 93.49%, with performance being defined as the times the model returns a number of people that is within one person of the observed occupancy expressed as a percentage of the total number of iterations. This was achieved by first splitting the data based on the observed occupancy to derive specific regression coefficients for those levels of occupancy. These were used in a Simulink model that actively chose which regression coefficient to use based on the input. Secondly a method using a three-layer, fully connected neural network with a combination of hyperbolic tangent and leaky ReLu activations functions used the ambient temperature, relative humidity, movement intensity and CO2 data to achieve a performance of 93.86%. This was achieved using a network structure with three hidden layers with 19, 21 and 39 neurons respectively. These numbers were derived using the genetic algorithm to optimise for 43 iterations with the number of neurons per hidden layer and the learning rate as optimisation variables. The final result yielded two models that were able to estimate the number of people in room 2 within one person of the observed number of people, respectively 93.49% and 93.86% of the time. The resulting models could, when verified further with additional data, be used to aid HVAC systems with ambient temperature control; having a reliable metric for occupancy allows people to be added to the energy balance of a room, which in turn allows for the creation of more accurate models of the indoor thermal climate. These models would allow for model based temperature controllers as opposed to PID type controllers that are currently used in most office thermostats.