The report contains this figure, and with it, a conclusion that SAP is not well correlated with actual heating use. There has in the past been an acceptance that there is some correlation between SAP rating, and heating energy used. In this blog, I would like to consider why the report doesn’t appear to support this past view.
I have explained some other relevant facts about SAP in previous blogs, which you can read if you wish (here http://www.jtecservices.co.uk/the-energy-performance-blogger/golden-rules-are-made-to-be-broken and here http://www.jtecservices.co.uk/the-energy-performance-blogger/consumer-protection-part-3-of-my-presentation-at-retro-expo.)
The graph at Figure 2 shows a best fit line using all the data points. The confidence (R2) indicates that we are not at all confident that this line is in the right place, because the spread of points about the line is high – they are not all clustered close to the line. This may well indicate that we need much more data points to put a definite line on the graph, which will accurately report the relationship between the two variables (SAP and average energy used).
However, there are tools available to us to deal with small data sets like this, to help us get better information from them. One of these is to look at the data to identify ‘outliers’ – data points that don’t appear to be as valid as most of the others. One would look at the details of each data point to identify such outliers. I don’t have access to that data, but I can immediately see a likely reason why the data point to the far left of the graph should be treated as an outlier, and removed. This is the data point at SAP 40, energy used for heating 50kWh/day.
Two occupants in a two bed home with a SAP rating of 40 are quite likely to use less energy for heating than the SAP prediction, since SAP assesses the energy used to provide the level of heating necessary for good health – that’s defined by the World Health Organisation as 21 degrees in the living area, and 18 in other rooms. In a home with a SAP of 40, this would be quite costly (remember, SAP is based on the cost of the fuel used, so a lower SAP means a higher cost). Many occupiers of homes with a SAP of 40 would deliberately heat their home to a lower standard, because this saves them money. If the two occupants are a single parent and a child, with a limited income, then the likelihood of this happening is greatly increased. The same ‘self-rationing’ of energy use might also apply to the homes with slightly higher SAP ratings, maybe up to a SAP of 60. This is why a SAP of 60 has generally been assumed as a minimum rating for homes occupied by the financially disadvantaged.
Now, if inspection of the data for this data point supports this view, we would remove that data point and redefine the best fit line. I don’t have the actual data, so I can’t redo the best fit line without that outlier, but I think you can probably see that the likely new line would be a slope, highest at the left of the graph and lowest at the right. If more of the data to the left of SAP 60 were removed, the slope gets even steeper. This would indicate quite a strong correlation between SAP and average energy used for heating.
As they have access to the data, the researchers could investigate all the data points to identify outliers that should be removed, and might find that this data point is valid, but others are not. The point to be made here is that without knowing the situation of the occupants studied, we don’t get the full picture. It is simplistic in the extreme to conclude that there is NO appreciable link between the SAP rating and the amount of energy used.
It’s interesting that this variability is to be the focus of further research to be undertaken by Sustainable Homes from October 2014. I hope that with more data, they will be able to refine their view that there is no correlation between these two variables.