Case Study: Health
Main Content
Using maps to show geographical patterns in health data
What is the study about?
The study will investigate the concept of using multiple datasets to help highlight areas that may be at a higher risk for health problems. This study will use the indices of deprivation and life expectancy data to highlight deprived local authorities. The study will then investigate the geographical variation of cancer and coronary heart disease (CHD), which are relevant to government health targets, within these deprived local authorities.
The study will be of interest to anyone who uses or intends to use health data or someone trying to develop the use of maps in their analysis and it is available as a pdf document (668Kb).
The data
Data for this study are available in both the 'Health and Care' and the 'Indices of Deprivation and Classification' domains on the Neighbourhood Statistics website. For the study three datasets have been used1:
- Life expectancy at birth (male and female indicators) - 2001-2003
- English Indices of Deprivation (IMD) - Local Authority summaries (overall score indicators) - 2004
- Percentage of Health Episodes - 2003
All datasets are based on the 354 local authorities in England (2003 geographical boundaries). The local authorities of City of London and Isles of Scilly have been removed due to suppressions within the life expectancy data.
The data used here represent a fraction of health data available through Neighbourhood Statistics. There are currently datasets that can help on a wide array of topics. They hold information on smoking, obesity, drinking plus many other health issues. These data are also presented on a wide range of geographical areas, such as Primary Care Organisations (PCOs). Both the data and specific health issues will be considered in a further study.
Note: 1 This study is specifically aimed at using data from the Neighbourhood Statistics website. It should not preclude use of alternative data sources to measure health deprivation.
The technique
Mapping - This study will show how maps on a single geographical level can be used to investigate a number of datasets and will offer hints about understanding thematic 'choropleth' maps.
Identifying the most deprived areas for health
Firstly the study will look to identify all the deprived local authorities in England. We will use a recognised definition for highlighting 'deprived areas' that considers a local authority to be deprived if it is present in the lowest 20 per cent of the data in 3 out of 5 of the following:
- male life expectancy
- female life expectancy
- cardio vascular disease (CVD) mortality among under 75s
- cancer mortality among under 75s
- English Indices of Deprivation (IMD) - overall score.
Although not all of these data are available through Neighbourhood Statistics, we will use the life expectancy for males and females, and the IMD score, to find local authorities that will be classified as 'deprived' for this study.
A useful way of showing the most deprived areas is with the use of maps. The maps 1-3 below, show each step of the analysis that has been done to identify the 'deprived' local authorities.
Map 1: Those local authorities that are identified as being present in the lowest 20 per cent for male life expectancy.
Map 2: Those local authorities that are identified as being present in the lowest 20 per cent for both male life expectancy and female life expectancy.
Map 3: Those local authorities that are identified as being 'deprived' when considering the male life expectancy, the female life expectancy and the IMD score.
Hint: Statistically the areas that are present within the lowest 20 per cent of all values in a dataset are said to be in the first quintile of the data. Quintiles in this case are all the local authorities that fall within one fifth bands of the ranked data. So the first quintile will be all those local authorities that fall in the first one fifth of the data (i.e. the 20 per cent most deprived), the second quintile will be all the local authorities that fall within the second fifth (i.e. 20 per cent to 40 per cent of the way through the data), and so on.
But why map?
By mapping each of these stages it is easy to see how the occurrence of each of the three characteristics contributes to defining a local authority as 'deprived'. These results could easily have been generated in a table but would this give you an understanding of the geographical spread?
A thematic map can add a geographical dimension to any analysis. When using a map it may be possible to see potential groups or clusters that could be present within the data. It is also a powerful tool that shows how different locations interact.
There are lots of different types of thematic map. However the ones used here are area-shaded or 'choropleth' maps. This type of map portrays geographical areas in a range of different shades representing banded values. The purpose of this map is to present data in a way that allows geographical patterns to be identified.
Although any clustering needs to be proven statistically it does give a useful indication of what the data are saying. This provides a useful first step of analysis as it highlights areas that may benefit from further analysis.
Using a different data source
To help us understand more about cancer and CVDs we will use the 'Percentage of Health Episodes' data from Neighbourhood Statistics. Although this dataset does not have mortality rates for cancer or CVD it does have the percentage of hospital episodes with diagnosis of cancer or coronary heart disease (CHD). Using these data will enable us to highlight areas that might be at risk from these diseases.
Investigating 'Percentage of Health Episodes' data
The first things we need to learn about these data are their limitations. The 'Percentage of Health Episodes' data have several which are relevant to this study:
- The data are not measures of mortality. They show the medical category of the finished consultant episodes within NHS hospitals, which is not the same as the number of patients.
- The age range is not the same as the mortality data.
- Patients can be counted more than once.
- There is an element of choice as to which hospital a patient attends.
- CHD is a subset of CVD - i.e. it only covers some of the diseases that are classified as CVD
However, even with these limitations it is still possible that the data will add to the analysis of the 'deprived' areas. By mapping these results we can consider the geographical impact of each disease.
Hint: Always read the metadata of the data that are being examined. It may be that data are not comparable for a number of reasons.
Investigating 'Percentage of Health Episodes' data for cancer episodes ages 16-59
In the following analysis we will consider only those local authorities which we have already found to be in the most 'deprived' quintile (i.e. being present in the lowest 20 per cent). We will look at their rates of cancer and CHD in 16-59 year olds as a percentage of all hospital episodes.
Map 4: Deprived English local authorities: Cancer as a percentage of all episodes for 16 - 59 year olds.
By mapping the data in this way we can identify the 'deprived' local authorities that have the greatest percentage of cancer finished episodes.
From Map 4 we can see that the North East region of England appears to have a concentration of local authorities with high percentages of cancer episodes. However, to put this into context we can compare the percentages in the local authorities to the percentage for England.
Compare to overall average
To investigate how cancer in the most 'deprived' local authorities compares to the national picture we can view all those local authorities that have above the England percentage of finished episodes for cancer. The England percentage has been calculated by combining the data at local authority level.
Map 5: Deprived English local authorities: Cancer as a percentage of all episodes for 16 - 59 year olds higher than England average.
We can see that there is a concentration of 'deprived' local authorities in the North East region with a higher percentage of finished episodes for cancer. This is not unexpected when we consider Map 4, but is much more understandable when shown in this context. This is because the England percentage is now a point of reference. Comparisons with higher level geographies can be helpful when considering any level of geography.
These two maps provide a useful tool that highlights possible areas of investigation. We need to understand what these maps are telling us. The areas with higher than average percentages may have reasons as to why this is so. For example, these areas may have been targeted by an initiative to reduce deaths from cancer. As part of this program it could be expected that more people would get tested for this disease, therefore more people would be diagnosed and so there may have been more hospital treatments as a result.
We can investigate the CHD finished episodes in a similar way.
Investigating 'Percentage of Health Episodes' data for CHD episodes ages 16-59
We can also investigate the CHD finished episodes in the 'deprived' local authorities. This will enable us to see if there are any noticeable differences compared to the results of the cancer finished episodes. Again we are only suggesting that this highlights areas of potential risk from CHD in the 'deprived' local authorities.
Map 6: Deprived English local authorities: CHD as a percentage of all episodes for 16 - 59 year olds.
Map 6 shows the percentages of CHD finished episodes, which range from 2 to 5 per cent. We can note that the percentages here are smaller than those for cancer in Map 4. This suggests that overall there are lower proportions of CHD than cancer.
Again we can show local authorities that have a higher than England percentage of CHD finished episodes.
Compared to overall average
As with the cancer analysis we can show those local authorities that have a percentage of CHD finished episodes which are higher than the England percentage. Again the England percentage was created by summing the data at local authority level.
Map 7: Deprived English local authorities: CHD as a percentage of all episodes for 16 - 59 year olds higher than England average.
Map 7 shows the distinct areas that have a higher percentage of CHD finished episodes. The North East again is highlighted as are the areas around the North West's larger cities. Also within London a number of local authorities are highlighted.
By producing these maps we have investigated the percentages of cancer and CHD within local authorities. But before we can make comments on what this means we need to be aware of how the percentages that are in the 'Percentage of Health Episodes' data have been created.
Understanding the percentages
When generating this type of analysis there is a need to understand how percentages are created. If we consider that the percentage of cancer finished episodes are the cancer finished episodes as a percentage of all finished episodes, we therefore need to explore what this actually means.
The way in which the percentages are calculated means that two local authorities can have the same percentage of finished cancer episodes but the underlying counts could be dramatically different. Due to this, we cannot say that these highlighted areas have the most cases of a particular finished episode. However, they do have a higher proportion of cancer or CHD finished episodes than that found for England. This means that the analysis has highlighted areas that would benefit from more in depth investigation.
But what does this analysis show?
Hopefully this study has shown two things. Firstly the value of mapping data and secondly how using different data sources, although having limitations, can add value to your analysis.
We have seen how maps can allow a fuller investigation of a dataset. Using maps allows you to create something that paints pictures of key messages that may be otherwise hidden in the data. They also work as a prompt for further investigation. We all know that a picture paints a thousand words; however care needs to be taken as it is easy, especially if other data sources are used, to produce a map that could be out of context.
We have also seen from this analysis the possible approaches that can be taken if datasets are not available. Ideally we would have used the mortality data for both CVD and cancer. However, the data that we did investigate allowed us to understand cancer and CHD, and the possible locations that are affected. This is only useful provided that the limitations of the dataset are fully known and understood.
The potential of thematic maps
Thematic maps can be created for any area of interest to investigate any dataset. They allow you to explore and understand variations in the data geographically that just cannot be seen in a table or chart.
Maps allow you to investigate and develop an understanding of surrounding areas and enable the context of data to be explored. This is a major advantage of incorporating thematic maps into your analysis as information about a topic or area can be portrayed more easily to your audience.
We have seen how data impact on maps, but some consideration of the maps themselves is also needed. There are many things that can make a map misleading. For example, the difference in the size of geographical areas means that larger areas can dominate the map.
Care is also needed with the choice of colours. These too can draw the viewers into incorrect assumptions. Generally darker colours are used to indicate something which is more extreme - e.g. a higher rate of cancer, or a lower level of income. But there are cases where certain colours have strong real-world associations with what is being shown, e.g. red for Labour and blue for the Conservatives.
What sorts of data are best for thematic maps?
One of the key purposes of this type of map is to demonstrate how data vary geographically. This means using standardised data so that comparisons can be made between areas. Follow this link for guidance on standardised data.