This user guide provides information on the Property Intelligence database, prepared by GBG for customers looking to use the data.
Property Intelligence is designed to assist in the process of providing insurance quotes for customers in England, Wales and Scotland by supplying data on residential properties. Property coverage:
Geographic coverage:
Property Intelligence is currently built on a quarterly basis, this is subject to review. The underlying datasets have a range of update frequencies from monthly upwards. Updates are included in the build as they become available.
The database contains a set of utility fields and a set of feature fields. The utility fields are as follows:
The feature fields in the database are arranged in sets of three:
The unique key to the database is the UDPRN / UMRRN pair supplied by Royal Mail, the UPRN is also supplied. The list of data items is as follows:
Technical details for each of these fields are shown in the table below:
Title | Field name | Data type |
---|---|---|
UPRN | UPRN | Integer |
UDPRN | UDPRN | Integer |
UMRRN | UMRRN | Integer |
Address1 | address1 | Text |
Postcode | postcode | Text |
Easting | easting | Float |
Northing | northing | Float |
Latitude | latitude | Float |
Longitude | longitude | Float |
Output area code | OA11CD | Text |
Lower super output area code | LSOA11CD | Text |
Output area code | OA21CD | Text |
Lower super output area code | LSOA21CD | Text |
Country | country | Text |
Property type | property_type | Lookup |
Number of floors | floors | Text |
Number of bedrooms | bedrooms | Text |
Number of bathrooms | bathrooms | Text |
Number of rooms in total | total_rooms | Text |
Building construction period | age | Lookup |
Year built | year_built | Text |
Listed building | listed | Lookup |
Cadastral polygon area | cadastral | Text |
Height | height | Text |
Building footprint (square metres) | footprint | Text |
Building volume (cubic metres) | volume | Text |
Average roof slope | avg_roof_slope | Text |
Flat roof fraction | flat_roof_fraction | Text |
Distance to tree | distance_to_tree | Text |
Geocode multiplicity | geocode_multiplicity | Text |
Floor area (square metres) | floor_area | Text |
Last transaction price | last_transaction_price | Text |
Last transaction date | last_transaction_date | Text |
Last transaction duration type | last_transaction_duration_type | Lookup |
Estimated current value | est_current_value | Text |
Number of transactions | n_transactions | Text |
Estimated council tax band | est_council_tax | Lookup |
Within 200 metres of watercourse | watercourse_200M | Lookup |
Distance to watercourse (within 200 metres) | distance_to_water | Text |
Distance to road | distance_to_road | Text |
Road class | road_class | Lookup |
Business usage | business_usage | Lookup |
Planning classification | planning_classification | Lookup |
Congestion zone | congestion_zone | Lookup |
Burglary rate | burglary_rate | Text |
Storey on which flat sits | flat_floor | Text |
Is top floor flat? | top_floor_flat | Lookup |
Number of extensions | extensions | Text |
Wall type | wall_type | Lookup |
Main central heating fuel | main_fuel | Lookup |
Type of tenure | tenure | Lookup |
Energy rating | energy_rating | Lookup |
EPC Inspection Date | epc_inspection_date | Text |
Multi-residential property | is_multires | Text |
Table 1: Technical details for each utility and data field. Lookup fields contain positive integers (starting from zero). source_X fields are lookup fields, p_X fields are number fields.
Tables 2-13 are the lookup tables relating the numbers found in the database fields to descriptions for the property type, property age, Council Tax band, and data source. The Yes/No lookup is used for the ‘watercourse 200M’, ‘congestion zone’ and ‘top floor flat’ fields.
Description | Value |
---|---|
No | 0 |
Yes | 1 |
Table 2: Yes/no lookup
Description | Value |
---|---|
Detached | 0 |
Semi-detached | 1 |
Terraced | 2 |
Flat | 3 |
Unknown | 4 |
Table 3: Property type lookup
Description | Value |
---|---|
Before 1719 (old) | 0 |
1720-1839 (Georgian) | 1 |
1840-1919 (Victorian/Edwardian) | 2 |
1920-1945 (Inter-war) | 3 |
1946-1979 (Post-war) | 4 |
1980 to date (Modern) | 5 |
Not known | 6 |
Table 4: Property age lookup
Description | Value |
---|---|
A | 0 |
B | 1 |
C | 2 |
D | 3 |
E | 4 |
F | 5 |
G | 6 |
H | 7 |
I | 8 |
N/A | 100 |
Table 5: Council tax band lookup
Description | Value |
---|---|
Default | 0 |
Land Registry | 2 |
Historic England | 3 |
Estate agent | 4 |
LIDAR | 7 |
NROSH multipart | 8 |
NROSH snapshot | 9 |
VOA | 12 |
Heuristic | 14 |
ML (age) | 15 |
Naive Bayes (age) | 17 |
Banded VOA | 18 |
ML (bedrooms) | 19 |
VOA (Council Tax) | 20 |
OS Open Rivers | 21 |
NB (bedrooms) | 24 |
OS Open Map | 25 |
Transport for London | 28 |
police.uk | 29 |
Flats modeller | 30 |
Cadw | 33 |
Historic Environment Scotland | 34 |
OS Open Roads | 35 |
Royal Mail | 36 |
DCLG | 37 |
DCLG non-domestic | 38 |
Prefix flat floor modeller | 42 |
Flats per floor modeller | 43 |
Nearest neighbour modeller | 44 |
DCLG Scotland | 45 |
DCLG Scotland non-domestic | 46 |
Financial Services | 48 |
Table 6: Data source lookup
Description | Value |
---|---|
Domestic | 0 |
Business | 1 |
Table 7: Business usage lookup
Description | Value |
---|---|
Gas | 0 |
Electricity | 1 |
Oil | 2 |
Not known | 3 |
Coal | 4 |
LPG | 5 |
Wood | 6 |
None | 7 |
B30K | 8 |
Other | 9 |
Biomass/Biogas | 10 |
District heating | 11 |
Waste heat | 12 |
Table 8: Main fuel lookup
Description | Value |
---|---|
Cavity wall | 0 |
Solid brick | 1 |
Sandstone | 2 |
Timber frame | 3 |
Granite | 4 |
System built | 5 |
SAP05 | 6 |
Not known | 7 |
Table 9: Wall type lookup
Description | Value |
---|---|
Not known | 0 |
A1/A2 Retail and Financial/Professional services | 1 |
A3/A4/A5 Restaurant and Cafes/Drinking Establishments and Hot Food takeaways | 2 |
B1 Offices and Workshop businesses | 3 |
B2 to B7 General Industrial and Special Industrial Groups | 4 |
B8 Storage or Distribution | 5 |
C1 Hotels | 6 |
C2 Residential Institutions - Hospitals and Care Homes | 7 |
C2 Residential Institutions - Residential schools | 8 |
C2 Residential Institutions - Universities and colleges | 9 |
C2A Secure Residential Institutions | 10 |
C3 - Dwelling houses | 11 |
D1 Non-residential Institutions - Community/Day Centre | 12 |
D1 Non-residential Institutions - Crown and County Courts | 13 |
D1 Non-residential Institutions - Education | 14 |
D1 Non-residential Institutions - Libraries Museums and Galleries | 15 |
D1 Non-residential Institutions - Primary Health Care Building | 16 |
D2 General Assembly and Leisure plus Night Clubs and Theatres | 17 |
Others - Passenger terminals | 18 |
Others - Emergency services | 19 |
Others - Miscellaneous 24hr activities | 21 |
Others - Car Parks 24 hrs | 22 |
Others - Stand alone utility block | 23 |
Others - Telephone exchanges | 24 |
Sui generis | 25 |
Table 10: Planning classification lookup
Description | Value |
---|---|
Unclassified | 0 |
Not classified | 1 |
Classified unnumbered | 2 |
B Road | 3 |
A Road | 4 |
Motorway | 5 |
Unknown | 6 |
Table 11: Road class lookup
Description | Value |
---|---|
Not listed | 0 |
I or A | 1 |
II* or B | 2 |
II or C | 3 |
Table 12: Listed building grade lookup
Description | Value |
---|---|
Owner-occupier | 0 |
Rented | 1 |
Social | 2 |
Table 13: Tenure lookup
Description | Value |
---|---|
A | 0 |
B | 1 |
C | 2 |
D | 3 |
E | 4 |
F | 5 |
G | 6 |
Table 14: Energy rating lookup
Description | Value |
---|---|
Not known | 0 |
Freehold | 1 |
Leasehold | 2 |
Table 15: Last transaction duration type lookup
Description | Value |
---|---|
No | 0 |
Yes | 1 |
Table 16: Last transaction duration type lookup
Accuracy for the tested fields calculated using 2025-02_groundtruth on 2025-03-13 00:52:51 against 32233 properties is shown in the table below.
Field | Accuracy (%) |
---|---|
Number of bedrooms | 71.3 |
Number of bathrooms | 77.4 |
Building construction period | 68.6 |
Property type | 82.1 |
Number of floors | 89.9 |
Table 17: Summary accuracy for fields, measured against ‘groundtruth’ properties in England and Wales, excluding flats
The following tables show dataset coverage and accuracy for number of floors, bedrooms, age and property type using the along with confidence for these attributes based on measurements against the 33,000 property groundtruth dataset covering England and Wales.
Source | Coverage | Accuracy | Confidence |
---|---|---|---|
DCLG | 0.119 | 0.691 | 0.700 |
Default | 0.032 | 0.407 | 0.500 |
Estate agent | 0.463 | 0.790 | 0.850 |
Flats modeller | 0.003 | 0.213 | 0.600 |
NB (bedrooms) | 0.378 | 0.655 | 0.640 |
NROSH multipart | 0.005 | 0.729 | 0.800 |
NROSH snapshot | 0.000 | 1.000 | 0.800 |
Overall | 1.000 | 0.713 | 0.741 |
Table 18: Accuracy and coverage for bedrooms
Source | Coverage | Accuracy | Confidence |
---|---|---|---|
Default | 0.560 | 0.800 | 0.730 |
Estate agent | 0.440 | 0.707 | 0.760 |
Overall | 1.000 | 0.774 | 0.743 |
Table 19: Accuracy and coverage for bathrooms
Source | Coverage | Accuracy | Confidence |
---|---|---|---|
Cadw | 0.000 | 0.667 | 0.630 |
DCLG | 0.442 | 0.772 | 0.650 |
DCLG non-domestic | 0.000 | 0.333 | 0.250 |
Default | 0.010 | 0.500 | 0.470 |
Heuristic | 0.012 | 0.420 | 0.600 |
Historic England | 0.004 | 0.532 | 0.540 |
Land Registry | 0.020 | 0.936 | 0.950 |
Naive Bayes (age) | 0.303 | 0.718 | 0.721 |
Overall | 1.000 | 0.686 | 0.637 |
VOA | 0.209 | 0.461 | 0.469 |
Table 20: Accuracy and coverage for age
Source | Coverage | Accuracy | Confidence |
---|---|---|---|
Banded VOA | 0.054 | 0.626 | 0.595 |
DCLG | 0.070 | 0.914 | 0.900 |
Default | 0.001 | 0.375 | 0.540 |
Estate agent | 0.690 | 0.849 | 0.880 |
LIDAR | 0.183 | 0.745 | 0.800 |
Land Registry | 0.000 | 0.000 | 0.920 |
NROSH multipart | 0.001 | 0.704 | 0.800 |
NROSH snapshot | 0.000 | 0.000 | 0.800 |
Overall | 1.000 | 0.821 | 0.851 |
Table 21: Accuracy and coverage for property_type
Source | Coverage | Accuracy | Confidence |
---|---|---|---|
Banded VOA | 0.118 | 0.825 | 0.811 |
DCLG | 0.104 | 0.960 | 0.940 |
Default | 0.003 | 0.738 | 0.840 |
LIDAR | 0.775 | 0.903 | 0.900 |
Overall | 1.000 | 0.899 | 0.893 |
Table 22: Accuracy and coverage for floors
The following charts show the distribution of values for selected fields, for domestic properties, not arising from the default model.
Figure 1: Distribution of property type
Figure 2: Distribution of number of bedrooms
Figure 3: Distribution of number of bathrooms
Figure 4: Distribution of building construction period
Figure 5: Distribution of number of floors
The following tables shows the coverage with direct data for the five fields tested against groundtruth.
Attribute | Percentage direct |
---|---|
Property type | 88.5 |
Floors | 75.1 |
Bedrooms | 59.8 |
Bathrooms | 36.6 |
Age | 60.0 |
Table 23: Percentage of data supplied from direct sources rather than modelled
Data recency for the Property Intelligence dataset is determined by a number of factors, listed below:
The table below shows the dates of the datasets used in this version of Property Intelligence along with an indication of the expected update frequency.
Dataset | Frequency | Date |
---|---|---|
Congestion Zone | Once | None |
DCLG | Quarterly | 2025-03-11 |
DCLG Scotland | Quarterly | 2024-12-04 |
ONS Postcode to LSOA/LA lookup | Quarterly | 2025-03-03 |
Land Registry House Price Index | Monthly | 2024-12-01 |
Land Registry Cadastral Polygons | Quarterly | 2025-03-02 |
Land Registry Price Paid | Monthly | 2025-03-03 |
English Heritage | Yearly | 2024 |
Historic Environment Scotland | Yearly | 2024 |
Cadw | Yearly | 2024 |
NROSH | Once | 2016-12-12 |
ONSPD | Quarterly | 2025-03-03 |
ONS rural-urban classification | Once | 2016-12-12 |
OS Open UPRN | Quarterly | 2025-02-01 |
OS Open Rivers | Quarterly | 2024-10-01 |
OS Open Roads | Quarterly | 2024-10-01 |
Police.uk | Monthly | 2024-12-01 |
Royal Mail | Monthly | 2025-03-03 |
VOA | Yearly | 2024 |
Estate Agent | Monthly | 2025-03-04 |
Table 24: Data recency and frequency by dataset
The Environment Agency started to systematically cover England for LIDAR measurement in about 2005 and they have added, very approximately 5% coverage in each year since then.
Figure 6: Cumulative percentage of LIDAR coverage
This dataset contains Open Data typically provided under the UK government’s OGL3 license, a requirement of this license is that an attribution is provided for the data. These are as follows:
No new fields have been added in this release but sources have been updated.
No new fields have been added in this release but sources have been updated.
No new fields have been added in this release but sources have been updated.
No new fields have been added in this release but sources have been updated.
No new fields have been added in this release but sources have been updated.
No new fields have been added in this release but sources have been updated.
No new fields have been added in this release but sources have been updated.
A new field, IS_MULTIRES, has been added. It indicates if a property is multi-residential based on Royal Mail MRS.
A new field, LAST_TRANSACTION_DURATION_TYPE, has been added. References to the duration type of the last transaction recorded by the Land Registry Price Paid dataset (England and Wales only, back to 1995). It has two possible values, Freehold or Leasehold.
No new fields have been added in this release but sources have been updated.
No new fields have been added in this release but sources have been updated.
No new fields have been added in this release but sources have been updated.
No new fields have been added in this release but sources have been updated.
An EPC_INSPECTION_DATE is added. References to ‘DCLG’, the original department responsible for the EPC Energy Certificate data are replaced with ‘EPC’ in documentation.
The Census 2021 codings oa21cd and lsoa21cd are added, to sit alongside the Census 2011 codings. Currently the source Open Data used to derive fields in Property Intelligence still use the Census 2011 codings.
No new fields have been added in this release but sources have been updated.
No new fields have been added in this release but sources have been updated.
No new fields have been added in this release but sources have been updated.
No new fields have been added in this release but sources have been updated.
Our supplier of business information which is used to populate the business_usage field has changed.
No new fields have been added in this release but sources have been updated.
Floor areas for flats are now included in modelling so that values for neighbouring flats are used if direct data is not available.
As a result of changing our address cleanser to the standard GBG Loqate Verify engine we now include some data from Northern Ireland.
No new fields have been added in this release but sources have been updated.
We have added the DCLG Scotland data which provides a significant improvement in accuracy for property type, and property age in Scotland as well as improvements in accuracy to numbers of bedrooms and floors. DCLG Scotland also provides fields including extension count, wall type, main fuel, floor area, total rooms, tenure and energy rating which were not previously populated for Scotland.
There are improvements in the flat floor modeller such that it does not return unreasonable large values (over 90 storeys) or non-numeric values (other than N/A), and floor areas for flats are now included in modelling so that values for neighbouring flats are used if direct data is not available.
As a result of changing our address cleanser to the standard GBG Loqate Verify engine we now include some data from Northern Ireland.
No new fields have been added in this release but sources have been updated.
We have introduced modelling for ‘flat_floor’ - the storey a flat sits on which improves the coverage for this field, and introduces new entries to the data sources table.
Tenure and energy rating fields have been added. Tenure is a replacement for the previously removed tenancy field. It indicates whether a property is owner-occupied, private rental or social housing. The name has been changed to retain consistency with the underlying dataset
The property age field has improved direct data content and accuracy as a result of the addition of a new dataset.
No new fields have been added in this release but sources have been updated.
We have resumed supply of two fields which had been suspended:
The Land Registry House Price Index is available once again, and thus Estimated Current Values will be up to date unless a sale went through during the period in which the HPI was suspended.
As noted in the April 2020 Release notes we have removed the following fields from this release:
The Land Registry UK House Price Index has been suspended as of the April 2020 release, due to be published in June because of the impact of COVID-19 which means limited transactions are occurring on which to base the Index. The relevant Land Registry Bulletin describing this change is here.This means the estimated current value field will contain the estimated current value at last release of the House Price Index - 1st March 2020.
This build incorporates the OS AddressBase Premium property-level easting / northing and latitude/longitude coordinates, these replace those provided by our previous supplier. This data is supplied under evaluation terms which you have already signed up to.
These will be replaced with coordinates from derived from Ordnance Survey Open Source data once this has been released in July 2020.
As a result of recent supplier changes we are also withdrawing a number of fields including the geocode accuracy and red route fields. The Congestion Zone field will remain but not be populated in the next build.
We will also be withdrawing the Tenancy field as a result of other supplier licensing changes.
Finally there are a number of fields which have not been populated for some time including multiplicity, outdoor area, building count and number of adult occupants. All of these fields are present in this build containing default values in most cases but will be removed from the July 2020 build.
No new fields have been added in this release but sources have been updated.
No new fields have been added in this release but sources have been updated.
No new fields have been added in this release but sources have been updated.
The listed building field now reports the grade of listing (I, II* or II in England or Wales, A, B or C in Scotland). Previously buildings were just reported listed/not listed.
A tenancy field was added in this release which identifies a property as being rented, social housing or owner-occupier.
No new fields have been added in this release but sources have been updated, in addition the documentation provides details of data recency.
This release contains further parameters derived from LIDAR data, these include building footprint, building volume, the average roof slope, a flat roof fraction, the distance to the nearest tree over 10 metres high to the property geocode and a geocode multiplicity which counts the number of geocodes within a building. The building footprint is not listed below since it is not a new field but has been re-calculated using LIDAR data.
This release incorporates a major new dataset which has brought improved accuracy to numbers of bedrooms, numbers of floors, property type and property age fields as well as introducing a number of new fields, listed below. The cadastral, outdoor area and footprint fields will be populated only with default values from this release onwards for licensing reasons. We hope to re-introduce the building footprint in the February 2018 release.
This release sees a switch to using the Royal Mail PAF, Not Yet Built and Multiple Residence files as the base address list which results in approximately 10% more addresses than earlier releases. In addition accuracy in identifying flats was improved substantially, and a number of fields pertaining specifically to flats included
This release introduced the following new fields, with a focus on logistics.
This is the first public release of the Property Intelligence dataset