Where in the World Are You With SAS® Visual Analytics?
Geo mapping is the process of taking any data with a location element and displaying it on a map. SAS® Visual Analytics supports three different types of geo mapping visualisations:
Geo Bubble Maps
Locations are displayed as bubbles on a map where the size and colour of the bubbles can be used to represent the values of different measures on continuous scales.
Geo Coordinate Maps
Locations are displayed as points on a map. The colour of the point can be used to represent the value of a category or a measure on a discrete scale using Display Rules.
Geo Region Maps
Locations are displayed as regions on a map. The colour of the region can be used to represent the value of a measure on a continuous scale.
The obvious commonality between all these visualisations is the need to have a data item that represents the location. In SAS Visual Analytics such variables are referred to as “Geography” data items. The starting point for any geography data item is specifying the coordinates of the location to be plotted. These coordinates must be expressed in terms (known as a coordinate system) that is supported by SAS Visual Analytics. Such coordinate systems fall into the following categories:
1. Countries and regions can be referenced using the following systems:
- Names (e.g. United Kingdom)
- ISO 2-Letter Codes (e.g. GB)
- ISO Numeric Codes (e.g. 826)
- SAS Map ID Values (e.g. GB)
2. Country and region subdivisions can be referenced using the following systems:
- Names (e.g. Oxfordshire)
- SAS Map ID Values (e.g. GB-38)
3. US specific locations can be referenced using the following systems:
- US State Names (e.g. North Carolina)
- US State Name Abbreviations (e.g. NC)
- US ZIP Codes (27513)
For each of the coordinate systems listed above, the location values supplied must exactly match values that are pre-defined by SAS Institute. The current list of supported values can be found here.
Each geography data item must follow only one of these systems, for example the data item must consist entirely of US State Names or entirely of US State Abbreviations - it cannot vary between the two. Values that are not supported by the selected scheme will not be displayed in the Geo Map visualisation, a warning will be displayed in the visualisation:
The following information is also worth noting:
- When working with country or region subdivision level data, only one country or region can be specified.
- It is possible to create hierarchies of geography data items enabling drilldown within the visualisation from country to US State to US ZIP Code:
Note, when creating any geography data item it is usually desirable to duplicate the location data item first and to convert the duplicate to the geography data item. This leaves the original data item available for use in non-Geo Map visualisations.
Creating a geography data item in the SAS Visual Analytics Report Designer or Data Explorer using one of these coordinate systems is as simple as right-clicking on the (usually category) data item containing the location and selecting the appropriate coordinate system from the Geography sub-menu:
If the location values in your data do not follow any of the above schemes (UK postcodes for example) or more precise locations are required, SAS Visual Analytics supports “Custom” geography data items. Note that custom geography data items are not supported for Geo Region Maps. Custom geography data items require three input data items:
- A category data item containing the location names (e.g. UK postcodes);
- Two measure (numeric) data items - one containing the latitude value and one containing the longitude value.
The longitude and latitude values must be specified using one of the following coordinate systems:
- World Geodetic System (WGS84; EPSG:4326) (values must be given in degrees)
- Web Mercator (EPSG:3857) (values must be given in metres)
- British National Grid (OSGB36; EPSG:27700) (value must be given in metres)
The European Petroleum Survey Group (EPSG) maintains a database of coordinate system information. Further details about each of the above coordinate systems can be found on their website by searching for the EPSG codes given above.
At the time of writing, coordinate values for UK postcodes (using the OSGB36 coordinate system) are freely available in the Code-Point Open product provided by Ordnance Survey. The data is provided as a series of CSV files, one for each postcode area (e.g. OX). SAS Institute have made available a macro for converting these CSV files to a single SAS data set. The macro also converts the OSGB36 coordinates to WGS84 coordinates, either can be used with SAS Visual Analytics. Similar data (although not always freely available) and SAS macros are available for other countries and even IP addresses.
To create a custom geography data item in the SAS Visual Analytics Report Designer or Data Explorer, right-click on the data item containing the location names (e.g. UK postcodes) and select the Custom options from the Geography sub-menu. You will then be asked to specify the names of the longitude and latitude data items and select the appropriate coordinate system: