I am not an expert on GIS but I like to think of myself as a geomapping enthusiastic. I recently downloaded Code-Point® Open file from the Ordnance Survey, a CSV file that provides a precise geographic location for each postcode unit in Great Britain. The product is a CSV file containing postcodes, grid references, NHS® health and regional health authority codes, administrative ward, district, county and country area codes.
The geo coordinates of this file come as Easting & Northing. From Wikipedia "*the terms easting and northing are geographic Cartesian coordinates for a point. Easting refers to the eastward-measured distance (or the x-coordinate), while northing refers to the northward-measured distance (or the y-coordinate)*".
Unfortunately things like Google Maps, Google Earth & others use as internal coordinate system the geographic coordinates (latitude/longitude) on the World Geodetic System of 1984 (WGS84) datum so in other to use Code-Point® Open file on one the above tools I had to convert eastings & northings to WGS84.
Thus I scanned the internet for possible solutions to my dilemma but none of them where exactly what I needed it. Some options where to use well-known APIs that will do the conversion for you, but my problem was I had 1.7 million postcodes so this wasn't an option really.
I also looked for python modules and read about Fiona and Shapely but none of them had the solution (probably due to my lack of knowledge on the subject) I was looking for.
Then I came across a python package called Pyproj which along with Pandas solved my problem easily (well, after couple of days of research).
Here it is my solution:
import os import pandas as pd import pyproj import re
listfiles = os.listdir("codepo_gb/Data") pieces =  columns = ['pstcode','positional_quality_indicator','eastings','northings','country_code','nhs_regional_ha_code', 'nhs_ha_code','admin_county_code','admin_district_code','admin_ward_code'] for f in listfiles: path = "codepo_gb/Data/%s" % f frame=pd.read_csv(path, names = columns) frame['filename']=f pieces.append(frame) postcodes = pd.concat(pieces, ignore_index=True)
Int64Index: 1687605 entries, 0 to 1687604 Data columns: pstcode 1687605 non-null values positional_quality_indicator 1687605 non-null values eastings 1687605 non-null values northings 1687605 non-null values country_code 1687605 non-null values nhs_regional_ha_code 1440838 non-null values nhs_ha_code 1596706 non-null values admin_county_code 636420 non-null values admin_district_code 1687514 non-null values admin_ward_code 1687514 non-null values filename 1687605 non-null values dtypes: int64(3), object(8)
Now here it is a quick example of converting eastings and northings to latitude and logitude and further down is the same piece of code wrapped into a function. I also extracting the area level from the post code.
bng = pyproj.Proj(init='epsg:27700') wgs84 = pyproj.Proj(init='epsg:4326') # AL1 1AB - pyproj.transform(from,to,easting,northing) lon,lat = pyproj.transform(bng,wgs84, 394251, 806376) print lon, lat
sample = postcodes
def proj_transform(df): bng = pyproj.Proj(init='epsg:27700') wgs84 = pyproj.Proj(init='epsg:4326') lon,lat = pyproj.transform(bng,wgs84,df['eastings'], df['northings']) area_pattern = r'([a-zA-Z]+)(?=\d)' #area = re.search(r'[a-zA-Z]+(?=\d)',df['pstcode'].str[:-3]).group() df['lat'] = lat df['lon'] = lon df['area'] = df['pstcode'].str.findall(area_pattern, flags=re.IGNORECASE).str return df
I now create a new data frame called sample2 with 3 new columns: latitude, longitud and area postcode level.
sample2 = proj_transform(sample)
With my new dataframe I then export it to a csv file and job done.
I have been playing with QuatumGIS but not enough to add something interesting yet so I'll be posting about producing shapefiles than could be use on Google maps soon.
I am done for now.