Selecting location data from a spatial database

I have been thinking to write about this subject a while back when project Spincloud was still under development. I was even thinking about making this the first post on my blog.
The idea is simple: you have location-based data (POIs for instance) stored in some database (preferably a spatial DB) and now you want to perform a select statement that will indicate the area that should include the points we want. In case of Spincloud’s weather map, we want the weather reported by the stations located within a given area determined by the Google Map viewport that the user is currently browsing.
In all my examples I’ll use SQL Spatial Extensions support, specifically MSQL spatial extensions.
Here’s a visual representation of the spatial select (the red grid is the area where we want to fetch the data):


This is quite easy to accomplish by issuing a spatial select statement on the database:

select * from POI where Contains(GeomFromText
    ('POLYGON ((-30 32, 30 -8, -89 -8, -89 32, -30 32))', LOCATION))

But what about selecting an area that crosses the 180 degrees longitude? Let’s say we want to select data in an area around New Zealand that starts at 170 degrees latitude and ends at -160 degrees latitude going East. The selected area will look like this:


If we are to describe this with a polygon, we may be tempted write it like this using Spatial SQL parlance:

POLYGON ((-160 -23, -160 -55, 170 -55, 170 -23, -160 -23))

but you’d be wrong. In fact this polygon describes an area on the map that excludes exactly our area of interest:

This happens specifically because the 180 degrees meridian is crossed (the next point of longitude beyond 180 degrees is -180 degrees).
To fix this special case, you have to test for this occurrence and if so we’ll construct two polygons, split by the 180 degrees meridian (notice the exaggerated division between the polygons east and west of the 180°):


Two spatial selects will have to be issued in order to account for areas on each side of the 180° meridian.

If you’re using Google Maps then you most likely use getBounds() to get the bounding box. The API reference specifically states that the returned box is “a rectangle in geographical coordinates, including one that crosses the 180 degrees meridian”. I use the excellent Vividsolutions JTS for all GIS related code so to construct the polygons I use the following logic:

public List getPOIsInArea(float neLng, float neLat, float swLng, float swLat) {
  GeometryFactory gf = new GeometryFactory();
  if((neLng < swLng) ) { //cross lat or long 180 degree boundary
   Polygon eastPol = gf.createPolygon(gf.createLinearRing(new Coordinate[] {
    new Coordinate(neLng, neLat),
    new Coordinate(-180, neLat),
    new Coordinate(-180, swLat),
    new Coordinate(neLng, swLat),
    new Coordinate(neLng, neLat)
     }), new LinearRing[]{});
   Polygon westPol = gf.createPolygon(gf.createLinearRing(new Coordinate[] {
    new Coordinate(swLng, swLat),
    new Coordinate(180, swLat),
    new Coordinate(180, neLat),
    new Coordinate(swLng, neLat),
    new Coordinate(swLng, swLat)
     }), new LinearRing[]{});
     //select POIs inside both eastPol and westPol
  } else {
      Polygon pol = gf.createPolygon(gf.createLinearRing(new Coordinate[] {
     new Coordinate(neLng, neLat),
     new Coordinate(neLng, swLat),
     new Coordinate(swLng, swLat),
     new Coordinate(swLng, neLat),
     new Coordinate(neLng, neLat),
     }), new LinearRing[]{});
     //select POIs from pol

The above code creates two polygons if the 180° meridian is crossed (E longitude is greater than the W longitude for the bounding box):

East polygon:
POLYGON ((-160 -23, -180 -23, -180 -55, -160 -55, -160 -23))

West polygon:
POLYGON ((160 -55, 180 -55, 180 -23, 160 -23, 160 -55))

Both polygons will be used when selecting POIs from the database. The resulted POI list will be of course the union between the POIs in the east and west polygons.

What’s remaining is to present the data on the map but since the spatial select may return more points than the maximum we want to show on the map (i.e. 1000 returned versus 20 we want shown), we’ll have to employ some balanced point reduction step (we call it clustering) before the POIs are presented on the map. I’ll talk about point clustering and using the cool k-means algorithm in a future post.