by
Paul Mendoza
8/5/2011
SQL Server has a geography data type that can store geometry or latitude and longitude points. One of the major selling points of this data type is to allow an efficient indexing and lookup of points near a specific latitude and longitude.
Once a spatial index is created there are two different functions that make this possible and picking the right one is important.
Filter() and STInterects()
Here is an example of using both queries.
DECLARE @g geography = geography::Point(41.848039, -87.96361, 4326);
DECLARE @region geography = @g.STBuffer(5000);
select count(0) from CustomerInformation ci WITH(INDEX(IX_CI_Geocode))
where ci.Geocode.Filter(@region) = 1
select count(0) from CustomerInformation ci WITH(INDEX(IX_CI_Geocode))
where ci.Geocode.STIntersects(@region) = 1
Notice that I have included a hint to use the spatial index that I’ve setup before this.
The only difference between these two queries is that one uses Filter() and the other uses STInterects().
For performance on a set of 5 million geocoded locations using a correctly configured spatial index, the first query returns results in less than a second using Filter() and the second query runs in 3 seconds. Why the difference you might ask?
Filter() is much faster because it can return false positives. As a result it doesn’t run as many checks to ensure that the point falls exactly within a region.
STInterects() does extra calculations to ensure that no false positives are returned. This results in much high CPU usages.
Here is how I’ve found use cases for each of these:
Filter - I’ve used this when building a map of points. On my map I want to show points around the center of where the map is currently looking. Because I don’t care if there are some extra points plotted outside of the visible map space, I use the Filter. Returning extra results doesn’t matter as much as speed matters.
STInterects - When I want to determine which points are in a geography object that is a polygon shape. For instance, I might have Lat and Long values for points but don’t know what the Zip code is that they are in. I might have in my database geography objects as polygons that represent the shape of individual zip codes. I could write a query that determines which zip code geography point this falls into. I don’t want any false positives here because then my query could end up coming back saying that my point is in two zip codes which it isnt.