Innovation, our reason for being
First and foremost, and if you don’t know very well what all this is about, spatial data is related to geographical and regional information, including Point of interest – POI –, areas on a map – lines, polygons, etc. –, and information about municipaliy, state or country.
I should tell you, that talking about spatial data is quite complex, and it’s a software discipline on its own. Proxia® provides spatial support that is limited to a set of features, including :
When I decided to tackle this problem, I needed to provide response to different questions. Perhaps the main one was, where the grouping logic should be done? Although there are many JavaScript libraries (Google, Leaflet, etc.) that address this problem, I didn’t think that they were the proper approach, since :
Taking this decision is slippery, since unless we want to penalize the user experience, we must optimize the processing time. Providing a good solution involves knowing how mapping technology - like Google Maps, Leaflet or Bing Maps – work.
As I told you, to retrieve the POIS – or areas – we must know how mapping technology works. Basically, we should consider the following issues :
Obviously there isn’t a plane of all the Earth for each zoom level, this wouldn’t be very practical, its size in KB would be humongous for the high-zoom levels and, actually, we are just working with a small area of all the Earth. Thus, the maps are divided in a grid composed of tiles. The more zoom you apply, the greater the number of tiles.
Therefore, POIS retrieval involves knowing the tiles that are being requested. You could think of it as a function of zoom level, and visible map boundaries. For each tile we recover the visible POIS inside it. If we talk about areas, we just need to know if one the area points is contained within the tile. Actually, building all of this is not really complex but you have to be careful, providing the system with different key-value layers, since you want a swift data recovery.
As you can notice, low-zoom layers contain a lot of POIS, as zoom increases the number of POIS in each tile decreases.
Obviously, it makes no sense to store in each layer all POI information, we’d be wasting a lot of memory/storage, so we have to implement another layer to store POIS shared data. This additional layer could include other filters as, for example, POI type, or postal address. In this way, seeking POIS in a geographic area is, in fact, a matter of locating proper tiles, retrieieving POIS and then, finally, filtering those who match the user’s selection.
You might wonder, what about storage? Do you have all this information in a database? Actually, no, since in database we only store the basic information, enough to create all these layers structure when the system starts up.
This process introduces, in fact, a new problem, how can the system deal with the creation, update or removal of POIS? Let me tell you, that we are using a broadcasting solution and that I’ll tell you about it in a future post.
Certainly, a map filled with pins is not a good solution for end-users. POIS Grouping allows us to provide a solution focused on user experience.
From a technical point of view, you could think that this grouping process is not a very complex problem. All you have to do is to find neighbouring POIS and mark them as a single cluster. In fact, it is a bit trickier, you should :
On the other hand, grouping just for the sake of it, doesn’t make sense. We have to include other approaches to improve user experience as, for example, including group qualifiers. A group qualifier allows us to group similar POIS. Just imagine a system comprised of medical centers and hosting facilities, being close cannot be the only criteria for grouping a series of POIS, at least at medium zoom levels.
Proxia® takes into account all these issues, allowing system administrator to create different grouping qualifiers, structured in hierarchies. Allowing us to provide our clients with a grouping solution independent of user device, based on the zoom level and which considers only the POIS recovered in the retrieval step.
In this case we are not talking about the representation of POIS in map, but to retrieve information based on criteria as in the surroundings, in this municipality, 25 kilometers from here, etc. You could think of it like the typical recommendation service: users who bought this product, also bought these others.
Retrieving information based on regions is, in fact, quite easy, you should only store an encoded postal address (country, region, state, municipality, town) within the information and seek proper matches. For this coding, we have used Spanish INE database and Portuguese Freguesías database.
The problem arises when you try to apply this idea to criteria such as in the surroundings or 25 kilometers from here. The reason is that calculating the distance between two points is not a simple task. Computing the actual distance, road-based, is extremely complex, and even computing radio-based distance isn’t simple since you cannot use Pythagorean theorem, remember that the Earth is not flat, but algorithms as Harvesine formula.
Our approach, a batch process to store these precomputed relationships. This gives us enough power and flexibility as to return related information in a extremely short time.
Map images provided by OpenStreetMaps, other icons used :