For the past couple of weeks, we’ve been writing a lot about building location-based applications, sharing our expertise in both developing apps like Uber and enterprise-scale platforms for the logistics industry.
Today, I’d like to introduce to you the main principles our teams use when designing spatial databases. We’re also going to look specifically at a case of the aforementioned logistics platform – our largest client – Trans.eu.
I had a chance to sit down with Wojciech Tomaszewski, Location-based Solutions Team Leader at RST Software and chat about what the team learned through many years of ongoing development and management of such a large system. Trans.eu is one of the biggest logistics software companies in Europe and Asia, with over 62 000 concurrent businesses active on the platform each day. So, let’s get started.
Introduction to GIS database development
First, it is worth mentioning that spatial database development, although similar in its basics, is still somewhat different from building a regular database. It has its own challenges, limitations and, obviously, advantages.
Geographic data comes in various forms: the simplest – coordinates, and more complex – lines, polygons, or collections of these basic types. Utilizing a GIS database allows for much easier, faster and more efficient access, preview and processing of spatial data. Which, in turn, can then be used to generate all kinds of resources: raster or vector maps, geofencing areas, various data layers (ie heatmaps) etc.
Now, one could think, ‘Okay, seems easy enough, I’mma go and create my spatial data real quick’. Well, I’ll have to stop you for a moment here. Location data, although in most cases standardized, can get tricky to manage on a large scale.
If you’re using third-party providers in the likes of Google Maps or Mapbox, you most likely won’t have to deal with your own GIS databases. If you’re building a custom solution, on the other hand, like the one we built for Trans.eu using OpenStreetMap, you’ll have to be prepared for constant maintenance. Here’s what we learned by developing a platform that operates in 70 countries.
3 important principles when designing your database for a location-based app
Generally speaking, you want to know whether you’re building a local-scale project or something that will eventually scale to operate in multiple markets.
Principle #1: Create separate spatial databases for each country
At the very beginning, we used a single GIS database when developing a custom OpenStreetMap-based solution for Trans.eu. It made perfect sense, logically. The issue came when we started writing custom data layers and search services with forward and reverse geocoding.
One might expect there to be a single standard for naming administrative areas… Let’s say you decided to filter out only districts. Okay, seems like it works just fine. Wait, what’s going on with my UK data? Why am I only seeing 1 district in England, which is the entire England?
Well, because in the UK, districts are administrative level 3, not 2, as in many other European countries. Level 2 are countries within the UK: England, Wales etc. This difference stems from the fact that each country has their own specifics for administrative division, often historical, which aren’t easy to unify. So, now you need to start writing IFs for your database queries. A single IF would be alright, but then you try to use other filters and suddenly your Spanish spatial data becomes a mess… What? Someone updated OpenStreetMap data and now entire countries stopped working? Great…If you think I’m overexaggerating to exemplify – I don’t. Working with geospatial data requires precision.
To avoid the above issues, we split our single database into country-specific databases, 70 in total. What it means in practice is that you need to create an abstract location model, which will be available to the end-user, and map all that country-specific data from separate sources to it. It’ll make everything, including updates and maintenance so much safer. Speaking of which.
Principle #2: Constantly verify whether your data is up-to-date
The world of geographic information is constantly changing. Each day roads are getting closed for repair or because of building sites, street names are getting changed, new roads are built etc. When building location-based apps, your data accuracy has to be on-point.
Having separate databases makes it much easier to both update and monitor your data. For instance, Wojciech and his team wrote a number of dedicated services that regularly update the data and monitor its integrity (as well as whether each modification doesn’t break Trans’s platform) before the updates are rolled-out on production environments. Even a day of issues would rack up a hefty loss for any logistics company.
Whenever those tests fail, the team is immediately notified about the issue and sits down to manually fix the problem at hand. Such a system ensures continuity of Trans’s operations, as data updates from other countries are not affected.
Principle #3: Prepare your database structure for easy integration with third-party search engines like Elasticsearch or Algolia from the get-go
When building a logistics platform, or an Uber-like app, for that matter, you’ll often have to provide your users with advance search capabilities, including forward and reverse geocoding and autocomplete.
You’ll also have to use some kind of search engine to interconnect your separated databases. Henceforth, sorting everything out at the very beginning will save you from the pain of restructuring your databases later on, when you’ll have other important things to take care of.
Trans.eu, we’re dealing with over 6.2mln search-related API requests daily, which requires us to provide our users with rapid performance of our systems. Just to give you a perspective, it’d cost us $420 000 per month to handle our scale of operations, should we decide to use Google Maps Platform instead of building a custom OpenStreetMap solution.
Conclusions
Whether you are able to predict your future scale of operations at the moment, or not, we recommend implementing the above principles at the very beginning. This way you won’t have to lose your head over dealing with it when your product is already live.
The discussed principles are based on the knowledge we acquired when building a successful logistics platform for our client, and we’re happy to be able to share with those who are developing their own location-based applications.
If you’d like to talk to us about your spatial project, we are always here. Just drop us a couple of lines at ross@rst.software, and we’ll get back to you asap.
Until then, that’s it for today from me. See you soon!