May 28, 2024
The quality of our squash venue data is paramount.
But quality data is always a journey, never a destination. It’s about ongoing improvement, refinement, learning, etc.
One of the many challenges we face in cleaning up the data is duplication. Duplication means a double-counting of squash facilities, which is not good!
Duplication happens for a number of reasons:
1. Both the venue AND the club (that operates out of the venue) have been added. We want the venue data, not the club data, for reasons explained here.
2. In a large complex, e.g. a University campus, sometimes the main University building has been added in addition to, for example, the sports or squash complex that is located on the same campus.
3. There is a content error on Google Maps. Our database of venues is tethered to Google Maps. Sometimes the same venue, perhaps with an alternative name, has been added twice to Google Maps, and our database inherits this mistake.
The Solution?
We look for venues that are ‘suspiciously’ close together. The closer they are together, the more likely they are to be duplicates.
For example, in this report, we look for all venues that are within 300 metres of each other.
This then gives us the opportunity to see if one of the above 3 conditions apply, and if so, to make the correction by archiving the duplicate, which also means that it can never be accidentally added again via the app.
In due course, we will add more intelligence to the app itself. When someone attempts to recommend a new venue for addition to our database, the app will automatically show the user a list of nearby venues, and question whether they still wish to proceed with their recommendation.
IF YOU SPOT A DUPLICATE, you can open up the venue in the app, scroll to the bottom of the listing, and click ‘Recommend venue deletion’.