StarBorn – A Real-Time Location Based Game for Geographic Information Mining

with No Comments

As part of my MSc of Geographic Information Science, I am writing my masters thesis on creating, implementing and analysing a location based game and the generated data thereof. Below you can find my detailed MSc concept. At the time of writing this blog post, I am already well into the implementation phase and have started alpha testing with a small chosen group of people. For the implementation, I am using the Symfony framework coupled with a Neo4j database for storing most of the information and a Postgres/PostGIS database which acts as a complex spatial index within the game. If you have any questions, critique or encouragements, feel free to drop me a mail or comment.

Detailed Concept

PDF version: 20160531_detailed_concept

State of the art

Land cover products have “been identified as one of the fundamental variables needed in order to study the morphological and functional changes occurring in the Earth’s ecosystems and the environment” (Congalton et al. 2014) and are key in decision and policy making processes (Congalton et al. 2014; Mallupattu & Sreenivasula Reddy 2013; Lambin et al. 2001). But, according to (Fritz et al. 2009) “global land cover datasets still show quite a high degree of disagreement” and (See et al. 2013) agree that “[…] datasets such as GLC-2000, MODIS and GlobCover […] frequently disagree over the land cover they record at any given location”. This calls for validation processes to be able to make quality predictions on existing land cover datasets and improve faulty classifications. A particular point of interest in the scientific community is the verification and improvement of land cover products using citizen science and crowdsourcing  (Fritz et al. 2009; Foody & Boyd 2012; See et al. 2013). In (Fritz et al. 2009; See et al. 2013) the usability of the crowdsourcing platform for land cover validation processes was scientifically examined. In regard to quality of data, (See et al. 2013) concluded that the overall quality of the crowdsourced information is relatively high and that the differences between experts and non-experts are small, but varied depending on land cover type and throughout the test period. This contradicts the general assumption that “data produced by volunteers is often considered as being of lesser quality than data produced by experts” as stated in (Yanenko & Schlieder 2014). It is also mentioned that “reliability of the information provided by non-experts improved faster and to a greater degree than experts”(See et al. 2013), which calls for targeted means of training. Research has also been conducted by (Hutchison et al. 2012) on how to allocate non-expert users when crowdsourcing satellite imagery analysis. The authors argue that “the precision rate of any parallel strategy increases with the number of users” and “an iterative strategy improves the spatial coverage (and thus the recall rate) as the iteration goes on”. It is also stated that “allocating more than 5 volunteers has a low impact on the accuracy and variability, while increasing unnecessary the resources” (Hutchison et al. 2012), which coincides with the findings of (Haklay et al. 2010), who confirm the validity of Linus’ Law in volunteered geographic information. Another approach is discussed in (Leung & Newsam 2014), where methodologies of land cover classifications using geo-referenced photos are presented. The authors focus on the photo collections of Flickr and the Geograph Project and agree that “large collections of geo-referenced ground level photos can be used to derive maps of what-is-where on the surface of the Earth”. The authors also highlight the “potential [of proximate sensing] for discriminating between land use classes”, but warn that the intent of the photographer has a high influence on the usability of the geo-referenced photo for land cover/land use classifications. The use of the textual data associated with the geo-referenced images is also discussed and the authors state that using text features to classify land cover performs better than using image processing techniques, but only for the Geograph Project collection, where the users’ intention is to provide typical geographic characteristics of predefined regions. (Fritz et al. 2009) present the viability of crowdsourced information for land cover validation but also mention future challenges, in particular how to “attract a wide range of volunteers from all over the world”. The authors propose the use of “competitive games such as those used for most computer games […] to make the challenge of land cover validation more attractive”(Fritz et al. 2009).

The literature agrees, that location based gaming can be a useful tool for (geospatial) data acquisition, (geospatial) data validation and for edutainment purposes (Matyas 2007; Yoshii et al. 2011; Celino et al. 2012; Ionescu et al. 2013; Davidovic et al. 2013; Kiefer & Schlieder 2012; Charsky 2010; Matyas et al. 2011; Avouris 2012; Richter et al. 2012; Matyas et al. 2008; Winter et al. 2009; Celino 2015; Yanenko & Schlieder 2014). Various authors have analysed the usability of location based games for data mining (Matyas et al. 2008; Kiefer & Schlieder 2012; Matyas 2007; Celino et al. 2012; Winter et al. 2009; Davidovic et al. 2013), mainly focusing on points of interest (POI) information collection. (Matyas et al. 2008; Matyas 2007) assess the game CityExplorer, a location based game in which players capture tiles by having the highest number of markers in a tile. Markers can only be placed on predefined location types (e.g. restaurants, beer-gardens, train stations), thus incentivising the collection of specific points of interest in a predefined region. (Celino et al. 2012) presents the location based game Urbanopoly, in which players buy properties (e.g. restaurants, theatres, shops) and are able to win away properties from other players. Unlike CityExplorer, this location based game allows for continuous gameplay and has no start or end of a game session. Urbanopoly not only focuses on the collection of information on points of interest, but also focuses on verifying, correcting and enriching existing information found on OpenStreetMap (OSM). MapSigns, presented in (Davidovic et al. 2013), is an attempt to use location based gaming to motivate users to collect niche datasets, usually with a low importance for mainstream users (e.g. traffic signs, park benches, trash cans). Feeding Yoshi is mentioned in various papers (Avouris 2012; Neustaedter et al. 2013; Matyas 2007) and is a location based game to map open and closed WiFi hotspots. Feeding Yoshi also allows for continuous gameplay and, like all other presented games, focuses on the collection of point of interest data. The only location based game found in the scientific literature which can vaguely be seen as not to only collect point of interest data is GeoSnake (Kiefer & Schlieder 2012; Matyas et al. 2011), a location based game adaptation of the highly popular snake game. As this location based game involves strategic routing decisions, it could be used to gather route information from different users. (Winter et al. 2009; Richter et al. 2012) mention the game Tell-Us-Where, which focuses solely on the collection of place descriptions. In contrast to the other presented games, Tell-Us-Where has no common game play elements or competition characteristics. The users are only asked to verify their GPS position and describe the place where they are and have a chance to win a gift voucher. This can arguably be seen more as a spatial questionnaire and less as a location based game. It can also be argued, that the Geograph Project can be seen as a semi-location based game (the users have to go to a specific location to take a representative photo) with a focus on tile based geographic information mining, including competitive elements (e.g. list of high scores). The Geograph Project however does not use real time location information of the users to allow or deny certain interactions, making it more of an asynchronous location based game.

The authors (Matyas et al. 2008; Kiefer & Schlieder 2012; Matyas 2007; Celino et al. 2012; Winter et al. 2009; Davidovic et al. 2013) mostly agree, that location based gaming can be used as an effective tool to collect large amounts of spatial data and that the gaming aspects can be enough motivation for users to contribute over a longer period of time. Not only are location based games viable as data collection tools, but also for data verification and curation (Celino 2015; Yanenko & Schlieder 2014) purposes. (Celino 2015) presents Urbanopoly out of the data curation perspective and proposes a methodology to verify and enrich the data of OSM. The authors conclude that applying “the power of Human Computation to Citizen Science” using location based games “can bring effective tools for geospatial data curation by exploiting the physical presence of the contributors in the environment”. (Yanenko & Schlieder 2014) primarily focus on the data quality improvement mechanisms of “confirmation” and “retesting”. The authors implement both mechanisms in a location based game implemented for the purpose of assessing the two different data quality improvement mechanisms and see both methods having a positive impact on data quality. The authors also mention the positive impact of using the presented mechanisms on decreasing the probability of cheating players.

A number of researchers  (Charsky 2010; Ionescu et al. 2013; Avouris 2012) have conducted broader research on the topic of (location based) gaming in a scientific context. (Ionescu et al. 2013) propose a multiplatform framework for developing location based games or transitioning existing games to a location based game style. In (Charsky 2010), key characteristics of serious games and edutainment games are presented and discussed. These include competition, goals, rules, choices, challenges and fantasy. Even though these characteristics are discussed as underlying elements of educational or serious games, they also apply to location based games or games in general. Most noteworthy are the positive effects on motivation by using competitive elements, as also confirmed by  (Lund et al. 2010) and fantasy elements, as also confirmed by (Kenny & Gunter 2007), in a game, to immerse a player and ensure longer and more frequent gameplay. (Avouris 2012) state that a solid narrative is “a valuable tool for construction of meaning” and that the “narration is a means for combining different heterogeneous parts (actions, events, etc.) into a coherent whole and crafting the relationships between these different parts”. In the context of location based gaming for geographic information mining this means that a strong narrative can be used to immerse the player into the game world and that the narrative creates a continuous and coherent story, in return motivating the player to continue playing.

Location based games all share the common denominator of only allowing certain interactions with a virtual environment if specific location based criteria is met. A location based game for geographic information mining is usually characterised by various indicators. The game field structure can be unstructured, semi-structured or structured (Kiefer & Schlieder 2012; Matyas 2007), encouraging the collection of specific types of data (e.g. POI, Path, Tiles). Three other key characteristics are the typical duration of a game (Avouris 2012), if the game has a narrative or story-line (Avouris 2012) and if the game is team based or not (Matyas et al. 2008; Kiefer & Schlieder 2012; Matyas 2007; Celino et al. 2012; Winter et al. 2009; Yoshii et al. 2011; Davidovic et al. 2013). The following table highlights these key characteristics regarding a selection of the most prominent location based games analysed in a scientific context.

Papers Game Game Field Data Duration Narrative Team
(Celino et al. 2012; Celino 2015) Urbanopoly Semi-Structured POI Information. Continuous Weak Every one for themselves
(Davidovic et al. 2013) MapSigns Semi-Structured POI information. Focus on street signs Short – medium Weak Team based. Teams made before every round. 2 teams
(Winter et al. 2009; Richter et al. 2012) Tell Us Where Unstructured POI information. Focus on place descriptions Short – medium None Every one for themselves
(Matyas 2007; Matyas et al. 2008) CityExplorer Semi-Structured POI information. Focus on POIs defined before a game session Short – long Weak Team based. 2 teams
(Kiefer & Schlieder 2012; Matyas et al. 2011) GeoSnake Structured Path & POI information Short – medium Weak Every one for themselves
(Avouris 2012; Neustaedter et al. 2013; Matyas 2007) Feeding Yoshi Unstructured POI Information. Focus on open and closed WiFi hotspots Continuous Weak Every one for themselves

Research Gap

The reviewed literature reveals multiple research gaps in the domains of crowd sourced land cover classifications, location based gaming and the combination thereof. Of particular interest is that the use of games to make land cover validation more attractive and to attract a large amount of users is identified as having great potential (Fritz et al. 2009), but to my knowledge, no research has been done on implementing a location based game for said purpose. On the other hand, location based gaming for (geospatial) data acquisition, (geospatial) data validation and for edutainment purposes has been widely discussed in the scientific literature, but no literature was found concerning location based games for tile based information mining.

When looking at the presented table of researched location based games, various similarities become obvious. First and foremost, most of the games concentrate on collecting POI information. Only GeoSnake could be seen as a data mining application with which route data could be collected. Furthermore, the duration of a game is predominantly short to medium with only Urbanopoly and Feeding Yoshi allowing for a continuous gameplay, hence allowing continuous data collection and verification. All of the studied games have no to weak storylines or narratives and finally, the games focus around the player distribution concepts of “everyone for themselves” or split into two teams. This clearly highlights a research gap in location based gaming implementation and research. Namely, the implementation and analysis of a location based game focusing on tile based information collection, incorporating strong narrative characteristics, with a continuous game play duration and faction based team play.

The above presented research gaps regarding land cover classification and verification and the research gaps concerning location based games for geographic information mining can be combined to formulate the overarching goal of my master’s thesis. Namely, the development, implementation, assessment and analysis of a tile based location based game with continuous gameplay, including narrative as well as competitive elements, focusing on geographic information mining regarding land cover/land use data and the analysis of the generated data.

Research Questions

The research questions revolve around the two main topics of implementing a location based game for geographic information mining and the analysis of the generated data.

  • How can a location based game with a non-expert target audience be implemented to mine tile based geographic information, in particular land cover data?
  • Can the generated land cover and semantic data be used in a research context, in particular with regards to the validation or improvement of land cover products?


Platform / Game

How should a location based game be structured to acquire land cover information?

I propose implementing a location based game with an underlying similarity with the highly popular board game “The Settlers of Catan”. The mentioned board game features the island of Catan, made up of hexagonal tiles with different land cover types. The players can acquire specific resources by building settlements on the respective tiles. The resources can then be spent on developing settlements to acquire more resources or to attack other players. The proposed location based game will generate land cover and land use data through the users, exploiting the proximate sensing capabilities (Leung & Newsam 2014) of the users as by only allowing game interactions with the immediate surrounding locations. After a user has signed up for the game and has chosen a team, the user will be able to capture tiles for his or her team by being in the real world location of a tile and supplying land cover information using a pre-defined list (e.g. classification scheme presented in (Leung & Newsam 2014) or (See et al. 2013) or further literature), free text descriptions and possibly photographs. By capturing and classifying tiles, the users will generate a form of in game currency or resource, which can then be spent on in game items or upgrades. The players will be able to destroy tiles captured by an enemy team and capture the tiles for themselves, allowing multiple users to provide classifications of the same tile and multiple captures of a tile from the same user. As for the extent of the game tiles and the game itself, no final decision has been made. The preliminary idea is that game tiles should not be too small, so that competitive play over tiles is encouraged, but should also not be of such a large extent, that a considerable effort on behalf of the players is needed to change the tiles the player can interact with. Additionally taking the positional AGPS accuracy of ~9m (Zandbergen 2009) into account, I argue game tile extents of 30m – 100m in width and height would be a suitable measure. For the game itself, I propose on testing and launching on a country scale (Switzerland), but ideally I will take care to make the location based game as scalable as possible, as to potentially open the game to new countries.

How can a strong narrative be incorporated as underlying motivational structure in a location based game focusing on geographic information mining?

As was presented in the state of the art, a strong narrative can be a key factor of motivation. I will create an immersive story from account creation and through the gameplay. Part of the narrative can be interwoven into the gameplay by adding a questing system in combination with different tile types. I argue that a quest system could be an effective tool to incentivise users to collect specific types of data (e.g. Collect 5 forest tiles! Collect three tiles through taking a panorama photo! Destroy 4 enemy tiles…). Furthermore, the tiles themselves can also be manipulated to incentivise different forms of gameplay. Tiles could be graded from common tiles, which all users can capture, to rare or mythic tiles, where specific criteria must be met (Tile can only be captured if 3 other team members are present in the tile! Tile can only be captured if all surrounding tiles are captured first!) before a tile can be captured, but with large rewards on completion. The implementation of a quest system allows for incentivising the users to collect specific types of data, whereas by strategically placing different grades of tiles into the virtual game world, users can be incentivised to collect information on specific locations.

How and what kind of control instances must be interwoven into the gameplay?

As to address the prominent issues of data quality and data verification, I propose using multiple control instances interwoven into the storyline of the game. One control instance is the collection of backlog data for tiles, allowing the prediction of data quality using the percentage of agreement. A second control instance is allowing the users to review other users’ classifications as to earn a unique reviewing currency or resource. A third control instance is to allow or force users to take representative photographs (as in the Geograph Project) or 360° panorama photographs of tiles and have other users control the classifications using the photographs.

Technical Implementation

The implementation of a location based game for geographic information mining is separated into three main sections: literature review and concept design, infrastructure and programming and finally testing and promotion. In the first phase of my master thesis, I will conceptually design the location based game based on read literature and on discussions with supervisors and partners. The second phase will consist of implementing the location based game and the third phase will be made up of testing, promotion and running the game.

What infrastructure should be used to create a location based game for geographic information mining?

I will implement the location based game as a web-application as to allow all devices with a modern web browser and GPS functionality to take part. I will implement the game using a Neo4j database as the data storage instance and the underlying game logic will be stored in and processed with PHP in a Symfony framework. The graphical elements and user interactions will be programmed to be client-side, using HTML5, CSS3, Leaflet, JavaScript and various JavaScript libraries. Having the proposed structure offers various advantages including distribution of computing tasks, efficiency, scalability and independence of device operating system. In the last phase of implementation, I will first alpha-test the implemented game with a small group of users, possibly conducting a user study. I will then beta-test the game with a larger amount of users. It is to note that testing and further implementation will be a simultaneous task as of the alpha release (see schedule). This enables bug fixing and using the feedback of test users to improve game mechanics. After testing, the game will be promoted on various channels to attract a larger number of users.


How can the quality of the generated data be assured or estimated?

Once the implemented game has run for a fixed amount of time, I will analyse the data collected through the location based game. On the one hand I am interested in whether a location based game with a non-expert target audience is a viable tool to collect large amounts of data to be used in a scientific context. For this analysis I propose on focusing on the distribution of data, the amount of data generated, the amount of data per user and the temporal variations of contributions. I will analyse the contributed attribute accuracy as presented by (Goodchild 1995) or  (Foody et al. 2014) who “used an intrinsic method of quality characterisation based on a latent class model to indicate the accuracy of VGI”, more specifically, land cover data. This approach is identified to be well suited for attribute accuracy estimations of features with more than three contributing users (Foody et al. 2014). A gold standard collection will also be created to further assess the quality of the contributions. Seeing that I will store the timestamp of contributions, I will be able to perform the proposed analyses on varying temporal intervals, allowing statements of change over time in the attribute accuracy of contributions.

How the generated information be used for land cover collection and verification of existing products?

On the other hand, I will analyse, if the collected data is suited to assess the quality of existing land cover products. This will be achieved by comparing the generated and analysed user contributed land cover data with available land cover products (e.g. GLC-2000, MODIS and GlobCover). I propose to use an error or confusion matrix approach as presented by (Ji & Niu 2014) and will also apply a Boolean or fuzzy agreement assessment as presented in (See & Fritz 2006). This will allow for assessments not only on the data quality of existing land cover products, but also on whether location based games are a viable tool for land cover collection and verification.


As the proposed master thesis is rather ambitious, various possible limitations are identified. First and foremost, the number of users. If only a very small number of users can be motivated to play the proposed location based game, the proposed analysis will only be performed on a small amount of data, thus putting the significance of the results in questions.

Another potential limitation can be seen in the player distribution and coverage. If all the players play in a small area, the resulting data coverage will also be confined to mentioned small region. This raises the question, if a small spatial coverage will be sufficient to perform the proposed analyses. This limitation can be addressed by motivating players from different areas (University of Bern, University of Zürich, friends in Biel etc.).

A further limitation is the prior knowledge of players in regard to land cover classifications and the differentiation thereof. Different users will have different knowledge about the various land cover types and will thus have varying levels of accuracy when classifying game tiles. This can be mitigated by implementing different levels of classifications, where the top level classification options can be easily differentiated (e.g. cultivated land vs. urban land vs. forest vs. mountain …) and the lower levels are optional but more detailed (e.g. cultivated land: Corn field vs. potato field vs. sunflower field …).


Avouris, N., 2012. A Review of Mobile Location-based Games for Learning across Physical and Virtual Spaces. , 18(15), pp.2120–2142.

Celino, I., 2015. Geospatial dataset curation through a location-based game: Description of the Urbanopoly linked datasets. Semantic Web, 6(2), pp.121–130.

Celino, I. et al., 2012. Urbanopoly – A social and location-based game with a purpose to crowdsource your urban data. In Proceedings – 2012 ASE/IEEE International Conference on Privacy, Security, Risk and Trust and 2012 ASE/IEEE International Conference on Social Computing, SocialCom/PASSAT 2012. pp. 910–913.

Charsky, D., 2010. From Edutainment to Serious Games: A Change in the Use of Game Characteristics. Games and Culture, 5(2), pp.177–198. Available at:\n

Congalton, R.G. et al., 2014. Global land cover mapping: A review and uncertainty analysis. Remote Sensing, 6(12), pp.12070–12093.

Davidovic, N. et al., 2013. Using Location Based Game MapSigns to motivate VGI data collection related to traffic signs. Agile 2013. Available at:

Foody, G.M. et al., 2014. Accurate Attribute Mapping from Volunteered Geographic Information: Issues of Volunteer Quantity and Quality. The Cartographic Journal, 000(000), p.1743277413Y.000. Available at:

Foody, G.M. & Boyd, D.S., 2012. Using volunteered data in land cover map validation: Mapping tropical forests across West Africa. International Geoscience and Remote Sensing Symposium (IGARSS), pp.6207–6208.

Fritz, S. et al., 2009. The use of crowdsourcing to improve global land cover. Remote Sensing, 1(3), pp.345–354.

Goodchild, M.F., 1995. Attribute accuracy. Elements of spatial data quality, pp.59–79.

Haklay, M. (Muki) et al., 2010. How Many Volunteers Does it Take to Map an Area Well? The Validity of Linus’ Law to Volunteered Geographic Information. The Cartographic Journal, 47(4), pp.315–322. Available at:

Hutchison, D. et al., 2012. Crowdsourcing Satellite Imagery Analysis: Study of Parallel and Iterative Models. In N. Xiao et al., eds. Geographic Information Science. Lecture Notes in Computer Science. Springer Berlin Heidelberg, pp. 116–131. Available at:

Ionescu, G., De Valmaseda, J.M. & Deriaz, M., 2013. GeoGuild: Location-based framework for mobile games. In Proceedings – 2013 IEEE 3rd International Conference on Cloud and Green Computing, CGC 2013 and 2013 IEEE 3rd International Conference on Social Computing and Its Applications, SCA 2013. pp. 261–265.

Ji, X. & Niu, X., 2014. The Attribute Accuracy Assessment of Land Cover Data in the National Geographic Conditions Survey. ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences, II-4(May), pp.35–40. Available at:

Kenny, R.F. & Gunter, G.A., 2007. Endogenous fantasy-based serious games: intrinsic motivation and learning. International Journal of Social Sciences, 2(1), pp.8–13. Available at:

Kiefer, P. & Schlieder, C., 2012. Changing the Rules : Acquiring Quality Assured Geospatial Data With Location-based Games.

Lambin, E.F. et al., 2001. The causes of land-use and land-cover change:moving beyond the myths. Global Environmental Change, 11, pp.261–269.

Leung, D. & Newsam, S., 2014. Land cover classification using geo-referenced photos. , (February).

Lund, K., Lochrie, M. & Coulton, P., 2010. Enabling Emergent Behaviour in Location Based Games. , 44(0), pp.78–85.

Mallupattu, P.K. & Sreenivasula Reddy, J.R., 2013. Analysis of land use/land cover changes using remote sensing data and GIS at an urban area, Tirupati, India. TheScientificWorldJournal, 2013(Figure 1), p.268623. Available at:

Matyas, S. et al., 2008. Designing location-based mobile games with a purpose: collecting geospatial data with CityExplorer. In Proceedings of the 2008 International Conference on Advances in Computer Entertainment Technology. pp. 244–247. Available at:

Matyas, S., 2007. Playful Geospatial Data Acquisition by Location-based Gaming Communities. , 6(3), pp.1–10.

Matyas, S. et al., 2011. Wisdom about the crowd: Assuring geospatial data quality collected in location-based games. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). pp. 1–6.

Neustaedter, C., Tang, A. & Judge, T.K., 2013. Creating scalable location-based games: Lessons from Geocaching. Personal and Ubiquitous Computing, 17(2), pp.335–349.

Richter, D. et al., 2012. How People Describe their Place : Identifying Predominant Types of Place Descriptions. Proceedings of the 1st ACM SIGSPATIAL International Workshop on Crowdsourced and Volunteered Geographic Information, pp.30–37. Available at:

See, L. et al., 2013. Comparing the Quality of Crowdsourced Data Contributed by Expert and Non-Experts. PLoS ONE, 8(7), pp.1–11.

See, L.M. & Fritz, S., 2006. A method to compare and improve land cover Datasets: Application to the GLC-2000 and MODIS land cover products. IEEE Transactions on Geoscience and Remote Sensing, 44(7), pp.1740–1746.

Winter, S. et al., 2009. Location-Based Mobile Games for Spatial Knowledge Acquisition. , pp.1–8.

Yanenko, O. & Schlieder, C., 2014. Game Principles for Enhancing the Quality of User – generated Data Collections.

Yoshii, A. et al., 2011. IDetective: A location based game to persuade users unconsciously. In Proceedings – 17th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, RTCSA 2011. pp. 115–120.

Zandbergen, P.A., 2009. Accuracy of iPhone locations: A comparison of assisted GPS, WiFi and cellular positioning. Transactions in GIS, 13(SUPPL. 1), pp.5–25.