Archive for December 2013

Growth of Mobile Location Services in 2013

 

I'm writing up my PhD thesis at the moment, and found myself having to update the opening paragraph of page 1 (you know, the part where I say how incredibly relevant my research is). The previous version, from my transfer thesis written in 2012, went like this:

Mobile location services have been a topic of considerable interest in recent years, both in industry and academia. In industry, software applications (or apps) with location-based components enjoy widespread use. This is evidenced, for example, by the 20 million active users who opt to check in (i.e., share their current location with friends) on Foursquare , the 50 million users who search for local services on Yelp when out and about, and the increasing number who electronically hail a taxi in Uber in 11 cities (up from 1 city in 2011), which they can then watch arrive at their location on a real-time map.

In updating this paragraph, I found that the statistics reflecting 2013's progress by mobile location services are as follows: Foursquare grew from 20 to 30 million users, Yelp grew from 50 million to 100 million users, and Uber is now in 66 cities (up from 11 cities in 2012).

From this small sample of progress, it seems that there is still a lot of growth in location-based services, especially ones involving crowdsourcing of physical tasks (e.g., Uber, TaskRabbit, Gigwalk).

The disappointing one of the pack in terms of user growth is Foursquare (which "only" grew by 50% in 2013), but they are arguably facing the different challenge of proving "that there's a real business there", in CEO Dennis Crowley's words (i.e., finding sustainable revenue streams). But in general, the most promising location-based services are still following the exponential growth curve (in number of users) which is good news for innovation.

The Language of Location

 

During my work, I've often noticed similarities between language and individual daily life location behaviour (as detected by GPS, cell towers, tweets, check-ins etc.). To summarise these thoughts, I've compiled a list of the similarities and differences between language and location below. I then mention a few papers that exploit these similarities to create more powerful or interesting approaches to analysing location data.

Similarities between Location and Language Data

  • Both exhibit power laws. A lot of words are used very rarely while a few words are very frequently used. The same happens with the frequency of visits to locations (e.g., how often you visit home v.s. your favourite theme park). This is not a truism. The most frequently visited locations or words used are *much* more likely to be visited/used than most other places/words.
  • Both exhibit sequential structure. Words are highly correlated with words near to them on the page. The same for locations on a particular day.
  • Both exhibit topics or themes. In the case of language, groups of words tend to co-occur in the same document (e.g., two webpages that talk about cars are both likely to mention words from a similar group of words representing the "car" topic). In the case of location data, a similar thing happens. I mention two interpretations from specific papers later in this post.
  • The availability of both language data and location data has exploded in the last decade (the former from the web, the latter from mobile devices).
  • There are cultural differences in using language just as there are cultural differences in location behaviour (e.g., Spanish people like to eat out later than people of other cultures).
  • Both are hierarchical. Languages have letters, words, sentences, and paragraphs. A person can be moving around at the level of the street, city, or country (during an hour, day, or week).
  • Both exhibit social interactions. Language is exchanged in emails, texts, verbally, or in scholarly debate. Friends, co-workers, and family may have interesting patterns of co-location.

Differences between Location and Language Data

  • Many words are shared between texts (of same language) but locations are usually highly personal to individuals (except for the special cases of friends, co-workers, and family).
  • There are no periodicities in text but strong periodicities in location (i.e., hourly, weekly, and monthly).
  • Language data is not noisy (except for spelling and grammar mistakes) while location data is usually noisy.
  • Language analysts do not usually need to worry about privacy issues whilst location analysts usually do.

Work that Exploits These Similarities

Here are a few papers that apply or adapt approaches that were primarily used for language models to location data:

K. Farrahi and D. Gatica-Perez. Extracting mobile behavioral patterns with the distant n-gram topic model. In Proc. ISWC, 2012.

They use topic modelling to capture the tendency of visiting certain locations on the same day. This is similar to using the presence of words like "windshield" and "wheel" to place higher predictive density on words like "road" and "bumper" (i.e., topic modelling bags of words). I have talked previously about why I think this is a good paper.

L. Ferrari and M. Mamei. Discovering daily routines from google latitude with topic models. In PerCom Workshops, pages 432–437, 2011.

This paper uses a similar application of topic modelling as the one by Farrahi and Gatica-Perez.

H. Gao, J. Tang, and H. Liu. Exploring social-historical ties on location-based social networks. In 6th ICWSM, 2012.

This paper uses a model that was previously used to capture sequential structure in words and applies it to Foursquare checkins.

J. McInerney, J. Zheng, A. Rogers, N. R. Jennings. Modelling Heterogeneous Location Habits in Human Populations for Location Prediction Under Data Sparsity. In Proc. UbiComp, 2013.

In my own work, I've used the concept of topics to refer to location habits that represent the tendency of an individual to be at a given location at a certain time of day or week. This way of thinking about locations is useful in generalising temporal structure in location behaviour across people, while still allowing for topics/habits to be present to greater or varying degrees in different people's location histories (just as topics are more or less prevalent in different documents).

Both language and location data are results of human behaviour, so it is unsurprising to find similarities, even if I think some of the similarities are coincidental (e.g., power laws crop up in many places and often for different reasons, and the increasing availability of data is part of the general trend of moving the things we care about into the digital domain). The benefits of analysis approaches seem to be flowing in the language -> location direction only at the moment, though I hope one day that will change.