City of Blinding Lights: Social Media Popularity Mapping
Finding a great restaurant can be time-consuming. Especially when you don’t know the city.
You might check reviews on applications such as TripAdvisor, Google, Foursquare, Instagram, etc. You can also read blogposts or travel guide books, or ask a trustworthy friend for a few recommendations. But we have to admit that it can be very time-consuming even in the mobile era.
Could we automate such a process? Thanks to machine learning, the answer might be ‘yes’. This data science project aims to build a restaurants recommendation map for Millennials based on social media popularity, focusing on Paris, France.
Why Millennials? Ok Boomer, it is not a discriminatory approach, but it is easier to work on 25-40 years old people since we can easily collect meaningful data from popular social networks such as Instagram or Foursquare.
Paris is divided into 80 administrative neighborhoods. Foursquare API returned a list of 4000 restaurants (50 venues per neighborhood). Instead of studying those administrative neighborhoods, only a few clusters were considered, those displaying a high density of restaurants.
How do we find those dense restaurants areas from administrative data? It was decided to reduce the list by a factor of 10 using eigenvector centrality as a criterion (also known as Google PageRank algorithm). In other words, a restaurant has a high centrality if it’s close to restaurants that are also close to other restaurants, and so on. This method enabled to facilitate K-means clustering (excluding isolated restaurants) and to keep calling Foursquare API using a free developer account. The whole clustering process is summarized here:
After clustering, 3 popularity metrics were considered: Foursquare rating, Foursquare ‘Likes’ count, Instagram ‘Followers’ count.
Given a normal-like distribution, rating data enabled to use mean measures to characterize the popularity of each area. On the other hand, ‘like’ and ‘followers’ counts display distributions that are more power-tailed (some people say that social media are scale-free networks), so it was necessary to use them in a different way.
Finally, the following recommendation map could be plotted:
Areas like Marais, Montorgueil, Canal Saint-Martin, Pigalle or Saint-Germain des Près, should sound familiar to every Parisian or any person familiar with the French capital. But the interesting part of this project is that this map was entirely generated applying machine learning algorithms.
Of course, the map is not perfect and could be improved. Moreover, it might be only relevant for a large part of Millennials but not for all of them. Nevertheless, the method leads to consistent results and could be easily applied to other cities or other venues such as bars, clubs, etc.
Ready to save time on your next visit to Paris? This map might do the job! However, it won’t be of any help for current transport issues in Paris…
This article was originally published on LinkedIn December 18, 2019.