Delineating urban functional use from points of interest data with neural network embedding: a case study in Greater London


Delineating urban functional use plays a key role in understanding urban dynamics, evaluating planning stra- tegies and supporting policymaking. In recent years, Points of Interest (POI) data, with precise geolocation and detailed attributes, have become the primary data source for exploring urban functional use from a bottom-up perspective, using local, highly disaggregated, big datasets. Previous studies using POI data have given insuffi- cient consideration to the relationship among POI classes in the spatial context, and have failed to provide a straightforward means by which to classify urban functional areas. This study proposes an approach for delin- eating urban functional use at the scale of the Lower Layer Super Output Area (LSOA) in Greater London by integrating the Doc2Vec model, a neural network embedding method commonly used in natural language processing for vectoring words and documents from their context. In this study, the neural network vectorises both POI classes (‘Word’) and urban areas (‘Document’) based on their functional context by learning features from the spatial distribution of POIs in the city. Specifically, we first construct POI sequences based on the distribution of POI classes, and add their LSOA IDs as ‘document’ tags. By utilising these constructed POI–LSOA sequences, the Doc2Vec model trains the vectors of 574 POI classes (word vectors) and 4836 LSOAs (document vectors). The vectors of POI classes are then used in calculating the functional similarity scores based on their cosine distance, with the vectors of LSOAs grouped into clusters (i.e., functional areas) via the k-means clustering algorithm. We also identify latent functions in each cluster of LSOAs by performing topic modelling and enrichment factor. Compared with TF–IDF, LDA and Word2Vec models, the Doc2Vec model obtains the highest accuracy when classifying functional areas. This study proposes a straightforward approach in which the model directly trains vectors for urban areas, subsequently using them to classify urban functional areas. By employing the enhanced neural network model with low-cost and ubiquitous POI datasets, this study provides a potential tool with which to monitor urban dynamics in a timely and adaptive manner, thereby providing enhanced, data- driven support to urban planning, development and management.

Computers, Environment and Urban Systems
Click the DOI button below the title to find the published version of this article.
  • A neural network embedding model is employed in delineating urban functional use from POI (Points of Interest) data.
  • Doc2Vec model directly trains vector representations for spatial areas while considering the spatial distribution of POIs.
  • This paper explores the functional similarity among 574 POI classes and 4836 LSOAs (Lower Layer Super Output Areas) in Greater London.
  • Doc2Vec model outperforms other semantic models (Word2Vec, LDA and TF-IDF) in urban functional areas identification.
Similarity of POI classes trained by Doc2Vec model
Similarity of POI classes trained by Doc2Vec model
Haifeng Niu
Haifeng Niu
PhD Candidate

Haifeng Niu is a PhD candidate in Lab of Interdisciplinary Spatial Analysis at the department of Land Economy, University of Cambridge. His main research interests cover urban big data mining, data-driven analysis for urban planning & policy and simulation of urban dynamics.