Embedding of POIs in Greater London
This project use the neural word embedding technique to explore Points of Interest (POIs) data especially in the relationship between POI Classes. Doc2Vec model, a common neural word embedding model developed by Quoc V. Le and Tomas Mikolov from Google, is adopted in this project. The input data for this two-layer neural network is the POI sequences based on geographical distribution of 0.4 million in Greater London. The specific method for constructing POI sequences can be found in the UKGISR 2020 paper. Doc2Vec model returns fixed-length vectors (20-dimensional) for 574 POI classes with fix length. The similarity matrix between all pairs of POI classes can be calculated by the cosine distance between the vectors. To illustrate the high-dimensional matrix, we use TensorFlow embedding projector to visualise.
Mar 1, 2020