Large-scale geotagged social media data have been increasingly used for exploring human movement patterns in cities. Challenges of this new data type, such as non-representative users and the lack of activity purposes, remain unsolved and limit its applications in exploring activity-based human patterns in cities. To deal with the above challenges, this paper proposed an analytical framework of social media data enrichment — by revealing the demographic composition of non-representative social media data users and inferring activity purposes of geotagged posts — for better exploring spatial-temporal patterns of human activity in cities. A deep learning model is employed to reveal social media users' age and gender groups from user names, profile images, biographies, and language settings. Eight types of activity purposes are inferred from embedded geo-location by spatially joining with fine-scale building and land use data. Using Greater London as the case study, this paper explores the temporal dynamics of activity purposes with heatmaps of hourly frequency of tweets and identifies spatial differences across age and gender groups using hotspots analysis (Getis–Ord Gi* statistics). This paper demonstrates the application of geotagged social media data in identifying spatial, temporal and demographic patterns of urban activities, which potentially helps shape better place-based and age/gender-sensitive urban policies and planning decisions.