Posts by Collection

portfolio

publications

High-Resolution Mapping of the Urban Built Environment Stocks in Beijing

Published in Environmental Science & Technology, 2020

Improving our comprehension of the weight and spatial distribution of urban built environment stocks is essential for informing urban resource, waste, and environmental management, but this is often hampered by inaccuracy and inconsistency of the typology and material composition data of buildings and infrastructure. Here, we have integrated big data mining and analytics techniques and compiled a local material composition database to address these gaps, for a detailed characterization of the quantity, quality, and spatial distribution (in 500 m ×{} 500 m grids) of the urban built environment stocks in Beijing in 2018. We found that 3621 megatons (140 ton/cap) of construction materials were accumulated in Beijing’s buildings and infrastructure, equaling to 1141 Mt of embodied greenhouse gas emissions. Buildings contribute the most (63% of total, roughly half in residential and half in nonresidential) to the total stock and the subsurface stocks account for almost half. Spatially, the belts between 3 and 7 km from city center (approximately 5 t/m2) and commercial grids (approximately 8 t/m2) became the densest. Correlation analyses between material stocks and socioeconomic factors at a high resolution reveal an inverse relationship between building and road stock densities and suggest that Beijing is sacrificing skylines for space in urban expansion. Our results demonstrate that harnessing emerging big data and analytics (e.g., point of interest data and web crawling) could help realize more spatially refined characterization of built environment stocks and highlight~the role of such information and urban planning in urban resource, waste, and environmental strategies.

Perceiving Beijing’s City Image Across Different Groups Based on Geotagged Social Media Data

Published in IEEE Access, 2020

City image in general refers to the perception, the feeling, and the opinion of a city, which contributes great importance to urban management, urban planning, urban cultural perceptions, and tourism resource development. Traditionally, city image is often inferred by the `five-element’ model of physical factors while lacking the consideration of subjective perception. With the rising penetration of smart mobile devices and social media, massive data of location-related texts has been generated for a variety of urban areas. The accessibility to the big data leads to a new approach of understanding the subjective perception of city image, which is important since the new approach takes the subjective heterogeneity into account. Based on the Beijing’s Weibo (microblog) data in the year of 2016, we use a random forest model to categorize user backgrounds into locals and non-locals. Meanwhile, spatial clustering is applied to identify hotspots. Then two text analysis methods-term frequency-inverse document frequency (TF-IDF) and latent Dirichlet allocation (LDA)-are adopted to abstract topics regarding the different geographical hotspots in the city across the different groups of individuals. Our research shows text mining on geotagged big data for city image makes it possible to accommodate the heterogeneity of the activities of different groups of people and to understand their preferences for different points of interests in the city, and thereby reveals the socio-cultural and functional features for the city.

Global Urban Subway Development, Construction Material Stocks, and Embodied Carbon Emissions

Published in Humanities and Social Sciences Communications, 2021

Urban subway system, as an important type of urban transportation infrastructure, can provide mass mobility service and help address urban sustainability challenges such as traffic congestion and air pollution. The continuous construction of subways, however, causes large amounts of construction materials and embodied greenhouse gas (GHG) emissions. In this study, we characterized the patterns of subway development, construction material stocks, and embodied emissions covering all 219 cities in the world in which subways are found by July 2020. The global subway length reached 16,419\,km in 2020, and the construction material stocks amounted to 2.5 gigatons, equaling to an embodied emission of 560 megatons. In particular, China’s subway system contributes to \textasciitilde 40% of the total global stocks, with a pattern of moderate and steady stocks growth before 2010 and a rapid expansion afterwards, implying the late-development advantages and infrastructure-based urbanization mode. Our results demonstrated that identifying the spatiotemporal characteristics of subway materials stocks development is imperative for benchmarking future resource demand, informing sustainable subway planning, prospecting urban mining and waste management opportunities and challenges, and mitigating the associated environmental impacts for global GHG emission reduction.

A BiLSTM-CNN Model for Predicting Users’ next Locations Based on Geotagged Social Media

Published in International Journal of Geographical Information Science, 2021

Location prediction based on spatio-temporal footprints in social media is instrumental to various applications, such as travel behavior studies, crowd detection, traffic control, and location-based service recommendation. In this study, we propose a model that uses geotags of social media to predict the potential area containing users’ next locations. In the model, we utilize HiSpatialCluster algorithm to identify clustering areas (CAs) from check-in points. CA is the basic spatial unit for predicting the potential area containing users’ next locations. Then, we use the LINE (Large-scale Information Network Embedding) to obtain the representation vector of each CA. Finally, we apply BiLSTM-CNN (Bidirectional Long Short-Term Memory-Convolutional Neural Network) for location prediction. The results show that the proposed ensemble model outperforms the single LSTM or CNN model. In the case study that identifies 100 CAs out of Weibo check-ins collected in Wuhan, China, the Top-5 predicted areas containing next locations amount to an 80% accuracy. The high accuracy is of great value for recommendation and prediction on areal unit.

A Novel Carbon Cycle Turbulence Index Identifies Environmental and Ecological Perturbations

Published in Geochemical Perspectives Letters, 2021

Earth’s history has been characterised by complex interactions between life and the environment, which are often difficult to resolve. Here, we propose a new carbon cycle turbulence index (CTindex), based on the carbonate-carbon isotope ({$\delta$}13Ccarb) record, to measure the extent of environmental perturbation over the last billion years. The CTindex trend is closely linked to Phanerozoic biotic extinction rates (ERs), as calculated from a palaeobiology database, supporting a strong environmental control on biotic ERs. We use the empirical CTindex\textemdash ER relationship to compare the extent of environmental perturbation due to greenhouse gas emissions with that during the Permian-Triassic (PTr) transition ({$\sim$}252 Ma), representing the most severe mass extinction of the Phanerozoic. At the current peak of fossil fuel emissions, the CTindex indicates a moderate future environmental perturbation. However, if fossil fuel emissions increase into the next century, a pronounced CTindex peak greater than that which occurred during the PTr transition is indicated, which suggests the potential for a severe ``sixth mass extinction’’ in the future.

Site Selection for Hybrid Offshore Wind and Wave Power Plants Using a Four-Stage Framework:A Case Study in Hainan, China

Published in Ocean & Coastal Management, 2022

The site selection for hybrid offshore wind and wave power plants (HOWWPP) is a critical step to a successful HOWWPP project. In this study, a four-stage framework is presented for determining the most suitable marine areas for the siting of HOWWPP. First, wind and wave energy potentials are assessed as a foundation for the implementation of a HOWWPP project. Next, unsuitable areas for the siting of HOWWPP are determined based on exclusion criteria to avoid any potential conflicts of marine spatial planning. Feasible areas (not satisfying the exclusion criteria) are classified and converted into spatial layers separately according to evaluation criteria. Then, the triangular fuzzy analytic hierarchy process is applied to calculate the evaluation criteria weights. Finally, the site suitability of feasible areas is calculated using the weighted overlay approach and then categorized into five classes. To validate the effectiveness of the proposed framework, a case study in Hainan Province of China was conducted. The results indicate that the marine areas with medium to very high suitability are approximately 1312~km2 (4.7% of the study area) for the deployment of HOWWPP. The obtained results of this study can support potential planners in selecting marine areas for the installation of HOWWPP.

Identifying Spatiotemporal Characteristics and Driving Factors for Road Traffic CO2 Emissions

Published in Science of The Total Environment, 2022

Road traffic is an important contributor to CO2 emissions. Previous studies lack enough spatiotemporal resolution in emission calculation at the road level and ignore the impact of the built environment on road traffic emissions. Therefore, this study develops a bottom-up methodology based on the traffic trajectory data to analyze the CO2 emission characteristics of road traffic with a high level of spatial-temporal resolution in Shenzhen. Then, the effects of built environment factors on road traffic emissions are investigated using multiscale geographically weighted regression. The results show a highly detailed map of CO2 emissions with high temporal (hour) and space (road) resolutions. The emission characteristics reflect the spatial non-equilibrium in road traffic CO2 emissions. In addition, six factors, including population density, number of workplaces, number of dwellings, density of main road, access to metro stations, and access to bus stops, have a significant effect on road traffic CO2 emissions. Finally, the policy suggestions are proposed for the reduction of road traffic CO2 emissions.

Spatial Calculation of Urban Built Environment Stock:Progress and Prospects

Published in National Remote Sensing Bulletin, 2022

Urban built environment is the manufactured environment where human beings live. The stocks of the urban built environment refer to the quality of materials (e.g., concrete, steel, copper, etc.) that accumulated in buildings and infrastructure. Revealing the spatial distribution of urban built environment stocks arises as a new direction for digital city construction, which helps to understand the urban development patterns and urban resource and waste management. Developing an urban circular economy and realizing sustainable urban development is essential. Therefore, it is necessary to summarize and sort out the current spatial calculation method of built environment stocks.This study introduces a detailed theoretical basis and development status of three methods for spatial calculation of urban built environment stock: that are the top-down method, the bottom-up method, and the remote sensing calculation method. The advantages and limitations of these models are elaborated with application and data availability. The top-down approach has a complete set of theoretical foundations and algorithm models, which can perform large-scale material flow analysis well. Due to its inability to obtain a high spatial resolution, this method is not suitable for analyzing urban development within cities. Contrastingly, the bottom-up method permits fine-grained stock estimation by gathering cadastral-level physical measurements of buildings and infrastructure and associated material composition indicators. However, it is labour-intensive and the scope of the bottom-up method is often restricted to city-level or lower geographical regions. As for remote sensing calculation, previous studies established a linear regression relationship between the nighttime light radiation intensity and the built environment stocks in the study areas. However, the night light remote sensing data will degrade the reliability of quantitative analysis due to background noise and radiation saturation effect. Thus, stock data with the high spatial resolution are impossible to acquire. These three traditional methods are often difficult to strike a balance between large scale and high spatial resolution. However, in the era of big geographic data, more data sources have brought new research directions for stock calculation.Geo Big Data and Earth Observation data are essential in developing earth science, environmental science, remote sensing science, and geographic information science. Combining these wide-coverage, high-precision, and fast-update data and machine learning methods have been widely used in poverty surveys and energy consumption. This paper proposes a framework that combines big geographic data and machine learning for stock calculation based on the above background. We expect an end-to-end method to estimate grid stocks directly from publicly available information that minimizes manual involvement. However, the heterogeneity of geospatial and the black-box nature of deep learning may have an impact on the migration effects of the model. Despite its drawbacks, this migration model has the potential for large-scale, high-resolution stock calculation in future works.

DouFu:A Double Fusion Joint Learning Method for Driving Trajectory Representation

Published in Knowledge-Based Systems, 2022

Driving trajectory representation learning is of great significance for various location-based services such as driving pattern mining and route recommendation. However, previous representation generation approaches rarely address three challenges: (1) how to represent the intricate semantic intentions of mobility inexpensively, (2) complex and weak spatial\textendash temporal dependencies due to the sparsity and heterogeneity of the trajectory data, and (3) route selection preferences and their correlation to driving behaviour. In this study, we propose a novel multimodal fusion model, DouFu, for trajectory representation joint learning, which applies a multimodal learning and attention fusion module to capture the internal characteristics of trajectories. We first design movement, route, and global features generated from the trajectory data and urban functional zones, and then analyse them with an with the attention encoder or fully connected network. The attention fusion module incorporates route features with movement features to create more effective spatial\textendash temporal embedding. Combined with the global semantic feature, DouFu produced a comprehensive embedding for each trajectory. We evaluated the representations generated by our method and other baseline models on the classification and clustering tasks. The empirical results show that DouFu outperforms other models in most learning algorithms, such as the linear regression and the support vector machines, by more than 10%.

GWRBoost:A Geographically Weighted Gradient Boosting Method for Explainable Quantification of Spatially-Varying Relationships

Published in , 2022

The geographically weighted regression (GWR) is an essential tool for estimating the spatial variation of relationships between dependent and independent variables in geographical contexts. However, GWR suffers from the problem that classical linear regressions, which compose the GWR model, are more prone to be underfitting, especially for significant volume and complex nonlinear data, causing inferior comparative performance. Nevertheless, some advanced models, such as the decision tree and the support vector machine, can learn features from complex data more effectively while they cannot provide explainable quantification for the spatial variation of localized relationships. To address the above issues, we propose a geographically gradient boosting weighted regression model, GWRBoost, that applies the localized additive model and gradient boosting optimization method to alleviate underfitting problems and retains explainable quantification capability for spatially-varying relationships between geographically located variables. Furthermore, we formulate the computation method of the Akaike information score for the proposed model to conduct the comparative analysis with the classic GWR algorithm. Simulation experiments and the empirical case study are applied to prove the efficient performance and practical value of GWRBoost. The results show that our proposed model can reduce the RMSE by 18.3% in parameter estimation accuracy and AICc by 67.3% in the goodness of fit.

Evaluating the Human Use Efficiency of Urban Built Environment and Their Coordinated Development in a Spatially Refined Manner

Published in Resources, Conservation and Recycling, 2023

Urban sustainability requires a coordinated development between urban built environment and human activities in cities. The irrational allocation of built environment stocks such as buildings and roads has led to urban problems like urban villages and ghost cities. However, the human use efficiency of urban built environment within cities and their coordinated development at a high spatial resolution remain hitherto poorly understood. Here, we aim to address this knowledge gap by leveraging the coupling coordination degree (CCD) method and emerging geospatial big data. We develop a framework that considers intra-city heterogeneity to achieve high-resolution mapping of the CCD between built environment and human activity at a grid level of 250m*250m for a case of Beijing, China. Our results show that the CCD of urban built environment and human activity in Beijing has a significant spatial correlation with a global Morans I of 0.716 and the two subsystems are well coupled in most areas. While the built environment subsystem lags in some old urban areas and commercial districts in the city center, human activity lags slightly in areas such as factories at the edge of the city. We suggest such methods could be extended to other cities to inform urban spatial planning and infrastructure development and maximize the human use efficiency of urban built environment in different cities and at different stages of urban development.

High-Resolution Quantification of Building Stock Using Multi-Source Remote Sensing Imagery and Deep Learning

Published in Journal of Industrial Ecology, 2023

In recent decades, urbanization has led to an increase in building material stock. The high-resolution quantification of building stock is essential to understand the spatial concentration of materials, urban mining potential, and sustainable urban development. Current approaches rely excessively on statistics or survey data, both of which are unavailable for most cities, particularly in underdeveloped areas. This study proposes an end-to-end deep-learning model based on multi-source remote sensing data, enabling the reliable estimation of building stock. Ground-detail features extracted from optical remote sensing (ORS) and spatiotemporal features extracted from nighttime light (NTL) data are fused and incorporated into the model to improve accuracy. We also compare the performance of our feature-fusion model with that of an ORS-only regression model and traditional NTL regression for Beijing. The proposed model yields the best building-stock estimation, with a Spearman’s rank correlation coefficient of 0.69, weighted root mean square error of 0.58, and total error in the test set below 14%. Using gradient-weighted class activation mapping, we further investigate the relationship between ORS features and building-stock estimation. Our model exhibits reliable predictive capability and illustrates the tremendous value of the physical environment for estimating building stock. This research illustrates the significant potential of ORS and deep learning for stock estimation. Large-scale, long-term building-stock investigations could also benefit from the end-to-end predictability and the data availability of the model.

ConvGCN-RF:A Hybrid Learning Model for Commuting Flow Prediction Considering Geographical Semantics and Neighborhood Effects

Published in GeoInformatica, 2023

Commuting flow prediction is a crucial issue for transport optimization and urban planning. However, the two existing types of solutions have inherent flaws. One is traditional models, such as the gravity model and radiation model. These models rely on fixed and simple mathematical formulas derived from physics, and ignore rich geographic semantics, which makes them difficult to model complex human mobility patterns. The other is the machine learning models, most of which simply leverage the features of Origin-Destination (OD), ignoring the topological nature of the interaction network and the spatial correlation brought by the nearby areas. In this paper, we propose a `preprocessing-encoder-decoder’ hybrid learning model, which can make full use of geographic semantic information and spatial neighborhood effects, thereby significantly improving the prediction performance. Specifically, in the preprocessing part, we divide the study area into grids, and then incorporates features such as location, population, and land use types. The second step of the encoder designs a convolutional neural network (CNN) to achieve the fusion of neighborhood features, constructs a spatial interaction network with the grids as nodes and the flows as edges, and then uses the graph convolutional network (GCN) to extract the embeddings of the nodes. In the last step of the decoder, a random forest regressor is trained to predict the commuting flow based on the learned embedding vectors. An empirical study on a commuter dataset in Beijing shows that our proposed model is approximately 20% better than XGBoost (state-of-the-art), thus proving its effectiveness.

Big Geodata Reveals Spatial Patterns of Built Environment Stocks Across and Within Cities in China

Published in Engineering, 2023

The patterns of material accumulation in buildings and infrastructure accompanied by rapid urbanization offer an important, yet hitherto largely missing stock perspective for facilitating urban system engineering and informing urban resources, waste, and climate strategies. However, our existing knowledge on the patterns of built environment stocks across and particularly within cities is limited, largely owing to the lack of sufficient high spatial resolution data. This study leveraged multi-source big geodata, machine learning, and bottom-up stock accounting to characterize the built environment stocks of 50 cities in China at 500 m fine-grained levels. The per capita built environment stock of many cities (240 tonnes per capita on average) is close to that in western cities, despite considerable disparities across cities owing to their varying socioeconomic, geomorphology, and urban form characteristics. This is mainly owing to the construction boom and the building and infrastructure-driven economy of China in the past decades. China’s urban expansion tends to be more vertical'' (with high-rise buildings) thanhorizontal’’ (with expanded road networks). It trades skylines for space, and reflects a concentration\textendash dispersion\textendash concentration pathway for spatialized built environment stocks development within cities in China. These results shed light on future urbanization in developing cities, inform spatial planning, and support circular and low-carbon transitions in cities.

Optimizing Segmented Trajectory Data Storage with HBase for Improved Spatio-Temporal Query Efficiency

Published in International Journal of Digital Earth, 2023

The surging accumulation of trajectory data has yielded invaluable insights into urban systems, but it has also presented challenges for data storage and management systems. In response, specialized storage systems based on non-relational databases have been developed to support large data quantities in distributed approaches. However, these systems often utilize storage by point or storage by trajectory methods, both of which have drawbacks. In this study, we evaluate the effectiveness of segmented trajectory data storage with HBase optimizations for spatio-temporal queries. We develop a prototype system that includes trajectory segmentation, serialization, and spatio-temporal indexing and apply it to taxi trajectory data in Beijing. Our findings indicate that the segmented system provides enhanced query speed and reduced memory usage compared to the Geomesa system.

talks

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.