Introduction

As one of the most infectious pandemics, COVID-19 has resulted in a devastating toll of over 7 million lives lost and an estimated $8 trillion in economic damage (Clark et al., 2020; Taskinsoy, 2020; Weiss et al., 2020; Zhang et al., 2022). Research on the COVID-19 pandemic and its impact holds the potential to offer valuable insights for addressing unforeseen large-scale, highly infectious diseases in the future (Chen et al., 2022; Menkir et al., 2021; Sibley et al., 2020).

Due to rapid urbanization, more than 55% of the global population resides in urban areas which serve as a hotbed for infectious diseases due to dense population (United Nations, 2018). It is imminent to study the response and recuperation of urban mobility during the influence of COVID-19 (Arellana et al., 2020; Atkinson-Clement and Pigalle, 2021; Gkiotsalitis and Cats, 2021; Levin et al., 2021) to facilitate understanding of the spread of the pandemics (Chang et al., 2021; Wei et al., 2021). Simultaneously, the change in urban mobility profoundly influences economic dynamics and mental health outcomes, among other societal aspects (Wang et al., 2021, 2022). Existing research has primarily investigated how governments proactively implement mandatory or advisory stay-at-home orders to change people’s mobility behavior and contain virus transmission (Martínez and Short, 2021; Shen et al., 2020; Tirachini and Cats, 2020; Wang et al., 2022; Zhang, 2021). However, there is limited knowledge regarding the spontaneous change in citizens’ mobility behavior during the COVID-19 pandemic. The spontaneous change consists of individuals’ voluntary adaption of their travel behavior concerning the infection rate of disease and perceived threats, without the restriction of government policies (Han et al., 2021). In the case of COVID-19, the self-driven behavior involves engaging in home-based care while infected, limiting travel to essential trips only, avoiding crowded places, shifting on-site work to remote work to reduce potential exposure to the virus, etc. (Balmford et al., 2020; Tisdell, 2020). A better understanding of these spontaneous mobility changes can provide valuable insights into the necessary mobility behavior of the citizens and in turn facilitate the design of urban mobility-related policies during pandemics.

In this paper, we investigate spontaneous mobility changes without stay-at-home orders throughout a highly infectious pandemic, from its emergence through large-scale proliferation to eventual stabilization. The mobility changes are manifested as fluctuations in the number of individuals traveling, which are microscopically composed of varying purpose-specific mobility behavior. We aim to answer the following research questions:

Q1: At the temporal and spatial scale, how does urban mobility evolve in response to a pandemic’s lifecycle?

Q2: Given distinct regions corresponding to varied functional zones and data pertaining to various modes of transportation within a city, what is the underlying mobility behavior, and how do they react to the pandemic?

Q3: How to establish a dynamic model to deduce the spatio-temporal mobility changes of the entire city based on the behavior of different travel purposes and modes?

To answer these questions, we utilize origin-destination (OD) mobility data involving 148 million occurrences before and after the cancellation of the "Zero-COVID" policy in Shenzhen subway, bus, and taxi systems, China, as shown in Fig. 1a, b. The OD data logs the time and quantity of individuals traveling from areas proximal to the origin to regions near the destination in the urban public transport system. Such data encapsulates the spatio-temporal mobility of urban populations, focusing solely on the volume of individuals moving between two regions, while avoiding the disclosure of individual-specific information. This ensures a high degree of privacy preservation.

Fig. 1: Temporal and spatial characteristics of OD mobility data.
figure 1

a Trips in the Shenzhen bus, subway, and taxi systems from December 8, 2022, to January 6, 2023. China adhered to a rigorous "Zero-COVID" policy and implemented strong non-pharmaceutical measures from the beginning of the COVID-19 outbreak until early Dec. 2022. As a result, over 99% of the population in China had never been infected by any variant of SARS-CoV-2. Even though China heavily distributed the COVID-19 vaccine, the vaccine merely prevents people from serious symptoms without comprehensive immunity due to the fast variability of COVID-19. On December 7, 2022, the "Zero-COVID" policy, along with the travel restrictions, was rescinded. The travel volume immediately returned to a level approximating normalcy. Thereafter, the spread of SARS-CoV-2 precipitated a substantial decrease in urban mobility. A progressive recovery would then succeed. To highlight the impact of the pandemic more effectively, we conducted normalization separately for weekdays, Saturdays, and Sundays, ultimately showcasing the trip rate. b The lines bridging the origin and destination maps symbolize the human movement between two corresponding regions. Due to the dense nature of OD data, we depict flows exceeding a daily average of 200 for clarity. Three time periods are highlighted: before the COVID-19 pandemic outbreak, during the peak impact, and during the gradual stabilization period. These periods correspond to three different levels of mobility.

The period before and after the cancellation of the "Zero-COVID" policy provides a unique opportunity to observe the spontaneous evolution of urban mobility without mobility-restricting measures, facing the pandemic spreading on a large scale in a short period. The "Zero-COVID" policy focuses on promptly identifying and containing localized outbreaks using advanced technologies such as big data analysis and nucleic acid screening (China CDC, 2022). Through strict quarantine measures, identification of infection sources, and high-risk group identification, China aims to swiftly end outbreaks with minimal societal and economic impacts. China’s implementation of the "Zero-COVID" policy has been highly successful. Despite its massive population of 1.4 billion people, the country has reported relatively low numbers of COVID-19 cases and deaths before the cancellation of the "Zero-COVID" policy. Impressively, China has recorded 1,655,477 cases (less than 0.2% of the population) and 13,524 deaths (less than 0.1%% of the population) (Burki, 2022). In late 2022, the Omicron variant of SARS-CoV-2 became predominant worldwide and it was known for its high transmissibility (basic reproduction number ≈ 9.5) and penetration rate. Yet, compared to previous variants, it also demonstrated a relatively lower infection fatality rate (<1%) (Liu et al., 2022). Under such a circumstance, on Dec. 7, 2022, China lifted the "Zero-COVID" policy along with the travel restrictions. However, the Omicron variant showed a high rate of immune escape from vaccines which merely prevented people from serious symptoms without delivering comprehensive immunity. Therefore, the lift of the policy led to a rapid and massive spread of the Omicron variant of SARS-CoV-2. It is estimated that within a month, the proportion of infected individuals in cities sharply escalated from nearly zero to surpassing 70% (Leung et al., 2023). Besides, Shenzhen is a megacity with a permanent population exceeding 17 million individuals and a population density reaching 8,800 individuals per square kilometer (Shenzhen Government, 2023). The research on mobility behavior during the pandemic in Shenzhen is representative and informative.

An analytic framework and a dynamic model are proposed with the following key designs. In terms of spatial distribution, mobility in central business districts (CBDs) and their adjacent areas is significantly impacted. The mobility changes can be quantitatively represented by a time series and evaluated by features about the decline and recovery of trips. To explore the spatial disparities, four mobility patterns are found among the mobility changes corresponding to various OD pairs through the K-means++ clustering algorithm (answering Q1). We overlay an urban land use (ULU) map on origins and destinations, thus elucidating the intent behind mobility behavior, such as commuting, recreation, schooling, and more. The differences in mobility changes among subways, buses, and taxis are also discussed (answering Q2). Inspired by this, we try to devise a dynamic model of changes in passenger flow, predicated on the impact of COVID-19 on different mobility behavior. We model distinct mobility behavior by combining physical infection and the willingness influence, which subsequently deduces the effects of ULU embodying travel intentions. As a result, we can infer the mobility within the geographic regions housing these lands with different uses (answering Q3). Our model allows us to not only fit observed changes in trips but also to conduct detailed analysis at a granularity of less than 500m. We believe the proposed insight and model could be leveraged to provide public health officials with a holistic recommendation as they decide on mobility-related policies under similar pandemic conditions.

Results

Spatial distributions of urban mobility level

The spatial distribution of the decline and recovery in mobility levels can be approximately observed based on Fig. 2. This observation is made at the granularity of a 1km × 1km grid level. For subways, buses, and taxis in a grid, the initial travel volumes for departures and arrivals are aggregated and compared with the passenger flow from the pre-pandemic period to assess the level of mobility. It is observed that the mobility in CBDs was more significantly impacted by COVID-19 compared to areas farther from the CBDs.

Fig. 2: Spatial distribution and temporal modeling of urban mobility changes across different stages.
figure 2

a The departure and arrival passenger flows for each region were aggregated and then divided by their pre-pandemic values to ascertain the mobility levels in each area. Notably, Day 15 was observed to have one of the most significant reductions in passenger flow. Day 7 marked a period of decline, whereas Days 21 and 28 were identified as recovery phases. Furthermore, a disparity in the impact on mobility levels was observed between the CBD and its adjacent areas compared to regions located further from the CBD. b The spontaneous mobility changes consist of the reduction, trough, and recovery stages. These three stages are quantitatively measured and a set of corresponding features are formed, as listed in Table 1.

CBDs are typically characterized by high density, both in terms of population and infrastructure. This density leads to more congested public spaces and transportation systems. During the pandemic, such conditions increased the risk of virus transmission, leading to more stringent movement restrictions and a greater reluctance among the public to travel in these areas. Furthermore, this may be related to the attributes of land use, which will be discussed in the section “Population behavior behind mobility patterns”.

Identifying mobility patterns

To investigate mobility changes at a finer granularity (street level), bus and subway data were processed based on station information and passenger flow, while taxi data were analyzed by urban areas (see Methods for a detailed description of the data). The OD data derived from the public bus, subway, and taxi systems in Shenzhen include both spatial and temporal information on urban mobility. In the bus system, bus routes establish connectivity between two distinct urban regions corresponding to the origin station and destination station, and passenger flow is influenced by the gradual spread of the epidemic, resulting in temporal variations. This principle is similarly applicable to subway and taxi systems.

Following the large-scale outbreak of the COVID-19 pandemic (post-December 7, 2022), the mobility for almost all OD pairs swiftly plummeted from pre-pandemic levels to a markedly low volume, eventually showing signs of gradual recovery, as shown in Fig. 2. This change in trips can be delineated via a temporal variation curve for each OD pair, as shown in Fig. 3a. We normalize each time series of OD passenger flow using historical data (i.e., typical passenger flow levels before November 25, 2022) and subject the data to preprocessing.

Fig. 3: Mobility patterns and the underlying population behavior.
figure 3

a This time series chart shows the changes in OD trip rate during the period from December 8, 2022, to January 6, 2023. The clustering algorithm identifies four distinct groups with different mobility patterns of decline and recovery. The middle line in each plot represents the average trip rate for each cluster. b Through the spatial attractiveness of ULU, we can infer prominent mobility behavior under each cluster. The changes in mobility are related to mobility behavior between different origins and destinations. c indicates the total impact on the overall passenger volumes of subway, bus, and taxi services.

To quantify the initial decline and subsequent recovery in mobility during the COVID-19 pandemic (Fig. 2), we establish a set of features to assess how trips evolve, as listed in Table 1. The concept of resilience is about how the system responds to disturbances (Qian et al., 2022; Schwarz, 2018; Standish et al., 2014; Tabatabaei et al., 2018): does it withstand the shock and remain unchanged, does it adapt and transform into a new state, or does it collapse (Forzieri et al., 2022; Kumpfer, 2002; Zhao et al., 2021). Inspired by this, we concentrate on the resilience of mobility to the pandemic—a characteristic of mobility decreasing and rapidly recovering from large-scale infectious diseases. Features are devised to quantitatively measure the three stages—reduction, trough, and recovery— capturing the mobility changes throughout the COVID-19 pandemic.

Table 1 Features to assess how mobility evolves over time (see Methods for detailed mathematical formulations) and the average values corresponding to four clusters derived from the K-means++ algorithm.

Once the features of changes in mobility for OD pairs are extracted, the mobility trends for different modes of transportation can be uniformly analyzed. Based on the aforementioned properties, we employ unsupervised machine learning technology, i.e. K-means++ clustering algorithm (Arthur and Vassilvitskii, 2006), to discern the similarities and disparities in passenger flow among various OD pairs (see Methods for detail). As depicted in Fig. 3a, the K-means++ algorithm categorizes the OD trip trends into four distinct patterns.

The OD mobility in Cluster 1 exhibits the most notable decline, with a total magnitude of impact of -13.47(%  days), as presented in Table 1. In the initial phase of the COVID-19 pandemic, a precipitous decrease in passenger flow was observed, with the average value plummeting to less than 50% by the 12th day. While the subsequent descent in passenger flow exhibited a slower pace, a persistently low level or downward trend was maintained. Certain OD passenger flows reached a standstill till the conclusion of the statistical period. This suggests a high degree of travel flexibility within this cluster, or significant aversion to infection, implying that this demographic endeavors to avoid travel throughout the pandemic. Furthermore, the resurgence of travel intent tends to be protracted, with a near-zero recovery rate in the short term.

Cluster 2 exhibits a pronounced U-shape curve with a comparable decline rate, as shown in Fig. 3a. The nadir of OD passenger flow appeared around the 15th day, hovering at approximately 50%. Thereafter, the travel volume exhibited a gradual resurgence, culminating in an average travel volume recovery of 82.2%. The graphical representation illustrated a significantly slower pace of travel recovery in the pandemic’s later stages, in contrast to the rapid decline observed during the initial phases. The mobility behavior characterizing Cluster 2 aligns with the epidemiological patterns of infection and recovery. During the initial phase of the COVID-19 pandemic, citizens refrained from traveling, either due to active infection or as a preventive measure against contagion. However, as the infection peaked and those infected began to recover, travel volume followed a trajectory of gradual recovery. This pattern is representative of the majority of mobility behavior.

The mobility change curve for Cluster 3 also exhibits a U-shaped pattern; however, its recovery speed significantly outpaces that of Cluster 2. Following the onset of the pandemic, there was a rapid decrease in passenger flow, averaging approximately a 48.8% reduction. Contrasting with the sluggish recovery observed in Cluster 2, the resurgence pace in Cluster 3 mirrors the speed of passenger flow decline experienced during the reduction stage. Consequently, by the conclusion of the observation period, the OD mobility level essentially reverted to its pre-epidemic benchmark. In the end, the mobility behavior corresponding to this cluster rapidly surmounted the impacts of the COVID-19 pandemic, facilitating, and even accelerating, the return to pre-epidemic levels.

The mobility patterns of OD bus stations in Cluster 4 are minimally impacted by the COVID-19 pandemic, with the total impact merely a third of that experienced by Cluster 1. The corresponding curve demonstrated minor fluctuations, with a peak declining amplitude of 38.2%. Across the tripartite stages of mobility alteration (reduction, trough, recovery), the pandemic’s impact on passenger volume was relatively insubstantial. This suggests that within this cluster, the corresponding origins and destinations exhibit robust travel demand, complemented by relatively inflexible mobility behavior.

Population behavior behind mobility patterns

The two crucial facets of population behavior are where to go and how to get there. Consequently, we delve into discussions on insights regarding travel purposes and modes during the processes of mobility decline and recovery.

Travel purpose

Mobility between two areas is aggregated from population behavior with various travel purposes. Lands with specific uses around each origin and destination have the potential to reveal the land-use characteristics of the places where passengers most visit (Chang et al., 2021; Sun et al., 2007). Therefore, the integration of OD data and the ULU map has the potential to reveal the travel purposes behind OD trips. For example, if a passenger departs from a station (or pick-up point of a taxi trip) surrounded by lands with the category of residential and alights at lands with the category of the company at night o’clock, it can be inferred that the passenger is most likely commuting to work. Therefore, the ULU categories clearly indicate people’s travel purposes.

More generally, we establish a ULU feature vector to represent the probability of departure or arrival at each nearby urban land category (Xing et al., 2020). We have collected nine common types of ULU information from Gaode Maps and Baidu Maps, including the residential, company, commercial service, transport hub, college, school, hospital, cultural/sport, and park/scenery. The OD pairs are mapped to OD land pairs according to the corresponding ULU feature vectors (see Methods for details). We further delineate the primary ULU visited and the key travel objectives within each cluster, as presented in Fig. 3b.

Travel behavior to schools is mainly classified within Cluster 1 and Cluster 2. Following COVID-19’s onset, passenger flow experienced a precipitous decline, with no short-term recovery trend in sight. Firstly, the health of minors is often perceived as more vulnerable, prompting parents to exercise added caution in safeguarding their children from the COVID-19 virus (She et al., 2020). Parents likely prefer to refrain from sending their children to school, owing to concerns regarding viral transmission. Subsequently, as infection rates escalated, schools transitioned to online or remote learning modalities (Betthäuser et al., 2023). This shift obviated the need for students, teachers, and staff to commute daily, thereby perpetuating the decrease in passenger flow. The passenger volume at school-associated stations was anticipated to remain low until the pandemic stabilizes. As for colleges, numerous students were sent home ahead of the lifting of the "Zero-COVID" policy, resulting in a significant decline in passenger flow.

The majority of trips from residential, company lands to park/scenery, cultural/sports lands are concentrated within clusters with a more significant impact on mobility. Analyzing travel sentiments reveals that these journeys are deemed non-essential, and the inclination to embark on them significantly diminishes following the COVID-19 pandemic outbreak (Han et al., 2021). Given the increased risk of COVID-19 infection, older populations exhibit heightened caution when using buses and visiting crowded locations, resulting in highly affected resilience. After contracting COVID-19 (Wang et al., 2022), individuals experience compromised systemic and pulmonary functions (Mulcahey et al., 2021), leading to a reluctance to engage in strenuous physical activity at sports halls for a short duration. Consequently, there was a diminished recovery in passenger flow to sports halls. Although there is no lockdown ban, citizens avoid visiting by bus and maintain social distance.

Intriguingly, despite the overarching epidemic conditions, passenger counts for the transport serving major transport hubs such as airports, train stations, and ferry terminals exhibit comparatively minor reductions and rapid recovery, mainly aligning with the trends of Cluster 3 and Cluster 4. The relative stability of these numbers can largely be attributed to the inherent nature of long-distance travel across cities. Unlike short-range transit, which may be supplanted by walking, cycling, or personal vehicles, alternatives for long-haul journeys are notably limited, thereby maintaining a baseline demand for buses servicing these transport hubs even during the height of the COVID-19 pandemic.

The alterations in mobility patterns concerning residential—hospital are primarily observed within clusters characterized by a comparatively modest reduction in mobility. The OD trips exhibited fluctuation, characterized by a decline and a gradual recovery within 30 days. In the initial stages, individuals with chronic conditions are advised to minimize hospital visits to lower their exposure risk to COVID-19. However, the pandemic triggered an escalation in healthcare demand, as numerous individuals sought medical care, testing, and treatment (Birkmeyer et al., 2020; Peiffer-Smadja et al., 2020). The heightened need for infectious disease services counterbalanced the reduction in visits to other outpatient clinics in hospitals, thereby contributing to the recovery of passenger flow towards hospitals. Moreover, the surge in hospital visits necessitated that healthcare workers and other essential personnel continue to commute to and from hospitals, irrespective of the pandemic situation. Their unwavering travel patterns help maintain a basic level of passenger flow of over 40% trip rate.

The behavior disparity between residential—company exhibits various patterns, with a significant proportion found in various clusters. Overall, they are relatively less affected, serving as primary driving factors for urban mobility recovery. It is possibly related to the various industries of companies. On one hand, the characteristics of various industries significantly influence the patterns of public transport use. Employees in industries of internet technology and electric communication demonstrate greater adaptability to remote work during the COVID-19 pandemic. Due to the pressing demand for healthcare products, drugs, and research, stations located near pharmaceutical companies are expected to experience a faster recovery in passenger flow. The nature of manufacturing and factory work typically requires on-site participation, rendering remote work impractical. On the other hand, income levels within industries also influence travel patterns. Employees in the internet and telecommunications industries tend to have higher incomes and have more flexibility in choosing their transport modes, potentially opting for private vehicles over public transport during the COVID-19 pandemic to reduce exposure risks. In contrast, lower-income employees might be more reliant on public transportation to commute to their workplaces, maintaining the demand for buses. The income disparity between industries further contributes to the observed differences in passenger flow trends during the pandemic.

Travel behavior involving commercial service falls within Cluster 3 and Cluster 4. The effect on the service industry essentially aligns with the proportion of the population infected with the virus. Following about three years of "Zero-COVID" policy, the associated panic has largely dissipated. As people yearn to return to normalcy and recreational activities, customer flow at shopping malls stages a swift recovery in later phases. This observation appears comprehensible, given the fundamental nature of work activities as a pivotal component of economic endeavors. With the resumption of work activities, a consequential surge in demands for commercial activities is anticipated (Ma et al., 2023).

Overall, during the COVID-19 pandemic, commuting, commercial, and healthcare demand constituted the largest components of urban mobility. It is crucial to maintain and promptly restore the supply of buses among communities, transport hubs, hospitals, and companies. Integrating ULU data with OD data enables governments and transport operators to thoroughly analyze and elucidate shifts in mobility behavior during the pandemic. This comprehensive approach offers a solid foundation for well-informed policy development and implementation against unforeseen pandemics with high infection and low case fatality rates.

Travel mode

The total impact for the overall passenger volumes of subway, bus, and taxi services are computed, as presented in Fig. 3c. The data indicates varying degrees of reduction in passenger flow, with subways (−9.42%  days) experiencing the most significant decline, followed by buses (−8.63%  days), and taxis (−7.90%  days). This trend can be primarily attributed to the perceived risk of COVID-19 transmission in different transportation environments and the adaptive responses of urban populations to the pandemic.

Subways, typically characterized by high passenger densities and closed environments, represent the apex of perceived transmission risk. The significant drop in subway usage can be attributed to people’s avoidance of crowded spaces and potential virus hotspots. Furthermore, the role of subways as connectors of various urban hubs rendered them particularly vulnerable to reduced usage as individuals sought to minimize travel and potential exposure to the virus. In contrast, buses, while also experiencing a notable decline in passenger flow, were marginally less impacted than subways. This difference might be due to the varied nature of bus routes, some of which cater to essential travel less feasible via other means, and the slightly lower passenger densities compared to subways.

Taxis, offering more individualized and controlled travel environments, demonstrated the least reduction in passenger flow. This trend suggests a public preference for modes of transportation perceived as safer and less conducive to virus spread. Nonetheless, the overall decline in taxi usage reflects broader patterns of reduced mobility, driven by lockdown measures, the shift to remote work, and heightened public health awareness. Economic factors also played a role, as the financial impacts of the pandemic might have influenced individuals’ transportation choices, with taxis being a costlier option compared to public transit. Collectively, these observations underscore the multifaceted impact of the COVID-19 pandemic on urban transportation, shaped by an interplay of health, lifestyle, and economic considerations.

Dynamic model of urban mobility

We develop a dynamic OD mobility model to quantitatively simulate the fine-grained impact of the COVID-19 pandemic on OD passenger flow. The mobility changes can be ascribed to the rate of COVID-19 infection and the willingness to travel. We employ an epidemic transmission model, the susceptible-infectious-removed (SIR) model (Cooper et al., 2020; Keeling and Eames, 2005), to simulate the rise in infection cases in the city. In general, people who got afflicted with COVID-19 would recuperate at home for days, leading to a plunge in overall city-wide human mobility flow. However, some patients might choose to crowd into the hospitals which leads to an unusual increase in travel needs between urban lands. Therefore, we establish the willingness factors to represent the emotional effects of the COVID-19 pandemic on various types of mobility behavior. The disparity in OD passenger flow under high spatial granularity can be elucidated by the travel purposes and travel modes (refer to Methods for details).

Specifically, people in the dynamic OD model have three distinct states: usual (U), infectious (I), and recovered (R). (Fig. 4). In contrast to the classic SIR model, the usual state in the proposed model refers to passengers who remained uninfected by the COVID-19 pandemic, with unaffected travel willingness. Infectious passengers encompass those who have contracted the COVID-19 virus or have been driven by panic to abstain from bus travel. Over time, if affected passengers decide to resume travel, they transition out of the recovered state. The model has merely four free parameters that scale: (1) transmission rate of the COVID-19 pandemic, (2) recovery rate of the COVID-19 pandemic, (3) willingness factors for transmission, and (4) willingness factors for recovery; all four parameters persist as constants over time. The first two parameters, which are determined by the infectivity and virulence of the epidemic itself, remain consistent for all OD pairs. The latter two categories of parameters are determined by travel purposes, which can be inferred by the ULU surrounding OD bus stations.

Fig. 4: A dynamic OD mobility model is developed to simulate the fine-grain impact of the COVID-19 pandemic on mobility behavior.
figure 4

The model inputs encompass the pandemic’s transmission and recovery rates, as well as people’s willingness factors. Furthermore, the model requires the ULU information near bus stations and the pre-outbreak historical passenger flow levels. In the model, the passengers have usual (U), infectious (I), and recovered (R) states. We fit models to all OD pairs in bus, subway, and taxi systems, which shows full model fits of four clusters corresponding to Fig. 3. The blue line represents the model predictions. As the trips of the OD passenger flow tend to have great variability, we also show the smoothed average (line of a different color from blue). Shaded regions denote the 2.5th and 97.5th percentiles across stochastic realizations. The green dashed line represents the predictions without considering willingness factors (In this case, the predictions for all clusters are consistent). We sample 100 parameter sets of willingness factors and perform stochastic realizations for each set.

The willingness factors serve as a measure of how much the panic mentality contributes to the decline and recovery in mobility. They reflect the degree of anxiety and perceived risk associated with traveling during the COVID-19 pandemic. For example, as the number of infectious passengers increases, other passengers perceive a higher risk of infection when taking the subway, which leads them to seek alternative means of transport or avoid traveling altogether. This corresponds to a negative willingness factor. Conversely, the surge in public health demand during the COVID-19 pandemic leads to positive willingness factors associated with medically related travel. The willingness factors collectively illustrate the various responses of different travel purposes to the COVID-19 pandemic.

Our model precisely matches the observed trips of OD pairs in Shenzhen from December 8, 2022, to January 6, 2023, as shown in Fig. 4. As shown in Supplementary Fig. 1, the average values of willingness factors correspond to the analysis of mobility patterns in section “Population behavior behind mobility patterns”. For example, the value of the willingness factor for recovery associated with the travel behavior of school-age children exhibits a significantly negative trend, indicating a slow recovery in their travel activities. This observation aligns well with the predominant mobility pattern observed within Cluster 1. The willingness factors for the recovery of “residential-transport hub” and “company-transport hub” are the two highest positive values, indicating that there are fewer viable alternatives for long-distance travel, and as a result, passenger flow is less affected. The distinctions between subway, bus, and taxi services can also be discerned from the willingness factors.

At a micro level, we can forecast the decline and recovery of mobility between any two stations or taxi-operating regions in Shenzhen. At a macro level, our model precisely captures the average mobility changes for the four clusters (Fig. 4). The dynamic OD model enables even a relatively straightforward UIR model to accurately fit observed passenger flow, despite mobility behavior during that period. This model offers valuable insights into urban mobility during potential future outbreaks of infectious diseases with high infection, empowering policymakers to deduce alterations in urban mobility and population behavior during the initial stages of an epidemic outbreak.

Discussion

Although COVID-19 has become a familiar presence in our lives, it continues to pose a significant global threat. As we write this paper, the virus still claims a life every three minutes (WHO, 2023). The suffering and the painful lessons learned from the COVID-19 pandemic must not be in vain. The analytical framework and model proposed in this research contribute to the long-term management of the COVID-19 pandemic and offer tools to confront potential future viral epidemics.

Based on 148 million travel records, this paper examines the fine-grained spatio-temporal characteristics of urban mobility during the COVID-19 pandemic. It integrates these characteristics with an urban land use map to elucidate the heterogeneity in the decline and recovery of mobility. In terms of travel purposes, trips originating from schools and colleges experienced a sharp decline attributable to class suspensions and risk aversion to infections. Non-essential travel, such as visits to museums, sports halls, and parks, saw a substantial decrease and a gradual recovery. In contrast, travel associated with commuting, commercial services, and healthcare exhibited relatively modest declines and a quicker rebound. Regarding transportation modes, the most affected to least affected were subway, bus, and taxi. Subsequently, the UIR model with willingness factors is built, which comprehensively captures the influence of COVID-19 on travel willingness. In contrast to previous research that primarily focused on macro-level mobility analysis, we propose a set of indicators to measure the temporal variations in urban mobility and uncover the heterogeneity of mobility changes through clustering. There were rare urban mobility models at the block level encompassing various modes of transportation.

Our dataset has limitations as it does not encompass all modes of urban mobility. The model is also minimalist, neglecting regional differences in epidemic spread. Nevertheless, our findings are valuable for revealing spatial differences in the decline and recovery of travel. The proposed analytical framework and dynamic model are adaptable and extensible, which can be applied to different pandemics of other cities or other travel modes, capturing detailed aspects of real-world urban mobility.

This paper offers potential implications for future application and research aimed at formulating more targeted and effective public health policies and strategies. In the context of evaluating public health policies, our results can guide policymakers in assessing the opportunity cost of urban mobility limitations during large-scale pandemic outbreaks. The dynamic OD urban mobility model provides a benchmark for mobility changes without mobility-restricting intervention, enabling the precise evaluation of the impacts on the transmission rate and economic productivity. Additionally, in the realm of policy formulation, policy decisions can influence mobility behavior by altering travel willingness and emotions associated with different POI types, significantly affecting the outcome of the epidemic infection. Improved paid leave policy or income support can reduce mobility for essential staff during illness. Strengthening hygiene and disinfection, maintaining proper ventilation, and increasing routes between origins and destinations of high travel demand contribute to enhanced passenger safety. Lastly, in terms of future academic research, fusing large language models with computational, interactive agents seems to be a path to realistic simulations of human behavior (Xu et al., 2023). Based on emerging urban sensing technologies (e.g., drones (Wang et al., 2022) and crowdsensing (Chen et al., 2019; Guo et al., 2021; Xia et al., 2023), diverse and heterogeneous human behavioral data are continuously mined. Integrating travel behavior and large models to construct urban individual profiles for simulating travel behaviors appears to be a promising direction. Additionally, we will continue to investigate the changes and resilience of urban elements when faced with restricted urban mobility, including variations in air pollution (Chen et al., 2020), unmanned delivery (Chen et al., 2022), and emergency communication (Ren et al., 2023).

Methods

Datasets

Origin-destination data of urban mobility

Thanks to the extensive deployment of mobile devices (Chen et al., 2020), the travel data are collected by a public transportation service company in Shenzhen. The original data, gathered by 663 bus routes, 16 subway routes, and 17,826 taxis during operation, total comprises approximately 148 million trips. Through advanced data collection (Li et al., 2022) and processing techniques (Chen et al., 2018), aggregated OD data can be obtained. The data collection period for bus-related information spans from October 15, 2022, to January 6, 2023, while that for subway and taxi data extends from November 1, 2022, to January 6, 2023. The OD data for buses has a shape of (3010, 3010, 98), where the first and second dimensions correspond to 3010 bus stations representing origins and destinations, respectively, and the third dimension represents dates. Similarly, the OD data for subways has a shape of (240, 240, 67) with a similar interpretation. The original data for taxis consists of individual trip records, including vehicle ID, pick-up time, drop-off time, pick–up latitude and longitude, and drop-off latitude and longitude, among other details. Based on the urban land use (ULU) map in Shenzhen (Fig. 5a), which can be divided into 11362 regions (Gong et al., 2020), we perform individual matching of each order’s latitude and longitude to the corresponding urban region or the nearest urban region based on proximity to the region’s boundaries. The taxi OD data has been organized into a matrix of shapes (11362, 11362, 67). Among them, we specifically focused on the data of 56 million urban mobility records spanning from December 8, 2022, to January 6, 2023. The data preceding December 7, 2022, are utilized to calculate the baseline level of mobility and assist in obtaining the spatial attractiveness of ULU.

Fig. 5: Urban land use map and how to formulate land use feature vector.
figure 5

a This map shows urban land use categories in Shenzhen. b This figure illustrates the process of obtaining the urban land use feature vector.

This dataset, capturing changes in passenger flow before and after China’s cancellation of the "Zero-COVID" policy, is distinct from other datasets. China adhered to the "Zero-COVID" policy from 2020 to 2022, with over 99% of Chinese residents never having been infected with COVID-19. After the policy’s cancellation on December 7, 2022, Shenzhen experienced the entire process of large-scale COVID-19 spread without mobility-restricting intervention. As of July 23rd, 2023, China has implemented universal and free COVID-19 vaccination for its population. The reported vaccination rates are as follows (Xinhua, 2022): a cumulative coverage rate of 92.1% for the first dose, a completion rate of 89.7%, and an enhanced immunization rate of 71.7%. Among individuals aged 60 and above, the rates are 89.6% for at least one dose, 84.7% for completion, and 67.3% for enhanced immunization. The widespread administration of vaccines has significantly reduced the mortality rate, facilitating recovery from COVID-19. However, the transmissibility of SARS-CoV-2 remains high. This data offers insights into the effects of highly infectious and low-toxicity viruses on social mobility and public transport use under natural transmission conditions in the city. This information provides a baseline for assessing the potential impact of future infectious diseases on urban mobility in the absence of government intervention.

This dataset encompasses the entire process of trip changes, from the initial stage to a sharp decline, and ultimately to a gradual recovery during large-scale infectious disease outbreaks in cities. From November 1, 2022, to November 25, 2022, Shenzhen experienced fewer than 50 confirmed cases per day, and the daily lives of citizens remained largely unaffected (Shenzhen Government, 2022). The data from this period can be considered the baseline passenger volume for Shenzhen’s bus system. As the number of confirmed cases progressively increased, the government implemented stringent lockdown measures from November 25, 2022, to December 7, 2022, resulting in a substantial decrease in passenger flow. As of December 7, 2022, the cumulative number of confirmed cases constituted less than 0.1% of the total population. The official outbreak of the COVID-19 pandemic transpired after the cancellation of the "Zero-COVID" policy. The Omicron variant of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spread rapidly (Leung et al., 2023), which is evident in the drastic reduction of passenger flow within the public transport system. Subsequently, as a majority of urban residents became infected and recovered, the public transport passenger flow gradually rebounded, as depicted in Fig. 1.

Assuming the passenger flow from origin i to destination j on day t is denoted as fi,j,t, the entire dataset can be denoted as

$${{{\bf{F}}}}={\left\{{f}_{i,j,t}\right\}}_{i = 1,2,...,M;j = 1,2,...,M;t = 1,2,...,T}$$
(1)

where M represents the sum of the number of bus stations, subway stations, and the total number of taxi-operating urban regions. T is the number of days.

Point of interest data and urban land use data

As online social media and mobile communication continue to flourish, location-based service (LBS) systems such as Google Places, Gaode Maps (China Satellite Maps), and Facebook are gaining popularity across various sectors (Han et al., 2021; Xing et al., 2020). These LBS systems enable users to search for points of interest (POIs) to access better services and share experiences from places they have visited (Chang et al., 2021). Generally, a POI is a specific location that individuals may find useful or intriguing. The term is commonly used to refer to commercial services, schools, subway stations, or other categories found in digital maps (Wang et al., 2019; Zhu et al., 2020).

A detailed land use map of Chinese cities can be derived by combining POI data with 10-meter satellite imagery, OpenStreetMap, nighttime lights, and Tencent social big data (Gong et al., 2020). This map serves as a valuable resource for inferring travel intentions. As the data is from 2018, we complement the newly developed land use attributes based on the latest available POI data. The data sources include Gaode Maps and Baidu Maps, which are leading providers of digital map content, navigation, and location service solutions in China. Important urban land use categories for travel are selected, including residential, company, commercial service, transport hub, college, school, hospital, cultural/sport, and park/scenery.

OD data preprocessing

For each OD pair (i, j) in bus, subway, and taxi systems, the time series of mobility can be represented as:

$${{{{\bf{f}}}}}_{i,j}=[{f}_{i,j,1},{f}_{i,j,2},...,{f}_{i,j,T}],(i,j=1,2,...,M)$$
(2)

Since the passenger flow at a single OD pair exhibits randomness, fi,j experiences significant fluctuations over time, as shown in Supplementary Fig. 2. To better illustrate the impact of COVID-19 on public transport, we apply Kalman filtering to smooth the data.

Kalman filtering is a recursive algorithm utilized for estimating the state of a dynamic system by combining noisy measurements with a mathematical model of the system (Chui et al., 2017). When applied to noisy time series data, the Kalman filter can provide a smoothed version of the data by recursively estimating the underlying state of the system that generated the data. The filter accounts for uncertainties in both the measurements and the system’s model, rendering it particularly effective at reducing noise while preserving the true signal. Upon applying Kalman filtering, the resulting sequence for the OD station pair (i, j) is denoted as:

$${{{{{\bf{f}}}}}^{{\prime} }}_{i,j}={{{\rm{Kalman}}}}\,{{{\rm{Filtering}}}}({{{{\bf{f}}}}}_{i,j})$$
(3)

The mobility between different OD pairs exhibits significant variation, with a maximum difference spanning several orders of magnitude. To uncover the general patterns impact of large-scale infectious diseases on passenger flow, it is essential to perform normalization on the time series \({{{{{\bf{f}}}}}^{{\prime} }}_{i,j}\). Specifically, for any OD pair (i, j), the passenger flow on December 8, 2022 is considered a normal level unaffected by the epidemic. The normalized time series is calculated by:

$${{{{{\bf{f}}}}}^{{\prime}{\prime}}}_{i,j}=\left[1,\frac{{{f}^{{\prime} }}_{i,j,2}}{{{f}^{{\prime} }}_{i,j,1}},...,\frac{{{f}^{{\prime} }}_{i,j,T}}{{{f}^{{\prime} }}_{i,j,1}}\right]$$
(4)

Identifying mobility patterns via clustering algorithm

Measuring of mobility changes

The features of changes in mobility are developed to facilitate description and subsequent clustering analysis. Drawing inspiration from the resilience in the fields of biology and engineering (Schwarz, 2018; Standish et al., 2014; Tabatabaei et al., 2018), we develop a set of features to comprehensively quantify the decline and recovery patterns of travel before, during, and after a large-scale infectious disease outbreak. These features can characterize the temporal changes in the response of mobility behavior to the pandemic, which reflects the interplay between human behavior, public policy, and the built environment, and provides insights into the dynamics of urban mobility under extraordinary circumstances.

Firstly, for any OD time series \({{{{{\bf{f}}}}}^{{\prime}{\prime}}}_{i,j}\), we can identify several basic features, including the maximum value Rtop before the passenger flow decreases, the minimum value Rbottom of passenger flow, the final passenger flow level Rend, and their corresponding time moments t0, t1, t2, t3. The definition and significance of the features are as follows.

Declining speed Γ: The speed at which trips decrease serves as a measure of how quickly urban populations adapt their mobility behavior in response to public health crises. It is calculated by:

$${{\Gamma }}=\frac{{R}_{{{{\rm{bottom}}}}}-{R}_{{{{\rm{top}}}}}}{{t}_{1}-{t}_{0}}$$
(5)

Declining amplitude Λ: The maximum amplitude of the decrease in trips directly mirrors the influence of the epidemic on mobility behavior.

$${{\Lambda }}={R}_{{{{\rm{bottom}}}}}-{R}_{{{{\rm{top}}}}}$$
(6)

Trough duration Π: The third indicator of mobility changes we define pertains to the trough duration of mobility. This indicator captures the persistence of the reduced trips during the COVID-19 pandemic, reflecting the extent to which the usual travels are disrupted.

$${{\Pi }}={t}_{2}-{t}_{1}$$
(7)

Recovery speed Θ: When infected individuals gradually recover and the overall panic levels subside, this metric captures the speed at which urban mobility bounces back to normalcy. The determination of the indicator is achieved through:

$${{\Theta }}=\frac{{R}_{{{{\rm{end}}}}}-{R}_{{{{\rm{bottom}}}}}}{{t}_{3}-{t}_{2}}$$
(8)

Recovery amplitude ϒ: The amplitude of mobility recovery highlights the degree to which trips rebound after the initial impact of the pandemic. As infected citizens gradually recover and no longer reject traveling, this indicator provides insights into the interplay between public health measures, human behavior, and the ability of societies to regain their normal functioning after experiencing a large-scale infectious disease outbreak.

$${{\Upsilon }}={R}_{{{{\rm{end}}}}}-{R}_{{{{\rm{bottom}}}}}$$
(9)

Total impact Δ: We finally introduce a comprehensive metric that captures the decline and recovery process of the pandemic’s impact on mobility. This total impact indicator effectively represents the degree to which the pandemic affects mobility, calculated as the area between the curve and the usual level of trips using the formula:

$${{\Delta }}=\int\nolimits_{{t}_{0}}^{{t}_{3}}(R(t)-{R}_{{{{\rm{top}}}}})dt$$
(10)

A lower value indicates a more substantial influence of the epidemic on mobility.

Therefore, we extract features of mobility change from each OD array (i, j):

$${{{{{\bf{f}}}}}^{{\prime}{\prime}}}_{i,j}\to {{{{\bf{x}}}}}_{i,j}=[{{{\Gamma }}}_{i,j},{{{\Lambda }}}_{i,j},{{{\Pi }}}_{i,j},{{{\Theta }}}_{i,j},{{{\Upsilon }}}_{i,j},{{{\Delta }}}_{i,j}]$$
(11)

K-means++ clustering

In order to identify change patterns of OD mobility during the pandemic, we perform the K-means++ clustering algorithm on the OD dataset. K-means++ clustering is an unsupervised machine learning algorithm used for partitioning a given dataset into a specified number (K) of clusters.

K-means++ clustering aims to minimize the within-cluster sum of squared (WCSS) distances from the data points to their respective cluster centroids by iteratively updating the centroids and assigning the data points. Specifically, the input of the K-means++ clustering algorithm is

$${{{\bf{X}}}}={\left\{{{{{\bf{x}}}}}_{i,j}\right\}}_{i,j = 1,2,...,M}$$
(12)

where each OD time series is represented by the features.

The algorithm first needs to initialize the cluster center. A centroid c1 is randomly chosen from X. For each xi,jX, compute the squared distance to its nearest centroid:

$$D{({{{{\bf{x}}}}}_{i,j})}^{2}=\mathop{\min }\limits_{{{{{\bf{c}}}}}_{p}\in {{{\mathcal{C}}}}}{\left\Vert {{{{\bf{x}}}}}_{i,j}-{{{{\bf{c}}}}}_{p}\right\Vert }_{2}^{2},$$
(13)

where \({{{\mathcal{C}}}}={\left\{{c}_{p}\right\}}_{p = 1,...,k-1}\) represents the set of centroids chosen so far. Then, the next centroid ck is determined from the data points with probability proportional to \(D{({x}_{i})}^{2}\):

$${{{{\bf{c}}}}_k} = \mathop{\arg\max}\limits_{{{{{\bf{x}}}}_{i,j}}} \frac{{D{{({{{{\bf{x}}}}_{i,j}})}^2}}}{{\sum\limits_{{{{{\bf{x}}}}_{i,j}} \in {{{\bf{X}}}}} D {{({{{{\bf{x}}}}_{i,j}})}^2}}}$$
(14)

Repeat the two steps until we have taken K centers altogether.

After selecting the K initial centroids, the algorithm proceeds with the standard iterative updates of centroids. Step 1 is assigning xi,j to the closest centroid using (13). Suppose that at iteration n, the dataset is divided into K clusters:

$${{{{\mathcal{X}}}}}_{p}^{(n)}={\left\{{{{{\bf{x}}}}}_{i,j}| {\left\Vert {{{{\bf{x}}}}}_{i,j}-{{{{\bf{c}}}}}_{p}^{(n)}\right\Vert }_{2}^{2} = D{({{{{\bf{x}}}}}_{i,j})}^{2}\right\}}^{(n)},(p=1,2,...,K)$$
(15)

In step 2, calculate new centroid positions for each cluster by minimizing the WCSS distance:

$${{{\bf{c}}}}_p^{(n + 1)} = \mathop {{\rm{arg}}\,{\rm{min}}}\limits_{{{{{\bf{c}}}}_p}} \mathop{\sum}\limits_{{{{{\bf{x}}}}_{i,j}} \in {{{\mathcal{X}}}}_p^{(n)}} {\left\| {{{{{\bf{x}}}}_{i,j}} - {{{{\bf{c}}}}_p}} \right\|_2^2}$$
(16)

The two steps are repeated until convergence is reached, i.e., when the assignment of samples to clusters no longer changes.

The elbow method is utilized to determine the optimal number of clusters, denoted as K, in K-means clustering (Bholowalia and Kumar, 2014). The elbow method involves plotting the variance explained or the sum of squared distances (SSE) of the data points to their cluster centroids against different values of K. The elbow method suggests selecting the value of K at the “elbow" or bend in the SSE plot (Supplementary Fig. 3a). This point represents the optimal trade-off between minimizing SSE and avoiding excessive complexity in the clustering model.

After establishing a preliminary range for K using the elbow method, Silhouette Analysis is employed to finalize the value of K (Lleti et al., 2004). For each K-value, silhouette coefficient is computed for each data point, which measures the similarity between each data point and the cluster to which it is assigned, with values ranging from –1 to 1. Subsequently, the average silhouette coefficient is calculated for all data points at each K-value. This average value serves as an indicator of cluster cohesion and separation, with higher values indicating better clustering results. As depicted in Supplementary Fig. 3b, the analysis determined that the optimal value for K = 4.

Finally, we derive K clusters of OD mobility, as shown in Fig. 3a. The labels for each OD pair have been obtained. Furthermore, the OD pairs can be associated with spatial information, which leads to subsequent analysis.

Spatial attractiveness of urban land use

Our goal is to illuminate the variation in mobility for OD pairs during a widespread infectious disease outbreak. Solely analyzing the geographical locations of the starting and ending stations of travel, however, is inadequate. As a result, we endeavor to extract travel purposes by integrating the urban land use map.

Lands in close proximity to bus stations, subway stations, and taxi pick-up and drop-off points provide insight into passengers’ potential travel motivation. For example, when a bus station is located in a residual land, there is a high probability that passengers disembarking at this station intend to commute to their residences. Accordingly, a ULU feature vector is established for each origin and destination, which is represented as:

$${{{{\bf{s}}}}}_{i}=\left[{s}_{i,1},...,{s}_{i,r},...,{s}_{i,R}\right]$$
(17)

where r = 1, 2, . . . , R represents urban land use categories. si,r is defined as the spatial attractiveness of land use category r to the citizens boarding or disembarking from the station or urban regions i(i = 1, 2, . . . , M).

For the bus stations and subway stations, a common method (Xing et al., 2020) is to use the number of lands to determine si,r:

$${s}_{i,r}=\frac{{z}_{i,r}}{\mathop{\sum }\nolimits_{r=1}^{R}{z}_{i,r}}$$
(18)

where zi,r denotes the number of lands belonging to category r within a certain distance range of the stations. Taking into account the varying importance of stations, the distance range for bus stations is set at 200 meters, while the distance range for subway stations is set at 500 meters, as presented in Fig. 5b.

As mentioned in “Origin-destination data of urban mobility”, the taxi OD data represents the mobility between 11,362 distinct regions within the city. These urban regions have been delineated with nearly homogeneous land use attributes. Therefore, si,r can be derived directly:

$${s}_{i,r}=1$$
(19)
$${s}_{i,1},...,{s}_{i,r-1},{s}_{i,r+1},...,{s}_{i,R}=0$$
(20)

For an urban region with the land use attribute r, the si,r value corresponds to 1, while the remaining values of the land feature vector are set to 0.

As each urban land use category can be mapped to an activity, identifying the most probable land use category corresponds to determining the most likely activity to be undertaken. To further analyze the mobility behavior from the origin i to the destination j, we compute the travel transition matrix:

$${{{{\bf{S}}}}}_{i,j}={{{{\bf{s}}}}}_{i}^{{{{\rm{T}}}}}* {{{{\bf{s}}}}}_{j}=\left[\begin{array}{cccc}{s}_{i,1}{s}_{j,1}&{s}_{i,1}{s}_{j,2}&\cdots \,&{s}_{i,1}{s}_{j,R}\\ {s}_{i,2}{s}_{j,1}&{s}_{i,2}{s}_{j,2}&\cdots \,&{s}_{i,2}{s}_{j,R}\\ \vdots &\vdots &\ddots &\vdots \\ {s}_{i,R}{s}_{j,1}&{s}_{i,R}{s}_{j,2}&\cdots \,&{s}_{i,R}{s}_{j,R}\end{array}\right]$$
(21)

The elements in the matrix Si,j represent the likelihood of passengers transitioning from a land use category at the origin station (region) to a ULU category at the destination station (region). For instance, si,1sj,2 in the bus system corresponds to passengers departing from land use category 1 near bus station i and arriving at land use category 2 near bus station j. From this, we can infer the passengers’ mobility behavior and purposes. The travel transition matrix for the subway and taxi follows the same principle.

We combine the K clusters of change patterns with the spatial attractiveness of urban land use. For each cluster, we sum the corresponding Si,j values to obtain a total transition matrix. Considering daily frequency changes in passenger flow, the passenger flow from station i to station j is approximately equal to the passenger flow from station j to station i on a daily timescale. Based on the symmetry of OD passenger flow, we add the lower and upper triangular parts of the total transition matrix. Finally, we identify the top values in the matrix that correspond to the primary OD land pairs and their corresponding proportions of mobility behavior within each cluster.

Dynamic OD mobility model

We develop a dynamic model to capture the fine-grained urban mobility changes. The decline in trips can be attributed to two factors: physiological infection with viruses and emotional influence. Inspired by the classic epidemic transmission model SIR (Cooper et al., 2020; Keeling and Eames, 2005), an OD passenger flow model is proposed, comprehensively considering the spread of the COVID-19 pandemic and the willingness to travel via public transportation, as shown in Fig. 4.

The model incorporates pandemic transmission and recovery rates, willingness factors of mobility behavior, urban land use map, and pre-pandemic OD passenger flow as input. It outputs the changes in trips for every OD pair during the COVID-19 pandemic with high spatial resolution.

Physiological infection module

Prior to the outbreak of large-scale infectious diseases, we assume that the daily number of population from station i to station j is Wi,j. Following the epidemic outbreak, all passengers can be categorized into three states: usual (U), infected (I) by the virus or panic, and recovered (R). The relationship among passengers associated with the three states is shown in the subsequent (22).

$${U}_{i,j,t}+{I}_{i,j,t}+{R}_{i,j,t}={W}_{i,j}$$
(22)

Specifically, initially uninfected people may become infected with the virus through contact with infected individuals. Eventually, the infected people transition to a recovery state. For each citizen, the probability state transition process of physiological infection is as follows:

$${{{\rm{P}}}}(U\to I| t\to t+1)=-\frac{\beta {U}_{i,j,t}{I}_{i,j,t}}{{W}_{i,j}}$$
(23)
$${{{\rm{P}}}}(I\to R| t\to t+1)=-\gamma {I}_{i,j,t}$$
(24)

where Ui,j,t, Ii,j,t, Ri,j,t denote passenger numbers of usual state, infected state, and recovery state, respectively on the t day. β and γ represent the transmission and recovery rates, respectively, which are inherent attributes of the pandemic spread.

Willingness influence module

During the pandemic, urban residents tend to reduce their travel activities to mitigate the risk of infection. This tendency is particularly pronounced when there is a higher reported or perceived number of infections and potential cases through news outlets, social media, or personal observations, leading to increased levels of anxiety. During the recovery phase, mobility gradually rebounds over time, and confidence in travel is steadily restored. Furthermore, the decline and recovery of different travel behaviors exhibit heterogeneity. From the clustering results in the section “Population behavior behind mobility patterns”, it is evident that the decrease and recovery of trips are associated with travel purposes.

Willingness factors are proposed to model the willingness influence. It is assumed that the willingness factors of travel from ULU type r to ULU type v follow normal distributions:

$${\alpha }_{r,v} \sim {{{\mathcal{N}}}}\left({\mu }_{r,v},{\sigma }_{r,v}^{2}\right)$$
(25)
$${{\alpha }^{{\prime} }}_{r,v} \sim {{{\mathcal{N}}}}\left({{\mu }^{{\prime} }}_{r,v},{\sigma }_{r,v}^{{\prime} 2}\right)$$
(26)

where r, v = 1, 2, , . . . , R. \({\mu }_{r,v},{\sigma }_{r,v}^{2}\) and \({{\mu }^{{\prime} }}_{r,v},{\sigma }_{r,v}^{{\prime} 2}\) are distribution parameters (mean and variance) of willingness factors for transmission and recovery, respectively.

In terms of travel mode choices, individuals tend to opt for modes associated with lower exposure to viral risks. Consequently, the willingness factor for transmission and recovery in the context of subway, bus, and taxi are denoted as \({\mu }_{q},{{\mu }^{{\prime} }}_{q}(q=1,2,3)\).

The probability state transition process of emotional influence on travel is as follows:

$${{{\rm{P}}}}(U\to I| t\to t+1)=-\frac{({\alpha }_{i,j}+{\mu }_{q}){U}_{i,j,t}{I}_{i,j,t}}{{W}_{i,j}}$$
(27)
$${{{\rm{P}}}}(I\to R| t\to t+1)=-({{\alpha }^{{\prime} }}_{i,j}+{{\mu }^{{\prime} }}_{q})t{I}_{i,j,t}$$
(28)

Put it together

For an OD pair (i, j) of transport mode q, the travel transition matrix Si,j decomposes travel for a given OD pair into combinations of various trip purposes. Consequently, it enables the synthesis of willingness factors, thereby generating the willingness factor for that specific OD pair. The willingness factors can be estimated by linearly stacking the sampled willingness factors for travel purposes:

$${\alpha }_{i,j}=\mathop{\sum }\limits_{r=1}^{R}\mathop{\sum }\limits_{v=1}^{R}{\alpha }_{r,v}{s}_{i,r}{s}_{j,v}$$
(29)
$${{\alpha }^{{\prime} }}_{i,j}=\mathop{\sum }\limits_{r=1}^{R}\mathop{\sum }\limits_{v=1}^{R}{{\alpha }^{{\prime} }}_{r,v}{s}_{i,r}{s}_{j,v}$$
(30)

The mobility change for OD pair (i, j) can be determined through the following equation.

$${U}_{i,j,t+1}={U}_{i,j,t}-\frac{(\beta +{\alpha }_{i,j}+{\mu }_{q}){U}_{i,j,t}{I}_{i,j,t}}{{W}_{i,j}}$$
(31)
$${I}_{i,j,t+1}={I}_{i,j,t}+\frac{(\beta +{\alpha }_{i,j}+{\mu }_{q}){U}_{i,j,t}{I}_{i,j,t}}{{W}_{i,j}}-(\gamma +{{\alpha }^{{\prime} }}_{i,j}t+{{\mu }^{{\prime} }}_{q}t){I}_{i,j,t}$$
(32)
$${R}_{i,j,t+1}={R}_{i,j,t}+(\gamma +{\alpha }_{i,j}^{{\prime} }t+{{\mu }^{{\prime} }}_{q}t){I}_{i,j,t}$$
(33)

The time series of the passenger flow from origin i to destination j is:

$${{{{\bf{f}}}}}_{i,j}=[{W}_{i,j}-{I}_{i,j,1},{W}_{i,j}-{I}_{i,j,2},...,{W}_{i,j}-{I}_{i,j,T}]$$
(34)

Wi,j is determined by the trips of OD (i, j) prior to the pandemic outbreak. αi,j and \({\alpha }_{i,j}^{{\prime} }\) denote willingness factors for transmission and recovery, respectively. These willingness factors depend on the ULU feature arrays of the origin i and the destination j, accounting for the diversity of OD passenger flow. The initial values of Ui,j,t, Ii,j,t, Ri,j,t are determined by the initial infection rate in the city.

By obtaining the changes in passenger flow at all OD pairs, we can also derive the overall change in trips under the impact of the pandemic. This model characterizes the response of an entire city’s public transport system to the pandemic and the shifts in mobility behavior.

Model validation

The alterations in actual OD passenger flow within the Shenzhen bus, subway, and taxi systems during the COVID-19 pandemic serve to verify the effectiveness of the dynamic model. Research (Cai et al., 2022; Leung et al., 2023; Ren et al., 2022; Ribeiro Xavier et al., 2022) indicates that the transmission rate of SARS-CoV-2 ranges from approximately 0.2 to 0.4, while the recovery rate spans from approximately 0.05 to 0.15. These values are determined by the inherent transmission characteristics of COVID-19. This paper primarily concentrates on modeling the variations in mobility for different OD stations. Without loss of generality, we employ β = 0.3, γ = 0.1.

When calculating the willingness factor for recovery, we consider only the main ULU pairs depicted in Fig 3b. Consequently, the parameter estimation target constitutes the mean values and standard deviations of willingness factors for OD ULU pairs. Based on the maximum likelihood estimation (MLE) method, we find the optimal parameter values of the willingness factor distributions, enabling the predictive changes in the mobility model to approximate the actual trend. Specifically, we employed the grid search method to identify the values of \({\alpha }_{r,v},{{\alpha }^{{\prime} }}_{r,v}\) that correspond to the minimum root mean squared error (RMSE) between the true trips and predicted trips. These values of \({\alpha }_{r,v},{{\alpha }^{{\prime} }}_{r,v}\) are treated as sampling data for the distribution of willingness factors to be estimated. Subsequently, we determined the parameter values that maximize the probability of sampling data occurrence, resulting in the optimal mean values and standard deviations of the empirical factors.

The optimization results show that the values of willingness factors for transmission are similar for all OD station pairs, whereas the willingness factors for recovery primarily contribute to the diversity observed in mobility behavior, as displayed in Supplementary Fig. 1. Under the optimal parameters, we sample and derive 100 sets of dynamic model parameters. The predictive trips in Shenzhen during the COVID-19 pandemic are shown in Fig. 4.

Besides, an analysis is conducted to evaluate the robustness of the UIR model. The values of β and γ vary by approximately ± 30% and observe the model performance while keeping other parameters constant. This analysis focused on calculating the average root mean square error (RMSE) between the actual and predicted values of passenger flow across various OD pairs in the city. The RMSE for each trip series provided a daily prediction error. As shown in Supplementary Fig. 4, our findings indicate that for every 1% deviation in β, the prediction error increased by approximately 0.61%, and for every 1% deviation in γ, the prediction error increased by around 0.77%. These results suggest a certain level of sensitivity of the model to these parameters.