1 Introduction

The COVID-19 pandemic has heavily influenced human mobility. At the global level, travel restriction has been commonly considered as an effective measure to mitigate the spread of COVID-19 and thus been adopted by many countries or regions [21, 22]. At the local level, empirical findings suggested that the COVID-19 pandemic had changed citizens’ travel behavior and demand [5, 42, 43]. This study will focus on the changes in local travel demand in New Your City (NYC), the United States (US).

In an urban transport system, there are several transport modes available, such as private car, public transit, and taxi. Recently, shared mobility services (e.g., car sharing and bike sharing) have received increasing attention, and played a vital role in the urban transport system. Previous studies have suggested that the COVID-19 pandemic influenced these transport modes (see section Impacts of COVID-19 on Transportation Systems below for a review of the impacts on shared bike and taxi). However, it remains unclear whether the COVID-19 impacts on different transport modes could be different. In response, this study will conduct a comparative study and assess the COVID-19 impacts on two typical transport modes in cities, namely, shared bike and taxi.

On the other hand, it is commonly recognized that travel demand is statistically associated with the built environment [2, 9, 25, 41]. Furthermore, some recent studies suggested the spread of COVID-19 was found to be associated with the built environment (see section Association between the Spread of COVID-19 and the Built Environment below for a review). However, it remains unclear how the changes in travel demand (caused by COVID-19) are associated with the built environment for different transport modes. In response, this study will explore how the decrease (during the lockdown phase) and increase (during the initial recovery phase) in travel demand by transport mode are associated with the built environment.

In summary, we will assess the impacts of COVID-19 on two typical transport modes, i.e., shared bike and taxi, through spatial big data analysis. We will particularly look at two phases: lockdown phase (during which travel demand decreased dramatically) and initial recovery phase (during which travel demand started to increase). Furthermore, we will quantify the potential associations between the changes in travel demand during the two phases and the built environment. The empirical findings from the spatial big data analysis would be helpful for local authorities to shape policies (especially mobility-related policies) to contain COVID-19.

2 Literature Review

2.1 Impacts of COVID-19 on transportation systems

2.1.1 Impacts of COVID-19 on taxi, ride sharing/hailing

Due to the outbreak of the coronavirus, the development of ride sharing, taxi and ride-hailing services had been hindered worldwide. Through statistical approaches, Dzisi et. al [7] conducted an online survey about ride-hailing demand in Ghana. The result demonstrated that more than half of the respondents (all commuters) tended to select ride-hailing as a mode of transport rather than the common para-transit service due to social distancing and public fear concern. Similarly, Loa et al. [27] also conducted an online survey to study the usage of shared travel modes in the Greater Toronto Area. Nonetheless, less than 20% of respondents used ride-sourcing during the pandemic because most respondents travelled less. Some researchers investigated travel behavior changes from a psychological perspective, for example, Morshed et al. [31] utilized the Sentiment-Emotion Detection model to analyze reactions, feelings and behavioral patterns from the tweets on Twitter that used ride-hailing services before and during the pandemic. The results showed that expressions towards ride-hailing services during the pandemic tend to be negative (sad and anger), whereas more users and commenters posted positive tweets towards ride-hailing services instead. Mai et al. [29] conducted an exploratory factor analysis to explore those factors influencing the use of ride-hailing in Vietnam. A factor named usefulness brought the most influential impact on using ride-hailing during the outbreak. Based on spatial-temporal approaches, Zheng et al. [44] analyzed four-week taxi GPS trajectory data which underwent the lockdown and reopening phases in Shenzhen, China. They found the recovery level of taxi trips was speedy when the city reopened, especially in the city centre and two peak periods (6-10 a.m. and 5-8 p.m.), but still not yet fully recovered as the pre-pandemic period. To anticipate when taxi travels will be wholly recovered like previous normal days, as developed by Nian et al. [33], a taxi data-driven social activities recovery level evaluation model predicted that around three to six months after the post-epidemic period could completely recover.

2.1.2 Impacts of COVID-19 on shared bikes

Shared micro-mobility services, like shared bikes, had simultaneously been affected due to the pandemic. Yet, bike-sharing is regarded as a more resilient alternative than other traditional modes of transport [13,14,20]. Queens tended to have a higher number of COVID-19 cases than the other four boroughs.

3.2 Data sources and preprocessing

3.2.1 Ridership big data on multiple transport modes

Bike sharing

Citi Bike is a bike sharing program in NYC, and is also the largest in the nation. Citi Bike has an open data policy, publishing disaggregated trip history for each station, which is the data source of this study (see: https://ride.citibikenyc.com/system-data). We extracted the records generated from February 29, 2020 to July 2, 2020 for this study. The key fields in the dataset include the start and end times of each trip, as well as the start and end stations used. The raw Citi Bike dataset was processed to remove those trips that were taken by staff as service and inspect the system, those trips that were taken to/from any of “test” stations, and those trips with a duration below 60 seconds [3]. In this study, those trips with a duration longer than 2 hours were removed. Also, those trips by users aged 100 or above were removed either.

Taxi

The New York City Taxi and Limousine Commission (TLC) is the major taxi service provider in NYC, responsible for services including yellow taxis, green taxis, black car, commuter vans, and ambulettes [38]. Regarded as the symbol of the city, the yellow and green taxies oversee over 40,000 other for-hire vehicles licensed by TLC [40]. The yellow taxies are allowed to pick up passengers anywhere in the five boroughs. The green ones can be accessed in Upper Manhattan, the Bronx, Brooklyn, Queens. We extracted the records generated from February 29, 2020 to October 9, 2020 for this study (see: https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page). Each trip record contains the locations of pickup and drop-off points (at the zone level), the times when pickup and drop-off events occurred, and trip distance. When cleaning data, we deleted those exceptional trip records with a duration shorter than 1 minute or longer than 200 minutes, and those trip records with a distance shorter than 0.3 miles and longer than 50 miles.

The layouts of Citi Bike stations and taxi zones in NYC are shown in Fig. 2. The open information of shared bike stations and core of taxi zones is extracted from abovementioned links, and the NYC map is based on the basemap of ArcGIS Pro.

Fig. 2
figure 2

Locations of Citi Bike stations and cores of taxi zones in New York City

3.2.2 Point of interest (POI) data

We used POIs to describe the built environment, which is a general approach used in previous studies [8, 24]. These POIs are grouped into 11 categories, namely, Commercial, Education, Emergency, Financial, Government, Healthcare, Office, Public, Residence, Sport and Transport. Figure 3 shows the heat maps for these 11 POI types based on the kernel density analysis. Manhattan tended to have a higher density for several POI types, including Commercial, Education, Financial, Government, Healthcare, Public, Sport and Transport.

Fig. 3
figure 3

Heat maps for 11 POI types in the study area

4 Methods

4.1 Analytical framework

This paper proposes a data-driven framework to assess the impacts of COVID -19 on two typical transport modes (i.e., shared bike and taxi) and further quantify the association between the changes in travel demand by transport mode and the built environment (see Fig. 4). Specifically, this paper looked at two specific stages, namely, lockdown stage (during which travel demand decreased dramatically) and initial recovery stage (during which travel demand started to increase). Two groups of datasets, i.e., ridership (see section Ridership Big Data on Multiple Transport Modes) and POI (see section Point of Interest (POI) Data) were used to prepare input for the association analysis through four typical models, namely Ordinary Least Squares (OLS), Geographically Weighted Regression (GWR), Temporally Weighted Regression (TWR), and Geographically and Temporally Weighted Regression (GTWR) models. The key outcomes include statistical and spatiotemporal ridership patterns and those statistically associated POI types identified for the two transport modes.

Fig. 4
figure 4

The analytical framework

4.2 Statistical and spatiotemporal analyses

4.2.1 Ordinary least squares (OLS) model

Regression models are usually used to reveal the causal relationship between variables, including global regression models and local regression models. Multiple linear regression model is a typical global model, with an assumption that the regression coefficients would not vary across geographical locations in a study area. In this case, the ridership \({Y}_{i}\) can be described as a function of the number of POIs by type (denoted as \({x}_{1},{ x}_{2},\cdots ,{ x}_{k}\)), as presented by Eq. (1) [36].

$${Y}_{i}={\beta }_{0}+\sum_{k\in K}{\beta }_{k}{x}_{ik}+{\varepsilon }_{i}$$
(1)

Where, \({\beta }_{0}\) is the intercept;\({ \beta }_{k}\) is parameter to be estimated for the \({k}^{th}\) variable; \({x}_{ik}\) is the \({k}^{th}\) variable value for the station \(i\); Here, \(K\) =11 (i.e., the 11 POI types); \({\varepsilon }_{i}\) is a random error.

4.2.2 Geographically weighted regression (GWR) model

GWR is particularly designed to deal with spatial data regression, allowing for coefficients to vary across spaces. It can be viewed as an extension of OLS models by associating explanatory variables with geographical locations. In this study, the units of geographical location for Citi Bike and taxi are bike station and taxi service zone, respectively. The fundamental formula of GWR is shown by Equation (2) [1].

$${Y}_{i}={\beta }_{0}\left({u}_{i},{v}_{i}\right)+\sum_{k\in K}{\beta }_{k}\left({u}_{i},{v}_{i}\right){x}_{ik}+{\varepsilon }_{i}$$
(2)

Where \(\left({u}_{i},{v}_{i}\right)\) are the coordinates of a station \(i\); \({\varepsilon }_{i}\) is an error term for point \(i\); \({\beta }_{0}\left({u}_{i},{v}_{i}\right)\) represents the intercept; \({\beta }_{k}\left({u}_{i},{v}_{i}\right)\) is the regression coefficients for the built environment variables. The distinct characteristic of GWR is that the model parameters \({\beta }_{k}\left({u}_{i},{v}_{i}\right)\) vary, so as to measure the spatial non-stationarity of observations while in the global model (i.e., OLS model) parameters are fixed for each observation.

4.2.3 Temporally weighted regression (TWR) model

TWR involves temporal peculiarity in OLS model by associating explanatory variables with time stamps. The fundamental formula of TWR is shown by Equation (3) [17].

$${Y}_{i}={\beta }_{0}\left({t}_{i}\right)+\sum_{k\in K}{\beta }_{k}\left({t}_{i}\right){x}_{ik}+{\varepsilon }_{i}$$
(3)

Where \({t}_{i}\) is the time period of \(i\); \({x}_{ik}\) is the \({k}^{th}\) variable value for the station \(i\); \({\beta }_{0}\left({t}_{i}\right)\) is the intercept for time period \({t}_{i}\) and \({\beta }_{k}\left({t}_{i}\right)\) is the regression coefficients for the built environment variable \({x}_{ik}\) for the station \(i\) at time period \({t}_{i}\).

4.2.4 Geographically and temporally weighted regression (GTWR) model

As an extension of GWR and TWR, GTWR embeds both spatial and temporal data into regression parameters to measure spatial and temporal variations simultaneously [1], which is shown by Equation (4).

$${Y}_{i}={\beta }_{0}\left({u}_{i},{v}_{i},{t}_{i}\right)+\sum_{k\in K}{\beta }_{k}\left({u}_{i},{v}_{i},{t}_{i}\right){X}_{ik}+{\varepsilon }_{i}$$
(4)

Where \(\left({u}_{i},{v}_{i},{t}_{i}\right)\) are the coordinates of station \(i\) in the spatiotemporal dimensions; \({x}_{ik}\) is the \({k}^{th}\) variable value for the station \(i\); \({\beta }_{0}\left({u}_{i},{v}_{i},{t}_{i}\right)\) is the intercept for station \(i\) at time \({t}_{i}\) and at location (\({u}_{i},{v}_{i}\)), and \({\beta }_{k}\left({u}_{i},{v}_{i},{t}_{i}\right)\) is the regression coefficients for the built environment variable \({x}_{ik}\) for the station \(i\) at time \({t}_{i}\) and at location (\({u}_{i},{v}_{i}\)).

4.3 Statistical criteria

To evaluate the performance of models, statistical criteria, R-square (\({\mathbf{R}}^{2}\)) and Akaike Information Criterion (AIC), are used and these indices can be defined by Eqs. (5) and (6), respectively.

$${R}^{2}=\frac{{\sum }_{i}{({\widehat{y}}_{i}-{y}_{i})}^{2}}{{\sum }_{i}{({y}_{i}-\overline{y })}^{2}}$$
(5)
$$AIC=-2LL\left(\widehat{\beta }\right)+2k$$
(6)

In Eq. (5), \({{\varvec{y}}}_{{\varvec{i}}}\) is the observed value for station \({\varvec{i}}\); \({\widehat{{\varvec{y}}}}_{{\varvec{i}}}\) is estimated value for station \({\varvec{i}}\); \({\overline{{\varvec{y}}} }_{{\varvec{i}}}\) is the mean of observed values. In Eq. (6), \({\varvec{L}}{\varvec{L}}\left(\widehat{{\varvec{\beta}}}\right)\) is the likelihood corresponding to the model with estimated parameters \(\widehat{{\varvec{\beta}}}\); \({\varvec{k}}\) is the number of parameters.

5 Results

5.1 General results

5.1.1 Statistical ridership patterns

The daily ridership of Citi Bike and taxi and the number of COVID-19 cases in 2020 are shown in Fig. 5. The pandemic led to significant changes on travel behavior (e.g., mode choice and travel demand). It can be seen that the ridership of shared bike and taxi decreased sharply in March, 2020 due to the outbreak of COVID-19. Citi Bike usage had an obvious rise shortly after the outbreak, and the travel demand of taxi rebounded more slowly. This was likely because Citi Bike allowed citizens to travel individually with little interactions with others, which could help to reduce the risk of getting infected.

Fig. 5
figure 5

Ridership and the number of COVID-19 cases in 2020. a Citi Bike daily ridership in 2020. b Taxi daily ridership in 2020

As aforementioned, we looked at two phases for each of the two transport modes, namely lockdown stage (during which travel demand decreased dramatically) and initial recovery stage (during which travel demand started to increase): for the lockdown phase, we looked at the changes in travel demand between the first period (before the outbreak of COVID-19) and second period (during which the travel demand decreased to the lowest point); for the recovery phase, we looked the changes in travel demand between the second period and third period (during which the travel demand increased to a relatively high level compared to the travel demand before the outbreak). Since the taxi ridership recovered slowly, the third period of taxi (i.e., from October 3 to October 9, 2020) was later than those of bike (i.e., from June 27 to July 2, 2020). In the third period, the demands for bicycle and taxis were 200% and 20% of those of the first period respectively. It is worth noting that the ridership on March 3, March 30, July 3 and October 7 were greatly affected by weather conditions, and thus were not included in the subsequent analyses.

5.1.2 Temporal ridership patterns

Figure 6 shows the 24-hour distribution of Citi Bike and taxi ridership on both workdays and non-workdays. For both transport modes, their temporal patterns in the first and third periods were similar, but were different from each other in the second period (during which the travel demand decreased to the lowest point due to the sudden outbreak of COVID-19). This indicates that COVID-19 significantly influenced the temporal travel demand when it broke out, but at the recovery phase, the influence tended to be much less. In addition, there were significant differences between workdays and non-workdays in temporal ridership patterns of Citi Bike and Taxi.

Fig. 6
figure 6

Temporal ridership patterns by transport mode. a Citi Bike ridership on workdays. b Citi Bike ridership on non-workdays. c Taxi ridership on workdays. d Taxi ridership on non-workdays

5.1.3 Spatial ridership patterns

We first extracted the trip origins and destinations on workdays and non-workdays for both Citi Bike and taxi, and then quantified the difference between the first and second period in the travel demand to represent the changes during the lockdown phase, and the difference between the second and third phase to represent the changes during the recovery phase. The kernel density analysis tool in ArcGIS software was used to analyze the spatial patterns of the Origin (O) and Destination (D) points on workdays and non-workdays, so as to find the hot areas of residents' travel. Here, we will only present the analysis result for the trip Origin (O) on workdays for bike-sharing and taxi (see Fig. 7), with the other results appended (see Appendix 1 in the Supplementary Materials).

Fig. 7
figure 7

Spatial changes in ridership by transport mode on workdays in the lockdown and recovery phases. a Decrease in Citi Bike ridership in the lockdown phase. b Increase in Citi Bike ridership in the recovery phase. c Decrease in Taxi ridership in the lockdown phase. d Increase in Taxi ridership in the recovery phase

Figure 7 shows the spatial changes in ridership patterns by transport mode on weekdays in the Lockdown and Recovery phases (see Appendix 1 for the spatial changes on non-workdays). It can be found that for Citi Bike, Manhattan had a more significant decrease in ridership in the lockdown phase (see subfigure a); while in the recovery phase, Citi Bike had a significant rebounded travel demand particularly for Manhattan and Brooklyn (see subfigure b). This was likely because it would be easier for traveler to maintain social distancing through cycling. For taxi (see subfigures c and d), Manhattan had a more significant decrease and increase in ridership in both the lockdown and recovery phases.

5.2 Model performance

Before establishing models, correlation analysis was conducted using the Pearson coefficient as an indicator (see in Appendix 2 in the Supplementary Materials). We first tested multicollinearity using the variance inflation factor (see Table A3 in Appendix 3 of the Supplementary Materials), and further global spatial autocorrelation using Moran's I (see Table A4 in Appendix 4 of the Supplementary Materials). Finally, we developed OLS, GWR, TWR and GTWR models for the two transport modes (i.e., Citi Bike and taxi) to explore the association between their ridership changes at both trip origin and destination and the built environment for both workdays and non-workdays during the lockdown and recovery phases. In total, we got 64 models, as shown in Tables 1 and 2. According to the indicator of R2, the OLS models can explain only 1% to 39% of the travel demand changes; the GWR models can explain 10% to 59%; the TWR models can explain 6% to 53%; the GTWR models can explain 18% to 72%. In general, GTWR tends to have a higher R2 value due to its capacity to capture both the spatial and temporal characteristics of variables. Among all the models, some exhibit relatively low R2 values. This could be because ridership changes may be influenced by other factors, such as the occurrence of new cases of Covid-19, psychological factors like anxiety and fear, and specific workplace policies relating to remote work. As the objective of this research is to investigate the association between the built environment variables and travel demand changes, we focused on the incorporation of the built environment variables. Moreover, the decision-making process of travelers' behavior inherently involves randomness, which might introduce unobservable factors that cannot be fully explained by the model. In terms of AIC, the fitting performance of GWR and TWR models were generally better than that of OLS, and all the GTWR models performed best. It's worth noting that models with small R2 values can still provide meaningful explanations for the impact of independent variables [6, 10, 11, 14, 28]. Therefore, the subsequent analysis will be focused on the GTWR models with the other models’ results appended (see Table A5 in Appendix 5 of the Supplementary Materials), due to page limit.

Table 1 Performance of OLS, GWR, TWR and GTWR models in the lockdown phase (note: O denotes trip origin; D denotes trip destination)
Table 2 Performance of OLS, GWR, TWR and GTWR models in the recovery phase (note: O denotes trip origin; D denotes trip destination)

5.3 Analysis of the spatiotemporal variation

5.3.1 Lockdown phase

Table 3 shows the results of GTWR models for the lockdown phase. It can be found that the two built environment factors, Commercial POI and Financial POI, were positively associated with the reduction of travel demand in all the models, indicating that people appeared to travel less for leisure and financial activities during the lockdown phase. Emergency POI is negatively associated with the demand reduction for both transport modes during the lockdown phase. This might be because those activities related to emergency issues were less affected by the pandemic. We also find that some POI types, including Education POI, Government POI and Healthcare POI, could have different associations with the reduction in travel demand of different transport modes. For example, Education POI was positively associated with the Citi Bike ridership reduction on both workdays and non-workdays but was negatively associated with the taxi ridership reduction. This could be reasonable as Citi Bike is generally used for the short-distance trips, and thus it is more likely to be substituted by walking for activities such as going to school.

Table 3 GTWR models in the lockdown phase

5.3.2 Initial recovery phase

Table 4 shows the GTWR model results for exploring the association between the increase in travel demand in the recovery phase and the built environment. Essentially, the built environment factors, Commercial POI, Public POI and Healthcare POI were positively associated with the increases in the ridership of Citi Bike and taxi during the recovery phase. Resident POI was positively associated the increases in the ridership of Citi Bike on both workdays and non-workdays. These findings indicated that more trips were generated because of leisure activities, public affairs, and people were getting out of their home to participate in social activities after the situation became stable. In addition, we also find that some POI types, including Emergency POI and Education POI, had different associations with different transport modes. Among the two transport modes, Citi Bike is the only mode whose ridership change on workdays was positively associated with Sport POI during the recovery phase. It appears to be reasonable as cycling could be means of exercise and then became a more preferred transport mode for those travelers who want to do sports.

Table 4 GTWR models in the recovery phase

5.3.3 Discussion

In section Lockdown Phase and Initial Recovery Phase, several GTWR models were developed to explore how the changes in ridership of shared bike and taxi might be associated with the built environment in the lockdown and (initial) recovery phases. The built environment factor, Commercial POI, was found positively associated with the ridership in both lockdown phase and recovery phases, indicating that travel demand in those areas with more Commercial POI were more likely to be affected by the pandemic. In addition, we find that for different transport modes, the changes in their ridership were associated with different built environment factors, as summarized by Table 5.

Table 5 The associated built environment factors for different transport modes during lockdown and recovery phases

6 Conclusion

COVID-19 has a significant influence on travel behavior across the global. This study proposed an analytical framework to compare the extent to which COVID-19 has impacted travel demand of different transport modes including bike sharing and taxi, and further how the changes in travel demand were associated with the built environment. Based on transport data sets in New York City, spatiotemporal patterns of travel demand by transport mode were analyzed comparatively. Models, including OLS, GWR, TWR, GTWR, were developed to investigate to what extent the various built environment factors was associated with the travel demand change during the lockdown and recovery phases. GTWR outperformed the other three model types according to R2 and AIC. This might be because GTWR could measure spatial and temporal variation simultaneously. The model results showed that during the lockdown phase, commercial and financial sites were positively associated with the reduction of travel demand of shared bikes and taxis, whereas the emergency service was negatively associated with the demand reduction. The educational area was positively associated with the shared bike ridership reduction on both workdays and non-workdays, but was negatively associated with the taxi ridership reduction. During the recovery phase, commercial areas and healthcare facilities, were positively associated with the increases in the ridership of shared bike and taxi on both workdays and non-workdays. Sports gymnasium was positively associated with the increases of shared bike ridership on workdays while negatively associated with taxi.

There are several areas that could be further explored in future research. Firstly, as this research focuses on examining the association between the built environment variables and travel demand changes, the capacity of models to fully explain all variations in the ridership changes may be limited. Future research may consider exploring a broader range of variables from diverse perspectives to better understand ridership changes during different phases of the COVID-19 pandemic. Secondly, travelers' behavior decision-making process inherently involves a degree of stochastics, which introduces unobserved factors that cannot be fully explained by the variables included in the model. We suggested that a longer-term large-scale sample size may provide a more comprehensive representation of patterns in ridership changes and improve the model performance in estimating these changes. Furthermore, this research applied GTWR as it can effectively incorporate spatial and temporal characteristics in the association between the built environment variables and travel demand changes, while the models can be limited in demand prediction. Future studies may consider leveraging the potential of machine learning and develo** more advanced algorithms. Finally, it would be beneficial to extend the analysis to include other modes of transportation, such as buses and the metro. This comparative analysis would enable a better understanding of the dynamics among different modes and facilitate the development of more effective strategies for balancing transportation options.