The impact of the second-level digital divide on residents’ consumption
Descriptive statistical analysis
Table 3 presents descriptive statistics for the primary variables. Concerning the explanatory variables, the mean consumption level among residents is 2.279, falling between the second and third consumption levels, with a standard deviation of 0.963, suggesting a wide variation in consumption levels among different residents. Regarding the key explanatory variables of this paper, the mean value of the second-level digital divide is 0.392, with a standard deviation of 0.275, indicating that the average level of the residents’ digital divide is in the lower-middle range. Yet, there are significant differences in the level of the second-level digital divide among different residents. Regarding the control variables of this paper, it is evident from the following table that the proportion of females among the respondents is higher than that of males, and the average age of the residents is approximately 40 years old, with good health. The average level of education among the respondents is high school, and their attitudes toward risk are moderate. On average, the residents’ monthly disposable income is at the third-bracket level. Overall, after processing the collected data, the descriptive statistics of each variable necessary for the empirical analysis of this paper are more reasonable.
Multicollinearity test
To ensure the reliability of the empirical analysis, this study conducted a multicollinearity diagnosis prior to formal empirical studies. A comprehensive collinearity assessment was performed using SPSS software for the regression model examining the relationship between residents’ second-level digital divide and consumption. As demonstrated in Table 4, the Variance Inflation Factor (VIF) and Tolerance (TOL) statistics for all explanatory variables were well within acceptable thresholds (VIF < 10 and TOL > 0.1). These diagnostic results indicate the absence of severe multicollinearity issues that might compromise the validity of parameter estimates. The obtained values suggest that the selected predictors maintain sufficient statistical independence to produce robust regression estimates.
Baseline regression analysis
This study employs the Ordered Probit model with a stepwise regression strategy to systematically introduce variables into the analysis, ensuring their relative independence. The findings from the baseline regression are presented in Table 5. Column 1 presents a univariate regression model, where consumption is regressed on the level of the second-level digital divide among residents. From columns 2 to 3, essential personal characteristics of residents (gender, age) and background characteristics (literacy, marital status) are added sequentially. The final column of the table includes all control variables in the regression model. The findings reveal a statistically significant and negative impact of the second-level digital divide on residents’ consumption at the 1% level, even after controlling for various factors (Table 5). The results support Hypothesis H1, suggesting that reducing the digital divide can boost spending. The regression also shows that age is negatively related to consumption, while marital status and disposable income positively influence it. The impact of gender, education level, health status, and risk attitudes on residents’ consumption is statistically insignificant.
Further study
The previous section’s baseline regression analysis has established that the second-level digital divide exerts an inhibitory effect on residents’ consumption, yet its coefficients do not reflect the marginal effect of this divide on consumption. This section will further explore the marginal impact of the second-level digital divide on residents’ consumption, building on the findings of the previous section. The paper’s analysis of residents’ consumption levels is based on a five-level scale used in the collected questionnaire, which represents the corresponding consumption level range and has values ranging from 1 to 5. The marginal effect of the variable is the influence it has on the probability of residents being in a specific consumption level. The marginal effect value for the second-level digital divide varies with each consumption level. To more clearly analyze the marginal effect of the second-level digital divide on residents’ consumption, this paper presents it graphically, as shown in Fig. 2. The graph reveals that reducing the second-level digital divide lowers the probability of residents choosing a low level of consumption while increasing the likelihood of selecting a high level.

Marginal effects of the second-level digital divide.
The marginal effects of the explanatory variables in the model are detailed in Table 6. Our analysis primarily explores the marginal impact of the second-level digital divide on residents’ consumption levels. The regression results indicate that a reduction in the second-level digital divide decreases the probability of residents’ consumption levels being 1 and 2 by 23.7% and 8.1%, respectively, while simultaneously increasing the likelihood of levels 3, 4, and 5 by 16.2%, 8.6%, and 7.1%, respectively. These results suggest that lowering the second-level digital divide among residents can reduce the likelihood of their consumption being at a lower level and increase the possibility of it being at a higher level. Furthermore, the marginal table reveals that the impact of the second-level digital divide on the lowest level of residents’ consumption is the most significant. Consequently, narrowing the level of the second-level digital divide among this group can significantly enhance the stimulation of residents’ consumption and, in turn, promote overall consumption levels.
Robustness test
This research employs rigorous robustness checks to ensure the credibility of the regression findings. These tests encompass substituting alternative estimation models, utilizing distinct variable measures, modifying the sampling strategy, and incorporating regional fixed effects into the analysis.
Estimated model for replacement of the baseline regression
To ensure the reliability of the benchmark regression results, we use an ordered multinomial logistic regression model, treating the dependent variable as an ordered discrete variable. We present the comparison between the two models in Table 7, with column (1) showcasing the original results and column (2) displaying the analysis outcomes after replacing the regression model. The study’s results consistently reveal a significant negative influence of the second-level digital divide on residents’ consumption levels, thus validating the initial regression results.
Changing the measurement method of independent variables
In line with Zhang and Lu’s (2021) research, the paper revises the approach for assessing the second-level digital divide from factor analysis to an equal weighting method, aiming to reconstruct the extent of the population’s second-level digital divide. Subsequently, a baseline regression is conducted, with the findings presented in column (3) of Table 7. Columns (3) reflect the regression outcomes after adjusting the second-level digital divide variable. Notably, the negative correlation between the digital divide level and consumption remains consistent even after modifying the measurement techniques, echoing the conclusions from the earlier section.
Replacement sample
This section excludes certain samples before conducting a regression analysis on the relationship between the second-level digital divide and residents’ consumption to mitigate potential sample selection bias and ensure that specific sample characteristics do not influence the results. Specifically, given that survey respondents in the sample may be students, retired individuals, or others with limited potential income, this part excludes samples under 24 and over 60, as these demographic groups may impact their potential income and, consequently, their consumption. The regression analysis, conducted after excluding these samples, is presented in column 4 of Table 7. The coefficient for the second-level digital divide remains negative, consistent with the benchmark regression results presented in the previous section.
The addition of area-fixed effects
To address potential endogeneity issues arising from omitted variables and ensure accurate estimation of coefficients, this section further incorporates control variables for robustness testing. Specifically, region-fixed effects are included to assess the impact of the second-level digital divide on residents’ consumption. The resulting regression model, denoted as (3), is then constructed.
$${{Consumption}}_{i}=b+{b}_{1}{{Digital}{{\_}}{divide}}_{i}+{b}_{2}{X}_{i}+{\theta }_{i}+{\rm{\delta }}$$
(3)
The variable θi signifies the fixed effects linked to the regional characteristics, while δ represents the random perturbation term. The remaining variables are defined as in the previous section. The regression results in column 5 of Table 7 show that after controlling for regional factors, the second-level digital divide significantly and negatively impacts residents’ consumption at the 1% level. This indicates that the baseline regression results are robust.
Endogeneity test
This study employs the instrumental variables (IV) method to address potential endogeneity concerns stemming from omitted variables or reverse causation. Given the ordered discrete nature of the dependent variables, a conventional IV model is inadequate. Instead, the Conditional Mixed Process (CMP) estimation by Roodman (2011) is employed to evaluate endogeneity in the benchmark regression. Similar to the two-stage least squares approach, the CMP method relies on selecting suitable instrumental variables. Drawing on Li and Li’s (2023) and other experts’ findings, mobile social networks are posited to bridge the digital divide. Therefore, this paper selects the number of friends residents chat with via WeChat in voice, video, and text daily as the instrumental variable. On the one hand, the number of friends residents chat with daily can reflect their social network to a certain extent; the higher the number, the larger the social network, and the more capable of narrowing the second-level digital divide through the knowledge spillover effect of the social network. On the other hand, the number of friends residents chat with daily does not have a direct relationship with the residents’ consumption and almost does not have an impact on the consumption of residents, which meets the requirement of exogeneity.
Table 8 presents the findings of a regression analysis using the CMP instrumental variable approach, providing a comprehensive analysis. To further enhance the robustness of the results, this study supplements the analysis with the outcomes of a two-stage least squares (2SLS) regression. The initial regression analysis of the CMP method reveals a statistically significant negative relationship between the number of friends residents chat with via WeChat and the extent of residents’ second-level digital divide. The higher the average number of friends with whom residents engage in WeChat communication daily, the smaller their second-level digital divide is. The first-stage F-statistic of 77.711 surpasses the threshold of 10, ensuring the absence of weak instrumental variables. The second-stage regression in the CMP model demonstrates that the second-level digital divide substantially and negatively influences consumption among residents. The parameter estimate, atanhrho_12, is statistically significant at the 1% level, confirming the appropriateness of the CMP estimation technique to address the endogeneity issue between the digital divide and consumption. The robustness of these findings is further confirmed by the 2SLS estimates presented in columns (3) and (4) of Table 8, where the instrumental variable remains as the number of friends residents chat with via WeChat. These consistent results reinforce the reliability of the baseline regression.
Mechanism analysis
Through the above analysis, this paper empirically investigates the second-level digital divide’s impact on residents’ consumption. However, the mechanism through which the second-level digital divide affects residents’ consumption remains to be explored in the subsequent section.
In recent years, the continuous development of digital technology has facilitated the widespread adoption of digital credit. The ability of residents to access digital credit is intricately linked to their digital skills, potentially affecting their consumption. Consequently, disparities in the second-level digital divide may impact residents’ access to digital credit, thereby influencing their consumption behavior.
This study builds upon previous analysis by examining the mediating role of digital credit access in the relationship between the second-level digital divide and consumption. Utilizing a three-step method, we integrate equations (4) and (5) into the baseline model (1) to establish a mediation effect model(internet-loan in equations (4) and (5) is digital credit access). The binary nature of digital credit access prompts using a probit model for regression. This method’s significance of coefficients a1, c1, and d2 signifies the mediating impact. Results from equations (1), (4), and (5) are displayed in Table 9’s columns 1–3, with Fig. 3 visually representing these findings. The negative coefficient a1 reveals a negative association between the second-level digital divide and consumption. C1 shows a statistically significant and negative influence of the digital divide on digital credit access at the 1% level.

Mediating effects of digital credit access.
Conversely, d2 exhibits a positive correlation, indicating that increased digital credit access positively correlates with consumption, while the second-level digital divide impedes it. The significance of a1, c1, and d2 in Table 9 confirms the mediating effect of digital credit, supporting the hypothesis. Consistent with Wen and Ye et al. (2014), the product of c1 and d2 aligns with the sign of a1, suggesting partial mediation of digital credit access in the link between the second-level digital divide and residents’ consumption.
$${internetloan}=c+{c}_{1}{Digital}{{\_}}{divide}+{c}_{2}X+{{\Omega }}$$
(4)
$$\begin{array}{l}{Consumption}={\rm{d}}+{d}_{1}{Digital}{{\_}}{divide}+{d}_{2}{\rm{internet}}\_{\rm{loan}}\\\qquad\qquad\qquad\quad\;+\,{d}_{3}{\rm{X}}+{\rm{e}}\end{array}$$
(5)
In order to further test the mediating effect of digital credit access, this study uses the Bootstrap method to further test the mediating effect of digital credit access while controlling the control variables. The test results are shown in Table 10. As can be seen from the results in the table, the confidence intervals of the total effect, direct effect and indirect effect do not contain 0, indicating that all three are significant, and the value of the mediating effect of digital credit access is −0.148 and the confidence interval does not contain 0, indicating that the mediating effect of digital credit access is significant.