Examining The Roots of Environmental Injustice in Minneapolis

Author

Na Nguyen and Charles Batsaikhan

Introduction

In the 1920s, Minneapolis introduced racial covenants to prevent non-white people from occupying and owning land. As Minnesota witnessed the wave of the First Great Migration, it also legitimized housing discrimination via restrictive clauses inserted into property documents since the presence of non-White, but especially Black people, was regarded to decrease property value. This racist real estate practice paved the way to redlining in the 1940s to decrease home-ownership rates of African American, contributing to their generational dispossession and displacement. Source 1 Source 2

While white areas were endorsed with beautification projects and accessible green spaces, Black residents saw their neighborhoods surrounded with highways, factories, and overall exposure to hazard. And the legacy of residential segregation has led to stark environmental injustice that tremendously ails Black, Latino, and Native American families in Minneapolis:

6x asthma death risk
- Among those who were under age 65, the asthma death rate for Black Minnesotans was 6 times higher than it was for white Minnesotans.
3x higher infant mortality
- Infant mortality rate for Native Americans is nearly 3x higher than that of white Minnesotans.
9x positive COVID-19 tests
- Latinx Minnesotans are testing positive at more than 4 times the overall population and 9 times the white population.
Over 2x the cancer risk
Black Minnesotans have 2.25 times the cancer risk as the average Minnesotan Source 3

There has been extensive reseafrch that definitively proves disproportionate pollution linked to geographically disadvantaged and vulnerable communities. However, these research tend to merely provide a snapshot of this picture at a moment in time. Therefore, we would like to contextualize these single portraits of inequality by investigating the relationship between modern day environmental inequality in Minneapolis and 1920’s racial covenants practice.

The research question we would like to explore is as follows: Looking at historically racially covenanted neighborhoods in Minneapolis and present day environmental data, can we still see its impact in the disparity of climate change effects? In other words, did the introduction of racial covenants in the 1920s cause air pollution inequality in the city of Minneapolis that we see today?

Data

Sources

Tree Canopy Data: Sourced from Google Earth Engine (processed CSV).
Social Vulnerability Index (SVI): From CDC’s Social Vulnerability Index (2020).
Air Pollution Data: Aggregated PM2.5 emissions from permitted facilities in Minnesota (Shapefile).
Redlining Data: Historical HOLC neighborhood grading (Shapefile).
Racial Covenants: Point-level racial covenant data (Shapefile) for Hennepin County.
Census Data: Minneapolis census tract boundaries (TIGER/LINE shapefiles).

Data Exploration

Data Cleaning and Transformation

To prepare the data for analysis, the following steps were taken:

Spatial Joins:
- Combined census tract shapefiles with air pollution data (aggregated emissions at ZIP code level).
- Merged tree canopy data and Social Vulnerability Index (SVI) scores based on the GEOID field.
Racial Covenants Integration:
- Transformed CRS of the racial covenants dataset to match the census tract geometry (st_transform).
- Used spatial joins (st_intersects) to create a binary variable has_covenant indicating whether racial covenants exist in a census tract.
Redlining Data:
- Extracted HOLC grades (A, B, C, D) and joined them with census tracts.
Missing Data Treatment:
- Replaced missing values for air pollution and tree canopy with zeros (ifelse logic).
- Filtered out invalid or irrelevant census tracts.
Variable Creation:
- Created svi_index as a binary variable based on CDC thresholds.
- Aggregated and summarized air pollution, tree canopy, and temperature data at the census tract level.

The cleaned dataset includes: - Census tract geometries. - Tree canopy coverage. - Air pollution emissions (PM2.5). - Social vulnerability status. - HOLC grades. - Racial covenant indicators. - Mean land surface temperature.

Loss of Census tracts

During the matching process we lost 25 census tracts data from 116 to 91 rows of observation.

Methods

Moving forward, we will work with the following variables at the census tract level in the city of Minneapolis:

has_racial_covenants: whether or not a census tract previously had a racial covenant on its property
holc_grade: the census tract’s Home Owner Loan Corporation (HOLC) neighborhood grades: A, B, C, or D (corresponding to A = “Best”; B = “Still Desirable”; C = “Declining”; and D = “Hazardous” designations, respectively)
svi_index: whether or not a census tract is considered as socially vulnerable by the CDC
tree_canopy: the percentage of tree canopy area of a census tract.
air_pollution: the average aggregated emission (tons) of PM2.5 from permitted facilities/factories in a census tract.

Note: A handful of census tracts do not have a holc_grade because they reside in downtown business areas of Minneapolis, not residential areas

Approach

Data Cleaning and Transformation: Processed spatial and tabular datasets, ensuring alignment of projections and merging datasets.
Matching: Controlled for holc_grade confounders using Exact Matching to ensure a comparable control group.
Regression Analysis: Modeled the effect of svi_index on air pollution and land surface temperature, controlling for tree canopy coverage and other variables.
Spatial Autoregressive Model (SAR): Accounted for spatial dependence in land surface temperature.

Our Causal DAG

According to our DAG, our control set includes the following variables: holc_grade, and tree_canopy

Matching

We originally have 116 total observations, 25 untreated and 91 treated census tracts. In order to control for our confounder—holc_grade, we conducted 5 matching methods for an ATE estimand: Nearest Neighbors (k=2), Full Matching, Exact Matching, Coarsened Exact Matching, and Subclass Matching. Our criteria for choosing a final matching method are:

Standard Mean Difference for all holc_grade’s are well within the 0-0.1 threshold.
Yielding a large enough matched sample size.

In the end, we opted for the Exact Matching Method. As opposed to other methods that resulted in rather large Standard Mean Differences ( >= 0.4), Exact Matching will keep our SMD at the minimum (0). However, because there are a handful of census tracts that do not have a holc_grade, our final matched sample size for treated is smaller: 25 untreated and 91 treated. The limited sample size may affect the generalizability of the results, so the findings should be interpreted with caution, and future studies with larger sample sizes would be needed to validate the conclusions.

Nonetheless, this method is the most optimal as it helps us to be confident in our later estimates for the causal relationship between our outcome and treatment variables.

Estimate Results

Air Pollution

Estimate (1.41): The coefficient for svi_index is 1.41. This suggests that of all census tracts in Minneapolis that are matched on, the air quality is worse for socially vulnerable neighborhoods, with the average annual PM2.5 emissions increasing by 1.41 tons, holding tree_canopy_area constant.
Std. Error (0.504): The standard error of the coefficient is 0.504, which reflects the variability in the estimate of the coefficient. This is a relatively small st.Error.
Pr(>|z|) (0.00525): The p-value is 0.00525, which is much smaller than the commonly used significance threshold of 0.05. This indicates that the relationship between svi_index and air_pollution is statistically significant.
95% Confidence Interval (0.419, 2.4): Since the interval does not include zero and it’s quite narrow, this suggests that there is evidence of a non-zero relationship between svi_index and air_pollution. We are 95% confident that this CI (0.419, 2.4) represents a range of plausible values for this effect of svi_index on air_pollution.


 Estimate Std. Error    z Pr(>|z|)   S 2.5 % 97.5 %
     1.41      0.504 2.79  0.00525 7.6 0.419    2.4

Term: svi_index
Type:  response 
Comparison: mean(1) - mean(0)
Columns: term, contrast, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted


Call:
lm(formula = air_pollution ~ svi_index * tree_canopy_area, data = match_out_exact, 
    weights = weights)

Weighted Residuals:
    Min      1Q  Median      3Q     Max 
-3.0446 -0.9035 -0.3988  0.8951  4.3940 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)
(Intercept)                -0.47194    1.78138  -0.265    0.792
svi_index                   2.00819    1.99877   1.005    0.318
tree_canopy_area            0.04250    0.04890   0.869    0.387
svi_index:tree_canopy_area -0.01893    0.05706  -0.332    0.741

Residual standard error: 1.71 on 87 degrees of freedom
Multiple R-squared:  0.09739,   Adjusted R-squared:  0.06627 
F-statistic: 3.129 on 3 and 87 DF,  p-value: 0.02976

Land Surface Temperature

Estimate (2.22): The coefficient for svi_index is 2.22. This suggests that of all census tracts in Minneapolis that are matched on, the mean land surface temperature is worse for socially vulnerable neighborhoods, with the mean LST increasing by 2.22 Fahrenheits, holding tree_canopy_area constant.
Std. Error (0.76): The standard error of the coefficient is 0.76, which reflects the variability in the estimate of the coefficient. This is a relatively small st.Error.
Pr(>|z|) (2e-16): The p-value is 2e-16, which is much smaller than the commonly used significance threshold of 0.05. This indicates that the relationship between svi_index and mean_lst is statistically significant.


Call:
lm(formula = mean_lst ~ svi_index * tree_canopy_area, data = match_out_exact_data, 
    weights = weights)

Weighted Residuals:
     Min       1Q   Median       3Q      Max 
-1.95012 -0.31088  0.09087  0.40727  1.22708 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)                97.97108    0.67721 144.669  < 2e-16 ***
svi_index                   2.22328    0.75985   2.926  0.00438 ** 
tree_canopy_area           -0.09910    0.01859  -5.331 7.62e-07 ***
svi_index:tree_canopy_area -0.05990    0.02169  -2.761  0.00702 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.65 on 87 degrees of freedom
Multiple R-squared:  0.7544,    Adjusted R-squared:  0.746 
F-statistic: 89.09 on 3 and 87 DF,  p-value: < 2.2e-16

Sensitivity Analysis

Unexplained Cofoundes for Air Pollution

            point    lower    upper
RR       2.143185 1.257734 3.651997
E-values 3.708450 1.827085       NA

Social vulnerability would have to be 2.14 times more common in census tracts with infrastructure divestment than those that don’t.
Infrastructure divestment would have to be 2.14 times more common in census tracts with high pollution areas than those that don’t.

Unexplained Cofounder for Mean LST

            point    lower    upper
RR       4.357943 1.628963 11.65875
E-values 8.183349 2.641166       NA

Social vulnerability would need to be 4.36 times more common in census tracts with infrastructure divestment than in those without, to explain away the observed association between social vulnerability and mean surface temperature.
Similarly, infrastructure divestment would need to be 4.36 times more common in census tracts with high mean surface temperatures than in those with lower temperatures, to nullify the observed causal effect.

Correlation Analysis

From our exploratory data analysis, we observe a strong correlation between has_racial_covenants, holc_grade, air_pollution, mean lst, and svi_index.

In higher HOLC grade neighborhoods with dense concentrations of racial covenants, their social vulnerability level is markedly lower. Contrast, social vulnerability level is remarkably higher in areas that were deemed as “declining”, “hazardous”, and “undesirable”–areas where there were little to no racial covenants.

As you can see, we see lot of correlation between HOLC grade & racial covenants areas with our environmental and social variables. In short, places that experience higher mean LST, high air pollution, and place with high social vulnerability score are associated with places with lower HOLC grade and places with no to little racial covenants areas. What does this tell us? It tell us that places that are deemed “risky” to invest and places that that did not barr people of color to live are the same places that experince worse environemntal factors like air pollution and mean LST.

Spatial Autoregressive (SAR) model

To address the potential spatial dependence in Land Surface Temperature (LST) across census tracts, we implemented a Spatial Autoregressive (SAR) Model. Spatial models account for spatial relationships between observations, ensuring more reliable estimates when spatial autocorrelation exists.

Descriptive Statistics

We first summarize the descriptive statistics for census tracts with and without racial covenants:

Simple feature collection with 2 features and 7 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -93.32914 ymin: 44.8899 xmax: -93.19404 ymax: 45.05125
Geodetic CRS:  NAD83
# A tibble: 2 × 8
  has_covenant mean_lst sd_lst mean_tree_canopy sd_tree_canopy
  <chr>           <dbl>  <dbl>            <dbl>          <dbl>
1 0                96.3   1.50             24.7           8.40
2 1                94.8   1.15             34.1           7.87
# ℹ 3 more variables: mean_air_pollution <dbl>, sd_air_pollution <dbl>,
#   geometry <MULTIPOLYGON [°]>

Racial Covenant	Mean LST (0b0F)	SD LST	Mean Tree Canopy (%)	SD Tree Canopy	Mean Air Pollution (tons)	SD Air Pollution
No	96.33	1.50	24.73	8.40	1.75	1.52
Yes	94.76	1.15	34.12	7.87	2.03	1.69

As you can see, places with no racial covenants are experiencing higher mean LST, higher mean air pollution, and lower tree canopy coverage. This further proves our point that there is environmental discriminaton within the neighborhoods of Minneaplis.

SAR Model Setup and Results

Step 1: Defining Spatial Relationship

We defined spatial relationships using Queen Contiguity to identify neighboring census tracts. This method considers census tracts that share edges or vertices as neighbors.

Step 2: Fitting the SAR Model

We fit a SAR Lag Model with the following predictors:

svi_index: Social vulnerability index.
tree_canopy_area: Tree canopy coverage percentage.
air_pollution: PM2.5 air pollution emissions.
has_covenant: Binary indicator for racial covenants.
holc_grade: Redlining classification (A, B, C, D).


Call:lagsarlm(formula = mean_surface_temp ~ svi_index + tree_canopy_area + 
    air_pollution + has_covenant + holc_grade, data = final_geometry, 
    listw = weights, method = "eigen")

Residuals:
     Min       1Q   Median       3Q      Max 
-1.76354 -0.27317  0.10693  0.39154  1.31225 

Type: lag 
Coefficients: (asymptotic standard errors) 
                   Estimate Std. Error  z value Pr(>|z|)
(Intercept)      61.6957839  6.8193539   9.0472  < 2e-16
svi_index        -0.4502705  0.2205213  -2.0418  0.04117
tree_canopy_area -0.1272125  0.0092264 -13.7879  < 2e-16
air_pollution    -0.0721093  0.0350014  -2.0602  0.03938
has_covenant1     0.0406033  0.1373476   0.2956  0.76752
holc_gradeB      -0.0734987  0.1914829  -0.3838  0.70110
holc_gradeC      -0.0223605  0.2257300  -0.0991  0.92109
holc_gradeD      -0.2109996  0.2567137  -0.8219  0.41112

Rho: 0.39843, LR test value: 27.951, p-value: 1.2445e-07
Asymptotic standard error: 0.070035
    z-value: 5.6891, p-value: 1.2771e-08
Wald statistic: 32.366, p-value: 1.2771e-08

Log likelihood: -98.53885 for lag model
ML residual variance (sigma squared): 0.30979, (sigma: 0.55659)
Number of observations: 116 
Number of parameters estimated: 10 
AIC: 217.08, (AIC for lm: 243.03)
LM test for residual autocorrelation
test value: 0.71391, p-value: 0.39815

SAR Model Results

Spatial Lag Coefficient rho: 0.398 (p<0.001), indicating significant spatial dependence in LST across census tracts.

Interpreation of Results

Tree Canopy Coverage:
- Each 1% increase in tree canopy reduces mean LST by 0.13 Fº (p < 0.001).
- This reinforces the importance of green infrastructure in mitigating urban heat.
Social Vulnerability:
- Socially vulnerable census tracts have slightly lower mean LST (-0.45 Fº); (p = 0.041).
- While counter intuitive, this suggests a complex interplay between social vulnerability and local urban characteristics. This is a results that is interesting and definetly worth exploring in the future.
Air Pollution:
- Higher air pollution emissions are associated with a small but significant reduction in LST (-0.07 Fº); (p = 0.039).
Racial Covenants:
- No significant direct effect of racial covenants was observed on LST (p = 0.768). However, covenants may indirectly influence LST through tree canopy and historical urban planning patterns.

Visualizing SAR Model Results

Observed Land Surface Temperature

Residuals from SAR Model

The residual visualization from the SAR model highlights areas where the model over-predicts or under-predicts land surface temperature (LST). Positive residuals, shown in red, indicate census tracts where the observed LST is higher than the model’s predictions, suggesting unaccounted factors contributing to elevated temperatures, such as localized infrastructure or land use patterns. Negative residuals, represented in blue, show areas where the model over-predicts LST, potentially due to higher tree canopy coverage or cooling effects not fully captured. The spatial clustering of residuals in certain areas suggests the presence of additional spatial patterns or unmeasured variables influencing LST, indicating that while the model accounts for spatial dependence, there are still localized effects worth further investigation.

Predicted Land Surface Temperature

This method assumes the response is known - see manual page

The predicted land surface temperature (LST) map highlights clear differences across Minneapolis neighborhoods. Higher temperatures, shown in darker red, are concentrated in the central and northern parts of the city. These are the areas with less tree canopy coverage, denser urban infrastructure, and historical factors that contribute to elevated heat levels. In contrast, the lighter areas on the outskirts of the city reflect cooler temperatures, which could be attributed to more green spaces and lower development density. This pattern suggests a connection between urban planning, environmental disparities, and the ongoing impacts of historical decisions on vulnerable communities.

Moran’s I Test for Residuals

To confirm that spatial autocorrelation was successfully addressed, we performed a Moran’s I Test on the residuals.


    Moran I test under randomisation

data:  final_geometry$sar_resid  
weights: weights    

Moran I statistic standard deviate = 0.86369, p-value = 0.1939
alternative hypothesis: greater
sample estimates:
Moran I statistic       Expectation          Variance 
      0.039581484      -0.008695652       0.003124431

Moran’s I Results:

Moran’s I Statistic: (0.0396)
p-value: (0.194)

The residuals show no significant spatial autocorrelation, indicating that the SAR model effectively captured the spatial dependence.

Final Conclusion

In this study, we investigated the lasting impact of racial covenants on environmental inequality in Minneapolis, specifically focusing on air pollution and land surface temperature (LST). Our key results showed that socially vulnerable neighborhoods, identified by the Social Vulnerability Index (SVI), experienced higher levels of air pollution and elevated land surface temperatures compared to less vulnerable areas. The SAR model further revealed that tree canopy coverage significantly reduced LST, while the direct influence of racial covenants on temperature was not statistically significant. Importantly, the Moran’s I test confirmed that spatial dependence was successfully addressed, strengthening the reliability of our findings.

These results align with existing research linking historical redlining and discriminatory housing policies to contemporary environmental injustices. However, the observed weaker direct association between racial covenants and LST suggests a more complex relationship where tree canopy coverage and infrastructure decisions may act as mediators. While expected trends were observed—higher air pollution and LST in vulnerable areas—the specifics of these effects and the role of spatial dependencies were surprising and highlight the complex nature of environmental inequality.

Limitations

Scope: Our analysis was geographically restricted to Minneapolis and may not generalize to other urban areas with differing histories of segregation or environmental infrastructure.
Temporal Constraints: The analysis relied on modern environmental data, and while historical covenants were integrated, the lack of longitudinal data limits our ability to trace environmental disparities over time.
We lost 25 census tracts during the matching process where higher than average air pollution and mean lst data were lost, consequently, lowering the regression estimation results.

Future Directions

Incorporate Longitudinal Data: Analyze changes in environmental conditions over multiple decades to better capture the temporal evolution of inequality.
Expand Geographic Scope: Compare results across other U.S. cities with similar redlining histories to identify broader trends.
Include Additional Mediators: Explore the role of infrastructure investments, park access, and zoning policies as potential mediators in the relationship between historical segregation and modern environmental outcomes.

Code

Full Matching

Code

match_out_full <- matchit(
    svi_index ~ holc_grade,
    data = final,
    method = "full",
    distance = "glm",
    estimand = "ATE"
)

match_out_full_summ <- summary(match_out_full, interactions = TRUE)

plot(match_out_full_summ)

Subclass matching

Code

match_out_subclass <- matchit(
    svi_index ~ holc_grade,
    data = final,
    method = "subclass",
    subclass = 5,
    distance = "glm",
    estimand = "ATT"
)

# Compute balance statistics overall across all subclasses
match_out_subclass_summ <- summary(match_out_subclass, interactions = TRUE)
plot(match_out_subclass_summ)

Coarsened Exact Matching

Code

cutpoints <- list(
  svi_index = quantile(final$svi_index, probs = seq(0, 1, 0.25), na.rm = TRUE),  
  holc_grade = c(0, 1, 2, 3, 4) 
)

# Apply Coarsened Exact Matching
match_out_cem <- matchit(
  svi_index ~ holc_grade,
  data = final,
  method = "cem",
  cutpoints = cutpoints,  #
  estimand = "ATE"
)


match_out_cem_summ <- summary(match_out_cem, interactions = TRUE)
print(match_out_cem_summ)

Exact Matching

Code

match_out_exact <- matchit(
    svi_index ~ holc_grade,
    data = final,
    method = "exact",
    distance = NULL,
    estimand = "ATE"
)

match_out_exact_summ <- summary(match_out_exact, interactions = TRUE, un = FALSE)
match_out_exact_summ

Estimates

Air Pollution

Code

matched_data <- match.data(match_out_exact)


mod <- lm(air_pollution ~ svi_index * tree_canopy_area, data = matched_data, weights = weights)


avg_comparisons(
    mod,
    variables = "svi_index",
    vcov = ~subclass,
    newdata = matched_data
)

Mean Land Surface Temperature

Code

match_out_exact_data <- match.data(match_out_exact)%>%
  left_join(lst%>%
              select(GEOID, mean_lst),
            by="GEOID")



mod_lst <- lm(mean_lst ~ svi_index * tree_canopy_area, data = match_out_exact_data, weights = weights)
summary(mod_lst)

Evalues

Code

evalues.OLS(1.41, 0.504, sd=sd(match_out_exact_data$air_pollution))

Code

evalues.OLS(est = 2.22328, se = 0.75985, sd = sd(match_out_exact_data$mean_lst, na.rm = TRUE))

Correlation Analysis

Code

library(leaflet)
library(sf)
library(dplyr)


racial_minneapolis <- racial_cov %>%
  filter(City == "MINNEAPOLIS")

racial_msp_points <- st_centroid(racial_minneapolis)

# Filter HOLC grades for MSP census tracts and rename without dropping geometry

redlining_msp <- redlining %>% 
  filter(GEOID20 %in% msp_census_tracts) %>% 
  rename(holc_grade = EQINTER20) %>% 
  mutate(holc_grade = case_when(
    holc_grade == 1 ~ "A",
    holc_grade == 2 ~ "B",
    holc_grade == 3 ~ "C",
    holc_grade == 4 ~ "D",
  ))

minneapolis_map <- leaflet() %>%
  addTiles() %>%
  
  
  addPolygons(
    data = redlining_msp,
    fillColor = ~colorFactor("viridis", redlining$holc_grade)(holc_grade),
    color = "black",
    weight = 1,
    opacity = 0.8,
    fillOpacity = 0.5,
    highlightOptions = highlightOptions(
      weight = 3,
      color = "blue",
      fillOpacity = 0.7,
      bringToFront = TRUE
    ),
    label = ~paste("HOLC Grade:", holc_grade)
  ) %>%
  
  # Add racial covenant centroids as points
  addCircleMarkers(
    data = racial_msp_points,
    radius = 3,
    color = "purple",
    fillColor = "purple",
    weight = 1,
    opacity = 0.8,
    fillOpacity = 0.7,
    label = ~paste("Address:", Address, "<br>", "Date:", Date_Deed)
  ) %>%
  
  # Add legend for racial covenants
  addLegend(
    position = "bottomleft",
    colors = "purple",
    labels = "Racial Covenants",
    title = "Racial Covenants Overlay"
  ) %>%
  
  # Add legend for HOLC grades
  addLegend(
    position = "bottomright",
    pal = colorFactor("viridis", redlining$grade),
    values = redlining$grade,
    title = "HOLC Grades"
  )

# Display the map
minneapolis_map


# Map Social Vulnerability Index
leaflet() %>%
  addTiles() %>%
  addPolygons(
    data = social_vulnerability_spatial,
    fillColor = ~colorNumeric(palette = "inferno", domain = social_vulnerability$RPL_THEMES)(RPL_THEMES),
    fillOpacity = 0.7,
    color = "black",
    weight = 1,
    label = ~paste("SVI Score:", round(RPL_THEMES, 2)) # Proper column for SVI
  ) %>%
  addLegend(
    position = "bottomright",
    pal = colorNumeric(palette = "inferno", domain = social_vulnerability$RPL_THEMES),
    values = social_vulnerability$RPL_THEMES,
    title = "Social Vulnerability Index"
  )


lst_map <- leaflet(final_geometry) %>%
  addTiles() %>%
  addPolygons(
    fillColor = ~colorNumeric(palette = "YlOrRd", domain = final_geometry$mean_surface_temp)(mean_surface_temp),
    fillOpacity = 0.7,
    color = "black",
    weight = 1,
    label = ~paste("LST (°F):", round(mean_surface_temp, 2)),
    highlightOptions = highlightOptions(
      weight = 3,
      color = "blue",
      bringToFront = TRUE
    )
  ) %>%
  addLegend(
    "bottomright",
    pal = colorNumeric("YlOrRd", final_geometry$mean_surface_temp),
    values = final_geometry$mean_surface_temp,
    title = "Mean LST (°F)"
  )

lst_map

air_pollution_map <- leaflet(final_geometry) %>%
  addTiles() %>%
  addPolygons(
    fillColor = ~colorNumeric(palette = "PuBu", domain = final_geometry$air_pollution)(air_pollution),
    fillOpacity = 0.7,
    color = "black",
    weight = 1,
    label = ~paste("Air Pollution (tons):", round(air_pollution, 2)),
    highlightOptions = highlightOptions(
      weight = 3,
      color = "blue",
      bringToFront = TRUE
    )
  ) %>%
  addLegend(
    "bottomright",
    pal = colorNumeric("PuBu", final_geometry$air_pollution),
    values = final_geometry$air_pollution,
    title = "Air Pollution (tons)"
  )

air_pollution_map

Descriptive Statistics

Code

summary_stats <- final_geometry %>%
  group_by(has_covenant) %>%
  summarise(
    mean_lst = mean(mean_surface_temp, na.rm = TRUE),
    sd_lst = sd(mean_surface_temp, na.rm = TRUE),
    mean_tree_canopy = mean(tree_canopy_area, na.rm = TRUE),
    sd_tree_canopy = sd(tree_canopy_area, na.rm = TRUE),
    mean_air_pollution = mean(air_pollution, na.rm = TRUE),
    sd_air_pollution = sd(air_pollution, na.rm = TRUE)
  )

print(summary_stats)

SAR Model Setup and Results

Code

queen_nb <- poly2nb(final_geometry, queen = TRUE)
weights <- nb2listw(queen_nb, style = "W")

sar_model <- lagsarlm(
  mean_surface_temp ~ svi_index + tree_canopy_area + air_pollution + has_covenant + holc_grade,
  data = final_geometry,
  listw = weights,
  method = "eigen"
)

Visualizing SAR Model and Results

Code

ggplot(final_geometry) +
  geom_sf(aes(fill = mean_surface_temp)) +
  scale_fill_gradient(low = "lightgrey", high = "red", name = "Mean LST") +
  labs(title = "Observed Land Surface Temperature") +
  theme_minimal()

final_geometry$sar_resid <- resid(sar_model)

ggplot(final_geometry) +
  geom_sf(aes(fill = sar_resid)) +
  scale_fill_gradient2(low = "blue", mid = "white", high = "red", midpoint = 0, name = "Residuals") +
  labs(title = "Residuals from SAR Model") +
  theme_minimal()

final_geometry$sar_pred <- fitted(sar_model)

ggplot(final_geometry) +
  geom_sf(aes(fill = sar_pred)) +
  scale_fill_gradient(low = "lightgrey", high = "red", name = "Predicted LST") +
  labs(title = "Predicted Land Surface Temperature") +
  theme_minimal()

Moran’s I Test

Code

moran_test <- moran.test(final_geometry$sar_resid, weights)
print(moran_test)

Introduction

Data

Sources

Data Exploration

Boxplot: Air Pollution and Land Surface Temperature by Social Vulnerability

Data Cleaning and Transformation

Methods

Approach

Our Causal DAG

Matching

Estimate Results

Air Pollution

Land Surface Temperature

Sensitivity Analysis

Unexplained Cofoundes for Air Pollution

Unexplained Cofounder for Mean LST

Correlation Analysis

Spatial Autoregressive (SAR) model

Descriptive Statistics

SAR Model Setup and Results

Step 1: Defining Spatial Relationship

Step 2: Fitting the SAR Model

SAR Model Results

Interpreation of Results

Visualizing SAR Model Results

Observed Land Surface Temperature

Residuals from SAR Model

Predicted Land Surface Temperature

Moran’s I Test for Residuals

Moran’s I Results:

Final Conclusion

Limitations

Future Directions

Code