Example: Intersectionality

Vinh Nguyen


Background on Intersectionality

Disaggregation and disproportionate impact (DI) analysis allows analysts to identify student groups in need of support, helping the institution prioritize resources in order to close equity gaps. As can be seen in the Scaling DI vignette, one could repeat DI calculations over various success variables, group (disaggregation) variables, and cohort variables using the di_iterate function from the DisImpact package. For example, one could choose to repeat the disaggregation by multiple demographic variables (eg, ethnicity, gender, low income status, foster youth status, undocumented status, and LGBTQIA+ status), and for each of the disaggregation, identify the groups that are disproportionately impacted on each outcome.

Conducting a DI analysis as described is a good first step in understanding student needs. It however ignores the concept of intersectionality, that considering each demographic variable individually leaves out the intersections of identity, where the level of disproportionate impact may be compounded. For example, “men of color” and “African American LGBTQIA+” communities be even more disproportionately impacted on outcomes than what’s reported when each variable is disaggregated on thier own (ethnicity, gender, and LGBTQIA+).

This vignette describes how one might account for intersectionality using the DisImpact package.

Intersectionality Using DisImpact

First, let’s conduct a DI analysis on the student_equity data set using a few demographic variables, as described in the Scaling DI vignette.

# Load some necessary packages

# Load student equity data set

# Caclulate DI over several scenarios
df_di_summary <- di_iterate(data=student_equity
                          , success_vars=c('Math', 'English', 'Transfer')
                          , group_vars=c('Ethnicity', 'Gender')
                          , cohort_vars=c('Cohort_Math', 'Cohort_English', 'Cohort')
                          , scenario_repeat_by_vars=c('Ed_Goal', 'College_Status')

Incorporating intersectionality is actually quite straightforward using the DisImpact impact. First, create a new variable that captures the intersection of interest. Then pass this as any other demographic variable to the group_vars argument of di_iterate. The following code illustrates the intersection of ethnicity and gender.

# Create new variable
student_equity_intersection <- student_equity %>%
  mutate(`Ethnicity + Gender`=paste0(Ethnicity, ', ', Gender))

# Check
table(student_equity_intersection$`Ethnicity + Gender`, useNA='ifany')
##           Asian, Female             Asian, Male            Asian, Other 
##                    2936                    2950                     114 
##           Black, Female             Black, Male            Black, Other 
##                    1021                     953                      26 
##        Hispanic, Female          Hispanic, Male         Hispanic, Other 
##                    2002                    1920                      78 
## Multi-Ethnicity, Female   Multi-Ethnicity, Male  Multi-Ethnicity, Other 
##                     509                     467                      24 
## Native American, Female   Native American, Male  Native American, Other 
##                     105                      91                       4 
##           White, Female             White, Male            White, Other 
##                    3285                    3385                     130
# Run DI, then selet rows of interest (for Ethnicity + Gender, remove the Other gender)
df_di_summary_intersection <- di_iterate(data=student_equity_intersection # Specify new data set
                          , success_vars=c('Math', 'English', 'Transfer')
                          , group_vars=c('Ethnicity', 'Gender', 'Ethnicity + Gender') # Add new column name
                          , cohort_vars=c('Cohort_Math', 'Cohort_English', 'Cohort')
                          , scenario_repeat_by_vars=c('Ed_Goal', 'College_Status')
                            ) %>%
  filter(!(disaggregation=='Ethnicity + Gender') | !str_detect(group, ', Other')) # Remove Ethnicity + Gender groups that correspond to 

Visualizing in Dashboard Platform

Once a DI summary data set with intersections of interest is available, it could be used in dashboard development as described in the Scaling DI vignette.

# Disaggregation: Ethnicity
df_di_summary_intersection %>%
  filter(Ed_Goal=='- All', College_Status=='- All', success_variable=='Math', disaggregation=='Ethnicity') %>%
  select(cohort, group, n, pct, di_indicator_ppg, di_indicator_prop_index, di_indicator_80_index) %>%
##    cohort           group    n       pct di_indicator_ppg
## 1    2017           Asian 1406 0.8968706                0
## 2    2017           Black  421 0.7862233                1
## 3    2017        Hispanic  815 0.7325153                1
## 4    2017 Multi-Ethnicity  211 0.8293839                0
## 5    2017 Native American   45 0.9333333                0
## 6    2017           White 1500 0.8773333                0
## 7    2018           Asian 2212 0.9235986                0
## 8    2018           Black  684 0.7441520                1
## 9    2018        Hispanic 1386 0.7366522                1
## 10   2018 Multi-Ethnicity  369 0.7940379                1
## 11   2018 Native American   68 0.8088235                0
## 12   2018           White 2576 0.8819876                0
## 13   2019           Asian 1429 0.9083275                0
## 14   2019           Black  411 0.7834550                1
## 15   2019        Hispanic  786 0.7404580                1
## 16   2019 Multi-Ethnicity  225 0.8000000                0
## 17   2019 Native American   47 0.8297872                0
## 18   2019           White 1558 0.8896021                0
## 19   2020           Asian  573 0.9301920                0
## 20   2020           Black  180 0.7333333                1
## 21   2020        Hispanic  304 0.7171053                1
## 22   2020 Multi-Ethnicity   99 0.7575758                0
## 23   2020 Native American   14 0.6428571                0
## 24   2020           White  610 0.8819672                0
##    di_indicator_prop_index di_indicator_80_index
## 1                        0                     0
## 2                        0                     0
## 3                        0                     1
## 4                        0                     0
## 5                        0                     0
## 6                        0                     0
## 7                        0                     0
## 8                        0                     0
## 9                        0                     1
## 10                       0                     0
## 11                       0                     0
## 12                       0                     0
## 13                       0                     0
## 14                       0                     0
## 15                       0                     0
## 16                       0                     0
## 17                       0                     0
## 18                       0                     0
## 19                       0                     0
## 20                       0                     1
## 21                       0                     1
## 22                       0                     0
## 23                       1                     1
## 24                       0                     0
# Disaggregation: Gender
df_di_summary_intersection %>%
  filter(Ed_Goal=='- All', College_Status=='- All', success_variable=='Math', disaggregation=='Ethnicity') %>%
  select(cohort, group, n, pct, di_indicator_ppg, di_indicator_prop_index, di_indicator_80_index) %>%
##    cohort           group    n       pct di_indicator_ppg
## 1    2017           Asian 1406 0.8968706                0
## 2    2017           Black  421 0.7862233                1
## 3    2017        Hispanic  815 0.7325153                1
## 4    2017 Multi-Ethnicity  211 0.8293839                0
## 5    2017 Native American   45 0.9333333                0
## 6    2017           White 1500 0.8773333                0
## 7    2018           Asian 2212 0.9235986                0
## 8    2018           Black  684 0.7441520                1
## 9    2018        Hispanic 1386 0.7366522                1
## 10   2018 Multi-Ethnicity  369 0.7940379                1
## 11   2018 Native American   68 0.8088235                0
## 12   2018           White 2576 0.8819876                0
## 13   2019           Asian 1429 0.9083275                0
## 14   2019           Black  411 0.7834550                1
## 15   2019        Hispanic  786 0.7404580                1
## 16   2019 Multi-Ethnicity  225 0.8000000                0
## 17   2019 Native American   47 0.8297872                0
## 18   2019           White 1558 0.8896021                0
## 19   2020           Asian  573 0.9301920                0
## 20   2020           Black  180 0.7333333                1
## 21   2020        Hispanic  304 0.7171053                1
## 22   2020 Multi-Ethnicity   99 0.7575758                0
## 23   2020 Native American   14 0.6428571                0
## 24   2020           White  610 0.8819672                0
##    di_indicator_prop_index di_indicator_80_index
## 1                        0                     0
## 2                        0                     0
## 3                        0                     1
## 4                        0                     0
## 5                        0                     0
## 6                        0                     0
## 7                        0                     0
## 8                        0                     0
## 9                        0                     1
## 10                       0                     0
## 11                       0                     0
## 12                       0                     0
## 13                       0                     0
## 14                       0                     0
## 15                       0                     0
## 16                       0                     0
## 17                       0                     0
## 18                       0                     0
## 19                       0                     0
## 20                       0                     1
## 21                       0                     1
## 22                       0                     0
## 23                       1                     1
## 24                       0                     0
# Disaggregation: Ethnicity + Gender
df_di_summary_intersection %>%
  filter(Ed_Goal=='- All', College_Status=='- All', success_variable=='Math', disaggregation=='Ethnicity + Gender') %>%
  select(cohort, group, n, pct, di_indicator_ppg, di_indicator_prop_index, di_indicator_80_index) %>%
##    cohort                   group    n       pct di_indicator_ppg
## 1    2017           Asian, Female  660 0.9075758                0
## 2    2017             Asian, Male  721 0.8862691                0
## 3    2017           Black, Female  212 0.7594340                1
## 4    2017             Black, Male  202 0.8168317                0
## 5    2017        Hispanic, Female  413 0.7312349                1
## 6    2017          Hispanic, Male  385 0.7272727                1
## 7    2017 Multi-Ethnicity, Female   95 0.8526316                0
## 8    2017   Multi-Ethnicity, Male  112 0.8125000                0
## 9    2017 Native American, Female   28 0.8928571                0
## 10   2017   Native American, Male   14 1.0000000                0
## 11   2017           White, Female  739 0.8646820                0
## 12   2017             White, Male  734 0.8869210                0
## 13   2018           Asian, Female 1111 0.9225923                0
## 14   2018             Asian, Male 1056 0.9270833                0
## 15   2018           Black, Female  340 0.7500000                1
## 16   2018             Black, Male  334 0.7395210                1
## 17   2018        Hispanic, Female  681 0.7577093                1
## 18   2018          Hispanic, Male  673 0.7161961                1
## 19   2018 Multi-Ethnicity, Female  195 0.8358974                0
## 20   2018   Multi-Ethnicity, Male  163 0.7423313                1
## 21   2018 Native American, Female   33 0.9090909                0
## 22   2018   Native American, Male   34 0.7058824                0
## 23   2018           White, Female 1234 0.8800648                0
## 24   2018             White, Male 1285 0.8840467                0
## 25   2019           Asian, Female  704 0.9232955                0
## 26   2019             Asian, Male  698 0.8954155                0
## 27   2019           Black, Female  213 0.7934272                0
## 28   2019             Black, Male  194 0.7731959                1
## 29   2019        Hispanic, Female  386 0.7616580                1
## 30   2019          Hispanic, Male  390 0.7230769                1
## 31   2019 Multi-Ethnicity, Female  111 0.8018018                0
## 32   2019   Multi-Ethnicity, Male  109 0.7889908                0
## 33   2019 Native American, Female   22 0.8181818                0
## 34   2019   Native American, Male   25 0.8400000                0
## 35   2019           White, Female  753 0.9003984                0
## 36   2019             White, Male  782 0.8785166                0
## 37   2020           Asian, Female  283 0.9363958                0
## 38   2020             Asian, Male  280 0.9214286                0
## 39   2020           Black, Female   90 0.6222222                1
## 40   2020             Black, Male   87 0.8390805                0
## 41   2020        Hispanic, Female  147 0.7278912                1
## 42   2020          Hispanic, Male  149 0.7248322                1
## 43   2020 Multi-Ethnicity, Female   59 0.7966102                0
## 44   2020   Multi-Ethnicity, Male   39 0.6923077                0
## 45   2020 Native American, Female    8 0.6250000                0
## 46   2020   Native American, Male    6 0.6666667                0
## 47   2020           White, Female  295 0.8677966                0
## 48   2020             White, Male  304 0.8947368                0
##    di_indicator_prop_index di_indicator_80_index
## 1                        0                     0
## 2                        0                     0
## 3                        0                     1
## 4                        0                     0
## 5                        0                     1
## 6                        0                     1
## 7                        0                     0
## 8                        0                     0
## 9                        0                     0
## 10                       0                     0
## 11                       0                     0
## 12                       0                     0
## 13                       0                     0
## 14                       0                     0
## 15                       0                     1
## 16                       0                     1
## 17                       0                     1
## 18                       0                     1
## 19                       0                     0
## 20                       0                     1
## 21                       0                     0
## 22                       0                     1
## 23                       0                     0
## 24                       0                     0
## 25                       0                     0
## 26                       0                     0
## 27                       0                     1
## 28                       0                     1
## 29                       0                     1
## 30                       0                     1
## 31                       0                     0
## 32                       0                     1
## 33                       0                     0
## 34                       0                     0
## 35                       0                     0
## 36                       0                     0
## 37                       0                     0
## 38                       0                     0
## 39                       1                     1
## 40                       0                     0
## 41                       0                     1
## 42                       0                     1
## 43                       0                     1
## 44                       0                     1
## 45                       1                     1
## 46                       1                     1
## 47                       0                     0
## 48                       0                     0
# Disaggregation: Ethnicity
df_di_summary_intersection %>%
  filter(Ed_Goal=='- All', College_Status=='- All', success_variable=='Math', disaggregation=='Ethnicity') %>%
  select(cohort, group, n, pct, di_indicator_ppg, di_indicator_prop_index, di_indicator_80_index) %>%
  mutate(group=factor(group) %>% fct_reorder(desc(pct))) %>% 
  ggplot(data=., mapping=aes(x=factor(cohort), y=pct, group=group, color=group)) +
  geom_point(aes(size=factor(di_indicator_ppg, levels=c(0, 1), labels=c('Not DI', 'DI')))) +
  geom_line() +
  xlab('Cohort') +
  ylab('Rate') +
  theme_bw() +
  scale_color_manual(values=c('#1b9e77', '#d95f02', '#7570b3', '#e7298a', '#66a61e', '#e6ab02'), name='Ethnicity') +
  labs(size='Disproportionate Impact') +
  scale_y_continuous(labels = percent, limits=c(0, 1)) +
  ggtitle('Dashboard drop-down selections:', subtitle=paste0("Ed Goal = '- All' | College Status = '- All' | Outcome = 'Math' | Disaggregation = 'Ethnicity'"))
## Warning: Using size for a discrete variable is not advised.

# Disaggregation: Gender
df_di_summary_intersection %>%
  filter(Ed_Goal=='- All', College_Status=='- All', success_variable=='Math', disaggregation=='Gender') %>%
  select(cohort, group, n, pct, di_indicator_ppg, di_indicator_prop_index, di_indicator_80_index) %>%
  mutate(group=factor(group) %>% fct_reorder(desc(pct))) %>% 
  ggplot(data=., mapping=aes(x=factor(cohort), y=pct, group=group, color=group)) +
  geom_point(aes(size=factor(di_indicator_ppg, levels=c(0, 1), labels=c('Not DI', 'DI')))) +
  geom_line() +
  xlab('Cohort') +
  ylab('Rate') +
  theme_bw() +
  scale_color_manual(values=c('#e7298a', '#66a61e', '#e6ab02'), name='Gender') +
  labs(size='Disproportionate Impact') +
  scale_y_continuous(labels = percent, limits=c(0, 1)) +
  ggtitle('Dashboard drop-down selections:', subtitle=paste0("Ed Goal = '- All' | College Status = '- All' | Outcome = 'Math' | Disaggregation = 'Gender'"))
## Warning: Using size for a discrete variable is not advised.

# Disaggregation: Ethnicity + Gender
df_di_summary_intersection %>%
  filter(Ed_Goal=='- All', College_Status=='- All', success_variable=='Math', disaggregation=='Ethnicity + Gender') %>%
  select(cohort, group, n, pct, di_indicator_ppg, di_indicator_prop_index, di_indicator_80_index) %>%
  mutate(group=factor(group) %>% fct_reorder(desc(pct))) %>% 
  ggplot(data=., mapping=aes(x=factor(cohort), y=pct, group=group, color=group)) +
  geom_point(aes(size=factor(di_indicator_ppg, levels=c(0, 1), labels=c('Not DI', 'DI')))) +
  geom_line() +
  xlab('Cohort') +
  ylab('Rate') +
  theme_bw() +
  scale_color_manual(values=c('#a6cee3', '#1f78b4', '#b2df8a', '#33a02c', '#fb9a99', '#e31a1c', '#fdbf6f', '#ff7f00', '#cab2d6', '#6a3d9a', '#ffff99', '#b15928'), name='Ethnicity + Gender') +
  labs(size='Disproportionate Impact') +
  scale_y_continuous(labels = percent, limits=c(0, 1)) +
  ggtitle('Dashboard drop-down selections:', subtitle=paste0("Ed Goal = '- All' | College Status = '- All' | Outcome = 'Math' | Disaggregation = 'Ethnicity + Gender'"))
## Warning: Using size for a discrete variable is not advised.

Appendix: R and R Package Versions

This vignette was generated using an R session with the following packages. There may be some discrepancies when the reader replicates the code caused by version mismatch.

## R version 4.0.2 (2020-06-22)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19042)
## Matrix products: default
## locale:
## [1] LC_COLLATE=C                          
## [2] LC_CTYPE=English_United States.1252   
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.1252    
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## other attached packages:
## [1] forcats_0.5.0    scales_1.1.1     ggplot2_3.3.2    stringr_1.4.0   
## [5] knitr_1.39       dplyr_1.0.8      DisImpact_0.0.18
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.8.3     pillar_1.7.0     bslib_0.3.1      compiler_4.0.2  
##  [5] jquerylib_0.1.4  highr_0.9        prettydoc_0.4.1  tools_4.0.2     
##  [9] digest_0.6.25    gtable_0.3.0     jsonlite_1.5     evaluate_0.15   
## [13] lifecycle_1.0.1  tibble_3.1.6     fstcore_0.9.12   pkgconfig_2.0.3 
## [17] rlang_1.0.1      cli_3.2.0        yaml_2.3.5       parallel_4.0.2  
## [21] xfun_0.30        fastmap_1.1.0    withr_2.5.0      generics_0.1.2  
## [25] vctrs_0.3.8      sass_0.4.1       grid_4.0.2       tidyselect_1.1.2
## [29] glue_1.6.1       R6_2.3.0         fansi_1.0.2      rmarkdown_2.14  
## [33] farver_2.0.3     purrr_0.3.4      tidyr_1.2.0      magrittr_2.0.2  
## [37] ellipsis_0.3.2   htmltools_0.5.2  fst_0.9.8        colorspace_1.4-1
## [41] labeling_0.3     utf8_1.2.2       stringi_1.4.6    munsell_0.5.0   
## [45] crayon_1.5.0