This vignette describes how to process and visualize data extracted
from a CHOIR database and provides an example using the included
validation
dataset. In this example, we will visualize the
percent endorsement of each region of the CHOIR Body Map (CBM) for men
and women.
First, load the validation
data into R and get a quick
look at it.
# loading the validation data included in the CHOIRBM package
data(validation)
head(validation)
#> id gender race age bodymap_regions_csv score
#> 1 1 Female White 45 128,117,107,105,223 58
#> 2 2 Female Other 46 110,106,105,235,233 60
#> 3 3 Female Other 42 219,218,213,212,210 73
#> 4 4 Female White 73 136,134,132,130,127 12
#> 5 5 Female Other 60 136,134,133,130,129 100
#> 6 6 Female White 51 128,125,118,117,214 33
You will notice that the CHOIR Body Maps for each patient are comma-separated strings in a single column. Each of these will need to be converted into its own body map before we can go further.
Separate the data into male and female data frames.
male_data <- validation[validation[["gender"]] == "Male", ]
female_data <- validation[validation[["gender"]] == "Female", ]
Create a list of body map data frames for the men and women by using
the string_to_map()
function and R’s lapply()
.
Then use agg_choirbm_list()
to reduce the list of data
frames through addition of the endorsement values. Since we want the
percent endorsement for plotting, we can then calculate the percentage
as a separate column in the final data frame.
male_bodymap_list <- lapply(male_data[["bodymap_regions_csv"]], string_to_map)
male_bodymap_df <- agg_choirbm_list(male_bodymap_list)
# we want to visualize the percent endorsement, so divide the values by
# the size of the data set and multiply by 100
male_bodymap_df[["perc"]] <- male_bodymap_df[["value"]] /
nrow(male_data) * 100
head(male_bodymap_df)
#> id value group perc
#> 1 101 147 Front 19.838057
#> 2 102 161 Front 21.727395
#> 3 103 66 Front 8.906883
#> 4 104 87 Front 11.740891
#> 5 105 57 Front 7.692308
#> 6 106 62 Front 8.367072
female_bodymap_list <- lapply(
female_data[["bodymap_regions_csv"]]
, string_to_map
)
female_bodymap_df <- agg_choirbm_list(female_bodymap_list)
# we want to visualize the percent endorsement, so divide the values by
# the size of the data set and multiply by 100
female_bodymap_df[["perc"]] <- female_bodymap_df[["value"]] /
nrow(female_data) * 100
head(female_bodymap_df)
#> id value group perc
#> 1 101 1046 Front 16.532322
#> 2 102 1211 Front 19.140193
#> 3 103 713 Front 11.269164
#> 4 104 827 Front 13.070966
#> 5 105 430 Front 6.796270
#> 6 106 476 Front 7.523313
Once the data is properly formatted, and the values to plot are calculated, then we can generate some CBMs. Plot the male and female body maps separately.
plot_male_choirbm(male_bodymap_df, value = "perc") +
theme(legend.position = "bottom") +
labs(fill = "Percent Endorsement")
plot_female_choirbm(female_bodymap_df, value = "perc") +
theme(legend.position = "bottom") +
labs(fill = "Percent Endorsement")