
1. Getting Started
Yuki Atsusaka and Seo-young Silvia Kim
Source:vignettes/v1-getting-started.Rmd
v1-getting-started.RmdThis vignette introduces the main rankingQ workflow
using the identity dataset. The package is designed for
ranking questions that may contain random responses and uses an
anchor-ranking question to estimate and correct the resulting
measurement error.
The Example Data
The identity dataset contains a main ranking question
about four sources of identity and an anchor-ranking question with a
known correct ordering. It also includes the survey weight
s_weight.
identity |>
select(
s_weight,
app_identity,
starts_with("app_identity_"),
anc_correct_identity
) |>
head()
#> # A tibble: 6 × 9
#> s_weight app_identity app_identity_1 app_identity_2 app_identity_3
#> <dbl> <chr> <dbl> <dbl> <dbl>
#> 1 0.844 1423 1 4 2
#> 2 0.886 1423 1 4 2
#> 3 2.96 3412 3 4 1
#> 4 0.987 1423 1 4 2
#> 5 1.76 4132 4 1 3
#> 6 0.469 3124 3 1 2
#> # ℹ 4 more variables: app_identity_4 <dbl>, app_identity_recorded <chr>,
#> # app_identity_row_rnd <chr>, anc_correct_identity <dbl>The app_identity columns describe the ranking question
of interest, while anc_correct_identity indicates whether
each respondent answered the anchor question correctly.
Direct Bias Correction
The imprr_direct function directly estimates
bias-corrected quantities of interest such as average ranks, pairwise
ranking probabilities, top-k probabilities, and marginal rank
probabilities.
out_direct <- imprr_direct(
data = identity,
J = 4,
main_q = "app_identity",
anc_correct = "anc_correct_identity",
weight = "s_weight"
)The first output summarizes the estimated proportion of random responses.
out_direct$est_p_random
#> mean lower upper
#> 1 0.3512825 0.3077923 0.3977214The second output contains several corrected ranking-based quantities of interest. For a first pass, average ranks are often the easiest place to start.
out_direct$results |>
filter(qoi == "average rank")
#> # A tibble: 4 × 6
#> item qoi outcome mean lower upper
#> <chr> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 app_identity_1 average rank Avg: app_identity_1 3.27 3.13 3.40
#> 2 app_identity_2 average rank Avg: app_identity_2 2.58 2.43 2.75
#> 3 app_identity_3 average rank Avg: app_identity_3 1.66 1.52 1.81
#> 4 app_identity_4 average rank Avg: app_identity_4 2.49 2.37 2.59Inverse-Probability Weighting
The imprr_weights function instead produces
respondent-level weights that can be used in downstream analyses.
out_weights <- imprr_weights(
data = identity,
J = 4,
main_q = "app_identity",
anc_correct = "anc_correct_identity",
weight = "s_weight"
)One output gives the bias-correction weight assigned to each possible ranking profile.
out_weights$rankings |>
select(ranking, weights) |>
head()
#> ranking weights
#> 1 1234 0.4413785
#> 2 1243 0.0000000
#> 3 1324 0.3402731
#> 4 1342 0.0000000
#> 5 1423 0.9819455
#> 6 1432 0.2721408The respondent-level output keeps the original data and appends a
weights column along with a unified ranking
column.
out_weights$results |>
select(weights, s_weight, app_identity, ranking) |>
head()
#> # A tibble: 6 × 4
#> weights s_weight app_identity ranking
#> <dbl> <dbl> <chr> <chr>
#> 1 0.982 0.844 1423 1423
#> 2 0.982 0.886 1423 1423
#> 3 1.32 2.96 3412 3412
#> 4 0.982 0.987 1423 1423
#> 5 1.14 1.76 4132 4132
#> 6 0.966 0.469 3124 3124Using the IPW Weights
The IPW-adjusted respondent-level data can be passed to downstream
helpers such as avg_rank.
items_df <- data.frame(
variable = paste0("app_identity_", 1:4),
item = c("Party", "Religion", "Gender", "Race")
)
avg_rank(
out_weights$results,
items = items_df,
weight = "weights",
raw = FALSE
)
#> item qoi mean se lower upper method
#> 1 Party Average Rank 3.226655 0.02751980 3.172656 3.280653 IPW
#> 2 Religion Average Rank 2.600988 0.04074443 2.521041 2.680935 IPW
#> 3 Gender Average Rank 1.707342 0.02466106 1.658953 1.755731 IPW
#> 4 Race Average Rank 2.465016 0.02531970 2.415334 2.514697 IPWNext Steps
The remaining vignettes go into more detail on specific parts of the workflow:
-
2. Basic Setupdescribes the expected input data structure. -
3. Correcting Bias in Ranking Datacovers the correction methods in more depth. -
4. Analysis of Bias-corrected Ranking Datashows downstream analysis with corrected weights. -
5. Visualizing Rankingsintroduces the plotting helpers. -
6. Uniformity Testscovers diagnostics when anchor questions are unavailable or need validation. -
7. Simulating Ranking Datadescribes the data-generation helperrpluce.