Open
Description
For paired samples (or one-sample), the rank-biserial correlation gives a (seemingly) odd result when there are ties with the mu
argument. For example (code below), if I use the sleep
data set there will be 1 tie between the paired comparisons therefore, in my opinion, the effect size, or probability of superiority, should not be -1.00 or 0%. Is there any particular reason for this?
> # example of function
> rank_biserial(extra ~ group, data = sleep, paired = TRUE,
+ parametric = FALSE, mu = 0)
r (rank biserial) | 95% CI
----------------------------------
-1.00 | [-1.00, -1.00]
> # take paired differences
> z = subset(sleep, group == 1)$extra - subset(sleep, group == 2)$extra
> # calculate sum less than zero
> # not equal to 100% less!
> sum(z < 0) / length(z)
[1] 0.9
My feeling is that with .r_rbs
should actually be the following for paired samples.
z = na.omit((x - y) - mu)
Ry <- effectsize:::.safe_ranktransform(z, sign = TRUE, verbose = verbose)
Ry0 <-ifelse(is.na(Ry),1,0)
Ry <- stats::na.omit(Ry)
n <- length(na.omit((x - y) - mu))
S <- (n * (n + 1) / 2)
U1 <- sum(Ry[Ry > 0], na.rm = TRUE) + 0.5*sum(Ry0[Ry0 == 1], na.rm = TRUE)
U2 <- -sum(Ry[Ry < 0], na.rm = TRUE) + -0.5*sum(Ry0[Ry0 == 1], na.rm = TRUE)
u_ <- U1 / S
f_ <- U2 / S
u_ - f_
For this exact example, it provides the correct result (again, just my opinion).
Activity