Greetings from the Australian National University in Canberra, where Diane Kelly, University of North Carolina, is speaking on "Statistical power analysis for sample size estimation and understanding risks in experiments with users". Having struggled through a course in research methods, I was relieved to hear that there is no perfect sample size. Diane looked at some of the constraints on sample size, such as budget and time.
One critical decision that researchers must make when designing experiments with users is how many participants to study. In our field, the determination of sample size is often based on heuristics and limited by practical constraints such as time and finances. As a result, many studies are underpowered and it is common to see researchers make statements like "With more participants significance might have been detected," but what does this mean? What does it mean for a study to be underpowered? How does this effect what we are able to discover about information search behavior, how we interpret study results and how we make choices about what to study next? How does one determine an appropriate sample size? What does it even mean for a sample size to be appropriate? In this talk, I will discuss the use of statistical power analysis for sample size estimation in experiments. Statistical power analysis does not necessarily give researchers a magic number, but rather allows researchers to understand the risks of Type I and Type II errors given an expected effect size. In discussing this topic, the issues of effect size, Type I and Type II errors and experimental design, including choice of statistical procedures, will also be addressed. I hope this talk will function as a conversation starter about issues related to sample size in experimental interactive information retrieval.