When performing statistical analyses it is often only necessary to take individual differences into account at the participant level. However, when people act as both participants and stimuli, there are two sources of individual difference variation. Both must be dealt with appropriately in order to maximize generalizability and reduce the likelihood of committing a Type 1 error. To illustrate this, we present data from two areas of research: face matching and face-voice matching. In investigations of face matching performance, participants are asked to decide whether two images of unfamiliar faces depict the same person. This task is surprisingly difficult. Existing studies acknowledge that some people are much better at the task than others, and furthermore, that performance depends on stimulus pairings; there is more within-stimulus variability across some people’s images than others. We explain that in such cases, only multilevel modeling can appropriately deal with both sources of variability, yet all previous face matching studies have used standard statistical techniques, and aggregated over one source. Our own face matching data highlights the necessity of using multilevel modeling. In a further demonstration of the utility of this statistical method, we present data from face-voice matching tasks, in which we tested whether people look and sound similar. Comparing multilevel and traditional analyses highlights that the failure to simultaneously treat both participants and person stimuli as random effects can seriously affect the resulting conclusions.