Problems in academic research

Writing in the Journal of International Business Studies Klaus Meyer and his colleagues outline some problems with academic research and how that journal suggests that they be dealt with. The problems they outline will be quite familiar to anyone following recent controversies in academia. The file drawer problem is that many studies are conducted but only the ones that turn out well tend to be published. Given results can, and do, occur by chance if enough people test any hypothesis someone is going to find a positive result — even if merely by total chance. These positive results get published and we end up with published significant results for any number of theories many of which seem to directly contradict each other.

Another problem can be very neatly illustrated. The authors do a graph of findings in a number of major management journals. (There are similar graphs for other journals). They show a relatively smooth curve but with a bizarre pattern at around p=.05 which is completely inconsistent with the rest of the graph. P<.05 is often used as the level where things are significant so academics have an incentive to get under p=.05 in order to be published and lots of paper miraculously seem to just do this. “The combination of a spike just above the p-value of 0.05 and the valley just below in the distribution of p-values close to the critical value of 0.05 (critical from a reporting point of view) corresponds with similar findings in economics and psychology” (Meyer, van Witteloostuijn, Beugeldijk, 2017, page 539). How could this come about? Scholars have many methods to (consciously or unconsciously) influence the findings. For example, they often decide what participants were not valid, e.g. those not paying sufficient attention, or choose to drop “outliers”. Any time you give researchers discretion runs the risk of reported results happening to be better than they would normally be (even without scholars just making up results).

Also “..scholars often conduct many tests, and develop their theory ex post but present it as if the theory had been developed first” (Meyer, van Witteloostuijn, Beugeldijk, 2017, page 539). HARKing (hypothesizing after the results are known) is not too dissimilar to being allowed to put down the target after you have fired the arrow. You tend to find something significant if you are allowed to look as much as you like.

To cut down on poor (or even sometimes plain fraudulent) research methods the authors have a number of suggestions. Radically these involve dropping the reporting of cut-off significance levels to avoid any fixation on p=.0.5. I’ have sympathy with such changes that are coming to academic research. They won’t correct all problems but they are probably a step in the right direction.

Read: Klaus E Meyer, Arjen van Witteloostuijn, and Sjoerd Beugelsdijk (2017) What’s in a p? Reassessing best practices for conducting and reporting hypothesis-testing research, Journal of International Business Studies, 48 (5), pages 535-551