So since this study is just mechanical Turk it’s entirely possible that this is nothing.
Supposed x% of participants are bots or answer randomly — if we’re measuring a trait that isn’t significantly more common than x% then s big portion of answers for any atypical response (eg I hate other people, I prefer taste of straight coffee beans) will both be random bots and correlate
Why would random answers correlate? Statistical significance for something like this is all about rejecting results consistent with randomness. Correlation means it appears to be non-random.
Suppose 1/20 people are sadistic, and 1/20 people love eating bitter food.
Let's suppose each question is multiple choice with a T/F.
Let's suppose also 1/10 respondents are bots that answer randomly.
Of the people who answer they like sadism on a given question, 66% will be bots. And of the people who say they like bitterness 66% will be bots.
For simplicity sake consider a simple two-question survey (one question about sadism, one about bitter food).
In this case you will get the following numbers, even if there's no genuine correlation:
[One bot in each category]
- 1/40 like both
- 3/40 like bitterness but NOT sadism
- 3/40 like sadism but NOT bitterness
- 33/40 like neither
So you would conclude if you like bitterness (4 people) you have a (1/4) 25% chance of liking sadism, whereas if you don't like bitterness (36 people) you have a (3/36) 8% chance of liking sadism. Therefore liking bitterness would appear to predict to liking sadism (when really both are just predictors of being a bot).
So since this study is just mechanical Turk it’s entirely possible that this is nothing.
Supposed x% of participants are bots or answer randomly — if we’re measuring a trait that isn’t significantly more common than x% then s big portion of answers for any atypical response (eg I hate other people, I prefer taste of straight coffee beans) will both be random bots and correlate
Why would random answers correlate? Statistical significance for something like this is all about rejecting results consistent with randomness. Correlation means it appears to be non-random.
So for a simple example:
Suppose 1/20 people are sadistic, and 1/20 people love eating bitter food.
Let's suppose each question is multiple choice with a T/F.
Let's suppose also 1/10 respondents are bots that answer randomly.
Of the people who answer they like sadism on a given question, 66% will be bots. And of the people who say they like bitterness 66% will be bots.
For simplicity sake consider a simple two-question survey (one question about sadism, one about bitter food).
In this case you will get the following numbers, even if there's no genuine correlation:
[One bot in each category] - 1/40 like both
- 3/40 like bitterness but NOT sadism
- 3/40 like sadism but NOT bitterness
- 33/40 like neither
So you would conclude if you like bitterness (4 people) you have a (1/4) 25% chance of liking sadism, whereas if you don't like bitterness (36 people) you have a (3/36) 8% chance of liking sadism. Therefore liking bitterness would appear to predict to liking sadism (when really both are just predictors of being a bot).
Does this mean that people like their coffee like their personality?
This is definitely going to be one of those studies that fails to replicate.
Right, just like an aquiline nose and steep brow
Edit to add: 2016