20 minutes
Speaker 1
00:00:00 - 00:00:17
Academia is broken. Universities are broken. The way that academic research is published is broken. That's the message that's come through loud and clear over the last few weeks, thanks to 3 articles concerning the research of Francesca Gino. If you don't know what I'm talking about, let me explain.
Speaker 1
00:00:17 - 00:00:34
Francesca Gino is a professor of behavioral science at Harvard University. She is extremely well-known in the field. I've talked about her research to clients before. I've recommended books on this channel to you guys that use her work as a key reference. I've used her research before as references in my own essays and work that I did at university.
Speaker 1
00:00:34 - 00:01:09
When it comes to academic fame, Francesca Gino is up there, as you would expect from someone who is a professor at Harvard. However, the reason why she's so well known is because her research tends to bring out a lot of very surprising findings. Now some people just think this research is cool and don't think much more about it, but a lot of people in the industry have been quite sceptical of Francesca Gino and her work because her results just seem a little bit too good. Her hypotheses are really wacky, but yet they always seem to be proved correct. The effect sizes from her studies seem to be really large and her statistical significance just seem a little bit too significant.
Speaker 1
00:01:09 - 00:01:44
So while some of us have been skeptical of her work for a while, nobody has taken the time to actually investigate her research and go into her data to see if they can find anything fishy, until now. These 3 guys, Yuri, Joe and Leif are also professors of behavioral science and other related subjects from different universities across the world. And they took it upon themselves to investigate Francesca Gino and her data to see if there was anything fishy going on. And spoiler alert, they found a lot of fishy stuff in the data, and that's what the 3 articles that they released are talking about. Each article relates to a different study by Francesca Gino, and in this video, I'm going to be taking you through each 1.
Speaker 1
00:01:44 - 00:02:18
The results of their investigation are shocking, damning for Francesca Gino, but I think they speak even louder volumes about the state of academia in general, and that's what I'm going to be concluding on at the end of this video. So without further ado, let's jump into the first study. So this first article is called Clusterfake, and it's referring to a paper written by Gino in 2012, along with her collaborators, Shu, Nina Mazar, Dan Ariely, and Max Bazeman. And given the fact that I know the first names of all of those researchers, with the exception of Shu, should tell you that all of these researchers are very well-known people in the field of behavioral science. So in this study, they were trying to get participants to be more honest.
Speaker 1
00:02:19 - 00:02:49
And their hypothesis was that if you put an honesty pledge at the top of a form, that'll make people more honest when they then fill out the rest of the form. So all of the studies in this paper by these authors were looking at this idea, that if you put an honesty pledge at the top of a form, people will be more honest than if you put the honesty pledge at the bottom of a form. Now the first study in this paper was led by Francesca Gino, our protagonist. So in this study, students were brought into a lab to complete 20 math puzzles in 5 minutes. The students were told that they would be paid $1 for each math puzzle they solved correctly.
Speaker 1
00:02:50 - 00:03:13
And the way that this worked was that when students walked into the room, there were 2 pieces of paper. They had their work paper and their report paper. So on the work paper, they'd write down their workings for the math questions and of course their answers. And then on the report paper, they would then have to report how many answers they got correctly and therefore how much they should get paid. The students were then told that before handing in their report paper to the researchers and getting paid, that they should shred their original work paper.
Speaker 1
00:03:14 - 00:03:59
The idea behind this is that by shredding their work paper there's then a stronger incentive for them to cheat on the report paper and lie about how many answers they got correct since the researchers in theory should never know how many answers they got right on the work paper. But what the students didn't know was that the shredder at the back of the room was not a normal shredder. In order to test the hypothesis of the researchers, on the reporting paper the participants were split into 2 groups. Half of them had an honesty pledge at the top of the paper, and half of them had an honesty pledge at the bottom of the paper. With the idea being, of course, that those who signed the honesty pledge at the top would then cheat less going forward.
Speaker 1
00:03:59 - 00:04:30
So, What was the result? Well, the result showed a massive effect from this simple intervention. According to what was published in the study originally, for the students who signed the honesty pledge at the top of the form, only 37% of them lied, but when students signed at the bottom of the form, 79% of students lied. This is a massive effect size that the researchers are reporting and as a result of that this study gained a lot of public attention and I have talked about it with many people in the past before because it is so surprising. But that's why these vigilantes were suspicious.
Speaker 1
00:04:30 - 00:04:52
The results just seem a bit too good. Can it really be the case that simply moving an honesty pledge from the bottom to the top of a form can have such a dramatic effect on the amount of cheating that happens? It seems pretty unlikely. So our vigilantes managed to source the original dataset that was published by the authors of the study. And when they looked into the data, it just seemed a little bit fishy.
Speaker 1
00:04:52 - 00:05:24
If you look at this table and specifically look at the left-hand column, the P hash column, this is referring to participant ID. This is the unique ID given to each participant in a study. And as is highlighted in yellow, there are some weird anomalies in the way that this data has been sorted. Because when you look at this data, it seems obvious that this has been sorted by first the condition, so all of condition 1 are together, then all of condition 2 are together, and then in ascending order of the participant ID. Which means that the numbers should consistently get bigger as you go down the line and there should be no duplicates.
Speaker 1
00:05:24 - 00:05:41
Remember each participant has a unique ID. So when you look at this data it's a bit weird. We've got 2 49s here, that's a duplicate, that should never happen. And then at the end of the condition 1 set of participants, you have participant 51 coming after 95, then 12, then 101. Like that sequence doesn't make any sense.
Speaker 1
00:05:41 - 00:06:29
And similarly, when you get to condition 2, we start with 7, then 91, then 52, then all the way back down to 5 again. These entries in the dataset look suspicious, they look like they're out of sequence, which suggests that somebody maybe has tampered with them. So our vigilantes are suspicious of these rows. So then you have to ask the question, why would the Researchers want to tamper with the data Well, it's because they would want to show a bigger effect than those actually seen in the real data the more dramatic the effect of the intervention is the more Surprising the result of the study is and therefore the more likely it is to get published in a top journal the more likely it is that this will make a lot of press headlines, that they will get lots of interviews and work off the back of it. And so there's a strong incentive for the researchers to fudge the data a little bit, make the effect seem larger than it really is, and so that's what our vigilantes were looking for.
Speaker 1
00:06:29 - 00:07:05
They wanted to see if the suspicious rows in the data set showed a bigger effect than the normal data that wasn't suspicious. And sure enough, that's exactly what they found. If you look at this graph, the red circles with the cross show the suspicious data and the blue dots show the unsuspicious data. And as you can see, the circles with the red crosses are the most extreme ones, meaning that these few data points are inflating the effect size. Now the article goes on to show how our vigilantes did some very clever work to unpack the Excel file that this data was stored in, and they were able to show quite clearly that these suspicious rows were manually resorted in the data set.
Speaker 1
00:07:05 - 00:07:34
I won't go into it on this video because it's quite technical, but I'll have a link to all of these articles in the description if you want to read them in full. But as you'll soon see, this theme of suspicious data and then those data showing extremely strong effect sizes will be a recurring pattern. So let's move on to study 2. Now this second article is called My Class Year is Harvard and you'll see why in a second. They're looking at a study from 2015 written by Francesca Gino as well as Kuchaki and Galinsky, again 2 fairly well-known researchers in the field.
Speaker 1
00:07:34 - 00:07:59
Now the hypothesis for this study, in my opinion, is pretty stupid. The hypothesis is that if you argue against something that you really believe in, that makes you feel dirty, which then increases your desire for cleansing products, which is kind of silly in my opinion. But nevertheless, this is what they were researching. So this study was done at Harvard University with almost 500 students. And what they asked the participants to do was the following.
Speaker 1
00:07:59 - 00:08:18
So students of Harvard University were brought into the lab and then asked how they felt about this thing called the Q-Guide. I don't really know what the Q-Guide is but apparently it's a hot topic at Harvard and it's very controversial. Some people are for it, some people are against it. So when they were brought into the lab they were asked how do you feel about the Q-Guide and they either said they were for or against it. And then the participants were split into 2 groups.
Speaker 1
00:08:18 - 00:08:43
Half the participants were asked to write an essay supporting the view that they just gave. So if they said, I'm for the Q guide, they had to then write an essay explaining why they were for the Q guide. But then half the participants were asked to write an essay arguing opposite to the point that they just gave. So if they said, I'm for the Q guide, they would then have to write an essay explaining why they should be against the Q guide. Again, the idea being that those who are writing an essay against what they actually believe in would make them feel dirty.
Speaker 1
00:08:43 - 00:09:13
Because after they'd written this essay, they were then shown 5 different cleansing products. And the participants in the study had to rate how desirable they felt these cleansing products were on a scale of 1 to 7, with 1 being completely undesirable and 7 being completely desirable. And again, the authors found a strong effect. You can see here that the P value is less than 0.0001. And for those of you who haven't had any academic training in statistics, basically when you're doing a study like this, you're looking for a P value that's less than 0.05.
Speaker 1
00:09:13 - 00:09:36
That's the industry standard. If it's less than 0.05, you say, yes, I'm confident that the effect that I'm seeing is caused by the manipulation that I just did. So less than 0.0001 is an extremely strong effect. You're basically 100% confident that what you're seeing in the data is caused by the manipulation that you did. So once again, our vigilantes are suspicious of this very strong effect size.
Speaker 1
00:09:36 - 00:10:11
So they managed to source the data online and do a little bit of investigating. And what they find are some weird anomalies in the kind of demographic data that the participants have to give when they enter the study. And this is very common in psychological studies that participants have to give a little bit of demographic data about themselves, which gives the researchers a little bit more flexibility about how they cut up the data later on. So in this particular study, the participants were asked a number of demographic questions, including their age, their gender, and then number 6 was what year in school they were. Now the way this question is structured isn't very good in my opinion in terms of research design, but nevertheless there are a number of acceptable answers that you can give to year in school.
Speaker 1
00:10:11 - 00:10:34
Because Harvard is an American school, you might say I'm a senior, right, which is a common thing, or a sophomore. You might write the year that you're supposed to graduate, 2015, 2016, etc. Or you might indicate a 1, a 2, a 3, a 4, or a 5 to indicate how many years of school that you've been in there. These are all different answers but they're all acceptable and make sense in the context of being asked what year in school are you. And so when our vigilantes go into the data, that's exactly what they saw in this column.
Speaker 1
00:10:34 - 00:10:48
A range of different answers that were all acceptable, except for 1. There were 20 entries in this data set where the answer to the question, what year in school are you, was Harvard. That doesn't make any sense. What year in school are you? Harvard.
Speaker 1
00:10:49 - 00:11:03
What? Right? That doesn't make any sense. And the other thing that was suspicious about these Harvard entries is that they were all grouped together within 35 rows. Again, this was a dataset of nearly 500 different participants, and yet all of these weird Harvard answers were within 35 rows.
Speaker 1
00:11:03 - 00:11:25
So once again, our vigilantes treat these Harvard answers as suspicious data entries. They mark them in red circles with crosses. And as you can see, the ones that are suspicious are again, the most extreme answers supporting the hypothesis of the researchers. With the exception of this 1, but come on. It's most suspicious when you look at the ones on argued other side.
Speaker 1
00:11:25 - 00:12:05
So these are the people who wrote an essay arguing against what they didn't believe in and therefore was supposed to feel more dirty and find cleansing products more appealing, all of these suspicious entries on that side of the manipulation went for 7, that they found all of the cleaning products completely desirable. And so what our vigilantes go on to say is that these were just the 20 entries in the dataset that looked suspicious because of this Harvard answer to the demographic question. But who's to say that the other data in the dataset was not also tampered with, but just they were more careful when they filled in this column and didn't put Harvard. Since it seems pretty clear that at least these 20 entries were manipulated and tampered with some way, it probably means that there are other entries within this dataset that were also tampered with. Are you shocked yet?
Speaker 1
00:12:05 - 00:12:30
I hope you are, but it's about to get worse because there's a third article to do with Francesca Gino. So this third article was released literally yesterday, the day before I'm filming this video, and it's called The Cheaters Are Out of Order. This was written by Francesca Gino and a guy called Wilter Muth. I don't know Wilter Muth, but again, I find it incredibly ironic that all of this cheating and fake data is being conducted by researchers who are studying the science of honesty. It is incredibly ironic.
Speaker 1
00:12:30 - 00:13:05
So in this third study, Gino and her co-author are investigating the idea that people who cheat, people that lie, who are dishonest, are actually more creative. And they call the paper, Evil Genius, How Dishonesty Can Lead To Greater Creativity. So let's quickly go through how the study worked. Participants were brought into a lab where they were sat at a machine with a virtual coin flipping mechanism. What the participants are asked to do is to predict whether the coin will flip heads or tails, and then they would push a button to actually flip the coin.
Speaker 1
00:13:05 - 00:13:32
And if they had predicted correctly about whether it would go heads or tails, then they would get a dollar. So again, there's a strong incentive to cheat. So the participants would write down on a piece of paper how many predictions they got correct, and then they would hand that to the researcher in order to get paid. But then of course the researcher would then go back and look at the machine that they were flipping the coin on to see how many they actually got correct and then they were able to tell how many times that participant had cheated. So after they completed the coin flipping task, they were then given a creativity task.
Speaker 1
00:13:32 - 00:13:57
And the creativity task was, how many different uses can you think of for a piece of newspaper? So in psychology, this is a pretty common technique for testing creativity. You give somebody an inanimate object and then you say, how many uses can you think of for this inanimate object? And again, with this study, we see a very strong effect size. Remember, the magic number that academics look for is P less than 0.05, and here we have P less than 0.001.
Speaker 1
00:13:58 - 00:14:24
So basically what that means is that there's an extremely high likelihood that the effect that the academics are seeing is caused by the manipulation that they did. So again, our vigilantes are suspicious, but this 1 is interesting because our vigilantes were able to actually get the data set from Gino several years ago. So they got this data set directly from Gino. So again, when our vigilantes look into the data, they find some weird things going on. As you can see, it seems to be sorted by 2 things.
Speaker 1
00:14:24 - 00:14:41
Firstly, by the number of times the participant cheated. So all of the people who didn't cheat at all are zeros. And then the number of responses is the number of different uses for a newspaper that that participant could come up with. And those are clearly ranked in ascending order. But as you can see from this next screenshot, some of the cheaters are out of order.
Speaker 1
00:14:41 - 00:15:06
So these are the people who cheated once, who basically over reported 1 time, and the number of uses that they could come up with for the newspaper are out of sequence. Here we have 3, 04:13, then 9, and then back down to 5 again, then back up to 9, then 5, then 9, then 8, then 9. It's just a total mess, right? So these ones that are highlighted in yellow are the suspicious ones. They're the ones that are out of order according to how the data appears to have been sorted.
Speaker 1
00:15:06 - 00:15:33
So what our vigilantes did was they basically took this data set and then made a new column and they called it imputed low or imputed high. What that basically means is that rather than taking the number of responses that are written down in this original data set, they're gonna say, well, where does this entry sit in the ranking order? And so we're going to replace the value that is given here with what the value should really be. So if it's between 4 and 5, then that number should be either 4 or 5, whether it's imputed low or imputed high. Does that make sense?
Speaker 1
00:15:33 - 00:16:16
So once again, our researchers plotted the data, suspicious entries are marked with a circle and a cross, and as you can see, the suspicious entries are the ones that deviate from the pattern that you see in the non-cheaters, the blue line. So in other words, the ones that are out of order, the suspicious entries, they're the ones showing the effect. But when you use the imputed position, so that's the number that is implied by the row that the entry was in, then suddenly the entire effect disappears, and the group of cheaters seem to show a very similar pattern to the group of non-cheaters. And the result of this, statistically speaking, is significant. Remember, the original p-value for this study was p less than 0.0001, but once you use the data that's implied by the row, suddenly the significance completely disappears.
Speaker 1
00:16:16 - 00:16:53
It then goes to p equals 0.292 or p equals 0.180 depending on whether you're imputing low or high. Remember in order for an academic study to be significant the standard is p less than 0.05 and here the p is clearly more than 0.05. Again this article goes on, the Vigilantes do a little bit more research to really back up the point and really drive home the fact that this data is very suspicious. I won't go into the details now, again, all of these articles are linked in the description, go check them out. And you'll notice that these were all called Part 1, Part 2, Part 3, and that's because this is actually a four-part series so I'm expecting a fourth article to come out after this video is published looking at yet another study from Francesca Gino.
Speaker 1
00:16:53 - 00:17:23
But I hope by this point you get the picture. There's a number of studies conducted by Francesca Gino with very suspicious looking data. So at this point you're probably wondering, how did Harvard allow this? And the short answer is, well, they don't really seem to have done. If you go on Francesca Gino's page on the Harvard website, it shows that she's on administrative leave, I think we all know what that means, and Harvard, who have even more access to Francesca Gino's data than our vigilantes do, have since asked for several of Francesca Gino's papers to be retracted from the journals that they were originally published in.
Speaker 1
00:17:23 - 00:17:45
Now, this is a bad look for Francesca Gino, right? And we can't be sure that it was Francesca Gino who was doing this manipulation. It could be 1 of her co-authors, but given that she's the common thread between all of these different papers, it seems pretty likely that it was her. In the world of psychology and writing good quality academic papers, this is really bad. It's not only bad for Francesca Gino, but it's bad for the field as a whole.
Speaker 1
00:17:45 - 00:18:23
It casts doubt over the entire field of behavioural science because we don't know the extent of the damage that bad actors like Gino have been causing in the field and for how long. Like I said, Francesca Gino has been a prominent name in the field for years, gaining a position at 1 of the top universities, Harvard. So who's to say that this isn't a problem that is rife amongst many other researchers in the field? We certainly hope not, but you can't really know when someone so high profile like this has been engaging in this kind of behaviour for years and getting away with it. It also looks bad for people like me who work in the industry, who trust these academics to publish good quality research that we then use to try and influence real world change in businesses, in government, and so on and so forth.
Speaker 1
00:18:23 - 00:18:54
Like I said, I've used Gino's work before to make recommendations to my clients. And I've recommended to you guys to read Dan Ariely's book, The Honest Truth about dishonesty in the past. A book which I no longer recommend, since the paper that was talked about in the first article here is used heavily as a reference for a lot of the claims that Ariella is making in that book. And while it's tempting here to just completely lay into Francesca Gino and just, you know, really have a go at her for this kind of bad behaviour, I actually kind of understand why she did it, right? If you're an academic at a top institution like Harvard, you are under an enormous amount of pressure to publish surprising results and consistently.
Speaker 1
00:18:55 - 00:19:27
Surprising results with big effect sizes are more likely to get published in top journals when you more press interviews and basically cement your position there at a top university like Harvard. So there is a strong incentive for academics to fudge data like this and come up with more surprising results in order to try and maintain their position. I'm not condoning the behaviour in the slightest, it's completely unacceptable that an academic would do this, but I can somewhat empathise that she's under a lot of pressure and can see how the incentives are working against the practice of following good science. But what do you guys think of Francesca Gino and all of this nonsense? Let me know in the comments below.
Speaker 1
00:19:27 - 00:19:45
Please go read the articles that are in the description. Thank you to Yuri, Joe and Lei for publishing this research. You guys are absolute legends. And Francesca Gino, if you're watching this video, I know you must be going through a really rough time right now to have your sort of entire career ripped away from you so publicly like this. While I think that what you did is completely unacceptable, please don't do anything stupid with your own life.
Speaker 1
00:19:45 - 00:19:45
You're still a valuable human being. But thank you guys so much for watching and I'll see You
Omnivision Solutions Ltd