Saturday, May 26, 2007
I think I just self-plagiarized all over myself...
That brings me to the another reason why I think about plagiarism: "Self-Plagiarism." Many of my talks and presentations and technical writeups are copied from each other. In other words, once I've found a way to describe a piece of equipment or a physical process, I simply recycle that piece of text without attribution. I mean, I wrote the damn thing in the first place, right? That should be okay, right? Well, if you google "self plagiarism," one of the first links that comes up is this website by Miguel Roig written for the Office of Research Integrity at US Department of Health and Human Services. Three of the things that Roig lists under self-plagiarism are "redundant or dual publications," "salami slicing," and "text recycling."
An example of dual publication is when the same work or paper is published in two different journals. I personally have never seen this and have trouble believing people try to pull this off. Ok, nevermind: my cubicle buddy just informed me that theorists (snicker) do this all the time. Salami slicing or to quote Roig "the segmenting of a large study into two or more publications" is considered "unacceptable scientific practice." Really? If I understand this correctly, then we do this all the time. Without getting too technical, let me explain: we measured quantities I'll call "A1" and "A2" as a function of another variable "v." One paper we published was essentially "(A1-A2)/v." We then published another paper that was essentially "(A1-A2)v^3."...And then three more papers were published that were literally different linear combinations of A1 and A2. I'm almost almost almost not kidding. The physics of these "derived" quantities are related, but different. Even though all the data came from one experiment taken over a single time interval, is this still self plagiarism and unacceptable?
Here's another situation that Roig talks about (called data augmentation): "when a researcher publishes a study and subsequently collects additional data, which typically end up strengthening the original effect, and publishes the combined results as a new study." Guess what, we've done exactly this as well (see first two links)! Again without going into the details, we count the number of electrons that hit the detectors after they bounce off the target. Being a "counting" experiment, the relative statistical uncertainty scales as the inverse of the square root of the number of electrons counted. We took data in three chunks over two years. Our preliminary results were published after the first year and our final combined results were published after the experiment ended. This appears to be an almost perfect of, in Roig's words, "old data that has been merely augmented with additional data points and that is subsequently presented as a new study." Roig calls this practice a "serious ethical breach."
Finally, he gets to the question that I originally had about text recycling: "a writer’s reuse of portions of text that have appeared previously in other works." Roig gives examples when this is acceptable and when it is "borderline or unacceptable." As you can probably guess by now, we've done it. I'll spare you the details. So...how do i reconcile these things? Well, first of all, the things that I described are fairly common practice in the field I work in: nuclear physics. This is what I call the "cultural differences" defense. Roig makes many good arguments for why self plagiarizing is bad in the "biomedical and social sciences" arena...but can analogous arguments be made to be suited for other fields? I don't know, but maybe I've been "cultured" to believe that what we do is okay. When I think about text recylcing, I feel it's no different from using the same figure depicting an experimental apparatus over and over again. Should you have to make a unique diagram for each new publication if the experimental apparatus is the same? I would say no, but then what's the difference between that diagram and the text used to describe that diagram? And what about our salami slicing? Well, as my cublicle buddy argued, all of those articles were published in a journal that has a limit of ~4ish pages per article. There is no way that we could cover ~20 pages of physics results in ~4 pages. This is what I call the "It's not our fault" defense.
Finally, the trickist one is data augmentation. In the example I used for what we did, the two papers had a different emphasis. Our first year data was a "new" result in the sense that no one had measured it before and it could have been "zero." The fact the the result was not "zero" was a significant finding itself: it was consistent with what we call the Standard Model of Physics. In our second paper, we were interested in seeing if there was any small deviation in the quantity that we were measuring from the theoretical value. This question required more data so that we could achieve the desired statistical precision. (By the way, there wasn't a statistically significant deviation.) Because the scientific questions were different, I claim the two papers really stand on their own. This is what I call the "No, no they're really two different things (hands waving)" defense. My last defense and maybe the most relevant one is the general idea that at no point did we ever try to "decieve" the reader, which is the standard that Roig repeats throughout his document. But this leads to the question of whether an author's "intent" is relevant to the determination of whether plagiarism has occurred. The answer probably depends on what kind of plagiarism is meant by "plagiarism."
Let me attempt to clarify using the ideas of Erik Campbell: "hard" plagiarism is the copying portions of text verbatim. In his very amusing article at the Virginia Quarterly Review Campbell reflects on his run-in with "accidental" hard plagiarism in poetry. He also presents the idea of "soft" plagiarism: "pilfering another’s ideas." This turns out to be a very murky subject because one has to walk a careful line between "creative influence" and "stealing ideas." How does one draw the line when discussing an artistic endeavor?
Take the case of Bryony Lavery's Tony-nominated play "Frozen" as outlined in Malcolm Gladwell's New Yorker article. The play is about a killer, the victim's mother, and a doctor who is studying the killer's mental state to understand his motivation. As Gladwell recalls, the doctor is based on a real life person named Dorothy Lewis whom he had written about in a New Yorker article years ago. The play's author, Lavery, adapted many of the scenes for her play directly from events described in the original article. In some cases the dialogue was (verbatim ) quotes cited in the article. None of these things were attibuted to Gladwell or to the real life doctor Lewis by Laverly. Gladwell goes back and forth about it and ponders how different things that are the result of a creative process, particularly musical ones, are related to each other. Is the relationship one of "cut and paste" or one of transformation and change? Eventually he chides the "plagiarism fundamentalists" for "[pretending that] chains of influence and evolution do not exist, and that a writer’s words have a virgin birth and an eternal life."
Meghan O'Rourke at Slate goes into more detail about how originality and creativity are related to plagiarism. Her article is relevant to the case of Florence Deeks and H.G. Wells which is recounted in Jonathon Keat's review of A.B. McKillop's book "The Spinster and the Prophet." Whereas, in the Lavery case, Gladwell argues that the two works share a "parent-child" relationship, this one is more of a sibling rivalry: a single path bifucates into two different competing trails. The controversy surrounds H.G. Well's famous book "The Outline of History." McKillop argues that although Wells and Deeks appear to have come up with the idea of writing a "history of everything from the beginning" independently, Wells' books clearly borrows heavily from Deeks' book. However, for Keats, hard plagiarism takes a back seat to soft plagiarism. He argues that Wells' book provides evidence for the important and original idea that "the progress of society" is to be measured against the yardstick of democracy. On the other hand, Deeks had written a feminist tome which presented evidence for a different idea but similarly "deeply original for its time", namely that "civilization (as opposed to barbarity) is feminine" and that "peace and properity were characteristic of female leadership."
In all of aforementioned literary examples, care is taken to distinguish between questions of plagiarism, which in my opinion are resolved in the court of public opinion, and questions of copyright infringement, which is a legal issue. Along these lines Tim Wu at Slate produces a thought provoking article discussing the legal battle between Dan Brown (the Da Vinci code) and Robert Leigh, "a self-appointed grail expert." Essentially the historical and religious claims that Brown presents as fiction are the ones that Leigh and his coauthors present as non-fiction in a book called "Holy Blood, Holy Grail." Wu addresses the following interesting questions (1) "Can one writer freely borrow someone else's wacky historical speculations?" (2) "When an author offers up a speculation like "space aliens killed JFK," does it really make sense to call that a fact?" (3) "How can dueling authors ever have a meaningful public discussion of who Mary Magdalene was, if, for example, one side claims exclusive ownership of the theory that she was a lowly prostitute?" The precedent for this case exists in American law and Wu summarizes the reasoning succintly: "If the author calls it a fact, you can steal it."
Finally here are some things that I'll save for another post by me or some interested party: (1) the many pieces of software that exist to uncover hard plagiarism, not the least of which is Google itself: Paul Collins at Slate discusses the impact that google book search will have on old and new cases of literary hard plagiarism. (2) Recent high profile cases of the two historians Stephen Ambrose and Doris Kearns Goodwin. (3) How the question of plagiarism is approached in a journalistic context. (4) John Fogerty's long and strange legal battle with Fantasy Records.
Wednesday, April 25, 2007
Clouds are big
Thursday, April 12, 2007
T. Rex... tastes like chicken
Sunday, April 8, 2007
Exercise Builds Brain Cells
Saturday, April 7, 2007
Supercooled, Superheated and Supersaturated Liquids
Spontaneous Boiling | Spontaneous Freezing |
In the case of the spontaneous boiling, the water has been superheated, which means that the water has been heated above its normal boiling point. How is that possible? Well, normally there are impuities in the water (like minerals or salt) and scratches on the surface of the cup which make it easier for bubbles to form. They do this by providing a phsical edge where bubbles can esailty br created, called nucleation sites. If there are neucleation sites, that as soon as a small portion of the water gets above the boiling poit, the water begins to boil in that region. On the other hand, if the water is very pure and the surface of the cup is very smooth, then there are no easy places for the bubbles to form, thus allowing the water to get to higher temperatures without boiling. When a spoon (or sugar) is put into this superheated water, the spoon provides the needed nucleation sites and the water, already above the boiling point, all boils away very very quickly. The exact same phenomenon occurs with supercooled liquids, except there the nucleation sites make it easier for ice crystals to form rather than bubbles.
The same phenomena is responsible for the popular Mentos and Diet Coke experiment. In this case, the soda is a supersaturated solution of carbon dioxide in water. The porous surface of the Mentos provide a huge number of nucleation sites for the release of carbon dioxide dissolved in the soda, causing it to spew out the top of the bottle.
Wednesday, April 4, 2007
Women In Science
Correspondingly, if you talk about ideas that may or may not have merit, but still evoke the same kinds of emotion, you get in even bigger trouble. Such is the position that Larry Summers found himself in a couple years ago. He spoke about the reasons that women are underrepresented relative to the population-at-large in tenure-track faculty positions in the sciences. Famously, one reason he gave was the possibility that there are "innate differences" between men and women. He explicitly said what he meant by this: even if the mean aptitude of men and women were the same, if the variance in the male population is greater than the variance in the female population then there will be more men at the extremes (both high and low). Since faculty are drawn exclusively from one extreme, men would be overrepresented.
He mentioned two other explanations - that the different responsibilities in child-bearing could make women less likely to thrive in a career that requires long hours from ages 25-35, and the generally accepted social factors (discrimination - the bad kind). Though he guessed at which factors may be more important, at no point did he say that any of it was unquestionably true. He never said women were dumber than men, or that any individual women could not succeed in science. Still, he was essentially forced to resign.
Several articles were published about the talk; some supportive and some not. I'm going to address one particular criticism - that we should not even discuss the possibility that differences between the sexes account for some of the underrepresentation. To do so, it is claimed, propagates stereotypes which are harmful to women considering or already involved in science.
Which furthers stereotypes more? To discuss the ideas that Summers put forward, or to assume that young women are so fragile that they cannot even hear alternative viewpoints? To assume that they cannot differentiate between someone saying that women as a whole are less likely to be well-suited for scientific careers, and someone saying that they individually are not cut out for it?
The solution to this problem (and I agree that all things being equal, it would preferable to have many more women in science) is not to refuse to have the conversation. If women are less likely to be in science because of family (Summer's guess at the top reason) then institutions could adopt more family friendly policies, as some are already doing. And if social biases are found to be truly a contributing factor, than we can more adeptly and confidently address these biases.
Instead, we are left not with searches for truth, but for what people want to hear. I believe this is only part of a disturbing trend on college campuses towards the stifling of dissent. SFSU students faced disciplinary action for flag-burning. Check out the news at FIRE, a college free speech advocacy group, for many other instances.
A common theme in my posts is that while our intuition is in many cases useful, it is often not the truth, but what we wish were true. If we are not allowed to question it, we will never tell the difference.
I can't find the exact quote, but I remember reading an exchange a short while ago. One person, a politician or civic leader or such, told a scientist (an evolutionary psychologist, perhaps) that his research implied things that were uncomfortable and disheartening to people. His response was something to the effect, "It's what the experiment shows. What would you have me do, fiddle with the results?"
Monday, April 2, 2007
Breakthrough in Blood Transfusions
Blood type is determined by the presence of antigens on the surface of red blood cells. People with antigen A are type A, people with antigen B are type B, and people with both are type AB. If a person with type O blood (which contains neither antigen) received type A blood, their immune system would interpret the blood cell as foreign and attack it. However if a person with type A blood received type O blood, there are no antigens to recognize and the blood would be accepted. This is the reason that anyone can receive type O blood (and conversely type AB people can receive any blood - they don't recognize any antigens as being foreign). There is also another important antibody, the presence or absence of which is denoted by a "positive" or "negative" designation.
The two enzymes reported essentially destroy the two antigens, converting the blood to type O. Interestingly, they were discovered by screening bacteria and fungi for enzymes capable of this activity. Many drugs we are accustomed to (for instance Tylenol, Viagra and Lipitor) work by inhibiting a particular protein that when left to function produces some undesirable effect. In this case the drugs are small molecules, and therefore can be rationally designed for that purpose. Enzymes are far more complicated however, and therefore are resistant to rational design. Instead, this screening technique is commonly used.
Small molecule drugs can also be found by this approach; in fact, this is where most antibiotics came from. My current boss, Scott Strobel, just got back from the jungles of Peru where he took undergrads to look for natural products. Maybe the next Tylenol is sitting in the dirty cardboard box in our fridge.