Tuesday, July 22, 2008
Up and downs of percentage of working men and women
New York Times published an article today on "Women Are Now Equal as Victims of Poor Economy". The whole article can be found here. I found the time plots used in this article is very interesting. It has some good graphing tactics: use of color/shade, for example and the Y axis is actually from 0-100%.
Monday, July 21, 2008
The Black Swan
Today, I stumbled over a special issue in The American Statistician (August 2007) designated to the praises and criticisms (mainly the latter) of "The Black Swan." The reason for such strong reaction from the usually low-key statistical profession is pretty obvious from a small number of excerpts quoted in TAS:
“Statisticians . . . are computing people, not thinkers.”
“Statisticians, it has been shown, tend to leave their brains in the classroom, and engage in the most trivial inferential errors when they are let out on the streets.”
It is worth mentioning that this special issue even caught the attention of the Bloomberg news and the editor of TAS was interviewed for a news coverage of this book.
TAS also invited Taleb to write a response in this special issue. In this response, Taleb said the main criticism for statistics in his book is for
1. The unrigorous use of statistics, and reliance on probability in domains where the current methods can lead us to make consequential mistakes (the “high impact”)where, on logical grounds, we need to force ourselves to be suspicious of inference about low probabilities.
2. The psychological effects of statistical numbers in lowering risk consciousness and the suspension of healthy skepticism—in spite of the unreliability of the numbers
produced about low-probability events.3. Finally TBS is critical of the use of commoditized metrics such as “standard deviation,” “Sharpe ratio,” “mean-variance,”and so on in fat-tailed domains where these terms have little practical meaning, and where reliance by the untrained has
been significant, unchecked and, alas, consequential.
This is essentially about cautions on prediction (extrapolation) based on models, effects of outliers and rare events, and uses of statistics that are motivated by specific probability models. I don't think any good applied statistican will deny the importance of cautionary interpretation of statistical analysis and will not "commit" the mistakes outlined above. Actually, sometimes I feel the conclusion that can be made from data analysis is very limited. The usefulness of statistical inference and analysis lies primarily in narrowing down hypotheses and possbilities.
Solution to improve integrity in medical research was proposed surrounding statistician's participation
Monday, February 25, 2008
Identifying gene-gene interaction that is relevant to a disease outcome
Tuesday, February 19, 2008
Tiling Arrays
Here are some references on this technology for my own use.
- Global Identification of Human Transcribed Sequences with Genome Tiling Arrays
Science 24 December 2004:Vol. 306. no. 5705, pp. 2242 - 2246
DOI: 10.1126/science.1103388
Paul Bertone,1* Viktor Stolc,1,2* Thomas E. Royce,3 Joel S. Rozowsky,3 Alexander E. Urban,1 Xiaowei Zhu,1 John L. Rinn,3 Waraporn Tongprasit,4 Manoj Samanta,2 Sherman Weissman,5 Mark Gerstein,3 Michael Snyder1,3 - Array of hope
Nature Genetics 21, 3 - 4 (1999)
E. S. Lander - Model-based analysis of tiling-arrays for ChIP-chip
PNAS August 15, 2006 vol. 103 no. 33 12457-12462
W. Evan Johnson*,,, Wei Li*,,, Clifford A. Meyer*,,, Raphael Gottardo, Jason S. Carroll¶, Myles Brown¶, and X. Shirley Liu*,,