t+z statistics: August 2005

Tuesday, August 30, 2005

Find the art in the everyday dot com

Check it out: www.findtheartintheeveryday.com

Some Columbia Students should do something similar on the Low plaza.

It would be nice if we use the steps for the audience and the lower level of the plaza as stage. Some stereo should be set up to play the music. It can either be done in the evening and every performer can hold a candle or something shinning, or it would be cool if it is done after a not-so-big snow.

Thursday, August 25, 2005

Prerequisite: high school math

I put this line in my 1111 syllabus every semester and have never paid a thought to it until last spring. It is in fact the only level of math we assume of our 1111 students.

One of my students approached me after class, pointing to this line to his copy of the syllabus.
"This is a joke, right?" He said.
"Why?"
" 'Cause everybody here [on this campus] must have taken math back in high school."
"I am not saying we only require that all students must have taken high school math before they can take 1111. We assume that they still remember much of their high school math before taking 1111." I thought for a while and responded defensively. That I think is a lot to assume.

The other day, I walked pass the midtown Kmart and saw their back-2-school ad. They put an equation, x^2=x-1, in one of the pictures. I solved it in my head as I walked towards the subway and realized that Kmart probably didn't know that this equation did not have rational numerical solutions. Then I thought: "this is legitimately high school math, but I am not sure whether all my 1111 students can solve it." Not that I care since they will not need to do such thing in my class.

It made me think about all those things we have learned in high school and never used. Is it a bad thing that we have forgoten much of it? Why should we study things we are not going to use? Just simply because we don't know what we are going to need in the future, we have to study a comprehensive foundation? Is this the only reason? Or is it healthier to keep one's brain busy at different things when one is young? Or is it healthier to have something to dump when one is getting older? Maybe, all those things we have learned and have no use for are some kind of placeholders for our future acquisitions? This is a wild thought. :)

Thursday, August 04, 2005

Summer Drought

Haven't been posting for a long time. No excuse at all.
It is just summer.

Wednesday, August 03, 2005

Local Computations with Probabilities on Graphical Structures and Their Application to Expert Systems

Journal of the Royal Statistical Society. Series B (Methodological) Vol. 50, No. 2 (1988), pp. 157-224 [JSTOR link]

S. L. Lauritzen; D. J. Spiegelhalter

Abstract: A causal network is used in a number of areas as a depiction of patterns of `influence' among sets of variables. In expert systems it is common to perform `inference' by means of local computations on such large but sparse networks. In general, non-probabilistic methods are used to handle uncertainty when propagating the effects of evidence, and it has appeared that exact probabilistic methods are not computationally feasible. Motivated by an application in electromyography, we counter this claim by exploiting a range of local representations for the joint probability distribution, combined with topological changes to the original network termed `marrying' and `filling-in'. The resulting structure allows efficient algorithms for transfer between representations, providing rapid absorption and propagation of evidence. The scheme is first illustrated on a small, fictitious but challenging example, and the underlying theory and computational aspects are then discussed.

Tuesday, August 02, 2005

Maximum Likelihood from Incomplete Data via the EM algorithm

Journal of the Royal Statistical Society. Series B. Vol. 39, No. 1, pp. 1-38 [JSTOR link]
A. P. Dempster, N. M. Laird, and D. B. Rubin

Abstract: A broadly applicable algorithm for computing maximum likelihood estimates from incomplete data is presented at various levels of generality. Theory showing the monotone behaviour of the likelihood and convergence of the algorithm is derived. Many examples are sketched, including missing value situations, applications to grouped, censored or truncated data, finite mixture models, variance component estimation, hyperparameter estimation, iteratively reweighted least squares and factor analysis.

Reversible Jump Markov Chain Monte Carlo Computation and Bayesian Model Determination

Biometrika Vol. 82, No. 4, pp. 711-732 [JSTOR link]
Peter J. Green

Abstract: Markov chain Monte Carlo methods for Bayesian computation have until recently been restricted to problems where the joint distribution of all variables has a density with respect to some fixed standard underlying measure. They have therefore not been available for application to Bayesian model determination, where the dimensionality of the parameter vector is typically not fixed. This paper proposes a new framework for the construction of reversible Markov chain samplers that jump between parameter subspaces of differing dimensionality, which is flexible and entirely constructive. It should therefore have wide applicability in model determination problems. The methodology is illustrated with applications to multiple change-point analysis in one and two dimensions, and to a Bayesian comparison of binomial experiments.

Identification and measurement of neighbor-dependent nucleotide substitution processes

Bioinformatics 2005 21(10):2322-2328; doi:10.1093/bioinformatics/bti376
Peter F. Arndt and Terence Hwa

Abstract:
Motivation: Neighbor-dependent substitution processes generated specific pattern of dinucleotide frequencies in the genomes of most organisms. The CpG-methylation–deamination process is, e.g. a prominent process in vertebrates (CpG effect). Such processes, often with unknown mechanistic origins, need to be incorporated into realistic models of nucleotide substitutions.

Results: Based on a general framework of nucleotide substitutions we developed a method that is able to identify the most relevant neighbor-dependent substitution processes, estimate their relative frequencies and judge their importance in order to be included into the modeling. Starting from a model for neighbor independent nucleotide substitution we successively added neighbor-dependent substitution processes in the order of their ability to increase the likelihood of the model describing given data. The analysis of neighbor-dependent nucleotide substitutions based on repetitive elements found in the genomes of human, zebrafish and fruit fly is presented.

Availability: A web server to perform the presented analysis is freely available at: http://evogen.molgen.mpg.de/server/substitution-analysis

Twitter

Tuesday, August 30, 2005

Thursday, August 25, 2005

Thursday, August 04, 2005

Wednesday, August 03, 2005

Tuesday, August 02, 2005