Wednesday, March 29, 2006
It will be an absolutely dumb statement if it was not "average" but "median". For "average", if the distribution of the returns is highly right-skewed, with a more than 50% probability to perform higher than the mean is a good indicator. But can the copy-writer of that commercial be so statistically sophiscated? This could be an amusing w1111 example in the future. Even though mutual funds may not be the most appropriate subject. I used to have students who complained about the car examples I used since they had little experience with automobiles.
This reminded me of my conversation with Ying Wei the past Monday. It was on the loss of efficience when estimating mean of a Gaussian distribution using median compared with using mean. That reminded me of the recent popular show on NBC: Deal or No Deal (DoND).
Here is my version of what that show is about: the show starts with 26 closed cases contain 26 fixed money values range from $0.01 to $1,000,000. The contestant will open several cases randomly in batches. The cases opened are eliminated from the board (i.e., can not be won by the contestant). After each batch of cases opened, the show will pause. Looking at the remained undiscovered money amounts, a banker will offer the contestant an amount of money to make him/her stop, (from winning the biggest remaining value, of course). If the contestant refuses the offer, he/she will have to eliminate one or more amounts by random guessing, which will actually make the next offer drop.
From the contestant stand point, he/she should accept offer that is higher than the MEDIAN since he/she only play once. If he/she keeps on playing, there is a 50-50 chance that he/she leaves with value lower than the offer. On the banker side, he needs to make offer that is much lower than the mean since he needs to play the game many times. Thus, it is not a surprise to me that every time the bank makes an offer, it is always much lower than the mean of the remaining values. I still yet to figure out the magical amount (offer-median) and the reasoning behind it.
Sunday, March 26, 2006
Regarding the security, we can always implement .htaccess level of limited access. Try click on my teaching site for w1111. It is pretty easy to set up but I haven't mastered how to let users change their password occassionally. I suppose it is not our biggest concern now.
Friday, March 24, 2006
One of the questions I had was that whether the latent factors fitted to the data correspond to demographic characteristics of the nodes in the network. Before I asked I knew the answer would be "not necessarily". The latent factors just provide a way or a model to decompose the variation structure of a network into a more interpretable factors that represent the initiator and the receiver of an edge in a network.
There were also other discussion along this direction. I didn't catch all of them since I was busy making some simple numerical examples to help myself understand better. Then I heard Andrew say:"we can not claim to infer the data generating mechanism behind the data. we can only infer a data generating mechanism that can generate the data observed."
This reminded me of my thoughts on data and models.
Data (limited observed values) always classify all possible models into equivalence classes. For example, in regression, n points (x_i, y_i) define classes of curves that go through the same values at the x_i's. The regression analysis is simply trying to find the class with the closest distance to the data. In a modeling effort, the targeted model space intercept with the data's equivalence classes. After the interception, if there is more than one model remained in each equivalence class, we get the identifiability issue.
We can only understand the world to the extent that the data allow. When we ask others about the size of their data sets, we may just sound like coworkers comparing offices: "how's the view in your new office?" "pretty good! the window's much bigger than what I used to have" "wow, nice! you can see so much more now!"
Thursday, March 23, 2006
Today I woke up feeling exhausted. Every time this happens, I just want to take something that will boost my mood into a unreasonable "work high". I vaguely recall reading about the medical cause of depression, where I learned that our moods are affected by some enzyme in our brain. So I thought maybe the level of that-whatever-it-is thing in my head is highly variable. Maybe there is a way to stablize it (doing Yoga maybe?).
So I went online and googled.
Okay. It is not really an enzyme. Hippocampus (hippo-campus?) that in charge of memory storage becomes smaller in depressed patients. Hmm ... interesting, I thought. Most people had depression went through things they DO want to forget. Sometime, we hear ourselves saying "I am trying to forget about this" especially during stressful events. This can be interpretted as signals to our brain (of course, we think using our brain, don't we?) and our brain takes the hint and signaled the hippocampus to become smaller.
There is absolute proof that people suffering from depression have changes in their brains compared to people who do not suffer from depression. The hippocampus, a small part of the brain that is vital to the storage of memories, is smaller in people with a history of depression than in those who've never been depressed. A smaller hippocampus has fewer serotonin receptors. Serotonin is a neurotransmitter -- a chemical messenger that allows communication between nerves in the brain and the body. What scientists don't yet know is why the hippocampus is smaller.
Investigators have found that cortisol (a stress hormone that is important to the normal function of the hippocampus) is produced in excess in depressed people. They believe that cortisol has a toxic or poisonous effect on the hippocampus. It's also possible that depressed people are simply born with a smaller hippocampus and are therefore inclined to suffer from depression.
So we probably should keep remembering everything no matter how frustrating it is. :)