Thursday, January 6, 2011

The Weibull Distribution vs. the Geometric Distribution


Trying to understand the relationship between the Weibull distribution and the geometric distribution. 

What is the relationship between the two distributions?


Let's suppose you have a coin. You are interested in how many times you have to flip it in order to come up with "heads". Each coin flip is called a trial.
The outcome of each trial is random. The geometric distribution describes the number of trials it takes for a success (heads) to occur, where each trial is independent of the others and each has a success probability of "p". The geometric distribution is called a "waiting time distribution," because you keep observing events, waiting for the first success.
The geometric distribution has a special property. It is "memoryless". That means that the probability of observing a certain number of failures in a row doesn't depend on how many trials have gone before. So, the chance that the next 5 trials will all be failures after already seeing 100 failures is the same as if you had only seen 10 failures. The distribution "forgets" how many failures have already occurred.
The geometric distribution is analogous to the exponential distribution. The geometric distribution is "discrete", since you can only have an integer number of coin flips (i.e. you can't flip a coin 1.34 times). The exponential distribution is continuous. Instead of the number of trials to success, it waits for the amount of time for the first success.
Interestingly, the exponential distribution is also memoryless. That is, if a widget has been on test for 100 hours, the probability of it failing in the next instant is the same as if the test had only just started. Again, it forgets how long it has been on test.
The exponential distribution is a special case of the Weibull distribution.

The two distributions are:
exponential: F(t) = 1 - exp(-u*t)
Weibull: F(t) = 1 - exp[-u*(t^a)]

where F(t) gives the probability of failure at time t, and u and a are parameters derived from the life data. When a=1, the Weibull is the same as the exponential. Only in this case will the Weibull be memoryless. Otherwise, the Weibull will not be memoryless. So, the relationship between the geometric and the Weibull distributions is as follows: the geometric distribution is the discrete analog to the special case of the Weibull distribution with the Weibull parameter a=1. Otherwise, the two distributions cannot be considered to be related.
Lastly, I would like to note that the exponential distribution is sometimes used to describe the failure rate of a part (transistor, ball bearing, spark plug, etc.). Since it is memoryless, the failure rate it predicts will always be constant with time. In the real world, the failure rate changes throughout the life of the part. For example, at the beginning of a life test of 100 transistors, there may be a few early failures. This is sometimes called "infant mortality". After that, the transistors might have a reasonably constant failure rate that could be modeled by the exponential distribution. As time goes on, however, parts start to wear out and the failure rate will increase. These three regions, infant mortality, constant failure rate, and wear out make up what is called "the bathtub curve". Its name derives from the shape of a graph of failure rate vs. time: high to low, constant, and low to high. In the case of infant mortality or wear out, the exponential distribution won't describe the life times too well. Word to the wise: the exponential distribution should be used with care since it assumes noinfant mortality and no wear out!

I hope that answers your question.

No comments:

Post a Comment