P-value, what is it?

Generally the p-value is the tail probability of the test statistic value given that the null hypothesis is true.

The text below is taken from Wikipedia article about P-value:

In statistical hypothesis testing, the p-value is the probability of obtaining a result at least as extreme as that obtained, assuming the truth of the null hypothesis that the finding was the result of chance alone. The fact that p-values are based on this assumption is crucial to their correct interpretation.

More technically, the p-value of an observed value tobserved of some random variable T used as a test statistic is the probability that, given that the null hypothesis is true, T will assume a value as or more unfavorable to the null hypothesis as the observed value tobserved. "More unfavorable to the null hypothesis" can in some cases mean greater than, in some cases less than, and in some cases further away from a specified center.

In the problem of motif finding, p-value serves for evaluation of non-randomness of observed motif occurrences. For example, one found k occurences of motif in a DNA sequence of length N under study. Then p-value is the probability to find at least k motif occurrences in a random N-sequence.

The p-value calulated here will show how it is probably to get such number of occurrences of motif in a random sequence. And thus you could judge about non-randomness of your motif occurrences.

To calculate the p-value, you need to set models for motif and for random text. Then, specify the number of occurrences of motif you observed in your DNA sequence.

Last modified 01 February 2007