Student’s t-distribution or t-distribution is a theoretical model used to approximate the first order moment of a normally distributed population when the sample size is small and the standard deviation is unknown.
In other words, the t distribution is a probability distribution that estimates the average value of a small sample taken from a population that follows a normal distribution and from which we do not know its standard deviation.
Recommended articles: degrees of freedom , degrees of freedom (example) and normal distribution.
Student’s t-distribution formula
Given a continuous random variable L, we say that the frequency of your observations can satisfactorily approximate a distribution t with g degrees of freedom such that:
The random variable L follows a distribution t with g degrees of freedom.
Representation of Student’s t-distribution
Density function of a distribution t with 3 degrees of freedom (df).
Density function of the distribution t with 3 degrees of freedom.
As we can see, the representation of the distribution t closely resembles the normal distribution except that the normal distribution has the widest tails and is more propped up. In other words, we should add more degrees of freedom to the t distribution so that the distribution “grows” and looks more like the normal distribution.
And … Why is the t distribution so special?
Because, unlike the normal distribution that depends on the mean and variance, the distribution t only depends on the degrees of freedom, English, degrees of freedom (df). In other words, by controlling the degrees of freedom, we control the distribution.
Student t application
The t distribution is used when:
- We want to estimate the average of a normally distributed population from a small sample.
- Sample size is less than 30 items, that is, n <30.
From 30 observations, the t distribution closely resembles the normal distribution and, therefore, we will use the normal distribution.
- The standardor standard deviation of a population is unknown and has to be estimated from the observations of the sample.
We assume that we have 28 observations of a random variable G that follows a Student’s t-distribution with 27 degrees of freedom (df).
28 observations of the random variable G that follows a distribution t with 27 degrees of freedom.
Random variable G that follows a distribution t with 27 degrees of freedom.
Since we are working with real data, there will always be an approximation error between the data and the distribution. In other words, the mean, median and mode will not always be zero (0) or exactly the same.
We represent the frequency of each observation of the variable G by a histogram.
Histogram of frequencies of the random variable G.
Can the random variable G approximate a distribution t?
Reasons to consider that the variable G follows a distribution t:
- The distribution is symmetric. That is, there are the same number of observations both to the right and to the left of the central value. Also, that the mean and the median tend to approximate the same value. The mean is approximately zero, mean = 0.016.
- The observations with more frequency or probability are around the central value. The observations with less frequency or probability are far from the central value.