Sampling (statistics)

Sampling. In Statistics , the technique for selecting a sample from a population is known as Sampling .

Summary

[ hide ]

  • 1 Sampling Techniques
  • 2 Primary concepts
    • 1 Population and Sample
    • 2 Parameter
    • 3 Statistical
    • 4 Sample error
    • 5 Confidence level
    • 6 Population variance
    • 7 Statistical inference
  • 3 Bibliography

Sampling Techniques

In order to understand it, it is important to have knowledge of a group of primary concepts that make it easier for the reader to get into the subject, such as:

  • Sample error.
  • Confidence level.
  • Population variance.

Primary concepts

Population and Sample

It is only that set of individuals or elements that we can observe, measure a characteristic or attribute.

Examples of population:

  • The set made up of all university students in Cuba.
  • The set of all the students of a University.
  • The set of smokers in a region.

They are measurable or observable characteristics of each element, for example, its height, weight, age, sex, etc.

Suppose we are interested in knowing the average weight of the population made up of students at a university. If the university has 5,376 students, it would be enough to weigh each student, add the 5,376 weighings and divide it by 5,376. But this process can present difficulties, among which we can mention:

  • Locate and accurately weigh each student;
  • Write all the data without mistakes in a list;
  • Carry out the calculations.

The difficulties are greater if the number of elements in the population is infinite, if the elements are destroyed, if they are damaged when measured or are widely dispersed, if the cost to carry out the work is very high.

One solution to this problem is to measure only a part of the population that we will call a sample and take the average weight in the sample as an approximation of the true value of the average weight of the population.

The population size (N) is the number of elements of this and the sample size (n) is the number of elements in the sample. Populations can be finite and infinite .

The data obtained from a population can contain all the information you want about it. What it is about is to extract this information from the sample, that is, from the sample data, to extract all the information from the population.

It is good to point out that at one point a population can be a sample in an investigation and a sample can be a population, this is given by the objective of the investigation, for example in the case of determining the average height of university students in Cuba a sample could to be to choose some universities in the country and to carry out the work, if, on the contrary, you want to know the average height of the students of a specific university in Cuba, then the set made up of all the students of this university would be the population and the sample would given by the groups, careers or years selected to carry out the experiment.

Parameter

They are the measures or data obtained on the probability distribution of the population, such as the Mean , the Variance , the Proportion , etc.

Statistical

The data or measurements that are obtained on a sample and therefore an estimation of the parameters.

Sample error

Estimate or Standard. It is the difference between a statistic and its corresponding parameter . Is a measure of the variability of estimates of samples repeated around the value of the population, it gives us a clear idea of how far and with what probability one estimate based on a sample away from the value that would have been obtained by a Complete census .

An error is always made, but the nature of the research will tell us to what extent we can make it (the results are subject to sampling error and confidence intervals that vary sample to sample). It varies according to whether it is calculated at the beginning or at the end. A statistician will be more precise as and when his error is smaller. We could say that it is the deviation of the sampling distribution of a statistic and its reliability.

Confidence level

Probability that the estimate made is true. Any information that we want to collect is distributed according to a probability law ( Gauss or Student ), thus we call the confidence level the probability that the interval built around a statistic captures the true value of the parameter.

Population variance

When a population is more homogeneous the variance is smaller and the number of interviews necessary to build a reduced model of the universe, or of the population, will be smaller. It is generally an unknown value and must be estimated from data from previous studies.

Statistical inference

Addresses the problem of extracting information about the population contained in the samples .

In order for the results obtained from the sample data to be extended to the population, the sample must be representative of the population as regards the characteristic under study, that is, the distribution of the characteristic in the sample must be approximately equal to the distribution of the characteristic in the population.

Statistical representativeness is achieved with adequate sampling that always includes randomness in the selection of the population elements that will make up the sample. However, such methods only guarantee us a very probable but not completely sure representativeness.

After these preliminary and essential concepts, it is possible to go on to treat some of the forms, which from a scientific point of view can be used to extract a sample.

When sampling in a population, we can speak of probabilistic and non-probabilistic samplings , in our case we will refer to probabilistic samplings and within it we will study Simple Random Sampling (MAS), as a basic method in statistics, Stratified Sampling and Cluster Sampling .

Simple random sampling: It is one in which each element of the population has the same probability of being selected to integrate the sample.

A simple random sample is one in which its elements are selected through simple random sampling.

In practice we are not interested in the individual or element of the population selected in general, but only a characteristic that we will measure or observe in it and whose value will be the value of a random variable that in each individual or element of the population can take a value which will be an element of a certain set of values.

So a simple random sample can be interpreted as a set of values ​​of n independent random variables, each of which has the same distribution that is called population distribution.

There are two ways to extract a sample from a population: with replacement and without replacement .

  • Sampling with replacement: It is the one in which an element can be selected more than once in the sample, for which an element is extracted from the population, it is observed and it is returned to the population, so in this way infinite extractions of the population even being this finite.
  • Sampling without replacement: The extracted elements are not returned to the population until all the elements of the population that make up the sample have been extracted.

When making a probabilistic sample, we must mainly take into account two aspects:

  1. The selection method.
  2. The sample size

 

by Abdullah Sam
I’m a teacher, researcher and writer. I write about study subjects to improve the learning of college and university students. I write top Quality study notes Mostly, Tech, Games, Education, And Solutions/Tips and Tricks. I am a person who helps students to acquire knowledge, competence or virtue.

Leave a Comment