Have you ever heard the term Data Mining? If you are a scientist or a person who is struggling with data, surely you are familiar with the term Data Mining. Data mining or data science is currently being talked about and many people use it for sure.
In short, data mining is a tool or method that allows users to access large amounts of data in a short amount of time. More specifically understanding of data mining is an application and tool using statistical analysis of data. For more details, will be discussed in this article.
Understanding Data Mining
Data mining is a process of gathering or mining important information from a large enough data. The process used in data mining usually uses statistical methods, mathematics, machine learning, to use artificial intelligence technology. These fairly complex techniques will later identify and extract useful information from a large database.
Other terms of data mining that are often used include knowledge discovery (mining) in databases (KDD), knowledge extraction, data / pattern analysis, data archeology, data dredging, information harvesting, and business intelligence.
Techniques in data mining are used to examine large databases as a way to find patterns or shapes that are new and useful. However, not all information retrieval work is stated as data mining.
A simple example is when you read data in a telephone book. After you finish reading, you will get information that most people with the name Asep live in Bandung, so that can be said as a process .
But, if you only look for Asep Hidayat’s residence in the telephone book, then the process cannot be said to be a data mining process. This is just a normal query process. The data mining process usually gets to the implementation process. Likened for example like gold mining Among the many materials, you only find a little gold but the value of the gold is very high.
Data mining is an integral part of KDD (knowledge discovery in databases). The entire KDD process for converting raw data into useful information is shown below:
If seen in the picture above in the KDD process, many techniques and concepts are used in the data mining process. In the process there are several steps needed to get the desired data or information.
The KDD process includes data cleaning, data integration, data selection, transformation, data mining, pattern evaluation, and knowledge presentation.
Data Mining Function
Data mining has many functions. For the main function consists of two functions, namely, the description function (description) and prediction function (prediction). But, basically data mining has four basic functions namely, the prediction function (prediction), the description function (description), the classification function (classification), and the association function (association). Consider the explanation of the following four functions:
- Prediction Function
Prediction function is a process that will later find certain patterns of data. The pattern is obtained from various variables contained in the data.
If the process has obtained a pattern, then the patterns can be used to predict other variables of unknown type or value.
Because of this process, this function is said to be a prediction function because it is the same as doing predictive analysis. This prediction function can also be used to predict certain variables that are not contained in the data.
So this prediction function can provide convenience and profit for anyone who needs a fairly accurate prediction to make something important even better.
- Function Description (Description)
The next function is the description function. The description function in data mining is a function to further understand the data being observed. With the existence of a process it is expected to know the behavior of a desired data. That data can later be used to determine the characteristics of the data in question.
By using the description function (description) of data mining, then later can find a certain pattern that is hidden in a data. In other words, it is with these repetitive and valuable patterns that the characteristics of a data can easily be known. This certainly will provide many benefits and can increase knowledge.
- Classification Function
The next function is the function of classification or classification . The classification function in data mining means that existing data will be processed so that later a specific function or model will be found that describes the concept or class of data. The function or model will later separate each data into a particular group.
These data groups can later be used to predict future data trends. Classification or grouping of data can also facilitate the owner of the data when searching for data needed.
- Association Function (Association)
The next function is the association or association function. The function of association or association analysis in data mining is a process whose use is to obtain associative combinations or rules of data . So the existing data will be processed so that it will get information about the relationship between one variable with another.
To be easily understood, here are examples of examples. For example in the analysis of purchases of goods in the minimarket. For example from the purchase data when it is processed and it turns out it gets the result of a relationship between egg purchases and soy sauce. If the customer is most likely to buy eggs and soy sauce together, the minimarket can use this information to arrange the egg and soy sauce layout.
Soy sauce can be placed on a rack not far from the egg. Or you can also give a bonus of soy sauce every egg purchase or maybe you can use another method, which clearly illustrates the relationship between the egg and soy sauce. Things like this will definitely benefit.
Examples of Data Mining
Some examples or application of data mining can occur in various sectors. For example the business sector, finance, management, and so forth. Here are some data mining applications from several sectors:
- Management and Market Analysis
Usually data mining in the marketing sector is used for customer relationship management (CRM), target marketing, cross selling, market analysis, and market segmentation.
- Marketing targets, for example, such as getting a “model” customer group that has the same characteristics as, income level, interests, shopping habits, and so forth. Or determine a purchase pattern from customers from time to time.
- Market traffic analysis,for example finding relationships / relationships between sales products and predictions based on the association.
- Customer profiling, for example what type of customers buy what products (classification or grouping).
- Analysis of customer needs, for example such as identifying which products are best for different groups of customers, predicting what factors will attract new customers to come, statistical summary information (trends and variations in data centers)
- Risk Management and Corporate Analysis
Usually the application of data mining in the corporate sector is used for customer retention, prediction, better underwriting, competitive analysis, and quality control.
- Financial planning and asset evaluation, for example, such as analyzing and predicting cash flow, analyzing contingent claims to evaluate assets, analyzing cross-sectional and time series, and so forth.
- Planning Resource planning, for example such as summarizing and comparing resources and also spending
- Competition, for example, such as monitoring competitors and market direction, setting strategies to set prices in highly competitive markets, classifying customers into classes, and so on.
- Mining Unusual Patterns and Fraud Detection
Data mining is also used to find and detect fraud on a system. By using data mining methods, millions of incoming transactions will also be seen.
- Approaches,for example such as Clustering & construction models for fraud, outlier analysis
- Application:health services, credit card services, telecomm, retail industry, money laundering, auto insurance, health insurance, and so on.