The proposed technology is a quick, accurate, and sound method for finding the most important features in a dataset. In big data, knowing which variables are important and which are redundant can be extremely difficult. Selecting appropriate and informative features allows for more powerful statistical inference and more efficient computation and data storage. Current methods make poor assumptions or change the data in ways that make it nonsensical or unusable. The proposed technology allows for efficient and accurate feature selection that makes good statistical sense.
Cooperative Game Theory for Feature Selection
This technology uses cooperative game theory to group variables into “coalitions.” This selects features which are most useful for storage and later inference. Current technologies for analysis of big data use one of several methods for feature selection, but some cannot be applied to certain types of data, such as gene expression data, or can have trouble handling data with large amounts of colinearity. This technology has been shown to outperform these types of feature selection when applied to blood loss severity data.
Applications
- Genomic Data Analysis
- Disease Diagnosis
- Social Networks Analysis
- Climate Science
- Financial Market Predictions
- Sales Intelligence
Advantages
- Requires fewer searches and less computational time than competing methods
- Agnostic to type of data being analyzed
- Independent of classification methods, such as Support Vector Machines or K-means clustering