Skip to main content

Big Data Mining Methods

Mining data from the data created on the internet is an ever increasing challenge as we are generating a staggering amount of data, roughly 2.5 Quintilian bytes of data every day. With the massive amounts of data that are generated, there have to be ways to filter and sort this data efficiently. There are several methods being utilised right now, such as association. When you go and buy some washing power on Amazon for example, it would make sense for that person to also want to buy fabric softener. This type of association is very simple but the same principles can be applied to millions of different products, thus making it easier to associate data with a person.
Another type of method for mining big data is clustering, clustering is similar to association, in that multiple products, items or even ideas, can be clustered together into one group. This one group can then be labelled as a certain entity, where you will be able to find things that relate to that entity. This would be very handy for narrowing down fields and getting specific information.





https://datafloq.com/read/5-major-data-mining-techniques-being-used-big-data/3352

Comments

Popular posts from this blog

Limitations of Traditional Data Analysis

Traditional data analysis is outdated for a few reasons, the number one reason is that it simply can not keep up with the massive amounts of data being  generated  in this day and age . The systems were not designed with such large quantities of data in mind. Second of all most of the data that is being generated is semi or unstructured data, meaning that the data is in a format that  can't  be read by the older traditional data analysis systems, such as email, video, audio and texts.  

Types of Problems Suited to Big Data Analysis

There are a lot of valuable uses for big data when it comes to companies making a profit, however big data can also be used to solve real life problems and make the world a better place. For example, back in 2014, the CDC which is the Centers for Disease Control, were able to track vital information such as the health, population and movement information to predict the spread of disease. This was extremely handy for preventing the spread of the virus to further areas and inflicting more pain and distress on the population. There is also work being carried out which will be able to predict floodings which will happen within the next 100 to 500 years. This work has been possible thanks to the information gathered from floods in the last two decades and using artificial intelligence to predict these future floods, preemptively saving the lives of countless people. https://insidebigdata.com/2018/03/21/big-data-revolution-data-can-solve-commercial-public-health-problems/ https://hack...