Nowadaysanenormousamountofdynamic,heterogeneous,complexandunboundeddatawasobtainedfromvarioussectors like social networks, genomics, physics, health, and climatology. The process of operating and managing these data was significantly tedious, at the same, it is important to achieve the desired speed-performance in data processing. In the existing systems, hardware is more operative than the software. The processor-based software which processed earlier has a major disadvantage on the term of an algorithm, it is not effective on dealing with huge volume of data and also on achieving the overall efficiency. On the big data analyses, hardware support is important in order to overcome the real-time issues. The major data mining task to be performed in big data analytics is clustering. It makes the relationship between the object s by means of the similarity and categorizes the data into meaningful groups. In this work, a novel k-means algorithm is proposed to minimize the running time. This algorithm has simple and scalable parallel architecture, which is easy to implement on FPGA-based parallel processing architecture also. This implementation is more efficient for K-means Clustering system on dealing with the big data. It is also applicable for reconfigurable hardware platform such as FPGA, known for the real-time clustering applications.The proposed system is implemented on our hardware design with the benchmark dataset, in order to prove its feasibility and efficiency. Our proposed hardware architecture is more prominent in dealing with different kinds of datasets, with the varying number of clusters as well as a huge volume of data.
Digital Object Identifier (DOI)
S., Castro and Pushpalakshmi, R.
"A Novel K-Means Clustering-Based FPGA Parallel Processing in Big Data Analysis,"
Applied Mathematics & Information Sciences: Vol. 13
, Article 10.
Available at: https://dc.naturalspublishing.com/amis/vol13/iss5/10