Increasingly sophisticated methods and tools are needed for tracking the dynamics and detecting inherent structures in modern day highly voluminous multi-faceted. Data scientists have long realized that tackling global challenges such as climate change, terrorism and food security cannot be contained within the frameworks and models of conventional data analysis. For example, separating noise from meaningful data in even a low-dimensional data with heavy tails and/or overlaps is quite challenging and standard non-linear approaches do not always succeed. Tracking the dynamics of multi-faceted data involving complex systems is tantamount to tracking agent-based complex systems with many interacting agents. Dimensional-reduction methods are commonly used to try and capture structures inherent in data but they do not generally lead to optimal solutions mainly because their optimisation functions and theoretical methods typically rely on special structures. We propose a parameter leveraging method for unsupervised big data modelling. The method searches for structures in data and creates a series of sub-structures which are subsequently merged or split. The strategy is to present the algorithm with a set of periodic data as one complex system. It then uses the patterns in the sub-structures to determine the overall behaviour of the complex system. Applications on solar magnetic activity cycles and seismic data show that the proposed method out-performs conventional unsupervised methods. We illustrate how the method can be extended to supervised modelling.
Digital Object Identifier (DOI)
Mwitondi, Kassim and Khorsheed, Eman
"A Parameter Leveraging Method for Unsupervised Big Data Modelling,"
Journal of Statistics Applications & Probability: Vol. 5
, Article 1.
Available at: https://dc.naturalspublishing.com/jsap/vol5/iss2/1