IIUM Repository

A spark-based parallel fuzzy C median algorithm for web log big data

Mallik, Moksud Alam and Zulkurnain, Nurul Fariza and Nizamuddin, Mohammed Khaja and Sarkar, Rashal and Chalil, Aboosalih Kakkat (2022) A spark-based parallel fuzzy C median algorithm for web log big data. International Journal on “Technical and Physical Problems of Engineering” (IJTPE), 14 (3). pp. 212-220. ISSN 2077-3528

[img]
Preview
PDF (SCOPUS) - Supplemental Material
Download (172kB) | Preview
[img] PDF (Article) - Published Version
Restricted to Repository staff only

Download (508kB) | Request a copy

Abstract

Now-a-days, the World Wide Web (WWW) is regarded as an exceptionally large data storehouse. The WWW is becoming more complicated and substantive every day. At the moment, the situation is such that we are starved for knowledge while drowning in data. Due to these factors, the data mining clustering technique is one of the most crucial tools for collecting useful data from the web. Clustering techniques for small datasets have led to the development of numerous successful clustering techniques. Nevertheless, these techniques do not provide adequate results when trading with extensive data sets. The most important problems are excessive computational difficulty and lengthy evaluating time, which is not acceptable for real-time context. It is very prime to process this enormous information on time. This paper proposes an efficient parallel Fuzzy C median solution based on Spark for large-scale web log data. Based on the Rand Index and SSE (sum of squared error), the parallel Fuzzy C median algorithm's performance is evaluated in the PySpark platform. According to the experimental findings, the parallel Fuzzy C median method built on Spark performs better.

Item Type: Article (Journal)
Uncontrolled Keywords: Fuzzy Clustering, Web Log Big Data, Parallel Computing, Apache Spark
Subjects: T Technology > TK Electrical engineering. Electronics Nuclear engineering > TK7800 Electronics. Computer engineering. Computer hardware. Photoelectronic devices > TK7885 Computer engineering
Kulliyyahs/Centres/Divisions/Institutes (Can select more than one option. Press CONTROL button): Kulliyyah of Engineering
Kulliyyah of Engineering > Department of Electrical and Computer Engineering
Depositing User: DR Nurul Fariza Zulkurnain
Date Deposited: 27 Dec 2022 09:17
Last Modified: 27 Dec 2022 10:06
URI: http://irep.iium.edu.my/id/eprint/102189

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year