IIUM Repository

Using regular expressions for mining data in large software repositories

Awang Abu Bakar, Normi Sham (2014) Using regular expressions for mining data in large software repositories. In: 2014 The 5th International Conference on Information and Communication Technology for The Muslim World (ICT4M), 17th-18th November 2014, Kuching, Sarawak, Malaysia.

[img] PDF - Published Version
Restricted to Repository staff only

Download (701kB) | Request a copy
[img] PDF (SCOPUS) - Supplemental Material
Restricted to Repository staff only

Download (81kB) | Request a copy

Abstract

The usage of data mining technique in collecting data from software repositories involves the extraction of both basic and value-added information from existing software repositories. Regular Expressions (Regex) provide a mechanism to select specific strings from a set of character strings. In this paper, we discuss how regular expressions are used to create a data mining tool, known as OSSGrab. We developed the mining tool using Python scripting, in combination with Regex, and as a result, the time spent on data collection can be saved significantly.

Item Type: Conference or Workshop Item (Plenary Papers)
Additional Information: 3509/42896
Uncontrolled Keywords: data mining, software repository; regular expression; open source
Subjects: T Technology > T Technology (General)
Kulliyyahs/Centres/Divisions/Institutes (Can select more than one option. Press CONTROL button): Kulliyyah of Information and Communication Technology > Department of Computer Science
Kulliyyah of Information and Communication Technology > Department of Computer Science
Depositing User: Dr. Normi Sham Awang Abu Bakar
Date Deposited: 21 May 2015 16:24
Last Modified: 20 Sep 2017 09:07
URI: http://irep.iium.edu.my/id/eprint/42896

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year