A Fuzzy-based approach to programming language independent source-code plagiarism detection

Acampora, Giovanni; Georgina, Cosma

doi:10.1109/FUZZ-IEEE.2015.7337935

Source-code plagiarism detection in programming, concerns the identification of source-code files that contain similar and/or identical source-code fragments. Fuzzy clustering approaches are a suitable solution to detecting source-code plagiarism due to their capability to capture the qualitative and semantic elements of similarity. This paper proposes a novel Fuzzy-based approach to source-code plagiarism detection, based on Fuzzy C-Means and the Adaptive-Neuro Fuzzy Inference System (ANFIS). In addition, performance of the proposed approach is compared to the Self-Organising Map (SOM) and the state-of-the-art plagiarism detection Running Karp-Rabin Greedy-String-Tiling (RKR-GST) algorithms. The advantages of the proposed approach are that it is programming language independent, and hence there is no need to develop any parsers or compilers in order for the fuzzy-based predictor to provide detection in different programming languages. The results demonstrate that the performance of the proposed fuzzy-based approach overcomes all other approaches on well-known source code datasets, and reveals promising results as an efficient and reliable approach to source-code plagiarism detection. © 2015 IEEE.

A Fuzzy-based approach to programming language independent source-code plagiarism detection / Acampora, G., Cosma, G.. - (2015), pp. 1-8. (2015 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2015) ) [10.1109/FUZZ-IEEE.2015.7337935].