Machine learning in the classification of computer code

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Lethbridge, Alta. : University of Lethbridge, Dept. of Mathematics and Computer Science

Abstract

Machine learning approaches are a well-established method to analyze natural language. Sociolinguistic characteristics, such as the author's gender, experience, and age, have compelling effects on natural language use. Previous research has shown that a computer program can be analyzed using similar linguistics-based approaches. In this research, we are using machine learning techniques to analyze computer programs based on the author's programming experience. We use machine learning and statistical approaches to determine which features are most significant in the classification of a computer program according to the author's programming experience. Several experiments have been carried out on a dataset consisting of computer programs written in C++, and the results are encouraging. The experimental results estimate that the author's programming experience can be predicted with an accuracy of 69%.

Description

Citation

Endorsement

Review

Supplemented By

Referenced By