Computer program categorization with machine learning

dc.contributor.authorRafee, Md Mahmudul Hasan
dc.contributor.authorUniversity of Lethbridge. Faculty of Arts and Science
dc.contributor.supervisorRice, Jacqueline E.
dc.date.accessioned2017-11-21T19:57:03Z
dc.date.available2017-11-21T19:57:03Z
dc.date.issued2017
dc.degree.levelMastersen_US
dc.description.abstractMachine learning techniques have been applied to improve the learning process and to learn about the utilization of natural languages. Previous research has shown that similar techniques can be applied in the analysis of computer programming (artificial) languages. Several studies have demonstrated the influence of sociolinguistic characteristics such as age, gender, region, and social status in natural languages. This research focuses on determining the impact of sociolinguistic characteristics of the author, particularly gender and region on computer programs. We use machine learning and statistical techniques to find out the similarities and dissimilarities in the use of programming language based on the gender and region of the programmer. The results of various experiments are promising. We demonstrate that we can predict the gender of programmers with 83.1% accuracy and the region of the programmer with 92.5% accuracy.en_US
dc.description.sponsorshipAlberta Innovates -Technology Futures (AITF)en_US
dc.embargoNoen_US
dc.identifier.urihttps://hdl.handle.net/10133/4984
dc.language.isoen_USen_US
dc.proquest.subject0984en_US
dc.proquestyesYesen_US
dc.publisherLethbridge, Alta. : Universtiy of Lethbridge, Department of Mathematics and Computer Scienceen_US
dc.publisher.departmentDepartment of Mathematics and Computer Scienceen_US
dc.publisher.facultyArts and Scienceen_US
dc.relation.ispartofseriesThesis (University of Lethbridge. Faculty of Arts and Science)en_US
dc.subjectartificial languageen_US
dc.subjectlinguisticsen_US
dc.subjectmachine learningen_US
dc.subjectprogrammer characteristicsen_US
dc.subjectsociolinguistic characteristicsen_US
dc.subjecttext categorizationen_US
dc.titleComputer program categorization with machine learningen_US
dc.typeThesisen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
RAFEE_MAHMUDUL_MSC_2017.pdf
Size:
1.89 MB
Format:
Adobe Portable Document Format
Description:
Main Article
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
3.25 KB
Format:
Item-specific license agreed upon to submission
Description: