Classification of computer programming contest programs based on gender, region and software metrics

dc.contributor.authorZinnat, Sara Binte
dc.contributor.authorUniversity of Lethbridge. Faculty of Arts and Science
dc.contributor.supervisorRice, Jacqueline E.
dc.date.accessioned2021-12-16T16:21:52Z
dc.date.available2021-12-16T16:21:52Z
dc.date.issued2021
dc.degree.levelMastersen_US
dc.description.abstractThis research focuses on determining the effect of sociolinguistics characteristics (particularly, gender and region) on computer programs. Previous studies have demonstrated the use of machine learning techniques to analyze the relationship between sociolinguistics features and programming language. We collected C++ programs from an open source programming contest website. The features were calculated based on three software metrics: lines of code, cyclomatic complexity and Halstead metrics. Using five machine learning algorithms we trained several models and performed experiments to compare their performance. To investigate the significance of the features, we also carried out statistical and correlation analysis. As indicated by the experimental results, our models successfully predicted the gender of the programmers with 91.7% accuracy when programmers solved the same problems. When the programmers solved different problems, the model achieved an accuracy of 86.4%. Our models also efficiently classified the region of the programmer with 75.2% accuracy.en_US
dc.description.sponsorshipAlberta Innovates- Data Enabled Innovation (AI-DEI)en_US
dc.identifier.urihttps://hdl.handle.net/10133/6113
dc.language.isoen_USen_US
dc.proquest.subject0710en_US
dc.proquest.subject0800en_US
dc.proquest.subject0984en_US
dc.proquestyesYesen_US
dc.publisherLethbridge, Alta. : University of Lethbridge, Dept. of Mathematics and Computer Scienceen_US
dc.publisher.departmentDepartment of Mathematics and Computer Scienceen_US
dc.publisher.facultyArts and Scienceen_US
dc.relation.ispartofseriesThesis (University of Lethbridge. Faculty of Arts and Science)en_US
dc.subjectartificial intelligenceen_US
dc.subjectmachine learningen_US
dc.subjectcomputer programmingen_US
dc.subjectprogramming languagesen_US
dc.subjectsociolinguistics (gender, region)en_US
dc.subjectweb scrapingen_US
dc.subjectdata miningen_US
dc.subjectclassificationen_US
dc.subjectstatistical analysisen_US
dc.subjectsoftware metricsen_US
dc.subjectComputer programming--Sex differences--Researchen_US
dc.subjectProgramming languages (Electronic computers)--Syntax--Sex differences--Researchen_US
dc.subjectSociolinguistics--Network analysisen_US
dc.subjectSoftware measurementen_US
dc.subject.lcshProgramming languages (Electronic computers)--Syntax--Research
dc.subject.lcshComputer programming--Competitions--Research
dc.subject.lcshDissertations, Academic
dc.titleClassification of computer programming contest programs based on gender, region and software metricsen_US
dc.typeThesisen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ZINNAT_SARA_BINTE_MSC_2021.pdf
Size:
957.44 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
3.25 KB
Format:
Item-specific license agreed upon to submission
Description: