Do sociolinguistic variations exist In programming?

dc.contributor.authorNaz, Fariha
dc.contributor.authorUniversity of Lethbridge. Faculty of Arts and Science
dc.contributor.supervisorRice, Jacqueline E.
dc.date.accessioned2015-09-25T20:40:53Z
dc.date.available2015-09-25T20:40:53Z
dc.date.issued2015
dc.degree.levelMastersen_US
dc.description.abstractMachine learning techniques are currently widely used in the analysis of natural language. This thesis focuses on extending these techniques for analysis of programming languages. In particular we are interested in determining whether there are differences in the use of programming languages that might be associated with the authors’ gender. There are currently few studies that address possible relationships between linguistics and programming. In this thesis we use computer programs as the samples in our dataset. These programs have been written using the C++ programming language. We also acquired sociolinguistic information about the programmers, with the focus especially on gender. We use machine learning and statistical techniques to identify patterns (in language use) that are consistent for male and female programmers. The results of numerous experiments are encouraging. We demonstrate that we can predict the gender of programmers with 71% accuracy and detect similarities or dissimilarities in their programming style.en_US
dc.description.sponsorshipUniversity of Lethbridge Research Fund (ULRF)en_US
dc.embargoNoen_US
dc.identifier.urihttps://hdl.handle.net/10133/3749
dc.language.isoen_CAen_US
dc.proquest.subject0984en_US
dc.proquest.subject0636en_US
dc.proquest.subject0733en_US
dc.proquestyesYesen_US
dc.publisherLethbridge, Alta. : University of Lethbridge, Dept. of Mathematics and Computer Scienceen_US
dc.publisher.departmentDepartment of Mathematics and Computer Scienceen_US
dc.publisher.facultyArts and Scienceen_US
dc.relation.ispartofseriesThesis (University of Lethbridge. Faculty of Arts and Science)en_US
dc.subjectcomputer scienceen_US
dc.subjectmachine learningen_US
dc.subjectsociolinguisticsen_US
dc.subjectgenderen_US
dc.subjecttext miningen_US
dc.subjectcomputer programsen_US
dc.subjectprogrammingen_US
dc.titleDo sociolinguistic variations exist In programming?en_US
dc.typeThesisen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
NAZ_FARIHA_MSC_2015.pdf
Size:
416.86 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: