University of Lethbridge Theses

Permanent URI for this collection

https://opus.uleth.ca/handle/10133/298

Browse

Now showing 1 - 2 of 2

Investigating the impact of programming styles to improve code quality using machine learning and sociolinguistic features
(Lethbridge, Alta. : University of Lethbridge, Dept. of Mathematics and Computer Science, 2025) Abdullah, Deen Mohammad; University of Lethbridge. Faculty of Arts and Science; Rice, Jacqueline E.
In this research we investigated whether sociolinguistic factors such as gender, region, and expertise influence programming styles and code quality. We collected and processed over 700,000 C++ programs from GitHub and Codeforces to build data sets for training Random Forest and BERT models to classify programmer groups. While capturing stylistic patterns, experimental results showed that context-based models outperform metrics-based models. To measure code quality, we combined the Maintainability Index and difficulty metrics to label code as compliant or non-compliant. We further fine-tuned the T5 model for code transformation to generate stylistically improved code. However, due to the limitations of encoder–decoder LLMs, the generated code samples were non-executable. To address this, we developed a CodeBERT-based recommendation model that generates targeted, metric-driven guidance to improve code quality. Finally, we implemented a prototype tool that combines classifications, code quality, and improvement suggestions, providing pedagogically meaningful feedback for learners and researchers.
Query focused abstractive summarization using BERTSUM model
(Lethbridge, Alta. : University of Lethbridge, Dept. of Mathematics and Computer Science, 2020) Abdullah, Deen Mohammad; University of Lethbridge. Faculty of Arts and Science; Chali, Yllias
In Natural Language Processing, researchers find many challenges on Query Focused Abstractive Summarization (QFAS), where Bidirectional Encoder Representations from Transformers for Summarization (BERTSUM) can be used for both extractive and abstractive summarization. As there is few available datasets for QFAS, we have generated queries for two publicly available datasets, CNN/Daily Mail and Newsroom, according to the context of the documents and summaries. To generate abstractive summaries, we have applied two different approaches, which are Query focused Abstractive and Query focused Extractive then Abstractive summarizations. In the first approach, we have sorted the sentences of the documents from the most query-related sentences to the less query-related sentences, and in the second approach, we have extracted only the query related sentences to fine-tune the BERTSUM model. Our experimental results show that both of our approaches show good results on ROUGE metric for CNN/Daily Mail and Newsroom datasets.

University of Lethbridge Theses

Permanent URI for this collection

Browse

Related Links

Connect with Us

Campus

Library

Browse

Browsing University of Lethbridge Theses by Author "Abdullah, Deen Mohammad"

Results Per Page

Sort Options