Chowdhury Sarker Jihan

Jihan is in his final year MSc Information Systems program from Toronto, ON. He has been an AU student for three years now. From a professional capacity, he is a senior software engineer and is currently an RA with AU’s IDEA Lab. His focus in the graduate program is data analytics and AI which aligns with his career goal of becoming a Natural Language Processing researcher. His research goals are AI, software engineering, data science, cyber security, etc. to name a few.


Title: Exploring the effectiveness of Bug Triaging through the implementation of BERT based deep learning model

Abstract: Bugs or software defects are an inherent part of software development. Bug triaging is the process of assigning the best developers to a bug. However, the process is time-consuming and error prone. Due to the textual nature of bug reports, existing solutions and literature include Natural Language Processing oriented sentiment analysis and text classification. Because bug reports include descriptive sentences, existing methods neglect the syntactic and sequential nature of words. This study used the Bi-directional Encoder Representations for Transformers (BERT) model as methodology. For BERT takes into consideration words in the context of their occurrences of other words in a given sentence. The paper investigates the following research question: In a given bug dataset, is it only a handful of developers that solve most of the bug incidents? The evaluation is based upon pre-training word embedding vectors and the BERT model against bug titles, applied on open-sourced datasets finding the best developers for triage. The preliminary experimentation on 1,032 developers from a bug dataset demonstrated 38 to be solving among 109,979 bug reports. Additionally, only a handful of those 38 developers solved most bugs with an accuracy close to 80% (representative of each developer correctly solving bugs assigned to them). Hence, this is indicative that the BERT model is strongly predictive of triaging the best developers for bugs against a given bug dataset. The contribution of the implementation is a novel pre-trained transfer learning methodology in bug triaging.

Related Sessions

View full schedule