
Registered user since Wed 6 Jan 2021
Contributions
View general profile
Registered user since Wed 6 Jan 2021
Contributions
Since toxicity during developers’ interactions in open source software (OSS) projects show negative impacts on developers’ relation, a toxicity detector for the Software Engineering (SE) domain is a need. However, prior studies found that contemporary toxicity detection tools failed their reliability with the SE texts by achieving a poor performance. To encounter this challenge, I have developed ToxiCR, a SE-specific toxicity detector that is evaluated with manually labeled 19,571 code review comments. I evaluate ToxiCR with different combinations of ten supervised learning models, five text vectorizers, and eight preprocessing techniques (two of them are SE domain-specific). After applying all possible combinations, I have found that ToxiCR significantly outperformed existing toxicity classifiers with accuracy of 95.8% and an 𝐹1_1 score of 88.9%
Pre-printDoctoral Symposium
Mon 10 Oct 2022 13:30 - 14:00 at Ambassador A - Session 3 - Human Factors & TestingToxic and unhealthy conversations during the developer’s communication may reduce the professional harmony and productivity of Free and Open Source Software (FOSS) projects. For example, toxic code review comments may raise pushback from an author to complete suggested changes. A toxic communication with another person may hamper future communication and collaboration. Research also suggests that toxicity disproportionately impacts newcomers, women, and other participants from marginalized groups. Therefore, toxicity is a barrier to promote diversity, equity, and inclusion. Since the occurrence of toxic communications is not uncommon among FOSS communities and such communications may have serious repercussions, the primary objective of my proposed dissertation is to automatically identify and mitigate toxicity during developers’ textual interactions. On this goal, I aim to: i) build an automated toxicity detector for Software Engineering (SE) domain, ii) identify the notion of toxicity across demographics, and iii) analyze the impacts of toxicity on the outcomes of Open Source Software (OSS) projects.
Pre-print