RepoSkillMiner: Identifying software expertise from GitHub repositories using Natural Language Processing
Thu 24 Sep 2020 10:35 - 10:40 at Wombat - Tool Demo Showcase (3) Chair(s): Csaba Nagy
One the one hand, as a GitHub profile is becoming an essential part of a developer’s resume it becomes increasingly important to enable HR departments to extract someone’s expertise, through automated analysis of his/her contribution to open-source projects. On the other hand, having clear insights on the technologies used in a project can be very beneficial for resource allocation and project maintainability planning. In the literature, one can identify various approaches for identifying expertise on programming languages, based on the projects that developer contributed to. In this paper, we move one step further and introduce an approach (accompanied by a tool) to identify low-level expertise on particular software frameworks and technologies apart, relying solely on GitHub data, using the GitHub API and Natural Language Processing (NLP)—using the Microsoft Language Understanding Intelligent Service (LUIS). In particular, we developed an NLP model in LUIS for named-entity recognition for three (3) .NET technologies and two (2) front-end frameworks. Our analysis is based upon specific commit contents, in terms of the exact code chunks, which the committer added or changed. We evaluate the precision, recall and f-measure for the derived technologies/frameworks, by conducting a batch test in LUIS and report the results. The proposed approach is demonstrated through a fully functional web application named RepoSkillMiner.
Tool Links: Video, Code Repo, Application, Validation Dataset
CCS CONCEPTS • Software and its engineering → Software creation and manage-ment -> Software post-development issues;
KEYWORDS Expertise; Frameworks; GitHub; Natural Language Processing; Soft-ware Project Management;
RepoSkillMiner Slides (RepoSkillMiner_Presentation.pdf) | 1.58MiB |
Wed 23 Sep Times are displayed in time zone: (UTC) Coordinated Universal Time
Thu 24 Sep Times are displayed in time zone: (UTC) Coordinated Universal Time
10:20 - 11:20: Tool Demo Showcase (3)Tool Demonstrations at Wombat Chair(s): Csaba NagySoftware Institute - USI, Lugano, Switzerland | |||
10:20 - 10:25 Talk | FILO: FIx-LOcus Localization for Backward Incompatibilities Caused by Android Framework Upgrades Tool Demonstrations Marco MobilioUniversity of Milano Bicocca, Oliviero RiganelliUniversity of Milano-Bicocca, Italy, Daniela MicucciUniversity of Milano-Bicocca, Italy, Leonardo MarianiUniversity of Milano Bicocca | ||
10:25 - 10:30 Talk | EXPRESS: An Energy-Efficient and Secure Framework for Mobile Edge Computing and Blockchain based Smart Systems Tool Demonstrations | ||
10:30 - 10:35 Talk | SmartBugs: A Framework to Analyze Solidity Smart Contracts Tool Demonstrations João F. FerreiraINESC-ID and IST, University of Lisbon, Pedro CruzIST, University of Lisbon, Portugal, Thomas DurieuxKTH Royal Institute of Technology, Sweden, Rui AbreuFaculty of Engineering, University of Porto, Portugal | ||
10:35 - 10:40 Talk | RepoSkillMiner: Identifying software expertise from GitHub repositories using Natural Language Processing Tool Demonstrations Efstratios KourtzanidisUniversity Of Macedonia, Alexander ChatzigeorgiouUniversity of Macedonia, Apostolos AmpatzoglouUniversity of Macedonia Pre-print Media Attached File Attached | ||
10:40 - 10:45 Talk | Sosed: a tool for finding similar software projects Tool Demonstrations Egor BogomolovJetBrains Research, Yaroslav GolubevJetBrains Research, ITMO University, Artyom LobanovJetBrains Research, Vladimir KovalenkoJetBrains Research, JetBrains N.V., Timofey BryksinJetBrains Research, Saint Petersburg State University | ||
10:45 - 10:50 Talk | GUI2WiRe: Rapid Wireframing with a Mined and Large-Scale GUI Repository using Natural Language Requirements Tool Demonstrations Kristian KolthoffInstitute for Enterprise Systems (InES), University Of Mannheim, Christian BarteltInstitute for Software and Systems Engineering, TU Clausthal, Simone Paolo PonzettoData and Web Science Group, University of Mannheim | ||
10:50 - 11:20 Live Q&A | Q&A or Discussion Tool Demonstrations |