Paper Image

Learning from realistic software bugs

Published on:

2 November 2023

Primary Category:

Machine Learning

Paper Authors:

Kamel Alrashedy,

Vincent J. Hellendoorn,

Alessandro Orso

Bullets

Key Details

Proposes technique to extract highly realistic subsets from unrealistic training data

Converts programs to embeddings and identifies most realistic via similarity

Shows consistent gains from pretraining on small, representative subsets

Highlights value of less but more realistic data for model performance

Cautions use of AI for predicting real-world bugs and vulnerabilities

AI generated summary

Learning from realistic software bugs

This paper proposes an approach to improve deep learning models for predicting software bugs and vulnerabilities. It identifies a subset of realistic examples from large datasets of unrealistic bugs for more effective model training. The key idea is to extract vector representations of programs using neural networks, and find unrealistic examples closest to real ones. Experiments on two defect prediction tasks show consistent gains from pretraining on small, highly realistic subsets, affirming the value of less but more representative data.

Answers from this paper

Comments

No comments yet, be the first to start the conversation...

Sign up to comment on this paper

Sign Up