An Empirical Study for Common Language Features Used in Python Projects
Paper Information
Paper Name: An Empirical Study for Common Language Features Used in Python Projects
Conference: IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER2021)
Authors: Yun Peng, Yu Zhang, and Mingzhe Hu
Abstract
As a dynamic programming language, Python is widely used in many fields. For developers, various language features affect programming experience. For researchers, they affect the difficulty of developing tasks such as bug finding and compilation optimization. Former research has shown that programs with Python dynamic features are more change-prone. However, we know little about the use and impact of Python language features in real-world Python projects. To resolve these issues, we systematically analyze Python language features and propose a tool named PYSCAN to automatically identify the use of 22 kinds of common Python language features in 6 categories in Python source code. We conduct an empirical study on 35 popular Python projects from eight application domains, covering over 4.3 million lines of code, to investigate the the usage of these language features in the project. We find that single inheritance, decorator, keyword argument, for loops and nested classes are top 5 used language features. Meanwhile different domains of projects may prefer some certain language features. For example, projects in DevOps use exception handling frequently. We also conduct in-depth manual analysis to dig extensive using patterns of frequently but differently used language features: exceptions, decorators and nested classes/functions. We find that developers care most about ImportError when handling exceptions. With the empirical results and in-depth analysis, we conclude with some suggestions and a discussion of implications for three groups of persons in Python community: Python designers, Python compiler designers and Python developers.