Understanding Abstract Syntax Trees with Python

banner

Dive into the world of Abstract Syntax Trees in Python, exploring their structure and practical applications with hands-on examples.

Abstract Syntax Trees (ASTs). A powerful representation of code structure that opens up a world of possibilities for code analysis, transformation, and generation.

Python’s ast module provides a robust set of tools for working with ASTs. Whether you’re building a linter, creating a code transformer, or diving into metaprogramming, understanding ASTs is going to be helpful.

In this post, we’ll explore Python’s AST capabilities, understand their structure, and see practical examples of how they can be used.

But, before we begin…

Let’s lay some groundwork first. These are just some of the things you should know before we go deep into the woods.

What are Abstract Syntax Trees?

Abstract Syntax Trees (ASTs) are tree-like representations of the structure of source code. They capture the essential elements of the code without including low-level details like punctuation or whitespace.

In an AST:

  • Leaf nodes typically represent values (like literals)
  • Interior nodes represent operations or control structures

Here’s a simple visualization of an AST for the expression 2 + 3 * 4:

       Add
      /   \
     2    Mult
         /    \
        3      4

This tree structure allows for easy traversal and manipulation of code structure.

Why Python?

Python’s ast module provides a comprehensive set of tools for working with ASTs. It allows you to:

  1. Parse Python code into an AST
  2. Traverse and analyze the AST
  3. Modify the AST
  4. Generate Python code from an AST

This makes Python an excellent choice for exploring AST concepts and applications.

With that out of the way, let’s begin.

Working with Python ASTs

Let’s start by looking at how to create and inspect an AST in Python.

Creating an AST

Python’s ast module makes it easy to parse Python code into an AST. Here’s a simple example:

import ast

code = """
def greet(name):
    print(f"Hello, {name}!")

greet("World")
"""

tree = ast.parse(code)

Now tree contains the AST representation of our code.

Inspecting the AST

To get a better understanding of the AST structure, we can use the ast.dump() function:

print(ast.dump(tree, indent=2))

This will output a readable representation of the AST:

Module(
  body=[
    FunctionDef(
      name='greet',
      args=arguments(
        posonlyargs=[],
        args=[arg(arg='name')],
        kwonlyargs=[],
        kw_defaults=[],
        defaults=[]),
      body=[
        Expr(
          value=Call(
            func=Name(id='print', ctx=Load()),
            args=[
              JoinedStr(
                values=[
                  Constant(value='Hello, '),
                  FormattedValue(
                    value=Name(id='name', ctx=Load()),
                    conversion=-1)],
                  Constant(value='!'))],
            keywords=[]))],
      decorator_list=[]),
    Expr(
      value=Call(
        func=Name(id='greet', ctx=Load()),
        args=[Constant(value='World')],
        keywords=[]))],
  type_ignores=[])

This output gives us a detailed view of the AST structure.

Practical Applications

Now that we understand the basics, let’s look at some practical applications of ASTs in Python.

1. Code Analysis

One common use of ASTs is for static code analysis. Let’s create a simple analyzer that counts the number of function definitions in a piece of code:

import ast

class FunctionCounter(ast.NodeVisitor):
    def __init__(self):
        self.count = 0

    def visit_FunctionDef(self, node):
        self.count += 1
        self.generic_visit(node)

code = """
def func1():
    pass

def func2():
    def inner():
        pass
    return inner

class MyClass:
    def method(self):
        pass
"""

tree = ast.parse(code)
counter = FunctionCounter()
counter.visit(tree)

print(f"Number of function definitions: {counter.count}")

This will output:

Number of function definitions: 4

2. Code Transformation

ASTs can also be used to transform code. Let’s create a transformer that adds a debug print statement to the beginning of every function.

First, install the astor library. This lets us convert an AST into code.

pip install astor
import ast
import astor

class DebugTransformer(ast.NodeTransformer):
    def visit_FunctionDef(self, node):
        debug_print = ast.Expr(
            value=ast.Call(
                func=ast.Name(id='print', ctx=ast.Load()),
                args=[ast.Str(s=f"Entering function: {node.name}")],
                keywords=[]
            )
        )
        node.body.insert(0, debug_print)
        return node

code = """
def greet(name):
    print(f"Hello, {name}!")

greet("World")
"""

tree = ast.parse(code)
transformer = DebugTransformer()
modified_tree = transformer.visit(tree)

print(astor.to_source(modified_tree))

This will output:

def greet(name):
    print('Entering function: greet')
    print(f'Hello, {name}!')

greet('World')

3. Code Generation

Finally, let’s look at how we can use ASTs to generate code. We’ll create a simple function that generates a Python function to compute the nth Fibonacci number:

import ast
import astor

def generate_fibonacci_func():
    return ast.FunctionDef(
        name='fibonacci',
        args=ast.arguments(
            args=[ast.arg(arg='n')],
            posonlyargs=[],
            kwonlyargs=[],
            kw_defaults=[],
            defaults=[]
        ),
        body=[
            ast.If(
                test=ast.Compare(
                    left=ast.Name(id='n', ctx=ast.Load()),
                    ops=[ast.LtE()],
                    comparators=[ast.Constant(value=1)]
                ),
                body=[ast.Return(value=ast.Name(id='n', ctx=ast.Load()))],
                orelse=[]
            ),
            ast.Return(
                value=ast.BinOp(
                    left=ast.Call(
                        func=ast.Name(id='fibonacci', ctx=ast.Load()),
                        args=[ast.BinOp(
                            left=ast.Name(id='n', ctx=ast.Load()),
                            op=ast.Sub(),
                            right=ast.Constant(value=1)
                        )],
                        keywords=[]
                    ),
                    op=ast.Add(),
                    right=ast.Call(
                        func=ast.Name(id='fibonacci', ctx=ast.Load()),
                        args=[ast.BinOp(
                            left=ast.Name(id='n', ctx=ast.Load()),
                            op=ast.Sub(),
                            right=ast.Constant(value=2)
                        )],
                        keywords=[]
                    )
                )
            )
        ],
        decorator_list=[]
    )

module = ast.Module(body=[generate_fibonacci_func()], type_ignores=[])
print(astor.to_source(module))

This will generate the following Python code:

def fibonacci(n):
    if n <= 1:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

Conclusions and beyond

We’ve explored the basics of working with Abstract Syntax Trees in Python, demonstrating how they can be used for code analysis, transformation, and generation. This is just scratching the surface of what’s possible with ASTs.

Some areas for further exploration include:

  • Using ASTs for building custom linters
  • Creating source-to-source transpilers
  • Implementing domain-specific languages (DSLs) in Python

Finally, I highly recommend diving deeper into Python’s ast module documentation. It provides a wealth of information and possibilities for working with Python’s syntax at a deeper level.

関連記事

カテゴリー:

ブログ

情シス求人

  1. チームメンバーで作字やってみた#1

ページ上部へ戻る