Unit 2.2: Pushdown Automata

Introduction to Pushdown Automata (PDA)

A Pushdown Automaton (PDA) is a type of automaton that extends the capabilities of a finite automaton with a stack. This stack provides memory, allowing the PDA to perform more complex language recognition tasks, including recognizing context-free languages. PDAs are often used to describe the behavior of parsers in compiler construction.

Formal Definition of a PDA

A Pushdown Automaton is formally defined as a 7-tuple. Note that this is more complex than the 5-tuple of a Finite Automaton, as it includes components for the stack.

Q

A finite set of states.

Σ (Sigma)

A finite set of input symbols (the input alphabet).

Γ (Gamma)

A finite set of pushdown symbols (the stack alphabet).

q0

The initial or start state, where q0 ∈ Q.

Z

The initial pushdown symbol, which is initially present on the stack.

F

A set of final or accepting states, where F ⊆ Q.

δ (delta)

The transition function, which maps from the current state, input symbol (or ε), and top of the stack symbol to a new state and new stack symbol(s).

Formal mapping: \( \delta: Q \times (\Sigma \cup \{\epsilon\}) \times \Gamma \to Q \times \Gamma^* \)

How a PDA Computes

Instantaneous Description (ID)

An Instantaneous Description (ID) is an informal notation that captures the configuration of a PDA at a specific moment. It is a triple `(q, w, α)` where:

q is the current state.
w is the remaining input string.
α is the current stack contents (with the top of the stack at the left).

Turnstile Notation (⊢)

This notation is used to describe moves between IDs.

The symbol `⊢` represents a single move.
The symbol `⊢*` represents a sequence of zero or more moves.

For example, `(p, b, T) ⊢ (q, w, α)` implies that from state `p`, by consuming input `b` and with `T` on top of the stack, the PDA moves to state `q`, the remaining input is `w`, and the `T` is replaced by the string `α` on the stack.

Types of Pushdown Automata

Deterministic Pushdown Automaton (DPDA)

In a DPDA, there is at most one possible transition for any combination of state, input symbol, and stack symbol. It follows a single, deterministic computational path and can recognize deterministic context-free languages.

Nondeterministic Pushdown Automaton (NPDA)

An NPDA can have multiple possible transitions for a given combination of state, input, and stack symbol. It has the freedom to explore all possible paths simultaneously. An input string is accepted if even one of its possible paths leads to an accepting state.

💡 Power of Nondeterminism: NPDAs are strictly more powerful than DPDAs. While every DPDA is technically an NPDA, not every NPDA has an equivalent DPDA. NPDAs are needed when the machine must "guess" how to process the input, such as in the language \(L = \{a^nb^n\} \cup \{a^nb^{2n}\}\).

Example: PDA for the Language \(L = \{a^nb^n \mid n > 0\}\)

This PDA accepts any string with one or more 'a's followed by an equal number of 'b's.

Formal Definition (M):

Q = {q0, q1}
Σ = {a, b}
Γ = {A, Z}
q0 is the initial state
Z is the initial stack symbol
Acceptance is by empty stack in this example.

Transition Function (δ):

δ(q0, a, Z) = (q0, AZ) // First 'a', push A onto Z
δ(q0, a, A) = (q0, AA) // Subsequent 'a's, push A onto A
δ(q0, b, A) = (q1, ε) // First 'b', start popping A's
δ(q1, b, A) = (q1, ε) // Subsequent 'b's, pop A's
δ(q1, ε, Z) = (q1, ε) // Input empty, stack has Z, pop Z to accept

Applications and Limitations of PDAs

Key Applications

Compilers: Used for parsing and syntax analysis of programming languages.
Document Validation: Validating the structure of files like XML and HTML against a specified grammar (CFG).
Natural Language Processing (NLP): Analyzing the syntactic structure of sentences.
Syntax Checking: Powering syntax highlighting and error detection in code editors.

Limitations

Limited to Context-Free Languages (CFLs): Even an NPDA cannot recognize languages that are not context-free, such as \(L = \{a^nb^nc^n \mid n \ge 1\}\). Such languages require a more powerful model like a Turing machine.
Inefficiency of NPDA: Because an NPDA explores multiple paths, simulating it on real (deterministic) hardware can be inefficient and may require exponential time.
Theoretical Model: NPDAs are primarily a theoretical tool. Practical parsers often use deterministic (DPDA-like) methods for efficiency.

📝 Quick Review & Memorization

Key Terms & Acronyms

PDA (Pushdown Automaton): A finite automaton augmented with a stack for memory.
Stack: A Last-In, First-Out (LIFO) data structure that gives a PDA its memory.
Context-Free Language (CFL): The class of languages that can be recognized by a Pushdown Automaton.
Instantaneous Description (ID): A snapshot of the PDA's configuration, represented as `(q, w, α)`.
Turnstile Notation (⊢): A symbol representing a single transition or move in a PDA.
DPDA (Deterministic PDA): A PDA with at most one possible move for any configuration.
NPDA (Nondeterministic PDA): A PDA with multiple possible moves for a configuration, allowing it to "guess" or explore paths in parallel.
Stack Alphabet (Γ): The set of symbols that can be pushed onto the stack.
Acceptance by Empty Stack: A method of acceptance where an input string is accepted if the PDA finishes processing the string and the stack is empty.

Summary Points

A Pushdown Automaton (PDA) is a Finite Automaton plus a stack.
The stack gives the PDA memory, allowing it to recognize Context-Free Languages, which are more complex than Regular Languages.
A PDA is formally defined by a 7-tuple: `{ Q, Σ, Γ, δ, q0, Z, F }`.
The transition function `δ` depends on the current state, the current input symbol (or ε), and the symbol at the top of the stack.
There are two main types: Deterministic (DPDA) and Nondeterministic (NPDA).
NPDAs are strictly more powerful than DPDAs, meaning they can recognize all deterministic CFLs plus some non-deterministic CFLs.
The primary application of PDAs is in the parsing phase of compiler construction and document validation (e.g., XML).

Important Enumerations

The 7 Tuples of a Pushdown Automaton:
- Q (Set of States)
- Σ (Input Alphabet)
- Γ (Stack Alphabet)
- δ (Transition Function)
- q0 (Initial State)
- Z (Initial Stack Symbol)
- F (Set of Final States)
Components of an Instantaneous Description (ID):
- Current State (q)
- Remaining Input (w)
- Stack Contents (α)
Two Main Types of Pushdown Automata:
- Deterministic (DPDA)
- Nondeterministic (NPDA)