Decomposing Büchi automata

How easy is it to analyse Büchi automata?

Hera Brown

Department of Computer Science, University of Oxford, UK

It’s a standard result that all Büchi automata recognise precisely finite unions of languages of the form $U \cdot V^\omega$, where $U$ and $V$ are both regular languages; we call such recognised languages $\omega$-regular expressions. Büchi himself proved the result in [1], Lemma 10. It’s a nice result to have, particularly because of its simplicitly, as well as how well it captures the intuition of what a run on a Büchi automaton looks like.

Here, then, I thought I’d give a go at proving the result in my own words. Of course there’s nothing groundbreaking about doing this, but I think both that it helps the result to sink in better, and that revisiting simple or foundational results is always a sensible thing to do. And, in any case, technical writing is fun.

Throughout the proof, I’ll assume that we’re talking about nondeterministic Büchi automata and nondeterministic finite automata. This is especially pertinent as there’s no equivalence between deterministic Büchi automata and $\omega$-regular expressions; no determinisic Büchi automaton recognises the language denoted by $(0+1)^*0^\omega$, for instance. On a more positive note, this restriction does let us be liberal with our use of $\epsilon$-transitions.

I’ll also refer to the process of turning a Büchi automaton into an $\omega$-regular expression as analysis, and the process of turning an $\omega$-regular expression into a Büchi automaton as synthesis. As far as I can tell, this way of talking about automata and regular expressions first comes from [2], in reference to Kleene’s paper on nerve nets and finite automata [3].

From expressions to automata

Let’s first give a method of translating $\omega$-regular expressions into Büchi automata. Now, we know that all $\omega$-regular expressions look something like

\[ U_1 \cdot V_1^\omega + U_2 \cdot V_2^\omega + \dots + U_n \cdot V_n^\omega \]

where all the languages $U_i$ and $V_i$ are regular. If we want to synthesise a Büchi automaton out of these, then we’ll need to show that:

1. we can synthesise a Büchi automaton $V^\omega$, given a regular language $V$,

2. we can synthesise a Büchi automaton $U \cdot V^\omega$, given regular languages $U$ and $V$, and

3. we can synthesise a Büchi automaton $U_1 \cdot V_1^\omega + U_2 \cdot V_2^\omega$, given regular languages $U_1, V_1, U_2$, and $V_2$.

All $\omega$-regular expressions are built up from this form, and so if we can synthesise a Büchi automaton from any $\omega$-regular expression of this form, then we can synthesise all $\omega$-regular expressions into Büchi automata, as we want. So let’s cover each case in turn.

Infinite repetition

First let’s show (1). Given a regular language $V$, we first synthesise an NFA that recognises it. We want to show that there’s a Büchi automaton that recognises words in $V^\omega$ i.e. words of the form $v_1 v_2 \dots$ where each $v_i$ is in $V$, and so the basic idea is that we’ll just run our word on the NFA over and over, ad infinitum.

A first naïve idea would be, given an NFA that recognises $V$, just add an $\epsilon$-transition from every final state back to the start state of the NFA. This does give us a Büchi automaton that recognises all the words in $V^\omega$, since every time we think we’ve come to the end of a word in $V$ we take an $\epsilon$-transition back to the start state, which has us visiting accepting states infinitely often.

The trouble is that this construction accepts too much. We might never use an $\epsilon$-transition, and so we might end up accepting an $\omega$-word that can’t be decomposed into words of $V$. For instance, consider an NFA that accepts the language denoted by $a(ab)^*$. Presumably, when running the word $a(ab)^\omega$ on that NFA, by treating the NFA as a Büchi automaton, we visit some accepting state infinitely often. But the word $a(ab)^\omega$ can’t be decomposed into the form $w_1 w_2 \dots$ where each $w_i$ is an element of $a(ab)^*$, and so our constructed Büchi automaton is too liberal, and accepts too much.

To deal with this, we just need to enforce that our synthesised Büchi automaton takes the $\epsilon$-transitions we’ve added infinitely often. We therefore won’t let the automaton visit an accepting state infinitely often without taking these $\epsilon$-transitions, and so won’t let the automaton accept a word unless it really can be decomposed into the form $v_1 v_2 \dots$.

For this, we do something similar to the synthesis of finite automata from regular expressions of the form $V^*$. Given an NFA that recognises $V$, we construct our Büchi automaton that recognises $V^\omega$ by setting all states of the NFA to be non-accepting. We then add a new start state to the NFA, and set that to be accepting. Finally, we add an $\epsilon$-transition back from each formerly accepting state to this new start state. We only accept a word if we visit this start state infinitely often, and so we need to take the $\epsilon$-transitions infinitely often, as we wanted to enforce.

Formally, we construct this Büchi automaton $\tuple{Q’, \language, \delta’, q_0’, F’}$ that recognises $V^\omega$ from the NFA $\tuple{Q, \language, \delta, q_0, F}$ that recognises $V$ as follows:

Let’s check that this Büchi automaton accepts precisely the words in $V^\omega$. By definition, we know that a Büchi automaton accepts a word just when a run on that word visits a subset of accepting states infinitely often. For that to happen here, we have to visit the privileged state $0$ infinitely often. That requires taking the $\epsilon$-transitions back infinitely often, and we can only take those transitions if we’ve just gone through an accepting run on our underlying NFA. We can translate those runs back into words, and so split up the word we’re running our automaton on. So we accept a word just when we can write it in the form $v_1 v_2 \dots$, with each $v_i$ from $V$, as we wanted. Grand.

We’ve now shown that all $\omega$-regular expressions of the form $V^\omega$ denote languages that are Büchi recognisable (that is, recognised by some Büchi automaton). Let’s now show that all $\omega$-regular expressions of the form $U \cdot V^\omega$ are Büchi recognisable.

Concatenation

Since we’re allowed nondeterminism, this isn’t too difficult. Given a regular language $U$ and a Büchi recognisable language $V^\omega$, we first construct an NFA that recognises $U$ and a Büchi automaton that recognises $V^\omega$. Given an $\omega$-word, we first run that word on our NFA, then nondeterministically guess that we’re done with the NFA and move onto the Büchi automaton. This lets us partition the $\omega$-word into the form $uv$, where $u$ is a word of $U$ and $v$ is a word of $V^\omega$, and so we use this constructed automaton as the automaton which recognises the language $U\cdot V^\omega$.

Formally speaking, given an NFA $\automaton{Q_1}{\language}{\delta_1}{q_{0_1}}{F_1}$ and a Büchi automaton $\automaton{Q_2}{\language}{\delta_2}{q_{0_2}}{F_2}$, we construct the automaton which recognises their concatenation as follows:

Let’s see if this works. We accept just when we visit a subset of the accepting states infinitely often, and that happens only if a suffix of our word is accepted by the underlying Büchi automaton. That happens only when we have an accepting run on the underlying NFA beforehand, and thus we can split our run into the finite prefix accepted by the underlying NFA, and the infinite suffix accepted by the underlying Büchi automaton. So we split up our input word into subwords $u$ and $v$, where $u$ is a word of $U$ and $v$ is a word of $V^\omega$, and we’re done.

With that done, we now just need to show that the finite union of the languages we’ve just defined are also Büchi recognisable.

Finite Unions

Given two Büchi recognisable languages denoted by $U_1\cdot V_1^\omega$ and $U_2\cdot V_2^\omega$, we construct a Büchi automaton that recognises $U_1\cdot V_1^\omega + U_2\cdot V_2$—that is, their union—as follows. We take the two languages, and synthesise from them two Büchi automata which recognise each respective language. We then add a new start state, and nondeterministically guess from that start state which of the Büchi automata we’re going to run our word on. And that’s it; it’s just like the analogous construction for recognising the finite unions of regular languages.

Formally, with Büchi automata $\automaton{Q_1}{\language}{\delta_1}{q_{0_1}}{F_1}$ and $\automaton{Q_2}{\language}{\delta_2}{q_{0_2}}{F_2}$, we define the Büchi automaton recognising their union as:

Let’s see if this works. Our constructed Büchi automaton accepts a word just when we visit a subset of $F$ infinitely often, and that happens just when we visit a subset of either $F_1$ or $F_2$ infinitely often. That happens precisely when we accept a word recognised by either our first or second automaton, and so the language our synthesised automaton accepts is just the union of the languages its constitutent automata accept. So we have what we want.

And that’s it! We’ve done all we need to show that every language denoted by an $\omega$-regular expression is Büchi recognisable. Now let’s see how big the automaton we get out the other end is.

Complexity

It’d first help to know the complexity of synthesising an NFA from a regular expression, since ultimately our $\omega$-regular expressions are built from regular expressions. Thankfully, we see:

The upshot of all this is that, given a regular expression, we can synthesise an NFA with a number of states linear in the size of the input regular expression. This is nice.

Moving to $\omega$-regular expressions, we get a similar result. In particular, looking at the above constructions, we see:

So we again get only a linear blowup in size. So it turns out that the synthesis problem for Büchi automata and $\omega$-regular expressions is quite easy. Wonderful.

Unfortunately (at least by a coarse-grained analysis), we don’t get such a nice increase in size for the analysis problem. Let’s see why.

From automata to expressions

Let’s now translate our Büchi automata back into $\omega$-regular expressions. To analyse a Büchi automaton into an $\omega$-regular expression, we need to know the words that the Büchi automaton accepts, and so to do this we’ll need to look at the underlying graph of our given Büchi automaton.

The main idea in this analysis is to split every run on our Büchi automaton into two parts:

Having done this translation of acceptance conditions into paths, we can now work on a proper algorithm to do the job for us.

Given these sets of paths, we need to transform them into an $\omega$-regular expression, and so we need to know all the words which take us between any given pair of states in a Büchi automaton. We have an algorithm to do this, by treating the Büchi automaton as an NFA, and then analysing that NFA into a regular expression. Regular expressions denote sets of words, and so we end up with the set of words that we wanted after all. To get the right set of words which take us between any two given states, we tweak the start and accept states, and analyse the resulting NFA.

In particular, given a Büchi automaton $\automaton{Q}{\language}{\delta}{q_0}{F}$, to find the sets of words which take us from a state $q$ to a state $r$, we construct an NFA with:

Denoting the set of words which takes us from state $q$ to state $q’$ as $W_{q \to q’}$, which is conveniently a regular expression, we then take our $\omega$-regular expression to be

\[ \bigcup\limits_{q \in F} W_{q_0 \to q} \cdot (W_{q \to q})^\omega \]

wherein we see the finite prefix we treat as a path to the chosen accepting state on the left, and the infinite prefix where we infinitely cycle around some accepting state on the right. Thus we’ve analysed a given Büchi automaton into an $\omega$-regular expression, as we wanted.

Complexity

Unfortunately, at least at a coarse anaylsis, we don’t get the same merely linear blowup with analysis as we did with synthesis. The trouble here is that analysing NFAs into regular expressions can take a polynomial amount of time. Since we have to analyse NFAs into regular expressions in order to construct an $\omega$-regular expression, this means that we end up with the analysis problem for Büchi automata in NP.

That tells us that it’s easier to turn an $\omega$-regular expression into a Büchi automaton than it is to turn a Büchi automaton into an $\omega$-regular expression. Taking that with the intuition (at the very least my intuition) that Büchi automata are easier to read than $\omega$-regular expressions, and it seems like despite being expressively equal Büchi automata are the richer representation of the Büchi-recognisable languages.

Concluding Remarks

We’ve seen how to transform $\omega$-regular expressions into Büchi automata and back again, and how easy it is to do that transformation in either direction. It’s nice to have this result as an extension of the analogous synthesis and analysis of finite automata and regular expressions, and so it’s nice to see that Büchi automata really are a natural extension of finite automata.

Bibliography