App. A: Background

A.1 O notation

Throughout this chapter and the rest of the book, we will describe the asymptotic behaviour of a function using \(O\) notation.

For two functions \(f(t)\) and \(g(t)\), we say that \(f(t) \le O(g(t))\) if \(f\) is asymptotically upper bounded by \(g\). Formally, this means that there exists some constant \(C > 0\) such that \(f(t) \le C \cdot g(t)\) for all \(t\) past some point \(t_0\).

We say \(f(t) < o(g(t))\) if asymptotically \(f\) grows strictly slower than \(g\). Formally, this means that for any scalar \(C > 0\), there exists some \(t_0\) such that \(f(t) \le C \cdot g(t)\) for all \(t > t_0\). Equivalently, we say \(f(t) < o(g(t))\) if \(\lim_{t \to \infty} f(t)/g(t) = 0\).

\(f(t) = \Theta(g(t))\) means that \(f\) and \(g\) grow at the same rate asymptotically. That is, \(f(t) \le O(g(t))\) and \(g(t) \le O(f(t))\).

Finally, we use \(f(t) \ge \Omega(g(t))\) to mean that \(g(t) \le O(f(t))\), and \(f(t) > \omega(g(t))\) to mean that \(g(t) < o(f(t))\).

We also use the notation \(\widetilde O(g(t))\) to hide logarithmic factors. That is, \(f(t) = \widetilde O(g(t))\) if there exists some constant \(C\) such that \(f(t) \le C \cdot g(t) \cdot \log^k(t)\) for some \(k\) and all \(t\).

Occasionally, we will also use \(O(f(t))\) (or one of the other symbols) as shorthand to manipulate function classes. For example, we might write \(O(f(t)) + O(g(t)) = O(f(t) + g(t))\) to mean that the sum of two functions in \(O(f(t))\) and \(O(g(t))\) is in \(O(f(t) + g(t))\).

A.2 Union bound

Theorem A.1 (Union bound) Consider a set of events \(A_0, \dots, A_{N-1}\). Then

\[ \mathbb{P}\left( \bigcup_{n=0}^{N-1} A_n \right) \le \sum_{n=0}^{N-1} \mathbb{P}(A_n). \tag{A.1}\]

In particular, if \(\mathbb{P}(A_n) \ge 1 - \delta\) for each \(n \in [N]\), we have

\[ \mathbb{P}\left( \bigcap_{n=0}^{N-1} A_n \right) \ge 1 - N \delta. \tag{A.2}\]

In other words, if each event \(A_n\) has a small probability \(\delta\) of “failure”, then to get the probability that there are any failures out of all \(N\) events, we multiply the failure probability by \(N\).