Home Research Papers and Presentations Resources Contact

Nodes and Information Transmission

The System

This example is taken from (initially) Chapter 5 of Goldstein & Wooff. We have a chemical process that leads to a product \(Y\) and a by-product \(Z\). We suspect that the yields of each of \(Y\) and \(Z\) are related to the temperature \(X\). For multiple values of the temperature, \(X=(x_1,\,x_2,\dots x_{12})\), we have values of \(Y=(y_1,\dots,y_{12})\) and \(Z=(z_1,\dots,z_{12})\), and we propose the following linear relation: \[ Y_{i} = a + b x_{i} +e_{i},\,Z_{i} = c + d x_{i} + f_{i}\] where \(e_i\) and \(f_i\) are 'noise' terms (having zero mean and non-zero variance). The uncertainty in our values of the parameters \(a,\,b,\,c,\,d\) is encapsulated by a variance matrix \[ \text{Var}\begin{pmatrix} a \\ b \\ c \\ d \end{pmatrix} = \begin{pmatrix} 4 & -6 & -1 & 0 \\ -6 & 225 & 0 & -90 \\ -1 & 0 & 1 & -2.4 \\ 0 & -90 & -2.4 & 144 \end{pmatrix}. \] Accordingly, the noise terms have \(\text{Var}(e_i)=6.25\), \(\text{Var}(f_i)=4\) and \(\text{Cov}(e_i,\,f_i)=2.5\), with them being uncorrelated with the parameters \(a,\dots,d\). We can then think about how those uncertainties propogate through the system, and which quantities have the largest impact in reduction of uncertainty in \(Y\) and \(Z\).

The graphical representation is one way of seeing how uncertainty resolution occurs. For ease of viewing, we have left the combined \(y_i\), \(z_i\), \(e_i\) and \(f_i\) together as one node each. We first determine which nodes impact on which: we think about this as nodes being 'parents' of others (or correspondingly, nodes being 'children' of other nodes). There is a mathematical framework for determining these relations (see 'Bayes linear sufficiency and belief separation'); some obvious links are that \(Y\) must have \(E\), \(a\) and \(b\) as parents, and similarly for \(\text{Pa}(Z)=\{F,\,c,\,d\}\). Each directed arc in the diagram indicates a parent-child relationship, with the arrow pointing from parent to child. Nodes themselves are labelled and coloured, for reasons that we explain now.

Any parent of a node can be responsible for some of the uncertainty in its child; we can therefore think of a parent 'resolving' a proportion of the variations in its children. The coloured segments in each of the nodes demonstrates this uncertainty: a larger segment means the parent node of that colour resolves more of the uncertainty in the child. Not all of the uncertainty is guaranteed to be explained by its parents (for example, \(b\) has a large variance itself which is unlikely to be explained by its one parent, \(a\)); therefore, we do not expect all of the nodes to be filled with colour. However, the nodes \(Y\) and \(Z\) are fully filled: this is a consequence of the fact that they are entirely determined by their linear relations, and so inherit all of their uncertainty from the parameters.

One important point is that, even if the parameters are not correlated (for example, \(b\) and \(c\)), they can still have a parent-child relationship. This is because, even though there is no direct correlation between them, \(b\) contributes to resolving the uncertainty in \(c\) in conjunction with \(a\): more uncertainty in \(c\) is explained by the combination \(\{a,\,b\}\) than by \(b\) alone. This subtlety is addressed by the bars superimposed on the links. The coloured segment nearest the parent node (and coloured in the parent colour) represents the amount of information leaving the parent to the child; the coloured segment closest to the child (coloured in the child colour) represents the information arriving at the child from the parent. These need not be the same, as we can see.

Alternate Node Configuration

The groupings of nodes chosen is not the only way to visualise the data. A few other options, generated by allowed operations on the nodes, are available in the drop-down list: