Nodes and Information Transmission
The System
This example is taken from (initially) Chapter 5 of Goldstein & Wooff. We have a chemical process that leads to a product \(Y\) and a by-product \(Z\). We suspect that the yields of each of \(Y\) and \(Z\) are related to the temperature \(X\). For multiple values of the temperature, \(X=(x_1,\,x_2,\dots x_{12})\), we have values of \(Y=(y_1,\dots,y_{12})\) and \(Z=(z_1,\dots,z_{12})\), and we propose the following linear relation: \[ Y_{i} = a + b x_{i} +e_{i},\,Z_{i} = c + d x_{i} + f_{i}\] where \(e_i\) and \(f_i\) are 'noise' terms (having zero mean and non-zero variance). The uncertainty in our values of the parameters \(a,\,b,\,c,\,d\) is encapsulated by a variance matrix \[ \text{Var}\begin{pmatrix} a \\ b \\ c \\ d \end{pmatrix} = \begin{pmatrix} 4 & -6 & -1 & 0 \\ -6 & 225 & 0 & -90 \\ -1 & 0 & 1 & -2.4 \\ 0 & -90 & -2.4 & 144 \end{pmatrix}. \] Accordingly, the noise terms have \(\text{Var}(e_i)=6.25\), \(\text{Var}(f_i)=4\) and \(\text{Cov}(e_i,\,f_i)=2.5\), with them being uncorrelated with the parameters \(a,\dots,d\). We can then think about how those uncertainties propogate through the system, and which quantities have the largest impact in reduction of uncertainty in \(Y\) and \(Z\).
The graphical representation is one way of seeing how uncertainty resolution occurs. For ease of viewing, we have left the combined \(y_i\), \(z_i\), \(e_i\) and \(f_i\) together as one node each. We first determine which nodes impact on which: we think about this as nodes being 'parents' of others (or correspondingly, nodes being 'children' of other nodes). There is a mathematical framework for determining these relations (see 'Bayes linear sufficiency and belief separation'); some obvious links are that \(Y\) must have \(E\), \(a\) and \(b\) as parents, and similarly for \(\text{Pa}(Z)=\{F,\,c,\,d\}\). Each directed arc in the diagram indicates a parent-child relationship, with the arrow pointing from parent to child. Nodes themselves are labelled and coloured, for reasons that we explain now.
Any parent of a node can be responsible for some of the uncertainty in its child; we can therefore think of a parent 'resolving' a proportion of the variations in its children. The coloured segments in each of the nodes demonstrates this uncertainty: a larger segment means the parent node of that colour resolves more of the uncertainty in the child. Not all of the uncertainty is guaranteed to be explained by its parents (for example, \(b\) has a large variance itself which is unlikely to be explained by its one parent, \(a\)); therefore, we do not expect all of the nodes to be filled with colour. However, the nodes \(Y\) and \(Z\) are fully filled: this is a consequence of the fact that they are entirely determined by their linear relations, and so inherit all of their uncertainty from the parameters.
One important point is that, even if the parameters are not correlated (for example, \(b\) and \(c\)), they can still have a parent-child relationship. This is because, even though there is no direct correlation between them, \(b\) contributes to resolving the uncertainty in \(c\) in conjunction with \(a\): more uncertainty in \(c\) is explained by the combination \(\{a,\,b\}\) than by \(b\) alone. This subtlety is addressed by the bars superimposed on the links. The coloured segment nearest the parent node (and coloured in the parent colour) represents the amount of information leaving the parent to the child; the coloured segment closest to the child (coloured in the child colour) represents the information arriving at the child from the parent. These need not be the same, as we can see.
Alternate Node Configuration
The groupings of nodes chosen is not the only way to visualise the data. A few other options, generated by allowed operations on the nodes, are available in the drop-down list:
-
Grouped Parameters
The sets \(\{a,b\}\) and \(\{c,d\}\) can be naturally combined into two nodes, which we call \(G_y\) and \(G_z\) respectively. This is allowed because \(a\) and \(b\) have the same parents (while \(a\) is a parent of \(b\), we can consider this an 'internal' link), and the same children (\(c\), \(d\) and \(Y\)), so we can make a single node out of them with these parent and child sets. Similarly, \(c\) and \(d\) share parents (\(a\) and \(b\)) and children (\(Z\)), so they can be made into a single node. This simplifes the node structure, while maintaining the separation between parameters \(a\dots b\) and the noise \(E,F\). We could combine \(E\) and \(F\) into a single node with children \(Y,Z\), but this may obscure details.
-
Reversed Hierarchy
If we were to observe the values of \(Y\) and \(Z\), then it would be useful to reverse the direction of travel so that our observation induces a change in our understanding of the parameters. Then, we consider \(Y\) and \(Z\) as parents, and our other parameters as children of both, to see the propogation of uncertainty from the observed quantities to the parameters. This is useful if we have observations and wish to check validity of the model.
With this grouping of nodes, the relationship between \(Y\) and \(Z\) and their parameters is more clear. While a lot of information leaves both \(Y\) and \(Z\) to all of the parameters, the information arriving at them is very different: information leaving \(Y\) has a greater impact on \(E\), \(a\) and \(b\) (while barely affecting the other three parameters), and vice-versa for \(Z\). -
Heart of the Transform
Note that much of the uncertainty in \(Y\) and \(Z\) is caused by the noise terms, \(E\) and \(F\). This may not be helpful for visual diagnostics. Instead, we can consider the sets \(G=\{a,b,c,d\}\) and \(H=\{Y,Z\}\). Then \(H=\{y_1,\dots,y_{12},z_1,\dots,z_{12}\}\) is \(24\)-dimensional, but there are four linear combinations of the \(y_i\) and \(z_i\) that are most affected by the meaningful parameters \(G\). We call this set the heart of the transform, and denote the set of four combinations \(W^+\); the remaining \(20\)-dimensional space is denoted \(W_0\). By construction, \(W^+\) and \(W_0\) are uncorrelated. Then we can look at the resolution in this set of nodes. This is particularly useful for two reasons. The heart \(W^+\) can be used solely for belief adjustment, and is simpler to deal with than the full space \(\{Y,Z\}\). The orthogonal space \(W_0\) is useful as a diagnostic tool for the specification, in the presence of observed values.
We can see that the heart shows much more clearly how the parameters \(G\) resolve the uncertainty in the 'useful' directions of \(H\); the contribution \(E\) and \(F\) make to \(W_0\) also more clearly shows the symmetry between the two sets in our original specification.