kboxplot {UEM}R Documentation

k-boxplots for mixture data

Description

k-boxplots visualize the k components of mixture models by k different boxes (Qarmalah, Einbeck & Coolen, 2016).

Usage

kboxplot(data, W=NULL, k, type="default", cen=0, colbox, 
       xlim, ylim, col, xlab, ylab, xaxt, main=type )

Arguments

data

a univariate data set for which a k-boxplot is to be produced. NAs are allowed in the data.

W

the responsibility matrix or weight matrix. This is a n x k matrix, where n is the length of the data set and k the number of mixture components. If the matrix W is not provided, it will be computed using EM.

k

the number of components.

type

specifies the way in which observations outside the boxes are displayed. Possible types are "plain", which simply draws whiskers until the maximum and minimum observations, "default" which displays individual points coloured by MAP classification, "full" for drawing lines which display the posterior possibilities, In addition, for k=2 only the option "two" is supported for drawing colored lines on boths side of the boxes. The boxes are drawn in exactly the same way under all four options.

cen

a real number on the x-axis at which the k-boxplots are centered.

colbox

color(s) to fill or shade the rectangle(s) with. The default NA (or also NULL) means do not fill.

xlim, ylim

numeric vectors of length 2, giving the x and y coordinates ranges.By default ,they are (xlim=cen+c(-1,1)) and (ylim=c(min(data),max(data))+c(-1,1)*0.1) respectively.

col

if col is not missing it is assumed to contain colors to be used to colour the borders of the boxes, lines and points.By default they are colored using the command raimbow(k).

xlab

a title for the x axis: see title.

ylab

a title for the y axis: see title.

xaxt

A character which specifies the x axis type. Specifying "n" suppresses plotting of the axis. The standard value is "s": for compatibility with S values "l" and "t" are accepted but are equivalent to "s": any value other than "n" implies plotting.

main

an overall title for the plot

Details

The k-boxplot is a new plot tailored to mixture data, where k is the number of mixture components. It visualizes the k components of mixture models by k different boxes, compared to a boxplot which has only one box. Then, a boxplot is a special case of a k-boxplot when k=1. Bottom and top of the boxes are drawn at the weighted first and third quartiles of the data in each group respectively. Weighted medians are displayed as horizontal lines drawn inside the boxes. Furthermore, optionally, the posterior probabilities of group membership can be visualised by appropriate lines and points. The required information in order to draw a k-boxplot can be estimated by different methods, for example by the EM-algorithm.

Value

A plotted k-boxplot

Author(s)

N. Qarmalah and J. Einbeck

References

Qarmalah, Einbeck and Coolen (2016). k-Boxplots for Mixture data. Statistical Papers 59(2): 513-528.

See Also

EM

Examples


# This code can be used to reproduce all examples in Qarmalah, Einbeck and Coolen (2016).

# Energy use data: 
 data(energy2)
 eng<-energy2[,"2011"]
 W<-EM(eng,2)$W
 par(mfrow=c(2,2))
 kboxplot(eng,W,2, xlab="2011", ylab="log energy use", type="plain")
 kboxplot(eng,W,2, xlab="2011", ylab="log energy use", type="default")
 kboxplot(eng,W,2, xlab="2011", ylab="log energy use", type="full")
 kboxplot(eng,W,2, xlab="2011", ylab="log energy use", type="two")


# Internet users data

 data(WWWusage)
 par(mfrow=c(1,2))
 E3 <- EM(log(WWWusage),3, lambda=1, init="gq", tol=2) # unequal component variances
 kboxplot(log(WWWusage),E3$W,3,main="(a)", type="default")

 E3a<- EM(log(WWWusage),3, lambda=0) # equal component variances
 kboxplot(log(WWWusage),E3a$W,3,main="(b)", type="default")


 E4<- EM(log(WWWusage),4, lambda=1, init="gq", tol=2) # unequal component variances
 kboxplot(log(WWWusage),E4$W,4,main="(a)", type="full")

 E4a<-  EM(log(WWWusage),4, lambda=0) # equal component variances
 kboxplot(log(WWWusage),E4a$W,4,main="(b)", type="full")

# Toxoplasmosis (rainfall) data

require(npmlreg)
data(rainfall)

toxo.np3<- alldist(cbind(Cases,Total-Cases) ~ 1, random=~1, random.distribution="np", family=binomial(link=logit), data=rainfall,k=3, plot.opt=0, verbose=FALSE)
W <- post(toxo.np3)$prob

par(mfrow=c(1,2))
kboxplot(rainfall$Cases/rainfall$Total, W, ylim=c(0,0.75), main="cases/total")
kboxplot(toxo.np3$fitted, W,  ylim=c(0,0.75), main="fitted")


[Package UEM version 0.3-1 Index]