PARETO Procedure

Creates a PARETO chart with accompanying legend, cumulative percentages and annotations. Histogram data is displayed as a bar chart with values sorted in descending order of frequency; percentage data is displayed as an overlaid line plot.

Usage

PARETO, data

Input Parameters

data — A one dimensional array of integer or string data. Integer data values should be sequential values representing the events, survey responses, and so on, that you wish to chart. If the values do not start at zero, use the 'Data_Start' keyword (below). String values representing the events may be used. These will be assigned integer values in alphabetical order (case-sensitive) starting at 0.

For String data: If you do not specify a LegendLabel array, the string values will be used. If you do not specify and XTickName array, the legend labels will also contain the Histogram bin number.

Keywords

Some keywords below refer to keyword array values at position 0 getting mapped to the bar representing the lowest value in your data. This means that if your data has values ranging from 0 to 5 and you supply an array of color indices via the FillColors keyword, for example, the color index at position 0 of the FillColors array is assigned to the bar representing the data value '0'. If your data ranges from 1 to 5, it is assigned to the bar representing the value '1'. This is important because the bars on the graph get sorted and it is a complicated way of saying that the values you supply in keyword arrays will get sorted along with them. So the first color in your color array may not necessarily be the color of the first bar on the graph.

Annotation Keywords

Title — A string for the graph title. (Color is !P.Color)

XTitle — A string for the X axis title. (Color is BarAxisColor)

YTitle — A string for the Y axis title. (Color is BarAxisColor)

XTickName — An array of labels that appear along the Xaxis beneath the appropriate bars. The number of labels provided must match the number of bars produced by the histogram of the input data. The label in position zero of this array will be associated with the bar representing the lowest integer value in your data, and so on. If this keyword is omitted, tick labels matching the integer value of the appropriate bar are automatically generated.

Data_Start — The lowest integer value in your data set. (Defaults to 0)

If your input data values range from M to N(M is non-zero), set this to M. This is used for the automatic XTickName generation.

Legend Keywords

NoLegend — A toggle that suppresses legend display.

LegendLabels — Either an integer(/LegendLabels) or an array of strings.

An integer causes the legend to be labelled with the integer value represented by each histogram bin. An array of strings must contain one string for each histogram bin, or bar, on the bar graph. The label in position zero of the array will be associated with the bar representing lowest integer value in your data, and so on.

Note:

If your input data is a string array you may omit LegendLabels keyword. In this case the string values from the data are used.

LegendCharSize — A number, default 1.0, for the size of the legend text.

DrawLegendBox — If set, draws a box around the legend.

LegendBoxColor — Chooses the color of that box. (Defaults to !P.Color)

LegendTextColor — Chooses the color of the text in the legend.

LegendPosition — Four element, floating-point array specifying normal coordinates of legend. (Defaults to right side of graph: [0.8,0.1,0.95,0.8])

NoBar — Toggle. Set to '1', suppresses display of bar graph.

Position — Four element, floating-point array specifying normal coordinates of main graph. (Defaults to [0.1,0.1,0.65,0.8])

FillColors — Either an integer(/FillColors) or an array of color indices.

An integer that invokes the TEK_COLOR routine in the TEK_COLOR Procedure and assigns color indices 2 through 32 to the bars, repeating if there are more than 30 bars. An array will be mapped to the bars and sorted with them.

If this array is smaller than the number of bars on the graph the values will be repeated. If it is larger it will be truncated.

You may need to set 'DEVICE, pseudo=8' to see the colors properly.

FillLinestyle — Either an integer value or an array of linestyle values.

An integer assigns linestyle values 0 through 5 to bars on the graph, starting with the bar representing the lowest data value and repeating if there are greater than six bars. An array of line styles will be mapped to the hist values, sorted with them and repeated as required.

FillLineColors — Integer color index or array of integer color indices indicating the color of the fill lines. Arrays are mapped to bars and sorted. Arrays are repeated or truncated as required.

FillThick — Thickness of fill lines, if FillLineStyle is turned on. May be a scalar (applies to all slices) or an array (applies to individual slices). (Defaults to !P.Thick)

FillOrientation — Orientation of fill lines (degrees CCW from horizontal), if FillLine Style is turned on. May be a scalar (applies to all slices) or an array (applies to individual slices). (Defaults to 0)

FillSpacing — Spacing between fill lines (in cm), if FillLineStyle is turned on. May be a scalar (applies to all slices) or an array (applies to individual slices). (Defaults to a 5 times the FillThick (converted to cm))

OutlineColor — An integer, one color value for the outline of each bar. (Default: !P.Color)

Width — Sets the width of each bar. Width=1 makes each bar touch, width=0.5 as a bar's width between each bar. (Default=0.8)

Background — A scalar specifying the color index for the background.

BarAxisColor — A scalar specifying the color of the bar graph axes. (Defaults to !P.Color)

Cumulative Percentage Keywords

NoPercentage — Toggle that suppresses display of percentage line.

PercentLineColor — Color index for Cumulative percentage line. Scalar applies to entire line. Array maps to each line segment. Arrays are repeated or truncated as required and sorted along with the bars.

PercentAxisColor — Scalar color index of Y axis for percentage line. Appears on right side of chart.

PercentPsym — Selects plot symbol to display on percentage line. Scalar applies to all points. Array is mapped and sorted with the bars and repeated or truncated as required. (Default is no plot symbols)

PsymColor — Color index for plot symbols. Scalar or array. Scalar applies to all psyms. Array is mapped to each symbol, sorted with the bars and repeated/truncated as required.

MarkPercent — An integer between 0 and 100. Places a line across the graph at the indicated percentage.

Markcolor — Color index for the above line. (Default is !P.color)

Discussion

The PARETO chart is most commonly used to visualize the familiar 80/20 rule. The input data is an integer representation of qualitative events, survey responses, and so on, or the string values of the responses themselves. For example: A company wishes to find out why some orders are shipped late. They determine that there are five main causes and assign sequential integer values to each cause. They then examine a representative sample of the late orders and assign to each the integer value associated with it's cause, resulting in a sample-sized array of integers ranging in value from zero to five. When this data array is passed to the PARETO routine a histogram of the data is taken resulting in five bins, one for each event of interest.

The histogram data is sorted in descending order and displayed as a bar chart. Beneath each bar is the integer value represented by that bar, or whatever label you chose to assign to that bin. A line representing the cumulative percentage of the total sample is overlaid on the bar graph.

Example

This example response data shows a number of each possible response (to a survey, for example).

TEK_COLOR
data0 = REPLICATE('Yes',10)
data1 = REPLICATE('No',12)
data2 = REPLICATE('NA',6)
data3 = REPLICATE('Never',19)
data4 = REPLICATE('Always',9)
; When the responses are strings, they are assigned indices in
; alphabetical order. Thus, the 'Always' response from the data
; above will be assigned index '0'. In the optional label array, 
; you should place at index '0' the label you wish to associate 
; with the response that was alphabetically assigned index '0'.
data = [data0, data1, data2, data3, data4]
labels = ['Always', 'NA', 'Never', 'No', 'Yes']
; create the chart
PARETO, data, Xtickname=labels, /NoLegend, $
    FillColors=WoColorConvert(INDGEN(5)+4)

See Also

XRCHART, XSRCHART, XBAR, MACHART, CUSUM