Misplaced Pages

Median polish

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

The median polish is a simple and robust exploratory data analysis procedure proposed by the statistician John Tukey. The purpose of median polish is to find an additively-fit model for data in a two-way layout table (usually, results from a factorial experiment) of the form row effect + column effect + overall median.

Median polish utilizes the medians obtained from the rows and the columns of a two-way table to iteratively calculate the row effect and column effect on the data. The results are not meant to be sensitive to the outliers, as the iterative procedure uses the medians rather than the means.

Model for a two-way table

Suppose an experiment observes the variable Y under the influence of two variables. We can arrange the data in a two-way table in which one variable is constant along the rows and the other variable constant along the columns. Let i and j denote the position of rows and columns (e.g. yij denotes the value of y at the ith row and the jth column). Then we can obtain a simple linear regression equation:

y i j = b 0 + b 1 x i + b 2 z j + ε i j , {\displaystyle \mathbf {y} _{ij}=b_{0}+b_{1}x_{i}+b_{2}z_{j}+\varepsilon _{ij},}

where b0, b1, b2 are constants, and xi and zj are values associated with rows and columns, respectively.

The equation can be further simplified if no xi and zj values are present for the analysis:

y i j = b 0 + c i + d j + ε i j , {\displaystyle \mathbf {y} _{ij}=b_{0}+c_{i}+d_{j}+\varepsilon _{ij},}

where ci and dj denote row effects and column effects, respectively.

Procedure

To carry out median polish:

(1) find the row medians for each row, find the median of the row medians, record this as the overall effect.

(2) subtract each element in a row by its row median, do this for all rows.

(3) subtract the overall effect from each row median.

(4) do the same for each column, and add the overall effect from column operations to the overall effect generated from row operations.

(5) repeat (1)-(4) until negligible change occur with row or column medians


References


Statistics
Descriptive statistics
Continuous data
Center
Dispersion
Shape
Count data
Summary tables
Dependence
Graphics
Data collection
Study design
Survey methodology
Controlled experiments
Adaptive designs
Observational studies
Statistical inference
Statistical theory
Frequentist inference
Point estimation
Interval estimation
Testing hypotheses
Parametric tests
Specific tests
Goodness of fit
Rank statistics
Bayesian inference
Correlation
Regression analysis
Linear regression
Non-standard predictors
Generalized linear model
Partition of variance
Categorical / Multivariate / Time-series / Survival analysis
Categorical
Multivariate
Time-series
General
Specific tests
Time domain
Frequency domain
Survival
Survival function
Hazard function
Test
Applications
Biostatistics
Engineering statistics
Social statistics
Spatial statistics
Category:
Median polish Add topic