Skip to content

Analysis of Variance (ANOVA)

Info

Author: Echo, Published on January 3, 2022, Reading time: about 6 minutes, WeChat Official Account Article Link:

1 Introduction

Last time, I talked about choosing the minimum sample size which focuses more on testing the mean and ratio of single or two samples. For the mean test of multiple samples, we can use ANOVA (Analysis of Variance) to analyze it separately. The new year starts with picking up last year's goals, so I'm here to fill in the gap!

Before we start, let's consider a question: since we already have the universal and useful AB test, why do we still need ANOVA? The answer is simple. In a production environment, the dependent variables we are interested in are usually influenced by many factors. For example, the effectiveness of a new drug is influenced by conditions such as indications, dosage, route and method of administration, and daily frequency of administration. Similarly, the sales of a product are influenced by factors such as advertising, product price, seasonal changes, etc. In addition, each influencing factor may have multiple observation levels. For example, the UI design of a webpage may have multiple versions, labeled as ABCDEFG (after all, black does not necessarily mean the same black). When the boss asks you whether these factors have an impact on the dependent variable and whether there are significant differences in the observation values of each level of the factor, relying solely on the two levels of the single factor AB test will be insufficient. Unless we combine every possible combination and test all of them, however, multiple tests will increase the probability of making Type I errors and it is impossible to consider all samples simultaneously. This is where ANOVA comes in to save the day.

2 Principle Introduction

In the case of a single factor, ANOVA is actually used to compare whether the means of multiple sample sets are the same/whether multiple sample sets have the same distribution. So why call it "Analysis of Variance"? Simply put, it comes from the decomposition of errors. We use the sum of squares of deviations of samples to measure the total error of the population, where the total error SST (Sum of Square for Total) = random error in the group SSE (Sum of Square for Error) + error between the groups SSA (Sum of Square for factor A). The statistical quantities are shown below. SSE divided by its degrees of freedom is equivalent to the variance of each group, and SSA divided by its degrees of freedom is equivalent to the variance of each group relative to the total population. When the sample follows a normal distribution, the sum of squares of deviations follows a chi-square distribution, so this statistical quantity is equivalent to the ratio of two chi-square distributions, which is a distribution of F. When the F-statistic is larger, it indicates that the denominator is small, the within-group variance is small, and they are tightly concentrated; the numerator is large, the between-group variance is large, the groups are far apart, and it tends to reject the null hypothesis of the equal means/distributions of multiple samples.

Formula of F-statistic

ANOVA has many variations and can be divided into one-way ANOVA and multi-factor ANOVA according to the number of factors. When related to experimental design DOE (Design of Experiment), there are many forms like completely random design, randomized block design, Latin square design, orthogonal design, etc. But great wisdom is simple, and different paths lead to the same destination. As long as you master the basic logic of ANOVA, you can easily handle all kinds of variations. Here we mainly classify according to the number of factors.

2.1 One-Way ANOVA

One-way ANOVA only concerns one factor but may have multiple observation levels. For example, the impact of UI design on traffic for three different versions of ABC, where the UI design is one factor, and the three versions of ABC are three observation levels of this factor. The specific testing process is as follows.

  • Establish the null hypothesis and alternative hypothesis for the test: the null hypothesis is usually that the means of each group are the same, and the alternative hypothesis is that there is at least one difference.
  • Calculate the F-statistic: calculate the sample means, the total mean, the sum of squares of errors and the degrees of freedom for each, and then calculate the value of the F-statistic.
  • Draw the testing conclusion: compare the F-statistic with the critical value obtained from the table. If the statistic is greater than the critical value, the null hypothesis can be rejected, and it can be concluded that there is a significant difference in the population means. The p-value of the hypothesis test can also be calculated and compared with the predetermined significance level α. If the p-value is less than α, the null hypothesis can be rejected.
  • Post-hoc analysis: it means pairwise comparisons analysis. When the null hypothesis is rejected in the previous step, it shows that the means of

Viewed times

Comments