Which non parametric test to use




















If you have groups, each group should be greater than Reason 3: Statistical power Parametric tests usually have more statistical power than nonparametric tests. If the mean accurately represents the center of your distribution and your sample size is large enough, consider a parametric test because they are more powerful. If the median better represents the center of your distribution, consider the nonparametric test even when you have a large sample.

You Might Also Like. Statistics 6 Minute Read. Statistics 5 Minute Read. All rights reserved. By using this site you agree to the use of cookies for analytics and personalized content in accordance with our Policy.

Parametric tests means. There are two types of statistical tests that are appropriate for continuous data — parametric tests and nonparametric tests. Nonparametric tests are suitable for any continuous data, based on ranks of the data values.

Because of this, nonparametric tests are independent of the scale and the distribution of the data. A histogram is simply a frequency plot of the values being witnessed in a dataset. For example, if researchers were interested in temperature, they could examine a histogram that displays the frequencies of each temperature occurring in their sample data. It should be noted that checking normality of data produced by smaller samples can be difficult.

Sometimes with a small sample, the data displayed in a histogram will be obviously asymmetrical, but there are certainly occasions in which it is impossible to tell.

This is because with a small sample, the histogram may not be smooth even if the data are normal. There might not be any significant evidence of symmetry or asymmetry, which can make it difficult to determine whether the data are normal or not.

The new program involves in-home visits during the course of pregnancy in addition to the usual or regularly scheduled visits. A pilot randomized trial with 15 pregnant women is designed to evaluate whether women who participate in the program deliver healthier babies than women receiving usual care.

Recall that APGAR scores range from 0 to 10 with scores of 7 or higher considered normal healthy , low and critically low. Is there statistical evidence of a difference in APGAR scores in women receiving the new and enhanced versus usual prenatal care?

We run the test using the five-step approach. H 1 : The two populations are not equal. The test statistic is U, the smaller of. The appropriate critical value can be found in the table above.

The first step is to assign ranks of 1 through 15 to the smallest through largest values in the total sample, as follows:. Next, we sum the ranks in each group. We reject H 0 because 9. A clinical trial is run to assess the effectiveness of a new anti-retroviral therapy for patients with HIV. Patients are randomized to receive a standard anti-retroviral therapy usual care or the new anti-retroviral therapy and are monitored for 3 months.

The primary outcome is viral load which represents the number of HIV copies per milliliter of blood. A total of 30 participants are randomized and the data are shown below. Is there statistical evidence of a difference in viral load in patients receiving the standard versus the new anti-retroviral therapy?

Because viral load measures are not normally distributed with outliers as well as limits of detection e. The first step is to assign ranks of 1 through 30 to the smallest through largest values in the total sample. Note in the table below, that the "undetectable" measurement is listed first in the ordered values smallest and assigned a rank of 1. We now compute U 1 and U 2 , as follows,. We do not have sufficient evidence to conclude that the treatment groups differ in viral load.

This section describes nonparametric tests to compare two groups with respect to a continuous outcome when the data are collected on matched or paired samples.

The parametric procedure for doing this was presented in the modules on hypothesis testing for the situation in which the continuous outcome was normally distributed. This section describes procedures that should be used when the outcome cannot be assumed to follow a normal distribution.

There are two popular nonparametric tests to compare outcomes between two matched or paired groups. Recall that when data are matched or paired, we compute difference scores for each individual and analyze difference scores.

The same approach is followed in nonparametric tests. In nonparametric tests, the null hypothesis is that the median difference is zero. Consider a clinical investigation to assess the effectiveness of a new drug designed to reduce repetitive behaviors in children affected with autism. If the drug is effective, children will exhibit fewer repetitive behaviors on treatment as compared to when they are untreated. A total of 8 children with autism enroll in the study.

Each child is observed by the study psychologist for a period of 3 hours both before treatment and then again after taking the new drug for 1 week. The time that each child is engaged in repetitive behavior during each 3 hour observation period is measured. Repetitive behavior is scored on a scale of 0 to and scores represent the percent of the observation time in which the child is engaged in repetitive behavior.

For example, a score of 0 indicates that during the entire observation period the child did not engage in repetitive behavior while a score of indicates that the child was constantly engaged in repetitive behavior. Looking at the data, it appears that some children improve e. Is there statistically significant improvement in repetitive behavior after 1 week of treatment?.

Because the before and after treatment measures are paired, we compute difference scores for each child. In this example, we subtract the assessment of repetitive behaviors after treatment from that measured before treatment so that difference scores represent improvement in repetitive behavior.

The question of interest is whether there is significant improvement after treatment. In this small sample, the observed difference or improvement scores vary widely and are subject to extremes e. Thus, a nonparametric test is appropriate to test whether there is significant improvement in repetitive behavior before versus after treatment.

The hypotheses are given below. In this example, the null hypothesis is that there is no difference in scores before versus after treatment. If the null hypothesis is true, we expect to see some positive differences improvement and some negative differences worsening.

If the research hypothesis is true, we expect to see more positive differences after treatment as compared to before. The Sign Test is the simplest nonparametric test for matched or paired data. The approach is to analyze only the signs of the difference scores, as shown below:. If the research hypothesis is true, we expect to see more positive differences.

The test statistic for the Sign Test is the number of positive signs or number of negative signs, whichever is smaller. In this example, we observe 2 negative and 6 positive signs. Is this evidence of significant improvement or simply due to chance?

Determining whether the observed test statistic supports the null or research hypothesis is done following the same approach used in parametric testing. Specifically, we determine a critical value such that if the smaller of the number of positive or negative signs is less than or equal to that critical value, then we reject H 0 in favor of H 1 and if the smaller of the number of positive or negative signs is greater than the critical value, then we do not reject H 0.

Notice that this is a one-sided decision rule corresponding to our one-sided research hypothesis the two-sided situation is discussed in the next example. In essence, we could use the critical value to decide whether to reject the null hypothesis. Another alternative would be to calculate the p-value, as described below. With the Sign test we can readily compute a p-value based on our observed test statistic.

These are shown in the table below. Recall that a p-value is the probability of observing a test statistic as or more extreme than that observed. We observed 2 negative signs. Using the table above,. Recall the critical value for our test was 1 based on the table of critical values for the Sign Test above. In the example looking for differences in repetitive behaviors in autistic children, we used a one-sided test i.

A two sided test can be used if we hypothesize a difference in repetitive behavior after taking the drug as compared to before. From the table of critical values for the Sign Test, we can determine a two-sided critical value and again reject H 0 if the smaller of the number of positive or negative signs is less than or equal to that two-sided critical value.

Alternatively, we can compute a two-sided p-value. With a two-sided test, the p-value is the probability of observing many or few positive or negative signs. If the research hypothesis is a two sided alternative i. Recall in two-sided tests, we reject the null hypothesis if the test statistic is extreme in either direction. Thus, in the Sign Test, a two-sided p-value is the probability of observing few or many positive or negative signs.

Here we observe 2 negative signs and thus 6 positive signs. The two-sided p-value is the probability of observing a test statistic as or more extreme in either direction i. There is a special circumstance that needs attention when implementing the Sign Test which arises when one or more participants have difference scores of zero i.

If there is just one difference score of zero, some investigators drop that observation and reduce the sample size by 1 i. This is a reasonable approach if there is just one zero. However, if there are two or more zeros, an alternative approach is preferred. A new chemotherapy treatment is proposed for patients with breast cancer. Investigators are concerned with patient's ability to tolerate the treatment and assess their quality of life both before and after receiving the new chemotherapy treatment.

The input variable is gender, which is nominal. The outcome variable is the five point ordinal scale. Each person's opinion is independent of the others, so we have independent data. Note, however, if some people share a general practitioner and others do not, then the data are not independent and a more sophisticated analysis is called for.

Note that these tables should be considered as guides only, and each case should be considered on its merits. However, they require certain assumptions and it is often easier to either dichotomise the outcome variable or treat it as continuous. Parametric tests are those that make assumptions about the parameters of the population distribution from which the sample is drawn. This is often the assumption that the population data are normally distributed.

Table 3 shows the non-parametric equivalent of a number of parametric tests. Non-parametric tests are valid for both non-Normally distributed data and Normally distributed data, so why not use them all the time? It would seem prudent to use non-parametric tests in all cases, which would save one the bother of testing for Normality.

Parametric tests are preferred, however, for the following reasons:. We are rarely interested in a significance test alone; we would like to say something about the population from which the samples came, and this is best done with estimates of parameters and confidence intervals. It is difficult to do flexible modelling with non-parametric tests, for example allowing for confounding factors using multiple regression.

Parametric tests usually have more statistical power than their non-parametric equivalents. In other words, one is more likely to detect significant differences when they truly exist.



0コメント

  • 1000 / 1000