Null Hypothesis – Simple Introduction
A null hypothesis is a precise statement about a population that we try to reject with sample data. We don't usually believe our null hypothesis (or H 0 ) to be true. However, we need some exact statement as a starting point for statistical significance testing.
Null Hypothesis Examples
Often -but not always- the null hypothesis states there is no association or difference between variables or subpopulations. Like so, some typical null hypotheses are:
- the correlation between frustration and aggression is zero ( correlation analysis );
- the average income for men is similar to that for women ( independent samples t-test );
- Nationality is (perfectly) unrelated to music preference ( chi-square independence test );
- the average population income was equal over 2012 through 2016 ( repeated measures ANOVA ).
- Dutch, German, French and British people have identical average body weigths .the average body weight is equal for D
“Null” Does Not Mean “Zero”
A common misunderstanding is that “null” implies “zero”. This is often but not always the case. For example, a null hypothesis may also state that the correlation between frustration and aggression is 0.5. No zero involved here and -although somewhat unusual- perfectly valid. The “null” in “null hypothesis” derives from “nullify” 5 : the null hypothesis is the statement that we're trying to refute, regardless whether it does (not) specify a zero effect.
Null Hypothesis Testing -How Does It Work?
I want to know if happiness is related to wealth among Dutch people. One approach to find this out is to formulate a null hypothesis. Since “related to” is not precise, we choose the opposite statement as our null hypothesis: the correlation between wealth and happiness is zero among all Dutch people. We'll now try to refute this hypothesis in order to demonstrate that happiness and wealth are related all right. Now, we can't reasonably ask all 17,142,066 Dutch people how happy they generally feel.
So we'll ask a sample (say, 100 people) about their wealth and their happiness. The correlation between happiness and wealth turns out to be 0.25 in our sample. Now we've one problem: sample outcomes tend to differ somewhat from population outcomes. So if the correlation really is zero in our population, we may find a non zero correlation in our sample. To illustrate this important point, take a look at the scatterplot below. It visualizes a zero correlation between happiness and wealth for an entire population of N = 200.
Now we draw a random sample of N = 20 from this population (the red dots in our previous scatterplot). Even though our population correlation is zero, we found a staggering 0.82 correlation in our sample . The figure below illustrates this by omitting all non sampled units from our previous scatterplot.
This raises the question how we can ever say anything about our population if we only have a tiny sample from it. The basic answer: we can rarely say anything with 100% certainty. However, we can say a lot with 99%, 95% or 90% certainty.
Probability
So how does that work? Well, basically, some sample outcomes are highly unlikely given our null hypothesis . Like so, the figure below shows the probabilities for different sample correlations (N = 100) if the population correlation really is zero.
A computer will readily compute these probabilities. However, doing so requires a sample size (100 in our case) and a presumed population correlation ρ (0 in our case). So that's why we need a null hypothesis . If we look at this sampling distribution carefully, we see that sample correlations around 0 are most likely: there's a 0.68 probability of finding a correlation between -0.1 and 0.1. What does that mean? Well, remember that probabilities can be seen as relative frequencies. So imagine we'd draw 1,000 samples instead of the one we have. This would result in 1,000 correlation coefficients and some 680 of those -a relative frequency of 0.68- would be in the range -0.1 to 0.1. Likewise, there's a 0.95 (or 95%) probability of finding a sample correlation between -0.2 and 0.2.
We found a sample correlation of 0.25. How likely is that if the population correlation is zero? The answer is known as the p-value (short for probability value): A p-value is the probability of finding some sample outcome or a more extreme one if the null hypothesis is true. Given our 0.25 correlation, “more extreme” usually means larger than 0.25 or smaller than -0.25. We can't tell from our graph but the underlying table tells us that p ≈ 0.012 . If the null hypothesis is true, there's a 1.2% probability of finding our sample correlation.
Conclusion?
If our population correlation really is zero, then we can find a sample correlation of 0.25 in a sample of N = 100. The probability of this happening is only 0.012 so it's very unlikely . A reasonable conclusion is that our population correlation wasn't zero after all. Conclusion: we reject the null hypothesis . Given our sample outcome, we no longer believe that happiness and wealth are unrelated. However, we still can't state this with certainty.
Null Hypothesis - Limitations
Thus far, we only concluded that the population correlation is probably not zero . That's the only conclusion from our null hypothesis approach and it's not really that interesting. What we really want to know is the population correlation. Our sample correlation of 0.25 seems a reasonable estimate. We call such a single number a point estimate . Now, a new sample may come up with a different correlation. An interesting question is how much our sample correlations would fluctuate over samples if we'd draw many of them. The figure below shows precisely that, assuming our sample size of N = 100 and our (point) estimate of 0.25 for the population correlation.
Confidence Intervals
Our sample outcome suggests that some 95% of many samples should come up with a correlation between 0.06 and 0.43. This range is known as a confidence interval . Although not precisely correct, it's most easily thought of as the bandwidth that's likely to enclose the population correlation . One thing to note is that the confidence interval is quite wide. It almost contains a zero correlation, exactly the null hypothesis we rejected earlier. Another thing to note is that our sampling distribution and confidence interval are slightly asymmetrical. They are symmetrical for most other statistics (such as means or beta coefficients ) but not correlations.
- Agresti, A. & Franklin, C. (2014). Statistics. The Art & Science of Learning from Data. Essex: Pearson Education Limited.
- Cohen, J (1988). Statistical Power Analysis for the Social Sciences (2nd. Edition) . Hillsdale, New Jersey, Lawrence Erlbaum Associates.
- Field, A. (2013). Discovering Statistics with IBM SPSS Newbury Park, CA: Sage.
- Howell, D.C. (2002). Statistical Methods for Psychology (5th ed.). Pacific Grove CA: Duxbury.
- Van den Brink, W.P. & Koele, P. (2002). Statistiek, deel 3 [Statistics, part 3]. Amsterdam: Boom.
Tell us what you think!
This tutorial has 17 comments:.
By John Xie on February 28th, 2023
“stop using the term ‘statistically significant’ entirely and moving to a world beyond ‘p < 0.05’”
“…, no p-value can reveal the plausibility, presence, truth, or importance of an association or effect.
Therefore, a label of statistical significance does not mean or imply that an association or effect is highly probable, real, true, or important. Nor does a label of statistical nonsignificance lead to the association or effect being improbable, absent, false, or unimportant.
Yet the dichotomization into ‘significant’ and ‘not significant’ is taken as an imprimatur of authority on these characteristics.” “To be clear, the problem is not that of having only two labels. Results should not be trichotomized, or indeed categorized into any number of groups, based on arbitrary p-value thresholds.
Similarly, we need to stop using confidence intervals as another means of dichotomizing (based, on whether a null value falls within the interval). And, to preclude a reappearance of this problem elsewhere, we must not begin arbitrarily categorizing other statistical measures (such as Bayes factors).”
Quotation from: Ronald L. Wasserstein, Allen L. Schirm & Nicole A. Lazar, Moving to a World Beyond “p<0.05”, The American Statistician(2019), Vol. 73, No. S1, 1-19: Editorial.
By Ruben Geert van den Berg on February 28th, 2023
Yes, partly agreed.
However, most students are still forced to apply null hypothesis testing so why not try to explain to them how it works?
An associated problem is that "significant" has a normal language meaning. Most people seem to confuse "statistically significant" with "real-world significant", which is unfortunate.
By the way, this same point applies to other terms such as "normally distributed". A normal distribution for dice rolls is not a normal but a uniform distribution ;-)
Keep up the good work!
SPSS tutorials
University Library
SPSS Tutorial: General Statistics and Hypothesis Testing
- About This Tutorial
- SPSS Components
- Importing Data
- General Statistics and Hypothesis Testing
- Further Resources
Merging Files based on a shared variable.
This section and the "Graphics" section provide a quick tutorial for a few common functions in SPSS, primarily to provide the reader with a feel for the SPSS user interface. This is not a comprehensive tutorial, but SPSS itself provides comprehensive tutorials and case studies through it's help menu. SPSS's help menu is more than a quick reference. It provides detailed information on how and when to use SPSS's various menu options. See the "Further Resources" section for more information.
To perform a one sample t-test click "Analyze"→"Compare Means"→"One Sample T-Test" and the following dialog box will appear:
The dialogue allows selection of any scale variable from the box at the left and a test value that represents a hypothetical mean. Select the test variable and set the test value, then press "Ok." Three tables will appear in the Output Viewer:
The first table gives descriptive statistics about the variable. The second shows the results of the t_test, including the "t" statistic, the degrees of freedom ("df") the p-value ("Sig."), the difference of the test value from the variable mean, and the upper and lower bounds for a ninety-five percent confidence interval. The final table shows one-sample effect sizes.
One-Way ANOVA
In the Data Editor, select "Analyze"→"Compare Means"→"One-Way ANOVA..." to open the dialog box shown below.
To generate the ANOVA statistic the variables chosen cannot have a "Nominal" level of measurement; they must be "ordinal."
Once the nominal variables have been changed to ordinal, select "the dependent variable and the factor, then click "OK." The following output will appear in the Output Viewer:
Linear Regression
To obtain a linear regression select "Analyze"->"Regression"->"Linear" from the menu, calling up the dialog box shown below:
The output of this most basic case produces a summary chart showing R, R-square, and the Standard error of the prediction; an ANOVA chart; and a chart providing statistics on model coefficients:
For Multiple regression, simply add more independent variables in the "Linear Regression" dialogue box. To plot a regression line see the "Legacy Dialogues" section of the "Graphics" tab.
Scholarly Commons
- << Previous: Importing Data
- Next: Graphics >>
- Last Updated: Jul 9, 2024 5:55 PM
- URL: https://guides.library.illinois.edu/spss
Hypothesis Testing: SPSS (2.1)
Introduction: Hypothesis Testing: SPSS (2.1)
Statitstical Inference
4.8 specifying null hypotheses in spss.
Figure 4.9: Flow chart for selecting a test in SPSS.
Statistics such as means, proportions, variances, and correlations are calculated on variables. For translating a research hypothesis into a statistical hypothesis, the researcher has to recognize the dependent and independent variables addressed by the research hypothesis and their variable types. The main distinction is between dichotomies (two groups), (other) categorical variables (three or more groups), and numerical variables. Once you have identified the variables, the flow chart in Figure 4.9 helps you to identify the right statistical test.
If possible, SPSS uses a theoretical probability distribution to approximate the sampling distribution. It will select the appropriate sampling distribution. In some cases, such as a test on a contingency table with two rows and two columns, SPSS automatically includes an exact test because the theoretical approximation cannot be relied on.
SPSS does not allow the user to specify the null hypothesis of a test if the test involves two or more variables. If you cannot specify the null hypothesis, SPSS uses the nil hypothesis that the population value of interest is zero. For example, SPSS tests the null hypothesis that males and females have the same average willingness to donate to a charity, that is, the mean difference is zero, if we apply an independent samples t test.
Imagine that we know from previous research that females tend to score one point higher on the willingness scale than males. It would not be very interesting to reject the nil hypothesis. Instead, we would like to test the null hypothesis that the average difference between females and males is 1.00. We cannot change the null hypothesis of a t test in SPSS, but we can use the confidence interval to test this null hypothesis as explained in Section 4.6.1 .
In SPSS, the analyst has to specify the null hypothesis in tests on one variable, namely tests on one proportion, one mean, or one categorical variable. The following instructions explain how to do this.
4.8.1 Specify null for binomial test
A proportion is the statistic best suited to test research hypotheses addressing the share of a category in the population. The hypothesis that a television station reaches half of all households in a country provides an example. All households in the country constitute the population. The share of the television station is the proportion or percentage of all households watching this television station.
If we have a data set for a sample of households containing a variable indicating whether or not a household watches the television station, we can test the research hypothesis with a binomial test. The statistical null hypothesis is that the proportion of households watching the television station is 0.5 in the population.
Figure 4.10: A binomial test on a single proportion in SPSS.
We can also be interested in more than one category, for instance, in which regions are the households located: in the north, east, south, and west of the country? This translates into a statistical hypothesis containing two or more proportions in the population. If 30% of households in the population are situated in the west, 25 % in the south and east, and 20% in the north, we would expect these proportions in the sample if all regions are equally well-represented. Our statistical hypothesis is actually a relative frequency distribution, such as, for instance, in Table 4.1 .
A test for this type of statistical hypothesis is called a one-sample chi-squared test. It is up to the researcher to specify the hypothesized proportions for all categories. This is not a simple task: What reasons do we have to expect particular values, say a region’s share of thirty per cent of all households instead of twenty-five per cent?
The test is mainly used if researchers know the true proportions of the categories in the population from which they aimed to draw their sample. If we try to draw a sample from all citizens of a country, we usually know the frequency distribution of sex, age, educational level, and so on for all citizens from the national bureau of statistics. With the bureau’s information, we can test if the respondents in our sample have the same distribution with respect to sex, age, or educational level as the population from which we tried to draw the sample; just use the official population proportions in the null hypothesis.
If the proportions in the sample do not differ more from the known proportions in the population than we expect based on chance, the sample is representative of the population in the statistical sense (see Section 1.2.6 ). As always, we use the p value of the test as the probability of obtaining our sample or a sample that is even more different from the null hypothesis, if the null hypothesis is true. Note that the null hypothesis now represents the (distribution in) the population from which we tried to draw our sample. We conclude that the sample is representative of this population in the statistical sense if we can not reject the null hypothesis, that is, if the p value is larger than .05. Not rejecting the null hypothesis means that we have sufficient probability that our sample was drawn from the population that we wanted to investigate. We can now be more confident that our sample results generalize to the population that we meant to investigate.
Figure 4.11: A chi-squared test on a frequency distribution in SPSS.
Finally, we have the significance test on one mean, which we have used in the example of average media literacy throughout this chapter. For a numeric (interval or ratio measurement level) variable such as the 10-point scale in this example, the mean is a good measure of the distribution’s center. Our statistical hypothesis would be that average media literacy score of all children in the population is (below) 5.5.
Figure 4.12: A one-sample t test in SPSS.
- Flashes Safe Seven
- FlashLine Login
- Faculty & Staff Phone Directory
- Emeriti or Retiree
- All Departments
- Maps & Directions
- Building Guide
- Departments
- Directions & Parking
- Faculty & Staff
- Give to University Libraries
- Library Instructional Spaces
- Mission & Vision
- Newsletters
- Circulation
- Course Reserves / Core Textbooks
- Equipment for Checkout
- Interlibrary Loan
- Library Instruction
- Library Tutorials
- My Library Account
- Open Access Kent State
- Research Support Services
- Statistical Consulting
- Student Multimedia Studio
- Citation Tools
- Databases A-to-Z
- Databases By Subject
- Digital Collections
- Discovery@Kent State
- Government Information
- Journal Finder
- Library Guides
- Connect from Off-Campus
- Library Workshops
- Subject Librarians Directory
- Suggestions/Feedback
- Writing Commons
- Academic Integrity
- Jobs for Students
- International Students
- Meet with a Librarian
- Study Spaces
- University Libraries Student Scholarship
- Affordable Course Materials
- Copyright Services
- Selection Manager
- Suggest a Purchase
Library Locations at the Kent Campus
- Architecture Library
- Fashion Library
- Map Library
- Performing Arts Library
- Special Collections and Archives
Regional Campus Libraries
- East Liverpool
- College of Podiatric Medicine
- Kent State University
- SPSS Tutorials
Independent Samples t Test
Spss tutorials: independent samples t test.
- The SPSS Environment
- The Data View Window
- Using SPSS Syntax
- Data Creation in SPSS
- Importing Data into SPSS
- Variable Types
- Date-Time Variables in SPSS
- Defining Variables
- Creating a Codebook
- Computing Variables
- Computing Variables: Mean Centering
- Computing Variables: Recoding Categorical Variables
- Computing Variables: Recoding String Variables into Coded Categories (Automatic Recode)
- rank transform converts a set of data values by ordering them from smallest to largest, and then assigning a rank to each value. In SPSS, the Rank Cases procedure can be used to compute the rank transform of a variable." href="https://libguides.library.kent.edu/SPSS/RankCases" style="" >Computing Variables: Rank Transforms (Rank Cases)
- Weighting Cases
- Sorting Data
- Grouping Data
- Descriptive Stats for One Numeric Variable (Explore)
- Descriptive Stats for One Numeric Variable (Frequencies)
- Descriptive Stats for Many Numeric Variables (Descriptives)
- Descriptive Stats by Group (Compare Means)
- Frequency Tables
- Working with "Check All That Apply" Survey Data (Multiple Response Sets)
- Chi-Square Test of Independence
- Pearson Correlation
- One Sample t Test
- Paired Samples t Test
- One-Way ANOVA
- How to Cite the Tutorials
Sample Data Files
Our tutorials reference a dataset called "sample" in many examples. If you'd like to download the sample dataset to work through the examples, choose one of the files below:
- Data definitions (*.pdf)
- Data - Comma delimited (*.csv)
- Data - Tab delimited (*.txt)
- Data - Excel format (*.xlsx)
- Data - SAS format (*.sas7bdat)
- Data - SPSS format (*.sav)
The Independent Samples t Test compares the means of two independent groups in order to determine whether there is statistical evidence that the associated population means are significantly different. The Independent Samples t Test is a parametric test.
This test is also known as:
- Independent t Test
- Independent Measures t Test
- Independent Two-sample t Test
- Student t Test
- Two-Sample t Test
- Uncorrelated Scores t Test
- Unpaired t Test
- Unrelated t Test
The variables used in this test are known as:
- Dependent variable, or test variable
- Independent variable, or grouping variable
Common Uses
The Independent Samples t Test is commonly used to test the following:
- Statistical differences between the means of two groups
- Statistical differences between the means of two interventions
- Statistical differences between the means of two change scores
Note: The Independent Samples t Test can only compare the means for two (and only two) groups. It cannot make comparisons among more than two groups. If you wish to compare the means across more than two groups, you will likely want to run an ANOVA.
Data Requirements
Your data must meet the following requirements:
- Dependent variable that is continuous (i.e., interval or ratio level)
- Independent variable that is categorical (i.e., nominal or ordinal) and has exactly two categories
- Cases that have nonmissing values for both the dependent and independent variables
- Subjects in the first group cannot also be in the second group
- No subject in either group can influence subjects in the other group
- No group can influence the other group
- Violation of this assumption will yield an inaccurate p value
- Random sample of data from the population
- Non-normal population distributions, especially those that are thick-tailed or heavily skewed, considerably reduce the power of the test
- Among moderate or large samples, a violation of normality may still yield accurate p values
- When this assumption is violated and the sample sizes for each group differ, the p value is not trustworthy. However, the Independent Samples t Test output also includes an approximate t statistic that is not based on assuming equal population variances. This alternative statistic, called the Welch t Test statistic 1 , may be used when equal variances among populations cannot be assumed. The Welch t Test is also known an Unequal Variance t Test or Separate Variances t Test.
- No outliers
Note: When one or more of the assumptions for the Independent Samples t Test are not met, you may want to run the nonparametric Mann-Whitney U Test instead.
Researchers often follow several rules of thumb:
- Each group should have at least 6 subjects, ideally more. Inferences for the population will be more tenuous with too few subjects.
- A balanced design (i.e., same number of subjects in each group) is ideal. Extremely unbalanced designs increase the possibility that violating any of the requirements/assumptions will threaten the validity of the Independent Samples t Test.
1 Welch, B. L. (1947). The generalization of "Student's" problem when several different population variances are involved. Biometrika , 34 (1–2), 28–35.
The null hypothesis ( H 0 ) and alternative hypothesis ( H 1 ) of the Independent Samples t Test can be expressed in two different but equivalent ways:
H 0 : µ 1 = µ 2 ("the two population means are equal") H 1 : µ 1 ≠ µ 2 ("the two population means are not equal")
H 0 : µ 1 - µ 2 = 0 ("the difference between the two population means is equal to 0") H 1 : µ 1 - µ 2 ≠ 0 ("the difference between the two population means is not 0")
where µ 1 and µ 2 are the population means for group 1 and group 2, respectively. Notice that the second set of hypotheses can be derived from the first set by simply subtracting µ 2 from both sides of the equation.
Levene’s Test for Equality of Variances
Recall that the Independent Samples t Test requires the assumption of homogeneity of variance -- i.e., both groups have the same variance. SPSS conveniently includes a test for the homogeneity of variance, called Levene's Test , whenever you run an independent samples t test.
The hypotheses for Levene’s test are:
H 0 : σ 1 2 - σ 2 2 = 0 ("the population variances of group 1 and 2 are equal") H 1 : σ 1 2 - σ 2 2 ≠ 0 ("the population variances of group 1 and 2 are not equal")
This implies that if we reject the null hypothesis of Levene's Test, it suggests that the variances of the two groups are not equal; i.e., that the homogeneity of variances assumption is violated.
The output in the Independent Samples Test table includes two rows: Equal variances assumed and Equal variances not assumed . If Levene’s test indicates that the variances are equal across the two groups (i.e., p -value large), you will rely on the first row of output, Equal variances assumed , when you look at the results for the actual Independent Samples t Test (under the heading t -test for Equality of Means). If Levene’s test indicates that the variances are not equal across the two groups (i.e., p -value small), you will need to rely on the second row of output, Equal variances not assumed , when you look at the results of the Independent Samples t Test (under the heading t -test for Equality of Means).
The difference between these two rows of output lies in the way the independent samples t test statistic is calculated. When equal variances are assumed, the calculation uses pooled variances; when equal variances cannot be assumed, the calculation utilizes un-pooled variances and a correction to the degrees of freedom.
Test Statistic
The test statistic for an Independent Samples t Test is denoted t . There are actually two forms of the test statistic for this test, depending on whether or not equal variances are assumed. SPSS produces both forms of the test, so both forms of the test are described here. Note that the null and alternative hypotheses are identical for both forms of the test statistic.
Equal variances assumed
When the two independent samples are assumed to be drawn from populations with identical population variances (i.e., σ 1 2 = σ 2 2 ) , the test statistic t is computed as:
$$ t = \frac{\overline{x}_{1} - \overline{x}_{2}}{s_{p}\sqrt{\frac{1}{n_{1}} + \frac{1}{n_{2}}}} $$
$$ s_{p} = \sqrt{\frac{(n_{1} - 1)s_{1}^{2} + (n_{2} - 1)s_{2}^{2}}{n_{1} + n_{2} - 2}} $$
\(\bar{x}_{1}\) = Mean of first sample \(\bar{x}_{2}\) = Mean of second sample \(n_{1}\) = Sample size (i.e., number of observations) of first sample \(n_{2}\) = Sample size (i.e., number of observations) of second sample \(s_{1}\) = Standard deviation of first sample \(s_{2}\) = Standard deviation of second sample \(s_{p}\) = Pooled standard deviation
The calculated t value is then compared to the critical t value from the t distribution table with degrees of freedom df = n 1 + n 2 - 2 and chosen confidence level. If the calculated t value is greater than the critical t value, then we reject the null hypothesis.
Note that this form of the independent samples t test statistic assumes equal variances.
Because we assume equal population variances, it is OK to "pool" the sample variances ( s p ). However, if this assumption is violated, the pooled variance estimate may not be accurate, which would affect the accuracy of our test statistic (and hence, the p-value).
Equal variances not assumed
When the two independent samples are assumed to be drawn from populations with unequal variances (i.e., σ 1 2 ≠ σ 2 2 ), the test statistic t is computed as:
$$ t = \frac{\overline{x}_{1} - \overline{x}_{2}}{\sqrt{\frac{s_{1}^{2}}{n_{1}} + \frac{s_{2}^{2}}{n_{2}}}} $$
\(\bar{x}_{1}\) = Mean of first sample \(\bar{x}_{2}\) = Mean of second sample \(n_{1}\) = Sample size (i.e., number of observations) of first sample \(n_{2}\) = Sample size (i.e., number of observations) of second sample \(s_{1}\) = Standard deviation of first sample \(s_{2}\) = Standard deviation of second sample
The calculated t value is then compared to the critical t value from the t distribution table with degrees of freedom
$$ df = \frac{ \left ( \frac{s_{1}^2}{n_{1}} + \frac{s_{2}^2}{n_{2}} \right ) ^{2} }{ \frac{1}{n_{1}-1} \left ( \frac{s_{1}^2}{n_{1}} \right ) ^{2} + \frac{1}{n_{2}-1} \left ( \frac{s_{2}^2}{n_{2}} \right ) ^{2}} $$
and chosen confidence level. If the calculated t value > critical t value, then we reject the null hypothesis.
Note that this form of the independent samples t test statistic does not assume equal variances. This is why both the denominator of the test statistic and the degrees of freedom of the critical value of t are different than the equal variances form of the test statistic.
Data Set-Up
Your data should include two variables (represented in columns) that will be used in the analysis. The independent variable should be categorical and include exactly two groups. (Note that SPSS restricts categorical indicators to numeric or short string values only.) The dependent variable should be continuous (i.e., interval or ratio). SPSS can only make use of cases that have nonmissing values for the independent and the dependent variables, so if a case has a missing value for either variable, it cannot be included in the test.
The number of rows in the dataset should correspond to the number of subjects in the study. Each row of the dataset should represent a unique subject, person, or unit, and all of the measurements taken on that person or unit should appear in that row.
Run an Independent Samples t Test
To run an Independent Samples t Test in SPSS, click Analyze > Compare Means > Independent-Samples T Test .
The Independent-Samples T Test window opens where you will specify the variables to be used in the analysis. All of the variables in your dataset appear in the list on the left side. Move variables to the right by selecting them in the list and clicking the blue arrow buttons. You can move a variable(s) to either of two areas: Grouping Variable or Test Variable(s) .
A Test Variable(s): The dependent variable(s). This is the continuous variable whose means will be compared between the two groups. You may run multiple t tests simultaneously by selecting more than one test variable.
B Grouping Variable: The independent variable. The categories (or groups) of the independent variable will define which samples will be compared in the t test. The grouping variable must have at least two categories (groups); it may have more than two categories but a t test can only compare two groups, so you will need to specify which two groups to compare. You can also use a continuous variable by specifying a cut point to create two groups (i.e., values at or above the cut point and values below the cut point).
C Define Groups : Click Define Groups to define the category indicators (groups) to use in the t test. If the button is not active, make sure that you have already moved your independent variable to the right in the Grouping Variable field. You must define the categories of your grouping variable before you can run the Independent Samples t Test procedure.
You will not be able to run the Independent Samples t Test until the levels (or cut points) of the grouping variable have been defined. The OK and Paste buttons will be unclickable until the levels have been defined. You can tell if the levels of the grouping variable have not been defined by looking at the Grouping Variable box: if a variable appears in the box but has two question marks next to it, then the levels are not defined:
D Options: The Options section is where you can set your desired confidence level for the confidence interval for the mean difference, and specify how SPSS should handle missing values.
When finished, click OK to run the Independent Samples t Test, or click Paste to have the syntax corresponding to your specified settings written to an open syntax window. (If you do not have a syntax window open, a new window will open for you.)
Define Groups
Clicking the Define Groups button (C) opens the Define Groups window:
1 Use specified values: If your grouping variable is categorical, select Use specified values . Enter the values for the categories you wish to compare in the Group 1 and Group 2 fields. If your categories are numerically coded, you will enter the numeric codes. If your group variable is string, you will enter the exact text strings representing the two categories. If your grouping variable has more than two categories (e.g., takes on values of 1, 2, 3, 4), you can specify two of the categories to be compared (SPSS will disregard the other categories in this case).
Note that when computing the test statistic, SPSS will subtract the mean of the Group 2 from the mean of Group 1. Changing the order of the subtraction affects the sign of the results, but does not affect the magnitude of the results.
2 Cut point: If your grouping variable is numeric and continuous, you can designate a cut point for dichotomizing the variable. This will separate the cases into two categories based on the cut point. Specifically, for a given cut point x , the new categories will be:
- Group 1: All cases where grouping variable > x
- Group 2: All cases where grouping variable < x
Note that this implies that cases where the grouping variable is equal to the cut point itself will be included in the "greater than or equal to" category. (If you want your cut point to be included in a "less than or equal to" group, then you will need to use Recode into Different Variables or use DO IF syntax to create this grouping variable yourself.) Also note that while you can use cut points on any variable that has a numeric type, it may not make practical sense depending on the actual measurement level of the variable (e.g., nominal categorical variables coded numerically). Additionally, using a dichotomized variable created via a cut point generally reduces the power of the test compared to using a non-dichotomized variable.
Clicking the Options button (D) opens the Options window:
The Confidence Interval Percentage box allows you to specify the confidence level for a confidence interval. Note that this setting does NOT affect the test statistic or p-value or standard error; it only affects the computed upper and lower bounds of the confidence interval. You can enter any value between 1 and 99 in this box (although in practice, it only makes sense to enter numbers between 90 and 99).
The Missing Values section allows you to choose if cases should be excluded "analysis by analysis" (i.e. pairwise deletion) or excluded listwise. This setting is not relevant if you have only specified one dependent variable; it only matters if you are entering more than one dependent (continuous numeric) variable. In that case, excluding "analysis by analysis" will use all nonmissing values for a given variable. If you exclude "listwise", it will only use the cases with nonmissing values for all of the variables entered. Depending on the amount of missing data you have, listwise deletion could greatly reduce your sample size.
Example: Independent samples T test when variances are not equal
Problem statement.
In our sample dataset, students reported their typical time to run a mile, and whether or not they were an athlete. Suppose we want to know if the average time to run a mile is different for athletes versus non-athletes. This involves testing whether the sample means for mile time among athletes and non-athletes in your sample are statistically different (and by extension, inferring whether the means for mile times in the population are significantly different between these two groups). You can use an Independent Samples t Test to compare the mean mile time for athletes and non-athletes.
The hypotheses for this example can be expressed as:
H 0 : µ non-athlete − µ athlete = 0 ("the difference of the means is equal to zero") H 1 : µ non-athlete − µ athlete ≠ 0 ("the difference of the means is not equal to zero")
where µ athlete and µ non-athlete are the population means for athletes and non-athletes, respectively.
In the sample data, we will use two variables: Athlete and MileMinDur . The variable Athlete has values of either “0” (non-athlete) or "1" (athlete). It will function as the independent variable in this T test. The variable MileMinDur is a numeric duration variable (h:mm:ss), and it will function as the dependent variable. In SPSS, the first few rows of data look like this:
Before the Test
Before running the Independent Samples t Test, it is a good idea to look at descriptive statistics and graphs to get an idea of what to expect. Running Compare Means ( Analyze > Compare Means > Means ) to get descriptive statistics by group tells us that the standard deviation in mile time for non-athletes is about 2 minutes; for athletes, it is about 49 seconds. This corresponds to a variance of 14803 seconds for non-athletes, and a variance of 2447 seconds for athletes 1 . Running the Explore procedure ( Analyze > Descriptives > Explore ) to obtain a comparative boxplot yields the following graph:
If the variances were indeed equal, we would expect the total length of the boxplots to be about the same for both groups. However, from this boxplot, it is clear that the spread of observations for non-athletes is much greater than the spread of observations for athletes. Already, we can estimate that the variances for these two groups are quite different. It should not come as a surprise if we run the Independent Samples t Test and see that Levene's Test is significant.
Additionally, we should also decide on a significance level (typically denoted using the Greek letter alpha, α ) before we perform our hypothesis tests. The significance level is the threshold we use to decide whether a test result is significant. For this example, let's use α = 0.05.
1 When computing the variance of a duration variable (formatted as hh:mm:ss or mm:ss or mm:ss.s), SPSS converts the standard deviation value to seconds before squaring.
Running the Test
To run the Independent Samples t Test:
- Click Analyze > Compare Means > Independent-Samples T Test .
- Move the variable Athlete to the Grouping Variable field, and move the variable MileMinDur to the Test Variable(s) area. Now Athlete is defined as the independent variable and MileMinDur is defined as the dependent variable.
- Click Define Groups , which opens a new window. Use specified values is selected by default. Since our grouping variable is numerically coded (0 = "Non-athlete", 1 = "Athlete"), type “0” in the first text box, and “1” in the second text box. This indicates that we will compare groups 0 and 1, which correspond to non-athletes and athletes, respectively. Click Continue when finished.
- Click OK to run the Independent Samples t Test. Output for the analysis will display in the Output Viewer window.
Two sections (boxes) appear in the output: Group Statistics and Independent Samples Test . The first section, Group Statistics , provides basic information about the group comparisons, including the sample size ( n ), mean, standard deviation, and standard error for mile times by group. In this example, there are 166 athletes and 226 non-athletes. The mean mile time for athletes is 6 minutes 51 seconds, and the mean mile time for non-athletes is 9 minutes 6 seconds.
The second section, Independent Samples Test , displays the results most relevant to the Independent Samples t Test. There are two parts that provide different pieces of information: (A) Levene’s Test for Equality of Variances and (B) t-test for Equality of Means.
A Levene's Test for Equality of of Variances : This section has the test results for Levene's Test. From left to right:
- F is the test statistic of Levene's test
- Sig. is the p-value corresponding to this test statistic.
The p -value of Levene's test is printed as ".000" (but should be read as p < 0.001 -- i.e., p very small), so we we reject the null of Levene's test and conclude that the variance in mile time of athletes is significantly different than that of non-athletes. This tells us that we should look at the "Equal variances not assumed" row for the t test (and corresponding confidence interval) results . (If this test result had not been significant -- that is, if we had observed p > α -- then we would have used the "Equal variances assumed" output.)
B t-test for Equality of Means provides the results for the actual Independent Samples t Test. From left to right:
- t is the computed test statistic, using the formula for the equal-variances-assumed test statistic (first row of table) or the formula for the equal-variances-not-assumed test statistic (second row of table)
- df is the degrees of freedom, using the equal-variances-assumed degrees of freedom formula (first row of table) or the equal-variances-not-assumed degrees of freedom formula (second row of table)
- Sig (2-tailed) is the p-value corresponding to the given test statistic and degrees of freedom
- Mean Difference is the difference between the sample means, i.e. x 1 − x 2 ; it also corresponds to the numerator of the test statistic for that test
- Std. Error Difference is the standard error of the mean difference estimate; it also corresponds to the denominator of the test statistic for that test
Note that the mean difference is calculated by subtracting the mean of the second group from the mean of the first group. In this example, the mean mile time for athletes was subtracted from the mean mile time for non-athletes (9:06 minus 6:51 = 02:14). The sign of the mean difference corresponds to the sign of the t value. The positive t value in this example indicates that the mean mile time for the first group, non-athletes, is significantly greater than the mean for the second group, athletes.
The associated p value is printed as ".000"; double-clicking on the p-value will reveal the un-rounded number. SPSS rounds p-values to three decimal places, so any p-value too small to round up to .001 will print as .000. (In this particular example, the p-values are on the order of 10 -40 .)
C Confidence Interval of the Difference : This part of the t -test output complements the significance test results. Typically, if the CI for the mean difference contains 0 within the interval -- i.e., if the lower boundary of the CI is a negative number and the upper boundary of the CI is a positive number -- the results are not significant at the chosen significance level. In this example, the 95% CI is [01:57, 02:32], which does not contain zero; this agrees with the small p -value of the significance test.
Decision and Conclusions
Since p < .001 is less than our chosen significance level α = 0.05, we can reject the null hypothesis, and conclude that the that the mean mile time for athletes and non-athletes is significantly different.
Based on the results, we can state the following:
- There was a significant difference in mean mile time between non-athletes and athletes ( t 315.846 = 15.047, p < .001).
- The average mile time for athletes was 2 minutes and 14 seconds lower than the average mile time for non-athletes.
- << Previous: Paired Samples t Test
- Next: One-Way ANOVA >>
- Last Updated: Jul 10, 2024 11:08 AM
- URL: https://libguides.library.kent.edu/SPSS
Street Address
Mailing address, quick links.
- How Are We Doing?
- Student Jobs
Information
- Accessibility
- Emergency Information
- For Our Alumni
- For the Media
- Jobs & Employment
- Life at KSU
- Privacy Statement
- Technology Support
- Website Feedback
- Skip to primary navigation
- Skip to main content
- Skip to primary sidebar
Statistical Methods and Data Analytics
SPSS Annotated Output T-test
The t-test procedure performs t-tests for one sample, two samples and paired observations. The single-sample t-test compares the mean of the sample to a given number (which you supply). The independent samples t-test compares the difference in the means from the two groups to a given value (usually 0). In other words, it tests whether the difference in the means is 0. The dependent-sample or paired t-test compares the difference in the means from the two variables measured on the same set of subjects to a given number (usually 0), while taking into account the fact that the scores are not independent. In our examples, we will use the hsb2 data set.
Single sample t-test
The single sample t-test tests the null hypothesis that the population mean is equal to the number specified by the user. SPSS calculates the t-statistic and its p-value under the assumption that the sample comes from an approximately normal distribution. If the p-value associated with the t-test is small (0.05 is often used as the threshold), there is evidence that the mean is different from the hypothesized value. If the p-value associated with the t-test is not small (p > 0.05), then the null hypothesis is not rejected and you can conclude that the mean is not different from the hypothesized value.
In this example, the t-statistic is 4.140 with 199 degrees of freedom. The corresponding two-tailed p-value is .000, which is less than 0.05. We conclude that the mean of variable write is different from 50.
One-Sample Statistics
a. – This is the list of variables. Each variable that was listed on the variables= statement in the above code will have its own line in this part of the output.
b. N – This is the number of valid (i.e., non-missing) observations used in calculating the t-test.
c. Mean – This is the mean of the variable.
d. Std. Deviation – This is the standard deviation of the variable.
e. Std. Error Mean – This is the estimated standard deviation of the sample mean. If we drew repeated samples of size 200, we would expect the standard deviation of the sample means to be close to the standard error. The standard deviation of the distribution of sample mean is estimated as the standard deviation of the sample divided by the square root of sample size: 9.47859/(sqrt(200)) = .67024.
Test statistics
f. – This identifies the variables. Each variable that was listed on the variables= statement will have its own line in this part of the output. If a variables= statement is not specified, t-test will conduct a t-test on all numerical variables in the dataset.
g. t – This is the Student t-statistic. It is the ratio of the difference between the sample mean and the given number to the standard error of the mean: (52.775 – 50) / .6702372 = 4.1403. Since the standard error of the mean measures the variability of the sample mean, the smaller the standard error of the mean, the more likely that our sample mean is close to the true population mean. This is illustrated by the following three figures.
In all three cases, the difference between the population means is the same. But with large variability of sample means, second graph, two populations overlap a great deal. Therefore, the difference may well come by chance. On the other hand, with small variability, the difference is more clear as in the third graph. The smaller the standard error of the mean, the larger the magnitude of the t-value and therefore, the smaller the p-value.
h. df – The degrees of freedom for the single sample t-test is simply the number of valid observations minus 1. We lose one degree of freedom because we have estimated the mean from the sample. We have used some of the information from the data to estimate the mean, therefore it is not available to use for the test and the degrees of freedom accounts for this.
i. Sig (2-tailed) – This is the two-tailed p-value evaluating the null against an alternative that the mean is not equal to 50. It is equal to the probability of observing a greater absolute value of t under the null hypothesis. If the p-value is less than the pre-specified alpha level (usually .05 or .01) we will conclude that mean is statistically significantly different from zero. For example, the p-value is smaller than 0.05. So we conclude that the mean for write is different from 50.
j. Mean Difference – This is the difference between the sample mean and the test value.
k. 95% Confidence Interval of the Difference – These are the lower and upper bound of the confidence interval for the mean. A confidence interval for the mean specifies a range of values within which the unknown population parameter, in this case the mean, may lie. It is given by
where s is the sample deviation of the observations and N is the number of valid observations. The t-value in the formula can be computed or found in any statistics book with the degrees of freedom being N-1 and the p-value being 1- alpha /2, where alpha is the confidence level and by default is .95.
Paired t-test
A paired (or “dependent”) t-test is used when the observations are not independent of one another. In the example below, the same students took both the writing and the reading test. Hence, you would expect there to be a relationship between the scores provided by each student. The paired t-test accounts for this. For each student, we are essentially looking at the differences in the values of the two variables and testing if the mean of these differences is equal to zero.
In this example, the t-statistic is 0.8673 with 199 degrees of freedom. The corresponding two-tailed p-value is 0.3868, which is greater than 0.05. We conclude that the mean difference of write and read is not different from 0.
Summary statistics
a. – This is the list of variables.
b. Mean – These are the respective means of the variables.
c. N – This is the number of valid (i.e., non-missing) observations used in calculating the t-test.
d. Std. Deviation – This is the standard deviations of the variables.
e. Std Error Mean – Standard Error Mean is the estimated standard deviation of the sample mean. This value is estimated as the standard deviation of one sample divided by the square root of sample size: 9.47859/sqrt(200) = .67024, 10.25294/sqrt(200) = .72499. This provides a measure of the variability of the sample mean.
f. Correlation – This is the correlation coefficient of the pair of variables indicated. This is a measure of the strength and direction of the linear relationship between the two variables. The correlation coefficient can range from -1 to +1, with -1 indicating a perfect negative correlation, +1 indicating a perfect positive correlation, and 0 indicating no correlation at all. (A variable correlated with itself will always have a correlation coefficient of 1.) You can think of the correlation coefficient as telling you the extent to which you can guess the value of one variable given a value of the other variable. The .597 is the numerical description of how tightly around the imaginary line the points lie. If the correlation was higher, the points would tend to be closer to the line; if it was smaller, they would tend to be further away from the line.
g. Sig – This is the p-value associated with the correlation. Here, correlation is significant at the .05 level.
g. writing score-reading score – This is the value measured within each subject: the difference between the writing and reading scores. The paired t-test forms a single random sample of the paired difference. The mean of these values among all subjects is compared to 0 in a paired t-test.
h. Mean – This is the mean within-subject difference between the two variables.
i. Std. Deviation – This is the standard deviation of the mean paired difference.
j. Std Error Mean – This is the estimated standard deviation of the sample mean. This value is estimated as the standard deviation of one sample divided by the square root of sample size: 8.88667/sqrt(200) = .62838. This provides a measure of the variability of the sample mean.
k. 95% Confidence Interval of the Difference – These are the lower and upper bound of the confidence interval for the mean difference. A confidence interval for the mean specifies a range of values within which the unknown population parameter, in this case the mean, may lie. It is given by
l. t – This is the t-statistic. It is the ratio of the mean of the difference to the standard error of the difference: (.545/.62838).
m. degrees of freedom – The degrees of freedom for the paired observations is simply the number of observations minus 1. This is because the test is conducted on the one sample of the paired differences.
n. Sig. (2-tailed) – This is the two-tailed p-value computed using the t distribution. It is the probability of observing a greater absolute value of t under the null hypothesis. If the p-value is less than the pre-specified alpha level (usually .05 or .01, here the former) we will conclude that mean difference between writing score and reading score is statistically significantly different from zero. For example, the p-value for the difference between the two variables is greater than 0.05 so we conclude that the mean difference is not statistically significantly different from 0.
Independent group t-test
This t-test is designed to compare means of same variable between two groups. In our example, we compare the mean writing score between the group of female students and the group of male students. Ideally, these subjects are randomly selected from a larger population of subjects. The test assumes that variances for the two populations are the same. The interpretation for p-value is the same as in other type of t-tests.
In this example, the t-statistic is -3.7341 with 198 degrees of freedom. The corresponding two-tailed p-value is 0.0002, which is less than 0.05. We conclude that the difference of means in write between males and females is different from 0.
a. female – This column gives categories of the independent variable female . This variable is necessary for doing the independent group t-test and is specified by the t-test groups= statement.
b. N – This is the number of valid (i.e., non-missing) observations in each group.
c. Mean – This is the mean of the dependent variable for each level of the independent variable.
d. Std. Deviation – This is the standard deviation of the dependent variable for each of the levels of the independent variable.
e. Std. Error Mean – This is the standard error of the mean, the ratio of the standard deviation to the square root of the respective number of observations.
f. – This column lists the dependent variable(s). In our example, the dependent variable is write (labeled “writing score”).
g. – This column specifies the method for computing the standard error of the difference of the means. The method of computing this value is based on the assumption regarding the variances of the two groups. If we assume that the two populations have the same variance, then the first method, called pooled variance estimator, is used. Otherwise, when the variances are not assumed to be equal, the Satterthwaite’s method is used.
h. F – This column lists Levene’s test statistic. Assume \(k\) is the number of groups, \(N\) is the total number of observations, and \(N_i\) is the number of observations in each \(i\)-th group for dependent variable \(Y_{ij}\). Then Levene’s test statistic is defined as
\begin{equation} W = \frac{(N-k)}{(k-1)} \frac{\sum_{i=1}^{k} N_i (\bar{Z}_{i.}-\bar{Z}_{..})^2}{\sum_{i=1}^{k}\sum_{j=1}^{N_i}(Z_{ij}-\bar{Z}_{i.})^2} \end{equation}
\begin{equation} Z_{ij} = |Y_{ij}-\bar{Y}_{i.}| \end{equation}
where \(\bar{Y}_{i.}\) is the mean of the dependent variable and \(\bar{Z}_{i.}\) is the mean of \(Z_{ij}\) for each \(i\)-th group respectively, and \(\bar{Z}_{..}\) is the grand mean of \(Z_{ij}\).
i. Sig. – This is the two-tailed p-value associated with the null that the two groups have the same variance. In our example, the probability is less than 0.05. So there is evidence that the variances for the two groups, female students and male students, are different. Therefore, we may want to use the second method (Satterthwaite variance estimator) for our t-test.
j. t – These are the t-statistics under the two different assumptions: equal variances and unequal variances. These are the ratios of the mean of the differences to the standard errors of the difference under the two different assumptions: (-4.86995 / 1.30419) = -3.734, (-4.86995/1.33189) = -3.656.
k. df – The degrees of freedom when we assume equal variances is simply the sum of the two sample sizes (109 and 91) minus 2. The degrees of freedom when we assume unequal variances is calculated using the Satterthwaite formula.
l. Sig. (2-tailed) – The p-value is the two-tailed probability computed using the t distribution. It is the probability of observing a t-value of equal or greater absolute value under the null hypothesis. For a one-tailed test, halve this probability. If the p-value is less than our pre-specified alpha level, usually 0.05, we will conclude that the difference is significantly different from zero. For example, the p-value for the difference between females and males is less than 0.05 in both cases, so we conclude that the difference in means is statistically significantly different from 0.
m. Mean Difference – This is the difference between the means.
n. Std Error Difference – Standard Error difference is the estimated standard deviation of the difference between the sample means. If we drew repeated samples of size 200, we would expect the standard deviation of the sample means to be close to the standard error. This provides a measure of the variability of the sample mean. The Central Limit Theorem tells us that the sample means are approximately normally distributed when the sample size is 30 or greater. Note that the standard error difference is calculated differently under the two different assumptions.
o. 95% Confidence Interval of the Difference – These are the lower and upper bound of the confidence interval for the mean difference. A confidence interval for the mean specifies a range of values within which the unknown population parameter, in this case the mean, may lie. It is given by
where s is the sample deviation of the observations and N is the number of valid observations. The t-value in the formula can be computed or found in any statistics book with the degrees of freedom being N-1 and the p-value being 1- width /2, where width is the confidence level and by default is .95.
Your Name (required)
Your Email (must be a valid email for us to receive the report!)
Comment/Error Report (required)
How to cite this page
- © 2024 UC REGENTS
IMAGES
VIDEO
COMMENTS
A null hypothesis is a precise statement about a population that we try to reject with sample data. We don't usually believe our null hypothesis (or H 0) to be true. However, we need some exact statement as a starting point for statistical significance testing.
How to run and interpret the results for the one sample t test in SPSS is shown in this video (Part 1 of 2). ...more.
To set up a null hypothesis in SPSS Statistics, you follow these general steps: Define the null hypothesis (H 0) and the alternative hypothesis (H 1). Select an appropriate statistical test based on your data type and research question. Set your significance level (alpha), typically at 0.05.
In the Data Editor, select "Analyze"→"Compare Means"→"One-Way ANOVA..." to open the dialog box shown below. To generate the ANOVA statistic the variables chosen cannot have a "Nominal" level of measurement; they must be "ordinal."
This video demonstrates how to test the null hypothesis with ANOVA in SPSS. In the case of ANOVA, the null hypothesis states the mean score for each group is...
The null hypothesis (H0) and the alternative hypothesis (H1) can be stated as: H0: There is no difference between the two drugs. H1: There is a significant difference between the two drugs. Special consideration is given to the null hypothesis.
In SPSS, the analyst has to specify the null hypothesis in tests on one variable, namely tests on one proportion, one mean, or one categorical variable. The following instructions explain how to do this.
The null hypothesis (H 0) and alternative hypothesis (H 1) of the Independent Samples t Test can be expressed in two different but equivalent ways: H 0 : µ 1 = µ 2 ("the two population means are equal")
Hypothesis testing involves making an assumption (the null hypothesis) about a population parameter and then using sample data to test that assumption. The alternative hypothesis is what you might conclude if you find evidence against the null hypothesis.
The single sample t-test tests the null hypothesis that the population mean is equal to the number specified by the user. SPSS calculates the t-statistic and its p-value under the assumption that the sample comes from an approximately normal distribution.