IBBiology @Skyline High School

                                                                                                                                                                     

 

Statistical Analysis

bullet

Mean

bullet

Standard Deviation

bullet

Standard Deviation in Excel 2003 or on TI-83

bullet

Mean and Standard Deviation in Excel 2007 (doc)

bullet

Mean and Standard Deviation in TI-nspire (external link)

bullet

T-Test

bullet

T-Test in Excel or on TI-83

 

Mean

bullet

The average of all data entries.

bullet

Measure of central tendency for normally distributed data.

bullet

DO NOT calculate a mean from values that are already averages.

bullet

DO NOT calculate a mean of ratios or percentages for groups of several difference sizes; go back to the raw data and recalculate.

bullet

DO NOT calculate a mean when the measurement scale is not linear (i.e. pH units are not measured on a linear scale).

bullet

The sum of all the results divided by the number of results.

 

Standard Deviation

Averages do not tell us everything about a sample. Samples can be very uniform with the data all bunched around the mean or they can be spread out a long way from the mean. The statistic that measures this spread is called the standard deviation.

 

The wider the spread of scores, the larger the standard deviation. 

 

For data that has a normal distribution, 68% of the data lies within one standard deviation of the mean.

 

 

Calculate the standard deviation by subtracting the mean of a distribution from the value of each individual variable in the distribution, squaring each resulting difference, summing these squared differences, then dividing this sum by the number of variables, and finally taking the square root of this quotient.

S = standard deviation
Σ = sum of
X = individual score
M = mean of all scores
n = sample size (number of scores)

 

Example:  Given the set of numbers {20.0, 23.0, 25.0, 26.0}, calculate the mean and the standard deviation.

Mean = (20+23+25+26)/4 = 23.5

Standard deviation

1.       Calculate (X-M) 

a.       The mean of these numbers was found to be equal to 23.5. 

b.       The deviations from the mean are respectively:

·         20.0 - 23.5 = -3.5

·         23.0 - 23.5 = -0.5

·         25.0 - 23.5 = 1.5

·         26.0 - 23.5 = 2.5

2.       Square each of these deviations to determine (X-M)2 

·         (3.5)2 = 12.25

·         (0.5)2= 0.25

·         (1.5)2= 2.25

·         (2.5)2= 6.25

3.       Add the values from step 2 together to get ∑(X-M)2  

·         12.25 + 0.25 + 2.25 + 6.25 = 21.

4.       Calculate (n-1) by subtracting 1 from your sample size

·         Since the were 4 original numbers, our n=4

·         Therefore (n-1) = 3

5.       Divide the answer from step 3 by the answer from step 4 to find

∑(X-M)2

n-1

·         21 / 3 = 7  

6.       Calculate the square root of your answer from step 5 to determine the standard deviation!

 

 

·         The square root of 7 is approximately 2.65 which rounds to 2.7

7.       Answer:  the standard deviation of the set of numbers {20, 23, 25, 26} is 2.7.  This means that 68% of the data lies within 2.65 of the mean (68% of the values are equal to 23.5 +/- 2.65). 

Using EXCEL to calculate the mean and the standard deviation 

Type the values you are trying to find the mean for in a column.  You can label the column, but you don’t have to.

 

Determine which box you want the mean to appear in.  In the example, I want the mean to appear in box A12.  In that box, type:  =AVERAGE(A2:A11) and then hit enter.  Basically you are telling Excel to average boxes A2 through A11. 

 

Determine which box you want the standard deviation to appear in. In the example, I want the standard deviation to appear in box A13.  In that box, type:  =STDEV(A2:A11) and then hit enter.   You are giving Excel the box labels for the data for which you want to find the standard deviation.

 

A

1

Number of Pennies

2

134

3

130

4

136

5

132

6

131

7

137

8

131

9

135

10

130

11

129

12

132.5

13

2.798809

 

Calculating mean and standard deviation on the TI-83:

 

First we have to enter the data. Hit the STAT button and you will see the options EDIT, CALC and TESTS atop the screen. Use the left and right arrows (if necessary) to move the cursor to EDIT, then select 1:Edit...

 

Now you will see a table with the headings L1 and L2. Enter the values under L1 (if you want to clear pre-existing data first, move the cursor to the top of the column, hit CLEAR and then ENTER.)

 

Once all the data is entered, go back to the STAT menu, but this time move the cursor to CALC instead of EDIT.

 

Once you're in the CALC menu, select 1-Var Stats, then hit ENTER.

 

The calculator will display the x-mean, some other stuff, and then the standard deviation (sx). Note that sx is what we called s.d. in class; the calculator refers to it as sx. This is followed by something called sigma x (which is what you would get as standard deviation if you had used n instead of n-1), and finally the sample size.

 

T-Test

A t-test is used to determine if the means of two samples (often an experimental and a control group) are truly, or at least significantly, different or if the difference between them is plausibly due to random variation not related to the hypothesis being tested.

 

 

 

The formula for the t-test is a ratio. The top part of the ratio is the difference between the two means or averages. The bottom part is a measure of the variability of the data.

 

 

 

 

 

 

 

Sample 1

Sample 2

7.85

12.50

8.51

12.94

13.66

6.26

11.03

6.10

6.59

13.19

8.04

10.74

14.16

6.06

8.13

12.53

6.79

15.45

11.06

15.64

5.83

15.19

10.73

14.93

6.68

7.94

5.02

8.28

10.37

12.65

Let’s us an example to help you learn the t-test:

 

Step 1:  Find the means for each sample

           

Sample 1 mean = 8.96

            Sample 2 mean = 11.36

 

Step 2:  Find the absolute value of the difference between the means.   

            This is the top part of the t-test formula. 

            Mean 1 – mean 2 =

            X1 – x2 =

             8.96 – 11.36 =

            -2.40

            Absolute value = 2.40

 

Step 3:  The bottom part is called the standard error of the difference.  To compute it, first find the standard deviation for each sample.

            Sample 1 SD = 2.76

            Sample 2 SD = 3.55

 

Step 4:  Square the standard deviation for each group to find the “variance” for each group. 

            Sample 1 variance = (2.76)2 = 7.63

            Sample 1 variance = (3.55)2 = 12.57

 Step 5:  Divide each squared standard deviation by the sample size of that group.  

 

            Sample 1:  7.63 / 15 = 0.51

            Sample 2:  12.57 / 15 = 0.84

 Step 6:  Add these two values

             0.51 + 0.84 = 1.35

 Step 7:  Take the square root of the number to find the “standard error of the difference”

            √1.35 = 1.16     

 Step 8:  divide the difference in the means (step 2) by the standard error of the difference (step 7)

             T = 2.40 / 1.16 = 2.07

 

Step 9:  You need to determine the degrees of freedom (df) for the test. In the t-test, the degrees of freedom is the sum of the sample sizes of both groups minus 2.

             DF = (15 +15) – 2 = 28

 

Step 10:  Once you compute the t-value (answer from step 8) and the degrees of freedom (answer from step 9) you have to look it up in a table of significance to test whether the ratio is large enough to say that the difference between the groups is not likely to have been a chance finding. To test the significance, you need to set a risk level (called the alpha level). In most research, the "rule of thumb" is to set the alpha level at .05. This means that five times out of a hundred you would find a statistically significant difference between the means even if there was none (i.e., by "chance").

 

Given the alpha level, the df, and the t-value, you can look the t-value up in a standard table of significance to determine whether the t-value is large enough to be significant.

 

 

 

 


 

  

 

df

.10

.05

.025

.01

.005

.000

1

3.078

6.314

12.706

31.821

63.657

636.619

2

1.886

2.920

4.303

6.965

9.925

31.598

3

1.638

2.353

3.182

4.541

5.841

12.941

4

1.533

2.132

2.776

3.747

4.604

8.610

5

1.476

2.015

2.571

3.365

4.032

6.859

6

1.440

1.943

2.447

3.143

3.707

5.959

7

1.415

1.895

2.365

2.998

3.499

5.405

8

1.397

1.860

2.306

2.896

3.355

5.041

9

1.383

1.833

2.262

2.821

3.250

4.781

10

1.372

1.812

2.228

2.764

3.169

4.587

11

1.363

1.796

2.201

2.718

3.106

4.437

12

1.356

1.782

2.179

2.681

3.055

4.318

13

1.350

1.771

2.160

2.650

3.012

4.221

14

1.345

1.761

2.145

2.624

2.977

4.140

15

1.341

1.753

2.131

2.602

2.947

4.073

16

1.337

1.746

2.120

2.583

2.921

4.015

17

1.333

1.740

2.110

2.567

2.898

3.965

18

1.330

1.734

2.101

2.552

2.878

3.922

19

1.328

1.729

2.093

2.539

2.861

3.883

20

1.325

1.725

2.086

2.528

2.845

3.850

21

1.323

1.721

2.080

2.518

2.831

3.819

22

1.321

1.717

2.074

2.508

2.819

3.792

23

1.319

1.714

2.069

2.500

2.807

3.767

24

1.318

1.711

2.064

2.492

2.797

3.745

25

1.316

1.708

2.060

2.485

2.787

3.725

26

1.315

1.706

2.056

2.479

2.779

3.707

27

1.314

1.703

2.052

2.473

2.771

3.690

28

1.313

1.701

2.048

2.467

2.763

3.674

29

1.311

1.699

2.045

2.462

2.756

3.659

30

1.310

1.697

2.042

2.457

2.750

3.646

40

1.303

1.684

2.021

2.423

2.704

3.551

60

1.296

1.671

2.000

2.390

2.660

3.460

120

1.289

1.658

1.980

2.358

2.617

3.373

c

1.282

1.645

1.960

2.326

2.576

3.291

 

If your calculated t value is greater than the number in the table, you can conclude that the difference between the means for the two groups is significantly different.

In our example, the number in the table for our data is 1.701.  So, since our calculated value (2.07) is greater than then number in the table, we must conclude that the difference between the two groups IS SIGNIFICANTLY DIFFERENT.

 

 

To check your answers

Sometimes it is nice to check your answers to make sure you are doing the calculations right.  Use this website to check your results

Performing a t-test with Excel

Excel calculates a T-test in a slightly different way.  Rather than giving you the t value and comparing it to a table, Excel simply tells you the probability that the means are different simply due to chance.  This is called a “P value.”

 

Follow these steps to calculate a P value using a t-test with Excel:

Step 1:  Create two columns, side by side, for the data of interest.  Each sample’s data should be in separate columns like in the example above.

Step 2:  Click on another blank cell where you wish the P value to appear. 

Step 3:  Then click “fx” on the Excel toolbar and choose “statistical” from the “function” list, then “TTest” from the list. 

Step 4:  Set the t-test parameters:

  • For “Array1” highlight the data from one sample; for “Array2”, highlight the data in the second column. 

  • Enter “2” in the box for “Tails.”

  • Lastly, you will have to select the “Type” of t-test.  or our purposes type “2.” 

  • After answering these questions click “OK” and the P value will appear.  The P value will fall between zero and one.

Step 5:  What does my P value mean?  Using Excel with the same data from the sample given above, Excel gives the number 0.05.  This means that there is a 5% chance that the differences between the two samples are due to random chance alone.  Another way to say this is that there is a 95% chance that the difference between these two samples is due to the variable being investigated.  Normally will say that a P value of .05 or less is significant.

Performing a T-test with the TI-83

 

1.  Hit the STAT button on the calculator

2.  Select option 4 to clear any past lists of data.

3.  Select option 1 to EDIT your lists.

4.  Enter your data for each group as List 1 and List 2

5.  Hit STAT button and use the arrow key to move over to the TESTS option

6.  Scroll down to option 4, the 2-sample T test and hit ENTER

7.  Scroll to the bottom of the screen and hit ENTER over the CALCULATE option

8.  Your results are given.

 

   T = calculated T value

   df = Degrees of Freedom

   X1 = mean of list 1

   X2 = mean of list 2

   Sx1 = standard deviation of list 1

   Sx2 = standard deviation of list 2

 

 

"When we tug at a simple thing in nature, we find it attached to the rest of the world."  John Muir