Introduction 1
About This Book 1
Foolish Assumptions 3
Icons Used in This Book 3
Beyond the Book 4
Where to Go from Here 4
Part 1: Tackling Data Analysis and Model-Building Basics 7
Chapter 1: Beyond Number Crunching: The Art and Science of Data Analysis9
Data Analysis: Looking before You Crunch 9
Nothing (not even a straight line) lasts forever 10
Data snooping isnt cool 11
No (data) fishing allowed 12
Getting the Big Picture: An Overview of Stats II 13
Population parameter 13
Sample statistic 13
Confidence interval 14
Hypothesis test 14
Analysis of variance (ANOVA) 15
Multiple comparisons 15
Interaction effects 16
Correlation 16
Linear regression 17
Chi-square tests 18
Chapter 2: Finding the Right Analysis for the Job21
Categorical versus Quantitative Variables 22
Statistics for Categorical Variables 23
Estimating a proportion 23
Comparing proportions 24
Looking for relationships between categorical variables 25
Building models to make predictions 26
Statistics for Quantitative Variables 27
Making estimates 27
Making comparisons 28
Exploring relationships 28
Predicting y using x 30
Avoiding Bias 31
Measuring Precision with Margin of Error 33
Knowing Your Limitations 35
Chapter 3: Having the Normal and Sampling Distributions in Your Back Pocket37
Recognizing the VIP Distribution the Normal 38
Characterizing the normal 38
Standardizing to the standard normal (Z-) distribution 38
Using the normal table 40
Finding probabilities for the normal distribution 41
Finally Getting Comfortable with Sampling Distributions 42
The mean and standard error of a sampling distribution 42
Sampling distribution ofX 43
Sampling distribution of p 44
Heads Up! Building Confidence Intervals and Hypothesis Tests 45
Confidence interval for the population mean 45
Confidence interval for the population proportion 46
Hypothesis test for population mean 46
Hypothesis test for the population proportion 47
Chapter 4: Reviewing Confidence Intervals and Hypothesis Tests49
Estimating Parameters by Using Confidence Intervals 50
Getting the basics: The general form of a confidence interval 50
Finding the confidence interval for a population mean 51
What changes the margin of error? 52
Interpreting a confidence interval 55
Whats the Hype about Hypothesis Tests? 56
What Ho and Ha really represent 56
Gathering your evidence into a test statistic 57
Determining strength of evidence with a p-value 57
False alarms and missed opportunities: Type I and II errors 58
The power of a hypothesis test 60
Part 2: Using Different Types of Regression to Make Predictions65
Chapter 5: Getting in Line with Simple Linear Regression67
Exploring Relationships with Scatterplots and Correlations 68
Using scatterplots to explore relationships 69
Collating the information by using the correlation coefficient 70
Building a Simple Linear Regression Model 71
Finding the best-fitting line to model your data 72
The y-intercept of the regression line 73
The slope of the regression line 74
Making point estimates by using the regression line 75
No Conclusion Left Behind: Tests and Confidence Intervals for Regression 75
Scrutinizing the slope 76
Inspecting the y-intercept 78
Building confidence intervals for the average response 80
Making the band with prediction intervals 81
Checking the Models Fit (The Data, Not the Clothes!) 83
Defining the conditions 84
Finding and exploring the residuals 85
Using r2 to measure model fit 89
Scoping for outliers 90
Knowing the Limitations of Your Regression Analysis 92
Avoiding slipping into cause-and-effect mode 92
Extrapolation: The ultimate no-no 93
Sometimes you need more than one variable 94
Chapter 6: Multiple Regression with Two X Variables95
Getting to Know the Multiple Regression Model 96
Discovering the uses of multiple regression 96
Looking at the general form of the multiple regression model 96
Stepping through the analysis 97
Looking at xs and ys 97
Collecting the Data 98
Pinpointing Possible Relationships 100
Making scatterplots 100
Correlations: Examining the bond 101
Checking for Multicolinearity 104
Finding the Best-Fitting Model for Two x Variables 105
Getting the multiple regression coefficients 106
Interpreting the coefficients 107
Testing the coefficients 108
Predicting y by Using the x Variables 110
Checking the Fit of the Multiple Regression Model 111
Noting the conditions 112
Plotting a plan to check the conditions 112
Checking the three conditions 114
Chapter 7: How Can I Miss You If You Wont Leave? Regression Model Selection117
Getting a Kick out of Estimating Punt Distance 118
Brainstorming variables and collecting data 118
Examining scatterplots and correlations 120
Just Like Buying Shoes: The Model Looks Nice, But Does It Fit? 123
Assessing the fit of multiple regression models 124
Model selection procedures 125
Chapter 8: Getting Ahead of the Learning Curve with Nonlinear Regression129
Anticipating Nonlinear Regression 130
Starting Out with Scatterplots 131
Handling Curves in the Road with Polynomials 133
Bringing back polynomials 134
Searching for the best polynomial model 136
Using a second-degree polynomial to pass the quiz 138
Assessing the fit of a polynomial model 141
Making predictions 143
Going Up? Going Down? Go Exponential! 145
Recollecting exponential models 145
Searching for the best exponential model 146
Spreading secrets at an exponential rate 148
Chapter 9: Yes, No, Maybe So: Making Predictions by Using Logistic Regression153
Understanding a Logistic Regression Model 154
How is logistic regression different from other regressions? 154
Using an S-curve to estimate probabilities 155
Interpreting the coefficients of the logistic regression model 156
The logistic regression model in action 157
Carrying Out a Logistic Regression Analysis 158
Running the analysis in Minitab 158
Finding the coefficients and making the model 160
Estimating p 161
Checking the fit of the model 162
Fitting the movie model 162
Part 3: Analyzing Variance with Anova167
Chapter 10: Testing Lots of Means? Come On Over to ANOVA! 169
Comparing Two Means with a t-Test 170
Evaluating More Means with ANOVA 171
Spitting seeds: A situation just waiting for ANOVA 172
Walking through the steps of ANOVA 173
Checking the Conditions 174
Verifying independence 174
Looking for whats normal 174
Taking note of spread 176
Setting Up the Hypotheses 178
Doing the F-Test 179
Running ANOVA in Minitab 180
Breaking down the variance into sums of squares 180
Locating those mean sums of squares 182
Figuring the F-statistic 183
Making conclusions from ANOVA 184
Whats next? 186
Checking the Fit of the ANOVA Model 186
Chapter 11: Sorting Out the Means with Multiple Comparisons189
Following Up after ANOVA 190
Comparing cellphone minutes: An example 190
Setting the stage for multiple comparison procedures 192
Pinpointing Differing Means with Fisher and Tukey .193
Fishing for differences with Fishers LSD 194
Separating the turkeys with Tukeys test 197
Examining the Output to Determine the Analysis 198
So Many Other Procedures, So Little Time! 199
Controlling for baloney with the Bonferroni adjustment 200
Comparing combinations by using Scheffés method 201
Finding out whodunit with Dunnetts test 202
Staying cool with Student Newman-Keuls 202
Duncans multiple range test 202
Chapter 12: Finding Your Way through Two-Way ANOVA205
Setting Up the Two-Way ANOVA Model 206
Determining the treatments 206
Stepping through the sums of squares 207
Understanding Interaction Effects 209
What is interaction, anyway? 209
Interacting with interaction plots 210
Testing the Terms in Two-Way ANOVA .213
Running the Two-Way ANOVA Table 214
Interpreting the results: Numbers and graphs 214
Are Whites Whiter in Hot Water? Two-Way ANOVA Investigates 217
Chapter 13: Regression and ANOVA: Surprise Relatives!221
Seeing Regression through the Eyes of Variation 222
Spotting variability and finding an x-planation 222
Getting results with regression 223
Assessing the fit of the regression model 225
Regression and ANOVA: A Meeting of the Models 226
Comparing sums of squares 226
Dividing up the degrees of freedom 228
Bringing regression to the ANOVA table 229
Relating the F- and t-statistics: The final frontier 230
Part 4: Building Strong Connections with Chi-Square Tests and Nonparametrics233
Chapter 14: Forming Associations with Two-Way Tables235
Breaking Down a Two-Way Table 236
Organizing data into a two-way table 236
Filling in the cell counts 237
Making marginal totals 238
Breaking Down the Probabilities 239
Marginal probabilities 239
Joint probabilities 241
Conditional probabilities 242
Trying To Be Independent 247
Checking for independence between two categories 247
Checking for independence between two variables 249
Demystifying Simpsons Paradox 250
Experiencing Simpsons Paradox 250
Figuring out why Simpsons Paradox occurs 253
Keeping one eye open for Simpsons Paradox 254
Chapter 15: Being Independent Enough for the Chi-Square Test257
The Chi-Square Test for Independence 258
Collecting and organizing the data 259
Determining the hypotheses 261
Figuring expected cell counts 261
Checking the conditions for the test 262
Calculating the Chi-square test statistic 263
Finding your results on the Chi-square table 266
Drawing your conclusions 269
Putting the Chi-square to the test 271
Comparing Two Tests for Comparing Two Proportions 272
Getting reacquainted with the Z-test for two population proportions 273
Equating Chi-square tests and Z-tests for a two-by-two table 274
Chapter 16: Using Chi-Square Tests for Goodness-of-Fit (Your Data, Not Your Jeans) 279
Finding the Goodness-of-Fit Statistic 280
Whats observed versus whats expected 280
Calculating the goodness-of-fit statistic 282
Interpreting the Goodness-of-Fit Statistic Using a Chi-Square 284
Checking the conditions before you start 285
The steps of the Chi-square goodness-of-fit test 286
Chapter 17: Rebels Without a Distribution Nonparametric Procedures291
Arguing for Nonparametric Statistics 292
No need to fret if conditions arent met 292
The medians in the spotlight for a change 293
So, whats the catch? 295
Mastering the Basics of Nonparametric Statistics 296
Sign 296
Chapter 18: All Signs Point to the Sign Test299
Reading the Signs: The Sign Test 300
Testing the median in real estate 302
Estimating the median 304
Testing matched pairs 306
Part 5: Putting it all Together: Multi-Stage Analysis of A Large Data Set309
Chapter 19: Conducting a Multi-Stage Analysis of a Large Data Set311
Steps Involved in Working with a Large Data Set 311
Wrangling Data 313
Discovery 313
Structuring 314
Cleaning 315
Enriching 315
Validating 316
Publishing 317
Visualizing Data 317
Exploring the Data 319
Looking for Relationships 319
Building Models and Making Inferences 320
Sharing the Story 321
Who is the audience? 322
Make an outline 322
Include an executive summary 323
Check your writing 323
Chapter 20: A Statistician Watches the Movies325
Examining the Movie Variables and Asking Questions 326
Visualizing the Movie Data 327
Categorical movie variables 328
Quantitative movie variables 329
Doing Descriptive Dirty Work 332
Looking for Relationships 333
Relationships between quantitative movie variables 333
Relationships between two categorical variables 337
Relationships between quantitative and categorical variables 338
Building a Model for Predicting U.S Revenue 340
Writing It Up 343
Chapter 21: Looking Inside the Refrigerator347
Refrigerator Data The Variables 348
Exploring the Data 348
Analyzing the Data 350
Writing It Up 358
Part 6: The Part of Tens361
Chapter 22: Ten Common Errors in Statistical Conclusions 363
Claiming These Statistics Prove 363
Its Not Technically Statistically Significant, But 364
Concluding That x Causes y 365
Assuming the Data Was Normal 366
Only Reporting Important Results 366
Assuming a Bigger Sample Is Always Better 367
Its Not Technically Random, But 369
Assuming That 1,000 Responses Is 1,000 Responses 369
Of Course the Results Apply to the General Population 371
Deciding Just to Leave It Out 372
Chapter 23: Ten Ways to Get Ahead by Knowing Statistics375
Asking the Right Questions 375
Being Skeptical 376
Collecting and Analyzing Data Correctly 377
Calling for Help 378
Retracing Someone Elses Steps 379
Putting the Pieces Together 379
Checking Your Answers 380
Explaining the Output 381
Making Convincing Recommendations 382
Establishing Yourself as the Statistics Go-To Person 383
Chapter 24: Ten Cool Jobs That Use Statistics385
Pollster 386
Data Scientist 387
Ornithologist (Bird Watcher) 387
Sportscaster or Sportswriter 388
Journalist 390
Crime Fighter 390
Medical Professional 391
Marketing Executive 392
Lawyer 393
Appendix A: Reference Tables 395
Index 409