(Invented) frequency of use of dude in four million word spoken corpora:


US NZ AU UK

15 9 11 5


Random distribution would be:


US NZ AU UK

10 10 10 10


Not significant, but what if corpus were 10 million words and we got?:


US NZ AU UK

150 90 110 50


That may be significant.



What does a bell curve show?










Men

Women

80%

85%

Do women really do better on the test?




(graph taken and modified from http://en.wikipedia.org/wiki/Image:Normal_distribution_pdf.png)


Consider the red curve to indicate women's scores and the purple to show men's scores. Is there a real difference between 80% and 85%?


Consider the purple curve to be the distribution of scores achieved by learners of Method 1 and the blue curve to be that of Method 2. Is there really a difference?


Statistics consider not only the average, but the distribution in determining significance.



Logistic Regression This is used when the dependent variable is nominal, and most often when the independent variables are nominal as well. For example, how do independent variables such as sex, race, region, spoken vs. written register, affect the deletion of t in words like swept, felt, dealt, perfect (dependent variable)?



A Language Study

Download this answer sheet, or use the one I give you in class. Make your responses on it. This will be used for Homework; correlation 2.

1-How many states have you lived in? (count a country as a state.)

2-Listen to the following seven sound files and indicate what country the speaker is from

Sample 1

Sample 2

Sample 3

Sample 4

Sample 5

Sample 6

Sample 7

(I haven't put the correct answers to these 7 on the class page. Be sure to ask me for them!)

3-Play each of the sound files below one at a time. They contain only a single word. Without thinking immediately say write the first word that comes into your head when you hear the word, then write it down.

Word 1

Word 2

Word 3

Word 4