This week we discussed in depth the probabilities of coin flips and rolling dice. If you flip a coin n times, then the size of the sample space is 2n. If you roll two 6-sided dice, then the sample space has size 36; 63=216 for three dice.
Tree diagrams help us visualize the sample space.
A random variable (r.v.) is a function from the sample space into the set of real numbers. When the outputs are all natural/whole numbers, then we say it is a discrete random variable.
Given an experiment and a random variable we want to construct the probability distribution function (pdf).
E.g. 1) flipping a coin n-times and X=number of heads, 2) rolling two 6-sided dice and X=sum of faces, 3) drawing a 5 card hand from a standard deck of cards and X=number of hearts.
We leaned how to use =RANDBETWEEN(bottom number, top number) and =FREQUENCY(data array, bins array) and the important thing is to hit CTRL+SHFT+ENTER .
We started our discussion of probability. You start with an "experiment" and have possible outcomes (simple). T he set of all possible outcomes is called the sample space. An event is a set of outcomes. The probability of an event is defined as
where |A| denotes the size of the set A.
There are three main ways to create a new event from given events A and B.
a) AND A AND B; the set of outcomes that belong to both A and B
b) OR A OR B; the set of outcomes that belong to at least one of A or B
c) NOT A' ; the set of outcomes that do not belong to both A.
We discussed three basic properties of probabilities. For any event E
i) 0 ≦ P(E) ≦ 1,
ii) P(A) + P(A') = 1,
iii) P(A OR B) = P(A) + P(B) - P(A AND B).
Two events are said to be mutually exclusive if A AND B = ∅, the empty set.
Two events are said to be independent if P(A) x P(B) = P(A AND B).
We completed Chapter 2. We discussed stem-leaf plots and its advantages (looks like a histogram with data listed) and disadvantages (hard to construct in Excel).
We discussed what happens to the mean, median, st. dev., and variance if you modify the data: either by adding c units to every data point, or multiplying by a constant c>0.
We showed how to construct box-whisker plots and visually see outliers: use the 1.5x length of box. We discussed how to construct a modified box-and-whisker plots.
We discussed why the median is resitant to outliers, whereas the mean is not. We learned how to make frequency tables and cumulative frequency tables, as well as find quartiles and 5-number summary on Excel.
I will try to keep this up-to-date. If you would like to share some thoughts about what we did in class in an email...please feel free to do so. use Updates for Stats in your subject line. This will count towards your participation/other
So far we have covered the following topics in class:
1. Two types of variables: categorical (qualitative) and quantitative. Quantitative can be broken into two sub-types: continuous and discrete.
2. Variables can be characterized by scale or levels of measurement (p. 27). You should understand these.
3. For quantitative variables we discussed the following terms: mean, median, mode, range, IQR inter-quartile range, quartiles, percentiles, standard deviation, and variance. Know the difference between a measure of center and measure of spread.
4. We did an example of a quanitative variable in class (hours you selpt last night). We constructed a frequency table. You should look up relative frequency table and cummulative relative frequency (p. 28-29).
5. We have discussed histograms, bar graphs, pie charts, and how to construct in excel. Bar graphs and pie charts are best for categorical data. histograms are great for quatitative data.
you should read sections 1.4, 1.5, and 1.6 for your own sake. I will not test you on this material.
Make sure you are getting a coin, a standard deck of cards, and a pair of 6-sided dice.
Remember: at the end of each chapter there is a list of Key Terms, a Chapter Review, a lit of HW Problems broken down by section, and then the answers to selected problems.