Working with the normal distribution
Example 1
Suppose weights of checked baggage of airline passengers follow a nearly normal distribution with mean = 45 pounds and SD = 3.2 pounds. Most airlines charge a fee for baggage that weigh in excess of 50 pounds.
Question: What percent of airline passengers are expected to incur this fee?
We can get the normal probability model:
baggage ~ N(mean = 45, SD = 3.2)
Solution 1: Use the app
The app link is available from https://gallery.shinyapps.io/dist_calc/
Solution 2: Use R pnorm()
So, in R, we set our pnorm()
function, cutoff value is 50, mean is 45, standard deviation is 3.2.
pnorm(50, mean = 45, sd = 3.2)
[1] 0.9409
We find that the area under the curve below 50 is 0.9409. This is not the area that we are interested in, so to find the desired area we need to find the complement of this answer, so 1 - 0.9409 = 0.0591 , so roughly 5.91% of the passengers are expected to have paggages that weigh in excess of 50 pounds.
Solution 3 - Use Z-score and normal probability table.
First, we need to calculate the z-score
50 - 45 Z = --------- = 1.56 3.2
Then, we need to refer to the normal probability table, locate 1.5 on the row and 0.06 on the column. And a Z-score of 1.56 then corresponds to the value at the intersection, which is 0.9406.
Remember, the Z-score refer to the area under the curve below observation value, so to find the area above the curve below the observation value, we simply need to find the complement, which is 1 - 0.9406 = 0.0594, roughly the 5.9%.
Example 2
The average daily high temperature in June in LA is 77F with a SD = 5F. Suppose that the temperature in June closely follow a normal distribution.
Question: How cold are the coldest 20% of the days during June in LA?
We can get the normal probability model:
temperatures ~ N(mean = 77, SD = 5)
The coldest 20% of the days, which means the lower end of the distribution, the percentile. So we can draw the distribution with shaded area 20%. This time we don't have the cutoff value (observation), and that is the value we are trying to get.
Solution 1: use R qnorm()
In order to work out the cutoff value, we need to use the qnorm()
function, where the first input is the percentile, the second input is the mean, and the third input is standard deviation.
qnorm(0.2, mean = 77, sd = 5)
[1] 72.79
That gives us the 72.79, meaning that the coldest 20% of the days in LA are colder than 72.79F.
Solution 2:
Another approach would be to do this by hand. Once again, we draw our curve and mark the percentile. Then, we need to figure out, using probability table, what Z-score 0.2 corresponds to.
We found the 0.2005 is the closest value, and it corresponds to Z-score -0.84. Therefore:
X - 77 Z = -0.84 = -------- 5 so, X = (-0.84 × 5) + 77 = 72.8
So, we once again to get the 72.8F, meaning that the coldest 20% of the days in LA are colder than 72.8F.
References & Resources
- N/A
Latest Post
- Dependency injection
- Directives and Pipes
- Data binding
- HTTP Get vs. Post
- Node.js is everywhere
- MongoDB root user
- Combine JavaScript and CSS
- Inline Small JavaScript and CSS
- Minify JavaScript and CSS
- Defer Parsing of JavaScript
- Prefer Async Script Loading
- Components, Bootstrap and DOM
- What is HEAD in git?
- Show the changes in Git.
- What is AngularJS 2?
- Confidence Interval for a Population Mean
- Accuracy vs. Precision
- Sampling Distribution
- Working with the Normal Distribution
- Standardized score - Z score
- Percentile
- Evaluating the Normal Distribution
- What is Nodejs? Advantages and disadvantage?
- How do I debug Nodejs applications?
- Sync directory search using fs.readdirSync