STATS191 - Home

Assignment 1

Assignment 1 #.

You may discuss homework problems with other students, but you have to prepare the written assignments yourself.

Please combine all your answers, the computer code and the figures into one PDF file, and submit a copy to your folder on Gradescope.

Grading scheme: 10 points per question, total of 40.

Due date: 11:59 PM April 15, 2024 (Monday evening).

RStudio: RMarkdown , Quarto

Question 1 #

Install the package ISLR using the command. If already installed use library(ISLR2) instead of install.packages .

{r} install.packages('ISLR2', repos='http://cloud.r-project.org')

We’ll use the College data set for this problem. In particular, we want to compare the cost of room and board between private and public colleges.

Draw a boxplot of Room.Board stratified by Private . Similarly plot histograms of the two samples.

Based on the histograms, do you think the two-sample \(t\) -test is justified here?

Carry out the two-sample \(t\) -test comparing Room.Board between private and public colleges.

Question 2 #

In this problem, we’ll repeat the analysis above using Top10perc rather than Room.Board . We’ll also simulate the distribution of the \(t\) -statistic for comparing Top10perc between public and private colleges under the null hypothesis that there is no difference.

Repeat the steps above swapping Top10perc for Room.Board . Do the histograms have a normal shape?

Compute the two-sample \(t\) -statistic comparing Top10perc among private and public colleges by saving the $statistic attribute of the object returned by t.test .

We’ll create a sample of 10000 draws of the \(T\) -statistic based on random permutations of the data. For this we’ll need to use a for loop. The snippet below creates a vector of length 10000 and runs a for loop to store some numbers in the vector. In the example below, we store a call to rnorm(1) – a randomly generated normal with mean 0 and variance 1.

{r} my_sample = rep(NA, 10000) # create a vector of length 10000 filled with missing values for (i in 1:10000) {        # for loops need to be enclosed in {}      my_sample[i] = rnorm(1) }

Randomly shuffle the Private vector 10000 times, and recompute the two sample \(t\) -test statistic, storing the values in a vector of length 10000.

Create a histogram of the 10000 statistics. Does its shape seem similar to that of the \(T\) -distribution used to compute the \(p\) -value by t.test ?

Compute a \(p\) -value by computing the proportion of your 10000 statistics that are larger in absolute value than your observed \(t\) -statistic. Is it close to the \(p\) -value you found by using t.test ?

Question 3 - Sign Test #

In this problem we’ll carry out the sign test by simulation. The data we’ll use is the Fund dataset from ISLR2 which contains (simulated) returns for different fund managers over 50 months. If fund managers are unable to really beat the market, we expect that they will have positive returns as frequently as negative returns. This is our null hypothesis.

For Manager14 , compute the signs of their 50 returns and store the number of months with positive returns as our statistic.

For a null distribution, we will take a distribution symmetric around 0 and look at the distributions of the sum of its signs.

For 10000 reps, draw 50 standard normals (mean 0, variance 1) using rnorm(50) , and compute the number of positive entries as a test statistic. Plot a histogram of your results using the argument breaks=1:50 .

Instead of normal random variables, repeat 2. with uniform random variables which have a distribution symmetric around 0.5. That is, for 10000 reps, simulate 50 uniform random variables with runif(50) and store the number greater than 0.5 as a test statistic. Does it look the same as 2.? Why?

Explain why the distribution from 2. is a good way to evaluate evidence against our null hypothesis about Manager14 . Compute a \(p\) -value by computing the proportion of your 10000 draws that have more positive draws then your observed number of positive months. Is this a 1-sided or 2-sided \(p\) -value? What is the null hypothesis here?

The distribution of the number of positive standard normals out of 50 is the Binomial distribution with parameters (50, 1/2) . Its true histogram (i.e. its probability mass function, or pmf) can be computed with dbinom . That is, the probability that there are exactly 23 positive standard normals is dbinom(23, 50, 0.5) . Use barplot to make a barplot of the true histogram.

Compute the probability of observing at least as many positive standard normals as the number of positive return months for Manager14 . You can do this by summing dbinom or using the cumulative histogram (i.e. cumulative distribution function or CDF) pbinom .

Question 4 #

Use the Wage data from ISLR2 for this problem.

Make a boxplot of wage ~ maritl (marital status). Inspect histograms for the 5 different values of maritl . Do the histograms have a normal shape?

Repeat 1. for logwage .

Fit the one-way ANOVA model logwage ~ maritl . Test the null hypothesis that there is no difference in wages based on the marital status of the 3000 men.

Construct a 95% confidence the difference between the wage of widowed men and those never married.

Suppose you wanted to not rely on the \(p\) -value from the \(F\) test as provided by your one-way model in 3. How might you simulate a distribution to test the null distribution that the wage distribution doesn’t depend on marital status.

Compute a \(p\) -value using by simulation using your answer to 4.

  • Installing and starting up R and RStudio
  • Creating a report using rmarkdown
  • Structuring an analysis
  • R scripts and R markdown files
  • Import data
  • Make summaries of data
  • Making figures
  • Making tables
  • Version control and GitHub
  • Introduction to statistical inference
  • Statistical inference, p-values and confidence intervals
  • Regression models part 1
  • Regression models part 2: Lactate threshold analysis
  • Regression models and correlations
  • Analyzing pre- to post-experiments
  • Analyzing trials with mixed-model
  • More on mixed models (visualizing models)
  • Importing data
  • Summarise data
  • Group exercises
  • Descriptive data
  • Feedback assignment 1
  • Feedback assignment 2
  • Regression models
  • Feedback assignment 3
  • Repeated measures studies

Assignment 1: Descriptive statistics

This assignment is composed of two parts. In Part 1 you are expected to reproduce the first part of Table 1 in Haun et al. (2019) . In the second part you are expected to present results from the reliability study.

Part 1: Reproduce a table

Haun et al. (2019) has published raw data together with their paper. This means that we will be able to reproduce some of their results, exciting! In this assignment you are expected to reproduce Table 1 in (Haun et al. 2019) . To get you started, I have included the code needed to download the data below (from the lesson on importing data).

When loading the data you will notice that some of the columns are duplicated. For example, there are two columns named GROUP , one will be converted as the warning message says.

The upper part of Table 1 contains mean and standard deviations for age, training age, body mass, DXA, type II fibers, 3RM in back squat, and total back squat volume. I have identified the the following variables in the data set that might be handy:

  • T1_BODY_MASS
  • PERCENT_TYPE_II_T1
  • Squat_3RM_kg
  • SQUAT_VOLUME

Notice that the CLUSTER variable is the grouping variable for the table.

I have not located the variable representing “Training age,” we will therefore make the table as complete as possible with the variables that we find. Do not bother including the \(\pm\) sign if you do not find a quick solution. It is actually better to write mean and standard deviations as Mean (SD) (Altman and Bland 2005) .

The code below can maybe get you started. See the lesson on making tables for more ideas.

Part 2: Calculate measures of reliability

The second part of the assignment concerns calculations of reliability from the physiology lab. You are expected to calculate the typical error (TE) and the smallest worthwhile change.

Typical error

The typical error can be calculated from two trials and be expressed as:

\[TE = \frac{s_{diff}}{\sqrt{2}}\] Where \(s_{diff}\) is the difference between two trials and the \(\sqrt{2}\) is needed to express the variation as the typical variation in a single trial. This follows from the fact that the variance ( \(s^2\) ) of the difference score is the sum of the variance of the typical error from each trial ( \(s^2_{diff}=s^2_{trial~1} + s^2_{trial~2}\) ) (Hopkins 2000) .

As the TE is expressed in standard deviations, it is on the same scale as the mean. TE can thus be expressed as a percentage of the mean or a coefficient of variation (CV). To express the TE as a percentage:

\(CV\% = 100 \times \frac{TE}{Mean}\)

Where the mean can be the group overall mean.

Smallest worthwhile change

We will calculate the smallest worthwhile change based on an estimate of the standard deviation in the population for a specific test. A sample can be used to estimate characteristics of the population. These concepts (sample/population) may be confusing and you may want to repeat them, a good place to start is (Diez, Barr, and Çetinkaya-Rundel 2020) where chapter 1 gives a broad introduction. Chapter 5 in (Navarro 2020) is a more in depth description of the same concepts with references to R.

OK, so using the sample we estimate the variation in the population as a standard deviation. The smallest worthwhile change correspond to \(0.2\times s\) . Where \(s\) is the estimate of the between individuals standard deviation. To calculate \(s\) we average multiple trials from the same individuals and calculate \(s\) between them.

The multiplier 0.2 comes from definitions of effect sizes where \(0.2\times s\) is considered a small but no trivial effect. We are thus interested in finding the value where a change is at least small.

Example code

How to submit, alternative 1.

Create a R Markdown file and submit this file together with any data or a link to data used to run the analysis on canvas.

Alternative 2, this is recommended!

Create a github repository and add your code to a R Markdown document. The repository should contain all data needed to run your analysis. See the lesson on version control for ideas on how to create the repository.

Logo for MacEwan Open Books

1.8 Assignment 1

This assignment has two parts. The first part assesses your knowledge of the concept of sample, population, descriptive vs. inferential study, using graphs to summarize data. The second part assesses your skills in using R commander to create the graphs and interpreting their meaning.

M01_SaleHome.xlsx

Instructions

Important note :

By default, in all assignments in this course, you are required to complete the questions or tasks in Part A by hand. This means that to do any calculation or drawing, you will NOT use R commander or any computer application. That is, you will do calculations manually with a non-programmable scientific calculator and use a pen or pencil to draw figures or build a distribution table on paper (or drawing pad/tablet) and take a photo of it and insert it below the answer space of the question. Before you start your assignment, you should get a calculator that has Statistic functions .

Before you complete Part B using R commander, you should read and practice the R commander steps by following the guidance in the Lab Manual.

Complete the following:

  • Identify the sample and population of this study. (2 marks)
  • Is this study a designed experiment or an observational study? Explain why. (2 marks)
  • Is this study descriptive or inferential? Explain why. (2 marks)
  • Propose two different methods to take a simple random sample of 200 students from a collection of 800 students. (2 marks)

A table showing the first thirty entries of the Home Sale spreadsheet. Image description available.

  • Identify the type of data provided in each column as qualitative, quantitative discrete, or quantitative continuous. (10 marks)
  • Obtain a frequency distribution using [1400, 1600) as the first sub-interval, [1600, 1800) as the second sub-interval, [1800, 2000) as the third, and etc. and insert it in the space below. (2 marks)
  • Obtain a relative frequency distribution based on part (1) and insert it in the space below. (2 marks)
  • Construct a relative-frequency histogram and insert it below. (3 marks)
  • Describe the graph you constructed in part (3) about its overall shape, modality, symmetricity/skewness, if applicable. (3 marks)
  • Obtain a frequency distribution. (2 marks)
  • Obtain a relative frequency distribution based on part (1). (2 marks)
  • Construct a graph corresponding to part (1). (3 marks)
  • Describe the graph obtained in part (3). (2 marks)
  • Construct two different types of graphs corresponding to part (2). (6 marks)
  • Describe the graphs obtained in part (3). (2 marks)

Finish the following questions using R and R commander :

Read the data set “M01_SaleHome.xlsx” and use R commander to complete the following tasks. For each, you need to copy or do a screenshot of the output in R commander (we later call it computer output) and paste it into the space below the questions . To save space, you only need to copy and paste what is asked for in the questions, and sometime may need to shrink the size.

  • Use the most suitable type of graphs to summarize the prices of these 88 sale homes. Comment on the distribution of the price in terms of overall shape, modality, symmetricity/skewness if applicable. (5 marks)
  • Use a suitable graph(s) we taught in Module 1 to compare the prices of homes with a tile roof and a non-tile roof. Briefly explain your findings based on the graph(s). (5 marks)
  • Homes with a swimming pool. (1 mark)
  • Homes without a swimming pool. (1 mark)
  • Homes with a swimming pool and with a tile roof. (1 mark)
  • Homes without a swimming pool and with a non-tile roof. (1 mark)
  • Use the most suitable graph to show the effect of “Size” on the “Price” of the sale home. Briefly describe the relationship you found. (5 marks)

Introduction to Applied Statistics Copyright © 2024 by Wanhua Su is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Browse Course Material

Course info.

  • Dr. Peter Kempthorne

Departments

  • Mathematics

As Taught In

  • Probability and Statistics
  • Game Theory

Learning Resource Types

Mathematical statistics, mathematical statistics assignment 1.

This resource contains information regarding mathematical statistics, assignment 1.

facebook

You are leaving MIT OpenCourseWare

Pardon Our Interruption

As you were browsing something about your browser made us think you were a bot. There are a few reasons this might happen:

  • You've disabled JavaScript in your web browser.
  • You're a power user moving through this website with super-human speed.
  • You've disabled cookies in your web browser.
  • A third-party browser plugin, such as Ghostery or NoScript, is preventing JavaScript from running. Additional information is available in this support article .

To regain access, please make sure that cookies and JavaScript are enabled before reloading the page.

CS 248: Introduction to Computer Graphics

Pat hanrahan, assignment 1 grades.

Grader Mean Median Mode Standard Deviation
Maneesh 97.47 100 98 5.56
Andrew 97.27 100 100 5.71
Reid 98.31 100 100 7.88
 
Overall 96.92 100 100 6.46

Navigation Menu

Search code, repositories, users, issues, pull requests..., provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications You must be signed in to change notification settings

ratan8932/Assignment--Basic-Statistics-Level-1

Folders and files.

NameName
3 Commits

Repository files navigation

Assignment--basic-statistics-level-1.

Q1) Identify the Data type for the Following:

Activity - Data Type Number of beatings from Wife - Discrete Results of rolling a dice - Discrete Weight of a person - Continuous Weight of Gold - Continuous Distance between two places - Continuous Length of a leaf - Continuous Dog's weight - Continuous Blue Color - Discrete Number of kids - Discrete Number of tickets in Indian railways - Discrete Number of times married - Discrete Gender (Male or Female) - Discrete

Q2) Identify the Data types, which were among the following Nominal, Ordinal, Interval, Ratio. Data - Data Type Gender - Nominal High School Class Ranking - Ordinal Celsius Temperature - Interval Weight - Ratio Hair Color - Nominal Socioeconomic Status - Ordinal Fahrenheit Temperature - Interval Height - Ratio Type of living accommodation - Nominal Level of Agreement - Ordinal IQ(Intelligence Scale) - Interval Sales Figures - Interval Blood Group - Nominal Time Of Day - Ordinal Time on a Clock with Hands - Interval Number of Children - Ratio Religious Preference - Nominal Barometer Pressure - Ratio SAT Scores - Interval Years of Education - Ratio

Q3) Three Coins are tossed, find the probability that two heads and one tail are obtained? Solution: When three coins are tossed, The total number of possible combinations are 2^3 = 8. These combinations are HHH, HHT, HTH, THH, TTH, THT, HTT, TTT. The number of combinations which have two heads and one tail are: HHT, HTH, TTH =3 Therefore, the probability of getting two heads and one tails is: P (Two heads and One Tail) = Number of desired outcomes/ Number of Total Outcomes = 3/8 = 0.375

Q4) Two Dice are rolled, find the probability that sum is a) Equal to 1 The set of possible outcomes when we roll a die are {1, 2, 3, 4, 5, 6} So, when we roll two dice there are 6 × 6 = 36 outcomes. There is no any moment when the sum of dice is 1 Therefore, the P(sum is equal to 1)= 0/36=0

b) Less than or equal to 4 The set of possible outcomes when we roll a die are {1, 2, 3, 4, 5, 6} So, when we roll two dice there are 6 × 6 = 36 outcomes. When we roll two dice, the possibility of getting number 4 is (1, 3), (2, 2), and (3, 1). So, The number of favorable outcomes = 3 Total number of outcomes = 36 Therefore, P(sum is Less than or equal to 4)=The number of favorable outcomes / Total number of possibilities = 3 / 36 = 1/12.

c) Sum is divisible by 2 and 3 The set of possible outcomes when we roll a die are {1, 2, 3, 4, 5, 6} So, when we roll two dice there are 6 × 6 = 36 outcomes. When we roll two dice, the possibility of getting Sum should be divisible by both 2 and 3 is (1, 5), (2, 4), (3, 3), (4, 2), (5, 1), and (6, 6). So, The number of favorable outcomes = 6 Total number of outcomes = 36 Therefore, P(Sum is divisible by 2 and 3)=The number of favorable outcomes / Total number of possibilities = 6 / 36 = 1/6.

Q5) A bag contains 2 red, 3 green and 2 blue balls. Two balls are drawn at random. What is the probability that none of the balls drawn is blue?

Solution: Total number of balls = (2 + 3 + 2) = 7 Let S be the sample space. Then, n(S) = Number of ways of drawing 2 balls out of 7 =7C2 =(7×6)/(2×1) =21 Let E = Event of drawing 2 balls, none of which is blue. So, n(E)= Number of ways of drawing 2 balls out of (2 + 3) balls. =5C2 =(5×4)/ (2×1) =10 Therefore, P(E)=n(E)/ n(S)=10/21

Q6) Calculate the Expected number of candies for a randomly selected child Below are the probabilities of count of candies for children (ignoring the nature of the child-Generalized view) CHILD Candies count Probability A 1 0.015 B 4 0.20 C 3 0.65 D 5 0.005 E 6 0.01 F 2 0.120 Child A – probability of having 1 candy = 0.015. Child B – probability of having 4 candies = 0.20 Solution: Expected number of candies for a randomly selected child = 1 * 0.015 + 4*0.20 + 3 0.65 + 5 0.005 + 6 *0.01 + 2 * 0.12 = 0.015 + 0.8 + 1.95 + 0.025 + 0.06 + 0.24 = 3.090 So, The expected number of candies for randomly selected child is 3.09

Q7) Calculate Mean, Median, Mode, Variance, Standard Deviation, Range & comment about the values / draw inferences, for the given dataset

  • For Points, Score, Weight Find Mean, Median, Mode, Variance, Standard Deviation, and Range and also Comment about the values/ Draw some inferences. Use Q7.csv file

Q8) Calculate Expected Value for the problem below a) The weights (X) of patients at a clinic (in pounds), are 108, 110, 123, 134, 135, 145, 167, 187, 199 Assume one of the patients is chosen at random. What is the Expected Value of the Weight of that patient? Solution: Expected Value = ∑ (probability * Value) = ∑ P(x). E(x) There are 9 patients, Probability of selecting each patient = 1/9 Ex: 108, 110, 123, 134, 135, 145, 167, 187, 199 P(x): 1/9 1/9 1/9 1/9 1/9 1/9 1/9 1/9 1/9 Expected Value = (1/9)*108 + (1/9)*110 + (1/9)*123 + (1/9)*134 + (1/9)*135 + (1/9)*145 + (1/9)*167 + (1/9)*187 + (1/9) 199 = (1/9) (108 + 110 + 123 + 134 + 135 + 145 + 167 + 187 + 199) = (1/9) (1308) = 145.33 Expected Value of the Weight of that patient = 145.33

Q9) Calculate Skewness, Kurtosis & draw inferences on the following data Cars speed and distance Use Q9_a.csv

SP and Weight (WT) Use Q9_b.csv

Q10) Draw inferences about the following boxplot & histogram

Q11) Suppose we want to estimate the average weight of an adult male in Mexico. We draw a random sample of 2,000 men from a population of 3,000,000 men and weigh them. We find that the average person in our sample weighs 200 pounds, and the standard deviation of the sample is 30 pounds. Calculate 94%,98%,96% confidence interval? Solution:

The information given is: • Sample mean of . • Sample standard deviation of . • Sample size of . The interval is:

• In which t is the critical value for the two-tailed confidence interval. Considering a 94% confidence level, using a calculator, with 200 - 1 = 199 df, the critical value is t = 1.8916, hence:

The 94% confidence interval is (198.73, 201.27). Considering a 96% confidence level, using a calculator, with 200 - 1 = 199 df, the critical value is t = 2.0673, hence:

The 96% confidence interval is (198.61, 201.39). Considering a 98% confidence level, using a calculator, with 200 - 1 = 199 df, the critical value is t = 2.3452, hence:

The 98% confidence interval is (198.43, 201.57).

Q12) Below are the scores obtained by a student in tests 34,36,36,38,38,39,39,40,40,41,41,41,41,42,42,45,49,56

  • Find mean, median, variance, standard deviation. Solution: Mean: As we know, Mean= Sum of the terms/ Number of the terms. Mean= (34+36+36+38+38+39+39+40+40+41+41+41+41+42+45+49+56)/18= 738/18=41

Median: For median first I have to arrange in ascending order but here scores are already in ascending order so I directly calculated the median. Median = (9th+ 10th Term)/2 = (40+41)/2 = 40.5

Variance: Mean(m)=41 Scores(s) s-m (s-m)^2 34 -7 49 36 -5 25 36 -5 25 38 -3 9 38 -3 9 39 -2 4 39 -2 4 40 -1 1 40 -1 1 41 0 0 41 0 0 41 0 0 41 0 0 42 1 1 45 4 16 49 8 64 56 15 225 Sum 0 433

Variance = 433/17= 25.47

standard deviation:

As We know standard deviation= (Variance)^(1/2) So standard deviation= (24.05) ^ (1/2) = 5.05

  • What can we say about the student marks?

Q13) What is the nature of skewness when mean, median of data are equal? Answer: Skewness refers to a distortion or asymmetry that deviates from the symmetrical bell curve, or normal distribution, in a set of data. If the curve is shifted to the left or to the right, it is said to be skewed. Skewness can be quantified as a representation of the extent to which a given distribution varies from a normal distribution. A normal distribution has a skew of zero. When mean, median and mode is equal then it is a normal distribution. So, the nature of skewness is zero when mean and median of data is equal.

Q14) What is the nature of skewness when mean > median ? Answer: Skewness refers to a distortion or asymmetry that deviates from the symmetrical bell curve, or normal distribution, in a set of data. If the curve is shifted to the left or to the right, it is said to be skewed. A positively skewed distribution is the distribution with the tail on its right side. The value of skewness for a positively skewed distribution is greater than zero. When the value of mean is greater than median and mode then it is called positive skewed. So the nature of skewness is positive when mean > median.

Q15) What is the nature of skewness when median > mean? Answer: Skewness refers to a distortion or asymmetry that deviates from the symmetrical bell curve, or normal distribution, in a set of data. If the curve is shifted to the left or to the right, it is said to be skewed. A negatively skewed distribution is the distribution with the tail on its left side. The value of skewness for a negatively skewed distribution is less than zero. When the value of mean is less than median and mode then it is called negative skewed. So the nature of skewness is negative when median > mean.

Q16) What does positive kurtosis value indicates for a data ? Answer: Kurtosis refers to the degree of presence of outliers in the distribution. It is a statistical measure, whether the data is heavy-tailed or light-tailed in a normal distribution. The excess kurtosis is used in statistics and probability theory to compare the kurtosis coefficient with that normal distribution. It can be positive (Leptokurtic distribution), negative (Platykurtic distribution), or near to zero (Mesokurtic distribution). Since normal distributions have a kurtosis of 3, excess kurtosis is calculating by subtracting kurtosis by 3. Excess kurtosis = Kurt – 3 Leptokurtic is having very long and skinny tails, which means there are more chances of outliers. Positive values of kurtosis indicate that distribution is peaked and possesses thick tails. An extreme positive kurtosis indicates a distribution where more of the numbers are located in the tails of the distribution instead of around the mean.

Q17) What does negative kurtosis value indicates for a data? Answer: Kurtosis refers to the degree of presence of outliers in the distribution. It is a statistical measure, whether the data is heavy-tailed or light-tailed in a normal distribution. The excess kurtosis is used in statistics and probability theory to compare the kurtosis coefficient with that normal distribution. It can be positive (Leptokurtic distribution), negative (Platykurtic distribution), or near to zero (Mesokurtic distribution). Since normal distributions have a kurtosis of 3, excess kurtosis is calculating by subtracting kurtosis by 3. Excess kurtosis = Kurt – 3 Platykurtic having a lower tail and stretched around center tails means most of the data points are present in high proximity with mean. A platykurtic distribution is flatter (less peaked) when compared with the normal distribution. A distribution with a negative kurtosis value indicates that the distribution has lighter tails than the normal distribution

Q18) Answer the below questions using the below boxplot visualization.

What can we say about the distribution of the data? What is nature of skewness of the data? What will be the IQR of the data (approximately)?

Q19) Comment on the below Boxplot visualizations?

Draw an Inference from the distribution of data for Boxplot 1 with respect Boxplot 2. Q 20) Calculate probability from the given dataset for the below cases

Data _set: Cars.csv Calculate the probability of MPG of Cars for the below cases. MPG <- Cars$MPG a. P(MPG>38) b. P(MPG<40) c. P (20<MPG<50)

Q 21) Check whether the data follows normal distribution a) Check whether the MPG of Cars follows Normal Distribution Dataset: Cars.csv

b) Check Whether the Adipose Tissue (AT) and Waist Circumference (Waist) from wc-at data set follows Normal Distribution Dataset: wc-at.csv

Solution: For 90% confidence interval: We have the significance level at 5 % (as it is a two tailed test) that is: α = 5 % = 0.05 z at α = 0.05 from the z table will be: z = 1.645. For 94 % confidence interval, we get: We have the significance level at 3 % (as it is a two tailed test) that is: α = 3 % = 0.03 z at α = 0.03 from the z table will be: z = 1.555. For 60 % confidence interval, we get: We have the significance level at 20 % (as it is a two tailed test) that is: α =20 % = 0.2 z at α = 0.2 from the z table will be: z = 0.253 Therefore, we get that the z score at 90% confidence interval is 1.645, at 94% confidence interval is 1.555 and at 60% confidence interval is 0.253

Q 23) Calculate the t scores of 95% confidence interval, 96% confidence interval, 99% confidence interval for sample size of 25 Solution: a)The sample size is n=25, So the degrees of freedom is n−1=25−1=24 Thus, we are interested in the quantity t(α/2)=t(0.05/2)=t(0.025) for a t-distribution with 24 degrees of freedom. Upon using a t-table, we see that the critical t-value for this 95% confidence interval is t(α/2)=2.064. b)Upon using a t-table, we see that the critical t-value for this 96% confidence interval is t(α/2)=2.164. c)Upon using a t-table, we see that the critical t-value for this 99% confidence interval is t(α/2)=2.797.

Q 24) A Government company claims that an average light bulb lasts 270 days. A researcher randomly selects 18 bulbs for testing. The sampled bulbs last an average of 260 days, with a standard deviation of 90 days. If the CEO's claim were true, what is the probability that 18 randomly selected bulbs would have an average life of no more than 260 days Hint: rcode  pt(tscore,df) df  degrees of freedom

  • Jupyter Notebook 100.0%

IMAGES

  1. Assignment 1

    assignment 1 statistics

  2. Business Statistics Assignment 1

    assignment 1 statistics

  3. Assignment 1

    assignment 1 statistics

  4. Statistics assignment 1

    assignment 1 statistics

  5. Statistics Assignment Examples : Introduction

    assignment 1 statistics

  6. Stats assignment 1

    assignment 1 statistics

COMMENTS

  1. Assignment #1 Descriptive Statistics Data Analysis Plan

    STAT200: Written Assignment #1 - Descriptive Statistics Data Analysis Plan - Instructions. Step #3: Complete the "Assignment #1: Descriptive Statistics Data Analysis Plan Template.". Remember, you will not be conducting any statistical analysis, drawing any graphs, or compiling any tables for the first assignment.

  2. STAT 200 Assignment 1 Discriptive Statistics Data Analysis Plan

    University of Mar yland Globa l Campus. STAT200 - Assignmen t #1: Descript ive Statistics Data Analysis Plan. Identifying Info rmation. Student (Fu ll Name): Willia ms, Tara. Class: STAT 20 0 7375. Instructor: Prof. San doval. Date: 3/29/2021. Scenario: The sample data has been gathered from 30households across from the US Department of.

  3. STA 201 : PRINCIPLES OF STATISTICS

    Assignment _ 1 (2)stat. 1 STA-201-GS Principles Of Statistics Assignment #1 Exercise 1.1 (pg 9-10) 1.6) Causation. 1.8) This study is Descriptive. Its the Average of the salaries in each major league for the years 1993 and 2003. 1.10) This study is Inferential. This survey shows

  4. Assignment 1

    Assignment 1 #. Assignment 1. #. You may discuss homework problems with other students, but you have to prepare the written assignments yourself. Please combine all your answers, the computer code and the figures into one PDF file, and submit a copy to your folder on Gradescope. Grading scheme: 10 points per question, total of 40.

  5. Assignment 1: Descriptive statistics

    Part 1: Reproduce a table. Haun et al. (2019) has published raw data together with their paper. This means that we will be able to reproduce some of their results, exciting! In this assignment you are expected to reproduce Table 1 in (Haun et al. 2019).To get you started, I have included the code needed to download the data below (from the lesson on importing data).

  6. Statistics and Probability

    Learn statistics and probability—everything you'd want to know about descriptive and inferential statistics.

  7. 1.8 Assignment 1

    1.8 Assignment 1 Purposes. This assignment has two parts. The first part assesses your knowledge of the concept of sample, population, descriptive vs. inferential study, using graphs to summarize data. The second part assesses your skills in using R commander to create the graphs and interpreting their meaning. Resources. M01_SaleHome.xlsx ...

  8. Assignment 1 statistics

    course name: health science statistics course code: hss511s due date : 24/03/2023 paper :theory. duration: 2 weeks marks: 50 assignment 1 examiner: mr. jj swarts and mr. s kashihalwa moderator: dr l aku-akai instructions 1. for all the questions show clearly all the steps used in the calculations. 2.

  9. Mathematical Statistics Assignment 1

    Assignments. pdf. 139 kB. Mathematical Statistics Assignment 1. Download File. DOWNLOAD. Over 2,500 courses & materials. Freely sharing knowledge with learners and educators around the world. Learn more.

  10. Assignment #1 Descriptive Statistics Data Analysis Plan Instructions

    Note: This first assignment is a plan only; no statistics will be calculated or graphs created. The second assignment will involve carrying out the plan, after receiving feedback from your instructor. Assignment Steps: Step #1: Review the STAT200 data set file. (Note: This data set will be used for all three of this term's written assignments).

  11. Assignment 1 RSCH7864.doc

    Enhanced Document Preview: Running head: HISTOGRAM AND DESCRIPTIVE STATISTICS Histogram and Descriptive Statistics Charleen Thibodeaux Capella University RSCH7864 Quantitative Design and Analysis Assignment 1 April 2021 HISTOGRAM AND DESCRIPTIVE STATISTICS 2 Graph A normal distribution would be data distributed symmetrically around the center of all scores, as if we drew a vertical line ...

  12. Statistics Assignment 1

    Page 1 of. STA10003 FOUNDATIONS OF STATISTICS ASSIGNMENT - PART 1. Semester / Study Period: Study Period 1 Year: 2020 Unit code: STA Unit name: Foundations of statistics Assignment Number: Part 1 worth 20% of your final mark for STA Your name: Student number: Date submitted: Please retain a hard copy of your assignment as you will be required to resubmit it if requested

  13. Assignment 1 Statistics

    Assignment 1 Grades You can view your grade on assignment 1 by clicking on your name on the students page. ... The following table lists the overall statistics for this assignment. We have listed the statistics separately for each grader, since the grading of the game play tended to be a bit subjective. See the top of your grade sheet to see ...

  14. Assignment 2 (docx)

    Topic 2-3: Descriptive Statistics You'll find the relevant data sets in the assignment folder. Please note: there are two worksheets contained in the Excel file "Assignment # 1." 4. The worksheet "Transportation Costs" contains the amount (rounded to the nearest dollar) that fifty, randomly selected, Kwantlen students spent on transportation on September 22, 2017.

  15. Assignment 1 Statistics

    CS 248: Introduction to Computer Graphics Pat Hanrahan Assignment 1 Grades You can view your grade on Assignment 1 by clicking on your name on the students page. Clicking on the Assignment 1 link on your list of grades will bring up a grading sheet, detailing where you lost points (if you did) and for what. The following table lists the overall statistics for this assignment.

  16. RSCH7864 WEEK 2 ASSIGNMENT 1 DESCRIPTIVE STATISTICS.docx

    RSCH7864 WEEK 2 ASSIGNMENT 1: DESCRIPTIVE STATISTICS PART 1 Both graphs appear to be symmetrical. The lower division visually looks more symmetrical. The mean, median, and mode will probably fall within the middle. Lower division is a regular bell shaped graph with tail off to the left which makes it negatively skewed. Mean will fall closer to the left with the median in the middle and the ...

  17. ratan8932/Assignment--Basic-Statistics-Level-1

    Q 23) Calculate the t scores of 95% confidence interval, 96% confidence interval, 99% confidence interval for sample size of 25 Solution: a)The sample size is n=25, So the degrees of freedom is n−1=25−1=24 Thus, we are interested in the quantity t (α/2)=t (0.05/2)=t (0.025) for a t-distribution with 24 degrees of freedom.

  18. Assignment 1

    Assignment 12 - statistics. Assignment 2 - statistics. Assignment 3 - statistics. Assignment 4 - statistics. Chapter Seven Power Point. statistics ac 2313, mave tengiey, ws nadistin avis am rrcavch (eve what! (oh ye ay pov aint ry pcyyed at et sawple pied yack 40 phoulattlan dot iy emec wunipty.

  19. Business Statistics 1

    Assignment 1 - Individual Case Study Analysis ECON1193 - Business Statistics 1. Student name: Samiksha Mathur Student number: s Location: RMIT Vietnam, HN Lecturer: Pham Thi Minh Thuy Class number: 1 Total pages: 7 (Excluding cover page and reference list) A. DESCRIPTIVE DATA ANALYSIS