At vero eos et accusamus et iusto odio dignissimos ducimus qui blanditiis praesentium voluptatum deleniti atque corrupti quos dolores et quas molestias excepturi sint occaecati cupiditate non provident, similique sunt in culpa qui officia deserunt mollitia animi, id est laborum et dolorum fuga. Et harum quidem rerum facilis est et expedita distinctio. Nam libero tempore, cum soluta nobis est eligendi optio cumque nihil impedit quo minus id quod maxime placeat facere possimus, omnis voluptas assumenda est, omnis dolor repellendus. Itaque earum rerum hic tenetur a sapiente delectus, ut aut reiciendis voluptatibus maiores alias consequatur aut perferendis doloribus asperiores repellat.
@dan815 can i get your help with this please?
@robtobey can i get your help with this please?
What are your results for the mean and the standard deviation for each set of data.
Data Set 1 Minimum= 2.29816 Q1= 2.29869 Median= 2.29915 Q3= 2.30074 Maximum= 2.30182 Standard deviation= .0013130135 Mean= 2.299706 Data Set 2 Minimum=2.30956 Q1=2.309935 Median=2.3101 Q3=2.31026 Maximum= 2.31163 Standard deviation= 0.00057 Mean= 2.310216667 @kropot72
IQR of data set 1= 0.002049999999999663→ 0.00205 IQR of data set 2=0.00032499999999968665→ 0.000325 The interval of values that defines IQR in data set 1 is [2.29869, 2.30074] The interval of values that defines the IQR in data set 2 is [2.309935,2.31026]. The formula is IQR=Q3-Q1. So I did 2.30074-2.29869 to find the interquartile range of the first data set and 2.31026-2.309935 to find the interquartile range of the second set. The inter-quartile range is the ‘interval of values’ on which is concentrated the central 50% of the probability
"Do the samples appear to be from the same population? " Taking the mean of the first set and adding 3 standard deviations gives an approximate upper limit of the spread of values of 2.304. Taking the mean of the second set and subtracting 3 standard deviations gives an approximate lower limit of the spread of values of 2.308. Therefore there appears to be no overlap in the estimated distribution of values, indicating that the parent populations are different.
oh ok do they aren't from the same population because there's no overlap, ok thanks so much! @kropot72
In attempting to decide whether the two sets of results derive from the same population, we must essentially decide whether the two data sets appear to follow the same distribution or not. We can do this by looking at the parameters associated with that distribution and comparing the figures we get for the two data sets. So, first we must decide on what type of distribution our shared population might follow. We don't know anything about our population, so we must try and provide a 'model' to which we could say it approximately follows. A good one to start with (as it's probably the most common) is a Normally distributed population (I haven't done any of the calculations, so I'm not sure whether this the correct one or not). I'll not go through all the aspects associated with this distribution, but essentially we say that if we plotted the entire population of values on an axis, it would follow a bell-shaped curve pattern, centered at some mean value and having some standard deviation from the mean. As the shape is symmetric , the population mean would be equal to the population median.Here's what it would look like: http://www.globalspec.com/reference/69565/203279/11-5-the-normal-gaussian-distribution If we were to plot our values in the two data sets individually (eg; boxplot or histogram), we would probably find that they might not look exactly like this smooth curve. This is because we have relatively small sample from a population in each case, so it's difficult to get a precise judgement compared to a population model which would theoretically contain infinitely many values. So, perhaps the best 'numerical' way of identifying whether they derive from the same population is if the 'Chemical' and 'Atmospheric' data sets have similar sample mean and sample standard deviation values (meaning that they are random draws from a Normal population with a similar mean and standard deviation). In doing this, we essentially use these sample parameters to estimate the parameters of the population that each of the data sets derive from. If the estimates are similar, then there would be good reason to believe that the samples are from the one Normally-distributed population. So, the first thing to do is to calculate the mean and standard deviation values associated with each of the data sets. There are many other things such as calculating and comparing 95% confidence intervals for the population means of both Chemical and Atmospheric or performing hypothesis tests...I'm not going to into these methods as there a bit specialised, but if you know about how to do them when encountering similar problems before, then perhaps you could use them here too. What I'd suggest to do is as follows: -Compare the mean and median values of each of the two samples individually; if they are in good agreement with one another then the populations which the samples derive from are most likely to be Normal. -Compare the mean and standard deviation of each data set; if they appear to be in good agreement with one another, then the populations are most likely the same. If not, then they derive from two distinct Normal distributions with population parameters which are not the same. Hope that helps!
You're welcome :)