The first part of this assignment takes a look at two different cycling teams and uses statistics to determine which team to invest money into. The two teams are team Astana and team Tobler and the times for their last race are listed below in table 1.
| table 1. Team Astana and Team Tobler individual race times in minutes |
With this data it is easy to figure out the different statistics for this data to get a better idea of what this information is showing. The statistics that will be applied to this data are range, mean, median, mode, kurtosis, skewness, and standard deviation. Below, these terms will be defined and applied to the data.
- Range: This refers to the extent of the information that is available. It is found by finding the difference between the highest and lowest value.
- Mean: This is the average of the data. It is the middle of the data and is heavily influenced by outliers. It is found by adding all the values together and dividing by the total number of values.
- Median: This is the exact middle of the data set. This is different from mean because it is the middle spot of the data that is available and not the actual middle. If there is an odd number of values the median is the middle value and if there is an even number of values, the middle two values need to be added together and divided by 2. This measure is also more resistant to outliers than the mean is.
- Mode: This is the most common value that is in the data set. There needs to be at least two of the same values in the data set to have a mode.
- Kurtosis: This refers to the shape of the graphed data. Kurtosis is a measure of how peaked or flat the graph is. A positive kurtosis means the graph is relatively peaked and that is called leptokurtic. A negative kurtosis means the graph is relatively flat and that is called platykurtic.
- Skewness: This, like kurtosis, refers to the shape of the graphed data. Skewness is a measure of how symmetrical the graph is. (-1) - 1 means that the distribution is normal or acceptable. If skewness is positive that means that the graph is shifted to the right and there are large outliers effecting the data. If skewness is negative that means that the graph is shifted to the left and there are small outliers effecting the data.
- Standard Deviation: This measures how spread out the data is. There are 6 standard deviations for every data set, 3 positive and 3 negative, and fall on equal intervals from the mean. Between the first positive and negative standard deviation is 68% of all observations. Between the second positive and negative standard deviation is 95% of all observations. Between the third positive and negative standard deviations is 99% of all observations. If the graph is flatter the data is more spread out and the standard deviation will be larger and if the graph is more peaked the data is closer together and the standard deviation will be smaller. A population standard deviation is found by finding the difference between the individual observation and the average. Then squaring all those values and adding them together. The next step is to divide that number by the total number of observations and finding the square root of that number. If the whole population is not known then all the same steps are followed except when dividing the sum of the squared values by the total number of observations. Instead subtract one from the total number of observations and use that value for the total number if observations.
| table 2. statistics applied to team Astana and team Tobler race times (time in hours, minutes) |
![]() |
| figure 1. standard deviation calculations by hand for team Astana and team Tobler |
Part 2:
The second part of the assignment looks at geographic mean centers in Wisconsin and weighted geographic mean center in Wisconsin. Mean centers take the coordinates (X, Y) for a series of points and finds the mean (average) of the X and the Y values separately. When the mean of the X and Y values are found, the new coordinate set can be plotted and the mean center is shown. For this assignment, the first mean center that was found was for all the counties in Wisconsin. The coordinates used for this are the geographic center of each county. This is shown by the green dot on the map below (map 1). The second mean center that was found for this assignment was weighted by the population in 2000. The purple dot on the map below (map 1) shows the mean center for Wisconsin weighted by county population. The third mean center that was found was weighted for the population in 2015. This is shown by the blue dot on the map below (map 1).
The geographic mean is the green dot and represents the spacial center of Wisconsin based on county that is not weighted. It looks like it is in the center of Wisconsin and should be. The next point is the mean center for population by county in 2000. This dot is shifted extensively to the south east. This is because the high population of Milwaukee county is causing that county to be weighted much heavier than the other counties so it drags the dot towards the county. The last point is the mean center for population by county in 2015. Here, the dot shifts to the west and a little to the north. This is either because Milwaukee is losing some of its population or more people are moving to the center of western Wisconsin around Eau Claire and close to Minneapolis.


No comments:
Post a Comment