# There is a possible problem with the data for

__REPORT ON TA1 AND TA2__

Chart 1 shows the profile of outgo for the two attractive forces. We can see that whilst TA2 closely follows a normal distribution ( no grounds of lopsidedness, symmetricalness in the dress suits ) , the distribution of TA1 is to a great extent skewed to the left. In other words, the outgo of households at TA1 is skewed towards the lower terminal of the outgo scope ( similar to the chance denseness map of the chi-squared distribution ) .

Following, we examine the information in more item. We can see from the following tabular array that the average spend at the two attractive forces is similar, although TA1 does interrupt the psychological ?50 barrier.

**Measures of the Average**

The symmetricalness of the distribution of TA2 is confirmed by the fact that the mean, manner and median are all moderately near. By contrast, the manner for TA1 is ?25 ( the information for TA1 was converted into the same categories as TA2 to enable a direct comparing to be made between the two sets of informations ) , which demonstrates that lower disbursement households are more prevailing at TA1. For TA1, hence, the mean is non peculiarly utile, since it hides the fact that there are important Numberss of both low and high disbursement households.

**Measures of Dispersion**

The most utile manner to compare steps of scattering in two distribution is the coefficient of fluctuation, which is standard divergence divided by mean. We can see that the CV of TA2, at 0.36, is instead lower than that of TA1. This supports the position of high Numberss of observations off from the mean.

A similar decision is supported by the semi interquartile scope ; at 19.25, the SIQR of TA1 is slightly higher than that of TA2, which is 15.

**Accumulative and Percentage Frequency of TA1 and TA2**

The following tabular array shows the cumulative and per centum distributions for the two TAs. For TA1, the information was converted into the same categories as TA2.

Chart 2 compares the per centum cumulative frequences of the two TAs.

We can see from chart 2 that the nose cone for TA2 is steeper and besides reaches 100 % slightly quicker than that for TA1. This confirms non merely that TA1 has a higher scope of values but besides has important Numberss of low and high values.

The quartiles may be read off this chart by reading relevant values off the Y axis.

**Data job with TA1?**

There is a possible job with the informations for TA1 in peculiar. Given that the figures presented include entryway fees, we can reason that the entryway fee per household is no more than ?30 ( over 40 households spend ?30 or less at TA1, including entryway ) .

However, we can see that 14 households – 10 % of the entire – spend over ?80 in TA1. This translates to discretionary spend ( nutrient, drinks, keepsakes ) of over ?50 per household and this seems really high. A dislocation of entire outgo into its elements, such as entryway fee, is necessary to guarantee that the informations collected looks sensible.

__Overall Conclusions and Recommendations__

Whilst the average spend for TA1 is a small higher than TA2, the major points to high spot are:

- a lopsidedness in outgo at TA1 ; we need to look into why some people are passing big sums and on what, whilst the majority of visitants spend comparatively small ;

- we have no thought so far on how spend is divided into entryway fee, entryway to particular exhibits inside the attractive force ( e.g. specific drives in an amusement park, occasional art exhibitions within a museum ) , and discretional disbursement such as nutrient, drink, keepsakes, etc.

At the really least, we should get the entryway fee duty for each TA so we can guarantee we are comparing like with like. Furthermore, if the two TAs are really different in nature ( e.g. London Eye versus Tower of London ) , so the comparing between the two TAs is of limited usage.

__Formula__

**Formulae used – TA1**

Mean: =AVERAGE ( B2.B145 )

Median: =MEDIAN ( B2.B145 )

St dev. : =STDEV ( B2.B145 )

The information was sorted by utilizing the Data, Sort bid on the scope. So, the quartiles were calculated as follows:

Lower quartile: norm of 37^{Thursday}and 38^{Thursday}values

Upper quartile: norm of 108^{Thursday}and 109^{Thursday}values

**Histogram computations**

To change over TA1 informations into a format similar to TA2, the undermentioned bids were used:

Tools, Data Analysis, Histogram

The bin values were assigned in cells I4 to I16 of sheet 2. The initial values assigned were 20, 30, 40 and so on. So, for illustration, any values between 20.01 and 30.00 would be assigned to the “30” bin. For presentation intents, these bin values were converted, so “30” , for illustration, becomes ?20-?29.99 ( the same as “20 but non more than 30” ) .

**Formulae used – TA2**

To cipher the mean and standard divergence for TA2, classical expression were used and the consequences generated in Excel. For the mean, the expression used was:

?f.x / ?f

Where degree Fahrenheit is the frequence and ten is the center of the scope. The expression used to cipher the mean was:

=SUMPRODUCT ( D4.D11, E4.E11 ) /SUM ( E4.E11 )

( see sheet 3 ) .

Meanwhile, the expression for the discrepancy, and therefore its square root the standard divergence is as follows:

v ( ?f. ( x-mean )^{2}/ ?f )

The Excel expression used was indistinguishable to that for the mean, except that the Numberss in the ( x-mean )^{2}column instead than the ten column were used. Once the discrepancy was calculated, I took the square root ( =SQRT ( discrepancy ) ) to find the standard divergence.

The manner, average and quartiles were all calculated manually. For the manner, the largest category is obviously ?40-?49.99, therefore ?45. the average, meanwhile, is the norm of the 150^{Thursday}and 151^{st}values ; both of these prevarications within the same ?40-?49.99 category. The quartiles were determined utilizing the same logic as the mean ( which is merely the 2nd quartile ) .