Correlation analysis is appropriate to use to detect whether

two continuous variables share a linear relationship. Correlation itself is measured

by the correlation coefficient (?)

which measures how much one viable changes in relation to the other. This

relationship can be either positive or negative correlation or show no

correlation at all. A second statistical

method, significance analysis, can then be carried out to provide a

significance value e.g. p

value, which can then be used to test the null hypothesis.

The R function used for correlation analyses is cor.test.

This function requires two vectors that must be the same length e.g.

cor.test(x,y). The input, cor.test(x, y)$estimate, is required to produce a

correlation coefficient (?3). If

the correlation coefficient is positive, then this indicates that the two

variables have a positive correlation.

If the correlation coefficient is negative, then it indicates a negative

correlation between the two. The p value can be produced by the input

cor.test(x, y)$p.value. If this p

value is less than a critical p value

(0.05 is commonly used), then the null hypothesis can be rejected as the two

variables have displayed a significant correlation. However, if the p value is greater than a critical p value then there is no significant correlation between the two

variables, therefore the null hypothesis cannot be rejected. To recall the p value and correlation coefficient, I

can use the correlation model ‘model’ to equal cor.test(x,y) and then input

‘model’ into the console when required.

The null hypothesis of correlation analysis (H0),

states that there is no correlation between the production of rice in the two

countries of Bangladesh and Pakistan. Therefore the null hypothesis is H0: ? = 0 , with ?, the

correlation coefficient between the production of rice in the two countries, equalling

0 due to the lack of correlation.

The two sets of data, from the production of rice from

Bangladesh and Pakistan, are the same size, therefore, I can use correlation

analysis to exam if the two values are correlated. I entered the data sets

independently, under the operator c. This then allowed me to include these data

sets in the cor.test function with the output saved as a ‘model’. Subsequently,

I could then generate a scatter diagram, using the function plot, to visualise

the two sets of data. I added a legend

into the plot to show the p value and correlation coefficient by using the function

legend. I used the function “topleft” to place the legend in the desired place.

The p value for

the two variables is 0.00158. As it is less than the critical value of

0.05, the null hypothesis can be rejected. This indicates that the rice

production from Bangladesh and Pakistan has a significant correlation. The

correlation coefficient (?) is

0.6588979, therefore there is a positive correlation beaten these two

variables, as ? is positive. The rice production from Pakistan likely

increases with the increase of rice product from Bangladesh. The figure below

shows the scatter of these two variables. The scatter pattern is consistent

with the correlation analysis test as a positive correlation can be deduced

from the graph (diagonal distribution of points from bottom left corner to top

right corner).