Correlation
Coefficient
Correlation, in the finance and
investment industries, is a statistic that measures the degree to which two
securities move in relation to each other. Correlations are used in
advanced portfolio
management, computed as the correlation
coefficient, which has a value that must fall between
-1 mutual relation of two or more things, parts, etc.: studies find a positive
correlation between severity of illness and nutritional status of the patients. Correlation is usually defined as a measure of
the linear relationship between two quantitative variables
(e.g., height and weight). Often a slightly looser definition is used, whereby
correlation simply means that there is some type of relationship between two
variables. This post will define positive and negative correlation, provide
some examples of correlation, explain how to measure correlation and discuss
some pitfalls regarding correlation.
When the values of
one variable increase as the values of the other increase, this is known
as positive correlation when the
values of one variable decrease as the values of another increase to form an
inverse relationship, this is known as negative correlation.
Correlation is a statistical measure that indicates the extent to which two or
more variables fluctuate together. A positive
correlation indicates the extent to which those variables increase
or decrease in parallel; a negative
correlation indicates the extent to which one variable increases
as the other decreases.
A correlation
coefficient is a statistical measure of the degree to which
changes to the value of one variable predict change to the value of another.
When the fluctuation of one variable reliably predicts a similar fluctuation in
another variable, there’s often a tendency to think that means that the change
in one causes the change in the other. However, correlation does not
imply causation. There may be,
for example, an unknown factor that influences both variables similarly.
Here’s one
example: a number of studies report a positive correlation between the amount
of television children watch and the likelihood that they will become bullies.
Media coverage often cites such studies to suggest that watching a lot of
television causes children to become bullies. However, the studies only report
a correlation, not causation. It is likely that some other factor – such as a
lack of parental supervision – may be the influential factor.
Correlation
is a statistical technique that can show whether and how strongly pairs of
variables are related. For example, height and weight are related; taller
people tend to be heavier than shorter people. The relationship isn't perfect.
People of the same height vary in weight, and you can easily think of two
people you know where the shorter one is heavier than the taller one.
Nonetheless, the average weight of people 5'5'' is less than the average weight
of people 5'6'', and their average weight is less than that of people 5'7'',
etc. Correlation can tell you just how much of the variation in peoples'
weights is related to their heights.
Although
this correlation is fairly obvious your data may contain unsuspected
correlations. You may also suspect there are correlations, but don't know which
are the strongest. An intelligent correlation analysis can lead to a greater
understanding of your data.
Correlation Coefficient
The main
result of a correlation is called the correlation coefficient (or
"r"). It ranges from -1.0 to +1.0. The closer r is to +1 or -1, the
more closely the two variables are related.If r is close to 0, it means there
is no relationship between the variables. If r is positive, it means that as
one variable gets larger the other gets larger. If r is negative it means that
as one gets larger, the other gets smaller (often called an "inverse"
correlation).
While
correlation coefficients are normally reported as r = (a value between -1 and
+1), squaring them makes then easier to understand. The square of the
coefficient (or r square) is equal to the percent of the variation in one
variable that is related to the variation in the other. After squaring r,
ignore the decimal point. An r of .5 means 25% of the variation is related (.5
squared =.25). An r value of .7 means 49% of the variance is related (.7
squared = .49).A correlation report can also show a second result of each test
- statistical significance. In this case, the significance level will tell you
how likely it is that the correlations reported may be due to chance in the
form of random sampling error. If you are working with small sample sizes,
choose a report format that includes the significance level. This format also
reports the sample size.
A key thing
to remember when working with correlations is never to assume a correlation
means that a change in one variable causes a change in another. Sales of
personal computers and athletic shoes have both risen strongly in the last
several years and there is a high correlation between them, but you cannot
assume that buying computers causes people to buy athletic shoes (or vice
versa).
The second
caveat is that the Pearson correlation technique works best with linear
relationships: as one variable gets larger, the other gets larger (or smaller)
in direct proportion. It does not work well with curvilinear relationships (in
which the relationship does not follow a straight line). An example of
a curvilinear relationship is age and health care. They are related,
but the relationship doesn't follow a straight line. Young children and older
people both tend to use much more health care than teenagers or young adults.
Multiple regressions (also included in the statistics module) can be used to examine
curvilinear relationships, but it is beyond the scope of this article.
Types
The most common correlation coefficient is
the pearson correlation
coefficient. It’s used to test for linear
relationships between data. In ap stats or elementary stats, the pearson is
likely the only one you’ll be working with. However, you may come across
others, depending upon the type of data you are working with. For
example, goodman and
kruskal’s lambda coefficient is a fairly
common coefficient. It can be symmetric, where you do not have to specify which
variable is dependent, and asymmetric where the dependent variable is
specified. Correlation is used to test relationships between quantitative variables or categorical variables. In other words, it’s a
measure of how things are related. The study of how variables are correlated is
called correlation analysis.
Some examples of data
that have a high correlation:
your
caloric intake and your weight.
your
eye colour and your relatives’ eye colours.
the
amount of time your study and your gpa.
Some examples of data that
have a low correlation (or none at all):
Your
sexual preference and the type of cereal you eat.
A
dog’s name and the type of dog biscuit they prefer.
The
cost of a car wash and how long it takes to buy a soda inside the station.
Correlations are useful because if you can find out what
relationship variables have, you can make predictions about future behaviour.
Knowing what the future holds is very important in the social sciences like
government and healthcare. Businesses also use these statistics for budgets and
business plans.
Positive correlation
Positive
correlation is a relationship between two variables in which both variables
move in tandem. A positive correlation exists
when one variable decreases as the other variable decreases, or one variable
increases while the other increases. In statistics, a perfect positive
correlation is represented by 1, while 0 indicates no correlation, and negative
1 indicates a perfect negative correlation
Figure 1. Positive correlation
Negative correlation
Negative correlation is
a relationship between two variables in which one variable increases as the
other decreases, and vice versa. In statistics, a perfect negative correlation is represented by
the value -1.00, a 0.00 indicates no correlation, and a +1.00 indicates a
perfect positive correlation. A perfect negative
correlation means the relationship that exists between two variables is
negative 100% of the time.
Figure
2. Negative correlation
No correlation
Denoting the situation in
which no apparent pattern can be formed when plotting data points for two
variables in a scatter diagram. It shows that there is no relationship between
the two variables.
Figure 3. negative correlation
Conclusion. correlation
refers to a technique used to measure the relationship between
two or more variables. A correlation coefficient is a statistical
measure of the degree to which changes to the value of one variable
predict change to the value of another.A number between +1 and −1
calculated so as to represent the linear interdependence of two variables or
sets of data.
MY AUDIO
POWER POINT PRESENTATION
Comments
Post a Comment