Using Meta-Analysis to Inform the Robustness of Research Findings

– Okay, let’s get started. Good afternoon everyone. My name is Zhen Zhang. I am from Arizona State University, and I wanna thank Gwen and
Mo for this opportunity to talk with you and
discuss with you about using meta analysis to inform the robustness of empirical research findings. So, an outline for today, I know we are, we have limited time, so
I will try to, you know, meet the goal of finishing
within 45 minutes, and please hold all of the questions toward the end if you have any. I am sure you will have many questions. So, the outline, I will first
review the different effect size measures in the
science of organizations, then briefly talk about
how meta analysis has been typically used and what are the indicators of effect size heterogeneity
in meta analysis, and then I will spend most
of the time to talk about the two different ways of
combining meta analysis and structural equation
modeling to allow researchers to test relationships or models
that they have not tested, my, may or may not have tested
in their primary studies. Then I will summarize with
recommendations for how to keep this in our tool box for
enhancing the interpretation and evaluation of primary study findings. So, a quick review. Many of you already know this. In terms of effect size measures
there are many different forms and for a lot of
the IO, OB, HR researchers we care a lot about the correlation. So, Pearson product moment correlation. Well, if you are doing
experimental research then the D statistics, or
even odds ratios may be, may be your focus area, and R
squares explain the variances, eta squares in ANOVA. So, that also affects S measures. And here I listed a question mark just after standardized
regression coefficients ’cause that’s, that’s something,
you know, that is tricky in terms of if you compare
standardized regression coefficients across different
studies you will need to have the same set of predictors
and the control variables in order to be able to
compare across studies. So, in the, on the one
hand, it’s a measure of standardized effect size,
but on the other hand you need to have the
same set of coefficients in order to be comparable
across different studies. And there are many other different
measures of effect sizes. Today we will be focusing on correlations. But the same logic could
it be applied to other, like D statistics and others, in terms of how to combine
the meta analysis method and to test a model that
it kind of allows you to do things above meta analysis. A quick review. Meta analysis, following the
tradition of Schmidt and Hunt, Hunter, we define it as
the, it has been defined as the quantitatively
summarize a body of work on a particular relationship,
and as you will see in the next few slides
when you are presented the meta analysis results
you typically have, you have to report the
average observed correlation, the standard deviation of
it, and after you correct for the artifacts then you
can report the estimated true population correlation,
which is called rho, and the standard deviation of it. One note here is that today
we will be only talking about the conventional meta analysis
where you will summarize research findings across
independently conducted studies. We’re not talking about
the so called internal or single paper meta
analysis where you have one group of researchers
having a few primary studies and doing the synthesize and the analysis within the same paper and
present the meta analytical results in that one paper. So, that’s not the focus for today. So, with the traditional or
conventional meta analysis the heterogeneity of those
effects can be indicated by a few different things, and
for example the standard deviation of the, of the
true population correlation, which is SD rho, and you
can look at the width of the credibility
intervals, which I will be defining very shortly,
and also the percentage of observed variance that is
accounted for by artifacts, so on and so forth. So, some of the measures,
some of the indicators have, have been, had been
popular in previous years, but not tending, tend to fall
off of favor by researchers, but the standard deviation of rho and also credibility intervals have been pretty robust in terms of, pretty, have continued to be popular
to be used by researchers. Here is an example of
the meta analysis table which you can, you can see in many, many published meta analyses. So, the reference is
listed here at the bottom. So, I would like to highlight a few things for you to think about. First of all let’s look at this line where it’s give us the information
that is corresponding to the column, column labels. So, first number 32 means the
number of independent samples. So, one study could have multiple samples. So, the K constant samples,
and as you’ll see on the top at the, at the label we will be looking at the true population,
estimated true population and the standard deviation of it, then there is, there
are two columns called the confidence interval boundaries, which is 95% and the 80%
credibility interval boundaries. So the, you know, considering the time, I will not be going
through the definitions, but on the other, other more of parameters or reported information, but
I would like to highlight that in this line the standard deviation of rho is estimated as a positive value. In contrast, if you look at the other highlighted area, standard
deviation of rho is zero, which I will be explaining
in the next slide in terms of what it really means statistically and conceptually. And as you’ll see here when
the standard deviation of rho is .00, the credibility interval is with the lower boundary
and the upper boundary is the same value, which
means there’s no variation between the two boundaries. Before we talk about why it could be zero and what does that mean, I would like to explain a little bit on the technical side of those two intervals. So, credibility intervals are
based upon Bayesian logic. When it was first introduced
by Schmidt and Hunter in their 1977 paper they
actually described why they, we have to look at the
posterior distribution of the correlation after
correcting for the artifacts and use that to construct
the confidence intervals, and in terms of the interpretation
if you have a wider CVs that means there is a
potential to have one moderator or multiple moderators on
that relationship of focus, whereas when you look
the confidence intervals it is estimating the accuracy of the mean observed correlation, and it is calculated with a similar formula, but just replacing the standard deviation of
rho by the standard error of the estimated R. So, for technical details
you can refer to Hunter and Schmidt’s books,
different versions of books, for the technical details. But I want to emphasize
that confidence intervals has to some extent an over emphasis on significance testing, whereas
the credibility intervals can actually give you rich information in terms of how the actual observed, or the actual true
population is distributed across different values. So, here is a slide to explain
why the SD rho could be zero and sometimes you may
realize that by, you know, your own, through your
own work, and reading others’ publications. You may realize that
sometimes the percentage of explained variance
is greater than 100%. The reason was the calculation
was based upon this equation. So the variance, the set,
first equation there, the variance of the
sample correlations equals to the variance of the
population correlation plus the variance due to sampling error, and the, the sampling error, so basically this term, if
you rearrange the terms, so the variance of rho equals the variance of observed sample correlation
minus the sampling error. And this term has a particular
formula to calculate. When this number, when
this value is greater than your observed variance that gives you the bigger than 100% percentage. So, in that situation
typically researchers will just fix the total sum or
the difference between the two as zero, rather than
going to a negative number. So, now understanding how,
after we know how the CVs versus CIs are interpreted and calculated let’s look at the sources of heterogeneity when you meta analyze primary studies. So, there are two different types. The first type of heterogeneity
may be due to moderators. Sometimes it’s one moderator,
sometimes there’s a group of different moderators at work, and this source of heterogeneity
and the ways to consider it is related to the generalizability of the primary study findings. So you can think of this as
different sub populations or even totally different populations, looking at the, for example, 30 studies. 15 of them may come from
a different culture. The other 15 may come from another culture and they heterogeneity
you identified when you were meta analyzing those 30
studies can actually give you the hint or give you the
direction of to look at those potentially moderators from
the cultural perspective, and also examine whether
a particular study is generalizable, the
findings from one culture is generalizable to other
settings, other cultures. So that’s why I say here it is
related to generalizability. In contrast, heterogeneity,
after considering the moderators, if, it’s
likely that you still have some normal variations in
the observed correlations, even if you considered already
it’s a homogenous population. So in this aspect it’s more
related to the replicability of the primary study findings. So, let’s assume you already
identified a homogenous population or sub
population and you are doing a few primary studies
out of that population. You’ll still have
variations, but this type of variation is no longer
due to, let’s assume, no longer due to any moderators. It’s only due to the remaining
sampling error you may have and this is more relevant for
replicability of, you know, rather than due to additional
moderators you have to consider before you,
reaching this stage. So let’s use one example
to look at how we can actually combine our knowledge so far in terms of meta analysis, heterogeneity, and testing a structural model. So this is a very simple
model where you have variable X predicting
two parallel mediators and then predicting, they (mumbles) the other common variable Y. So when you are able to
combine meta analysis results and structural equation modeling
researchers can actually test relationships that they
have never tested before in any primary study. The reason is some of the
studies may have tested the front part but not
the, not the second part, and when you combine them
together in a meta analysis you have actually the chance
to test the whole model, assuming that primary studies
have not tested it before. In addition to look at the single path is, path coefficients, you
can actually use the, use the so called meta
analytic SEM to test the mediation effect, for
example, the upper pass mediation effect, or
the difference between the two mediation effects,
or any other combination you want to test, as long as it has a conceptual meaning of it. So, the first approach when
we come to the combination of meta analysis and the
structural equation modeling was proposed by Viswesvaran
and Ones in 1995. It’s called the meta
analytic ICM approach. So, in this approach, as many
of you already are familiar with, you construct a
pairwise correlation matrix. With that matrix you can use as a input, actually for the matrix, you
only need the information at the lower diagonal,
’cause it will be repeating the same information
on the upper diagonal. So, when you have this information ready then you can use that as the input matrix into any SEM software and do the testing as the second step. You know, it’s important to
note that some of the cells may come from study set A. Other cells may come from
a different study set, and some cells may, the information here may come from prior meta analyses. So that’s a way to really synthesize and combine prior knowledge
that may not, you know, from studies that may
not have been looking at the same mediation model
as we are looking at here. So, in this approach a
critique on this approach is the matrix, because it is
constructed in a pairwise way, so the correlation between
X to M one is not taken into any consideration of
the correlation between X two with M two. So, due to that reason, it’s very likely that you will encounter
some non positively definite matrixes and in that
situation there’s no way that you can run an
SEM in the second step. So we’ll talk about that later in the slides when we talk about how to incorporate heterogeneity there. And again, it’s a average
correlation matrix and heterogeneity is not
often considered there. So in order to explicitly
incorporate heterogeneity of the correlation coefficients
that you meta analyzed there are different ways. If the categorical, if the
moderator is categorical then you can look at the
different values of the moderator, assuming that you already
identified the moderator. Then at the different values you, assumably you have a
homogenous population there. Or, you could actually
include the product term as if it’s a variable in
the correlation matrix. But such product term and
its correlations are often unavailable, you know, from the primary study research authors. So, there is another way, I
mean, there, the second step after you’re considering
the moderator values and the potential, potential
opportunity to include the product terms is that
after you consider all those moderators, and let’s
assume you have a normal distribution, I mean, you have
a normal homogenous sample, homogenous population for the correlation, then you could use a method
advocated by Yu et al in 2016 to more explicitly
think, talk about how to incorporate the variations in
the correlation coefficients. So, in this paper Yu and her co authors were using simulated correlation matrixes to create the posterior
distribution of any coefficients or any combination of them
and giving you the credibility interval of those correlations. So, in terms of time we’ll not
be going through the steps. There are different steps involved in this simulation procedure. One key thing to note
is that when you are, when you simulate there’s
a possibility that you have non positively
definite correlation matrixes, so you just discard those matrixes then continue the simulation. So in the end, you will
have a K minus J runs and the J means the numbers,
the number of matrixes that are not positively definite. Then based upon those so many runs then you can create the
80% credibility interval based upon any point estimated
that you are interested in. However, this approach is
still a pairwise approach because each of standard
deviation of the rho is considered independent of
the other standard deviations. So that’s one weakness continued
from the previous usage of the meta, the MASEM approach. So, using, using this method from Yu et al you can obtain the 80% confidence interval for the mediation effect,
credibility interval, I’m sorry, 80% credibility interval
for the difference or even, you know, obtain
the credibility interval for any, any (mumbles) that, or non linear combinations of those coefficients. With that said, although
despite the advances of the Yu et al paper
in terms of highlighting the way to incorporate heterogeneity there is a response from researchers in terms of Cheung in his, in his in press paper talking about the limitations as a comment
to the Yu et al publication, talking about how, how
actually the distribution of the model fit indexes
are not so meaningful. However, the Yu et al approach is still useful in terms of providing researchers of the credibility interval
of the coefficients. So, the coefficients and the combination, linear, non linear of those coefficients, are still meaningful for researchers to understand and to potentially identify moderators of the whole
model or on particular paths. But Cheung’s suggestion is
we should not be looking at the distribution of the model fit indexes, such as our MSEA or CFI or TOI, those type of standardized,
standard reports from SEM software. So, the Yu et al paper
was not actually the first publication to apply
this heterogeneity logic. A few years ago, a
similar logic was proposed by Edwards and Christian
where they emphasized the role played by the
credibility interval. So they actually have a similar idea where a researcher can use a
group of a series of group of correlation matrixes
where you just change, you can replace the one cell with that particular correlation’s
upper versus lower boundary of the credibility interval. So, basically if you have four variables like we have here in the
simple mediation model then it will be six
correlations in the matrix and replacing each cell by its lower bound and upper bound that will be
12 different matrixes to fit and then you can summarize
how the past coefficients, mediation effects, or model
fits indexes will be distributed across the 12 different matrices, and also assuming that they
are all positively definite. So, that’s the first approach. As you can see there is still a pairwise logic there, even for the
Edwards and Christian method, because when you change
a particular cell with its lower or upper
boundaries you still assume that cell is independent from other cells. And that weakness has been
addressed by the second approach, which was proposed by Cheung
and Chan in 2005 and 2009. It is called the two stage SEM approach. So, let’s look at how data is used in this type of situation. Let’s assume that we’re
still interested in the four variable mediation model, and you can actually accumulate the studies and use their
correlation matrices directly. Let’s assume that you have four studies and some studies, like study one, only have three variables
available in the matrix. Study two, you are missing X. Study three, you are missing M one, and study four, we are missing M two. So, that’s not a problem. So you basically can create matrices with those missing correlations and directly estimate across the four. You know, that’s just an example. You may have 40 or 100 matrices in order to get the first step, and there are two steps
here according to Cheung. So, first you test the
homogeneity of correlations and calculate the pooled, the
pooled correlation matrix, and then at the same time
because you have the raw data for each of the matrices
you can actually get asymptotic covariance
matrix in terms of showing how those correlations
co vary across the four, 40, or 400 studies. And using, ’cause this,
this is is so called ACO, ACOV matrix, the asymptotic
covariance matrix, so this is the new
element here which is only available when you have
the matrixes as raw input rather than looking at a particular correlation value at the input. Then the second step
you will use the pooled correlation matrix which is
basically the mean, the average, and this asymptotic
covariance matrix as the input to calculate the SEM model using, using a asymptotically
distribution free method. So, with this method as you’ll see here you need more information. So, one thing to highlight is some researchers claim
that in order to use the two stage SEM
approach you need to have one particular study that
covers all the variables. So, basically you need to
have one complete matrix in order to use it. According to the in press
paper by Mike Cheung that’s not actually a requirement. So basically, let’s assume
you only have four studies and you could have none
of those four studies have a complete matrix to start with. So, in this approach the
asymptotic covariance matrix takes into consideration the
second order sampling error. However, heterogeneity was only reflected in the asymptotic covariance matrix. If you want to more explicitly
incorporate heterogeneity you could actually use the
Yu method, Yu et al method, but this time instead of
creating simulated correlations from the SD of rhos,
this time the asymptotic covariance matrix can actually provide you with more information on that. So, you can actually use,
directly use, those covariance, asymptotic covariance matrix
in your simulation process and then you can create
it, create the matrices and then run the similar
step, similar procedures that Yu et al has suggested to gather the credibility intervals
of any combination of coefficients you are interested in. Just remember, the distribution
of the model fit indexes will not be that meaningful. So, what does this mean for us
in terms of study robustness and empirical findings? There are two, there are
actually a recent paper by ORM, in ORM by DeSimone and et al. They are looking at, they
actually, they provide an update of a previous study looking at how people are using
meta analysis results. Their findings are similar
to the previous study where when researchers are
studying a meta analysis the most, the most, the
only (mumbles) about whether a relationship exists. They tend to discard the information, the rich information, associated
with the standard deviation of rho and the credibility intervals associated with that finding. They, many of the papers that
are studying meta analyses before are not even mentioning moderators, or potential sub populations as a way to guide their theoretical building and the hypothesis testing. So, that shows there is a, you know, a gap between what the information
meta analysis provides us and how people, or how
researchers are using it. So, in order to better use
it we need to be aware that heterogeneity in those
effect sizes reported in meta analysis may
reflect theoretically sound moderators, which means there
is a, there is potentially a research question
about generalizability. So, whether a different
setting, different, different sub population may
give you different results. And after considering
that, when you look at a homogenous population
you have to remember that there may still variations even from findings of that homogenous population, and this is pertaining
to the replicability of those primary study findings. So, my hope is we can
all keep meta analysis, the bi varied correlation
based meta analysis in our mind, and also
the (mumbles), the MASEM approach in your toolbox
so that we can evaluate the robustness of any primary
studies that we encounter, and keep that in mind in
terms of how to accurately interpreting those findings by keeping a, by highlighting the
potential heterogeneity and the moderators identified
in those meta analyses. And that’s, that’s my hope
in terms of how we can use meta analysis for enhancing
the, our understanding of study robustness. So, again, we are today are talking about the conditional meta
analysis which is different from the so called one paper meta analysis where you, you do a internal
one on the same paper. So I, here I listed a
few references for you if you are interested in, two slides here, which may give you some food for thoughts in terms of your own research for future, future advancement directions. Okay, thank you very much. That was my presentation. (audience applauding)

Leave a Reply

Your email address will not be published. Required fields are marked *