– Okay, let’s get started. Good afternoon everyone. My name is Zhen Zhang. I am from Arizona State University, and I wanna thank Gwen and

Mo for this opportunity to talk with you and

discuss with you about using meta analysis to inform the robustness of empirical research findings. So, an outline for today, I know we are, we have limited time, so

I will try to, you know, meet the goal of finishing

within 45 minutes, and please hold all of the questions toward the end if you have any. I am sure you will have many questions. So, the outline, I will first

review the different effect size measures in the

science of organizations, then briefly talk about

how meta analysis has been typically used and what are the indicators of effect size heterogeneity

in meta analysis, and then I will spend most

of the time to talk about the two different ways of

combining meta analysis and structural equation

modeling to allow researchers to test relationships or models

that they have not tested, my, may or may not have tested

in their primary studies. Then I will summarize with

recommendations for how to keep this in our tool box for

enhancing the interpretation and evaluation of primary study findings. So, a quick review. Many of you already know this. In terms of effect size measures

there are many different forms and for a lot of

the IO, OB, HR researchers we care a lot about the correlation. So, Pearson product moment correlation. Well, if you are doing

experimental research then the D statistics, or

even odds ratios may be, may be your focus area, and R

squares explain the variances, eta squares in ANOVA. So, that also affects S measures. And here I listed a question mark just after standardized

regression coefficients ’cause that’s, that’s something,

you know, that is tricky in terms of if you compare

standardized regression coefficients across different

studies you will need to have the same set of predictors

and the control variables in order to be able to

compare across studies. So, in the, on the one

hand, it’s a measure of standardized effect size,

but on the other hand you need to have the

same set of coefficients in order to be comparable

across different studies. And there are many other different

measures of effect sizes. Today we will be focusing on correlations. But the same logic could

it be applied to other, like D statistics and others, in terms of how to combine

the meta analysis method and to test a model that

it kind of allows you to do things above meta analysis. A quick review. Meta analysis, following the

tradition of Schmidt and Hunt, Hunter, we define it as

the, it has been defined as the quantitatively

summarize a body of work on a particular relationship,

and as you will see in the next few slides

when you are presented the meta analysis results

you typically have, you have to report the

average observed correlation, the standard deviation of

it, and after you correct for the artifacts then you

can report the estimated true population correlation,

which is called rho, and the standard deviation of it. One note here is that today

we will be only talking about the conventional meta analysis

where you will summarize research findings across

independently conducted studies. We’re not talking about

the so called internal or single paper meta

analysis where you have one group of researchers

having a few primary studies and doing the synthesize and the analysis within the same paper and

present the meta analytical results in that one paper. So, that’s not the focus for today. So, with the traditional or

conventional meta analysis the heterogeneity of those

effects can be indicated by a few different things, and

for example the standard deviation of the, of the

true population correlation, which is SD rho, and you

can look at the width of the credibility

intervals, which I will be defining very shortly,

and also the percentage of observed variance that is

accounted for by artifacts, so on and so forth. So, some of the measures,

some of the indicators have, have been, had been

popular in previous years, but not tending, tend to fall

off of favor by researchers, but the standard deviation of rho and also credibility intervals have been pretty robust in terms of, pretty, have continued to be popular

to be used by researchers. Here is an example of

the meta analysis table which you can, you can see in many, many published meta analyses. So, the reference is

listed here at the bottom. So, I would like to highlight a few things for you to think about. First of all let’s look at this line where it’s give us the information

that is corresponding to the column, column labels. So, first number 32 means the

number of independent samples. So, one study could have multiple samples. So, the K constant samples,

and as you’ll see on the top at the, at the label we will be looking at the true population,

estimated true population and the standard deviation of it, then there is, there

are two columns called the confidence interval boundaries, which is 95% and the 80%

credibility interval boundaries. So the, you know, considering the time, I will not be going

through the definitions, but on the other, other more of parameters or reported information, but

I would like to highlight that in this line the standard deviation of rho is estimated as a positive value. In contrast, if you look at the other highlighted area, standard

deviation of rho is zero, which I will be explaining

in the next slide in terms of what it really means statistically and conceptually. And as you’ll see here when

the standard deviation of rho is .00, the credibility interval is with the lower boundary

and the upper boundary is the same value, which

means there’s no variation between the two boundaries. Before we talk about why it could be zero and what does that mean, I would like to explain a little bit on the technical side of those two intervals. So, credibility intervals are

based upon Bayesian logic. When it was first introduced

by Schmidt and Hunter in their 1977 paper they

actually described why they, we have to look at the

posterior distribution of the correlation after

correcting for the artifacts and use that to construct

the confidence intervals, and in terms of the interpretation

if you have a wider CVs that means there is a

potential to have one moderator or multiple moderators on

that relationship of focus, whereas when you look

the confidence intervals it is estimating the accuracy of the mean observed correlation, and it is calculated with a similar formula, but just replacing the standard deviation of

rho by the standard error of the estimated R. So, for technical details

you can refer to Hunter and Schmidt’s books,

different versions of books, for the technical details. But I want to emphasize

that confidence intervals has to some extent an over emphasis on significance testing, whereas

the credibility intervals can actually give you rich information in terms of how the actual observed, or the actual true

population is distributed across different values. So, here is a slide to explain

why the SD rho could be zero and sometimes you may

realize that by, you know, your own, through your

own work, and reading others’ publications. You may realize that

sometimes the percentage of explained variance

is greater than 100%. The reason was the calculation

was based upon this equation. So the variance, the set,

first equation there, the variance of the

sample correlations equals to the variance of the

population correlation plus the variance due to sampling error, and the, the sampling error, so basically this term, if

you rearrange the terms, so the variance of rho equals the variance of observed sample correlation

minus the sampling error. And this term has a particular

formula to calculate. When this number, when

this value is greater than your observed variance that gives you the bigger than 100% percentage. So, in that situation

typically researchers will just fix the total sum or

the difference between the two as zero, rather than

going to a negative number. So, now understanding how,

after we know how the CVs versus CIs are interpreted and calculated let’s look at the sources of heterogeneity when you meta analyze primary studies. So, there are two different types. The first type of heterogeneity

may be due to moderators. Sometimes it’s one moderator,

sometimes there’s a group of different moderators at work, and this source of heterogeneity

and the ways to consider it is related to the generalizability of the primary study findings. So you can think of this as

different sub populations or even totally different populations, looking at the, for example, 30 studies. 15 of them may come from

a different culture. The other 15 may come from another culture and they heterogeneity

you identified when you were meta analyzing those 30

studies can actually give you the hint or give you the

direction of to look at those potentially moderators from

the cultural perspective, and also examine whether

a particular study is generalizable, the

findings from one culture is generalizable to other

settings, other cultures. So that’s why I say here it is

related to generalizability. In contrast, heterogeneity,

after considering the moderators, if, it’s

likely that you still have some normal variations in

the observed correlations, even if you considered already

it’s a homogenous population. So in this aspect it’s more

related to the replicability of the primary study findings. So, let’s assume you already

identified a homogenous population or sub

population and you are doing a few primary studies

out of that population. You’ll still have

variations, but this type of variation is no longer

due to, let’s assume, no longer due to any moderators. It’s only due to the remaining

sampling error you may have and this is more relevant for

replicability of, you know, rather than due to additional

moderators you have to consider before you,

reaching this stage. So let’s use one example

to look at how we can actually combine our knowledge so far in terms of meta analysis, heterogeneity, and testing a structural model. So this is a very simple

model where you have variable X predicting

two parallel mediators and then predicting, they (mumbles) the other common variable Y. So when you are able to

combine meta analysis results and structural equation modeling

researchers can actually test relationships that they

have never tested before in any primary study. The reason is some of the

studies may have tested the front part but not

the, not the second part, and when you combine them

together in a meta analysis you have actually the chance

to test the whole model, assuming that primary studies

have not tested it before. In addition to look at the single path is, path coefficients, you

can actually use the, use the so called meta

analytic SEM to test the mediation effect, for

example, the upper pass mediation effect, or

the difference between the two mediation effects,

or any other combination you want to test, as long as it has a conceptual meaning of it. So, the first approach when

we come to the combination of meta analysis and the

structural equation modeling was proposed by Viswesvaran

and Ones in 1995. It’s called the meta

analytic ICM approach. So, in this approach, as many

of you already are familiar with, you construct a

pairwise correlation matrix. With that matrix you can use as a input, actually for the matrix, you

only need the information at the lower diagonal,

’cause it will be repeating the same information

on the upper diagonal. So, when you have this information ready then you can use that as the input matrix into any SEM software and do the testing as the second step. You know, it’s important to

note that some of the cells may come from study set A. Other cells may come from

a different study set, and some cells may, the information here may come from prior meta analyses. So that’s a way to really synthesize and combine prior knowledge

that may not, you know, from studies that may

not have been looking at the same mediation model

as we are looking at here. So, in this approach a

critique on this approach is the matrix, because it is

constructed in a pairwise way, so the correlation between

X to M one is not taken into any consideration of

the correlation between X two with M two. So, due to that reason, it’s very likely that you will encounter

some non positively definite matrixes and in that

situation there’s no way that you can run an

SEM in the second step. So we’ll talk about that later in the slides when we talk about how to incorporate heterogeneity there. And again, it’s a average

correlation matrix and heterogeneity is not

often considered there. So in order to explicitly

incorporate heterogeneity of the correlation coefficients

that you meta analyzed there are different ways. If the categorical, if the

moderator is categorical then you can look at the

different values of the moderator, assuming that you already

identified the moderator. Then at the different values you, assumably you have a

homogenous population there. Or, you could actually

include the product term as if it’s a variable in

the correlation matrix. But such product term and

its correlations are often unavailable, you know, from the primary study research authors. So, there is another way, I

mean, there, the second step after you’re considering

the moderator values and the potential, potential

opportunity to include the product terms is that

after you consider all those moderators, and let’s

assume you have a normal distribution, I mean, you have

a normal homogenous sample, homogenous population for the correlation, then you could use a method

advocated by Yu et al in 2016 to more explicitly

think, talk about how to incorporate the variations in

the correlation coefficients. So, in this paper Yu and her co authors were using simulated correlation matrixes to create the posterior

distribution of any coefficients or any combination of them

and giving you the credibility interval of those correlations. So, in terms of time we’ll not

be going through the steps. There are different steps involved in this simulation procedure. One key thing to note

is that when you are, when you simulate there’s

a possibility that you have non positively

definite correlation matrixes, so you just discard those matrixes then continue the simulation. So in the end, you will

have a K minus J runs and the J means the numbers,

the number of matrixes that are not positively definite. Then based upon those so many runs then you can create the

80% credibility interval based upon any point estimated

that you are interested in. However, this approach is

still a pairwise approach because each of standard

deviation of the rho is considered independent of

the other standard deviations. So that’s one weakness continued

from the previous usage of the meta, the MASEM approach. So, using, using this method from Yu et al you can obtain the 80% confidence interval for the mediation effect,

credibility interval, I’m sorry, 80% credibility interval

for the difference or even, you know, obtain

the credibility interval for any, any (mumbles) that, or non linear combinations of those coefficients. With that said, although

despite the advances of the Yu et al paper

in terms of highlighting the way to incorporate heterogeneity there is a response from researchers in terms of Cheung in his, in his in press paper talking about the limitations as a comment

to the Yu et al publication, talking about how, how

actually the distribution of the model fit indexes

are not so meaningful. However, the Yu et al approach is still useful in terms of providing researchers of the credibility interval

of the coefficients. So, the coefficients and the combination, linear, non linear of those coefficients, are still meaningful for researchers to understand and to potentially identify moderators of the whole

model or on particular paths. But Cheung’s suggestion is

we should not be looking at the distribution of the model fit indexes, such as our MSEA or CFI or TOI, those type of standardized,

standard reports from SEM software. So, the Yu et al paper

was not actually the first publication to apply

this heterogeneity logic. A few years ago, a

similar logic was proposed by Edwards and Christian

where they emphasized the role played by the

credibility interval. So they actually have a similar idea where a researcher can use a

group of a series of group of correlation matrixes

where you just change, you can replace the one cell with that particular correlation’s

upper versus lower boundary of the credibility interval. So, basically if you have four variables like we have here in the

simple mediation model then it will be six

correlations in the matrix and replacing each cell by its lower bound and upper bound that will be

12 different matrixes to fit and then you can summarize

how the past coefficients, mediation effects, or model

fits indexes will be distributed across the 12 different matrices, and also assuming that they

are all positively definite. So, that’s the first approach. As you can see there is still a pairwise logic there, even for the

Edwards and Christian method, because when you change

a particular cell with its lower or upper

boundaries you still assume that cell is independent from other cells. And that weakness has been

addressed by the second approach, which was proposed by Cheung

and Chan in 2005 and 2009. It is called the two stage SEM approach. So, let’s look at how data is used in this type of situation. Let’s assume that we’re

still interested in the four variable mediation model, and you can actually accumulate the studies and use their

correlation matrices directly. Let’s assume that you have four studies and some studies, like study one, only have three variables

available in the matrix. Study two, you are missing X. Study three, you are missing M one, and study four, we are missing M two. So, that’s not a problem. So you basically can create matrices with those missing correlations and directly estimate across the four. You know, that’s just an example. You may have 40 or 100 matrices in order to get the first step, and there are two steps

here according to Cheung. So, first you test the

homogeneity of correlations and calculate the pooled, the

pooled correlation matrix, and then at the same time

because you have the raw data for each of the matrices

you can actually get asymptotic covariance

matrix in terms of showing how those correlations

co vary across the four, 40, or 400 studies. And using, ’cause this,

this is is so called ACO, ACOV matrix, the asymptotic

covariance matrix, so this is the new

element here which is only available when you have

the matrixes as raw input rather than looking at a particular correlation value at the input. Then the second step

you will use the pooled correlation matrix which is

basically the mean, the average, and this asymptotic

covariance matrix as the input to calculate the SEM model using, using a asymptotically

distribution free method. So, with this method as you’ll see here you need more information. So, one thing to highlight is some researchers claim

that in order to use the two stage SEM

approach you need to have one particular study that

covers all the variables. So, basically you need to

have one complete matrix in order to use it. According to the in press

paper by Mike Cheung that’s not actually a requirement. So basically, let’s assume

you only have four studies and you could have none

of those four studies have a complete matrix to start with. So, in this approach the

asymptotic covariance matrix takes into consideration the

second order sampling error. However, heterogeneity was only reflected in the asymptotic covariance matrix. If you want to more explicitly

incorporate heterogeneity you could actually use the

Yu method, Yu et al method, but this time instead of

creating simulated correlations from the SD of rhos,

this time the asymptotic covariance matrix can actually provide you with more information on that. So, you can actually use,

directly use, those covariance, asymptotic covariance matrix

in your simulation process and then you can create

it, create the matrices and then run the similar

step, similar procedures that Yu et al has suggested to gather the credibility intervals

of any combination of coefficients you are interested in. Just remember, the distribution

of the model fit indexes will not be that meaningful. So, what does this mean for us

in terms of study robustness and empirical findings? There are two, there are

actually a recent paper by ORM, in ORM by DeSimone and et al. They are looking at, they

actually, they provide an update of a previous study looking at how people are using

meta analysis results. Their findings are similar

to the previous study where when researchers are

studying a meta analysis the most, the most, the

only (mumbles) about whether a relationship exists. They tend to discard the information, the rich information, associated

with the standard deviation of rho and the credibility intervals associated with that finding. They, many of the papers that

are studying meta analyses before are not even mentioning moderators, or potential sub populations as a way to guide their theoretical building and the hypothesis testing. So, that shows there is a, you know, a gap between what the information

meta analysis provides us and how people, or how

researchers are using it. So, in order to better use

it we need to be aware that heterogeneity in those

effect sizes reported in meta analysis may

reflect theoretically sound moderators, which means there

is a, there is potentially a research question

about generalizability. So, whether a different

setting, different, different sub population may

give you different results. And after considering

that, when you look at a homogenous population

you have to remember that there may still variations even from findings of that homogenous population, and this is pertaining

to the replicability of those primary study findings. So, my hope is we can

all keep meta analysis, the bi varied correlation

based meta analysis in our mind, and also

the (mumbles), the MASEM approach in your toolbox

so that we can evaluate the robustness of any primary

studies that we encounter, and keep that in mind in

terms of how to accurately interpreting those findings by keeping a, by highlighting the

potential heterogeneity and the moderators identified

in those meta analyses. And that’s, that’s my hope

in terms of how we can use meta analysis for enhancing

the, our understanding of study robustness. So, again, we are today are talking about the conditional meta

analysis which is different from the so called one paper meta analysis where you, you do a internal

one on the same paper. So I, here I listed a

few references for you if you are interested in, two slides here, which may give you some food for thoughts in terms of your own research for future, future advancement directions. Okay, thank you very much. That was my presentation. (audience applauding)