WEBVTT WEBVTT Kind: captions Language: en 00:00:00.120 --> 00:00:03.930 Formative measurement is a  controversial concept that   00:00:03.930 --> 00:00:06.420 nevertheless sees some applications in research. 00:00:06.420 --> 00:00:11.490 You need to understand this concept to  understand why it's controversial and   00:00:11.490 --> 00:00:15.750 then you can make an informed decision of  whether this is something that you should   00:00:15.750 --> 00:00:20.970 use or not and also when you review work by  others you will eventually encounter people   00:00:20.970 --> 00:00:25.950 who claim that they use formative indicators  - formative measures or causal indicators. 00:00:25.950 --> 00:00:29.040 So what is this concept about and  why is there such controversy? 00:00:29.040 --> 00:00:36.690 The normal measurement model that we use shows  that the concept that we measure is a cause of   00:00:36.690 --> 00:00:41.790 the indicators. So the definition of validity  that I'll be using in these videos says that   00:00:41.790 --> 00:00:46.410 an indicator is valid if the variation in  the indicator is causally produced by the   00:00:46.410 --> 00:00:51.180 variation of the construct. So the idea is that  the indicators vary because the construct varying. 00:00:51.180 --> 00:00:57.360 Then in formative measurement this  idea is reversed. The idea is that   00:00:57.360 --> 00:01:00.990 measure causes the constructs or as a set of three   00:01:00.990 --> 00:01:04.680 measures for example calls together  the concept that we're measuring. 00:01:04.680 --> 00:01:10.590 What exactly does it mean that measure causes  the constant. It's easy to understand how   00:01:10.590 --> 00:01:16.290 innovativeness of a company for example  could cause some people to respond highly   00:01:16.290 --> 00:01:22.740 on a question about innovation and some people  responding lowly about question about innovation. 00:01:22.740 --> 00:01:25.200 So what does it mean that this is reversed? 00:01:25.200 --> 00:01:29.760 The problem is that the literature  doesn't really explain what it means   00:01:29.760 --> 00:01:35.460 that the measure causes the construct. If  we take this literally then it would mean   00:01:35.460 --> 00:01:40.440 that when a CEO responds positively  the question about the innovativeness   00:01:40.440 --> 00:01:44.970 then that causes the company to be  innovative. That's clearly implausible. 00:01:44.970 --> 00:01:51.180 Then another example that is commonly used  is the socioeconomic status. So for example   00:01:51.180 --> 00:01:57.600 how people responds the questions about  innovativeness how and education income   00:01:57.600 --> 00:02:03.060 and education and other things define their  social economic status. How you respond   00:02:03.060 --> 00:02:08.550 to questions has certainly no causal  effect on your socioeconomic status. 00:02:08.550 --> 00:02:16.570 So there's controversy and some methodologies say  that this idea should be abandoned all together   00:02:16.570 --> 00:02:24.280 and for example then we have these guys like Lee  and Cadogan and Chamberlain who say that formative   00:02:24.280 --> 00:02:29.980 indicators are not measures at all partly and  Mark Lee says that researchers should abandon   00:02:29.980 --> 00:02:38.080 this approach until we figure out the problems  and then Edwards say that looking at the problem   00:02:38.080 --> 00:02:42.250 of formative measurement lead to the logical  conclusion that the approach should be abandoned. 00:02:42.250 --> 00:02:46.120 So what kind of problems we have in  the idea that the indicators cause   00:02:46.120 --> 00:02:49.660 the construct and there can be some  other causes of the concern as well? 00:02:49.660 --> 00:02:55.810 Let's take a look at first of how the advocates  of these approach recommend that it's being used. 00:02:55.810 --> 00:03:00.910 Commonly when you read an article  about formative measurement you see   00:03:00.910 --> 00:03:06.130 these kind of guidelines. So there are  many guideline type articles and that   00:03:06.130 --> 00:03:09.430 tell you when you should use these  approaches and when you shouldn't. 00:03:09.430 --> 00:03:16.330 There are two basic rules of when you should  apply formatting measurement according to the   00:03:16.330 --> 00:03:22.120 advocates. And the first one is that when  you have a set of indicators how do you   00:03:22.120 --> 00:03:30.070 expect those indicators to be related as  a set with the concept. So do you expect   00:03:30.070 --> 00:03:35.890 the concept to be higher when one of the  indicators but not the others change or   00:03:35.890 --> 00:03:42.700 do you expect the indicators all to  be higher when the concept changes. 00:03:42.700 --> 00:03:48.460 So the idea of normal factor analysis  based measurement model is that we have   00:03:48.460 --> 00:03:55.150 uni-dimensionality. So when we have a set  of indicators that are supposed to measure   00:03:55.150 --> 00:04:00.850 innovativeness then for highly innovative  companies we should expect answers to those   00:04:00.850 --> 00:04:07.660 questions to be on average always higher than  for companies that are not so innovative. 00:04:07.660 --> 00:04:11.500 In informative measurement the  idea is that these indicators   00:04:11.500 --> 00:04:16.030 represent different causes of different  dimensions for example like if we have   00:04:16.030 --> 00:04:20.860 socioeconomic status that's measured  with income and education for example   00:04:20.860 --> 00:04:26.140 then we don't necessarily expect that  the income and education covariant. 00:04:26.140 --> 00:04:31.780 So that's one do the - should we expect  the indicators to covary in normal factor   00:04:31.780 --> 00:04:36.760 analysis based measurement model we do in this  formative measurement model we don't. So that's   00:04:36.760 --> 00:04:42.190 one rule of thumb for when the advocates assess  that you should apply formative measurement. 00:04:42.190 --> 00:04:48.160 And another one is an empirical test and one  particularly commonly used test is called the   00:04:48.160 --> 00:04:57.820 vanishing tetrad test by Bollen and the idea  here is that the factor analytical model says   00:04:57.820 --> 00:05:01.570 that the correlation structure should be  so that all the indicators are positively   00:05:01.570 --> 00:05:06.880 correlated and there are certain constraints  they should follow. And when the chi-square   00:05:06.880 --> 00:05:14.230 model - chi-square test rejects that model  then the claim is that the reflective model   00:05:14.230 --> 00:05:19.540 or the factor analytical model is untenable  for the data and therefore the the formative   00:05:19.540 --> 00:05:25.180 model will remodel the latent variable as being  caused by the indicators is more reasonable. 00:05:25.180 --> 00:05:33.280 This is a logical fallacy. The reason why this is  a logical fallacy is that rejection of one model   00:05:33.280 --> 00:05:40.450 does not imply the acceptance of another model.  So it could be that your indicators are reflecting   00:05:40.450 --> 00:05:45.310 the concepts or the concept causes variation in  the indicators but they're just bad measures. 00:05:45.310 --> 00:05:49.480 So because they're bad measures the  model doesn't work. It doesn't imply   00:05:49.480 --> 00:05:53.410 that if you've bad measures you should  just take a sum of those indicators and   00:05:53.410 --> 00:05:59.170 then ignore that your original model  didn't work with those indicators. 00:05:59.170 --> 00:06:05.530 So the mental experiment has some merit  - these empirical tests really don't. 00:06:05.530 --> 00:06:15.700 Let's take a look at more why the idea of a  formative measurement is troublesome. I'm not   00:06:15.700 --> 00:06:21.400 against taking indicators of taking multiple  different indicators of different things and   00:06:21.400 --> 00:06:27.460 making an index. You have many different ways  that combining things that don't really correlate   00:06:27.460 --> 00:06:33.790 as one number makes a lot of sense for example  stock indices are made that way. The individual   00:06:33.790 --> 00:06:40.480 stocks correlate to some extent but they are  not very highly correlated. Yet taking a sum of   00:06:40.480 --> 00:06:45.790 these uncorrelated variables produces a very good  measure of the overall stock market performance. 00:06:45.790 --> 00:06:51.100 So taking in this is taking sums is not a  problematic. So what's the problematic is   00:06:51.100 --> 00:06:56.380 the attachment of the idea of measurement  into this sum indicators as an index. 00:06:56.380 --> 00:07:01.660 The idea of measurement was that we had the  theoretical concept and the measurement result   00:07:01.660 --> 00:07:05.680 and they have some kind of relationship.  So there must be some kind of statistical   00:07:05.680 --> 00:07:09.880 association between the measurement  result and the theoretical concept. 00:07:09.880 --> 00:07:15.340 The traditional way is again thinking that  the theoretical concepts variation causes   00:07:15.340 --> 00:07:21.400 variation in the measurement results and then  we model it that way. We model latent variable   00:07:21.400 --> 00:07:27.340 recipers and the theoretical concept and then  the arrows go towards the measurement results. 00:07:27.340 --> 00:07:33.220 So we take the measurement results here and  then we build a statistical model based on   00:07:33.220 --> 00:07:37.480 those results. So we have three different  things that we need to consider. We need to   00:07:37.480 --> 00:07:41.590 consider the theoretical concepts  here then the measurement results   00:07:41.590 --> 00:07:45.580 and how we build a statistical model  based on those measurement results. 00:07:45.580 --> 00:07:52.030 The idea here is that the statistical  model should be a representation of   00:07:52.030 --> 00:07:58.660 the theoretical concept and if we represent a  measurement relationship - so the measurement   00:07:58.660 --> 00:08:02.920 relationship is the relation between  the measurement result and the thing   00:08:02.920 --> 00:08:07.060 being measured - then we call the  resulting model measurement model. 00:08:07.060 --> 00:08:15.160 So statistical model here is a representation  of the theoretical concept there. 00:08:15.160 --> 00:08:27.550 So what's the problem here with the formative  measurement thing? The Markus Keith Markus is   00:08:27.550 --> 00:08:32.800 one of the people who don't really  think this approach makes sense and   00:08:32.800 --> 00:08:40.390 his recent article goes over a couple of  conceptual impediments for this discussion   00:08:40.390 --> 00:08:47.320 about formative measurement merits and  weaknesses and then he states that one   00:08:47.320 --> 00:08:52.060 of the big problems here in the literature is  that the literature of formative measurement   00:08:52.060 --> 00:08:56.290 is too much focused on the modeling  part so how do we construct models. 00:08:56.290 --> 00:09:01.870 It often makes sense to make indices out  of indicators but just that sum indicator   00:09:01.870 --> 00:09:07.480 does not represent measurement. So there's a  clearly clear distinction about how we model   00:09:07.480 --> 00:09:12.670 things and what does it mean to measure  things and that's what he is saying. 00:09:12.670 --> 00:09:17.710 So we - I don't think that he agrees  that no one is seriously saying that   00:09:17.710 --> 00:09:23.410 how you respond to questions about  the income and education causes your   00:09:23.410 --> 00:09:27.850 socioeconomic status. We can take sum  of those indicators and use that sum   00:09:27.850 --> 00:09:34.600 as a measure of socioeconomic status but that  has really nothing to do about with causality   00:09:34.600 --> 00:09:39.580 and measurement. You're just taking aggregating  things as a useful index. It's not measurement. 00:09:39.580 --> 00:09:44.620 So when we look again at this figure. The  formative measurement model looks like that.   00:09:44.620 --> 00:09:50.410 So we specify that the indicators are freely  correlated. We don't say that the construct   00:09:50.410 --> 00:09:55.030 presented by this latent variable is the  cause of this indicator covariances and   00:09:55.030 --> 00:10:03.430 then we say that this model here - if this  model is a valid measurement model then it   00:10:03.430 --> 00:10:09.970 should be a good representation of the relation  between a theoretical concept and the measure. 00:10:09.970 --> 00:10:17.980 So on in practice we can't really defend  the idea that the measurement results cause   00:10:17.980 --> 00:10:24.880 the concept so that's indefensible.  If it was the case we could easily   00:10:24.880 --> 00:10:30.880 manipulate social behavior by just having  people respond particular way in a survey   00:10:30.880 --> 00:10:35.110 instrument and then we would see an effect  in reality. It just doesn't happen that way. 00:10:35.110 --> 00:10:39.670 So the relationship between theoretical  concept and measure is always from the   00:10:39.670 --> 00:10:43.090 theoretical concept to the measure  and not the other way around. At   00:10:43.090 --> 00:10:45.610 least there is no evidence that  it will go the other way around. 00:10:45.610 --> 00:10:50.350 And then this model if this is a good  measurement model it should be a good   00:10:50.350 --> 00:10:58.420 representation of these causal relationships  here and for that reason calling these models   00:10:58.420 --> 00:11:04.240 as measurement models formative models is  misguided. It could be a useful model. So   00:11:04.240 --> 00:11:08.830 when we aggregate things as an index -  it could be useful but it's not a model   00:11:08.830 --> 00:11:14.350 of measurement. So that's one of the key  points of the opponents of this approach. 00:11:14.350 --> 00:11:19.090 Nobody is saying that you should never  aggregate different dimensions into one   00:11:19.090 --> 00:11:24.940 index but simply that it is not about  validation of measurement. So the idea   00:11:24.940 --> 00:11:30.280 of the label of formative measurement is one  problematic thing saying that you construct   00:11:30.280 --> 00:11:36.130 an index and you are explaining why the thing  index is useful - that's fairly unproblematic. 00:11:36.130 --> 00:11:43.180 There are alternatives to formative measurement.  So how do we model measurement? We could say   00:11:43.180 --> 00:11:49.900 that we have different indicators here. So  we have three indicators and we say that   00:11:49.900 --> 00:11:55.540 these indicators x1 x2 and x3 are measured  these three different latent variables that   00:11:55.540 --> 00:12:01.270 then caused this latent variable of  interest. There is nothing wrong with   00:12:01.270 --> 00:12:06.490 this kind of model. So we are saying that  these indicators x1 x2 x3 are valid because   00:12:06.490 --> 00:12:11.320 their variation is closely produced but the  variation of the latent variables that we   00:12:11.320 --> 00:12:15.400 measure and then we are interested in the  outcome of these three latent variables. 00:12:15.400 --> 00:12:21.880 So that's - there's nothing wrong with that  statistically that's nearly identical to the   00:12:21.880 --> 00:12:26.890 formative measurement model because we have  to for identification assume that all these   00:12:26.890 --> 00:12:31.930 indicators x1 x2 and x3 are perfectly reliable.  We cannot estimate their error variances. 00:12:31.930 --> 00:12:41.230 Of course this model can be extended. So we can  add multiple indicators for each of these latent   00:12:41.230 --> 00:12:46.870 variables of interest that we say that are causes  of these ultimate interest or latent variable. 00:12:46.870 --> 00:12:53.770 So that's a second-order model and that's  a fairly defensible model. So you measure   00:12:53.770 --> 00:13:00.070 three different dimensions and then you measure  each with parallel indicators or indicators that   00:13:00.070 --> 00:13:06.130 you assume to be parallel - then you make take a  sum of those three latent variables justify why   00:13:06.130 --> 00:13:10.000 it makes sense and publish the people. There  is - no one is going to argue against that. 00:13:10.000 --> 00:13:16.870 The problem is that when we say that indicator  causes construct then that's implausible. 00:13:16.870 --> 00:13:20.260 So let's take a summary of a formative measurement   00:13:20.260 --> 00:13:24.270 and this is something that you may  fine useful when you review work   00:13:24.270 --> 00:13:28.980 done by others or if you consider using  formative measurement models yourself. 00:13:30.300 --> 00:13:34.440 The first thing to understand it's the  formative measurement does not exist so   00:13:34.440 --> 00:13:39.930 indicators don't cause the concept. If if  you think that the indicator really causes   00:13:39.930 --> 00:13:48.060 a concept that's fairly easy to demonstrate with  the experimental research. Just do a survey form   00:13:48.060 --> 00:13:53.160 where you think that the indicators cause the  construct then instruct some people to always   00:13:53.160 --> 00:13:58.020 answer on the left-hand side of the scale  or other set of people always transfer the   00:13:58.020 --> 00:14:03.630 right-hand scale - if you randomized that's a  valid experiment then wait one year and then   00:14:03.630 --> 00:14:08.850 measure the latent variable that is supposed  to be caused by these indicators and see if   00:14:08.850 --> 00:14:12.300 there are differences between the groups.  If you can find the difference that would   00:14:12.300 --> 00:14:17.460 be a huge finding. I don't think that anyone  ever will because it's not a realistic idea. 00:14:17.460 --> 00:14:24.990 Formative models or indices can be used. It's  - you can take sums of different variables.   00:14:24.990 --> 00:14:29.730 There are good reasons to do so. I'll go  through those reasons in a different video. 00:14:29.730 --> 00:14:36.270 For example - but the items that go to the index  you have to validate them separately. So just that   00:14:36.270 --> 00:14:43.560 you take a sum of three different indicators is  not - has nothing to do about validation. We can   00:14:43.560 --> 00:14:48.120 sum things such as person's height and weight  - it doesn't tell us whether those measures   00:14:48.120 --> 00:14:54.300 of height and weight are valid or reliable and  the sum doesn't really make any sense anyway. 00:14:54.300 --> 00:15:01.050 Then you have to justify why the index is  useful. So if you take a sum of people's   00:15:01.050 --> 00:15:06.720 height and weight what use would such  index be? So it's an argument that is   00:15:06.720 --> 00:15:11.430 non statistical you have to explain why  combining different things makes sense.