WEBVTT 00:00:00.590 --> 00:00:02.830 To do research, we need some data. 00:00:02.830 --> 00:00:06.290 The data comes from what we know as measurement. 00:00:06.290 --> 00:00:10.850 Singleton & Straits define measurement as a process of assigning numbers. 00:00:10.850 --> 00:00:15.389 And we could also say that it's a process of assigning numbers according to a known 00:00:15.389 --> 00:00:21.460 rule to be more precise, and measurement can have different properties. 00:00:21.460 --> 00:00:26.439 The idea of measurement is that we typically start with some kind of theoretical concepts. 00:00:26.439 --> 00:00:30.300 And then we need to come up with empirical measures of those theoretical concepts. 00:00:30.300 --> 00:00:34.290 And then we do calculations using those empirical measures. 00:00:34.290 --> 00:00:36.720 How it actually works, let's take a look. 00:00:36.720 --> 00:00:40.040 I'm going to use this diagram from Mikko Ketokivi's book. 00:00:40.040 --> 00:00:44.649 And it's based on Bacharach's article in Academic Management Review. 00:00:44.649 --> 00:00:48.840 To analyze this diagram, or use this diagram, we need a claim. 00:00:48.840 --> 00:00:56.280 And our claim or our proposition is that naming a woman as a CEO causes profitability to increase. 00:00:56.280 --> 00:01:01.250 And this is a proposition, it's a level, it's a claim on the level of theory. 00:01:01.250 --> 00:01:03.829 And it contains two theoretical concepts. 00:01:03.829 --> 00:01:06.350 We have first concept is CEO-gender. 00:01:06.350 --> 00:01:08.899 The second concept is profitability. 00:01:08.899 --> 00:01:15.279 And this proposition here is basically a causal claim, that female CEO causes profitability 00:01:15.279 --> 00:01:17.079 to go up. 00:01:17.079 --> 00:01:18.849 How do we then actually test this? 00:01:18.849 --> 00:01:22.000 So the idea is that we have a proposition. 00:01:22.000 --> 00:01:24.810 Then we need to have a way of assigning 00:01:24.810 --> 00:01:29.919 numbers to represent profitability and represent CEO gender. 00:01:29.919 --> 00:01:35.560 Then we make a hypothesis that these representations or empirical concepts are associated. 00:01:35.560 --> 00:01:38.950 Because we can't really observe causality, we can only 00:01:38.950 --> 00:01:40.369 observe associations, 00:01:40.369 --> 00:01:44.409 Then we collect some data and we test for statistical association. 00:01:44.409 --> 00:01:51.130 If we find a statistical association, then we conclude that we could not reject the 00:01:51.130 --> 00:01:56.009 hypothesis, therefore we found some evidence for the proposition. 00:01:56.009 --> 00:01:58.889 So how do we actually measure CEO-gender. 00:01:58.889 --> 00:02:01.549 We could have every CEO to go through a medical 00:02:01.549 --> 00:02:05.850 examination and the doctor would determine their gender. 00:02:05.850 --> 00:02:10.110 Or we could send them surveys, ask them to report the gender. 00:02:10.110 --> 00:02:12.489 And but that's, that's not practical. 00:02:12.489 --> 00:02:18.650 So how do we go in practice of determining which CEOs are men which are women. 00:02:18.650 --> 00:02:21.099 One easy way would be looking at their names. 00:02:21.099 --> 00:02:27.500 So the names of CEOs that is public information, and then we can check whether the name is 00:02:27.500 --> 00:02:32.480 a man's name, or a woman's name, and assign the gender variable according to that rule. 00:02:32.480 --> 00:02:36.590 So measurement is about assigning numbers according to the rule. 00:02:36.590 --> 00:02:42.129 Whether the rule is entirely, reliable and entirely valid is 00:02:42.129 --> 00:02:45.599 probably not, because some names can be used 00:02:45.599 --> 00:02:46.709 for both genders. 00:02:46.709 --> 00:02:52.680 So there's a bit of unreliability, there's randomness in how a person 00:02:52.680 --> 00:02:53.890 evaluates them. 00:02:53.890 --> 00:02:59.040 And also, there are names that are from different countries, they might be difficult for us 00:02:59.040 --> 00:03:00.360 to evaluate. 00:03:00.360 --> 00:03:03.250 So we get some data about specific companies. 00:03:03.250 --> 00:03:05.459 So that's the measurement process. 00:03:05.459 --> 00:03:07.569 And then we evaluate. 00:03:07.569 --> 00:03:13.469 How do we evaluate profitability, we need to define what is our measure, and 00:03:13.469 --> 00:03:16.330 in this case, our measures is return on assets. 00:03:16.330 --> 00:03:20.469 How do we claim that this is a valid measure of profitability? 00:03:20.469 --> 00:03:24.780 We can claim validity because ROA is actually, something that investors 00:03:24.780 --> 00:03:29.950 look at when they look at profitability differences between companies. 00:03:29.950 --> 00:03:35.069 So it is valid in the sense that that is what people actually use. 00:03:35.069 --> 00:03:39.629 We can also argue validity of ROA, based on understanding 00:03:39.629 --> 00:03:43.319 that profitability or performance of a company 00:03:43.319 --> 00:03:47.879 is related to how much money or how much profits that makes. 00:03:47.879 --> 00:03:50.739 And in ROA, the returns is the profits. 00:03:50.739 --> 00:03:56.629 And then assets simply scales, those profits to be comparable across companies of different 00:03:56.629 --> 00:03:57.629 sizes. 00:03:57.629 --> 00:04:00.370 So we can, we can do two different arguments. 00:04:00.370 --> 00:04:02.239 We can say that this is what people actually 00:04:02.239 --> 00:04:07.670 use for profitability, or we can see that ROA can be derived from profitability, 00:04:07.670 --> 00:04:09.730 and therefore it's a valid measure. 00:04:09.730 --> 00:04:16.420 Then we collect some data for companies and specific firms, we calculate some kind 00:04:16.420 --> 00:04:17.769 of associations. 00:04:17.769 --> 00:04:20.019 In this diagram. 00:04:20.019 --> 00:04:21.049 Reliability is here. 00:04:21.049 --> 00:04:23.130 It's a very low level thing. 00:04:23.130 --> 00:04:29.020 Would we get the same data if we collected the ROAs and CEO-genders again, the next day? 00:04:29.020 --> 00:04:30.020 Probably yes. 00:04:30.020 --> 00:04:31.950 So this is probably highly reliable. 00:04:31.950 --> 00:04:34.110 Reliability also concerns the rating here. 00:04:34.110 --> 00:04:39.950 So if a person rates some companies as men-led or women-led based on name, would 00:04:39.950 --> 00:04:42.810 that be perfectly reliable? 00:04:42.810 --> 00:04:43.810 Maybe not. 00:04:43.810 --> 00:04:51.949 Maybe 99,95% reliable, but certainly not 100% reliable, because some names are gender ambiguous. 00:04:51.949 --> 00:04:57.880 And then validity concerns whether we can claim that actually name is valid information 00:04:57.880 --> 00:04:58.979 of gender. 00:04:58.979 --> 00:05:01.979 We can say that that is because of the tradition of naming boys in 00:05:01.979 --> 00:05:04.050 one way, girls in another way. 00:05:04.050 --> 00:05:08.759 And then whether ROA is a valid measure of profitability, which we can claim based 00:05:08.759 --> 00:05:13.530 on the definition of ROA or based on the fact that ROA is actually used as a 00:05:13.530 --> 00:05:16.810 measure of profitability by for example, investors. 00:05:16.810 --> 00:05:22.669 So, this is in a nutshell the idea of measurement, we have concepts, and then we derive empirical 00:05:22.669 --> 00:05:25.240 representations for those concepts. 00:05:25.240 --> 00:05:31.440 We collect some data, we analyze association, if association exists, we conclude that we 00:05:31.440 --> 00:05:33.080 got some support for the theoretical purposes.