Wednesday, June 17, 2009

To do SEM or Not to do SEM, that is the question?

I've been avoiding writing this because I'm not sure if I can present a balanced case. The intense heat of the last couple of months in Delhi and Pondicherry (well over 40 degrees C) has not helped either as every time I got down to writing a draft, I felt drained and irritable. Finally, I've decided to take the bull by the horns and put it out there.

I like Structural equation modeling (SEM) for it's ability to model the rhythm, flow and intricacies of relationships in real data BUT (and here's the hard part) I would not recommend that it be used unless a strict set of conditions are applied and met for its use.

For the uninitiated, here is what structural equation models do-

  • They are a step up from regression models and allow you to incorporate multiple independent and dependent variables in the analysis.
  • The dependent and independent variables may be latent constructs formed by observed variables.
  • They allow for testing of specified relationships among the observed and latent variables in a kind of testing of hypothetical theoretical frameworks.
The problem is that I have rarely seen well done SEM with market research data. In fact unlike regression, botched up SEM can get really ugly. This is surprising because SEM has been around for a while now and there is enough available literature on its use in the industry along with issues that practitioners need to be aware of and deal with while constructing these models. Thus to many a client I would simply say-if you are going with SEM, make sure you have addressed all the issues outlined below else, go with the simple factor and regression.

Here is why SEM completely falls apart in the hands of unskilled (even somewhat skilled) or ‘software trigger happy’ practitioners:
  1. Use of SEM in situations where the measurement structure underlying a set of items is not well established and there is no sound theoretical framework available for possible patterns of relationships among constructs
  2. Having too many single indicator constructs
  3. Items loading on more than one construct
  4. Low sample sizes relative to number of parameters to be estimated
  5. Lack of addressing issues such as outliers or normality of variables
  6. Use of too few measures for assessment of fit of model or use of measures of fit that do not address sample size biases
  7. Building models that are too complex
  8. Lack of use of measures of reliability to assess the quality of construct measurement
  9. Little attention given to variance fit measures in the structural model
  10. Lack of specifications of alternate nested models and testing of the same
  11. Using modification indices and residual analysis too liberally to re-specify the model
  12. No cross validation of model
Of all the above, the main reason why SEM fails on market research data is the first point. Some studies like satisfaction and engagement research are easier to work with due to existence of stronger theoretical frameworks and measurement of constructs by cohesive indicators. Others like brand analysis sometimes rely heavily on use of SEM as an exploratory tool for ‘confirming’ model structure.

Thus what is the way out one may ask? My point of view is-unless the first condition of having compact indicators that measure constructs and existence of a sound theoretical framework for latent constructs is met, proceed slowly and with care. In this case, SEM should be used taking into consideration all the issues stated above of which the most important is the cross validation of the model.

To sum up, try and use the technique in a more confirmatory way for a-priori specification of hypothesized models. What you may get if SEM is used without prudence is a model structure so unique that it may not be real.

No comments: