Thursday, July 22, 2004
Bad Econometrics
I checked EconoPundit today and found this interesting link. You can pretty much look at this chart and his prediction and go "Hm, that can't be right." Indeed, if you look at what his model's prediction was, it seems that he's predicting something like 1 million new jobs a month for the next 5 months. That doesn't seem right, does it? His "most pessimistic prediction" of 1.53 million new jobs by the election breaks down to something like 300k new jobs a month. That's not impossible, but it seems unlikely to happen at this point. More likely, he doesn't know what he's doing, which is strange, since he says he's teaching a class on this stuff.
So, I started playing with the data (the data is the BLS's payroll employent series). When I ran a simple model that regressed total employment against time, I got a jump very similar to his in the predicted months, so I imagine that's something like what he did. I can't know for sure, of course, because he doesn't say. If that's what he did, his model is simply picking up the long-term growth trend in total employment. Given that we've gotten behind the trend in recent years, of course the model now predicts that we'll be back on the trend; it doesn't take into account recent history, just the "employment schedule" against time. That's all it knows about as that model is specified. Just look at his graph and imagine the trend line, it's right in line.
A better idea is to detrend the series by taking first differences (that is, construct the series that is the difference between total employment every two adjacent months). Then, you can run an autoregressive model on the series (I tried an ARMA(2,2) model, just off the top of my head). That model predicted growth in total employment to be around 200k to 160k a month for the rest of the year, meaning we'd just barely get back to peak employment by then. That model is based on what change in employment you would expect to see given what we've seen in recent previous months (with the coefficients chosen from the full history of the data). We might do better than that, and we might do slightly worse than that, but at least the numbers pass a simple sanity check. I expect we'll probably actually do better than that; 200k is not a bad guess for this month or next, but less than that seems unlikely after that.
Obvious disclaimer: I know absolutely nothing about labor market forecasting, I'm just naively looking at his data and model and pointing out what seems to be a pretty clear problem with what he's doing. Specifically, this is a pretty clear violation of one of the assumptions of ordinary least squares estimation: residuals are independently distributed (no serial correlation, or trend).
Of course, I emailed him about this, asking for more details on his model, in case I'm the one who's confused, and mentioning my findings. He hasn't replied, nor changed his site, so either he's simply not received my email yet (increasingly unlikely as time goes by), or he's not really interested in doing it right.
So, I started playing with the data (the data is the BLS's payroll employent series). When I ran a simple model that regressed total employment against time, I got a jump very similar to his in the predicted months, so I imagine that's something like what he did. I can't know for sure, of course, because he doesn't say. If that's what he did, his model is simply picking up the long-term growth trend in total employment. Given that we've gotten behind the trend in recent years, of course the model now predicts that we'll be back on the trend; it doesn't take into account recent history, just the "employment schedule" against time. That's all it knows about as that model is specified. Just look at his graph and imagine the trend line, it's right in line.
A better idea is to detrend the series by taking first differences (that is, construct the series that is the difference between total employment every two adjacent months). Then, you can run an autoregressive model on the series (I tried an ARMA(2,2) model, just off the top of my head). That model predicted growth in total employment to be around 200k to 160k a month for the rest of the year, meaning we'd just barely get back to peak employment by then. That model is based on what change in employment you would expect to see given what we've seen in recent previous months (with the coefficients chosen from the full history of the data). We might do better than that, and we might do slightly worse than that, but at least the numbers pass a simple sanity check. I expect we'll probably actually do better than that; 200k is not a bad guess for this month or next, but less than that seems unlikely after that.
Obvious disclaimer: I know absolutely nothing about labor market forecasting, I'm just naively looking at his data and model and pointing out what seems to be a pretty clear problem with what he's doing. Specifically, this is a pretty clear violation of one of the assumptions of ordinary least squares estimation: residuals are independently distributed (no serial correlation, or trend).
Of course, I emailed him about this, asking for more details on his model, in case I'm the one who's confused, and mentioning my findings. He hasn't replied, nor changed his site, so either he's simply not received my email yet (increasingly unlikely as time goes by), or he's not really interested in doing it right.
Comments:
Well, given that he's eager to mention the election in his forecast, we all know where he stands politically. I'd just classify it under the "I'll say anything to help me (and my readers) believe my candidate will be (re-)elected."
Post a Comment