The reason the predictions are high in the 2-lepton categories

I’ve found the reason the predictions are high in the 2-lepton categories.  The problem is in the 2-lepton to 1-lepton ratios, but I’m not sure how to fix it.

In order to calculate the expected contribution from a particular 2-lepton process, say e+ from a heavy jet and e- from a light jet (abbreviated ePh and eNl) we find N(ePh)_data from the pTrel/d0 fit, and multiply that by the ratio found in the MC: N(ePh + eNl)/N(ePh)

N(ePh)_data * N(ePh + eNl)_MC / N(ePh)_MC = N(ePh + eNl)_predicted

Currently, we’re using the W+heavy MC to calculate N(ePh + eNl)_MC / N(ePh)_MC.  Naively, I would expect that using the W+jet MC would give the same answer, but with larger error bars, since there is some W+heavy in the W+jet sample.  However, as it turns out, the two samples give a ratio that differs by a factor of two.  They agree on the number of expected ePh + eNl events, but the W+jet has significantly more ePh events. (remember that the categories are exclusive)

In the following plots, the X-axis is the number of e+ from heavy jets in the event, and the Y-axis is the number of e- from light jets.  Look at the (1,1) and (1,0) bins.  The top plot is W+jets, and the bottom one is W+heavy.

The N(ePh) is twice as large in the W+jets sample as in the W+heavy sample.  We’re currently using the W+heavy numbers in our calculations, (on the assumption that the error bars would be smaller) but the W+jets numbers seem to agree better with the data.  Is there a reason the two should be so wildly discrepant?

-Scott