Greetings,
1) Before running a parasitic back-annotated simulation, you might 
    want to try running just an LPE, layout parameter extracted 
    simulation. That is, you extract just the devices and simulate.
    --> This validate the layout is correct. You don't say much about 
          the implementation, but it maybe that the folding and merging 
          of devices has altered something.
   --> Basically your parasitic back-annotated capacitance simulations
         provide equivalent information so this appears to be done and 
         the issue is not in the basic implementation.
2) There are two things(at least 

 ) to look for in with back with 
    back-annotated parasitics:
    a) Are the biases correct?
        If the bias currents are different than the performance will 
        be different. An easy test is the power dissipation the both 
        pre-layout and post-layout simulation results the same? After 
        that check each stage for bias. Then check the common mode 
        levels.
    b) If the bias voltages and current are okay and the performance
        is still off, then you need to look for other issues. 
        - Can you probe internally and look at the output of the first
          stage? Are the results that same for both pre-layout and 
          post-layout
        - Can you look at the gain of the second stage? Is the gain 
           correct? You don't specify the implementation in detail, 
           however, common-source amplifiers are sensitive to the 
           resistance between the source and the power supply since
           this resistance can act like a degeneration resistor.
        - Can you look at the common-mode feedback amplifier?
          If it is not working properly the common-mode level may 
          end up outside the optimum operating range?
    ---> You need to localize the issue, if possible. Localizing the issue
           can be difficult because everything is connected 

3) Are you using ADE [IC5141/IC61] for your simulation environment?
    ADE has the option to back-annotate selected nets and simulate
    so you may be able to localize the issue?
4) Are you using ADE [IC61], using the Parasitic Aware Design flow, 
    you can do sensitivity analysis to identify the relationship between 
    specific parasitics and the dc gain. 
5) You also mention a zero in the ac response. How is your 
    compensation implemented? Is it possible that the zero 
    is due to layout parasitic resistance in the routing for the 
    comp cap? Or does the extracted capacitor include the 
    extracted resistance of the plate? How was the compensation
   capacitor implemented, MOS CAP [MOS transistor]? 
                                                                 Best Regards,
                                                                   Sheldon