CGDisplayStream does indeed work very well, however on the virtual screen the performance is about the same as mapping the memory of the ewproxyframebuffer.
Driver monitor says that on this display (and the virtual display only) that the driver is busy uploading bytes to the IOSurface when using CGDisplayStream. Furthermore, probing the display with CGLQueryRendererInfo() shows that only the software renderer is available.
So this is as Anome said earlier – the virtual display is not using the GPU directly – which suggests that whatever API is used to access it to build IOSurfaces, is going to result in GPU/CPU bus traffic.