To flesh out Sathya's answer a bit: In most systems, the same PCIe lanes are used for both the IGP and the PCIe-x16 slot for the video card. So either the slot can be used or the IGP. This means you can't even put non-video card devices (e.g. RAID controllers) into the x16 slot without losing access to the IGP -- you'd still have to install a video card in a different slot!
Late Edit: It appears that on Sandy Bridge systems with an H67 (and probably Z68, when they launch) chipsets, it's possible to run both the onboard GPU and an add-on graphics card at the same time. Other 6-series (and later) chipsets may work as long as both the CPU & MB Chipset support Intel's Flexible Display Interface, a DisplayPort-based standard that gives the integrated GPU a direct connection to the onboard video connectors.
There are two main factors.
First, you are quite right that RAM is the biggest one. Because a GPU has to share RAM bandwidth with the CPU, it simply cannot use nearly as much RAM. Worse, it is using RAM that is not optimized for GPU use, so the CPU, GPU, and RAMDAC all fight for the same precious bandwidth and the path between the GPU and RAM is much less direct.
Second, a dedicated GPU can have more compute units. You can only fit so many transistors on a single die, and a dedicated GPU can devote more space to GPU computing units.
I'm not sure what you mean by "less latency". If you think that it means communication between the CPU and GPU is more efficient, it basically isn't. Modern graphics cards have a great path that allows the CPU to write directly into the GPU (and its RAM) through fast buffers. A dedicated GPU has more room for these kinds of buffers because it's not sharing die space with the CPU and its caches.
Lacking GPU RAM, integrated solutions typically require "bulk" CPU/GPU communication to go through the regular RAM which is less efficient. The CPU can't give bulk data directly to the GPU. That would require them to run in lockstep which would waste resources because they're never exactly the same speed. And what could the GPU do with such bulk data other than write it to RAM? It's not like it has anyplace else to keep it while it processes it.
CPU to GPU communication basically involves writing the information to be communicated some place where both components can get it and then telling the GPU to process the information. With an integrated solution, that has to be the regular RAM which is already the limiting factor. With a dedicated solution, that can be the GPU's RAM, which is much more efficient.
Best Answer
The core differences of integrated graphics processors (IGPs) vs dedicated video cards (GPUs) are :
So when you should consider one or the other? I'll try to make your decision simple.
Personally, I predominantly game on consoles and my iPhone nowadays. If you are going to get a notebook, try to see if you can get one with the Nvidia 9400M chipset (IGP, but damn fine performance for an IGP solution). However, I do have a souped-up desktop rig for gaming, which is currently turned off 90% of the time... till Diablo 3 gets released. :)