CUDA – Improve Performance of CUDA vs Windows and Make Intel Primary GPU

cudadriversdual-bootgraphicsnvidia

I have gone through the GPU tensorflow install on a dualboot system (Windows 10 and Ubuntu 16.04.x)

both OSs have roughly the same versions of drivers

Lenovo P50 laptop with Nvidia Quadro M1000M    

Windows 376.51 nvidia driver version
Ubuntu  375.66 nvidia driver version

I train a Deep Learning Model, each training set takes a vastly different amount of time

Windows 10   + Tensorflow 1.3 GPU + CUDA =  8 min. per epoch
Ubuntu 16.04 + Tensorflow 1.3 GPU + CUDA = 45 min. per epoch

Ubuntu install was via all the defaults from apt-get (not sources install), and pip

My one thought so far… is that I must be using the NVIDIA GPU to paint the graphics.. and not getting to utilize ALL of the GPU for compute.. is there a way to check this? I've installed everything on both the same.. including the patches for CUDA 8.x

I'm not even clear what the issue is but it looks like the drivers are setup to use Optimus.. maybe I need to switch it into a different profile?

enter image description here

Idea One: I might try tomorrow is recompile tensorflow from sources.. with all CPU optimizations inside Ubuntu 16.x .. perhaps the pip install is more painful than the binary install on Windows…

Idea Two : If above does nothing, I will go into BIOS and force intel integrated graphics .. do a reinstall and try to install the noveua graphics.. kind of like this :

Seems this is an "Optimus" enabled laptop.. I cannot completely shut off the nvidia gpu for rendering, only enable hybrid mode. Perhaps I'll do a fresh install.. remove all nvidia drivers and see if I can get X working that way..?

http://guanghan.info/blog/en/my-works/building-our-personal-deep-learning-rig-gtx-1080-ubuntu-16-04-cuda-8-0rc-cudnn-7-tensorflowmxnetcaffedarknet/

"So I went to BIOS and set the integrated graphics as default and 
restart. Remember to switch the HDMI from the port on GTX1080 to that 
on the motherboard. Now the display works well. I successfully 
installed Ubuntu following its prompt guides."

https://devtalk.nvidia.com/default/topic/991849/-solved-run-cuda-on-dedicated-nvidia-gpu-while-connecting-monitors-to-intel-hd-graphics-is-this-possible-/

When installing the NVIDIA display driver, be sure to:

1. not install the openGL libs (there are command line options with 
driver runfile installers or CUDA runfile installers to allow this)
2. make sure not to make any changes to the xorg.conf configuration.

Best Answer

After much hunting, searching.. coallescing .. I successfully found the problem and fixed it! Yes, the Intel GPU in windows was getting used while in Linux it was going unutilized... forcing the NVIDIA GPU to draw the screen and lose resources.

I reinstalled a fresh Ubuntu 16.04 USB Stick on top of my previous system..

During reinstallation choose updates, but don't use 3rd party libraries

Once installed, you should verify you're in the intel nouveau driver mode.. instead of Nvidia's proprietary driver.

Now came the weird part(s)

A User here had pointed out the same problem.. but for desktop deployments

https://devtalk.nvidia.com/default/topic/991849/cuda-setup-and-installation/-solved-run-cuda-on-dedicated-nvidia-gpu-while-connecting-monitors-to-intel-hd-graphics-is-this-possible-/

basically...

In summary, in order to make this to work, you need to

1. make sure you have enabled onboard graphics in the BIOS settings (or set it as primary)

I did hybrid mode.. since there's no intel only option

2. install both xorg intel driver and nvidia/cuda drivers

here, you need to pass the flags

--no-opengl-files //for the driver install I choose latest (384)

--no-openfl-libs // cuda 8.0 + patch here ..

be sure to disable nouveau .. and all the steps outlined in the instructions..

main way to know you're good.. install glmark2 and always assure that it's outputting intel

3. start nvidia-settings, and go to the PRIME settings page, set Intel (Power Saving Mode) as default
4. modify your .bashrc and set LD_LIBRARY_PATH to at least contain /usr/local/cuda/lib64:/usr/lib/nvidia-XXX where XXX in my case is 375.

this folder didn't exist for me at all. I still added the default LD_LIBRARY_PATH and PATH outlined in the post install cuda instructions

4. logout to restart X or reboot
5. run glmark2 to confirm GL status

since you should have installed without installing opengl files.. this is perhaps unneeded

<strike>6. (update) if the libGL printed from step 5 points to nvidia's driver folder, you need to remove/rename the libGL.so*/libGLX.so*/libGLdispatch.so* under nvidia driver folder so that your OS can pick up the mesa libGL library.</strike>


7. run nvidia-smi to list your dedicated NVIDIA GPU, and run your CUDA program, you should not see any errors.

this didn't work for me until I ran nvidia-modprobe once.. then suddenly it all worked.

update : saddly a reboot broke the config. not sure how to fix yet...