Your i5 has two cores, each core can run two threads because of intel's hyperthreading, making 4 threads, beyond that it switches at high speeds between processes. Here's a nice explation of multithreading if you want to know more, but in essence your CPU can run 4 processes simultaniously, and switch at high speed between processes.
This is known as heterogeneous multiprocessing (HMP) and is widely adopted by mobile devices. In ARM-based devices which implement big.LITTLE, the processor contains cores with different performance and power profiles, e.g. some cores run fast but draw lots of power (faster architecture and/or higher clocks) while others are energy-efficient but slow (slower architecture and/or lower clocks). This is useful because power usage tends to increase disproportionately as you increase performance once you get past a certain point. The idea here is to get performance when you need it and battery life when you don't.
On desktop platforms, power consumption is much less of an issue so this is not truly necessary. Most applications expect each core to have similar performance characteristics, and scheduling processes for HMP systems is much more complex than scheduling for traditional SMP systems. (Windows 10 technically has support for HMP, but it's mainly intended for mobile devices that use ARM big.LITTLE.)
Also, most desktop and laptop processors today are not thermally or electrically limited to the point where some cores need to run faster than others even for short bursts. We've basically hit a wall on how fast we can make individual cores, so replacing some cores with slower ones won't allow the remaining cores to run faster.
While there are a few desktop processors that have one or two cores capable of running faster than the others, this capability is currently limited to certain very high-end Intel processors (as Turbo Boost Max Technology 3.0) and only involves a slight gain in performance for those cores that can run faster.
While it is certainly possible to design a traditional x86 processor with both large, fast cores and smaller, slower cores to optimize for heavily-threaded workloads, this would add considerable complexity to the processor design and applications are unlikely to properly support it.
Take a hypothetical processor with two fast Kaby Lake (7th-generation Core) cores and eight slow Goldmont (Atom) cores. You'd have a total of 10 cores, and heavily-threaded workloads optimized for this kind of processor may see a gain in performance and efficiency over a normal quad-core Kaby Lake processor. However, the different types of cores have wildly different performance levels, and the slow cores don't even support some of the instructions the fast cores support, like AVX. (ARM avoids this issue by requiring both the big and LITTLE cores to support the same instructions.)
Again, most Windows-based multithreaded applications assume that every core has the same or nearly the same level of performance and can execute the same instructions, so this kind of asymmetry is likely to result in less-than-ideal performance, perhaps even crashes if it uses instructions not supported by the slow cores. While Intel could modify the slow cores to add advanced instruction support so that all cores can execute all instructions, this would not resolve issues with software support for heterogeneous processors.
A different approach to application design, closer to what you're probably thinking about in your question, would use the GPU for acceleration of highly parallel portions of applications. This can be done using APIs like OpenCL and CUDA. As for a single-chip solution, AMD promotes hardware support for GPU acceleration in its APUs, which combine a traditional CPU and a high-performance integrated GPU onto the same chip, as Heterogeneous System Architecture, though this has not seen much industry uptake outside of a few specialized applications.
Best Answer
As another user commented, it's mostly OS-dependent.
Concurrently yes, in parallel no. See: https://softwareengineering.stackexchange.com/questions/190719/the-difference-between-concurrent-and-parallel-execution
Each OS has it's own scheduling algorithm.
We don't know where it will be executed and it will most probably not continue executing from start to finish on the same core. Again, depends on the OS scheduler.
Apparently yes: https://stackoverflow.com/questions/663958/how-to-control-which-core-a-process-runs-on
I didn't understand the question here, but the basic idea with threads is that you create them and the OS runs using its scheduling algorithm, there's no need for you to control in which logical or physical core it will run (there may be cases you might want to do that, I'm not sure why).