Horizontal Scaling
Horizontal Scaling is essentially building out instead of up. You don't go and buy a bigger beefier server and move all of your load onto it, instead you buy 1+ additional servers and distribute your load across them.
Horizontal scaling is used when you have the ability to run multiple instances on servers simultaneously. Typically it is much harder to go from 1 server to 2 servers then it is to go from 2 to 5, 10, 50, etc.
Once you've addressed the issues of running parallel instances, you can take great advantage of environments like Amazon EC2, Rackspace's Cloud Service, GoGrid, etc as you can bring instances up and down based on demand, reducing the need to pay for server power you aren't using just to cover those peak loads.
Relational Databases are one of the more difficult items to run full read/write in parallel.
I saw Damien Katz speaking about CouchDB at StackOverflow DevDays in Austin and one of his primary focuses for its creation was these parallel instances. As this has been a focus of it since day one, it would be much more capable of taking advantage of horizontal scaling.
Vertical Scaling
Vertical Scaling is the opposite, build up instead of out. Go and buy the strongest piece of hardware you can afford and put your application, database, etc on it.
Real World
Of course, both of these have their advantages and disadvantages. Often times a combination of these two are used for a final solution.
You may have your primary database where everyone writes to and reads real time data on a large piece of hardware. Then have distributed read only copies of the database for heavier data analysis and reporting where being up to the minute doesn't matter as much. Then the front end web application may be running on multiple web servers behind a load balancer.
Although Normalization and partitioning both produce a rearrangement of the columns between tables they have very different purposes.
Normalization is first considered during logical datamodel design. It is a set of rules which ensure that each entity type has a well-defined primary key and each non-key attribute depends solely and fully upon that primary key.
Partitioning comes in during physical database design, when we start to map logical attributes to physical columns and determine the operational characteristics required from the system. Sometimes it is an optimisation added after testing under load because performance was found to be inadequate. It can also play a role in implementing a data retention policy.
In partitioning we recognise that a table is made from rows and columns. When we partition we separate some of those rows (or columns) from the others and hold them in a physically different location.
Horizontal partitioning is when some rows are stored in one table, and some in another. There could be many sub-tables. A typical example is when currently-active transactional data is separated from old "archive" data. This keeps "hot" data compact, with associated performance improvements. We many be able to make the archive tables read-only, compressed and on cheaper disk, too.
As the next step each partition may be moved onto separate hardware. This is commonly know as "sharding." Advantages include being able to use many cheaper boxes rather than one very large, very expensive server, and being able to position a user's data geographically close to her. The cost is increased application complexity. Some DBMS incorporate this ability natively.
Vertical partitioning is when some columns are moved to a different table or tables. Similar to horizontal partitioning the motivation is to keep the "hot" table small so access is faster. Say you run an e-marketing company. 99% for the time you need a person's name and email address and nothing else. These will go in one table and all the other stuff which is useful but seldom-used - birthday, golf handicap, PA's phone number etc. - go in a different table. It can also help when the partitions have different update regimes or are owned by different sections of the business. The two tables can have the same primary key column, and corresponding rows could have the same key value. While it is possible to have multiple vertical partitions for a table, and to shard vertically, I've never come across it.
Vertical and horizontal partitioning can be mixed. One may choose to keep all closed orders in a single table and open ones in a separate table i.e. two horizontal partitions. For the open orders, order data may be in one vertical partition and fulfilment data in a separate partition.
The techniques I've talked about are ways to change the design to improve performance. Scaling is when you change the hardware. One can scale up by buying a bigger box with more RAM, CPU or faster disk, or scale out by moving some of the work onto a different box. Scale up is sometimes called scaling vertically whereas scale out can be called horizontal scaling. While horizontal scaling and sharding have an obvious relationship they are not synonymous. It would be possible to use replication technologies to copy an entire database to another location for use by the users there, thus achieving scale-out, without having to partition any tables.
Best Answer
In some cases (perhaps most) the servers are already at capacity physically. An increase in the number of CPU's would require a motherboard swap. To add RAM to an existing server could be expensive, depending on how old the server is. Memory modules more than 5 years old and sourced from a dealer can be prohibitively expensive.
What all this amounts to is that upgrading from an 8 processor box with 32GB RAM to a 64 processor box with 128GB would involve purchasing an entire new server to replace the old one. This is still vertical scaling from the database/application point of view. A server that supports this many CPU's is likely far more expensive than 7 servers with 8CPU's each (keeping the original one as the 8th).
As per mustaccio's comment: High-end hardware is not just about more processors and RAM; designing hardware to support throughput and power/cooling requirements for dozens of multicore processors is far from trivial. Thus, a 256-core 2TB RAM server might cost $150000, which is about an order of magnitude more expensive than 64 4-core 32GB commodity servers