Yes, there is a pretty big pitfall you're going to run into fairly quickly, and that is with the size and maintenance of the tables. You are somewhat on the right track by saying that you want to put your data into a temporary table daily, and then move it into your permanent table, but you'll soon run into trouble with this scheme.
For example, let's say you want to "roll off" the oldest month's worth of data after two years. In your design, you would have to issue a DELETE statement against your big, big table. This will likely be somewhat slow, depending on the number of indexes you have. Also, it will cause index fragmentation, and the only way to fix that would be to rebuild or reorganize the indexes on this very large table which would also cause performance problems. There are a whole host of other issues with a big single table type design as well. For example, with a big, single table, you can't do FILEGROUP based backups, which means that if you want to have a full backup of your database, it's gonna be BIG, and it's gonna take a LONG time to complete.
What's the solution? Table partitioning. Read about this in depth, in as many places as you can. Basically, partitioning allows you to split up your data onto "tables within tables" -- each partition shares the same schema, and is accessed through the table object, but can be indexed and maintained differently. Partitions are basically tables, cut up by some useful key. In your case it will likely be date. They can be dropped just like (and just as fast as) tables, which means that if you partition your big data tables by date, you can simply drop old partitions instantly, with no adverse effect to the indexes on any of the other partitions. You can put partitions on different filegroups, which means that older partitions can be rolled off, or rolled on to cheaper commodity storage if it's not commonly used. Last but not least, in SQL 2012 you'll be able to create COLUMNSTORE type indexes on your older, read-only partitions, while having a different, more insert-oriented indexing scheme on the active partition where you're inserting all your sensor data.
Hope this helps. You have a good amount of research to do regarding partitioning and partitioning schemes, but hopefully now you know the direction you need to be looking.
P.S.:
Oh, and I forgot your bulleted list of questions...
Answer 1, 2, and 5. See above. Answer 3: In SQL Server, you can compress on a partition by partition basis, so compress your older partitions aggressively using PAGE compression. But I believe your out-of-row large data types will not be compressed if you do this -- again, you may want to alleviate this problem by normalizing your sensor values. Answer 4: Absolutely not, but if all you want to do is store static data by day and never search on it any other way, compressed flat files may be a much easier way to go.
P.P.S:
Oh, and another thing. You don't need your two-table solution to make this all work. Large binary sensor data should be of type VARBINARY(MAX) because its values can be stored "out of row" but still be a column in a single table (see the sp_tableoption documentation). You may want to consider normalizing some of your sensor data out of the binary data you have in the table, though, because your database won't be good for much beyond retrieving chunks of sensor data by time if you don't.
Since no one else has answered, I'll share some opinions and do some hand-waving.
As long as you aren't locking common resources, or are locking resources in the same order, you shouldn't have problems with deadlocks.
I'd look at separate tables before separate databases. Each additional database will definitely cost you more but additional tables won't necessarily cost you more. You might need to use more than 1 database becuase of the sheer volume of data you will store or because of the rate at which you need to store your burst traffic. If you can manage it, I think that a table-level granularlity will be more flexible and possibly a good deal cheaper than starting with a database-level granularity.
The problem with putting each device's data into it's own tables is that it makes the reporting hard since all of the table names will be different.
I presume that you have some way of detecting when you get a "failure resend" of data. You don't want to put the same value in a table twice and I i'm sure that the devices can fail (local power failure?) in ways that have nothing to do with whether or not earlier values where properly stored.
WAG: Assuming each "value" is 4 bytes, I calculated about 11.5 MB of collected data per device, per day. (This ignores all kinds of stuff, like device identifiers and timestamps, but I think it is OK as a rough estimate.) So, with "thousands" of sites, we are looking at tens of GB, per day. You don't mention any kind of lifetime on that data. The largest Azure database currently maxes out at 150 GB. You could fill those up pretty quickly.
Getting anything to happen in a web browser in a short period of time is iffy. When you are reading from (possibly multiple) databases with GBs of data, continuously inserting lots of new data into the tables you are reading from and interacting with web servers across the open internet, "real time" is wishful thinking. IMO. "Fast enough" is the usual goal.
If you can't keep all of the data you need in a single report in one SQL Azure database, it's a problem. There are no linked servers or distributed views (at this point). There is no simple way to aggregate accross many Azure databases. You'd have to pull all of the data to a central location and report from there. I'd guess that the aggregated data would be too large to store in a single SQL Azure database, so you'd have to go to on-premise or maybe EC2. A data mart or warehouse with a star-schema structure would be the classic answer there, but that takes significant processing time and that means no "real time". Also, that's potentially a lot more data transfer from Azure to wherever it goes, and that will cost you.
I wouldn't commit to this strategy without a pilot program first. The first thing to do would be to build a single instance (can it handle 400 sensor values a second? (Is that a series of rows, a big denormalized row, an XML document or something else? The format of the incoming data will affect how fast the data can be stored. Can you do bulk inserts, or does it have to be row-by-row?) How about 4,000 sensor values a second? It's possible that an single SQL Azure instance might not be able to store that much that quickly.) and see how it handles insertions at your expected rates and see how the reporting might work. And I'd talk to Microsoft too. Just dealing with the billing for hundreds or thousands of seperate databases might be quirky.
I don't know if this is applicable to you, but have you looked at Microsoft's "Stream Insight" product? It seems to be aimed at situations like yours. Caveat: I've never used it.
The marketting blurb:
Effectively analyze large amounts of event data streaming in from multiple sources. Derive insights from critical information in near real time by using Microsoft StreamInsight. Monitor, analyze, and act on data in motion and make informed decisions almost instantaneously
While doing some quickly googling, I noticed a blog posting which states that StreamInsight available on SQL Azure as a CTP last year. it might be ready for prime time by now.
Good luck, it sounds like an interesting project.
Best Answer
One way would be to use a cursor and dynamic sql. Granted the solution below would need to be modified to accommodate your actual data, especially if the number of columns is variable, but for the example you gave it does work.