It's going to be a challenge to apply a uniform policy across such a varied estate. My first port of call would be to understand the biggest consumers of key resources (CPU, memory, IO) on each instance, with a view to isolating the problematic databases from the others. Glenn Berrys healthcheck DMVs would be a good starting point.
Who's using most of the memory (buffer pool) on each instance?
SELECT DB_NAME(database_id) AS [Database Name],
COUNT(*) * 8/1024.0 AS [Cached Size (MB)]
FROM sys.dm_os_buffer_descriptors
WHERE database_id > 4 -- system databases
AND database_id <> 32767 -- ResourceDB
GROUP BY DB_NAME(database_id)
ORDER BY [Cached Size (MB)] DESC OPTION (RECOMPILE);
How long are pages staying in the buffer pool?
SELECT cntr_value AS [Page Life Expectancy]
FROM sys.dm_os_performance_counters
WHERE [object_name] = N'SQLServer:Buffer Manager'
AND counter_name = N'Page life expectancy' OPTION (RECOMPILE);
Who's generating most IO?
SELECT DB_NAME(DB_ID()) AS [Database Name],[file_id], num_of_reads, num_of_writes,
io_stall_read_ms, io_stall_write_ms,
CAST(100. * io_stall_read_ms/(io_stall_read_ms + io_stall_write_ms) AS DECIMAL(10,1)) AS [IO Stall Reads Pct],
CAST(100. * io_stall_write_ms/(io_stall_write_ms + io_stall_read_ms) AS DECIMAL(10,1)) AS [IO Stall Writes Pct],
(num_of_reads + num_of_writes) AS [Writes + Reads], num_of_bytes_read, num_of_bytes_written,
CAST(100. * num_of_reads/(num_of_reads + num_of_writes) AS DECIMAL(10,1)) AS [# Reads Pct],
CAST(100. * num_of_writes/(num_of_reads + num_of_writes) AS DECIMAL(10,1)) AS [# Write Pct],
CAST(100. * num_of_bytes_read/(num_of_bytes_read + num_of_bytes_written) AS DECIMAL(10,1)) AS [Read Bytes Pct],
CAST(100. * num_of_bytes_written/(num_of_bytes_read + num_of_bytes_written) AS DECIMAL(10,1)) AS [Written Bytes Pct]
FROM sys.dm_io_virtual_file_stats(DB_ID(), NULL) OPTION (RECOMPILE);
Which databases are consuming the most CPU and which particular procedures?
SELECT
DB_NAME(qt.dbid) AS DatabaseName
, SUM(qs.total_worker_time) AS [TotalWorkerTime]
FROM
sys.dm_exec_query_stats AS qs
CROSS APPLY
sys.dm_exec_sql_text(qs.[sql_handle]) AS qt
WHERE
DB_NAME(qt.dbid) IS NOT NULL
GROUP BY
qt.dbid
ORDER BY
SUM(qs.total_worker_time) DESC
OPTION
(RECOMPILE) ;
SELECT TOP (1000)
DB_NAME(qt.dbid) AS DatabaseName
, qt.[text] AS [SP Name]
, qs.total_worker_time AS [TotalWorkerTime]
FROM
sys.dm_exec_query_stats AS qs
CROSS APPLY
sys.dm_exec_sql_text(qs.[sql_handle]) AS qt
WHERE
DB_NAME(qt.dbid) IS NOT NULL
ORDER BY
qs.total_worker_time DESC
OPTION
(RECOMPILE) ;
The short answer...
This has the hallmarks of solving the wrong problem with the wrong solution. It sounds like you're looking for a workaround to a fundamental design flaw, introducing an entirely new set of problems on the way.
An attempt at a longer answer...
It's almost impossible to offer any practical advice without a better understanding of the database, the application and interaction between the two. That said, there are one or two warning signs.
Logic in a table function
It sounds like you have a hierarchy of entities, which could be represented in a relational model and navigated in a set oriented fashion. If complicated logic is required in the database to navigate the hierarchy, the model is probably wrong.
Do you have a relational database or an object store/bit bucket?
Your question refers to entities, rather than tables and records. Did you build an object model and dump it to tables, rather than map your objects to a relational model?
How much data are you returning to the user?
A 2 second stored procedure call could be sub-optimal, accessing a sub-optimal model or returning too much data. Requiring 10 calls to this procedure suggests it could well be all 3. Navigating a 40k record graph is not big data territory, unless you're application is loading 40k records from the database to determine what's navigable.
You already have a better model
Your workaround may actually be the solution, if applied differently. If the output from your table function is a model which works, perhaps it should be a permanent part of the model. Instead of building this temporary table every time a user logs in, why not keep the data in that form?
Best Answer
Not 100% sure I understand your question but - if you are already selecting the data, will also selecting a count be that much more expensive? No, I don't think you'd have an issue doing that. Selecting just a count should incur the same cost as far as the scans/seeks and joins/filtering/aggregating you had to do in your query but also show some time saved in processing/formatting and sending the actual rows as well. If you your select takes 10ms, I'd expect the count to also take 10 or less ms.
Alternatively you can get the rowcount returned by operations as well and return that without even incurring the additional 10 ms.
This SO question deals with the same basic question I think you are driving at and gives a few different answers on technique.