Vertica includes an ODBC driver for C.
Read "Programmer's > Guide Vertica Client Libraries > Programming ODBC Client Applications"
This section details the process for configuring the Vertica ODBC
driver. It also explains how to use the ODBC API to connect to Vertica
in your own client applications.
A rough estimate for bare tables is pretty simple. The number of bytes in a row is roughly
Total number of bytes for all columns
+ dbms overhead
dbms overhead might include a fixed or variable number of bytes per row or per column. The number of bytes of overhead often depends on the datatype of the column. The dbms might also apply padding to make bit columns, for example, align with a machine word boundary.
Multiply the bytes per row by the expected number of rows one year, two years, or five years from now. Repeat for every table, or for only a statistically representative sample of the tables.
You can often find leads to the details for your dbms by searching for "your_dbms row format". If you searched for
sql server row format
you'd probably find this blog article.
Your dbms might require you to take into account compression, compact storage formats, and indexes. Compression and compact storage formats reduce the amount of storage space you need; indexes usually increase it.
If you want a really good estimate, build a testbed. Fill it with statistically representative sample data. Do the arithmetic based on what that tells you. Include estimates for new tables to implement new features.
Best Answer
There are several ways to get the size of a database, each suited to a slightly different use case. It's important to note that Vertica uses raw and compressed data in different ways, and that you should be conscious of which size you require. For example, licensing is based on the raw data size.
Raw Size
The raw size is useful for capacity planning or monitoring license utilization (it's the space the data would take up if it wasn't compressed). To get the raw data size of the entire database, you can either use the
GET_COMPLIANCE_STATUS()
function, or query the system tablev_internal.license_audits
.The
GET_COMPLIANCE_STATUS()
function retrieves information about the most recent audit. An audit estimates the raw data size of the database and stores the information inv_internal.license_audits
. By default, audits take place daily at 23:59 and can be configured or manually run.Here's some example output:
If you have permissions, you can directly query the
license_audits
table:Further, if you wish to get raw data size at the schema level, you can use this (from vertica.tips):
Compressed Size
The compressed size is the actual size of the data on the disk. This is useful for estimating disk space usage as Vertica recommends that at least 40% of space is available at all times. You can get the compressed size from
column_storage
orprojection_storage
system tables.Using
projection_storage
will also return any empty tables (from vertica.tips):