How to get Size of db tables one month back

vertica

Recently there has been a drop in our db size and we want to compare the size of the tables back then with that of now. For this purpose we need the exact size of each table one month back.
We can extract license_size for an old date, but how to extract size of each table say a month ago.

Best Answer

Unless you keep a log of it , you cannot. There is no history mechanism to keep track of tbl sizes.

you also can get a lot of scripts and articles on Vertica here www.aodba.com

What i do to keep an eye on this kind of stuff is logging this info in a dba use table using a cron job.

##will run every 30 minutes
*/30 * * * * /home/dbadmin/scripts/dba/database_growth.sh

Content .sh script

#!/bin/bash
#!/bin/bash

. /home/dbadmin/.profile
export VSQLDIR=/opt/vertica/bin
export SQLDIR=/home/dbadmin/scripts/dba
export data=`date`
echo " Execution date -" $data >> /home/dbadmin/scripts/dba/log/database_growth.log
echo "#################################################" >> /home/dbadmin/scripts/dba/log/database_growth.log
$VSQLDIR/vsql  -U $username -w $password -d analytics -f $SQLDIR/database_growth.sql >> /home/dbadmin/scripts/dba/log/database_growth.log
echo "#################################################" >> /home/dbadmin/scripts/dba/log/database_growth.log

Note:/home/dbadmin/.profile -- holds all the parameters values (this is to avoid text passwords - they are all logged as text - you wanna avoid this ) Content database_growth.sql

 insert INTO dba.database_growth(time,schema_name,size_gb) SELECT * from dba.pre_database_growth;

Content view dba.pre_database_growth

CREATE VIEW
    dba.pre_database_growth AS
SELECT
    (statement_timestamp())::TIMESTAMP AS GETDATE,
    p.anchor_table_name,
    s.schema_name,
    ROUND(((((SUM(s.used_bytes) / 1024::NUMERIC(18,0)) / 1024::NUMERIC(18,0)) /
    1024::NUMERIC(18,0)))::FLOAT, 3) AS size_in_bytes_gb
FROM
    v_monitor.storage_containers s join v_catalog.projections p
    on s.projection_id=p.projection_id
GROUP BY
    s.schema_name,p.anchor_table_name;

You can use this then to create growth reports on your objects over time.

I hope this helped.

Raw Size

The raw size is useful for capacity planning or monitoring license utilization (it's the space the data would take up if it wasn't compressed). To get the raw data size of the entire database, you can either use the GET_COMPLIANCE_STATUS() function, or query the system table v_internal.license_audits.

The GET_COMPLIANCE_STATUS() function retrieves information about the most recent audit. An audit estimates the raw data size of the database and stores the information in v_internal.license_audits. By default, audits take place daily at 23:59 and can be configured or manually run.

Here's some example output:

dbadmin=> SELECT GET_COMPLIANCE_STATUS();
                            GET_COMPLIANCE_STATUS
---------------------------------------------------------------------------------

Raw Data Size: 4.83TB +/- 0.24TB
License Size : 30.00TB
Utilization  : 16%
Audit Time   : 2014-05-11 23:59:49.763799+00
Compliance Status : The database is in compliance with respect to raw data size.

License End Date: 10/30/2014
Days Remaining: 171.18

If you have permissions, you can directly query the license_audits table:

SELECT /*+ label(license_utilization)*/
       audit_start_timestamp, 
       database_size_bytes / ( 1024^3 ) AS database_size_gb, 
       license_size_bytes / ( 1024^3 )  AS license_size_gb, 
       usage_percent 
FROM   v_catalog.license_audits 
ORDER  BY audit_start_timestamp DESC 
LIMIT  30;

Further, if you wish to get raw data size at the schema level, you can use this (from vertica.tips):

SELECT /*+ label(estimated_raw_size)*/
       pj.anchor_table_schema, 
       pj.used_compressed_gb, 
       pj.used_compressed_gb * la.ratio AS raw_estimate_gb 
FROM   (SELECT ps.anchor_table_schema, 
               SUM(used_bytes) / ( 1024^3 ) AS used_compressed_gb 
        FROM   v_catalog.projections p 
               JOIN v_monitor.projection_storage ps 
                 ON ps.projection_id = p.projection_id 
        WHERE  p.is_super_projection = 't' 
        GROUP  BY ps.anchor_table_schema) pj 
       CROSS JOIN (SELECT (SELECT database_size_bytes 
                           FROM   v_catalog.license_audits 
                           ORDER  BY audit_start_timestamp DESC 
                           LIMIT  1) / (SELECT SUM(used_bytes) 
                                        FROM   V_MONITOR.projection_storage) AS ratio) la 
ORDER  BY pj.used_compressed_gb DESC;

Compressed Size

The compressed size is the actual size of the data on the disk. This is useful for estimating disk space usage as Vertica recommends that at least 40% of space is available at all times. You can get the compressed size from column_storage or projection_storage system tables.

Using projection_storage will also return any empty tables (from vertica.tips):

SELECT /*+ label(compressed_table_size)*/
       anchor_table_schema, 
       anchor_table_name, 
       SUM(used_bytes) / ( 1024^3 ) AS used_compressed_gb 
FROM   v_monitor.projection_storage 
GROUP  BY anchor_table_schema, 
          anchor_table_name 
ORDER  BY SUM(used_bytes) DESC;

Vertica performance on select from view with union

Vertica query performance depends highly on the predicate used in the query .

To get the gist of your performance , try getting the projection name of the selected columns of the query you are firing . The columns in the order by clause of the projection is very important in deciding the performance of your select. you can get that by running explain on your query .

Vertica improves performance by sorting the columns , compressing via encoding them so that they use minimal memory while running .

Also run analyze_histogram(tablename, 100) on all your tables . this will ensure statistics over complete data sample, not just 10% of data sample which is taken by analyze_statistics .

Also , since you are doing union you should try to keep the sort order of all the projections same as after union it might be meaningless .

Best Answer

Related Solutions

Vertica – How to Get the Size of a Database

Raw Size

Compressed Size

Vertica performance on select from view with union

Related Question