Virtual Column in Snowflake — The Complete Guide
Not much is talked about virtual column in the Snowflake documentation and it is only referenced in the topics of external tables, masking, row level security and few more. So, bringing you here the detailed guide on the same.
A virtual column is a column that is not physically stored in the table but can be computed on the fly when queried. It is also known as a computed column or a derived column.
Creating a table with virtual column -
When you create a virtual column, you define its expression and the expression can reference other columns in the same table or constants.
Example, creating a virtual column as name_length to determine the length of other column name in the same table.
CREATE TABLE demo_vc (
id INTEGER,
name VARCHAR(20),
name_length INTEGER AS (LEN(name))
);
Once the table is created, describe it and check on column kind and expression. They distinguish the virtual column from normal columns.
desc table demo_vc;
Querying table with virtual column -
Whenever a query references the virtual column, Snowflake evaluates the expression and returns the computed result.
Virtual columns are supported in Snowflake for both traditional and external tables.
In Snowflake, only a limited set of deterministic functions are allowed for use in virtual columns that include basic arithmetic operators, date and time functions, and string functions such as LOWER(), UPPER(), SUBSTR(), TRIM(), REPLACE(), CONCAT(), LENGTH () among others.
Advantages of Virtual Columns —
- Simplified data model: Virtual columns can help to simplify the data model and reduce the complexity of views and queries.
- Consistent data: Virtual columns ensure that the derived values are consistent and accurate across all rows.
Disadvantages of Virtual Columns —
- Cost overhead: Virtual columns do not consume storage space, but the expressions used to create them can consume compute resources. If you have a large number of virtual columns, or if the expressions used to create them are complex, you may experience increased compute resource consumption.
- Query complexity: Virtual columns can make queries more complex, as they are referenced using their expression. This can make it harder to understand and debug queries that use virtual columns.
- Limited support for data manipulation: Virtual columns cannot be updated or deleted directly. Instead, the underlying expressions must be updated or the virtual column must be dropped and recreated. This can be cumbersome if you need to update a large number of virtual columns.
Abnormalities —
If you want to create a virtual column that returns the length of a string column in a Snowflake table, when creating a table you can use the LEN()/LENGTH() function but when altering the table LEN() doesn't work and only LENGTH() works.
LEN() doesn’t work when altering the table
LENGTH() works when altering the table
There can be more, and if you find do let other readers know in the comment section.
References: https://docs.snowflake.com/
Follow and Clap if you like the content and feel free to ask if you have any questions in the comments. I will be more than happy to assist and guide you.