Introduction
This article shows you how to set up a snapshot datamart to power Astrato, helping you keep performance levels high.
❄ If you are using Snowflake, you can use materialized views/dynamic tables to achieve this. Also to get setup un Snowflake see our Astrato for Snowflake Performance article
Contents
Why use a datamart?
It can be a taxing experience for your data warehouse if you're using a live-query analytics tool with views and more complex queries. The query spends more time calculating, aggregating, joining and potentially reformatting lots of data, and this will occur each time you load a workbook or filter.
Connecting to your warehouse directly without requiring views or further transformations to the data is the fastest method and in most cases you don't need to do anything to optimize performance.
However, to get the fastest performance out of transformed and/or calculated data in a live-query tool, it's best to use a data mart. A datamart is a subset of a data warehouse focused on a particular line of business, department, or subject area.
Setting up a datamart
To setup a data mart, you'll need 3 components.
Query/View - to retreive the data.
Stored procedure - to update the table.
Task - to schedule the creation of the snapshot table.
We'll use the rest of the article to give more details about that. All the commands are designed to be run in the console of your data warehouse instance.
Table
Create your table using a select statement or use our sample code below.
Tables and view can be randomly generated using scripts - in most cases we can generate that script using a sample from our view. We've got an example here to show you.
//----- CREATES RANDOMISED SAMPLE VIEW
CREATE OR REPLACE VIEW VW_MYVIEW as
select
ROW_NUMBER() over (order by dateadd(second, '-' || seq4(0.01), sysdate()) desc) as ROWNO,
randstr(uniform(10,30,random(1)),uniform(1,100000,random(1)))::varchar(30) as name
,randstr(uniform(10,20,random(2)),uniform(1,10000,random(2)))::varchar(30) as city
,concat(randstr(uniform(10,15,random(4)),uniform(1,200000,random(4))), '@gmail.com')::varchar(30) as email,
dateadd(second, '-' || seq4(0.01), sysdate()) as TIMESTAMP
from table(generator(rowcount=>3000));
//----- CREATES RANDOMISED SAMPLE VIEW
select * from VW_MYVIEW;
// replace this with your view/query
Task
The example task in this section has a similar name to the stored procedure.
You'll need to define a warehouse and a schedule. The schedule we've used below runs at 7am, 10am, 1pm, 4pm UTC. You can find more details for the schedule here.
Create a task by running the snippet below:
CREATE TASK TSK_CREATE_ANALYTICS_DATA_MART
WAREHOUSE = MY_WAREHOUSE
SCHEDULE = 'USING CRON 0 7,10,13,16 * * * UTC'
AS
select * from VW_MYVIEW;
To view all your tasks and validate the task you just created, run this command.
SHOW TASKS;