All Collections
Data
Snowflake in Astrato
Astrato for Snowflake: Optimizing Snowflake for cost & performance
Astrato for Snowflake: Optimizing Snowflake for cost & performance

Optimizing Snowflake for balanced cost and performance in live query analytics

Piers Batchelor avatar
Written by Piers Batchelor
Updated over a week ago

Astrato was purpose built for Snowflake’s Data Cloud, reimagining a world where analytics was cloud-fast, secure and infinitely scalable. Welcome to modern analytics – welcome to data apps built for cloud.

❄ Get the most out of Astrato and optimize your warehouse for live-query BI

How it works

Astrato live-queries Snowflake for Visualization, Writeback and Data Science (Snowpark). Security and governance is baked into Snowflake.

Security

Because Astrato never moves data, security and compliance teams choose Astrato to minimize risk and maintain data governance. Enterprise authentication, row level security and masking rules for PII are passed through to Astrato.

Performance & cost

Snowflake’s cache means your dashboards are not only fast, but you’re only charged as little as once a day per unique query.

New queries are still superfast and can be supercharged on larger datasets across billions of rows with dynamic warehouse scaling.

Getting started

You can get connected straight away to your Snowflake instance with a service account or using SSO.

Warehouse optimization

The choice is yours - we're providing the information you need to make a decision on how you optimize warehouses. In most cases, you'll want to start with an XS warehouse for typical BI usage.

💡 Auto suspend stops your warehouse running when inactive (this clears the cache)

💡 Multi-cluster scaling solves query-queuing performance issues cause by concurrent usage

💡 Economy scaling policy

💡 Bundling workloads into a single warehouse, improves cost-efficiency

Auto suspend

By default, auto-suspend is enabled. Snowflake automatically suspends the warehouse if it is inactive for the specified period of time.

By default, auto-resume is enabled. Snowflake automatically resumes the warehouse when any statement that requires a warehouse is submitted and the warehouse is the current warehouse for the session.

Multi-cluster scaling for high concurrency performance [Snowflake Enterprise]

Auto-scaling ensures that even at the highest demand, performance and cost are balanced. To enable fully automated scaling for concurrency, Snowflake recommends multi-cluster warehouses, which provide essentially the same benefits as creating additional warehouses and redirecting queries, but without requiring manual intervention.

⚠ If you don't enable multi-cluster scaling, you may want to consider a dedicated warehouse

Multi-cluster scaling policies: Economy scaling

An XS warehouse costs 1 credit per full continuous hour of use.

To optimize towards cost, you may prefer to set the warehouse to Economy. This will ensure that the currently running clusters are at full capacity before starting a new one.

ℹ Astrato cancels queries which are longer than 30 seconds

-- Create an XS Warehouse with auto-scaling enabled
CREATE WAREHOUSE XS_WAREHOUSE
WITH WAREHOUSE_SIZE = 'XSMALL'
AUTO_SUSPEND = 600 -- time in seconds to auto-suspend when idle
AUTO_RESUME = TRUE -- auto-resume when a query is executed
MIN_CLUSTER_COUNT = 1 -- minimum number of clusters
SCALING_POLICY = 'ECONOMY' -- optimise warehouse for cost-efficiency
MAX_CLUSTER_COUNT = 2; -- maximum number of clusters for auto-scaling

Dedicated warehouse vs multi-purpose warehouse

An XS warehouse can only cost 1 credit per hour (per cluster - see auto-scaling for clarity). In some cases, a warehouse may have ultra-low utilization but have regular activity. Snowflake customers are billed on warehouse uptime, how long the warehouse is active.

Understanding costs & utilization

An extreme example of underutilization is this: Consider running a 1-second-query, every minute, over the course of an hour. This will still cost 1 credit, which is equivalent to an hour of constant usage. This is because at a minimum - a warehouse remains active for 60 seconds after the last performed activity. This 1 second of usage per minute equates to 1 minute per hour, which is 16.6% utilization.

Multi-purpose warehouse (best with Snowflake Enterprise auto-scaling)

Adding multiple workloads to a warehouse will improve its efficiency, in some cases, even allowing you to run some queries within the existing bandwidth of the warehouse, at no extra cost. An example of this is running lightweight queries from multiple products, on the same warehouse - this can improve efficiency, sometimes at the cost of performance. However if you have auto-scaling, concurrent queries will add a cluster, meaning you're only paying for multi clusters if you need it.

Dedicated warehouse

For many, a dedicated warehouse is preferred. You can accurately view cost per warehouse, and understand the cost of each workload. Much like a team doing work, if each team member has a separate job, accounting for cost is simple, yet, you're paying for more team members - the same thinking is for warehouses.

Alternatively, for clarity on costs, you can rely on query tags, with each query having an estimated cost. We use this in Astrato's Snowflake Consumption Workbook Accelerator - this alleviates needs for dedicated warehouses.

Did this answer your question?