Write optimized Snowflake SQL with CTEs, window functions, and semi-structured data
✓Works with OpenClaudeYou are a Snowflake SQL optimization expert. The user wants to write production-grade Snowflake SQL leveraging CTEs, window functions, and semi-structured data handling.
What to check first
- Confirm Snowflake warehouse is active:
SELECT CURRENT_WAREHOUSE(); - Verify you have
USAGEprivilege on the target database:SHOW GRANTS ON DATABASE your_db; - Check if data contains VARIANT columns:
DESC TABLE your_table;and look for VARIANT data type
Steps
- Structure multi-step logic using
WITHclauses (CTEs) to build intermediate result sets before the finalSELECT - Use window functions (
ROW_NUMBER(),RANK(),LAG(),LEAD()) withPARTITION BYandORDER BYto avoid expensive self-joins - Extract semi-structured data from VARIANT/OBJECT/ARRAY columns using
:notation andLATERAL FLATTEN()for nested arrays - Apply
QUALIFYclause to filter window function results directly instead of wrapping in subqueries - Leverage
RECURSIVECTEs only when needed for hierarchical data; prefer flattening for most semi-structured scenarios - Use
TRY_PARSE_JSON()andTRY_CAST()to handle malformed JSON gracefully without errors - Index optimization: add clustering keys on high-cardinality columns in joins using
ALTER TABLE ... CLUSTER BY - Test query performance with
EXPLAIN USING TABULAR;and check scan statistics withQUERY_HISTORY()
Code
-- Optimized Snowflake SQL: Sales analytics with semi-structured metadata
WITH order_summary AS (
SELECT
order_id,
customer_id,
order_date,
total_amount,
-- Extract nested JSON: customer_metadata is a VARIANT column
metadata:customer_tier::STRING AS tier,
metadata:region::STRING AS region,
-- Parse array of items from semi-structured data
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date DESC) AS recency_rank,
SUM(total_amount) OVER (PARTITION BY customer_id) AS customer_lifetime_value,
LAG(total_amount) OVER (PARTITION BY customer_id ORDER BY order_date) AS previous_order_amount
FROM orders
WHERE order_date >= DATEADD(MONTH, -12, CURRENT_DATE())
),
flattened_items AS (
-- Flatten array of items from VARIANT column items_array
SELECT
order_id,
f.value:product_id::STRING AS product_id,
f.value:quantity::INT AS qty,
f.value:price::DECIMAL(10, 2) AS price
FROM order_summary
LATERAL FLATTEN(INPUT => TRY_PARSE_JSON(metadata:items),
Note: this example was truncated in the source. See the GitHub repo for the latest full version.
Common Pitfalls
- Treating this skill as a one-shot solution — most workflows need iteration and verification
- Skipping the verification steps — you don't know it worked until you measure
- Applying this skill without understanding the underlying problem — read the related docs first
When NOT to Use This Skill
- When a simpler manual approach would take less than 10 minutes
- On critical production systems without testing in staging first
- When you don't have permission or authorization to make these changes
How to Verify It Worked
- Run the verification steps documented above
- Compare the output against your expected baseline
- Check logs for any warnings or errors — silent failures are the worst kind
Production Considerations
- Test in staging before deploying to production
- Have a rollback plan — every change should be reversible
- Monitor the affected systems for at least 24 hours after the change
Related Snowflake Skills
Other Claude Code skills in the same category — free to download.
Snowflake dbt Models
Build dbt models, tests, and macros for Snowflake transformations
Snowflake Streams & Tasks
Set up change data capture with streams and scheduled tasks
Snowflake Snowpipe
Configure continuous data ingestion with Snowpipe and external stages
Snowflake RBAC
Configure role-based access control with roles, privileges, and masking
Snowflake Stored Procedures
Write JavaScript and SQL stored procedures in Snowflake
Snowflake Data Sharing
Set up secure data sharing and data marketplace listings
Snowflake + Python
Use Snowpark for Python-based data engineering and ML in Snowflake
Want a Snowflake skill personalized to YOUR project?
This is a generic skill that works for everyone. Our AI can generate one tailored to your exact tech stack, naming conventions, folder structure, and coding patterns — with 3x more detail.