Snowflake is a data platform built for the cloud. It consolidates data warehousing, data marts, and data lakes into a single platform in order to make all data available to all business users. The architecture consists of three layers:
Snowflake effectively reinvents data warehousing, eliminating complexity related to integrating different data sources and types.
“Why should you pay for something not currently in use? Why should you manually relate to scalability? Why should there be downtime relating to scalability and database changes? These are some of the things I admire Snowflake for solving. You shouldn’t have to analyze and guess the number of CPUs you need at 8:00 in the morning or 18:00 in the evening.
Other technologies provide something similar, but they fall short since the redeployment time for compute is often more than 1 minute. I don’t know how Snowflake has managed to solve this, but I think a lot of people are jealous of their less than 1 second redeployment time.
Zero-copy clone is to me a completely unique way of testing for corrections and alterations in a 1:1 setup such as production. I have seen a lot of heavy solutions on how to synchronize development, test, and production setups in your database, but I have not previously seen a well-working solution. Zero-copy clone provides exactly that. It takes a meta-data copy of e.g. a production database for a test environment in 1 second. Since it only copies meta-data and not actual data, the cost of this is next to nothing.”
Adam Boje Hertz, Head of AI & Data Platforms at Intellishore
Snowflake’s patented multi-cluster shared data architecture delivers a platform that enables many different workloads. These include: Data warehouses, data lakes, data pipelines, and data exchanges, as well as many types of business intelligence, data science, and data analytics applications.
… and different data types and sources
Snowflake’s agnostic nature supports the handling and optimization of both structured and semi-structured data. The latter include the likes of JSON, Avro, and XML. The platform includes standard-based connectors such as ODBC, JDBC, Javascript, Python, Spark, R, and Node.js. As a result, developers are granted full access to all tools, languages, or frameworks they might need.
Furthermore, Snowflake Data Marketplace allows you to discover new datasets and services.
In an effect to help mitigate data silos within both large and small organizations, Snowflake allows for global sharing of data. When requested, this happens instantaneously without anyone having to copy or move data. The platform is also cloud agnostic meaning that, in addition to the ability of Snowflake to distribute data across regions, it can distribute data across different cloud providers including AWS, Google Cloud, and Microsoft Azure.
This effectively allows large organizations to break down data-silos and obtain unified insights from an all-encompassing data platform.
Snowflake operates as a true software as a service solution through its fully managed service layer handling user sessions, resources, enforcing security measures, compiling queries, enabling data governance, and ensuring atomicity, consistency, isolation, and durability.
Snowflake is fully automatic and capable of handling and servicing the infrastructure. Effectively, this allows organizations to focus on analyzing and gaining insights from data rather than spending resources maintaining the data platform.