Snowflake is the first and the only data warehouse built for the cloud. Snowflake provides an analytic Data Warehouse-as-a-Service that is faster, easier to use and far more flexible than traditional data warehouse offerings. The Snowflake data warehouse uses a new SQL database engine with a unique architecture designed for the cloud. Snowflake’s data warehouse is a true SaaS offering, specifically:
- There is no hardware (virtual or physical) for you to select, install, configure, or manage.
- There is no software for you to install, configure, or manage.
- Ongoing maintenance, management, and tuning is handled by Snowflake.
- Snowflake runs completely on cloud infrastructure. All components of Snowflake’s service (other than an optional command line client), run in a public cloud infrastructure.
- Snowflake uses virtual compute instances for its compute needs and a storage service for persistent storage of data.
- Snowflake cannot be run on private cloud infrastructures (on-premises or hosted).
- Snowflake is not a packaged software offering that can be installed by a user. Snowflake manages all aspects of software installation and updates.
Snowflake’s architecture is a hybrid of traditional shared-disk database architectures and shared-nothing database architectures. Like shared-disk architectures, Snowflake uses a central data repository for persisted data that is accessible from all compute nodes in the data warehouse. But like shared-nothing architectures, Snowflake processes queries using MPP (massively parallel processing) compute clusters where each node in the cluster stores a portion of the entire data set locally. This approach offers the data management simplicity of a shared-disk architecture, but with the performance and scale-out benefits of a shared-nothing architecture.
Snowflake’s unique architecture consists of three key layers:
When data is loaded into Snowflake, Snowflake reorganizes that data into its internal optimized, compressed, columnar format. This optimized data is stored in the cloud. Snowflake manages all aspects of how this data is stored — organization, file size, structure, compression, metadata, statistics, and other aspects of data storage are handled by Snowflake. The data objects stored by Snowflake are not directly visible nor accessible by customers; they are only accessible through SQL query operations run using Snowflake.
Query execution is performed in the processing layer. Snowflake processes queries using “virtual warehouses”. Each virtual warehouse is an MPP compute cluster composed of multiple compute nodes allocated by Snowflake from a cloud provider. Each virtual warehouse is an independent compute cluster that does not share compute resources with other virtual warehouses. As a result, each virtual warehouse has no impact on the performance of other virtual warehouses.
Cloud services layer is a collection of services that coordinate activities across Snowflake. These services tie together all the different components of Snowflake to process user requests, from login to query dispatch. The cloud services layer also runs on compute instances provisioned by Snowflake from the cloud provider. Among the services in this layer:
- Infrastructure management
- Metadata management
- Query parsing and optimization
- Access control
- Snowflake-trained and Cloud Analytics Academy-certified consultants who are at the forefront of industry’s best practices
- Extensive experience in BI & Analytics projects
- Unique Managed Services Model provides the ability to meld onsite consultants with our tech center resources resulting in the most optimized solutions