The SnowPro ™ Core Certification tests your core expertise implementing and migrating to Snowflake. As a certified professional you can design, develop and manage scalable and secure Snowflake solutions to drive business objectives.
This exam covers architectural principles, core aspects and best practices of Snowflake solutions covering Data Loading/Unloading, Performance, concurrency, scaling, data types, connectors, security and Account Management.
SnowFlake SnowPro Core Exam is always highly recommended to go through practice exams/practice questions to make yourself familiar with the real exam pattern. Whizlabs offers a great set of practice questions for this certification exam
This set consists of 50 questions across all the domains in a SnowFlake SnowPro Core certification. It will give you a clear idea about the type of questions that are asked in the real exam.
SnowFlake SnowPro Core Certification Free Questions
Please find the below set as a free exam exercise that might help you to understand the exam pattern.
Account and Security
Q 1. You are the owner of a table T1 which is in schema S1. The schema is in database D1. In order to grant read-only permissions of this table to a newly created role R1, you will need to…[select all that apply]
A. Grant ‘USAGE’ on database D1 B. Grant ‘SELECT’ on database D1 C. Grant ‘USAGE’ on schema S1 D. Grant ‘SELECT’ on schema S1 E. Grant ‘USAGE’ on table T1
F. Grant ‘SELECT’ on table T1
Correct Answers: A, C, and F
To allow a role to use your database objects, you, as the owner must grant appropriate privileges on the database, schema, and objects. First up, you will need to grant USAGE privilege on the parent database and schema. USAGE privilege enables the grantee to refer to the database and schema in the query. Then, you will need to grant SELECT privilege on the table. The SQL statements are given below: Grant usage on the database: grant usage on database D1 to role R1; Grant usage on the schema: grant usage on schema D1.S1to role R1; Grant the ability to query an existing table:
grant select on table D1.S1.T1 to role R1;
Q 2. Identify system-defined roles in Snowflake from the roles given below. [select four]
A. AccountAdmin B. SysAdmin C. Auditor D. RoleAdmin E. SecurityAdmin
F. Public
Correct Answers: A, B, E, and F
A role in snowflake is essentially a container of privileges on objects. Roles are assigned to users to allow them to perform actions on the objects. A role can be directly assigned to the user, or a role can be assigned to a different role leading to the creation of role hierarchies. The role at a higher level in the hierarchy inherits the privileges of the role at a lower level in the hierarchy. Snowflake defined 5 system-defined roles. These roles, the key features, and the hierarchy of system-defined roles are shown in the diagram below.
Snowflake Virtual Warehouses
Q 3. What are the minimum and the maximum number of clusters in a multi-cluster warehouse?
A. Minimum: 1, Maximum: 99 B. Minimum: 1, Maximum: 100 C. Minimum: 1, Maximum: 10
D. Minimum: 1, Maximum: unlimited
Correct Answer: C
- Answer choice-C is correct. In a multi-cluster warehouse, the minimum number of clusters can be 1 and the maximum number of clusters can be 10.
Q 4. When a virtual warehouse is started or resized or resumed, the minimum billing charge is 1 minute.
A. True
B. False
Correct Answer: A
The above statement is True. When a new warehouse is started afresh, or when you resize an existing running warehouse or when a warehouse is resumed [automatically or manually], the warehouse is billed for a minimum of 1 minute’s worth of usage. After the 1st minute has elapsed, all subsequent billing is charged per second.
➤ Practical Info – There is no benefit in stopping a warehouse before the first 1st minute is over because you are anyway going to pay for the first 60-seconds, as the discussion given above shows.
➽ Exam Tip– It is NOT necessary to stop a virtual warehouse to resize it. You can resize a virtual warehouse even when it is executing a query.
Q 5 . When a multi-cluster warehouse is suspended, which of the following Snowflake cache will be purged?
A. Metadata cache B. Remote disk cache [Resultset cache] C. Local disk cache
D. All of the above
Correct Answer: C
- When a multi-cluster warehouse [or a single cluster warehouse] is suspended, only the local disk cache is dropped. This cache is also known as warehouse cache. This is because this cache is essentially the fast SSD of the warehouse. So when the warehouse is gone, this cache is also gone with it. Answer choice-C is correct.
- Metadata cache contains Snowflake account object information and statistics. It is always on and it is never dropped. Answer choice-A is incorrect.
- Remote disk cache [resultset cache] caches exact query results including aggregations etc. and it persists the exact resultset for 24-hours after the query is executed irrespective of the state of the warehouse. Answer choice-C is incorrect.
Snowflake Overview and Architecture
Q 6. From the given options, select the cloud services provider supported by Snowflake architecture. [three choices]
A. Amazon Web Services [AWS] B. Google [GCP] C. HP Virtual Private Cloud D. Cloud Foundry
E. Microsoft Azure
Correct Answers: A, B, and E
Snowflake is a SaaS product. It supports three public cloud platforms given below.
- Amazon Web Services [AWS]
- Microsoft Azure [Azure]
- Google Cloud Platform [GCP]
Snowflake does NOT support any other cloud platform, any private cloud deployment, or on-prem deployment. Therefore, option choices – HP Virtual Private Cloud and Cloud Foundry – are not correct.
➤ Practical Info: It is not necessary that you have to host Snowflake on the same cloud platform where your organization’s other IT applications are hosted. For example, it is perfectly okay to have Snowflake hosted on Azure and your other IT applications in your organization hosted on AWS/GCP.
➽ Exam Tip: It must be noted that each Snowflake account is hosted in one cloud platform and one region, and you must choose cloud platform and region while opening an account. You cannot change it thereafter. If you want to change region and/or cloud platform at a later point, you will need to open a new Snowflake account in that region and/or cloud platform.
Your organization may open multiple Snowflake accounts [such as Dev, Test, Prod], and they can all reside on the same cloud platform and the same region, OR you may choose to open them on different cloud platforms and/or regions.
Q 7 . What are the key considerations for choosing a specific region for your Snowflake account? [Select all that apply]
A. Cost B. Proximity to the end-users C. Data sovereignty
D. Regulatory requirements
Correct Answers: A, B, C, and D
- Option 1 is correct. Snowflake offers different pricing for computing and storage depending upon the region you choose. For example, a Snowflake instance running in the Mumbai region costs $3.30/credit, whereas the same instance running in the Frankfurt region costs $4.00/credit. So if you are looking to save costs, you would choose a low-cost region.
- Option 2 is correct. The closer you are to your end-users, data transfer latency would be lower so you can transfer and access data faster. Hence, it is always a good practice to set up your Snowflake account in the region closer to your end-users.
- Option 3 and 4 are correct. As is often the case in most countries, there are data sovereignty and regulatory restrictions on where the data of users is stored. For example, in the EU, certain sensitive user data is now allowed to cross the EU boundary. India also has similar restrictions enforced by the law. Hence, if you build software that will store such sensitive user data, you will need to check the regulatory requirements and choose the Snowflake region accordingly.
Q 8. Informatica, Matillion, Azure data factory are examples of ………………….. partners of the Snowflake partner ecosystem.
A. Data Management B. Business Intelligence C. Data Integration
D. Data Science
Correct Answer: C
Informatica, Matillion, and Talend are data integration partners that provide the following functionalities: Extract – Exporting data from the data source. Transform – Modifying data according to business rules. Load – Loading data into Snowflake.
The below diagram shows Snowflake partner ecosystem players divided based on the functional categories.
Snowflake partner eco-system [image source: Snowflake official documentation]
Q 9 . Which of the following programming languages are supported in Snowflake to write user-defined functions [UDFs]?
A. SQL B. Java C. JavaScript
D. Python
Correct Answers: A and C
User-defined functions [UDFs] can be used in Snowflake to write custom business logic which is hard to encapsulate within individual SQL statements. UDFs [together with Stored Procedures] enable database-level programming in Snowflake. At present, Snowflake supports two programming languages to write UDFs: Javascript UDFs and SQL UDFs.
➽ Practical Info– Stored procedures in Snowflake are written in Javascript. The javascript code can issue SQL statements to process data in the Snowflake database. An example of a Snowflake stored procedure that clears a log table using SQL is given below: [this is only for your understanding, we are not expecting coding related questions in the exam]
create or replace procedure clear_log[] returns int not null language javascript as $$ var delete_log_sql_command = “delete from user_log”; var return_value = 0; var statement1 = snowflake.createStatement[ {sqlText: delete_log_sql_command} ]; var result_set1 = statement1.execute[]; return_value = 1; return return_value;$$
Q 10. Which Snowflake edition supports transmitting data in encrypted form over the network between VPCs [virtual private cloud]?
A. All editions B. Enterprise edition and above C. Business Critical edition and Above
D. All except for the Virtual Private Snowflake [VPS] edition
Correct Answer: C
A Snowflake account on AWS [or Azure] is implemented as a VPC. There are two ways to establish communication between your Snowflake VPC and other VPCs [e.g. your organization’s VPC]. One is to transmit the traffic over the public internet. Other [and safer] option is to establish an exclusive, highly secure network between your Snowflake account and your other AWS VPCs [in the same AWS region], fully protected from unauthorized access. To implement this secure channel of communication between VPCs, AWS supports a feature called AWS PrivateLink [Azure also supports a similar feature called Azure PrivateLink]. Snowflake offers support for AWS PrivateLink [and Azure PrivateLink] based communication in Business Critical Edition and above.
Q 11. More than one clustering key can co-exist in a Snowflake table
A. True
B. False
Correct Answer: B
This statement is false. You can define at most one clustering key in a Snowflake table to organize micro-partitions. When you define a clustering key, Snowflake will reorganize the naturally clustered micro-partitions and will relocate related rows to the same micro-partition and group them according to the clustering key. This process is called Reclustering.
➤ Practical Info – Reclustering happens automatically once a clustering key is defined for a table. The process consumes credits. So be cognizant of the cost when you go for reclustering.
Q 12. Which of the following statements will you use to change the warehouse for workload processing to a warehouse named ‘COMPUTE_WH_XL’?
A. SET CURRENT_WAREHOUSE = COMPUTE_WH_XL B. USE WAREHOUSE COMPUTE_WH_XL; C. USE CURRENT_WAREHOUSE[‘COMPUTE_WH_XL’];
D. SET CURRENT_WAREHOUSE = COMPUTE_WH, SIZE = XL;
Correct Answer: B
A session context in Snowflake consists of 4 objects:
- Role
- Warehouse
- Database
- Schema
You can set appropriate session context using a set of SQL statements given below. These statements specify the role, warehouse, database, or schema to use for the current session:
You can set appropriate session context using a set of SQL statements given below. These statements specify the role, warehouse, database, or schema to use for the current session: Use warehouse Use role Use database Use schema So the correct answer of this question is:
Use warehouse COMPUTE_WH_XL;
Q 13. In the case of a Snowflake account created on AWS, ……………….. is responsible for the management of Availability Zones?
A. Customer B. Snowflake C. Cloud Provider
D. It is a shared responsibility
Correct Answer: C
An Availability Zone [AZ] is essentially a group of one or more physically separated data centers with redundant power, networking, and connectivity and located within a single cloud region. Each cloud region has multiple AZs [most have 3 or more]. The AZs are connected with high-bandwidth, low-latency network infrastructure and support synchronous replication. Due to the automatic, synchronous replication and physical separation plus isolation, AZs enable your applications and databases to be highly available. When one AZ within a region fails, another one should remain active and the switch between the two zones is silent and transparent to customers. The cloud providers are responsible for the maintenance of AZ infrastructure and replication of your databases and switching to alternate AZ in case of any failure.
Snowflake Storage and Protection
Q 14. Once the time-travel period has expired, it is possible to request Snowflake support to retrieve historical data for a period of
A. Day B. Days C. Days D. It depends on the Snowflake edition
E. It is user-configurable
Correct Answer: B
After the time travel data retention period is over, you can use Snowflake’s fail-safe feature to recover your data. The duration of the fail-safe period is 7 days [Answer choice 2 is correct]. This is a fixed duration and cannot be changed. Only Snowflake support personnel can help recover data during the fail-safe period. The fail-safe feature is available to all customers irrespective of the Snowflake edition.
The below diagram succinctly summarizes key differences between Snowflake’s two important data protection features – time-travel and fail-safe.
Q 15. Which of the following statements are TRUE concerning a stream object in Snowflake? [select all that apply]
A. A stream object provides a record of DML changes [inserts, updates, deletes] made to a table at row level. B. A stream object can keep track of DML changes for the entire life of a table. C. Streams on materialized views are not supported.
D. Streams on external tables are not supported.
Correct Answers: A and C
Answer choice-A is correct. A stream object provides change tracking over a source table. It records DML changes made to tables, [inserts, updates, and deletes] as well as metadata about each change. This is referred to as Change Data Capture [CDC], and this feature is extensively used in data warehousing scenarios to create data pipelines. Please note that the stream object itself does not store this data. It relies on the version history of source data maintained in the metadata layer.
Answer choice-B is incorrect. Stream object keeps track of DML changes of a source table up until the data retention period of the source table. After that, the DML changes are no longer accessible.
Answer choice-C is correct. Currently, Snowflake does not support creating stream objects on materialized views.
Answer choice-D is incorrect. Snowflake supports creating insert-only stream objects on external tables.
Q 16. Only one stream object can be created on a source table
A. True
B. False
Correct Answer: B
- The above statement is false. You can create any number of streams on a source table. These streams can have the same or different offset positions. One example of creating multiple streams is when you want to report month-on-month changes, week-on-week changes, and day-on-day changes happening in a product inventory table of your POS database. In this case, you may create three streams on the table to record monthly, weekly and daily changes. All three streams exist independently of each other with their respective offset positions.
Q 17. Snowflake replicates the following layer[s] across availability zones [select two]
A. Cloud Services Layer B. Warehouse Layer C. Storage Layer D. Cloud Agnostic Layer
E. Data Sharing Layer
Correct Answers: A and C
Snowflake replicates the cloud services layer [Answer choice-A] and the storage layer [Answer choice-C] across the availability zones of your cloud and region.
Data loading/unloading
Q 18. Which Snowflake edition supports Search Optimization Service to improve performance of point lookup queries?
A. All editions B. Enterprise edition and above C. Business Critical edition and Above
D. All except Virtual Private Snowflake [VPS] edition
Correct Answer: B
Search Optimization Service in Snowflake is a background service that is designed to improve the performance of point lookup queries. A point lookup query is essentially a SELECT statement that returns only a small number of distinct rows from a large dataset. The service runs transparently from an end-user’s standpoint. Only the Enterprise edition and above provide this feature.
Q 19. Consider the XML given below. The XML file is loaded in a User Stage:
Famous Five
Roald dahl
Secret Seven
Enid Blayton
Mahashweta
Sudha Murthy
To remove the top-level element and load elements into separate rows in a Snowflake table, which of the following file format options will be used in the COPY INTO command?
A. STRIP_TOP_ELEMENT B. STRIP_OUTER_ELEMENT C. STRIP_OUTER_ARRAY
D. STRIP_FIRST_ELEMENT
Correct Answer: B
Enabling the STRIP_OUTER_ELEMENT file format option for the COPY INTO
Table stage | User Stage | Named stage | |
Purpose | This is a table-specific stage. Automatically available for every Snowflake table. Can copy data into that specific table only. Can be used by any user. | This is a user-specific stage. Automatically available for every user. Can copy data into any table to which the user has access. One user cannot access another user’s stage. | Most flexible of the three. Can copy data into any table and can be used by any user having the privileges to access that stage. |
Name | Has the same name as the table name. | Has the same name as the user name | first-class database objects and can have any user-defined name |
Referenced as | @% | @~ | @ |
Data transformations | Not supported | Not supported | Supported |
Web UI | Cannot be seen in the Stages tab of Web UI | Cannot be seen in the Stages tab of Web UI | Can be seen in the Stages tab of Web UI |
Further Reading: Choosing a Stage for Local Files — Snowflake Documentation, Copying Data from an Internal Stage — Snowflake Documentation
Domain : Data loading/unloading
Q37 : While loading data into Snowflake, which of the following transformations are supported? [select all that apply]
A. Column reordering
B. Column omission
C. Derived columns
D. Cast operation
E. Truncation
F. Transpose
Correct Answers: A, B, D and E
Explanation
Snowflake supports 4 types of simple transformations while doing data loading.
The supported transformations are – reordering of columns, omission of columns, data type conversion [also called casting or typecasting] and truncating text strings that exceed the length in the target column. Derived columns and transpose or any other complex data transformation operations are not supported while data loading.
Domain : Data loading/unloading
Q38 : It is NOT possible to query data stored in the cloud in Snowflake without first loading data into a Snowflake table.
A. True
B. False
Correct Answer: B
Explanation
The above statement is false. Snowflake supports direct querying of data stored in a cloud location.
Imagine a large organization that has invested time and energy to build a data lake on the Azure cloud [or Amazon cloud or GCP]. The data lake contains petabytes of data accumulated over many years across multiple channels. Now, the organization wants to query a portion of this data to derive useful insights without reloading data in Snowflake. For these kind of scenarios, Snowflake provides a feature called ‘External Tables’.
An external table in Snowflake references to your data files in external cloud storage [e.g., data lake]. External tables store file-level metadata about the data files in the data lake, such as the filename, a version identifier and related properties. Now when you query an external table, Snowflake executes that query against the data stored in external cloud storage, as if it were inside a database.
➤ Practical Info – When you query an external table, it may appear as if data is being fetched from the database. But in reality, the data is fetched directly from external cloud storage every time you run a query on an external table. Therefore the source of truth for the data is always the external cloud store.
Further Reading: Overview of Data Loading — Snowflake Documentation
Domain : Snowflake Overview and Architecture
Q39 : From the statements given below, select the statements which are TRUE for reader accounts. [select all that apply]
A. A reader account facilitates data sharing with consumers that do not have a Snowflake account
B. One reader account can be used to access multiple shares provided by different provider accounts
C. A reader account consumes credit of the provider account
D. A reader account has the same access to Snowflake support as the provider account
Correct Answers: A and C
Explanation
Answer choice-A is correct. If consumers of your share do not have a Snowflake account [e.g., a consumer using other database products such as Oracle or PostgreSQL], Snowflake provides a functionality called ‘reader account’ which can be used to access the share. A reader is a special Snowflake account created by the provider of the share for the sole purpose of providing access to the share to a consumer. The provider fully manages the reader accounts.
Answer choice-B is incorrect. Each reader account belongs to the provider account that created it. This means that if you need to access multiple shares from different providers, you will need to have multiple reader accounts – one reader account per provider.
Answer choice-C is correct. A reader account will require a warehouse to execute queries on the data shared with it. As the reader account is not a licensed user of Snowflake, the warehouse needs to be provisioned by the provider account. Consequently, the provider will be charged for the credits consumed by the queries executed by the reader accounts it has created.
Answer choice-D is incorrect. The reader account does not have access to Snowflake support. The provider that creates and manages the reader account will be able to route questions to Snowflake support on behalf of the reader account.
➤ Practical Info: Technically, a reader account can consume an unlimited number of credits, which will be charged to the provider account. Therefore, to monitor and control the credit consumption by the reader accounts, the provider account should create resource monitors.
Further Reading: Managing reader accounts
Domain : Snowflake Storage and Protection
Q40 : Match the appropriate data protection feature given on the right with the scenario given on the left. [Assume Business-critical edition of Snowflake]
Scenario | Data Protection Feature | |
S1
S2 S3 S4 S5 |
The latest release rolled out today morning has corrupted data in the production environment.
A data issue in the production database was discovered 3 days after time travel has expired. Snowflake instance has crashed, but the cloud region is available. One availability zone in the cloud region where Snowflake was hosted has failed. The entire cloud region where Snowflake was hosted has failed [for example, due to some natural calamity or network outage] |
|
A. S1 – 2, S2 – 3, S3 – 5, S4 – 1, S5 – 4
B. S1 – 3, S2 – 2, S3 – 1, S4 – 5, S5 – 4
C. S1 – 4, S2 – 1, S3 – 5, S4 – 4, S5 – 3
D. S1 – 3, S2 – 1, S3 – 2, S4 – 5, S5 – 4
Correct Answer: D
Explanation
S1 – Latest release rolled out today morning has corrupted data in the production environment. Answer: Time-Travel As data got corrupted in the morning, you have time to recover this data yourself by using the Snowflake time-travel feature. Time-travel is available for every Snowflake edition. For the Standard Edition, you have 1 day [24 hours] of default time travel. For Enterprise edition and above, you get up to 90 days of time travel.
S2 – A data issue in the production database was discovered 3 days after time travel has expired
As it is given in this scenario that the time-travel window has expired, you can rely on the fail-safe feature of Snowflake and retrieve the older version of your data. The fail-safe period is 7 days after the expiry of time travel. Answer: Fail-safe
S3 – Snowflake instance has crashed, but the cloud region is available.
This scenario indicates that the Snowflake instance has crashed, but the underlying cloud provider/region is up. Therefore this is a Snowflake-specific problem. Snowflake provides a triple redundancy feature that will get activated in such scenarios to mitigate the impact on client workloads. This is completely transparent to the end-users. Answer: Triple Redundancy
S4 – One availability zone in the cloud region where Snowflake was hosted has failed.
Snowflake provides failover protection across three availability zones of your cloud provider region. Your data is synchronously replicated across the three availability zones. Therefore when your primary availability zone fails, the workload is automatically and transparently moved to one of the other availability zones. Answer: Automatic AZ Fail-over
S5 – The entire cloud region where Snowflake was hosted has failed.
To ensure resiliency in this scenario, You will use the ‘Database replication and failover/failback’ feature available in the Business Critical edition. This feature allows you to create a secondary database in another cloud provider/region. The secondary database is automatically synced with the primary database. In case of a cloud region failure, you can promote your secondary database to serve as the primary database and continue to process data. Answer: Database replication
Further Reading: How Snowflake protects your data and services
Domain : Performance and Tuning
Q41 : Select the statements that apply to clustering depth in a micro-partition. [select all that apply]
A. Clustering depth is the average depth of overlapping micro-partitions for a specified set of columns
B. The lower the clustering depth, the better clustered the table is
C. The higher the clustering depth, the better clustered the table is
D. Is an indication of whether the table needs an explicit clustering key
Correct Answers: A, B and D
Explanation
Answer choice-A is correct. Clustering depth is the average depth of overlapping micro-partitions. Clustering depth indicates how well the data is dispersed across micro-partitions.
Answer choice-B is correct and Answer choice-C is incorrect. The ideal clustering depth is 1 and it indicates that no micro-partitions overlap for a given set of columns. Therefore, when a search query is executed, Snowflake can directly find the right micro-partition. Any number greater than 1 indicates that when a search query is executed, Snowflake will find several micro-partitions and it will need to process these micro-partitions to prepare the resultset. For an empty table, the clustering depth is 0.
Answer choice-D is correct. In general, Snowflake organizes your data in micro-partitions in such a way that clustering depth is low. However, over time, as insert/updates are performed on the data, the data may no longer stay clustered optimally and clustering depth becomes large. This is specifically true for very large tables. When this happens, the search query performance becomes slower over time.
To improve the clustering of the data in micro-partitions, Snowflake allows you to designate one or more table columns/expressions as an explicit clustering key for the table. A clustering key co-locates rows in the same micro-partitions thereby improving search efficiency. Hence, high clustering depth indicates that the queries on that table may benefit by defining an explicit clustering key.
Further Reading: Clustering depth illustration – Snowflake documentation
Domain : Snowflake Virtual Warehouses
Q42 : Select statement[s] which are TRUE from the given set of statements concerning Snowflake virtual warehouses. [select all that apply]
A. Scaling up a virtual warehouse is an automatic process
B. Scaling up a virtual warehouse is a manual process
C. Scaling down a virtual warehouse is an automatic process
D. Scaling down a virtual warehouse is a manual process
E. Scaling out a virtual warehouse is an automatic process
F. Scaling out a virtual warehouse is a manual process
G. Scaling back a virtual warehouse is an automatic process
H. Scaling back a virtual warehouse is a manual process
Correct Answers: B, D, E and G
Explanation
Scaling up or scaling down a virtual warehouse is also known as warehouse re-sizing. When you re-size a warehouse, you change the T-shirt size of the warehouse [e.g. from S to L or from XL to M]. This is a manual activity. You may want to increase the T-shirt size of your warehouse [i.e. scale up] if your query performance or data loading performance is slow. You will want to decrease the T-shirt size of your warehouse [i.e. scale down] if your query workload is low.
Scaling out or scaling back a virtual warehouse is adding or removing new warehouse clusters [without regard to the size of the warehouse]. This is an automatic process managed transparently by Snowflake. Snowflake will automatically increase the number of warehouse clusters as the number of queries increases to prevent queries from queuing. When Snowflake determines that additional clusters are no longer needed, it will shut them down. In summary, Snowflake ensures that a multi-cluster warehouse dynamically adapts to increase or decrease in the number of queries without any user intervention.
Summary:
Scaling model | How? | For? |
Scale-Up/Down | Manual | Query/ Data Loading Performance |
Scale-Out/In | Automatic | Query concurrency/ |
Further Reading: Working with virtual warehouses in Snowflake – YouTube video
Domain : Account and Security
Q43 : In Snowflake, For each securable object, there is a set of privileges. The privileges provide fine-grained access control on the object. Each securable object has an owner that can assign these privileges directly to a user or a group of users.
A. True
B. False
Correct Answer: B
The above statement is false. This is because the owner of a securable object CANNOT grant privileges over the object directly to a user. The privileges must be assigned to roles. The roles can be granted to other roles creating a role hierarchy OR the role can be granted to a user or a group of users.
Further Reading: Access Control Framework – Snowflake Documentation
Domain : Snowflake Overview and Architecture
Q44 : Snowflake being a SaaS software, which of the following activities of an on-prem/hosted data warehouse are not required in Snowflake from the user’s perspective? [select all that apply]
A. Hardware sizing, purchase, or configuration
B. Hardware scaling
C. User and access management
D. Software installation, maintenance, or upgrades
E. Database Tuning
F. Site-level disaster recovery [due to loss of data center]
Correct Answers: A, B, D, E and F
Explanation
Snowflake is a true SaaS, cloud-native, data warehouse product. As it completely runs on public cloud infrastructure, it does not require end-users to purchase, configure or manage any on-prem hardware [answer choice A]. Practically unlimited scalability available on tap means end users will not need to carry out costly and effort-intensive hardware scaling by themselves [answer choice B]. All software maintenance, bug fixes, feature updates, etc. are rolled out completely by Snowflake without any end-user intervention [answer choice D]. Database tuning Is no longer a task for the DBA in Snowflake as Snowflake does not support the creation of indexes, database partitions, [Answer choice-E]. Your data is automatically replicated across three availability zones in the cloud region without any user intervention [Answer choice-F]. Out of the given list of answer choices, only ‘User and access management is a user responsibility.
Further Reading: Snowflake data warehouse pros and cons – Snowflake community blog
Domain : Snowflake Storage and Protection
Q45 : Zero-copy cloning operation is supported for the following objects in Snowflake. [select all that apply]
A. A temporary Table
B. A transient Table
C. An external table
D. A permanent Table
E. A database created from a Share
F. Internal [named] Stage
Correct Answers: B and D
Explanation
- Zero-copy cloning is available only for the permanent and transient table types. It is not available for temporary and external table types.
- You cannot clone a database that is created from a Share.
- Also, zero-copy cloning is not available for stage objects.
Domain : Snowflake Virtual Warehouses
Q46 : A resource monitor can be created by…
A. Owner of the virtual warehouse
B. ACCOUNTADMIN role
C. SYSADMIN role
D. USERADMIN role
E. Any of the above
Correct Answer: B
Explanation
Resource monitors can only be created by account administrators [i.e. users with the ACCOUNTADMIN role]. However, account administrators can choose to grant MONITOR and MODIFY privileges on resource monitors to the users with other roles as needed.
Further Reading: Access Control Privileges for Resource Monitors – Snowflake Documentation
Domain : Data loading/unloading
Q47 : You have a business-critical edition of Snowflake on AWS cloud. The data lake implementation of your organization is in Azure and utilizes Azure Data Lake Gen2 service. As your Snowflake account and your data lake are on different clouds, it is not possible to do bulk loading from Azure data lake into Snowflake using the COPY command.
A. True
B. False
Correct Answer: B
Explanation
This statement is not correct. Bulk loading using COPY INTO supports data loading into Snowflake from the data files on your local file system or in cloud storage external to Snowflake [Amazon S3, Microsoft Azure, or Google Cloud Storage] irrespective of where your Snowflake account is hosted.
Further Reading: Introduction to Data Loading – Youtube Video
Domain : Semi Structured data
Q48 : Consider a table vehicle_inventory that stores vehicle information of all vehicles in your dealership. The table has only one VARIANT column called vehicle_data which stores information in JSON format. The data is given below:
{
“date_of_arrival”: “2021-04-28”,
“supplier_name”: “Hillside Honda”,
“contact_person”: {
“name”: “Derek Larssen”,
“phone”: “8423459854”
},
“vehicle”: [
{
“make”: “Honda”,
“model”: “Civic”,
“variant”: “GLX”,
“year”: “2020”
}
]
}
Which of the following are valid SQL queries that retrieve supplier_name? [select all that apply]
A. select vehicle_data.supplier_name::string from vehicle_inventory
B. select vehicle_data:supplier_name::string from vehicle_inventory
C. select vehicle_data[supplier_name[string]] from vehicle_inventory
D. select vehicle_data[supplier_name’]::string from vehicle_inventory
Correct Answer: B
Explanation
In this example, supplier_name is the first level element in the JSON. To access this in SQL you will need to use the below syntax in the SQL if you are using . [dot] notation.
:::
Therefore, the correct SQL statements would be –
select vehicle_data:supplier_name::string from vehicle_inventory
Further Reading: Querying Semi-Structured Data – Snowflake Documentation
Domain : Snowflake Virtual Warehouses
Q49 : Which of the following statements are true about multi-cluster Warehouses in Snowflake? [Select all that apply].
A. Multi-cluster warehouses support all properties of a single-cluster warehouse
B. Increasing the number of compute resources in a multi-cluster warehouse is an example of scaling up
C. A single cluster warehouse can be reconfigured to turn into a multi-cluster warehouse but vice-versa is not true
D. None of the above
Correct Answer: A
Explanation
Answer choice-A is correct. Multi-cluster warehouses support all the same properties and actions as single warehouses, including:
- Specifying warehouse size [ e.g. XS, S, M, L…]
- Resizing a warehouse at any time.
- Auto-suspending a running warehouse due to inactivity;
- Auto-resuming a suspended warehouse when new queries are submitted.
Answer choice-B is incorrect. Increasing the number of compute resources in a multi-cluster warehouse is an example of scaling out. Increasing the size of compute clusters is an example of scaling up.
Answer choice-C is incorrect. A single cluster warehouse can be reconfigured to turn into a multi-cluster warehouse. To do this, you will have to set the max_cluster_count to a value greater than 1 as shown in the example below:
ALTER WAREHOUSE “COMPUTE_WH” SET MAX_CLUSTER_COUNT = 3;
Similarly, a multi-cluster warehouse can be reconfigured to turn into a single cluster warehouse. To do this, you will have to set the max_cluster_count to a value equal to 1 as shown in the example below:
ALTER WAREHOUSE “COMPUTE_WH” SET MAX_CLUSTER_COUNT = 1;
Domain : Snowflake Overview and Architecture
Q50 : Using materialized views can be beneficial in the following scenarios. [select all that apply]
A. when you have a complex query that is frequently used
B. When the underlying data changes frequently
C. when the query results are smaller than the base table
D. When the query includes an external table
Correct Answers: A, C and D
Explanation
Answer choice-A is correct. Encapsulating a frequently used, complex query that normally takes a long time to execute into a materialized view can improve performance. This is because the query is executed automatically in the background and the computed result set is stored in the materialized view. Querying the view, then, becomes significantly faster compared to re-running the query.
Answer choice-B is incorrect. A background service in Snowflake refreshes the materialized view after changes are made to the base table. This service consumes credits. If the underlying data changes frequently, the background service has to be executed repeatedly which leads to higher credit consumption. Hence, it is not recommended to create materialized views on a base table that changes frequently.
Answer choice-C is correct. When the query results are smaller than the base table, the background service that refreshes the materialized view needs to compute results for a small set of records. This results in lower consumption of credits to keep the view up to date.
Answer choice-D is correct. Querying an external table is likely to be slower than querying native database tables as the data in an external table is stored in an external stage in the cloud [outside of Snowflake]. Creating a materialized view brings the resultset locally within your Snowflake account thereby improving query performance.