SnowPro Advanced: Architect Recertification Exam Questions and Answers
What transformations are supported in the below SQL statement? (Select THREE).
CREATE PIPE ... AS COPY ... FROM (...)
Options:
Data can be filtered by an optional where clause.
Columns can be reordered.
Columns can be omitted.
Type casts are supported.
Incoming data can be joined with other tables.
The ON ERROR - ABORT statement command can be used.
Answer:
A, B, CExplanation:
- The SQL statement is a command for creating a pipe in Snowflake, which is an object that defines the COPY INTO
statement used by Snowpipe to load data from an ingestion queue into tables1. The statement uses a subquery in the FROM clause to transform the data from the staged files before loading it into the table2.
- The transformations supported in the subquery are as follows2:
SQLAI-generated code. Review and use carefully. More info on FAQ.
create pipe mypipe as
copy into mytable
from (
select * from @mystage
where col1 = 'A' and col2 > 10
);
- uk.co.certification.simulator.questionpool.PList@18277a90
SQLAI-generated code. Review and use carefully. More info on FAQ.
create pipe mypipe as
copy into mytable (col1, col2, col3)
from (
select col3, col1, col2 from @mystage
);
- uk.co.certification.simulator.questionpool.PList@18277ad0
SQLAI-generated code. Review and use carefully. More info on FAQ.
create pipe mypipe as
copy into mytable (col1, col2)
from (
select col1, col2 from @mystage
);
- The other options are not supported in the subquery because2:
SQLAI-generated code. Review and use carefully. More info on FAQ.
create pipe mypipe as
copy into mytable (col1, col2)
from (
select col1::date, col2 from @mystage
);
- uk.co.certification.simulator.questionpool.PList@18275290
SQLAI-generated code. Review and use carefully. More info on FAQ.
create pipe mypipe as
copy into mytable (col1, col2, col3)
from (
select s.col1, s.col2, t.col3 from @mystage s
join othertable t on s.col1 = t.col1
);
- uk.co.certification.simulator.questionpool.PList@182752f0
SQLAI-generated code. Review and use carefully. More info on FAQ.
create pipe mypipe as
copy into mytable
from (
select * from @mystage
on error abort
);
References:
- 1: CREATE PIPE | Snowflake Documentation
- 2: Transforming Data During a Load | Snowflake Documentation
Question 2An Architect has a design where files arrive every 10 minutes and are loaded into a primary database table using Snowpipe. A secondary database is refreshed every hour with the latest data from the primary database.
Based on this scenario, what Time Travel query options are available on the secondary database?
Options:
A.A query using Time Travel in the secondary database is available for every hourly table version within the retention window.
B.A query using Time Travel in the secondary database is available for every hourly table version within and outside the retention window.
C.Using Time Travel, secondary database users can query every iterative version within each hour (the individual Snowpipe loads) in the retention window.
D.Using Time Travel, secondary database users can query every iterative version within each hour (the individual Snowpipe loads) and outside the retention window.
Answer:
AExplanation:
Explanation:Snowflake’s Time Travel feature allows users to query historical data within a defined retention period. In the given scenario, since the secondary database is refreshed every hour, Time Travel can be used to query each hourly version of the table as long as it falls within the retention window. This does not include individual Snowpipe loads within each hour unless they coincide with the hourly refresh.
References: The answer is verified using Snowflake’s official documentation, which provides detailed information on Time Travel and its usage within the retention period123.
Question 3A company needs to share its product catalog data with one of its partners. The product catalog data is stored in two database tables: product_category, and product_details. Both tables can be joined by the product_id column. Data access should be governed, and only the partner should have access to the records.
The partner is not a Snowflake customer. The partner uses Amazon S3 for cloud storage.
Which design will be the MOST cost-effective and secure, while using the required Snowflake features?
Options:
A.Use Secure Data Sharing with an S3 bucket as a destination.
B.Publish product_category and product_details data sets on the Snowflake Marketplace.
C.Create a database user for the partner and give them access to the required data sets.
D.Create a reader account for the partner and share the data sets as secure views.
Answer:
DExplanation:
Explanation:A reader account is a type of Snowflake account that allows external users to access data shared by a provider account without being a Snowflake customer. A reader account can be created and managed by the provider account, and can use the Snowflake web interface or JDBC/ODBC drivers to query the shared data. A reader account is billed to the provider account based on the credits consumed by the queries1. A secure view is a type of view that applies row-level security filters to the underlying tables, and masks the data that is not accessible to the user. A secure view can be shared with a reader account to provide granular and governed access to the data2. In this scenario, creating a reader account for the partner and sharing the data sets as secure views would be the most cost-effective and secure design, while using the required Snowflake features, because:
- It would avoid the data transfer and storage costs of using an S3 bucket as a destination, and the potential security risks of exposing the data to unauthorized access or modification.
- It would avoid the complexity and overhead of publishing the data sets on the Snowflake Marketplace, and the potential loss of control over the data ownership and pricing.
- It would avoid the need to create a database user for the partner and grant them access to the required data sets, which would require the partner to have a Snowflake account and consume the provider’s resources.
References:
- Reader Accounts
- Secure Views
Question 4Which Snowflake data modeling approach is designed for BI queries?
Options:
A.3 NF
B.Star schema
C.Data Vault
D.Snowflake schema
Answer:
BExplanation:
Explanation:In the context of business intelligence (BI) queries, which are typically focused on data analysis and reporting, the star schema is the most suitable data modeling approach.
Option B: Star Schema - The star schema is a type of relational database schema that is widely used for developing data warehouses and data marts for BI purposes. It consists of a central fact table surrounded by dimension tables. The fact table contains the core data metrics, and the dimension tables contain descriptive attributes related to the fact data. The simplicity of the star schema allows for efficient querying and aggregation, which are common operations in BI reporting.
Question 5A Snowflake Architect is designing an application and tenancy strategy for an organization where strong legal isolation rules as well as multi-tenancy are requirements.
Which approach will meet these requirements if Role-Based Access Policies (RBAC) is a viable option for isolating tenants?
Options:
A.Create accounts for each tenant in the Snowflake organization.
B.Create an object for each tenant strategy if row level security is viable for isolating tenants.
C.Create an object for each tenant strategy if row level security is not viable for isolating tenants.
D.Create a multi-tenant table strategy if row level security is not viable for isolating tenants.
Answer:
AExplanation:
Explanation:In a scenario where strong legal isolation is required alongside the need for multi-tenancy, the most effective approach is to create separate accounts for each tenant within the Snowflake organization. This approach ensures complete isolation of data, resources, and management, adhering to strict legal and compliance requirements. Role-Based Access Control (RBAC) further enhances security by allowing granular control over who can access what resources within each account. This solution leverages Snowflake’s capabilities for managing multiple accounts under a single organization umbrella, ensuring that each tenant's data and operations are isolated from others.References: Snowflake documentation on multi-tenancy and account management, part of the SnowPro Advanced: Architect learning path.
Question 6A company has a source system that provides JSON records for various loT operations. The JSON Is loading directly into a persistent table with a variant field. The data Is quickly growing to 100s of millions of records and performance to becoming an issue. There is a generic access pattern that Is used to filter on the create_date key within the variant field.
What can be done to improve performance?
Options:
A.Alter the target table to Include additional fields pulled from the JSON records. This would Include a create_date field with a datatype of time stamp. When this field Is used in the filter, partition pruning will occur.
B.Alter the target table to include additional fields pulled from the JSON records. This would include a create_date field with a datatype of varchar. When this field is used in the filter, partition pruning will occur.
C.Validate the size of the warehouse being used. If the record count is approaching 100s of millions, size XL will be the minimum size required to process this amount of data.
D.Incorporate the use of multiple tables partitioned by date ranges. When a user or process needs to query a particular date range, ensure the appropriate base table Is used.
Answer:
AExplanation:
Explanation:- The correct answer is A because it improves the performance of queries by reducing the amount of data scanned and processed. By adding a create_date field with a timestamp data type, Snowflake can automatically cluster the table based on this field and prune the micro-partitions that do not match the filter condition. This avoids the need to parse the JSON data and access the variant field for every record.
- Option B is incorrect because it does not improve the performance of queries. By adding a create_date field with a varchar data type, Snowflake cannot automatically cluster the table based on this field and prune the micro-partitions that do not match the filter condition. This still requires parsing the JSON data and accessing the variant field for every record.
- Option C is incorrect because it does not address the root cause of the performance issue. By validating the size of the warehouse being used, Snowflake can adjust the compute resources to match the data volume and parallelize the query execution. However, this does not reduce the amount of data scanned and processed, which is the main bottleneck for queries on JSON data.
- Option D is incorrect because it adds unnecessary complexity and overhead to the data loading and querying process. By incorporating the use of multiple tables partitioned by date ranges, Snowflake can reduce the amount of data scanned and processed for queries that specify a date range. However, this requires creating and maintaining multiple tables, loading data into the appropriate table based on the date, and joining the tables for queries that span multiple date ranges. References:
- Snowflake Documentation: Loading Data Using Snowpipe: This document explains how to use Snowpipe to continuously load data from external sources into Snowflake tables. It also describes the syntax and usage of the COPY INTO command, which supports various options and parameters to control the loading behavior, such as ON_ERROR, PURGE, and SKIP_FILE.
- Snowflake Documentation: Date and Time Data Types and Functions: This document explains the different data types and functions for working with date and time values in Snowflake. It also describes how to set and change the session timezone and the system timezone.
- Snowflake Documentation: Querying Metadata: This document explains how to query the metadata of the objects and operations in Snowflake using various functions, views, and tables. It also describes how to access the copy history information using the COPY_HISTORY function or the COPY_HISTORY view.
- Snowflake Documentation: Loading JSON Data: This document explains how to load JSON data into Snowflake tables using various methods, such as the COPY INTO command, the INSERT command, or the PUT command. It also describes how to access and query JSON data using the dot notation, the FLATTEN function, or the LATERAL join.
- Snowflake Documentation: Optimizing Storage for Performance: This document explains how to optimize the storage of data in Snowflake tables to improve the performance of queries. It also describes the concepts and benefits of automatic clustering, search optimization service, and materialized views.
Question 7There are two databases in an account, named fin_db and hr_db which contain payroll and employee data, respectively. Accountants and Analysts in the company require different permissions on the objects in these databases to perform their jobs. Accountants need read-write access to fin_db but only require read-only access to hr_db because the database is maintained by human resources personnel.
An Architect needs to create a read-only role for certain employees working in the human resources department.
Which permission sets must be granted to this role?
Options:
A.USAGE on database hr_db, USAGE on all schemas in database hr_db, SELECT on all tables in database hr_db
B.USAGE on database hr_db, SELECT on all schemas in database hr_db, SELECT on all tables in database hr_db
C.MODIFY on database hr_db, USAGE on all schemas in database hr_db, USAGE on all tables in database hr_db
D.USAGE on database hr_db, USAGE on all schemas in database hr_db, REFERENCES on all tables in database hr_db
Answer:
AExplanation:
Explanation:- To create a read-only role for certain employees working in the human resources department, the role needs to have the following permissions on the hr_db database:
- Option A is the correct answer because it grants the minimum permissions required for a read-only role on the hr_db database.
- Option B is incorrect because SELECT on schemas is not a valid permission. Schemas only support USAGE and CREATE permissions.
- Option C is incorrect because MODIFY on the database is not a valid permission. Databases only support USAGE, CREATE, MONITOR, and OWNERSHIP permissions. Moreover, USAGE on tables is not sufficient for querying the data. Tables support SELECT, INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, and OWNERSHIP permissions.
- Option D is incorrect because REFERENCES on tables is not relevant for querying the data. REFERENCES permission allows the role to create foreign key constraints on the tables.
References:
- : https://docs.snowflake.com/en/user-guide/security-access-control-privileges.html#database-privileges
- : https://docs.snowflake.com/en/user-guide/security-access-control-privileges.html#schema-privileges
- : https://docs.snowflake.com/en/user-guide/security-access-control-privileges.html#table-privileges
Question 8The diagram shows the process flow for Snowpipe auto-ingest with Amazon Simple Notification Service (SNS) with the following steps:
Step 1: Data files are loaded in a stage.
Step 2: An Amazon S3 event notification, published by SNS, informs Snowpipe — by way of Amazon Simple Queue Service (SQS) - that files are ready to load. Snowpipe copies the files into a queue.
Step 3: A Snowflake-provided virtual warehouse loads data from the queued files into the target table based on parameters defined in the specified pipe.
If an AWS Administrator accidentally deletes the SQS subscription to the SNS topic in Step 2, what will happen to the pipe that references the topic to receive event messages from Amazon S3?
Options:
A.The pipe will continue to receive the messages as Snowflake will automatically restore the subscription to the same SNS topic and will recreate the pipe by specifying the same SNS topic name in the pipe definition.
B.The pipe will no longer be able to receive the messages and the user must wait for 24 hours from the time when the SNS topic subscription was deleted. Pipe recreation is not required as the pipe will reuse the same subscription to the existing SNS topic after 24 hours.
C.The pipe will continue to receive the messages as Snowflake will automatically restore the subscription by creating a new SNS topic. Snowflake will then recreate the pipe by specifying the new SNS topic name in the pipe definition.
D.The pipe will no longer be able to receive the messages. To restore the system immediately, the user needs to manually create a new SNS topic with a different name and then recreate the pipe by specifying the new SNS topic name in the pipe definition.
Answer:
DExplanation:
Explanation:If an AWS Administrator accidentally deletes the SQS subscription to the SNS topic in Step 2, the pipe that references the topic to receive event messages from Amazon S3 will no longer be able to receive the messages. This is because the SQS subscription is the link between the SNS topic and the Snowpipe notification channel. Without the subscription, the SNS topic will not be able to send notifications to the Snowpipe queue, and the pipe will not be triggered to load the new files. To restore the system immediately, the user needs to manually create a new SNS topic with a different name and then recreate the pipe by specifying the new SNS topic name in the pipe definition. This will create a new notification channel and a new SQS subscription for the pipe. Alternatively, the user can also recreate the SQS subscription to the existing SNS topic and then alter the pipe to use the same SNS topic name in the pipe definition. This will also restore the notification channel and the pipe functionality. References:
- Automating Snowpipe for Amazon S3
- Enabling Snowpipe Error Notifications for Amazon SNS
- HowTo: Configuration steps for Snowpipe Auto-Ingest with AWS S3 Stages
Question 9A user can change object parameters using which of the following roles?
Options:
A.ACCOUNTADMIN, SECURITYADMIN
B.SYSADMIN, SECURITYADMIN
C.ACCOUNTADMIN, USER with PRIVILEGE
D.SECURITYADMIN, USER with PRIVILEGE
Answer:
CExplanation:
Explanation:According to the Snowflake documentation, object parameters are parameters that can be set on individual objects such as databases, schemas, tables, and stages. Object parameters can be set by users with the appropriate privileges on the objects. For example, to set the object parameter AUTO_REFRESH on a table, the user must have the MODIFY privilege on the table. The ACCOUNTADMIN role has the highest level of privileges on all objects in the account, so it can set any object parameter on any object. However, other roles, such as SECURITYADMIN or SYSADMIN, do not have the same level of privileges on all objects, so they cannot set object parameters on objects they do not own or have the required privileges on. Therefore, the correct answer is C. ACCOUNTADMIN, USER with PRIVILEGE.
References:
- Parameters | Snowflake Documentation
- Object Parameters | Snowflake Documentation
- Object Privileges | Snowflake Documentation
Question 10How do Snowflake databases that are created from shares differ from standard databases that are not created from shares? (Choose three.)
Options:
A.Shared databases are read-only.
B.Shared databases must be refreshed in order for new data to be visible.
C.Shared databases cannot be cloned.
D.Shared databases are not supported by Time Travel.
E.Shared databases will have the PUBLIC or INFORMATION_SCHEMA schemas without explicitly granting these schemas to the share.
F.Shared databases can also be created as transient databases.
Answer:
A, C, DExplanation:
Explanation:According to the SnowPro Advanced: Architect documents and learning resources, the ways that Snowflake databases that are created from shares differ from standard databases that are not created from shares are:
- Shared databases are read-only. This means that the data consumers who access the shared databases cannot modify or delete the data or the objects in the databases. The data providers who share the databases have full control over the data and the objects, and can grant or revoke privileges on them1.
- Shared databases cannot be cloned. This means that the data consumers who access the shared databases cannot create a copy of the databases or the objects in the databases. The data providers who share the databases can clone the databases or the objects, but the clones are not automatically shared2.
- Shared databases are not supported by Time Travel. This means that the data consumers who access the shared databases cannot use the AS OF clause to query historical data or restore deleted data. The data providers who share the databases can use Time Travel on the databases or the objects, but the historical data is not visible to the data consumers3.
The other options are incorrect because they are not ways that Snowflake databases that are created from shares differ from standard databases that are not created from shares. Option B is incorrect because shared databases do not need to be refreshed in order for new data to be visible. The data consumers who access the shared databases can see the latest data as soon as the data providers update the data1. Option E is incorrect because shared databases will not have the PUBLIC or INFORMATION_SCHEMA schemas without explicitly granting these schemas to the share. The data consumers who access the shared databases can only see the objects that the data providers grant to the share, and the PUBLIC and INFORMATION_SCHEMA schemas are not granted by default4. Option F is incorrect because shared databases cannot be created as transient databases. Transient databases are databases that do not support Time Travel or Fail-safe, and can be dropped without affecting the retention period of the data. Shared databases are always created as permanent databases, regardless of the type of the source database5. References: Introduction to Secure Data Sharing | Snowflake Documentation, Cloning Objects | Snowflake Documentation, Time Travel | Snowflake Documentation, Working with Shares | Snowflake Documentation, CREATE DATABASE | Snowflake Documentation
Question 11A new table and streams are created with the following commands:
CREATE OR REPLACE TABLE LETTERS (ID INT, LETTER STRING) ;
CREATE OR REPLACE STREAM STREAM_1 ON TABLE LETTERS;
CREATE OR REPLACE STREAM STREAM_2 ON TABLE LETTERS APPEND_ONLY = TRUE;
The following operations are processed on the newly created table:
INSERT INTO LETTERS VALUES (1, 'A');
INSERT INTO LETTERS VALUES (2, 'B');
INSERT INTO LETTERS VALUES (3, 'C');
TRUNCATE TABLE LETTERS;
INSERT INTO LETTERS VALUES (4, 'D');
INSERT INTO LETTERS VALUES (5, 'E');
INSERT INTO LETTERS VALUES (6, 'F');
DELETE FROM LETTERS WHERE ID = 6;
What would be the output of the following SQL commands, in order?
SELECT COUNT (*) FROM STREAM_1;
SELECT COUNT (*) FROM STREAM_2;
Options:
A.2 & 6
B.2 & 3
C.4 & 3
D.4 & 6
Answer:
CExplanation:
Explanation:In Snowflake, a stream records data manipulation language (DML) changes to its base table since the stream was created or last consumed. STREAM_1 will show all changes including the TRUNCATE operation, while STREAM_2, being APPEND_ONLY, will not show deletions like TRUNCATE. Therefore, STREAM_1 will count the three inserts, the TRUNCATE (counted as a single operation), and the subsequent two inserts before the delete, totaling 4. STREAM_2 will only count the three initial inserts and the two after the TRUNCATE, totaling 3, as it does not count the TRUNCATE or the delete operation.
References: The explanation is based on the Snowflake documentation on streams, which details how streams track changes and the difference between standard and APPEND_ONLY streams12.
Question 12An Architect has designed a data pipeline that Is receiving small CSV files from multiple sources. All of the files are landing in one location. Specific files are filtered for loading into Snowflake tables using the copy command. The loading performance is poor.
What changes can be made to Improve the data loading performance?
Options:
A.Increase the size of the virtual warehouse.
B.Create a multi-cluster warehouse and merge smaller files to create bigger files.
C.Create a specific storage landing bucket to avoid file scanning.
D.Change the file format from CSV to JSON.
Answer:
BExplanation:
Explanation:According to the Snowflake documentation, the data loading performance can be improved by following some best practices and guidelines for preparing and staging the data files. One of the recommendations is to aim for data files that are roughly 100-250 MB (or larger) in size compressed, as this will optimize the number of parallel operations for a load. Smaller files should be aggregated and larger files should be split to achieve this size range. Another recommendation is to use a multi-cluster warehouse for loading, as this will allow for scaling up or out the compute resources depending on the load demand. A single-cluster warehouse may not be able to handle the load concurrency and throughput efficiently. Therefore, by creating a multi-cluster warehouse and merging smaller files to create bigger files, the data loading performance can be improved. References:
- Data Loading Considerations
- Preparing Your Data Files
- Planning a Data Load
Question 13An Architect needs to design a Snowflake account and database strategy to store and analyze large amounts of structured and semi-structured data. There are many business units and departments within the company. The requirements are scalability, security, and cost efficiency.
What design should be used?
Options:
A.Create a single Snowflake account and database for all data storage and analysis needs, regardless of data volume or complexity.
B.Set up separate Snowflake accounts and databases for each department or business unit, to ensure data isolation and security.
C.Use Snowflake's data lake functionality to store and analyze all data in a central location, without the need for structured schemas or indexes
D.Use a centralized Snowflake database for core business data, and use separate databases for departmental or project-specific data.
Answer:
DExplanation:
Explanation:The best design to store and analyze large amounts of structured and semi-structured data for different business units and departments is to use a centralized Snowflake database for core business data, and use separate databases for departmental or project-specific data. This design allows for scalability, security, and cost efficiency by leveraging Snowflake’s features such as:
- Database cloning: Cloning a database creates a zero-copy clone that shares the same data files as the original database, but can be modified independently. This reduces storage costs and enables fast and consistent data replication for different purposes.
- Database sharing: Sharing a database allows granting secure and governed access to a subset of data in a database to other Snowflake accounts or consumers. This enables data collaboration and monetization across different business units or external partners.
- Warehouse scaling: Scaling a warehouse allows adjusting the size and concurrency of a warehouse to match the performance and cost requirements of different workloads. This enables optimal resource utilization and flexibility for different data analysis needs. References: Snowflake Documentation: Database Cloning, Snowflake Documentation: Database Sharing, [Snowflake Documentation: Warehouse Scaling]
Question 14A company is using a Snowflake account in Azure. The account has SAML SSO set up using ADFS as a SCIM identity provider. To validate Private Link connectivity, an Architect performed the following steps:
* Confirmed Private Link URLs are working by logging in with a username/password account
* Verified DNS resolution by running nslookups against Private Link URLs
* Validated connectivity using SnowCD
* Disabled public access using a network policy set to use the company’s IP address range
However, the following error message is received when using SSO to log into the company account:
IP XX.XXX.XX.XX is not allowed to access snowflake. Contact your local security administrator.
What steps should the Architect take to resolve this error and ensure that the account is accessed using only Private Link? (Choose two.)
Options:
A.Alter the Azure security integration to use the Private Link URLs.
B.Add the IP address in the error message to the allowed list in the network policy.
C.Generate a new SCIM access token using system$generate_scim_access_token and save it to Azure AD.
D.Update the configuration of the Azure AD SSO to use the Private Link URLs.
E.Open a case with Snowflake Support to authorize the Private Link URLs’ access to the account.
Answer:
B, DExplanation:
Explanation:The error message indicates that the IP address in the error message is not allowed to access Snowflake because it is not in the allowed list of the network policy. The network policy is a feature that allows restricting access to Snowflake based on IP addresses or ranges. To resolve this error, the Architect should take the following steps:
- Add the IP address in the error message to the allowed list in the network policy. This will allow the IP address to access Snowflake using the Private Link URLs. Alternatively, the Architect can disable the network policy if it is not required for security reasons.
- Update the configuration of the Azure AD SSO to use the Private Link URLs. This will ensure that the SSO authentication process uses the Private Link URLs instead of the public URLs. The configuration can be updated by following the steps in the Azure documentation1.
These two steps should resolve the error and ensure that the account is accessed using only Private Link. The other options are not necessary or relevant for this scenario. Altering the Azure security integration to use the Private Link URLs is not required because the security integration is used for SCIM provisioning, not for SSO authentication. Generating a new SCIM access token using system$generate_scim_access_token and saving it to Azure AD is not required because the SCIM access token is used for SCIM provisioning, not for SSO authentication. Opening a case with Snowflake Support to authorize the Private Link URLs’ access to the account is not required because the authorization can be done by the account administrator using the SYSTEM$AUTHORIZE_PRIVATELINK function2.
Question 15How can an Architect enable optimal clustering to enhance performance for different access paths on a given table?
Options:
A.Create multiple clustering keys for a table.
B.Create multiple materialized views with different cluster keys.
C.Create super projections that will automatically create clustering.
D.Create a clustering key that contains all columns used in the access paths.
Answer:
BExplanation:
Explanation:According to the SnowPro Advanced: Architect documents and learning resources, the best way to enable optimal clustering to enhance performance for different access paths on a given table is to create multiple materialized views with different cluster keys. A materialized view is a pre-computed result set that is derived from a query on one or more base tables. A materialized view can be clustered by specifying a clustering key, which is a subset of columns or expressions that determines how the data in the materialized view is co-located in micro-partitions. By creating multiple materialized views with different cluster keys, an Architect can optimize the performance of queries that use different access paths on the same base table. For example, if a base table has columns A, B, C, and D, and there are queries that filter on A and B, or on C and D, or on A and C, the Architect can create three materialized views, each with a different cluster key: (A, B), (C, D), and (A, C). This way, each query can leverage the optimal clustering of the corresponding materialized view and achieve faster scan efficiency and better compression.
References:
- Snowflake Documentation: Materialized Views
- Snowflake Learning: Materialized Views
Question 16A group of Data Analysts have been granted the role analyst role. They need a Snowflake database where they can create and modify tables, views, and other objects to load with their own data. The Analysts should not have the ability to give other Snowflake users outside of their role access to this data.
How should these requirements be met?
Options:
A.Grant ANALYST_R0LE OWNERSHIP on the database, but make sure that ANALYST_ROLE does not have the MANAGE GRANTS privilege on the account.
B.Grant SYSADMIN ownership of the database, but grant the create schema privilege on the database to the ANALYST_ROLE.
C.Make every schema in the database a managed access schema, owned by SYSADMIN, and grant create privileges on each schema to the ANALYST_ROLE for each type of object that needs to be created.
D.Grant ANALYST_ROLE ownership on the database, but grant the ownership on future [object type] s in database privilege to SYSADMIN.
Answer:
CExplanation:
Explanation:The requirements state that the data analysts need to be able to create and modify database objects and load data, but should not be able to manage access for users outside of their role.
Option C: By making each schema within the database a managed access schema and having them owned by SYSADMIN, the ability to grant privileges on the schema's objects is strictly controlled. Managed access schemas limit the granting of privileges to the role specified as the owner of the schema, in this case, SYSADMIN. The ANALYST_ROLE can be granted the privileges necessary to create and modify objects within these schemas, satisfying the requirement for the analysts to perform their tasks without being able to extend access beyond their role.
Question 17How can the Snowpipe REST API be used to keep a log of data load history?
Options:
A.Call insertReport every 20 minutes, fetching the last 10,000 entries.
B.Call loadHistoryScan every minute for the maximum time range.
C.Call insertReport every 8 minutes for a 10-minute time range.
D.Call loadHistoryScan every 10 minutes for a 15-minute time range.
Answer:
DExplanation:
Explanation:- Snowpipe is a service that automates and optimizes the loading of data from external stages into Snowflake tables. Snowpipe uses a queue to ingest files as they become available in the stage. Snowpipe also provides REST endpoints to load data and retrieve load history reports1.
- The loadHistoryScan endpoint returns the history of files that have been ingested by Snowpipe within a specified time range. The endpoint accepts the following parameters2:
- The loadHistoryScan endpoint can be used to keep a log of data load history by calling it periodically with a suitable time range. The best option among the choices is D, which is to call loadHistoryScan every 10 minutes for a 15-minute time range. This option ensures that the endpoint is called frequently enough to capture the latest files that have been ingested, and that the time range is wide enough to avoid missing any files that may have been delayed or retried by Snowpipe. The other options are either too infrequent, too narrow, or use the wrong endpoint3.
References:
- 1: Introduction to Snowpipe | Snowflake Documentation
- 2: loadHistoryScan | Snowflake Documentation
- 3: Monitoring Snowpipe Load History | Snowflake Documentation
Question 18Files arrive in an external stage every 10 seconds from a proprietary system. The files range in size from 500 K to 3 MB. The data must be accessible by dashboards as soon as it arrives.
How can a Snowflake Architect meet this requirement with the LEAST amount of coding? (Choose two.)
Options:
A.Use Snowpipe with auto-ingest.
B.Use a COPY command with a task.
C.Use a materialized view on an external table.
D.Use the COPY INTO command.
E.Use a combination of a task and a stream.
Answer:
A, EExplanation:
Explanation:The requirement is for the data to be accessible as quickly as possible after it arrives in the external stage with minimal coding effort.
Option A: Snowpipe with auto-ingest is a service that continuously loads data as it arrives in the stage. With auto-ingest, Snowpipe automatically detects new files as they arrive in a cloud stage and loads the data into the specified Snowflake table with minimal delay and no intervention required. This is an ideal low-maintenance solution for the given scenario where files are arriving at a very high frequency.
Option E: Using a combination of a task and a stream allows for real-time change data capture in Snowflake. A stream records changes (inserts, updates, and deletes) made to a table, and a task can be scheduled to trigger on a very short interval, ensuring that changes are processed into the dashboard tables as they occur.
Question 19Database DB1 has schema S1 which has one table, T1.
DB1 --> S1 --> T1
The retention period of EG1 is set to 10 days.
The retention period of s: is set to 20 days.
The retention period of t: Is set to 30 days.
The user runs the following command:
Drop Database DB1;
What will the Time Travel retention period be for T1?
Options:
A.10 days
B.20 days
C.30 days
D.37 days
Answer:
CExplanation:
Explanation:The Time Travel retention period for T1 will be 30 days, which is the retention period set at the table level. The Time Travel retention period determines how long the historical data is preserved and accessible for an object after it is modified or dropped. The Time Travel retention period can be set at the account level, the database level, the schema level, or the table level. The retention period set at the lowest level of the hierarchy takes precedence over the higher levels. Therefore, the retention period set at the table level overrides the retention periods set at the schema level, the database level, or the account level. When the user drops the database DB1, the table T1 is also dropped, but the historical data is still preserved for 30 days, which is the retention period set at the table level. The user can use the UNDROP command to restore the table T1 within the 30-day period. The other options are incorrect because:
- 10 days is the retention period set at the database level, which is overridden by the table level.
- 20 days is the retention period set at the schema level, which is also overridden by the table level.
- 37 days is not a valid option, as it is not the retention period set at any level.
References:
- Understanding & Using Time Travel
- AT | BEFORE
- Snowflake Time Travel & Fail-safe
Question 20An Architect needs to allow a user to create a database from an inbound share.
To meet this requirement, the user’s role must have which privileges? (Choose two.)
Options:
A.IMPORT SHARE;
B.IMPORT PRIVILEGES;
C.CREATE DATABASE;
D.CREATE SHARE;
E.IMPORT DATABASE;
Answer:
C, EExplanation:
Explanation:According to the Snowflake documentation, to create a database from an inbound share, the user’s role must have the following privileges:
- The CREATE DATABASE privilege on the current account. This privilege allows the user to create a new database in the account1.
- The IMPORT DATABASE privilege on the share. This privilege allows the user to import a database from the share into the account2. The other privileges listed are not relevant for this requirement. The IMPORT SHARE privilege is used to import a share into the account, not a database3. The IMPORT PRIVILEGES privilege is used to import the privileges granted on the shared objects, not the objects themselves2. The CREATE SHARE privilege is used to create a share to provide data to other accounts, not to consume data from other accounts4.
References:
- CREATE DATABASE | Snowflake Documentation
- Importing Data from a Share | Snowflake Documentation
- Importing a Share | Snowflake Documentation
- CREATE SHARE | Snowflake Documentation
Question 21When using the copy into
command with the CSV file format, how does the match_by_column_name parameter behave?
Options:
A.It expects a header to be present in the CSV file, which is matched to a case-sensitive table column name.
B.The parameter will be ignored.
C.The command will return an error.
D.The command will return a warning stating that the file has unmatched columns.
Answer:
BExplanation:
Explanation:Option B is the best design to meet the requirements because it uses Snowpipe to ingest the data continuously and efficiently as new records arrive in the object storage, leveraging event notifications. Snowpipe is a service that automates the loading of data from external sources into Snowflake tables1. It also uses streams and tasks to orchestrate transformations on the ingested data. Streams are objects that store the change history of a table, and tasks are objects that execute SQL statements on a schedule or when triggered by another task2. Option B also uses an external function to do model inference with Amazon Comprehend and write the final records to a Snowflake table. An external function is a user-defined function that calls an external API, such as Amazon Comprehend, to perform computations that are not natively supported by Snowflake3. Finally, option B uses the Snowflake Marketplace to make the de-identified final data set available publicly for advertising companies who use different cloud providers in different regions. The Snowflake Marketplace is a platform that enables data providers to list and share their data sets with data consumers, regardless of the cloud platform or region they use4.
Option A is not the best design because it uses copy into to ingest the data, which is not as efficient and continuous as Snowpipe. Copy into is a SQL command that loads data from files into a table in a single transaction. It also exports the data into Amazon S3 to do model inference with Amazon Comprehend, which adds an extra step and increases the operational complexity and maintenance of the infrastructure.
Option C is not the best design because it uses Amazon EMR and PySpark to ingest and transform the data, which also increases the operational complexity and maintenance of the infrastructure. Amazon EMR is a cloud service that provides a managed Hadoop framework to process and analyze large-scale data sets. PySpark is a Python API for Spark, a distributed computing framework that can run on Hadoop. Option C also develops a python program to do model inference by leveraging the Amazon Comprehend text analysis API, which increases the development effort.
Option D is not the best design because it is identical to option A, except for the ingestion method. It still exports the data into Amazon S3 to do model inference with Amazon Comprehend, which adds an extra step and increases the operational complexity and maintenance of the infrastructure.
References: 1: Snowpipe Overview 2: Using Streams and Tasks to Automate Data Pipelines 3: External Functions Overview 4: Snowflake Data Marketplace Overview : [Loading Data Using COPY INTO] : [What is Amazon EMR?] : [PySpark Overview]
- The copy into
command is used to load data from staged files into an existing table in Snowflake. The command supports various file formats, such as CSV, JSON, AVRO, ORC, PARQUET, and XML1.
- The match_by_column_name parameter is a copy option that enables loading semi-structured data into separate columns in the target table that match corresponding columns represented in the source data. The parameter can have one of the following values2:
- The match_by_column_name parameter only applies to semi-structured data, such as JSON, AVRO, ORC, PARQUET, and XML. It does not apply to CSV data, which is considered structured data2.
- When using the copy into
command with the CSV file format, the match_by_column_name parameter behaves as follows2:
References:
- 1: COPY INTO
| Snowflake Documentation
- 2: MATCH_BY_COLUMN_NAME | Snowflake Documentation
Question 22A Data Engineer is designing a near real-time ingestion pipeline for a retail company to ingest event logs into Snowflake to derive insights. A Snowflake Architect is asked to define security best practices to configure access control privileges for the data load for auto-ingest to Snowpipe.
What are the MINIMUM object privileges required for the Snowpipe user to execute Snowpipe?
Options:
A.OWNERSHIP on the named pipe, USAGE on the named stage, target database, and schema, and INSERT and SELECT on the target table
B.OWNERSHIP on the named pipe, USAGE and READ on the named stage, USAGE on the target database and schema, and INSERT end SELECT on the target table
C.CREATE on the named pipe, USAGE and READ on the named stage, USAGE on the target database and schema, and INSERT end SELECT on the target table
D.USAGE on the named pipe, named stage, target database, and schema, and INSERT and SELECT on the target table
Answer:
BExplanation:
Explanation:According to the SnowPro Advanced: Architect documents and learning resources, the minimum object privileges required for the Snowpipe user to execute Snowpipe are:
- OWNERSHIP on the named pipe. This privilege allows the Snowpipe user to create, modify, and drop the pipe object that defines the COPY statement for loading data from the stage to the table1.
- USAGE and READ on the named stage. These privileges allow the Snowpipe user to access and read the data files from the stage that are loaded by Snowpipe2.
- USAGE on the target database and schema. These privileges allow the Snowpipe user to access the database and schema that contain the target table3.
- INSERT and SELECT on the target table. These privileges allow the Snowpipe user to insert data into the table and select data from the table4.
The other options are incorrect because they do not specify the minimum object privileges required for the Snowpipe user to execute Snowpipe. Option A is incorrect because it does not include the READ privilege on the named stage, which is required for the Snowpipe user to read the data files from the stage. Option C is incorrect because it does not include the OWNERSHIP privilege on the named pipe, which is required for the Snowpipe user to create, modify, and drop the pipe object. Option D is incorrect because it does not include the OWNERSHIP privilege on the named pipe or the READ privilege on the named stage, which are both required for the Snowpipe user to execute Snowpipe. References: CREATE PIPE | Snowflake Documentation, CREATE STAGE | Snowflake Documentation, CREATE DATABASE | Snowflake Documentation, CREATE TABLE | Snowflake Documentation
Question 23A company's Architect needs to find an efficient way to get data from an external partner, who is also a Snowflake user. The current solution is based on daily JSON extracts that are placed on an FTP server and uploaded to Snowflake manually. The files are changed several times each month, and the ingestion process needs to be adapted to accommodate these changes.
What would be the MOST efficient solution?
Options:
A.Ask the partner to create a share and add the company's account.
B.Ask the partner to use the data lake export feature and place the data into cloud storage where Snowflake can natively ingest it (schema-on-read).
C.Keep the current structure but request that the partner stop changing files, instead only appending new files.
D.Ask the partner to set up a Snowflake reader account and use that account to get the data for ingestion.
Answer:
AExplanation:
Explanation:The most efficient solution is to ask the partner to create a share and add the company’s account (Option A). This way, the company can access the live data from the partner without any data movement or manual intervention. Snowflake’s secure data sharing feature allows data providers to share selected objects in a database with other Snowflake accounts. The shared data is read-only and does not incur any storage or compute costs for the data consumers. The data consumers can query the shared data directly or create local copies of the shared objects in their own databases. Option B is not efficient because it involves using the data lake export feature, which is intended for exporting data from Snowflake to an external data lake, not for importing data from another Snowflake account. The data lake export feature also requires the data provider to create an external stage on cloud storage and use the COPY INTO
command to export the data into parquet files. The data consumer then needs to create an external table or a file format to load the data from the cloud storage into Snowflake. This process can be complex and costly, especially if the data changes frequently. Option C is not efficient because it does not solve the problem of manual data ingestion and adaptation. Keeping the current structure of daily JSON extracts on an FTP server and requesting the partner to stop changing files, instead only appending new files, does not improve the efficiency or reliability of the data ingestion process. The company still needs to upload the data to Snowflake manually and deal with any schema changes or data quality issues. Option D is not efficient because it requires the partner to set up a Snowflake reader account and use that account to get the data for ingestion. A reader account is a special type of account that can only consume data from the provider account that created it. It is intended for data consumers who are not Snowflake customers and do not have a licensing agreement with Snowflake. A reader account is not suitable for data ingestion from another Snowflake account, as it does not allow uploading, modifying, or unloading data. The company would need to use external tools or interfaces to access the data from the reader account and load it into their own account, which can be slow and expensive. References: The answer can be verified from Snowflake’s official documentation on secure data sharing, data lake export, and reader accounts available on their website. Here are some relevant links: - Introduction to Secure Data Sharing | Snowflake Documentation
- Data Lake Export Public Preview Is Now Available on Snowflake | Snowflake Blog
- Managing Reader Accounts | Snowflake Documentation
Question 24When loading data into a table that captures the load time in a column with a default value of either CURRENT_TIME () or CURRENT_TIMESTAMP () what will occur?
Options:
A.All rows loaded using a specific COPY statement will have varying timestamps based on when the rows were inserted.
B.Any rows loaded using a specific COPY statement will have varying timestamps based on when the rows were read from the source.
C.Any rows loaded using a specific COPY statement will have varying timestamps based on when the rows were created in the source.
D.All rows loaded using a specific COPY statement will have the same timestamp value.
Answer:
DExplanation:
Explanation:When using the COPY command to load data into Snowflake, if a column has a default value set to CURRENT_TIME() or CURRENT_TIMESTAMP(), all rows loaded by that specific COPY command will have the same timestamp. This is because the default value for the timestamp is evaluated at the start of the COPY operation, and that same value is applied to all rows loaded by that operation.
References: This behavior is consistent with Snowflake’s documentation on the CURRENT_TIMESTAMP function, which specifies that the timestamp is captured at the time the statement is executed1.
Question 25A healthcare company is deploying a Snowflake account that may include Personal Health Information (PHI). The company must ensure compliance with all relevant privacy standards.
Which best practice recommendations will meet data protection and compliance requirements? (Choose three.)
Options:
A.Use, at minimum, the Business Critical edition of Snowflake.
B.Create Dynamic Data Masking policies and apply them to columns that contain PHI.
C.Use the Internal Tokenization feature to obfuscate sensitive data.
D.Use the External Tokenization feature to obfuscate sensitive data.
E.Rewrite SQL queries to eliminate projections of PHI data based on current_role().
F.Avoid sharing data with partner organizations.
Answer:
A, B, DExplanation:
Explanation:- A healthcare company that handles PHI data must ensure compliance with relevant privacy standards, such as HIPAA, HITRUST, and GDPR. Snowflake provides several features and best practices to help customers meet their data protection and compliance requirements1.
- One best practice recommendation is to use, at minimum, the Business Critical edition of Snowflake. This edition provides the highest level of data protection and security, including end-to-end encryption with customer-managed keys, enhanced object-level security, and HIPAA and HITRUST compliance2. Therefore, option A is correct.
- Another best practice recommendation is to create Dynamic Data Masking policies and apply them to columns that contain PHI. Dynamic Data Masking is a feature that allows masking or redacting sensitive data based on the current user’s role. This way, only authorized users can view the unmasked data, while others will see masked values, such as NULL, asterisks, or random characters3. Therefore, option B is correct.
- A third best practice recommendation is to use the External Tokenization feature to obfuscate sensitive data. External Tokenization is a feature that allows replacing sensitive data with tokens that are generated and stored by an external service, such as Protegrity. This way, the original data is never stored or processed by Snowflake, and only authorized users can access the tokenized data through the external service4. Therefore, option D is correct.
- Option C is incorrect, because the Internal Tokenization feature is not available in Snowflake. Snowflake does not provide any native tokenization functionality, but only supports integration with external tokenization services4.
- Option E is incorrect, because rewriting SQL queries to eliminate projections of PHI data based on current_role() is not a best practice. This approach is error-prone, inefficient, and hard to maintain. A better alternative is to use Dynamic Data Masking policies, which can automatically mask data based on the user’s role without modifying the queries3.
- Option F is incorrect, because avoiding sharing data with partner organizations is not a best practice. Snowflake enables secure and governed data sharing with internal and external consumers, such as business units, customers, or partners. Data sharing does not involve copying or moving data, but only granting access privileges to the shared objects. Data sharing can also leverage Dynamic Data Masking and External Tokenization features to protect sensitive data5.
References: : Snowflake’s Security & Compliance Reports : Snowflake Editions : Dynamic Data Masking : External Tokenization : Secure Data Sharing
Question 26Two queries are run on the customer_address table:
create or replace TABLE CUSTOMER_ADDRESS ( CA_ADDRESS_SK NUMBER(38,0), CA_ADDRESS_ID VARCHAR(16), CA_STREET_NUMBER VARCHAR(IO) CA_STREET_NAME VARCHAR(60), CA_STREET_TYPE VARCHAR(15), CA_SUITE_NUMBER VARCHAR(10), CA_CITY VARCHAR(60), CA_COUNTY
VARCHAR(30), CA_STATE VARCHAR(2), CA_ZIP VARCHAR(10), CA_COUNTRY VARCHAR(20), CA_GMT_OFFSET NUMBER(5,2), CA_LOCATION_TYPE
VARCHAR(20) );
ALTER TABLE DEMO_DB.DEMO_SCH.CUSTOMER_ADDRESS ADD SEARCH OPTIMIZATION ON SUBSTRING(CA_ADDRESS_ID);
Which queries will benefit from the use of the search optimization service? (Select TWO).
Options:
A.select * from DEMO_DB.DEMO_SCH.CUSTOMER_ADDRESS Where substring(CA_ADDRESS_ID,1,8)= substring('AAAAAAAAPHPPLBAAASKDJHASLKDJHASKJD',1,8);
B.select * from DEMO_DB.DEMO_SCH.CUSTOMER_ADDRESS Where CA_ADDRESS_ID= substring('AAAAAAAAPHPPLBAAASKDJHASLKDJHASKJD',1,16);
C.select*fromDEMO_DB.DEMO_SCH.CUSTOMER_ADDRESSWhereCA_ADDRESS_IDLIKE ’%BAAASKD%';
D.select*fromDEMO_DB.DEMO_SCH.CUSTOMER_ADDRESSWhereCA_ADDRESS_IDLIKE '%PHPP%';
E.select*fromDEMO_DB.DEMO_SCH.CUSTOMER_ADDRESSWhereCA_ADDRESS_IDNOT LIKE '%AAAAAAAAPHPPL%';
Answer:
A, BExplanation:
Explanation:The use of the search optimization service in Snowflake is particularly effective when queries involve operations that match exact substrings or start from the beginning of a string. The ALTER TABLE command adding search optimization specifically for substrings on the CA_ADDRESS_ID field allows the service to create an optimized search path for queries using substring matches.
- Option A benefits because it directly matches a substring from the start of the CA_ADDRESS_ID, aligning with the optimization's capability to quickly locate records based on the beginning segments of strings.
- Option B also benefits, despite performing a full equality check, because it essentially compares the full length of CA_ADDRESS_ID to a substring, which can leverage the substring index for efficient retrieval.Options C, D, and E involve patterns that do not start from the beginning of the string or use negations, which are not optimized by the search optimization service configured for starting substring matches.References: Snowflake's documentation on the use of search optimization for substring matching in SQL queries.
Question 27Which data models can be used when modeling tables in a Snowflake environment? (Select THREE).
Options:
A.Graph model
B.Dimensional/Kimball
C.Data lake
D.lnmon/3NF
E.Bayesian hierarchical model
F.Data vault
Answer:
B, D, FExplanation:
Explanation:Snowflake is a cloud data platform that supports various data models for modeling tables in a Snowflake environment. The data models can be classified into two categories: dimensional and normalized. Dimensional data models are designed to optimize query performance and ease of use for business intelligence and analytics. Normalized data models are designed to reduce data redundancy and ensure data integrity for transactional and operational systems. The following are some of the data models that can be used in Snowflake:
- Dimensional/Kimball: This is a popular dimensional data model that uses a star or snowflake schema to organize data into fact and dimension tables. Fact tables store quantitative measures and foreign keys to dimension tables. Dimension tables store descriptive attributes and hierarchies. A star schema has a single denormalized dimension table for each dimension, while a snowflake schema has multiple normalized dimension tables for each dimension. Snowflake supports both star and snowflake schemas, and allows users to create views and joins to simplify queries.
- Inmon/3NF: This is a common normalized data model that uses a third normal form (3NF) schema to organize data into entities and relationships. 3NF schema eliminates data duplication and ensures data consistency by applying three rules: 1) every column in a table must depend on the primary key, 2) every column in a table must depend on the whole primary key, not a part of it, and 3) every column in a table must depend only on the primary key, not on other columns. Snowflake supports 3NF schema and allows users to create referential integrity constraints and foreign key relationships to enforce data quality.
- Data vault: This is a hybrid data model that combines the best practices of dimensional and normalized data models to create a scalable, flexible, and resilient data warehouse. Data vault schema consists of three types of tables: hubs, links, and satellites. Hubs store business keys and metadata for each entity. Links store associations and relationships between entities. Satellites store descriptive attributes and historical changes for each entity or relationship. Snowflake supports data vault schema and allows users to leverage its features such as time travel, zero-copy cloning, and secure data sharing to implement data vault methodology.
References: What is Data Modeling? | Snowflake, Snowflake Schema in Data Warehouse Model - GeeksforGeeks, [Data Vault 2.0 Modeling with Snowflake]
Question 28A company is designing high availability and disaster recovery plans and needs to maximize redundancy and minimize recovery time objectives for their critical application processes. Cost is not a concern as long as the solution is the best available. The plan so far consists of the following steps:
1. Deployment of Snowflake accounts on two different cloud providers.
2. Selection of cloud provider regions that are geographically far apart.
3. The Snowflake deployment will replicate the databases and account data between both cloud provider accounts.
4. Implementation of Snowflake client redirect.
What is the MOST cost-effective way to provide the HIGHEST uptime and LEAST application disruption if there is a service event?
Options:
A.Connect the applications using the
- URL. Use the Business Critical Snowflake edition. B.Connect the applications using the
- URL. Use the Virtual Private Snowflake (VPS) edition. C.Connect the applications using the
- URL. Use the Enterprise Snowflake edition. D.Connect the applications using the
- URL. Use the Business Critical Snowflake edition. Answer:
DExplanation:
Explanation:To provide the highest uptime and least application disruption in case of a service event, the best option is to use the Business Critical Snowflake edition and connect the applications using the
- URL. The Business Critical Snowflake edition offers the highest level of security, performance, and availability for Snowflake accounts. It includes features such as customer-managed encryption keys, HIPAA compliance, and 4-hour RPO and RTO SLAs. It also supports account replication and failover across regions and cloud platforms, which enables business continuity and disaster recovery. By using the - URL, the applications can leverage the Snowflake Client Redirect feature, which automatically redirects the client connections to the secondary account in case of a failover. This way, the applications can seamlessly switch to the backup account without any manual intervention or configuration changes. The other options are less cost-effective or less reliable because they either use a lower edition of Snowflake, which does not support account replication and failover, or they use the - URL, which does not support client redirect and requires manual updates to the connection string in case of a failover. References: - [Snowflake Editions] 1
- [Replication and Failover/Failback] 2
- [Client Redirect] 3
- [Snowflake Account Identifiers] 4
Question 29Which SQL alter command will MAXIMIZE memory and compute resources for a Snowpark stored procedure when executed on the snowpark_opt_wh warehouse?
A)
B)
C)
D)
Options:
A.Option A
B.Option B
C.Option C
D.Option D
Answer:
AExplanation:
Explanation:To maximize memory and compute resources for a Snowpark stored procedure, you need to set the MAX_CONCURRENCY_LEVEL parameter for the warehouse that executes the stored procedure. This parameter determines the maximum number of concurrent queries that can run on a single warehouse. By setting it to 16, you ensure that the warehouse can use all the available CPU cores and memory on a single node, which is the optimal configuration for Snowpark-optimized warehouses. This will improve the performance and efficiency of the stored procedure, as it will not have to share resources with other queries or nodes. The other options are incorrect because they either do not change the MAX_CONCURRENCY_LEVEL parameter, or they set it to a lower value than 16, which will reduce the memory and compute resources for the stored procedure. References:
- [Snowpark-optimized Warehouses] 1
- [Training Machine Learning Models with Snowpark Python] 2
- [Snowflake Shorts: Snowpark Optimized Warehouses] 3
Question 30A company has a table with that has corrupted data, named Data. The company wants to recover the data as it was 5 minutes ago using cloning and Time Travel.
What command will accomplish this?
Options:
A.CREATE CLONE TABLE Recover_Data FROM Data AT(OFFSET => -60*5);
B.CREATE CLONE Recover_Data FROM Data AT(OFFSET => -60*5);
C.CREATE TABLE Recover_Data CLONE Data AT(OFFSET => -60*5);
D.CREATE TABLE Recover Data CLONE Data AT(TIME => -60*5);
Answer:
CExplanation:
Explanation:This is the correct command to create a clone of the table Data as it was 5 minutes ago using cloning and Time Travel. Cloning is a feature that allows creating a copy of a database, schema, table, or view without duplicating the data or metadata. Time Travel is a feature that enables accessing historical data (i.e. data that has been changed or deleted) at any point within a defined period. To create a clone of a table at a point in time in the past, the syntax is:
CREATE TABLE
CLONE AT (OFFSET => ); The OFFSET parameter specifies the time difference in seconds from the present time. A negative value indicates a point in the past. For example, -60*5 means 5 minutes ago. Alternatively, the TIMESTAMP parameter can be used to specify an exact timestamp in the past. The clone will contain the data as it existed in the source table at the specified point in time12.
References:
- Snowflake Documentation: Cloning Objects
- Snowflake Documentation: Cloning Objects at a Point in Time in the Past
Question 31An Architect is troubleshooting a query with poor performance using the QUERY function. The Architect observes that the COMPILATION_TIME Is greater than the EXECUTION_TIME.
What is the reason for this?
Options:
A.The query is processing a very large dataset.
B.The query has overly complex logic.
C.The query Is queued for execution.
D.The query Is reading from remote storage
Answer:
BExplanation:
Explanation:- The correct answer is B because the compilation time is the time it takes for the optimizer to create an optimal query plan for the efficient execution of the query. The compilation time depends on the complexity of the query, such as the number of tables, columns, joins, filters, aggregations, subqueries, etc. The more complex the query, the longer it takes to compile.
- Option A is incorrect because the query processing time is not affected by the size of the dataset, but by the size of the virtual warehouse. Snowflake automatically scales the compute resources to match the data volume and parallelizes the query execution. The size of the dataset may affect the execution time, but not the compilation time.
- Option C is incorrect because the query queue time is not part of the compilation time or the execution time. It is a separate metric that indicates how long the query waits for a warehouse slot before it starts running. The query queue time depends on the warehouse load, concurrency, and priority settings.
- Option D is incorrect because the query remote IO time is not part of the compilation time or the execution time. It is a separate metric that indicates how long the query spends reading data from remote storage, such as S3 or Azure Blob Storage. The query remote IO time depends on the network latency, bandwidth, and caching efficiency. References:
- Understanding Why Compilation Time in Snowflake Can Be Higher than Execution Time: This article explains why the total duration (compilation + execution) time is an essential metric to measure query performance in Snowflake. It discusses the reasons for the long compilation time, including query complexity and the number of tables and columns.
- Exploring Execution Times: This document explains how to examine the past performance of queries and tasks using Snowsight or by writing queries against views in the ACCOUNT_USAGE schema. It also describes the different metrics and dimensions that affect query performance, such as duration, compilation, execution, queue, and remote IO time.
- What is the “compilation time” and how to optimize it?: This community post provides some tips and best practices on how to reduce the compilation time, such as simplifying the query logic, using views or common table expressions, and avoiding unnecessary columns or joins.
Question 32An Architect is troubleshooting a query with poor performance using the QUERY_HIST0RY function. The Architect observes that the COMPILATIONJHME is greater than the EXECUTIONJTIME.
What is the reason for this?
Options:
A.The query is processing a very large dataset.
B.The query has overly complex logic.
C.The query is queued for execution.
D.The query is reading from remote storage.
Answer:
BExplanation:
Explanation:Compilation time is the time it takes for the optimizer to create an optimal query plan for the efficient execution of the query. It also involves some pruning of partition files, making the query execution efficient2
If the compilation time is greater than the execution time, it means that the optimizer spent more time analyzing the query than actually running it. This could indicate that the query has overly complex logic, such as multiple joins, subqueries, aggregations, or expressions. The complexity of the query could also affect the size and quality of the query plan, which could impact the performance of the query3
To reduce the compilation time, the Architect can try to simplify the query logic, use views or common table expressions (CTEs) to break down the query into smaller parts, or use hints to guide the optimizer. The Architect can also use the EXPLAIN command to examine the query plan and identify potential bottlenecks or inefficiencies4 References:
- 1: SnowPro Advanced: Architect | Study Guide 5
- 2: Snowflake Documentation | Query Profile Overview 6
- 3: Understanding Why Compilation Time in Snowflake Can Be Higher than Execution Time 7
- 4: Snowflake Documentation | Optimizing Query Performance 8
- : SnowPro Advanced: Architect | Study Guide
- : Query Profile Overview
- : Understanding Why Compilation Time in Snowflake Can Be Higher than Execution Time
- : Optimizing Query Performance
Question 33A company is designing a process for importing a large amount of loT JSON data from cloud storage into Snowflake. New sets of loT data get generated and uploaded approximately every 5 minutes.
Once the loT data is in Snowflake, the company needs up-to-date information from an external vendor to join to the data. This data is then presented to users through a dashboard that shows different levels of aggregation. The external vendor is a Snowflake customer.
What solution will MINIMIZE complexity and MAXIMIZE performance?
Options:
A.1. Create an external table over the JSON data in cloud storage.
2. Create a task that runs every 5 minutes to run a transformation procedure on new data, based on a saved timestamp.
3. Ask the vendor to expose an API so an external function can be used to generate a call to join the data back to the loT data in the transformation procedure.
4. Give the transformed table access to the dashboard tool.
5. Perform the a
B.1. Create an external table over the JSON data in cloud storage.
2. Create a task that runs every 5 minutes to run a transformation procedure on new data based on a saved timestamp.
3. Ask the vendor to create a data share with the required data that can be imported into the company's Snowflake account.
4. Join the vendor's data back to the loT data using a transformation procedure.
5. Create views over the larger da
C.1. Create a Snowpipe to bring the JSON data into Snowflake.
2. Use streams and tasks to trigger a transformation procedure when new JSON data arrives.
3. Ask the vendor to expose an API so an external function call can be made to join the vendor's data back to the loT data in a transformation procedure.
4. Create materialized views over the larger dataset to perform the aggregations required by the dashboard.
5. Give
D.1. Create a Snowpipe to bring the JSON data into Snowflake.
2. Use streams and tasks to trigger a transformation procedure when new JSON data arrives.
3. Ask the vendor to create a data share with the required data that is then imported into the Snowflake account.
4. Join the vendor's data back to the loT data in a transformation procedure
5. Create materialized views over the larger dataset to perform the aggregatio
Answer:
DExplanation:
Explanation:Using Snowpipe for continuous, automated data ingestion minimizes the need for manual intervention and ensures that data is available in Snowflake promptly after it is generated. Leveraging Snowflake’s data sharing capabilities allows for efficient and secure access to the vendor’s data without the need for complex API integrations. Materialized views provide pre-aggregated data for fast access, which is ideal for dashboards that require high performance1234.
References =
•Snowflake Documentation on Snowpipe4
•Snowflake Documentation on Secure Data Sharing2
•Best Practices for Data Ingestion with Snowflake1
Question 34A company is using Snowflake in Azure in the Netherlands. The company analyst team also has data in JSON format that is stored in an Amazon S3 bucket in the AWS Singapore region that the team wants to analyze.
The Architect has been given the following requirements:
1. Provide access to frequently changing data
2. Keep egress costs to a minimum
3. Maintain low latency
How can these requirements be met with the LEAST amount of operational overhead?
Options:
A.Use a materialized view on top of an external table against the S3 bucket in AWS Singapore.
B.Use an external table against the S3 bucket in AWS Singapore and copy the data into transient tables.
C.Copy the data between providers from S3 to Azure Blob storage to collocate, then use Snowpipe for data ingestion.
D.Use AWS Transfer Family to replicate data between the S3 bucket in AWS Singapore and an Azure Netherlands Blob storage, then use an external table against the Blob storage.
Answer:
BExplanation:
Explanation:Option A is the best design to meet the requirements because it uses a materialized view on top of an external table against the S3 bucket in AWS Singapore. A materialized view is a database object that contains the results of a query and can be refreshed periodically to reflect changes in the underlying data1. An external table is a table that references data files stored in a cloud storage service, such as Amazon S32. By using a materialized view on top of an external table, the company can provide access to frequently changing data, keep egress costs to a minimum, and maintain low latency. This is because the materialized view will cache the query results in Snowflake, reducing the need to access the external data files and incur network charges. The materialized view will also improve the query performance by avoiding scanning the external data files every time. The materialized view can be refreshed on a schedule or on demand to capture the changes in the external data files1.
Option B is not the best design because it uses an external table against the S3 bucket in AWS Singapore and copies the data into transient tables. A transient table is a table that is not subject to the Time Travel and Fail-safe features of Snowflake, and is automatically purged after a period of time3. By using an external table and copying the data into transient tables, the company will incur more egress costs and operational overhead than using a materialized view. This is because the external table will access the external data files every time a query is executed, and the copy operation will also transfer data from S3 to Snowflake. The transient tables will also consume more storage space in Snowflake and require manual maintenance to ensure they are up to date.
Option C is not the best design because it copies the data between providers from S3 to Azure Blob storage to collocate, then uses Snowpipe for data ingestion. Snowpipe is a service that automates the loading of data from external sources into Snowflake tables4. By copying the data between providers, the company will incur high egress costs and latency, as well as operational complexity and maintenance of the infrastructure. Snowpipe will also add another layer of processing and storage in Snowflake, which may not be necessary if the external data files are already in a queryable format.
Option D is not the best design because it uses AWS Transfer Family to replicate data between the S3 bucket in AWS Singapore and an Azure Netherlands Blob storage, then uses an external table against the Blob storage. AWS Transfer Family is a service that enables secure and seamless transfer of files over SFTP, FTPS, and FTP to and from Amazon S3 or Amazon EFS5. By using AWS Transfer Family, the company will incur high egress costs and latency, as well as operational complexity and maintenance of the infrastructure. The external table will also access the external data files every time a query is executed, which may affect the query performance.
References: 1: Materialized Views 2: External Tables 3: Transient Tables 4: Snowpipe Overview 5: AWS Transfer Family
Question 35A data platform team creates two multi-cluster virtual warehouses with the AUTO_SUSPEND value set to NULL on one. and '0' on the other. What would be the execution behavior of these virtual warehouses?
Options:
A.Setting a '0' or NULL value means the warehouses will never suspend.
B.Setting a '0' or NULL value means the warehouses will suspend immediately.
C.Setting a '0' or NULL value means the warehouses will suspend after the default of 600 seconds.
D.Setting a '0' value means the warehouses will suspend immediately, and NULL means the warehouses will never suspend.
Answer:
DExplanation:
Explanation:The AUTO_SUSPEND parameter controls the amount of time, in seconds, of inactivity after which a warehouse is automatically suspended. If the parameter is set to NULL, the warehouse never suspends. If the parameter is set to ‘0’, the warehouse suspends immediately after executing a query. Therefore, the execution behavior of the two virtual warehouses will be different depending on the AUTO_SUSPEND value. The warehouse with NULL value will keep running until it is manually suspended or the resource monitor limits are reached. The warehouse with ‘0’ value will suspend as soon as it finishes a query and release the compute resources. References:
- ALTER WAREHOUSE
- Parameters
Question 36What is a valid object hierarchy when building a Snowflake environment?
Options:
A.Account --> Database --> Schema --> Warehouse
B.Organization --> Account --> Database --> Schema --> Stage
C.Account --> Schema > Table --> Stage
D.Organization --> Account --> Stage --> Table --> View
Answer:
BExplanation:
Explanation:This is the valid object hierarchy when building a Snowflake environment, according to the Snowflake documentation and the web search results. Snowflake is a cloud data platform that supports various types of objects, such as databases, schemas, tables, views, stages, warehouses, and more. These objects are organized in a hierarchical structure, as follows:
- Organization: An organization is the top-level entity that represents a group of Snowflake accounts that are related by business needs or ownership. An organization can have one or more accounts, and can enable features such as cross-account data sharing, billing and usage reporting, and single sign-on across accounts12.
- Account: An account is the primary entity that represents a Snowflake customer. An account can have one or more databases, schemas, stages, warehouses, and other objects. An account can also have one or more users, roles, and security integrations. An account is associated with a specific cloud platform, region, and Snowflake edition34.
- Database: A database is a logical grouping of schemas. A database can have one or more schemas, and can store structured, semi-structured, or unstructured data. A database can also have properties such as retention time, encryption, and ownership56.
- Schema: A schema is a logical grouping of tables, views, stages, and other objects. A schema can have one or more objects, and can define the namespace and access control for the objects. A schema can also have properties such as ownership and default warehouse .
- Stage: A stage is a named location that references the files in external or internal storage. A stage can be used to load data into Snowflake tables using the COPY INTO command, or to unload data from Snowflake tables using the COPY INTO LOCATION command. A stage can be created at the account, database, or schema level, and can have properties such as file format, encryption, and credentials .
The other options listed are not valid object hierarchies, because they either omit or misplace some objects in the structure. For example, option A omits the organization level and places the warehouse under the schema level, which is incorrect. Option C omits the organization, account, and stage levels, and places the table under the schema level, which is incorrect. Option D omits the database level and places the stage and table under the account level, which is incorrect.
References:
- Snowflake Documentation: Organizations
- Snowflake Blog: Introducing Organizations in Snowflake
- Snowflake Documentation: Accounts
- Snowflake Blog: Understanding Snowflake Account Structures
- Snowflake Documentation: Databases
- Snowflake Blog: How to Create a Database in Snowflake
- [Snowflake Documentation: Schemas]
- [Snowflake Blog: How to Create a Schema in Snowflake]
- [Snowflake Documentation: Stages]
- [Snowflake Blog: How to Use Stages in Snowflake]
Question 37What Snowflake system functions are used to view and or monitor the clustering metadata for a table? (Select TWO).
Options:
A.SYSTEMSCLUSTERING
B.SYSTEMSTABLE_CLUSTERING
C.SYSTEMSCLUSTERING_DEPTH
D.SYSTEMSCLUSTERING_RATIO
E.SYSTEMSCLUSTERING_INFORMATION
Answer:
C, EExplanation:
Explanation:The Snowflake system functions used to view and monitor the clustering metadata for a table are:
- SYSTEM$CLUSTERING_INFORMATION
- SYSTEM$CLUSTERING_DEPTH
Comprehensive But Short Explanation:
- The SYSTEM$CLUSTERING_INFORMATION function in Snowflake returns a variety of clustering information for a specified table. This information includes the average clustering depth, total number of micro-partitions, total constant partition count, average overlaps, average depth, and a partition depth histogram. This function allows you to specify either one or multiple columns for which the clustering information is returned, and it returns this data in JSON format.
- The SYSTEM$CLUSTERING_DEPTH function computes the average depth of a table based on specified columns or the clustering key defined for the table. A lower average depth indicates that the table is better clustered with respect to the specified columns. This function also allows specifying columns to calculate the depth, and the values need to be enclosed in single quotes.
References:
- SYSTEM$CLUSTERING_INFORMATION: Snowflake Documentation
- SYSTEM$CLUSTERING_DEPTH: Snowflake Documentation
Question 38Which query will identify the specific days and virtual warehouses that would benefit from a multi-cluster warehouse to improve the performance of a particular workload?
A)
B)
C)
D)
Options:
A.Option A
B.Option B
C.Option C
D.Option D
Answer:
BExplanation:
Explanation:The correct answer is option B. This query is designed to assess the need for a multi-cluster warehouse by examining the queuing time (AVG_QUEUED_LOAD) on different days and virtual warehouses. When the AVG_QUEUED_LOAD is greater than zero, it suggests that queries are waiting for resources, which can be an indicator that performance might be improved by using a multi-cluster warehouse to handle the workload more efficiently. By grouping by date and warehouse name and filtering on the sum of the average queued load being greater than zero, the query identifies specific days and warehouses where the workload exceeded the available compute resources. This information is valuable when considering scaling out warehouses to multi-cluster configurations for improved performance.
Question 39Is it possible for a data provider account with a Snowflake Business Critical edition to share data with an Enterprise edition data consumer account?
Options:
A.A Business Critical account cannot be a data sharing provider to an Enterprise consumer. Any consumer accounts must also be Business Critical.
B.If a user in the provider account with role authority to create or alter share adds an Enterprise account as a consumer, it can import the share.
C.If a user in the provider account with a share owning role sets share_restrictions to False when adding an Enterprise consumer account, it can import the share.
D.If a user in the provider account with a share owning role which also has override share restrictions privilege share_restrictions set to False when adding an Enterprise consumer account, it can import the share.
Answer:
BExplanation:
Explanation:In Snowflake, data sharing capabilities allow a Business Critical edition account to share data with an Enterprise edition consumer account. The ability to share data is contingent upon the role permissions within the provider account. If a user has the necessary role authority (like ACCOUNTADMIN or a role with similar privileges to create or manage shares), they can add an Enterprise edition account as a consumer. This feature enables flexibility in data sharing across different Snowflake account editions, facilitating broader data collaboration and accessibility.References: Snowflake's data sharing documentation and the specifics of edition-based capabilities discussed in SnowPro Advanced: Architect certification materials.
Question 40A company is following the Data Mesh principles, including domain separation, and chose one Snowflake account for its data platform.
An Architect created two data domains to produce two data products. The Architect needs a third data domain that will use both of the data products to create an aggregate data product. The read access to the data products will be granted through a separate role.
Based on the Data Mesh principles, how should the third domain be configured to create the aggregate product if it has been granted the two read roles?
Options:
A.Use secondary roles for all users.
B.Create a hierarchy between the two read roles.
C.Request a technical ETL user with the sysadmin role.
D.Request that the two data domains share data using the Data Exchange.
Answer:
DExplanation:
Explanation:In the scenario described, where a third data domain needs access to two existing data products in a Snowflake account structured according to Data Mesh principles, the best approach is to utilize Snowflake’s Data Exchange functionality. Option D is correct as it facilitates the sharing and governance of data across different domains efficiently and securely. Data Exchange allows domains to publish and subscribe to live data products, enabling real-time data collaboration and access management in a governed manner. This approach is in line with Data Mesh principles, which advocate for decentralized data ownership and architecture, enhancing agility and scalability across the organization.References:
- Snowflake Documentation on Data Exchange
- Articles on Data Mesh Principles in Data Management
Question 41A table, EMP_ TBL has three records as shown:
The following variables are set for the session:
Which SELECT statements will retrieve all three records? (Select TWO).
Options:
A.Select * FROM Stbl_ref WHERE Scol_ref IN ('Name1','Nam2','Name3');
B.SELECT * FROM EMP_TBL WHERE identifier(Scol_ref) IN ('Namel','Name2', 'Name3');
C.SELECT * FROM identifier
WHERE NAME IN ($var1, $var2, $var3); D.SELECT * FROM identifier($tbl_ref) WHERE ID IN Cvarl','var2','var3');
E.SELECT * FROM $tb1_ref WHERE $col_ref IN ($var1, Svar2, Svar3);
Answer:
B, EExplanation:
Explanation:- The correct answer is B and E because they use the correct syntax and values for the identifier function and the session variables.
- The identifier function allows you to use a variable or expression as an identifier (such as a table name or column name) in a SQL statement. It takes a single argument and returns it as an identifier. For example, identifier($tbl_ref) returns EMP_TBL as an identifier.
- The session variables are set using the SET command and can be referenced using the $ sign. For example, $var1 returns Name1 as a value.
- Option A is incorrect because it uses Stbl_ref and Scol_ref, which are not valid session variables or identifiers. They should be $tbl_ref and $col_ref instead.
- Option C is incorrect because it uses identifier
, which is not a valid syntax for the identifier function. It should be identifier($tbl_ref) instead. - Option D is incorrect because it uses Cvarl, var2, and var3, which are not valid session variables or values. They should be $var1, $var2, and $var3 instead. References:
- Snowflake Documentation: Identifier Function
- Snowflake Documentation: Session Variables
- Snowflake Learning: SnowPro Advanced: Architect Exam Study Guide
Question 42An Architect on a new project has been asked to design an architecture that meets Snowflake security, compliance, and governance requirements as follows:
1) Use Tri-Secret Secure in Snowflake
2) Share some information stored in a view with another Snowflake customer
3) Hide portions of sensitive information from some columns
4) Use zero-copy cloning to refresh the non-production environment from the production environment
To meet these requirements, which design elements must be implemented? (Choose three.)
Options:
A.Define row access policies.
B.Use the Business-Critical edition of Snowflake.
C.Create a secure view.
D.Use the Enterprise edition of Snowflake.
E.Use Dynamic Data Masking.
F.Create a materialized view.
Answer:
B, C, EExplanation:
Explanation:These three design elements are required to meet the security, compliance, and governance requirements for the project.
- To use Tri-Secret Secure in Snowflake, the Business Critical edition of Snowflake is required. This edition provides enhanced data protection features, such as customer-managed encryption keys, that are not available in lower editions. Tri-Secret Secure is a feature that combines a Snowflake-maintained key and a customer-managed key to create a composite master key to encrypt the data in Snowflake1.
- To share some information stored in a view with another Snowflake customer, a secure view is recommended. A secure view is a view that hides the underlying data and the view definition from unauthorized users. Only the owner of the view and the users who are granted the owner’s role can see the view definition and the data in the base tables of the view2. A secure view can be shared with another Snowflake account using a data share3.
- To hide portions of sensitive information from some columns, Dynamic Data Masking can be used. Dynamic Data Masking is a feature that allows applying masking policies to columns to selectively mask plain-text data at query time. Depending on the masking policy conditions and the user’s role, the data can be fully or partially masked, or shown as plain-text4.
Question 43A media company needs a data pipeline that will ingest customer review data into a Snowflake table, and apply some transformations. The company also needs to use Amazon Comprehend to do sentiment analysis and make the de-identified final data set available publicly for advertising companies who use different cloud providers in different regions.
The data pipeline needs to run continuously ang efficiently as new records arrive in the object storage leveraging event notifications. Also, the operational complexity, maintenance of the infrastructure, including platform upgrades and security, and the development effort should be minimal.
Which design will meet these requirements?
Options:
A.Ingest the data using COPY INTO and use streams and tasks to orchestrate transformations. Export the data into Amazon S3 to do model inference with Amazon Comprehend and ingest the data back into a Snowflake table. Then create a listing in the Snowflake Marketplace to make the data available to other companies.
B.Ingest the data using Snowpipe and use streams and tasks to orchestrate transformations. Create an external function to do model inference with Amazon Comprehend and write the final records to a Snowflake table. Then create a listing in the Snowflake Marketplace to make the data available to other companies.
C.Ingest the data into Snowflake using Amazon EMR and PySpark using the Snowflake Spark connector. Apply transformations using another Spark job. Develop a python program to do model inference by leveraging the Amazon Comprehend text analysis API. Then write the results to a Snowflake table and create a listing in the Snowflake Marketplace to make the data available to other companies.
D.Ingest the data using Snowpipe and use streams and tasks to orchestrate transformations. Export the data into Amazon S3 to do model inference with Amazon Comprehend and ingest the data back into a Snowflake table. Then create a listing in the Snowflake Marketplace to make the data available to other companies.
Answer:
BExplanation:
Explanation:This design meets all the requirements for the data pipeline. Snowpipe is a feature that enables continuous data loading into Snowflake from object storage using event notifications. It is efficient, scalable, and serverless, meaning it does not require any infrastructure or maintenance from the user. Streams and tasks are features that enable automated data pipelines within Snowflake, using change data capture and scheduled execution. They are also efficient, scalable, and serverless, and they simplify the data transformation process. External functions are functions that can invoke external services or APIs from within Snowflake. They can be used to integrate with Amazon Comprehend and perform sentiment analysis on the data. The results can be written back to a Snowflake table using standard SQL commands. Snowflake Marketplace is a platform that allows data providers to share data with data consumers across different accounts, regions, and cloud platforms. It is a secure and easy way to make data publicly available to other companies.
References:
- Snowpipe Overview | Snowflake Documentation
- Introduction to Data Pipelines | Snowflake Documentation
- External Functions Overview | Snowflake Documentation
- Snowflake Data Marketplace Overview | Snowflake Documentation
Question 44A company has a Snowflake account named ACCOUNTA in AWS us-east-1 region. The company stores its marketing data in a Snowflake database named MARKET_DB. One of the company’s business partners has an account named PARTNERB in Azure East US 2 region. For marketing purposes the company has agreed to share the database MARKET_DB with the partner account.
Which of the following steps MUST be performed for the account PARTNERB to consume data from the MARKET_DB database?
Options:
A.Create a new account (called AZABC123) in Azure East US 2 region. From account ACCOUNTA create a share of database MARKET_DB, create a new database out of this share locally in AWS us-east-1 region, and replicate this new database to AZABC123 account. Then set up data sharing to the PARTNERB account.
B.From account ACCOUNTA create a share of database MARKET_DB, and create a new database out of this share locally in AWS us-east-1 region. Then make this database the provider and share it with the PARTNERB account.
C.Create a new account (called AZABC123) in Azure East US 2 region. From account ACCOUNTA replicate the database MARKET_DB to AZABC123 and from this account set up the data sharing to the PARTNERB account.
D.Create a share of database MARKET_DB, and create a new database out of this share locally in AWS us-east-1 region. Then replicate this database to the partner’s account PARTNERB.
Answer:
CExplanation:
Explanation:- Snowflake supports data sharing across regions and cloud platforms using account replication and share replication features. Account replication enables the replication of objects from a source account to one or more target accounts in the same organization. Share replication enables the replication of shares from a source account to one or more target accounts in the same organization1.
- To share data from the MARKET_DB database in the ACCOUNTA account in AWS us-east-1 region with the PARTNERB account in Azure East US 2 region, the following steps must be performed:
- Therefore, option C is the correct answer.
References: : Replicating Shares Across Regions and Cloud Platforms : Working with Organizations and Accounts : Replicating Databases Across Multiple Accounts : Replicating Shares Across Multiple Accounts
Question 45At which object type level can the APPLY MASKING POLICY, APPLY ROW ACCESS POLICY and APPLY SESSION POLICY privileges be granted?
Options:
A.Global
B.Database
C.Schema
D.Table
Answer:
AExplanation:
Explanation:The object type level at which the APPLY MASKING POLICY, APPLY ROW ACCESS POLICY and APPLY SESSION POLICY privileges can be granted is global. These are account-level privileges that control who can apply or unset these policies on objects such as columns, tables, views, accounts, or users. These privileges are granted to the ACCOUNTADMIN role by default, and can be granted to other roles as needed. The other options are incorrect because they are not the object type level at which these privileges can be granted. Database, schema, and table are lower-level object types that do not support these privileges. References: Access Control Privileges | Snowflake Documentation, Using Dynamic Data Masking | Snowflake Documentation, Using Row Access Policies | Snowflake Documentation, Using Session Policies | Snowflake Documentation
Question 46A DevOps team has a requirement for recovery of staging tables used in a complex set of data pipelines. The staging tables are all located in the same staging schema. One of the requirements is to have online recovery of data on a rolling 7-day basis.
After setting up the DATA_RETENTION_TIME_IN_DAYS at the database level, certain tables remain unrecoverable past 1 day.
What would cause this to occur? (Choose two.)
Options:
A.The staging schema has not been setup for MANAGED ACCESS.
B.The DATA_RETENTION_TIME_IN_DAYS for the staging schema has been set to 1 day.
C.The tables exceed the 1 TB limit for data recovery.
D.The staging tables are of the TRANSIENT type.
E.The DevOps role should be granted ALLOW_RECOVERY privilege on the staging schema.
Answer:
B, DExplanation:
Explanation:- The DATA_RETENTION_TIME_IN_DAYS parameter controls the Time Travel retention period for an object (database, schema, or table) in Snowflake. This parameter specifies the number of days for which historical data is preserved and can be accessed using Time Travel operations (SELECT, CREATE … CLONE, UNDROP)1.
- The requirement for recovery of staging tables on a rolling 7-day basis means that the DATA_RETENTION_TIME_IN_DAYS parameter should be set to 7 at the database level. However, this parameter can be overridden at the lower levels (schema or table) if they have a different value1.
- Therefore, one possible cause for certain tables to remain unrecoverable past 1 day is that the DATA_RETENTION_TIME_IN_DAYS for the staging schema has been set to 1 day. This would override the database level setting and limit the Time Travel retention period for all the tables in the schema to 1 day. To fix this, the parameter should be unset or set to 7 at the schema level1. Therefore, option B is correct.
- Another possible cause for certain tables to remain unrecoverable past 1 day is that the staging tables are of the TRANSIENT type. Transient tables are tables that do not have a Fail-safe period and can have a Time Travel retention period of either 0 or 1 day. Transient tables are suitable for temporary or intermediate data that can be easily reproduced or replicated2. To fix this, the tables should be created as permanent tables, which can have a Time Travel retention period of up to 90 days1. Therefore, option D is correct.
- Option A is incorrect because the MANAGED ACCESS feature is not related to the data recovery requirement. MANAGED ACCESS is a feature that allows granting access privileges to objects without explicitly granting the privileges to roles. It does not affect the Time Travel retention period or the data availability3.
- Option C is incorrect because there is no 1 TB limit for data recovery in Snowflake. The data storage size does not affect the Time Travel retention period or the data availability4.
- Option E is incorrect because there is no ALLOW_RECOVERY privilege in Snowflake. The privilege required to perform Time Travel operations is SELECT, which allows querying historical data in tables5.
References: : Understanding & Using Time Travel : Transient Tables : Managed Access : Understanding Storage Cost : Table Privileges
Question 47Based on the Snowflake object hierarchy, what securable objects belong directly to a Snowflake account? (Select THREE).
Options:
A.Database
B.Schema
C.Table
D.Stage
E.Role
F.Warehouse
Answer:
A, E, FExplanation:
Explanation:- A securable object is an entity to which access can be granted in Snowflake. Securable objects include databases, schemas, tables, views, stages, pipes, functions, procedures, sequences, tasks, streams, roles, warehouses, and shares1.
- The Snowflake object hierarchy is a logical structure that organizes the securable objects in a nested manner. The top-most container is the account, which contains all the databases, roles, and warehouses for the customer organization. Each database contains schemas, which in turn contain tables, views, stages, pipes, functions, procedures, sequences, tasks, and streams. Each role can be granted privileges on other roles or securable objects. Each warehouse can be used to execute queries on securable objects2.
- Based on the Snowflake object hierarchy, the securable objects that belong directly to a Snowflake account are databases, roles, and warehouses. These objects are created and managed at the account level, and do not depend on any other securable object. The other options are not correct because:
References:
- 1: Overview of Access Control | Snowflake Documentation
- 2: Securable Objects | Snowflake Documentation
- 3: CREATE SCHEMA | Snowflake Documentation
- 4: CREATE TABLE | Snowflake Documentation
- [5]: CREATE STAGE | Snowflake Documentation
Question 48An Architect needs to design a solution for building environments for development, test, and pre-production, all located in a single Snowflake account. The environments should be based on production data.
Which solution would be MOST cost-effective and performant?
Options:
A.Use zero-copy cloning into transient tables.
B.Use zero-copy cloning into permanent tables.
C.Use CREATE TABLE ... AS SELECT (CTAS) statements.
D.Use a Snowflake task to trigger a stored procedure to copy data.
Answer:
AExplanation:
Explanation:Zero-copy cloning is a feature in Snowflake that allows for the creation of a clone of a database, schema, or table without duplicating any data, which is cost-effective as it saves on storage costs. Transient tables are temporary and do not incur storage costs for the time they are not accessed, making them a cost-effective option for development, test, and pre-production environments that do not require the durability of permanent tables123.
References
•Snowflake Documentation on Zero-Copy Cloning3.
•Articles discussing the cost-effectiveness and performance benefits of zero-copy cloning12.
ARA-R01 Free PDF Answers
Page: 1 / 16
Total 162 questionsUnlock ARA-R01 Features
- ARA-R01 All Real Exam Questions
- ARA-R01 Exam easy to use and print PDF format
- Download Free ARA-R01 Demo (Try before Buy)
- Free Frequent Updates
- 100% Passing Guarantee by DumpsWrap
Questions & Answers PDF Demo
- ARA-R01 All Real Exam Questions
- ARA-R01 Exam easy to use and print PDF format
- Download Free ARA-R01 Demo (Try before Buy)
- Free Frequent Updates
- 100% Passing Guarantee by DumpsWrap
Practice Tesitng Engine Demo
Dumpswrap Footer Menu
DumpsWrap All Rights Reserved
Copyright © 2014-2024 DumpsWrap. All Rights Reserved