Spring Sale Limited Time 75% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code = simple75

Pass the Google Cloud Certified Professional-Data-Engineer Questions and answers with Dumpstech

Exam Professional-Data-Engineer Premium Access

View all detail and faqs for the Professional-Data-Engineer exam

Practice at least 50% of the questions to maximize your chances of passing.
Viewing page 8 out of 12 pages
Viewing questions 71-80 out of questions
Questions # 71:

You are deploying a new storage system for your mobile application, which is a media streaming service. You decide the best fit is Google Cloud Datastore. You have entities with multiple properties, some of which can take on multiple values. For example, in the entity ‘Movie’ the property ‘actors’ and the property ‘tags’ have multiple values but the property ‘date released’ does not. A typical query would ask for all movies with actor=<actorname> ordered by date_released or all movies with tag=Comedy ordered by date_released. How should you avoid a combinatorial explosion in the number of indexes?

Question # 71

Options:

A.

Option A

B.

Option B.

C.

Option C

D.

Option D

Questions # 72:

Your company produces 20,000 files every hour. Each data file is formatted as a comma separated values (CSV) file that is less than 4 KB. All files must be ingested on Google Cloud Platform before they can be processed. Your company site has a 200 ms latency to Google Cloud, and your Internet connection bandwidth is limited as 50 Mbps. You currently deploy a secure FTP (SFTP) server on a virtual machine in Google Compute Engine as the data ingestion point. A local SFTP client runs on a dedicated machine to transmit the CSV files as is. The goal is to make reports with data from the previous day available to the executives by 10:00 a.m. each day. This design is barely able to keep up with the current volume, even though the bandwidth utilization is rather low.

You are told that due to seasonality, your company expects the number of files to double for the next three months. Which two actions should you take? (choose two.)

Options:

A.

Introduce data compression for each file to increase the rate file of file transfer.

B.

Contact your internet service provider (ISP) to increase your maximum bandwidth to at least 100 Mbps.

C.

Redesign the data ingestion process to use gsutil tool to send the CSV files to a storage bucket in parallel.

D.

Assemble 1,000 files into a tape archive (TAR) file. Transmit the TAR files instead, and disassemble the CSV files in the cloud upon receiving them.

E.

Create an S3-compatible storage endpoint in your network, and use Google Cloud Storage Transfer Service to transfer on-premices data to the designated storage bucket.

Questions # 73:

You are choosing a NoSQL database to handle telemetry data submitted from millions of Internet-of-Things (IoT) devices. The volume of data is growing at 100 TB per year, and each data entry has about 100 attributes. The data processing pipeline does not require atomicity, consistency, isolation, and durability (ACID). However, high availability and low latency are required.

You need to analyze the data by querying against individual fields. Which three databases meet your requirements? (Choose three.)

Options:

A.

Redis

B.

HBase

C.

MySQL

D.

MongoDB

E.

Cassandra

F.

HDFS with Hive

Questions # 74:

You work for a manufacturing plant that batches application log files together into a single log file once a day at 2:00 AM. You have written a Google Cloud Dataflow job to process that log file. You need to make sure the log file in processed once per day as inexpensively as possible. What should you do?

Options:

A.

Change the processing job to use Google Cloud Dataproc instead.

B.

Manually start the Cloud Dataflow job each morning when you get into the office.

C.

Create a cron job with Google App Engine Cron Service to run the Cloud Dataflow job.

D.

Configure the Cloud Dataflow job as a streaming job so that it processes the log data immediately.

Questions # 75:

Your company is loading comma-separated values (CSV) files into Google BigQuery. The data is fully imported successfully; however, the imported data is not matching byte-to-byte to the source file. What is the most likely cause of this problem?

Options:

A.

The CSV data loaded in BigQuery is not flagged as CSV.

B.

The CSV data has invalid rows that were skipped on import.

C.

The CSV data loaded in BigQuery is not using BigQuery’s default encoding.

D.

The CSV data has not gone through an ETL phase before loading into BigQuery.

Questions # 76:

You are designing the database schema for a machine learning-based food ordering service that will predict what users want to eat. Here is some of the information you need to store:

The user profile: What the user likes and doesn’t like to eat

The user account information: Name, address, preferred meal times

The order information: When orders are made, from where, to whom

The database will be used to store all the transactional data of the product. You want to optimize the data schema. Which Google Cloud Platform product should you use?

Options:

A.

BigQuery

B.

Cloud SQL

C.

Cloud Bigtable

D.

Cloud Datastore

Questions # 77:

When you store data in Cloud Bigtable, what is the recommended minimum amount of stored data?

Options:

A.

500 TB

B.

1 GB

C.

1 TB

D.

500 GB

Questions # 78:

Which of these statements about BigQuery caching is true?

Options:

A.

By default, a query's results are not cached.

B.

BigQuery caches query results for 48 hours.

C.

Query results are cached even if you specify a destination table.

D.

There is no charge for a query that retrieves its results from cache.

Questions # 79:

What Dataflow concept determines when a Window's contents should be output based on certain criteria being met?

Options:

A.

Sessions

B.

OutputCriteria

C.

Windows

D.

Triggers

Questions # 80:

Suppose you have a table that includes a nested column called "city" inside a column called "person", but when you try to submit the following query in BigQuery, it gives you an error.

SELECT person FROM `project1.example.table1` WHERE city = "London"

How would you correct the error?

Options:

A.

Add ", UNNEST(person)" before the WHERE clause.

B.

Change "person" to "person.city".

C.

Change "person" to "city.person".

D.

Add ", UNNEST(city)" before the WHERE clause.

Viewing page 8 out of 12 pages
Viewing questions 71-80 out of questions