Name: Cloudera DS-200 Exam
Brand: Pass4Success
SKU: DS-200
Price: 69.00 USD
Availability: InStock
Rating: 4.8 (175 reviews)

Disscuss Cloudera DS-200 Topics, Questions or Ask Anything Related

Submit Cancel

Currently there are no comments in this discussion, be the first to comment!

Free Cloudera DS-200 Exam Actual Questions

Note: Premium Questions for DS-200 were last updated On 22-04-2019 (see below)

Question #1

You want to build a classification model to identify spam comments on a blog. You decide to use the words in the comment text as inputs to your model. Which criteria should you use when deciding which words to use as features in order to contribute to making the correct classification decision?

AChoose words for your sample that are most correlated with the Spam label

BChoose wordsfor your sample thatoccur most frequently in the text

CChoose words, for your sample that have the largest mutual information with the spam label

DChoose words for your sample that are least correlated with the spam label

Reveal Solution

Correct Answer: A

Question #2

Given the following sample of numbers from a distribution:

1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89

What are the five numbers that summarize this distribution (the five number summary of sample

percentiles)?

A1, 3, 8, 34, 89

B1, 4, 13, 34, 89

C1, 1.5, 5, 24.5, 89

D1, 2.5, 8, 27.5, 89

Reveal Solution

Correct Answer: A

Question #3

Given the following sample of numbers from a distribution:

1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89

What are two benefits of using the five-number summary of sample percentiles to summarize a data set?

AYou can calculate unbiased estimators for the parameters of the distribution

BIt's robust to outliers

CIt's well-defined for any probability distribution

DYou can calculate it quickly using a relational database like MySQL, even when we have a large sample

Reveal Solution

Correct Answer: D

Question #4

Given the following sample of numbers from a distribution:

1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89

How do high-level languages like Apache Hive and Apache Pig efficiently calculate approximately percentiles for a distribution?

AThey sort all of the input samples and the lookup the samples for each percentile

BThey maintain index of input data as it is loaded into HDFS and load them into memory

CThey use pivots to assign each observations to the reducer that calculate each percentile

DThey assign sample observations to buckets and then aggregate the buckets to compute the approximations

Reveal Solution

Correct Answer: C

Question #5

What is the best way to determine the learning rate parameters for stochastic gradient descent when the distribution of the input data shifts over time?

AThe learning rate should be adjusted periodically based on the setting that optimizes the objective function over a sample of recent observations

BThe learning rate should be fixed number that decays as the number of observations in the data set increases

CThe learning rate should be the value that optimizes the value of the objective function over the first N samples in the dataset

DThe learning rate should be a fixed number with a constant decay factor

EThe learning rate should be continuously adjusted based on the value that optimizes the objective function for the most recent observation from the input data

Reveal Solution

Correct Answer: C

Unlock Premium DS-200 Exam Questions with Advanced Practice Test Features:

Select Question Types you want
Set your Desired Pass Percentage
Allocate Time (Hours : Minutes)
Create Multiple Practice tests with Limited Questions
Customer Support

Get Full Access Now

Cloudera DS-200 Exam Questions

Free Cloudera DS-200 Exam Actual Questions

Note: Premium Questions for DS-200 were last updated On 22-04-2019 (see below)