Cyber Monday 2024! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Google Exam Professional Machine Learning Engineer Topic 9 Question 72 Discussion

Actual exam question for Google's Professional Machine Learning Engineer exam
Question #: 72
Topic #: 9
[All Professional Machine Learning Engineer Questions]

You are developing a model to help your company create more targeted online advertising campaigns. You need to create a dataset that you will use to train the model. You want to avoid creating or reinforcing unfair bias in the model. What should you do?

Choose 2 answers

Show Suggested Answer Hide Answer
Suggested Answer: C, E

To avoid creating or reinforcing unfair bias in the model, you should collect a representative sample of production traffic to build the training dataset, and conduct fairness tests across sensitive categories and demographics on the trained model. A representative sample is one that reflects the true distribution of the population, and does not over- or under-represent any group. A random sample is a simple way to obtain a representative sample, as it ensures that every data point has an equal chance of being selected. A stratified sample is another way to obtain a representative sample, as it ensures that every subgroup has a proportional representation in the sample. However, a stratified sample requires prior knowledge of the subgroups and their sizes, which may not be available or easy to obtain. Therefore, a random sample is a more feasible option in this case. A fairness test is a way to measure and evaluate the potential bias and discrimination of the model, based on different categories and demographics, such as age, gender, race, etc. A fairness test can help you identify and mitigate any unfair outcomes or impacts of the model, and ensure that the model treats all groups fairly and equitably. A fairness test can be conducted using various methods and tools, such as confusion matrices, ROC curves, fairness indicators, etc.Reference: The answer can be verified from official Google Cloud documentation and resources related to data sampling and fairness testing.

Sampling data | BigQuery

Fairness Indicators | TensorFlow

What-if Tool | TensorFlow


Contribute your Thoughts:

Lisandra
6 months ago
I think conducting fairness tests on the trained model is crucial to ensure we avoid bias.
upvoted 0 times
...
German
7 months ago
But wouldn't including a comprehensive set of demographic features help us address bias?
upvoted 0 times
...
Emiko
7 months ago
I disagree, I believe we should collect a stratified sample to ensure balanced representation.
upvoted 0 times
...
German
7 months ago
I think we should collect a random sample of production traffic for the dataset.
upvoted 0 times
...
Thersa
8 months ago
Haha, can you imagine if we just went with option B? 'Oh, here's our dataset for targeted advertising - it's just a bunch of middle-aged white dudes.' Yeah, no, that's not going to fly. I'm with you guys on the stratified sampling approach. Gotta keep that diversity in check, you know?
upvoted 0 times
...
Lettie
8 months ago
Hmm, I'm not too keen on option B. Focusing only on the groups that interact most with ads could lead to some serious skew in the data. And option A, with all the demographic features, just feels like a recipe for disaster. Fairness testing, as in option E, is definitely important, but we need to get the data right first.
upvoted 0 times
...
Vallie
8 months ago
You know, I was initially leaning towards option C, the random sample, but after thinking about it, I agree that a stratified sample is probably the better approach. That way, we can make sure we're not over-representing any one group and really getting a representative dataset to train the model on.
upvoted 0 times
...
Christoper
8 months ago
Ah, this is a tricky one. We definitely want to avoid creating or reinforcing unfair bias, but including a comprehensive set of demographic features could potentially amplify those biases. I'm thinking option D might be the way to go - collecting a stratified sample of production traffic could help ensure we capture a diverse range of users and perspectives.
upvoted 0 times
Fletcher
8 months ago
D) Collect a stratified sample of production traffic to build the training dataset.
upvoted 0 times
...
Kayleigh
8 months ago
A) Include a comprehensive set of demographic features.
upvoted 0 times
...
...

Save Cancel
az-700  pass4success  az-104  200-301  200-201  cissp  350-401  350-201  350-501  350-601  350-801  350-901  az-720  az-305  pl-300  

Warning: Cannot modify header information - headers already sent by (output started at /pass.php:70) in /pass.php on line 77