What is a benefit to performing data cleansing (imputation, transformations, etc.) on data after partitioning the data for honest assessment as opposed to performing the data cleansing prior to partitioning the data?
Exactly! I mean, who cares if it takes a few extra minutes to run the analysis, as long as we're getting reliable results, right? I'd rather spend the time and do it right than rush through it and end up with a flawed model.
Haha, yeah, as long as you're not using a Commodore 64 to run your analysis, the computational time shouldn't be a problem. I'm with you, the honest assessment is way more important.
That's true, but I think the benefits of getting a more accurate assessment outweigh the computational cost. Plus, modern computers are pretty powerful these days. I'm not too worried about the time it takes.
I agree, but I'm also concerned about the computational expense. Wouldn't it be more efficient to do the cleansing before partitioning? That way, we don't have to do it multiple times for the training and test sets.
Yeah, that's a good point. Doing it after partitioning also allows us to see how effective the cleansing methods are. That way, we can make a more informed decision on which techniques to use in the future.
Hmm, this is an interesting question. I think performing data cleansing after partitioning the data allows us to get a more honest assessment of the model's performance. If we do it beforehand, the cleansing methods might give the model an unfair advantage on the test set.
upvoted 0 times
...
Log in to Pass4Success
Sign in:
Report Comment
Is the comment made by USERNAME spam or abusive?
Commenting
In order to participate in the comments you need to be logged-in.
You can sign-up or
login
Joesph
6 months agoTheron
7 months agoLeana
7 months agoTom
7 months agoTheron
7 months agoValentin
8 months agoThaddeus
8 months agoNell
8 months agoAlyce
6 months agoMalinda
6 months agoTrina
7 months agoJamie
8 months agoStaci
8 months agoWilda
8 months ago