[scikit-learn] scikit-learn Digest, Vol 6, Issue 40

Naoya Kanai naopon at gmail.com
Mon Sep 26 17:25:04 EDT 2016


What if you split the data pairwise(i.e. X_success, X_fail, etc) with subjects matched by row index, then run train_test_split on each one with the same random_state?

Naoya Kanai

Sent from
https://polymail.io/

On Mon, Sep 26, 2016 at 2:06 PM Afarin Famili

<
mailto:Afarin Famili <Afarin.Famili at utsouthwestern.edu>
> wrote:

a, pre, code, a:link, body { word-wrap: break-word !important; }

Hi David,

When applying Train_test_split to the sample space, we have a single row per subject. I am looking for some other function like Train_test_split that can deal with pairs of rows (for each subject), which does not lead to a biased accuracy. We are studying memory and have a row of features for successful memory encoding, and a second row for unsuccessful memory encoding in each of the subjects. Our target space being 1 for successful and 0 for unsuccessful encoding respectively.

How do you recommend me to split this set of data in order to get a reasonable/unbiased accuracy?

Thanks,

Afarin

________________________________________

From: scikit-learn
mailto:utsouthwestern.edu at python.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20160926/5b5cb8bd/attachment-0001.html>


More information about the scikit-learn mailing list