[scikit-learn] using a mask for brain images

Fri May 5 01:59:26 EDT 2017

Hi Alle,

I think that what has changed between 2014 and today is that the
coefficients coef are now a 2D array (number of hyperplanes x number of
features). In your case, the first direction is of length one, so you
could just do:

coef = clf.coef_[0]

and your script should work.

The code of the Abraham paper cannot be updated, because papers are not
living objects. However, we are maintaining a package that encodes all
these patterns in higher-level construct: http://nilearn.github.io/
It might be a good idea to use this package, as it is maintained and has
quality assurance.

Best,

Gaël

On Thu, May 04, 2017 at 05:52:27PM +0200, Alle Meije Wink wrote:
> I have a script to classify MRI perfusion maps from healthy subjects and
> patients. For the file IO and the classifier I have started with the example
> code in Abraham et al 2014 [https://arxiv.org/pdf/1412.3919.pdf].

> I use the same classifier as in the paper to produce a back-projected map of
> classification weights, which I then want to 'unmask' like in the paper:

>     coef=clf.coef_
>     coef=featureselection.inverse_transform(coef)          

> and

>     map_name='weights_check.nii.gz'
>     wmap=np.zeros(mask.shape, dtype=X.dtype)
>     wmap[mask]=coef
>     img=nb.Nifti1Image(wmap,np.eye(4))
>     img.to_filename(map_name)   

> But the line "wmap[mask]=coef" throws an error "ValueError: boolean index array
> should have 1 dimension". I tried the example code from the paper and that
> works.

> Is the 'coef' array of back-projected SVM weights in some way different than
> the masked input image? Or am I doing something else wrong? The error suggests
> that the mask array is the problem.

> The complete script is attached.

> Many thanks for your help!
> Alle Meije

> # -*- coding: utf-8 -*-
> """
> classify MR images from controls (CON) and patients (MCI)

> """

> import os
> import numpy as np
> import nibabel as nb
> import sklearn as sl

> from sklearn.feature_selection import f_classif
> from sklearn import svm

> def subj_lists( rootdir='/data/a.wink/MR', starts=['CON','MCI'] ):

>     # make and empty list for each patient group
>     slists=[]
>     for i in range(0,len(starts)):
>         slists.append([]);

>     # add subjects to the group based on 
>     for root, dirs, files in os.walk(rootdir):
>         for fname in files:
>             for i in range(0,len(starts)):
>                 if fname.startswith(starts[i]):
>                     slists[i].append(os.path.join(root,fname))     

>     print('finished building subject lists')                              
>     return slists        

> def mk_mask ( subj_lists=[] ):

>     # check whether mask can be loaded
>     mname='mask_check.nii.gz'

>     if os.path.isfile(mname):

>         msk=nb.load(mname).get_data()

>     # or must be built (values >20% in the subject images and >20% grey matter in the template)
>     else:    

>         num_im=0;
>         for i in range(0,len(subj_lists)):
>             for fname in subj_lists[i]:
>                 imdata=nb.load(fname).get_data()
>                 if 'msk' not in locals():
>                     msk=np.absolute(imdata)
>                 else:
>                     msk=msk+np.absolute(imdata)
>                 num_im+=1

>         msk/=num_im
>         msk/=np.amax(msk)    
>         msk[msk<.2]=0
>         msk[msk>0]=1

>         grey=nb.load('/usr/local/spm8/apriori/grey.nii').get_data()
>         grey[grey<.2]=0
>         grey[grey>0]=1

>         msk=msk*grey

>         img=nb.Nifti1Image(msk,np.eye(4))
>         img.to_filename(mname)

>     msk=msk.astype(bool)    

>     print('finished building mask')                              
>     return msk, mname

> def load_images( subj_lists=[], mask=[] ):

>     # load all the images, build a matrix X or subjects (rows) * inmask-voxels (columns)
>     num_im=0;
>     for i in range(0,len(subj_lists)):
>         for fname in subj_lists[i]:
>             imdata=nb.load(fname).get_data()
>             num_im+=1
>             print '\r' + str(num_im) +'\r',
>             if 'the_matrix' not in locals():
>                 the_matrix=imdata[mask].T
>             else:
>                 the_matrix=np.vstack((the_matrix,imdata[mask].T))

>     print('finished building matrix X')                              
>     return the_matrix

> def main():

>     # get the input data
>     subjlists = subj_lists()    
>     y=np.concatenate( ([-1 for _ in range(len(subjlists[0]))],
>                        [ 1 for _ in range(len(subjlists[1]))]) )
>     mask, mas_name = mk_mask(subjlists)
>     X = load_images(subjlists,mask)    
>     print "number of subjects, voxels: %d, %d" % X.shape

>     # select features 
>     featureselection=sl.feature_selection.SelectKBest(f_classif, k=8000)
>     X_reduced=featureselection.fit_transform(X,y)

>     # classify
>     clf=svm.SVC(kernel='linear')
>     clf.fit(X_reduced,y)

>     # make discrimination map
>     coef=clf.coef_
>     coef=featureselection.inverse_transform(coef)           

>     map_name='weights_check.nii.gz'
>     wmap=np.zeros(mask.shape, dtype=X.dtype)
>     wmap[mask]=coef
>     img=nb.Nifti1Image(wmap,np.eye(4))
>     img.to_filename(map_name)    

> if __name__ == "__main__":
>     main()

> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-- 
    Gael Varoquaux
    Researcher, INRIA Parietal
    NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
    Phone:  ++ 33-1-69-08-79-68
    http://gael-varoquaux.info            http://twitter.com/GaelVaroquaux