[Baypiggies] list question

Vikram K kpguy1975 at gmail.com
Wed Apr 6 05:56:36 CEST 2011


Thanks. I  tried a different way (you need python 3 or python 2.7 for this):
from collections import Counter
...
...
print len(data_one)
data_one_unique = list(set(data_one))

print len(data_one_unique)
a = Counter(data_one)
b = Counter(data_one_unique)

c = a - b
print list(c.elements())


------------------
This was the output:

285
228
['DKVTIADDySDPFDAK', 'GEPEALyAAVTK', 'QHSLPSSEHLGTDGALyQVPPQPR',
'THAVSVSETDDyAEIIDEEDTYTMPSTR', 'LIEDNEyTAR', 'YMEDSTYyK', 'VyENVGLMQQQR',
'AVCSTyLQSR', 'MNHTSQAFITAASGGQPPNyER', 'ERDyAEIQDFHR', 'RTEGDyLSYR',
'NTyNQTALDIVNQFTTSQASR', 'YMEDSTyYK', 'GEPNVSyICSR', 'GEPNVSyICSR',
'SAQPSPHYMAGPSSGQIyGPGPR', 'TACTNFMMTPyVVTR', 'SDNNySTLNER', 'TVCSTyLQSR',
'SLDNNySTLNER', 'GLCTSPAEHQYFMTEyVATR', 'GLCTSPAEHQYFMTEyVATR',
'TPYEAyDPIGK', 'NLSEGNNANYTEyVATR', 'ELFDDPSyVNIQNLDK', 'LVQSPNSyFMDVK',
'VADPDHDHTGFLtEyVATR', 'TADSVFCPHyEK', 'LWLEAMDGKEPIyTLPAIISK',
'VVQEYIDAFSDyANFK', 'VEKIGEGTyGVVYK', 'AGKGESAGyMEPYEAQR', 'YVDSEGHLyTVPIR',
'KIYNGDyYR', 'LSHSSGyAQLNTYSR', 'STTNyVDFYSTK', 'IEKIGEGtyGVVYK',
'TLEPVKPPTVPNDyMTSPAR', 'IEKIGEGTyGVVYK', 'VGQGYVYEAAQTEQDEyDTPR',
'TAGTSFMMTPyVVTR', 'TAGTSFMMTPyVVTR', 'NEEENIySVPHDSTQGK',
'LCDFGSASHVADNDITPyLVSR', 'LCDFGSASHVADNDITPyLVSR', 'GPLDGSPyAQVQR',
'GPLDGSPyAQVQR', 'FLEENSSDPTyTSSLGGKIPIR', 'HAAyGGYSTPEDR',
'VADPDHDHTGFLTEyVATR', 'HLLAPGPQDIyDVPPVR', 'LTDSKEDPIyDEPEGLAPAPPR',
'HTDDEMTGyVATR', 'HTDDEMTGyVATR', 'IYQyIQSR', 'IYQyIQSR',
'VLEDDPEATyTTSGGK']

----------

I took the output list and made it equal to z and then subsequently i made
this list unique:

z = ['DKVTIADDySDPFDAK', 'GEPEALyAAVTK', 'QHSLPSSEHLGTDGALyQVPPQPR',
'THAVSVSETDDyAEIIDEEDTYTMPSTR', 'LIEDNEyTAR', 'YMEDSTYyK', 'VyENVGLMQQQR',
'AVCSTyLQSR', 'MNHTSQAFITAASGGQPPNyER', 'ERDyAEIQDFHR', 'RTEGDyLSYR',
'NTyNQTALDIVNQFTTSQASR', 'YMEDSTyYK', 'GEPNVSyICSR', 'GEPNVSyICSR',
'SAQPSPHYMAGPSSGQIyGPGPR', 'TACTNFMMTPyVVTR', 'SDNNySTLNER', 'TVCSTyLQSR',
'SLDNNySTLNER', 'GLCTSPAEHQYFMTEyVATR', 'GLCTSPAEHQYFMTEyVATR',
'TPYEAyDPIGK', 'NLSEGNNANYTEyVATR', 'ELFDDPSyVNIQNLDK', 'LVQSPNSyFMDVK',
'VADPDHDHTGFLtEyVATR', 'TADSVFCPHyEK', 'LWLEAMDGKEPIyTLPAIISK',
'VVQEYIDAFSDyANFK', 'VEKIGEGTyGVVYK', 'AGKGESAGyMEPYEAQR', 'YVDSEGHLyTVPIR',
'KIYNGDyYR', 'LSHSSGyAQLNTYSR', 'STTNyVDFYSTK', 'IEKIGEGtyGVVYK',
'TLEPVKPPTVPNDyMTSPAR', 'IEKIGEGTyGVVYK', 'VGQGYVYEAAQTEQDEyDTPR',
'TAGTSFMMTPyVVTR', 'TAGTSFMMTPyVVTR', 'NEEENIySVPHDSTQGK',
'LCDFGSASHVADNDITPyLVSR', 'LCDFGSASHVADNDITPyLVSR', 'GPLDGSPyAQVQR',
'GPLDGSPyAQVQR', 'FLEENSSDPTyTSSLGGKIPIR', 'HAAyGGYSTPEDR',
'VADPDHDHTGFLTEyVATR', 'HLLAPGPQDIyDVPPVR', 'LTDSKEDPIyDEPEGLAPAPPR',
'HTDDEMTGyVATR', 'HTDDEMTGyVATR', 'IYQyIQSR', 'IYQyIQSR',
'VLEDDPEATyTTSGGK']
>>> z_unique = list(set(z))
>>> len(z)
57
>>> len(z_unique)
50
>>>

It appears that if the non-unique element occurs twice than you will have
only one occurence in the output but if it occurs three times then the
output (print list(c.elements()) ) will have the element written twice and
so on.




On Tue, Apr 5, 2011 at 11:01 PM, Brian Palmer <bpalmer at gmail.com> wrote:

>
> On Tue, Apr 5, 2011 at 7:49 PM, Vikram K <kpguy1975 at gmail.com> wrote:
>
>> i have this list:
>> x = ['cat','dog','dog']
>>
>> i wish to identify the non-unique element in this list i.e. 'dog'. how do
>> i do this?
>>
>
> This may not be suitable depending on how big your list is, but consider
>
> x = ['cat', 'dog', 'dog']
> x_count = defaultdict(lambda: 0)
> for k in x:
>   x[k] = x[k] + 1
> unique_xs = [k for k in x if x[k] == 1]
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/baypiggies/attachments/20110405/16f8906b/attachment.html>


More information about the Baypiggies mailing list