How to test characters of a string

Avi Gross avigross at verizon.net
Thu Jun 9 13:33:44 EDT 2022


Dave,

Sometimes a task is done faster by NOT programming anything in any language!

Not only have you spent a lot of your own time but many dozens of messages here have dragged in others, who gain nothing ;-)

The domain you are operating in seems to have lots of variants in how the titles are stored as names and you keep finding new variants. Yes, if your goal is to use this as a way to learn more in general about Python, clearly it may meet that goal!

Contrary to what I (and some others) said earlier, it may be time to consider regular expressions and other heavier artillery! LOL!

I do not plan on sticking with your twists and turns but will quickly address your variants.

Sides that come in two's like records are presumably limited to using A and B in your example. But multiple disks connected can mean the digit(s) following can have double digits or even more. Your python code may have to contain lots of functions you create that match some pattern and return some value and perhaps other functions that know how to convert from that format to a common canonical format of your own so they can be compared.

Your main code may need to try them in various sequences till it finds a match and so on.

But when you are done, in what format do you save them? The original or your minimal? 

Still confusing to me, as someone who does not give a darn, is the reality that many songs may have the same name but be different as in a song from Sinatra when he was young and a later recording  with a different orchestra or by a Sinatra imitator. They may all be titled something like "New York, New York" or "NEW YORK -- NEW YORK" which your algorithm folds into the same characters.

So I am guessing you also need to access other data about the person who sings it or what year it was released to make comparisons. At some point you may want to create or borrow some sort of class/object that encapsulates your data as well as methods that let you do things like make a canonical version of the Title and then a way to ask if Object A is reasonably equal to object B might happen if you define a function/method of __eq__ for that class.

It might take you years and need periodic refining as you encounter ever more varied ways people have chosen to label their music, but so what? LOL!

Humor or sarcasm aside, your incremental exploratory method reminds me why it is a good idea to first scope out the outlines of your problem space and make some key decisions and write out a fairly detailed set of requirements before seriously making more than prototypes. You might get very different help from people if they understood that your first request was far from complete but only one of many that may better be worked on some other way.
And I wonder if you did any search of the internet to see if anyone had done anything similar in Python (or another language) that may handle parts of what you need before asking here. I note lots of people who come with what they consider a good programming background have to adjust to aspects of languages like python as what they know is in some ways wrong or inadequate in a new environment. 

-----Original Message-----
From: Dave <dave at looktowindward.com>
To: python-list at python.org
Sent: Thu, Jun 9, 2022 2:50 am
Subject: Re: How to test characters of a string

Hi,

I’ve found you also need to take care of multiple disk CD releases. These have a format of

“1-01 Track Name”
“2-02  Trackl Name"

Meaning Disk 1 Track1, Disk 2, Track 2.

Also A and B Sides (from Vinyl LPs)

“A1-Track Name”
“B2-Track Name”

Side A, Track 1, etc.

Cheers
Dave


> On 8 Jun 2022, at 19:36, Dennis Lee Bieber <wlfraed at ix.netcom.com> wrote:
> 
> On Wed, 8 Jun 2022 01:53:26 +0000 (UTC), Avi Gross <avigross at verizon.net>
> declaimed the following:
> 
> 
>> 
>> So is it necessary to insist on an exact pattern of two digits followed by a space? 
>> 
>> 
>> That would fail on "44 Minutes", "40 Oz. Dream", "50 Mission Cap", "50 Ways to Say Goodbye", "99 Ways to Die" 
>> 
>> It looks to me like you need to compare TWICE just in case. If it matches in the original (perhaps with some normalization of case and whitespace, fine. If not will they match if one or both have something to remove as a prefix such as "02 ". And if you are comparing items where the same song is in two different numeric sequences on different disks, ...
> 
>     I suspect the OP really needs to extract the /track number/ from the
> ID3 information, and (converting to a 2digit formatted string) see if the
> file name begins with that track number... The format of the those
> filenames appear to be those generated by some software when ripping CDs to
> MP3s -- for example:
> 
> -=-=-
> c:\Music\Roger Miller\All Time Greatest Hits>dir
> Volume in drive C is OS
> Volume Serial Number is 4ACC-3CB4
> 
> Directory of c:\Music\Roger Miller\All Time Greatest Hits
> 
> 04/11/2022  05:06 PM    <DIR>          .
> 04/11/2022  05:06 PM    <DIR>          ..
> 07/26/2018  11:20 AM        4,493,279 01 Dang Me.mp3
> 07/26/2018  11:20 AM        5,072,414 02 Chug-A-Lug.mp3
> 07/26/2018  11:20 AM        4,275,844 03 Do-Wacka-Do.mp3
> 07/26/2018  11:20 AM        4,284,208 04 In the Summertime.mp3
> 07/26/2018  11:20 AM        6,028,730 05 King of the Road.mp3
> 07/26/2018  11:20 AM        4,662,182 06 You Can't Roller Skate in a
> Buffalo Herd.mp3
> 07/26/2018  11:20 AM        5,624,704 07 Engine, Engine #9.mp3
> 07/26/2018  11:20 AM        5,002,492 08 One Dyin' and a Buryin'.mp3
> 07/26/2018  11:21 AM        6,799,224 09 Last Word in Lonesome Is Me.mp3
> 07/26/2018  11:21 AM        5,637,230 10 Kansas City Star.mp3
> 07/26/2018  11:21 AM        4,656,910 11 England Swings.mp3
> 07/26/2018  11:21 AM        5,836,638 12 Husbands and Wives.mp3
> 07/26/2018  11:21 AM        5,470,216 13 I've Been a Long Time Leavin'.mp3
> 07/26/2018  11:21 AM        6,230,236 14 Walkin' in the Sunshine.mp3
> 07/26/2018  11:21 AM        6,416,060 15 Little Green Apples.mp3
> 07/26/2018  11:21 AM        9,794,442 16 Me and Bobby McGee.mp3
> 07/26/2018  11:22 AM        7,330,642 17 Where Have All the Average People
> Gone.mp3
> 07/26/2018  11:22 AM        7,334,752 18 South.mp3
> 07/26/2018  11:22 AM        6,981,924 19 Tomorrow Night in Baltimore.mp3
> 07/26/2018  11:22 AM        9,353,872 20 River in the Rain.mp3
>              20 File(s)    121,285,999 bytes
>              2 Dir(s)  295,427,198,976 bytes free
> 
> c:\Music\Roger Miller\All Time Greatest Hits>
> -=-=-
> 
>     Untested (especially the ID3 "variable" -- substitute variables as
> needed to match the original code):
> 
>>>> id3Track = 2
>>>> track_number = "%2.2d " % id3Track
>>>> track_number
> '02 '
>>>> filename = "02 This is the life.mp3"
>>>> if filename.startswith(track_number):
> ...     nametitle = filename[3:]
> ... else:
> ...     nametitle = filename
> ...     
>>>> if nametitle.endswith(".mp3"):
> ...         nametitle = nametitle[:-4]
> ...     
>>>> nametitle
> 'This is the life'
> 
>     Handling ASCII ' and " vs Unicode "smart" quotes is a different matter.
> 
>     One may still run the risk of having a filename without a track number
> BUT having a number that just manages to match the track number. To account
> for that I'd suggest using the sequence:
> 
> *    Strip extension (if filename.lower().endswith(".mp3"): ...)
> *    Handle any Unicode/ASCII quotes in both filename AND ID3 track title
> *    Compare filename and title.
> *        IF MATCHED -- done
> *        IF NOT MATCHED
> *            Format ID3 track number as shown above
> *            Compare filename to (formatted track number + track title)
> *                IF MATCHED -- done
> *                IF NOT MATCHED
> *                    Log full filename and ID3 track title/track number to a
> log for later examination.
> 
> 
> 
> -- 
>     Wulfraed                Dennis Lee Bieber        AF6VN
>     wlfraed at ix.netcom.com    http://wlfraed.microdiversity.freeddns.org/
> -- 
> https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


More information about the Python-list mailing list