Some Issues on Tagging Text

Ben Finney ben+python at benfinney.id.au
Fri May 25 19:28:12 EDT 2018


Cameron Simpson <cs at cskk.id.au> writes:

> On 25May2018 04:23, Subhabrata Banerjee <subhabangalore at gmail.com> wrote:
> >On Friday, May 25, 2018 at 3:59:57 AM UTC+5:30, Cameron Simpson wrote:
> >> If you want to solve this problem with a programme you must first
> >> clearly define what makes an unwanted tag "unwanted". [...]
> >
> >By unwanted I did not mean anything so intricate.
> >Unwanted meant things I did not want.
>
> That much was clear, but you need to specify in your own mind
> _precisely_ what makes some things unwanted and others wanted. Without
> concrete criteria you can't write code to implement those criteria.

Importantly, “define” means more than just coming up with examples.

    To determine with precision; to mark out with distinctness; to
    ascertain or exhibit clearly.

    <URL:https://en.wiktionary.org/wiki/define>

Before you can write code that will *reliably* select those parts you
want and exclude those parts you don't want, you need to precisely
define what should be matched such that it also excludes what should not
be matched.

Come up with statements about what you want, and ask “if *any* text
matches this, does that necessarily mean it is wanted?”

Then do exactly the same in reverse: “if *any* text fails to match this,
does that necessarily mean it is unwanted?”

Keep refining your statement until it is precise enough that you can say
“yes” to both those questions.

Then, you have a statement that is precise enough to write tests for,
and therefore to write in code.

-- 
 \       “I have always wished for my computer to be as easy to use as |
  `\       my telephone; my wish has come true because I can no longer |
_o__)          figure out how to use my telephone.” —Bjarne Stroustrup |
Ben Finney




More information about the Python-list mailing list