syntax difference

Mon Jun 18 08:18:00 EDT 2018

On Mon, Jun 18, 2018 at 10:07 PM, Bart <bc at freeuk.com> wrote:
> On 18/06/2018 12:33, Chris Angelico wrote:
>>
>> On Mon, Jun 18, 2018 at 9:16 PM, Bart <bc at freeuk.com> wrote:
>
>
>>> What will those look like? If copyright/licence comments have their own
>>> specific syntax, then they just become another token which has to be
>>> recognised.
>>
>>
>> If they have specific syntax, they're not comments, are they?
>
>
> So how is it possible for ANY program to determine what kind of comments
> they are?
>
> I've used 'smart' comments myself, which contain special information, but
> are also designed to be very easily detected by the simplest of programs
> which scan the source code. For that purpose, they might start with a
> special prefix so that it is not necessary to parse the special information,
> but just to detect the prefix.
>
> For example, comments that start with #T# (and in my case, that begin at the
> start of a line). Funnily enough, this also provided type information
> (although for different purposes than what is discussed here).

Oh, a specific format of comments? Sorry, I misunderstood.

Yes, it is certainly possible to start with a program that removes all
comments, and then refine it to one which only removes those which
(don't) match some pattern. That's definitely a possibility.

> The subject is type annotation. Presumably there is some way to distinguish
> such a type annotation within a comment from a regular comment? Such as the
> marker I suggested above.
>
> Then the tokeniser just needs to detect that kind of comment rather than
> need to understand the contents.
>
> Although the tokeniser will need to work a little differently by maintaining
> the positions of all tokens within the line, information that is usually
> discarded.

Pretty much, yes. It's going to end up basically in the same place, though:

1) Parse the code, keeping all the non-essential parts as well as the
essential parts.
2) Find the comments, or find the annotations
3) If comments, figure out if they're the ones you want to remove.
4) Reconstruct the file without the bits you want to remember.

Step 3 is removed if you're using syntactic annotations. Otherwise,
they're identical.

ChrisA