WP-A: A New URL Shortener

Chris Angelico rosuav at gmail.com
Tue Mar 15 19:55:52 EDT 2016


On Wed, Mar 16, 2016 at 10:38 AM, Thomas 'PointedEars' Lahn
<PointedEars at web.de> wrote:
> Chris Angelico wrote:
>
>> On Wed, Mar 16, 2016 at 9:53 AM, Thomas 'PointedEars' Lahn
>> <PointedEars at web.de> wrote:
>
>>> […] I cannot be sure because I have not thought this through, but with
>                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>> aliases for common second-level domains, and with text compression, it
>>> should be possible to do this without a database.
>>
>> How? If you shorten URLs, you have to be able to reconstruct the long
>> ones. Compression can't do that to arbitrary lengths. Somewhere there
>> needs to be the rest of the information.
>
> First of all, you quoted me out of context.

I trimmed the context. You got a problem with that?

> Second, do you even read what you reply to?  See the markings above.

Instead of thinking about URL shorteners specifically, think generally
about information theory. You cannot, fundamentally, shorten all URLs
arbitrarily. There just isn't enough room to store the information.

> And as for second-level domains, consider for example “t.c” instead of
> “twitter.com” as part of the short URI.

That'll work only for the ones that you code in specifically, and
that's only shortening your URL by 8 characters. A typical URL needing
shortening is over 80 characters - maybe several hundred. You need to
cut that down to a manageable length. That fundamentally cannot be
reversed without readding information.

>>> And with the exception of Twitter-ish sites that place a limit on message
>>> length, there really is *no need* for shorter URIs nowadays.  (HTTP)
>>> clients and servers are capable of processing really long ones [1];
>>> electronic communications media and related software, too [2].  And data
>>> storage space as well as data transmission has become exceptionally
>>> inexpensive.  A few less bytes there do not count.
>>
>> There are many places where there are limits (hard or soft) on message
>> lengths. Some of us still use MUDs and 80-character line limits.
>
> See above.  Covered by [2].

Unrelated. Not covered by that link. Go use a MUD some time.

> But speaking of length limits, the lines in your postings are too long,
> according to Usenet convention.  I had to correct the quotations so that
> they remained readable when word-wrapped.

Oh, so you'd rather the lines be cut to... I dunno, 80 characters?
Might be a good reason to use a URL shortener.

>> Business cards or other printed media need to be transcribed by hand.
>> Dictation of URLs becomes virtually impossible when they're
>> arbitrarily long.
>
> (You are not reading at all, are you?)  This is covered by that:
>
>>> Instead, there *is* a need for *concise*, *semantic* URIs that Web
>>> (service) users can *easily* *remember*.  It is the duty of the original
>>> Web authors∕developers to make sure that there are, and I think that no
>>> kind of automation is going to ease or replace thoughtful path design
>>> anytime soon (but please, prove me wrong):
>>
>> Sure...... if you control the destination server. What if you're
>> engaging in scholarly discussion about someone else's content? You
>> can't change the canonical URLs, and you can't simply copy their
>> content to your own server (either for licensing reasons or to
>> guarantee that the official version hasn't been tampered with).
>
> That is why I said it is the duty of the original authors/developers.  It is
> a community effort, and it is not going to happen overnight.  But evading
> the problem with unreliable replacements such as “short URLs” is not going
> to solve it either.

So, you can go fight an unwinnable battle against literally every web
creator in the world. Meanwhile, I'll keep on using URL shorteners.

ChrisA



More information about the Python-list mailing list