WP-A: A New URL Shortener

Gene Heskett gheskett at wdtv.com
Tue Mar 15 22:34:18 EDT 2016


On Tuesday 15 March 2016 19:55:52 Chris Angelico wrote:

> On Wed, Mar 16, 2016 at 10:38 AM, Thomas 'PointedEars' Lahn
>
> <PointedEars at web.de> wrote:
> > Chris Angelico wrote:
> >> On Wed, Mar 16, 2016 at 9:53 AM, Thomas 'PointedEars' Lahn
> >>
> >> <PointedEars at web.de> wrote:
> >>> […] I cannot be sure because I have not thought this through, but
> >>> with
> >
> >                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >
> >>> aliases for common second-level domains, and with text
> >>> compression, it should be possible to do this without a database.
> >>
> >> How? If you shorten URLs, you have to be able to reconstruct the
> >> long ones. Compression can't do that to arbitrary lengths.
> >> Somewhere there needs to be the rest of the information.
> >
> > First of all, you quoted me out of context.
>
> I trimmed the context. You got a problem with that?
>
> > Second, do you even read what you reply to?  See the markings above.
>
> Instead of thinking about URL shorteners specifically, think generally
> about information theory. You cannot, fundamentally, shorten all URLs
> arbitrarily. There just isn't enough room to store the information.
>
> > And as for second-level domains, consider for example “t.c” instead
> > of “twitter.com” as part of the short URI.
>
> That'll work only for the ones that you code in specifically, and
> that's only shortening your URL by 8 characters. A typical URL needing
> shortening is over 80 characters - maybe several hundred. You need to
> cut that down to a manageable length. That fundamentally cannot be
> reversed without readding information.

And I submit that putting someone in charge of the drives organization, 
and the database on that drive that the url has to dig thru, can make a 
huge difference in the length of the resultant url.

> >>> And with the exception of Twitter-ish sites that place a limit on
> >>> message length, there really is *no need* for shorter URIs
> >>> nowadays.  (HTTP) clients and servers are capable of processing
> >>> really long ones [1]; electronic communications media and related
> >>> software, too [2].  And data storage space as well as data
> >>> transmission has become exceptionally inexpensive.  A few less
> >>> bytes there do not count.

They may not count for that much in terms of what the user pays for 
bandwidth, but see below.  And some users are probably still paying for 
their internet access by the minute in some locales.

> >> There are many places where there are limits (hard or soft) on
> >> message lengths. Some of us still use MUDs and 80-character line
> >> limits.
> >
> > See above.  Covered by [2].
>
> Unrelated. Not covered by that link. Go use a MUD some time.
>
> > But speaking of length limits, the lines in your postings are too
> > long, according to Usenet convention.  I had to correct the
> > quotations so that they remained readable when word-wrapped.
>
> Oh, so you'd rather the lines be cut to... I dunno, 80 characters?
> Might be a good reason to use a URL shortener.
>
usenet generally encourages us to set our word wrap at 72 to 73 
characters so there is room for the invitable additions of the quote > 
character so we can track who said what.  That is just common good 
practice.

> >> Business cards or other printed media need to be transcribed by
> >> hand. Dictation of URLs becomes virtually impossible when they're
> >> arbitrarily long.

OTOH, url's in excess of 250 characters long exist only to polish ego's 
of the people involved or demonstrate that they could not organize a 
company picnic in a 4 person company.

Few enough recognize that problem and post their urls on the form of 
<url> which most email agents recognize as a url, that before 
presentation to a browser when you click on it, will then go thru it, 
stripping out the line feeds and carriage returns so that the original 
as pasted and wrecked by the emailers word wrapping, is restored and it 
has at least a snowballs chance in hell of working.

But you can't teach a winderz user to do that any better than you can 
break them from top posting.

> > (You are not reading at all, are you?)  This is covered by that:
> >>> Instead, there *is* a need for *concise*, *semantic* URIs that Web
> >>> (service) users can *easily* *remember*.  It is the duty of the
> >>> original Web authors∕developers to make sure that there are, and I
> >>> think that no kind of automation is going to ease or replace
> >>> thoughtful path design anytime soon (but please, prove me wrong):
> >>
> >> Sure...... if you control the destination server. What if you're
> >> engaging in scholarly discussion about someone else's content? You
> >> can't change the canonical URLs, and you can't simply copy their
> >> content to your own server (either for licensing reasons or to
> >> guarantee that the official version hasn't been tampered with).
> >
> > That is why I said it is the duty of the original
> > authors/developers.  It is a community effort, and it is not going
> > to happen overnight.  But evading the problem with unreliable
> > replacements such as “short URLs” is not going to solve it either.

True, its fixing the wrong end of the problem.

> So, you can go fight an unwinnable battle against literally every web
> creator in the world. Meanwhile, I'll keep on using URL shorteners.
>
> ChrisA

Cheers, Gene Heskett
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page <http://geneslinuxbox.net:6309/gene>



More information about the Python-list mailing list