[Archiver-dev] UpLib and archiving

Bill Janssen janssen at parc.com
Tue Oct 19 08:18:39 CEST 2010


Earl, good input.

Earl Hood <earl at earlhood.com> wrote:

> Sanitizing javascript should be the default behavior.  This is a
> major XSS exploit, and if you want others to utilize your software
> for their sites, they will open their site to XSS if this
> is not done.

Probably not, because this isn't a typical Web server.

> > As for the Content-Disposition filenames: UpLib runs its own
> > content-type determiner over the content to try to see what it is rather
> > than just relying on the filename, though it will fall back to the
> > filename if it can't figure it out.  And I've hardcoded some typical
> > situations.
> 
> Falling back to filename should be a configurable option, and
> it should be disabled by default.

That could easily be done.

> I recommend that all attachments be saved into an attachments
> area so you can place restrictive web server configuration
> settings on it.  This approach assumes you serve up attachment
> data directly via the file system via standard HTTP server
> retrieval.  If you serve up attachments via custom
> web service (e.g. servlet, CGI), then filenaming concerns of
> attachments are not as critical.

Right.  The HTTP interface to UpLib is not a typical Web server -- it
doesn't allow direct access to files.  All access is mediated through my
code.

> >> * Email address obfuscation.  Obviously we'd want to support that, but using
> >>   what algorithm?  xxx'ing out the domain?  Using a central forwarding
> >>   service?  How do we recognize email addresses?
> >
> > I don't obfuscate anything, really.  But this is an issue for a public
> > Web UI design, I think.
> 
> If your plugin architecture has the support for data filtering
> step before archiving, this could be done with plugins.
> 
> Or, if you have plugins that allow the filter of content on
> retrieval, that may be better.  This way the stored data
> still has the email addresses intact, but they get obfuscated
> on rendering.

Right, that's how I'd do it.  Though I do think input obfuscation could
be done without losing much, if anything.  UpLib typically serves up a
rendering of a document, rather than the actual bits (though there is an
API to retrieve the actual bits).  The current rendering for email
messages doesn't obfuscate addresses by design, but it does typicall
obfuscate them a bit for clarity -- email addresses were never designed
to be shown to the user.

Bill


More information about the Archiver-dev mailing list