[TriZPUG] RDF and Open Data (was Re: TriZPUG Digest, Vol 64, Issue 6)
Chris Calloway
cbc at unc.edu
Tue Aug 20 21:16:16 CEST 2013
On 8/19/2013 6:24 PM, Eric Leary wrote:
> Colin warmed every one up perfectly last month to a lot of the same
> material thats in Josh's book - so I think we are ready for a
> presentation that gets a little closer to the realities of
> implementation for coders. Recently I've gone down the rabbit hole on
> RDF, RDFa, and JASON-LD in trying to understand their future role or
> rejection. Are they dead, or do they just smell funny?
Open Data has breathed new life into RDF via two developments:
http://en.wikipedia.org/wiki/SPARQL
http://en.wikipedia.org/wiki/GeoSPARQL
My co-worker in the office next door does SPARQL for metadata ontologies.
I have my doubts about whether ontologies are ever going to be useful,
however. If disciplines can't agree about metadata and vocabularies, who
is going to arbitrate metadata translation? Does having a Russian to
English dictionary translate War and Peace into English on its own?
I've never believed in the semantic web.
> Chris and James were able to point out a lot of paradoxes in principle
> and in day to day trade craft that made me realize how naive I am about
> the "power of open data" and "open anything."
Just to be open, things I pointed out to Eric:
1) Data liability is an obstacle to openness. If I provide data, and you
provide a service on top of that data, and then your service fails
because the data I provided you were faulty, am I liable to you even if
I were providing the best available data in good faith? Many open data
providers will post a policy that tries to wash off any liability. But
the law may not recognize such policies. If open data is "use at your
own risk," can open data every be useful for public safety? For
investment decisions? Aren't those the kind of things we need open data
for, the things that matter?
2) Personal privacy is an obstacle to openness and openness is an
obstacle to personal privacy. Data about people generally needs to be
anonymized in order protect individual privacy. Yet calls for openness
have gone so far that the personal names, home addresses, home phone
numbers, and *salaries* of state government employees are available
through open data services (luckily NC teachers and university employees
have somehow been overlooked in this particular boondoggle). I can
freely look up how much you paid for your house and how much you had to
borrow for a mortgage. I can look up your political party affiliation,
your age, race, gender, home address, and which elections you voted in
over the last several years.
3) Governments hide public data through third party vendor access. Some
government agencies may hold public but have no legal or budgetary
mandate to help you find or access it. There's an opportunity there for
agencies to make money giving private companies the raw public data, and
then the private companies will charge you for organized search and
access to public data. Arrest records are public data but you'll need to
fork over some dollars to private companies to look at this data, just
enough to discourage it in most cases. Sometimes companies can get
exclusive rights to distribute public data.
4) Governments will only go so far to allow access to data. The more
valuable or politically sensitive data is, the more likely it is to be
"classified" even if paid for by tax dollars. Governments also respond
to business interests to suppress access to or defund generation of
data, particularly scientific data. It's easy to access data from
successful clinical trials. But it's not easy to access data from
unsuccessful clinical trials. Even when the unsuccessful trials
outnumber the successful ones by orders of magnitude for a particular drug.
5) Governments can and do own intellectual property which they can and
do decide to keep proprietary rather than openly license, going so far
as to generate revenue streams with proprietary licensing. There have
been bills in front of Congress, so far fortunately unsuccessful, to
restrict access to nationally financed weather data to only certain
companies such that you would have to pay those companies to get weather
reports. Most state university systems operate intellectual property
offices to capture patents and copyrights for royalties.
6) Providing access to public data is an expensive public service.
Archiving data long term is super expensive. Cataloging and classifying
data is labor intensive. Who pays for it? If it comes from access fees,
does that provide unfair advantages to those who can afford access over
those who can't but who did help pay for creating the data? When
financial times are tight and just keeping the public safety net patched
is a challenge compared to other interests, is open data all that
important? What if you visited the public library, and all the books
were piled on a table without Dewey decimal numbers and there were no
card catalog? What if there were no public library? Data requires the
online equivalent of libraries and librarians. Is public data an
essential governmental service?
We have this recent and rather toothless presidential executive order:
http://www.whitehouse.gov/the-press-office/2013/05/09/executive-order-making-open-and-machine-readable-new-default-government-
I'm witnessing the speed of at least one government agency's response to
this order, however, and know it will be years in the making if at all.
There are so many silos within that have to agree to all manner of
standards to make this work. There are so many offerings of standards
and methods to implement standards from each silo. Every department
already has a half-assed skunk-works of an open data project already in
operation. And the management style for most agencies making these
decisions on how to "come together" is via consensus (so no one gets
blamed for bad decisions). Having to make the decisions and then
implement them also are generally not the within the agencies missions,
so not only are the decisions by consensus, but the implementations are
pretty much volunteer work. The public at large would be amazed to know
just how much government function is accomplished via volunteer work by
mid to low level government employees in addition to their regular jobs.
But it's a start. The real open data movement is occurring at individual
municipal levels, such as what the City of Raleigh is doing, and also
occurring by private uplift, such as Code for America in Durham. There's
also an overlap in what people consider open data and crowd sourced
data. Open data is more about unlocking access to already existing
government data.
--
Sincerely,
Chris Calloway http://nccoos.org/Members/cbc
office: 3313 Venable Hall phone: (919) 599-3530
mail: Campus Box #3300, UNC-CH, Chapel Hill, NC 27599
More information about the TriZPUG
mailing list