[Python-ideas] Add a __cite__ method for scientific packages

Wes Turner wes.turner at gmail.com
Wed Jul 4 22:10:21 EDT 2018


typeshed, dotted lookup, ScholarlyArticle semantic graphs with classes,
properties, and URIs

Would external metadata (similar to how typeshed is defined in a 'shadow
naming scheme' (?)) be advantageous
for dotted name lookup of citation metadata?

> Typeshed contains external type annotations for the Python standard
library and Python builtins, as well as third party packages.
>

> This data can e.g. be used for static analysis, type checking or type
inference.

https://github.com/python/typeshed
stdlib/{2, 2and3, 3, 3.5, 3.6, 3.7}
third_party/{2, 2and3, 3}/{jinja2,}

Ideally, a ScholarlyArticle can also be published as HTML with RDFa and/or
JSONLD (in addition to two column LaTeX/PDF which is lossy in regards to
structured data / linked data) with its own document-level metadata simply
as part of a graph of resources (such as schema:citation and
schema:Datasets) described using a search-indexed vocabulary such as the
Schema.org RDFS vocabulary.

An aside:
https://schema.org/unitCode has a range of {Text, URL} where the Text
should be a 3 character UN/CEFACT Common Code; but there's also QUDT for
unit URIs; fortunately, RDF allows repeated property values, so we can just
add both.

On Wednesday, July 4, 2018, Wes Turner <wes.turner at gmail.com> wrote:

> ... a schema:Dataset may be part of a Creative work.
>
> https://schema.org/Dataset
> https://schema.org/isPartOf
> https://schema.org/ScholarlyArticle
>
> #LinkedReproducibility #nbmeta
>
> On Wednesday, July 4, 2018, Wes Turner <wes.turner at gmail.com> wrote:
>
>> https://schema.org/CreativeWork
>>   https://schema.org/Code
>>   https://schema.org/SoftwareApplication
>>
>> CreativeWork has a https://schema.org/citation field with a range of
>> {CreativeWork, Text}
>>
>> There's also a https://schema.org/funder attribute with a domain of
>> CreativeWork and a range of {Organization, Person}
>>
>> - BibTeX is actually somewhat ill-specified, TBH.
>> - There is a repository of CSL styles at https://citationstyles.org .
>> - CSL is sponsored by both Zotero and Mendeley.
>> - A number of search engines support schema.org (and JSONLD)
>> - The schema.org RDFS vocabulary is designed to describe a graph of
>> resources (CreativeWork, Code, SoftwareApplication, ScholarlyArticle,
>> MedicalScholarlyArticle).
>>
>> __citation__ = [{}, ]
>> __citation__ = {
>>   '@type': ['schema:ScholarlyArticle'],
>>   'schema:name': '',
>>   'schema:author': [{
>>       '@type': 'schema:Person',
>>       '...': '...'}]
>> }
>>
>> JSONLD is ideal for describing a graph of resources with varied types.
>>
>> If the overhead of __citation__ for every import is unjustified,
>> a lookup of methods with dotted names that finds entries for root modules
>> as well would be great:
>>
>> >>> citations('json.loads')
>> >>> citations('list.sort')
>>
>> A tracing debugger could lookup each and every package, module, function,
>> and method each ScholarlyArticle SoftwareApplication executes (from a
>> registry in e.g. a _citations_.py or a _citations_.jsonld.json).
>>
>> It'd be a shame to need to manually format citations for a particular
>> Journal's CSL bibliographic  metadata template preference.
>>
>> sphinxcontrib-bibtex is a Sphinx extension for BibTeX support (with a
>> bibliography directive and a cite role)
>> - Src: https://github.com/mcmtroffaes/sphinxcontrib-bibtex
>>
>> Jupyter notebooks support document-level metadata (in JSON that's
>> currently only similar to schema.org JSONLD).
>>
>> https://schema.org/ScholarlyArticle is search engine indexable.
>>
>>
>> On Wednesday, July 4, 2018, Alexander Belopolsky <
>> alexander.belopolsky at gmail.com> wrote:
>>
>>>
>>>
>>> On Sun, Jul 1, 2018 at 9:45 AM David Mertz <mertz at gnosis.cx> wrote:
>>>
>>>> ..
>>>> There's absolutely nothing in the idea that requires a change in
>>>> Python, and Python developers or users are not, as such, the relevant
>>>> experts.
>>>>
>>>
>>> This is not entirely true.  If some variant of __citation__ is endorsed
>>> by the community, I would expect that pydoc would extract this information
>>> to fill an appropriate section in the documentation page.  Note that pydoc
>>> already treats a number of dunder variables specially: '__author__', '__credits__',
>>> and '__version__' are a few that come to mind, so I don't think the
>>> threshold for adding one more should be too high.  On the other hand,
>>> maybe   '__author__', '__credits__', and '__citation__' should be
>>> merged in one structured variable (a dict?) with format designed with some
>>> extendability in mind.
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180704/3c37d6d8/attachment-0001.html>


More information about the Python-ideas mailing list