[SciPy-User] scipy central comments

Kevin Dunn kgdunn at gmail.com
Tue Sep 6 13:25:19 EDT 2011


On Mon, Sep 5, 2011 at 21:01, Skipper Seabold <jsseabold at gmail.com> wrote:
> On Mon, Sep 5, 2011 at 8:40 PM, Kevin Dunn <kgdunn at gmail.com> wrote:
>> Hi everyone, SciPy Central maintainer here - sorry for the slow reply.
>>
>> On Mon, Sep 5, 2011 at 19:57, Collin Stocks <collinstocks at gmail.com> wrote:
>>> Also in favour of a comments system.
>>
>> This is been bumped on the priority list. Django, which is used for
>> SciPy Central, has a built-in commenting system. I will look at
>> integrating it in the next while (just a bit busy at work with other
>> things, but will get to it soon).
>>
>> My current highest priority is to complete library submissions, which
>> are uploaded via a ZIP file. This will allow visitors to see the
>> contents of the library by clicking on the file names (kind of like
>> browsing a repo on GitHub).
>>
>> The next highest priority is commenting. So if you've got any requests
>> for how commenting should look and behave:
>> https://github.com/kgdunn/SciPyCentral/issues/111
>>
>>> On 09/05/2011 03:39 AM, Michael Klitgaard wrote:
>>>> Would it be possible to include data files on SciPy-Central?
>>
>> Absolutely. I've already got some Django code for this on my company's
>> site (shameless plug: http://datasets.connectmv.com).
>>
>
> We could also make the statsmdels datasets module independently
> distributable and available, if there's interest.
>
>> By the way, does SciPy/NumPy have a way to load data from a URL like R?
>>
>> I've looked for this but can't seem to find anything on it. In R it is
>> so nice to be able to say to someone:
>> data = read.table('http://datasets.connectmv.com/file/ammonia.csv')
>>
>> rather that doing a two step: download and load.
>>
>
>
> There is the DataSource class, though there are other ways this could
> be accomplished. I'm not sure that there's a function to do it yet.
>
> ds = np.lib.DataSource()
> fp = ds.open('http://datasets.connectmv.com/file/ammonia.csv')
> from StringIO import StringIO
> arr = np.genfromtxt(StringIO(fp.read()), names=True)
>
> Or you could use urllib
>
> import urllib
> fp2 = urllib.urlopen('http://datasets.connectmv.com/file/ammonia.csv')
> arr2 = np.genfromtxt(StringIO(fp2.read()), names=True)
>
> If there's nothing else
>
> def loadurl(url, *args, **kwargs):
>    from urllib import urlopen
>    from cStringIO import StringIO
>    fp = urlopen(url)
>    return np.genfromtxt(StringIO(fp.read()), *args, **kwargs)
>
> arr3 = loadurl('http://datasets.connectmv.com/file/ammonia.csv')

Thanks Skipper - I wasn't aware of the np.lib.DataSource class. It
seems that it can handle compressed data sources as well.

All 3 methods work great. Would you mind adding those code snippets to
SciPy Central for others to see?

Thanks,
Kevin


> Skipper
>
>>>> In this case the file 'ct.raw'. I believe it would improve the quality
>>>> of the program to include the sample_data.
>>>>
>>>> This could perhaps make scipy central evem more usefull than other
>>>> code sharing sites.
>>
>> Thanks for the idea!
>>
>>>> Sincerely
>>>> Michael
>>>>
>>>
>>> The problem I see with data files is that they could be potentially
>>> large. There may be a way around the problems associated with this, though.
>>
>> Bandwidth on my host shouldn't be an issue. Right now I use less than
>> 0.5% of my monthly 1200 Gb allocation, so there's plenty of room to
>> grow.
>>
>>> Maybe I am a bit naive, but I can't think of many common circumstances
>>> where a code fragment or program which is potentially useful to many
>>> people would be more helpful by providing a data file, since most people
>>> who would find said code useful would already have access to their own
>>> data set.
>>>
>>> I do, however, see the benefit of having some set of sample data on
>>> SciPy-Central, but perhaps this data set should be generic in that many
>>> different code contributions could reference it in a useful way.
>>
>> Agreed. Which is why I was asking about loading data from a URL above.
>>
>>> My two cents.
>>>
>>> -- Collin



More information about the SciPy-User mailing list