How to write temporary data to file?

Thomas Ploch Thomas.Ploch at gmx.net
Tue Jan 9 02:23:26 EST 2007


Ravi Teja schrieb:
> Thomas Ploch wrote:
>> Ravi Teja schrieb:
>>> Thomas Ploch wrote:
>>>> Hi folks,
>>>>
>>>> I have a data structure that looks like this:
>>>>
>>>> d = {
>>>> 	'url1': {
>>>> 		'emails': ['a', 'b', 'c',...],
>>>> 		'matches': ['d', 'e', 'f',...]
>>>> 	},
>>>> 	'url2': {...
>>>> }
>>>>
>>>> This dictionary will get _very_ big, so I want to write it somehow to a
>>>> file after it has grown to a certain size.
>>>>
>>>> How would I achieve that?
>>>>
>>>> Thanks,
>>>> Thomas
>>> Pickle/cPickle are standard library modules that can persist data.
>>> But in this case, I would recommend ZODB/Durus.
>>>
>>> (Your code example scares me. I hope you have benevolent purposes for
>>> that application.)
>>>
>>> Ravi Teja.
>>>
>> Thanks, but why is this code example scaring you?
>>
>> Thomas
> 
> The code indicates that you are trying to harvest a _very_ (as you put
> it) large set of email addresses from web pages. With my limited
> imagination, I can think of only one group of people who would need to
> do that. But considering that you write good English, you must not be
> one of those mean people that needed me to get a new email account just
> for posting to Usenet :-).
> 
> Ravi Teja.
> 

Oh, well, yes you are right that this application is able to harvest
email addresses. But it can do much more than that. It has a text
matching engine, that according to given meta keywords can scan or not
scan documents in the web and harvest all kinds of information. It can
also be fed with callbacks for each of the Content-Types. I know that
the email matching engine is a kind of a 'grey zone', and I asked
myself, if it needs the email stuff. But I mean you could easily include
the email regex to the text matching engine yourself, so I decided to
add this functionality (it is 'OFF' by default :-) ).

Thomas

P.S.: No, I am a good person.




More information about the Python-list mailing list