python ETL
Jorgen Grahn
jgrahn-nntq at algonet.se
Thu Aug 4 13:55:04 EDT 2005
On Mon, 01 Aug 2005 10:49:36 -0500, Paul Watson <pwatson at redlinepy.com> wrote:
> arielgr at gmail.com wrote:
>> Hi,
>> My company is involved in the development of many data marts and
>> data-warehouses, and I currently looking into migrating our old set of
>> tools (written in Korn) to a new, more dynamic and robust one.
...
> However, I would have to assume that if homebrew shell scripts have been
> doing the work adequately, then the marts and warehouses are not very
> large and the datasets are primarily text rather than binary.
>
> If this is the case and you are only seeking incremental improvement,
> then Python would be a very good choice. Perl would also do the job.
> Just about any language would work. Yes, there are many reasons to
> choose Python. However, you would have to build any scalability and
> metadata management.
>
> If you seek a radical improvement, it is available, but I do not know of
> any free tools that will do it. A question like this will probably not
> be answered in a newsgroup post or even the exchange of a few emails.
>
> Choosing an effective tool for the organization is not a trivial
> process. It requires knowledge of both the tools and the organization's
> methodologies and processes. If you do not have staff who can do this,
> most companies find it is much cheaper and faster to pay someone who
> does know (a consultant) to assist them in assessing their requirements,
> tool selection, and forming an implementation plan.
But remember: sometimes, a bunch of shell scripts or a Python script is the
right tool for the problem.
Sometimes, I think a bunch of shell scripts is the right tool for a lot of
the problems people throw XMLthis, XMLthat, .NET, SQL servers, consultants
and money at.
There is no real reason (with the little information we have[1]) to believe
that the original poster is making his employer a disservice by looking at
doing things himself, in plain old Python, instread of letting someome tear
down and rebuild whatever workflow/methodology/process stuff they have right
now.
/Jorgen
[1] Unless "ETL" and "data mart" carry some deep meaning which
I've missed, that is.
--
// Jorgen Grahn <jgrahn@ Ph'nglui mglw'nafh Cthulhu
\X/ algonet.se> R'lyeh wgah'nagl fhtagn!
More information about the Python-list
mailing list