From james at jamesh.id.au Mon Jul 3 05:54:52 2017 From: james at jamesh.id.au (James Henstridge) Date: Mon, 3 Jul 2017 17:54:52 +0800 Subject: [DB-SIG] .rowcount issues with large affected row counts. In-Reply-To: References: Message-ID: On 30 June 2017 at 06:24, ..: Mark Sloan :.. wrote: > Hi all, > > Kind of new to a lot things here so if I am way off please correct me. > > using psycopg2 with postgres / greenplum / redshift it's now pretty easy to > have a single query have a affected row count higher than it seems .rowcount > allows for. > > I am pretty sure libpq returns the affected row as a string ("for historical > reasons" according to the pg mailing threads) however when I have a large > update statement (e.g. several billion) I seem to get a .rowcount back that > isn't correct. > > > using the psql client I can't reproduce the affected row count being > incorrect there. > > > any ideas or suggestions? You might have better luck asking questions about psycopg specifically on the psycopg mailing list (info at http://initd.org/psycopg/development/). I don't think the DB-API itself has the restriction you're In answer to your question, the relevant code in psycopg is here: https://github.com/psycopg/psycopg2/blob/master/psycopg/pqpath.c#L1339-L1350 It is using atol() to convert libpq's string row count to a C long int. If you're running on a 32-bit OS or 64-bit Windows, the maximum value for a long int will be about 2 billion. Unfortunately, the atol() function has no way to report an error and has undefined behaviour on overflow. That it doesn't use an API that can check for conversion errors sounds like a bug in psycopg2. James. From daniele.varrazzo at gmail.com Mon Jul 3 09:08:14 2017 From: daniele.varrazzo at gmail.com (Daniele Varrazzo) Date: Mon, 3 Jul 2017 14:08:14 +0100 Subject: [DB-SIG] .rowcount issues with large affected row counts. In-Reply-To: References: Message-ID: Please open a bug to the psycopg bug tracker: https://github.com/psycopg/psycopg2/issues specifying your platform (win/linux/other, 32/64 bit). Please also add an idea of the number you are expecting to see (I think we should be able to parse 2*10^9 no problem, if not it's a bug). If possible compile psycopg in debug mode (see http://initd.org/psycopg/docs/install.html#creating-a-debug-build) and report the debug line is printed for a failing case (it should say "_read_rowcount: PQcmdTuples..." looking at the link James has kindly provided). Regardless of the report I'll look into parsing that value without using 'atol()' for next bugfix release. -- Daniele On Thu, Jun 29, 2017 at 11:24 PM, ..: Mark Sloan :.. wrote: > Hi all, > > Kind of new to a lot things here so if I am way off please correct me. > > using psycopg2 with postgres / greenplum / redshift it's now pretty easy to > have a single query have a affected row count higher than it seems .rowcount > allows for. > > I am pretty sure libpq returns the affected row as a string ("for historical > reasons" according to the pg mailing threads) however when I have a large > update statement (e.g. several billion) I seem to get a .rowcount back that > isn't correct. > > > using the psql client I can't reproduce the affected row count being > incorrect there. > > > any ideas or suggestions? > > > thanks > > -Mark > > > _______________________________________________ > DB-SIG maillist - DB-SIG at python.org > https://mail.python.org/mailman/listinfo/db-sig > From james at jamesh.id.au Tue Jul 4 07:16:29 2017 From: james at jamesh.id.au (James Henstridge) Date: Tue, 4 Jul 2017 19:16:29 +0800 Subject: [DB-SIG] .rowcount issues with large affected row counts. In-Reply-To: References: Message-ID: On 3 July 2017 at 21:08, Daniele Varrazzo wrote: > Please open a bug to the psycopg bug tracker: > > https://github.com/psycopg/psycopg2/issues > > specifying your platform (win/linux/other, 32/64 bit). Please also add > an idea of the number you are expecting to see (I think we should be > able to parse 2*10^9 no problem, if not it's a bug). If possible > compile psycopg in debug mode (see > http://initd.org/psycopg/docs/install.html#creating-a-debug-build) and > report the debug line is printed for a failing case (it should say > "_read_rowcount: PQcmdTuples..." looking at the link James has kindly > provided). > > Regardless of the report I'll look into parsing that value without > using 'atol()' for next bugfix release. At a minimum, using strtol() should let you detect cases where the number can not be converted completely. It might also be worth switching to a "long long int" and strtoll(): this will be a bit slower on 32-bit systems, but (a) they are getting less important as time goes on, and (b) correct behaviour is more important. James.