pgsql: Fix numeric_mul() overflow due to too many digits after decimal

Lists: pgsql-committerspgsql-hackers
From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Fix numeric_mul() overflow due to too many digits after decimal
Date: 2021-07-10 11:54:15
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Fix numeric_mul() overflow due to too many digits after decimal point.

This fixes an overflow error when using the numeric * operator if the
result has more than 16383 digits after the decimal point by rounding
the result. Overflow errors should only occur if the result has too
many digits *before* the decimal point.

Discussion: https://postgr.es/m/CAEZATCUmeFWCrq2dNzZpRj5+6LfN85jYiDoqm+ucSXhb9U2TbA@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/e7fc488ad67caaad33f6d5177081884495cb81cb

Modified Files
--------------
src/backend/utils/adt/numeric.c | 10 +++++++++-
src/test/regress/expected/numeric.out | 6 ++++++
src/test/regress/sql/numeric.sql | 2 ++
3 files changed, 17 insertions(+), 1 deletion(-)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: Re: pgsql: Fix numeric_mul() overflow due to too many digits after decimal
Date: 2021-07-10 15:01:12
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> writes:
> This fixes an overflow error when using the numeric * operator if the
> result has more than 16383 digits after the decimal point by rounding
> the result. Overflow errors should only occur if the result has too
> many digits *before* the decimal point.

I think this needs a bit more thought. Before, a case like
select 1e-16000 * 1e-16000;
produced
ERROR: value overflows numeric format
Now you get an exact zero (with a lot of trailing zeroes, but still
it's just zero). Doesn't that represent catastrophic loss of
precision?

In general, I'm disturbed that we just threw away the previous
promise that numeric multiplication results were exact. That
seems like a pretty fundamental property --- which is stated
in so many words in the manual, btw --- and I'm not sure I want
to give it up.

regards, tom lane


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: Re: pgsql: Fix numeric_mul() overflow due to too many digits after decimal
Date: 2021-07-10 17:03:54
Message-ID: CAEZATCXV6bdUUDckMVD=PtHXRUKTPZeHpy538xWY+wW=DZKYhg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Sat, 10 Jul 2021 at 16:01, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> I think this needs a bit more thought. Before, a case like
> select 1e-16000 * 1e-16000;
> produced
> ERROR: value overflows numeric format
> Now you get an exact zero (with a lot of trailing zeroes, but still
> it's just zero). Doesn't that represent catastrophic loss of
> precision?

Hmm, "overflow" isn't a great result for that case either. Zero is the
closest we can get to the exact result with a fixed number of digits
after the decimal point.

> In general, I'm disturbed that we just threw away the previous
> promise that numeric multiplication results were exact. That
> seems like a pretty fundamental property --- which is stated
> in so many words in the manual, btw --- and I'm not sure I want
> to give it up.

Perhaps we should amend the statement about numeric multiplication to
say that it's exact within the limits of the numeric type's supported
scale, which we also document in the manual as 16383.

That seems a lot better than throwing an overflow error for a result
that isn't very big, which limits what's possible with numeric
multiplication to much less than 16383 digits.

Regards,
Dean


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pgsql: Fix numeric_mul() overflow due to too many digits after decimal
Date: 2021-07-10 17:30:42
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

[ moving to pghackers for wider visibility ]

Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> writes:
> On Sat, 10 Jul 2021 at 16:01, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> In general, I'm disturbed that we just threw away the previous
>> promise that numeric multiplication results were exact. That
>> seems like a pretty fundamental property --- which is stated
>> in so many words in the manual, btw --- and I'm not sure I want
>> to give it up.

> Perhaps we should amend the statement about numeric multiplication to
> say that it's exact within the limits of the numeric type's supported
> scale, which we also document in the manual as 16383.
> That seems a lot better than throwing an overflow error for a result
> that isn't very big, which limits what's possible with numeric
> multiplication to much less than 16383 digits.

TBH, I don't agree. I think this is strictly worse than what we
did before, and we should just revert it. It's no longer possible
to reason about what numeric multiplication will do. I think
throwing an error if we can't represent the result exactly is a
preferable behavior. If you don't want exact results, use float8.

regards, tom lane


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgsql: Fix numeric_mul() overflow due to too many digits after decimal
Date: 2021-07-10 18:19:12
Message-ID: CAEZATCXAHZBr8hbTAyCdtAmuSHh5CporSn-mDGad34T=xn-JXQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Sat, 10 Jul 2021 at 18:30, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> [ moving to pghackers for wider visibility ]
>
> Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> writes:
> > On Sat, 10 Jul 2021 at 16:01, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> In general, I'm disturbed that we just threw away the previous
> >> promise that numeric multiplication results were exact. That
> >> seems like a pretty fundamental property --- which is stated
> >> in so many words in the manual, btw --- and I'm not sure I want
> >> to give it up.
>
> > Perhaps we should amend the statement about numeric multiplication to
> > say that it's exact within the limits of the numeric type's supported
> > scale, which we also document in the manual as 16383.
> > That seems a lot better than throwing an overflow error for a result
> > that isn't very big, which limits what's possible with numeric
> > multiplication to much less than 16383 digits.
>
> TBH, I don't agree. I think this is strictly worse than what we
> did before, and we should just revert it. It's no longer possible
> to reason about what numeric multiplication will do. I think
> throwing an error if we can't represent the result exactly is a
> preferable behavior. If you don't want exact results, use float8.

Actually, I think it makes it a lot easier to reason about what it
will do -- it'll return the exact result, or the exact result
correctly rounded to 16383 digits.

That seems perfectly reasonable to me, since almost every numeric
operation ends up having to round at some point, and almost all of
them don't produce exact results.

The previous behaviour might seem like a principled stance to take,
from a particular perspective, but it's pretty hard to work with in
practice.

For example, say you had 2 numbers, each with 10000 digits after the
decimal point, that you wanted to multiply. If it throws an overflow
error for results with more than 16383 digits after the decimal point,
then you'd have to round one or both input numbers before multiplying.
To get the most accurate result possible, you'd actually have to round
each number to have the same number of *significant digits*, rather
than the same number of digits after the decimal point. In general,
that's quite complicated -- suppose, after rounding, you had

x = [x1 digits] . [x2 digits]
y = [y1 digits] . [y2 digits]

where x1+x2 = y1+y2 to minimise the error in the final result, and
x2+y2 = 16383 to maximise the result scale. x1 and y1 are known
(though we don't have a convenient way to obtain them), so that's a
pair of simultaneous equations to solve to decide the optimal amount
of rounding to apply before multiplying. And after all that, the
result will only be accurate to around x1+x2 (or y1+y2) significant
digits, and around x1+y1 of those will be before the decimal point, so
even though it will return a result with 16383 digits after the
decimal point, lots of those digits will be completely wrong.

With the new code, you just multiply the numbers and the result is
correctly rounded to 16383 digits.

In general, I'd argue that using numeric isn't about getting exact
results, since nearly all real computations don't have an exact
result. Really, it's about getting high-precision results that would
be impossible with float8. (And if you don't care about numbers with
16383 digits after the decimal point, this change won't affect you.)

Regards,
Dean