PostgreSQL 利用编译器extension 支持int128，提升聚合性能

2021-11-07 14:46:05

postgresql , int128 , clang , gcc , icc

postgresql 9.4以及以前的版本，在int，int2，int8的聚合计算中，为了保证数据不会溢出，中间结果使用numeric来存储。

numeric是postgresql自己实现的一种数值类型，可以存储非常大的数值(估计是做科学计算的需求)，但是牺牲了一定的性能。

为了提高聚合，特别是大数据量的聚合时的性能，社区借用了编译器支持的int128类型，作为数据库int, int2, int8的中间计算结果，从而提升计算性能。

gcc,clang,icc都支持int128

1. gcc

2. icc

编译时根据编译器的特性自动判断是否使用int128特性.

<a href="https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=8122e1437e332e156d971a0274879b0ee76e488a">https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=8122e1437e332e156d971a0274879b0ee76e488a</a>

there was recently talk about if we should start using 128-bit integers

(where available) to speed up the aggregate functions over integers

which uses numeric for their internal state. so i hacked together a

patch for this to see what the performance gain would be.

previous thread:

<a href="http://www.postgresql.org/message-id/[email protected]">http://www.postgresql.org/message-id/[email protected]</a>

what the patch does is switching from using numerics in the aggregate

state to int128 and then convert the type from the 128-bit integer in

the final function.

the functions where we can make use of int128 states are:

the initial benchmark results look very promising. when summing 10

million int8 i get a speedup of ~2.5x and similarly for var_samp() on 10

million int4 i see a speed up of ~3.7x. to me this indicates that it is

worth the extra code. what do you say? is this worth implementing?

the current patch still requires work. i have not written the detection

of int128 support yet, and the patch needs code cleanup (for example: i

used an int16_ prefix on the added functions, suggestions for better

names are welcome). i also need to decide on what estimate to use for

the size of that state.

the patch should work and pass make check on platforms where __int128_t

is supported.

the simple benchmarks:

PostgreSQL 利用编译器extension 支持int128，提升聚合性能

继续阅读

输出蛇形矩阵 C语言输出蛇形矩阵

1.54寸TFT ST7789液晶屏图片如何取模

三子棋——年轻人的第一款小游戏前言正文总结

DOG（4）：解析器的部分实现细节先来说说parser一些可能迷惑的地方结果如何返回?pcd其实是一回事最后的一点说明

NYOJ 269--VF

二叉树三种遍历(先序，中序，后序)----超详细引入先序遍历后序遍历中序遍历总结

UVA 110 Meta-Loopless Sorts

为什么要选择UniDAC

CRC32和CRC8校验代码，C语言版

241 Different Ways to Add Parentheses（C代码版）

C语言：初学者必定看懂的注释！！！猴子吃桃问题。猴子第一天摘下若干个桃子，每天都吃了前一天剩下的一半零一个，到第10天早上想再吃的时候，就剩下一个桃子. 求第一天共摘多少个桃子。

[转]九大排序算法——C语言实现及详解

while 循环、do- while 循环和 for 循环之间的那点事C语言自学之三种循环比较

结构体：typedef与struct的区别

hdu7108哈希