地理位置画像、连锁店圈人、地理围栏圈人、多地圈选、multi-polygon圈选、多点圈选

https://github.com/digoal/blog/blob/master/201901/20190116_01.md#%E8%83%8C%E6%99%AF 背景

基于地理位置的圈选，例如我们有人的位置数据，有门店的位置数据，一家连锁店全国有1000家门店，如何快速的圈选所有门店附近的人群进行透视。

使用PostGIS和gis索引，以及PG的并行计算，可以实现高效率圈选。

如果还要加上其他非空间条件，使用btree_gist，组合搜索更快。

《HTAP数据库 PostgreSQL 场景与性能测试之 47 - (OLTP) 空间应用 - 高并发空间位置更新、多属性KNN搜索并测（含空间索引）末端配送、新零售类项目》

https://github.com/digoal/blog/blob/master/201901/20190116_01.md#%E4%BE%8B%E5%AD%90 例子

我们有人的位置数据，有门店的位置数据，一家连锁店全国有1000家门店，如何快速的圈选所有门店附近的人群进行透视。

1、创建地理位置信息插件

create extension postgis;

2、用户位置

create unlogged table tbl1 (  
  uid int8 primary key, -- 用户ID  
  pos geometry  -- 用户位置  
);

3、门店位置

create unlogged table tbl2 (  
  bid int,   -- 商家ID  
  pos geometry  -- 商家某门店位置  
  cbid int,  -- 商家某门店ID  
);

4、用户位置索引，

create index idx_tbl1_1 on tbl1 using gist (pos);

5、门店表，商家ID索引

create index idx_tbl2_1 on tbl2 (bid);

6、生成用户位置

vi test1.sql  
\set uid random(1,2000000000)  
insert into tbl1 values (:uid, st_setsrid(st_makepoint(118+2*random(), 30+3*random()), 4326)) on conflict (uid) do nothing;

7、生成门店位置

vi test2.sql  
\set bid random(1,100000)  
insert into tbl2 values(:bid, st_setsrid(st_makepoint(118+2*random(), 30+3*random()), 4326));

8、最终数据量

12.8亿用户

pgbench -M prepared -n -r -P 1 -f ./test1.sql -c 32 -j 32 -t 40000000

10万商家，2亿门店，平均每个商家2000个门店

pgbench -M prepared -n -r -P 1 -f ./test2.sql -c 20 -j 20 -t 10000000

vacuum analyze tbl1;  
vacuum analyze tbl2;

9、查询某个商家所有门店附近的人群。

根据商家ID找到所有门店的地理位置，然后根据地理位置，圈出附近的人群，以下SQL，2179个门店，每个门店附近圈出100人，返回21.79万人，可以使用游标返回。

postgres=# explain analyze select tbl1.uid from tbl1,tbl2 where tbl2.bid=1 and st_dwithin(tbl1.pos, tbl2.pos, 0.005);  
                                                              QUERY PLAN                                                                 
---------------------------------------------------------------------------------------------------------------------------------------  
 Nested Loop  (cost=0.97..23056993.11 rows=336364 width=8) (actual time=0.145..191.189 rows=5347 loops=1)  
   ->  Index Scan using idx_tbl2_1 on tbl2  (cost=0.49..49.26 rows=2144 width=32) (actual time=0.037..4.888 rows=2179 loops=1)  
         Index Cond: (bid = 1)  
   ->  Index Scan using idx_tbl1_1 on tbl1  (cost=0.48..10754.17 rows=2 width=40) (actual time=0.070..0.084 rows=2 loops=2179)  
         Index Cond: (pos && st_expand(tbl2.pos, '0.0001'::double precision))  
         Filter: ((tbl2.pos && st_expand(pos, '0.0001'::double precision)) AND _st_dwithin(pos, tbl2.pos, '0.0001'::double precision))  
         Rows Removed by Filter: 1  
 Planning Time: 0.258 ms  
 Execution Time: 191.620 ms  
(9 rows)  
  
postgres=# explain analyze select tbl1.uid from tbl1,tbl2 where tbl2.bid=1 and st_dwithin(tbl1.pos, tbl2.pos, 0.0006);  
                                                              QUERY PLAN                                                                 
---------------------------------------------------------------------------------------------------------------------------------------  
 Nested Loop  (cost=0.97..23474312.47 rows=342457 width=8) (actual time=0.146..1356.814 rows=195721 loops=1)  
   ->  Index Scan using idx_tbl2_1 on tbl2  (cost=0.49..49.50 rows=2155 width=32) (actual time=0.031..8.108 rows=2179 loops=1)  
         Index Cond: (bid = 1)  
   ->  Index Scan using idx_tbl1_1 on tbl1  (cost=0.48..10892.93 rows=2 width=40) (actual time=0.057..0.609 rows=90 loops=2179)  
         Index Cond: (pos && st_expand(tbl2.pos, '0.0006'::double precision))  
         Filter: ((tbl2.pos && st_expand(pos, '0.0006'::double precision)) AND _st_dwithin(pos, tbl2.pos, '0.0006'::double precision))  
         Rows Removed by Filter: 28  
 Planning Time: 0.152 ms  
 Execution Time: 1367.205 ms  
(9 rows)

10、如果是返回UID进行推广，可以游标式返回。响应速度更快。

postgres=# begin;  
BEGIN  
postgres=# declare cur1 cursor for select tbl1.uid from tbl1,tbl2 where tbl2.bid=1 and st_dwithin(tbl1.pos, tbl2.pos, 0.0006);  
DECLARE CURSOR  
postgres=# \timing  
Timing is on.  
postgres=# fetch 10 from cur1;  
    uid       
------------  
  767330778  
  240805365  
 1753394423  
 1026446114  
   30318931  
  881053980  
   32654928  
 1420031417  
  162509026  
 1019355204  
(10 rows)  
  
Time: 0.385 ms  
  
postgres=# fetch 10 from cur1;  
    uid       
------------  
 1732633872  
 1966961936  
  710467223  
  323584211  
 1138058052  
  778209047  
 1384104188  
  353370951  
 1980956718  
  339078065  
(10 rows)  
  
Time: 0.201 ms

https://github.com/digoal/blog/blob/master/201901/20190116_01.md#%E5%8F%82%E8%80%83 参考

围栏，空间热力图，点面判断，商圈空间分析，与之类似。

《PostgreSQL 空间聚合性能 - 行政区、电子围栏空间聚合 - 时间、空间热力图》《PostgreSQL 电子围栏的应用场景和性能(大疆、共享设备、菜鸟。。。)》《菜鸟末端轨迹 - 电子围栏(解密支撑每天251亿个包裹的数据库) - 阿里云RDS PostgreSQL最佳实践》《PostgreSQL 生成空间热力图》《PostgreSQL 11 并行计算算法，参数，强制并行度设置》《PostgreSQL Oracle 兼容性之 - 自定义并行聚合函数 PARALLEL_ENABLE AGGREGATE》《PostgreSQL 10 自定义并行计算聚合函数的原理与实践 - (含array_agg合并多个数组为单个一元数组的例子)》

https://github.com/digoal/blog/blob/master/201901/20190116_01.md#%E5%85%8D%E8%B4%B9%E9%A2%86%E5%8F%96%E9%98%BF%E9%87%8C%E4%BA%91rds-postgresql%E5%AE%9E%E4%BE%8Becs%E8%99%9A%E6%8B%9F%E6%9C%BA 免费领取阿里云RDS PostgreSQL实例、ECS虚拟机

地理位置画像、连锁店圈人、地理围栏圈人、多地圈选、multi-polygon圈选、多点圈选

地理位置画像、连锁店圈人、地理围栏圈人、多地圈选、multi-polygon圈选、多点圈选

标签

https://github.com/digoal/blog/blob/master/201901/20190116_01.md#%E8%83%8C%E6%99%AF 背景

https://github.com/digoal/blog/blob/master/201901/20190116_01.md#%E4%BE%8B%E5%AD%90 例子

https://github.com/digoal/blog/blob/master/201901/20190116_01.md#%E5%8F%82%E8%80%83 参考

https://github.com/digoal/blog/blob/master/201901/20190116_01.md#%E5%85%8D%E8%B4%B9%E9%A2%86%E5%8F%96%E9%98%BF%E9%87%8C%E4%BA%91rds-postgresql%E5%AE%9E%E4%BE%8Becs%E8%99%9A%E6%8B%9F%E6%9C%BA 免费领取阿里云RDS PostgreSQL实例、ECS虚拟机

继续阅读

set define off关闭替代变量功能

报错：'mysql' 不是内部或外部命令，也不是可运行的程序或批处理文件。

Linxu常用命令技巧汇总

ERROR 1 (HY000): Can't create/write to file '/tmp/#sql_4188_1.MYI' (Errcode: 28)

艰难安装LDAP,SSL认证

《Linux命令行与Shell脚本编程大全第2版.布卢姆》pdf

MySQL的4种隔离级别？出现问题

XX系统实施过程问题总结

无组件上传图片到数据库中，最完整解决方案

【MySQL数据库】数据库索引事务1.索引2.事务

neo4j之cypher使用文档

NOSQL安全攻击

mybatis_入门程序Mybatis入门

登录plsql 报错 the account is locked --用户被锁

SequoiaDB巨杉数据库C++驱动概述

Oracle 批量查询传入List 返回List