天天看点

oracle分区表使用

分区概述:

为了简化数据库大表的管理,从ORACLE8推出了分区选项。分区将表分离在若干不同的表空间上,用分而治之的方法来支撑无限膨胀的大表,给大表在物理一级的可管理性。将大表分割成较小的分区可以改善表的维护、备份、恢复、事务及查询性能。针对当前社保及电信行业的大量日常业务数据,可以推荐使用ORACLE8的该选项。

分区优点:

1 、增强可用性:如果表的一个分区由于系统故障而不能使用,表的其余好的分区仍然可以使用;

2 、减少关闭时间:如果系统故障只影响表的一部分分区,那么只有这部分分区需要修复,故能比整个大表修复花的时间更少;

3 、维护轻松:如果需要重建表,独立管理每个分区比管理单个大表要轻松得多;

4 、均衡I/O:可以把表的不同分区分配到不同的磁盘来平衡I/O改善性能;

5 、改善性能:对大表的查询、增加、修改等操作可以分解到表的不同分区来并行执行,可使运行速度更快;

6 、分区对用户透明,最终用户感觉不到分区的存在。

关于何时应该进行分区,Oracle有如下建议:

(1)Tables greater than 2GB should always be considered for partitioning.  表大小超过2G,总是要考虑分区

(2)Tables containing historical data, in which new data is added into the newest partition. A typical example is a historical table where only the current month's data is updatable and the other 11 months are read only.    表有历史数据和当前数据之分

一、常见分区:

1、范围分区

创建表空间,供分区使用

create tablespace ts_emp1 datafile 'D:\APP\PETER\ORADATA\WJIAO\ts_emp1.dbf' size 1m;

create tablespace ts_emp2 datafile 'D:\APP\PETER\ORADATA\WJIAO\ts_emp2.dbf' size 1m;

create tablespace ts_emp3 datafile 'D:\APP\PETER\ORADATA\WJIAO\ts_emp3.dbf' size 1m;

创建分区表

create table emp

(

    empno     number(4) not null,

    ename     varchar2(10),

    sal       number(7,2)

)

partition by range(empno)

(

    partition p1000 values less than (1000)    tablespace ts_emp1,

    partition p2000 values less than (2000)    tablespace ts_emp2,

    partition pmax values less than (maxvalue) tablespace ts_emp3

);

查看表内分区信息

[email protected]>select segment_name,segment_type,partition_name from user_segments where segment_name='EMP';

SEGMENT_NAME                   SEGMENT_TYPE       PARTITION_NAME

------------------------------ ------------------ ------------------------------

EMP                            TABLE PARTITION    P1000

EMP                            TABLE PARTITION    P2000

EMP                            TABLE PARTITION    PMAX

模拟数据

insert into emp values (100,'Tom',1000);

insert into emp values (500,'Peter',2000);

insert into emp values (1000,'Scott',3000);

insert into emp values (1999,'Bill',4000);

insert into emp values (5000,'Gates',6000);

commit;

查看记录分布

[email protected]>select * from emp;

     EMPNO ENAME             SAL

---------- ---------- ----------

       100 Tom              1000

       500 Peter            2000

      1000 Scott            3000

      1999 Bill             4000

      5000 Gates            6000

[email protected]>select * from emp partition(p1000);

     EMPNO ENAME             SAL

---------- ---------- ----------

       100 Tom              1000

       500 Peter            2000

[email protected]>select * from emp partition(p2000);

     EMPNO ENAME             SAL

---------- ---------- ----------

      1000 Scott            3000

      1999 Bill             4000

[email protected]>select * from emp partition(pmax);

     EMPNO ENAME             SAL

---------- ---------- ----------

      5000 Gates            6000

维护分区

exp导出指定表分区

C:\>exp userid=hr/hr file=d:\app\peter\oradata\emp_p1000.dmp tables=emp:p1000

imp导入指定表分区

C:\>imp userid=hr/hr file=d:\app\peter\oradata\emp_p1000.dmp tables=(emp:p1000) ignore=y 分区不存在,则插入到p2000

ignore=y必须要有,忽略创建错误,否则导入失败

truncate指定分区

alter table emp truncate partition pmax;

drop指定分区

alter table emp drop partition pmax;

超出范围的数据,报错

[email protected]>insert into emp values (5000,'Gates',6000);

insert into emp values (5000,'Gates',6000)

            *

ERROR at line 1:

ORA-14400: inserted partition key does not map to any partition

加入分区

alter table emp add partition pmax values less than(maxvalue);

[email protected]>insert into emp values (5000,'Gates',6000);

1 row created.

2、hash分区

hash分区最主要的机制是根据hash算法来计算具体某条纪录应该插入到哪个分区中

(问:hash算法是干什么的?呵呵,只能去看看数据结构了)

hash算法中最重要的是hash函数,Oracle中如果你要使用hash分区,只需指定分区的数量即可

建议分区的数量采用2的n次方,这样可以使得各个分区间数据分布更加均匀

hash列上数值不能有太多的重复值,否则会导致数据分布不均匀

drop table emp purge;

create table emp

(

    empno     number(5) not null,

    ename     varchar2(10),

    sal       number(7,2)

)

partition by hash(empno)

partitions 8 store in (users)

;

--插入数据,递增序列,分区数据分布均匀

set timing on;

begin

    for i in 1..96400 loop

        execute immediate 'insert into emp values(:1,:2,:3)' using i,'emp'||i,i;

    end loop;

    commit;

end;

/

[email protected]>analyze table emp compute statistics for table for all columns for all indexes;

[email protected]>select partition_name,count(*)

  2  from user_extents

  3  where segment_name = 'EMP'

  4  group by partition_name;

PARTITION_NAME                   COUNT(*)

------------------------------ ----------

SYS_P82                                 6

SYS_P79                                 6

SYS_P81                                 6

SYS_P77                                 6

SYS_P78                                 6

SYS_P80                                 6

SYS_P83                                 6

SYS_P84                                 6

8 rows selected.

hash列上重复值过多,导致数据分布不均匀,建分区表应尽量使数据均匀分布

set timing on;

begin

    for i in 1..96400 loop

        execute immediate 'insert into emp values(:1,:2,:3)' using mod(i,4),'emp'||i,i;

    end loop;

    commit;

end;

/

[email protected]>select partition_name,count(*)

  2  from user_extents

  3  where segment_name = 'EMP'

  4  group by partition_name;

PARTITION_NAME                   COUNT(*)

------------------------------ ----------

SYS_P205                                6

SYS_P206                                6

SYS_P208                               16

SYS_P210                               15

SYS_P212                               16

SYS_P209                                6

SYS_P207                                6

SYS_P211                               16

8 rows selected.

3、复合分区

将范围分区、hash分区综合在一起使用的技术

drop table emp purge;

create table emp

(

    empno     number(5) not null,

    ename     varchar2(10),

    sal       number(7,2)

)

partition by range(empno)

subpartition by hash(empno)

subpartitions 2

(

    partition p1000 values less than (1000),

    partition p2000 values less than (2000),

    partition pmax values less than (maxvalue)

);

上面的例子中将雇员表先按照雇员工号empno进行范围分区,然后再把每个分区分为两个子hash分区。例子中一共将产生6个分区

[email protected]>select segment_name,segment_type,partition_name from user_segments where segment_name='EMP';

SEGMENT_NAME                   SEGMENT_TYPE       PARTITION_NAME

------------------------------ ------------------ ------------------------------

EMP                            TABLE SUBPARTITION SYS_SUBP219

EMP                            TABLE SUBPARTITION SYS_SUBP220

EMP                            TABLE SUBPARTITION SYS_SUBP221

EMP                            TABLE SUBPARTITION SYS_SUBP222

EMP                            TABLE SUBPARTITION SYS_SUBP223

EMP                            TABLE SUBPARTITION SYS_SUBP224

6 rows selected.

查询需要使用子分区

[email protected]>select * from emp subpartition(SYS_SUBP219);

     EMPNO ENAME             SAL

---------- ---------- ----------

       100 Tom              1000

       500 Peter            2000

4、列表分区

一般分区列包含数据可枚举,有限个值,例如地区、或国家等

create table emp

(

    empno     number(4) not null,

    ename     varchar2(10),

    location  varchar2(13)

)

partition by list(location)

(

    partition pnkg values ('Nanjing'),

    partition pswc values ('Suzhou','Wuxi','Changzhou'),

    partition pxls values ('Xuzhou','Lianyg','Suqian')

);

模拟数据

insert into emp values (100,'Tom','Nanjing');

insert into emp values (500,'Peter','Changzhou');

insert into emp values (1000,'Scott','Xuzhou');

insert into emp values (1999,'Bill','Wuxi');

insert into emp values (5000,'Gates','Lianyg');

commit;

查看记录分布

[email protected]>select * from emp partition(pnkg);

     EMPNO ENAME      LOCATION

---------- ---------- -------------

       100 Tom        Nanjing

[email protected]>select * from emp partition(pswc);

     EMPNO ENAME      LOCATION

---------- ---------- -------------

       500 Peter      Changzhou

      1999 Bill       Wuxi

二、如果一个表创建时没有利用分区技术,如何对一个没有添加过分区表的表上添加分区表呢?

比如scott下已存在emp_unpart表,未使用表分区技术

1、最简单方法

drop table emp_part purge;

create table emp_part

partition by hash(empno)

partitions 4 store in (users)

as select * from emp_unpart;

这个方法虽然简单,由于实际生产环境中数据量太大,几乎不可行

2、在线重定义,利用oracle提供的dbms_redefinition包

生成相同表结构的分区表

drop table emp_part purge;

create table emp_part

partition by hash(empno)

partitions 4 store in (users)

as select * from emp_unpart where 1=2;

[email protected]>select segment_name,partition_name from user_extents;

SEGMENT_NAME                   PARTITION_NAME

------------------------------ ------------------------------

EMP_PART                       SYS_P285

EMP_PART                       SYS_P286

EMP_PART                       SYS_P287

EMP_PART                       SYS_P288

为了能够看到表的变化

[email protected]>grant select on emp_unpart to hr;

[email protected]>select tablespace_name,table_name,owner from all_tables where owner = 'SCOTT';

TABLESPACE_NAME                TABLE_NAME                     OWNER

------------------------------ ------------------------------ -----------------------------

USERS                          EMP_UNPART                     SCOTT

sys下执行表的在线重定义

[email protected]>begin  --使用包 DBMS_REDEFINITION 将非分区表转为分区表

  2    dbms_redefinition.can_redef_table('SCOTT','EMP_UNPART',2);

  3    dbms_redefinition.start_redef_table('SCOTT','EMP_UNPART','EMP_PART',null,2);

  4    dbms_redefinition.finish_redef_table('SCOTT','EMP_UNPART','EMP_PART');

  5  end;

  6   /

PL/SQL procedure successfully completed.

[email protected]>select segment_name,partition_name from user_extents;

SEGMENT_NAME                   PARTITION_NAME

------------------------------ ------------------------------

EMP_UNPART                     SYS_P285

EMP_UNPART                     SYS_P286

EMP_UNPART                     SYS_P287

EMP_UNPART                     SYS_P288

[email protected]>select tablespace_name,table_name,owner from all_tables where owner = 'SCOTT';

TABLESPACE_NAME                TABLE_NAME                     OWNER

------------------------------ ------------------------------ ------------------------------

USERS                          EMP_PART                       SCOTT

从数据可以看出:在线重定义后,未分区的表重新使用分区表的定义;

三、如何实现分区滑动窗口

有需求:由于数据量巨大,只计划保留最近3个月数据,之前的数据入数据仓库

drop table emp purge;

drop table t_emp purge;

create table emp

(

    empno     number(4) not null,

    ename     varchar2(10),

    hiredate  date

)

partition by range(hiredate)

(

    partition p201309 values less than (to_date('20131001','yyyymmdd')),

    partition p201310 values less than (to_date('20131101','yyyymmdd')),

    partition p201311 values less than (to_date('20131201','yyyymmdd')),

    partition pmax values less than (maxvalue)

);

模拟数据

insert into emp values (100,'Tom',   to_date('20130907','yyyymmdd'));

insert into emp values (500,'Peter', to_date('20131007','yyyymmdd'));

insert into emp values (1000,'Scott',to_date('20131107','yyyymmdd'));

insert into emp values (1999,'Bill', to_date('20131207','yyyymmdd'));

insert into emp values (5000,'Gates',to_date('20140107','yyyymmdd'));

commit;

假设当前时间为20131231235959,1s后操作如下

1、exchange分区和临时表,发生数据交换

创建另一张相同表结构的表

create table t_emp as select * from emp where 1=2;

查看记录分布

[email protected]>select * from emp partition(p201309);

     EMPNO ENAME      HIREDATE

---------- ---------- ---------

       100 Tom        07-SEP-13

[email protected]>select * from t_emp;

no rows selected

利用exchange实现数据交换。exchange利用了指针原理,仅仅改变数据字典,速度非常快,几乎不涉及IO操作

[email protected]>alter table emp exchange partition p201309 with table t_emp;

注意exchange后数据变化,表面上分区和表数据发生了交换,实际是后台指针改变链接位置

[email protected]>select * from emp partition(p201309);

no rows selected

[email protected]>select * from t_emp;

     EMPNO ENAME      HIREDATE

---------- ---------- ---------

       100 Tom        07-SEP-13

2、merge表分区

[email protected]>alter table emp merge partitions p201309,p201310 into partition p201310;

查看数据字典变化

[email protected]>select partition_name from user_extents where segment_name = 'EMP';

PARTITION_NAME

------------------------------

P201310

P201311

PMAX

3、split最后一个分区

[email protected]>alter table emp split partition pmax at (to_date('20140101','yyyymmdd')) into (partition p201312 tablespace ts_emp2, partition pmax tablespace ts_emp3);

查看数据字典变化

[email protected]>select partition_name from user_extents where segment_name = 'EMP';

PARTITION_NAME

------------------------------

P201310

P201311

P201312

PMAX

查看记录分布

[email protected]>select * from emp partition(p201312);

     EMPNO ENAME      HIREDATE

---------- ---------- ---------

      1999 Bill       07-DEC-13

[email protected]>select * from emp partition(pmax);

     EMPNO ENAME      HIREDATE

---------- ---------- ---------

      5000 Gates      07-JAN-14

接下来将t_emp的数据入到数据仓库

至此,滑动窗口完成