【Linux 核心記憶體管理】實體記憶體組織結構 ④ ( 記憶體區域 zone 簡介 | zone 結構體源碼分析

文章目錄

一、記憶體區域 zone 簡介
二、zone 結構體源碼分析
- 1、watermark 成員
- 2、lowmem_reserve 成員
- 3、zone_pgdat 成員
- 4、pageset 成員
- 5、zone_start_pfn 成員
- 6、managed_pages、spanned_pages、present_pages成員
- 7、name 成員
- 8、free_area 成員
三、zone 結構體源碼

記憶體管理系統

3級結構 :

① 記憶體節點 Node ,

② 記憶體區域 Zone ,

③ 記憶體頁 Page ,

Linux 核心中 , 使用上述

3 級結構描述和管理 " 實體記憶體 " ;

一、記憶體區域 zone 簡介

" 記憶體節點 " 是記憶體管理的最頂層結構 ,

" 記憶體節點 " 再向下劃分 , 就是 " 記憶體區域 " zone ,

" 記憶體區域 " 在 Linux 核心中使用

struct zone

結構體類型進行描述 ,

zone

枚舉定義在 Linux 核心源碼的 linux-4.12\include\linux\mmzone.h#350 位置 ;

每個 " 記憶體區域 " , 都使用

1 個

zone

結構體描述 ;

【Linux 核心記憶體管理】實體記憶體組織結構 ④ ( 記憶體區域 zone 簡介 | zone 結構體源碼分析 | zone 結構體源碼 )

二、zone 結構體源碼分析

1、watermark 成員

watermark

表示 " 頁配置設定器 " 使用的水線 ;

/* zone watermarks, access with *_wmark_pages(zone) macros */
	unsigned long watermark[NR_WMARK];

2、lowmem_reserve 成員

lowmem_reserve

表示頁配置設定器使用的區域 , 這些區域必須保留 , 不能借給高區域類型 ;

/*
	 * We don't know if the memory that we're going to allocate will be
	 * freeable or/and it will be released eventually, so to avoid totally
	 * wasting several GB of ram we must reserve some of the lower zone
	 * memory (otherwise we risk to run OOM on the lower zones despite
	 * there being tons of freeable ram on the higher zones).  This array is
	 * recalculated at runtime if the sysctl_lowmem_reserve_ratio sysctl
	 * changes.
	 */
	long lowmem_reserve[MAX_NR_ZONES];

3、zone_pgdat 成員

zone_pgdat

指向 " 記憶體節點 " 的

pglist_data

執行個體 ;

struct pglist_data	*zone_pgdat;

4、pageset 成員

pageset

表示每個 " 處理頁 " 的集合 ;

struct per_cpu_pageset __percpu *pageset;

5、zone_start_pfn 成員

zone_start_pfn

表示目前 " 記憶體區域 zone " 的 " 起始實體頁 " 編号 ;

/* zone_start_pfn == zone_start_paddr >> PAGE_SHIFT */
	unsigned long		zone_start_pfn;

6、managed_pages、spanned_pages、present_pages成員

managed_pages

表示 " 夥伴配置設定器 " 管理的實體頁數量 ;

spanned_pages

表示目前的 " 記憶體區域 " 跨越的實體頁個數 , 包含 " 記憶體空洞 " ;

present_pages

表示目前的 " 記憶體區域 " 包含的實體頁個數 , 不包含 " 記憶體空洞 " ;

/*
	 * spanned_pages is the total pages spanned by the zone, including
	 * holes, which is calculated as:
	 * 	spanned_pages = zone_end_pfn - zone_start_pfn;
	 *
	 * present_pages is physical pages existing within the zone, which
	 * is calculated as:
	 *	present_pages = spanned_pages - absent_pages(pages in holes);
	 *
	 * managed_pages is present pages managed by the buddy system, which
	 * is calculated as (reserved_pages includes pages allocated by the
	 * bootmem allocator):
	 *	managed_pages = present_pages - reserved_pages;
	 *
	 * So present_pages may be used by memory hotplug or memory power
	 * management logic to figure out unmanaged pages by checking
	 * (present_pages - managed_pages). And managed_pages should be used
	 * by page allocator and vm scanner to calculate all kinds of watermarks
	 * and thresholds.
	 *
	 * Locking rules:
	 *
	 * zone_start_pfn and spanned_pages are protected by span_seqlock.
	 * It is a seqlock because it has to be read outside of zone->lock,
	 * and it is done in the main allocator path.  But, it is written
	 * quite infrequently.
	 *
	 * The span_seq lock is declared along with zone->lock because it is
	 * frequently read in proximity to zone->lock.  It's good to
	 * give them a chance of being in the same cacheline.
	 *
	 * Write access to present_pages at runtime should be protected by
	 * mem_hotplug_begin/end(). Any reader who can't tolerant drift of
	 * present_pages should get_online_mems() to get a stable value.
	 *
	 * Read access to managed_pages should be safe because it's unsigned
	 * long. Write access to zone->managed_pages and totalram_pages are
	 * protected by managed_page_count_lock at runtime. Idealy only
	 * adjust_managed_page_count() should be used instead of directly
	 * touching zone->managed_pages and totalram_pages.
	 */
	unsigned long		managed_pages;
	unsigned long		spanned_pages;
	unsigned long		present_pages;

7、name 成員

name

表示 " 記憶體區域 " 名稱 ;

const char		*name;

8、free_area 成員

free_area

表示不同長度的記憶體空間區域 ;

/* free areas of different sizes */
	struct free_area	free_area[MAX_ORDER];

三、zone 結構體源碼

struct zone {
	/* Read-mostly fields */

	/* zone watermarks, access with *_wmark_pages(zone) macros */
	unsigned long watermark[NR_WMARK];

	unsigned long nr_reserved_highatomic;

	/*
	 * We don't know if the memory that we're going to allocate will be
	 * freeable or/and it will be released eventually, so to avoid totally
	 * wasting several GB of ram we must reserve some of the lower zone
	 * memory (otherwise we risk to run OOM on the lower zones despite
	 * there being tons of freeable ram on the higher zones).  This array is
	 * recalculated at runtime if the sysctl_lowmem_reserve_ratio sysctl
	 * changes.
	 */
	long lowmem_reserve[MAX_NR_ZONES];

#ifdef CONFIG_NUMA
	int node;
#endif
	struct pglist_data	*zone_pgdat;
	struct per_cpu_pageset __percpu *pageset;

#ifndef CONFIG_SPARSEMEM
	/*
	 * Flags for a pageblock_nr_pages block. See pageblock-flags.h.
	 * In SPARSEMEM, this map is stored in struct mem_section
	 */
	unsigned long		*pageblock_flags;
#endif /* CONFIG_SPARSEMEM */

	/* zone_start_pfn == zone_start_paddr >> PAGE_SHIFT */
	unsigned long		zone_start_pfn;

	/*
	 * spanned_pages is the total pages spanned by the zone, including
	 * holes, which is calculated as:
	 * 	spanned_pages = zone_end_pfn - zone_start_pfn;
	 *
	 * present_pages is physical pages existing within the zone, which
	 * is calculated as:
	 *	present_pages = spanned_pages - absent_pages(pages in holes);
	 *
	 * managed_pages is present pages managed by the buddy system, which
	 * is calculated as (reserved_pages includes pages allocated by the
	 * bootmem allocator):
	 *	managed_pages = present_pages - reserved_pages;
	 *
	 * So present_pages may be used by memory hotplug or memory power
	 * management logic to figure out unmanaged pages by checking
	 * (present_pages - managed_pages). And managed_pages should be used
	 * by page allocator and vm scanner to calculate all kinds of watermarks
	 * and thresholds.
	 *
	 * Locking rules:
	 *
	 * zone_start_pfn and spanned_pages are protected by span_seqlock.
	 * It is a seqlock because it has to be read outside of zone->lock,
	 * and it is done in the main allocator path.  But, it is written
	 * quite infrequently.
	 *
	 * The span_seq lock is declared along with zone->lock because it is
	 * frequently read in proximity to zone->lock.  It's good to
	 * give them a chance of being in the same cacheline.
	 *
	 * Write access to present_pages at runtime should be protected by
	 * mem_hotplug_begin/end(). Any reader who can't tolerant drift of
	 * present_pages should get_online_mems() to get a stable value.
	 *
	 * Read access to managed_pages should be safe because it's unsigned
	 * long. Write access to zone->managed_pages and totalram_pages are
	 * protected by managed_page_count_lock at runtime. Idealy only
	 * adjust_managed_page_count() should be used instead of directly
	 * touching zone->managed_pages and totalram_pages.
	 */
	unsigned long		managed_pages;
	unsigned long		spanned_pages;
	unsigned long		present_pages;

	const char		*name;

#ifdef CONFIG_MEMORY_ISOLATION
	/*
	 * Number of isolated pageblock. It is used to solve incorrect
	 * freepage counting problem due to racy retrieving migratetype
	 * of pageblock. Protected by zone->lock.
	 */
	unsigned long		nr_isolate_pageblock;
#endif

#ifdef CONFIG_MEMORY_HOTPLUG
	/* see spanned/present_pages for more description */
	seqlock_t		span_seqlock;
#endif

	int initialized;

	/* Write-intensive fields used from the page allocator */
	ZONE_PADDING(_pad1_)

	/* free areas of different sizes */
	struct free_area	free_area[MAX_ORDER];

	/* zone flags, see below */
	unsigned long		flags;

	/* Primarily protects free_area */
	spinlock_t		lock;

	/* Write-intensive fields used by compaction and vmstats. */
	ZONE_PADDING(_pad2_)

	/*
	 * When free pages are below this point, additional steps are taken
	 * when reading the number of free pages to avoid per-cpu counter
	 * drift allowing watermarks to be breached
	 */
	unsigned long percpu_drift_mark;

#if defined CONFIG_COMPACTION || defined CONFIG_CMA
	/* pfn where compaction free scanner should start */
	unsigned long		compact_cached_free_pfn;
	/* pfn where async and sync compaction migration scanner should start */
	unsigned long		compact_cached_migrate_pfn[2];
#endif

#ifdef CONFIG_COMPACTION
	/*
	 * On compaction failure, 1<<compact_defer_shift compactions
	 * are skipped before trying again. The number attempted since
	 * last failure is tracked with compact_considered.
	 */
	unsigned int		compact_considered;
	unsigned int		compact_defer_shift;
	int			compact_order_failed;
#endif

#if defined CONFIG_COMPACTION || defined CONFIG_CMA
	/* Set to true when the PG_migrate_skip bits should be cleared */
	bool			compact_blockskip_flush;
#endif

	bool			contiguous;

	ZONE_PADDING(_pad3_)
	/* Zone statistics */
	atomic_long_t		vm_stat[NR_VM_ZONE_STAT_ITEMS];
} ____cacheline_internodealigned_in_smp;

【Linux 核心記憶體管理】實體記憶體組織結構 ④ ( 記憶體區域 zone 簡介 | zone 結構體源碼分析 | zone 結構體源碼 )

文章目錄

1、watermark 成員

2、lowmem_reserve 成員

3、zone_pgdat 成員

4、pageset 成員

5、zone_start_pfn 成員

6、managed_pages、spanned_pages、present_pages成員

7、name 成員

8、free_area 成員

繼續閱讀

iOS 記憶體管理~非集合對象的copy和mutableCopy

copy與mutableCopy 記住：

Linux-2.6.20的LCD驅動分析（二）[轉]

配置開發支援高并發TCP連接配接的Linux應用程式全攻略

android linux 核心層

linux檔案系統調用(2)---umount術語表：一、目的二、函數調用關系三、已挂載檔案系統之間的關系四、查找挂載目錄下所有的檔案系統五、删除所有子檔案系統的挂載節點六、删除所有子檔案系統的挂載點七、總結版權聲明：

oracle等待事件之cursor：pin S wait on X

資料結構學習（二）指針和記憶體那些事兒

SQL Server 性能調優（方法論）【轉】

SQL Server性能調優：資源管理之記憶體管理篇

IO端口和IO記憶體的差別Linux系統對IO端口和IO記憶體的管理

netfilter五個hook點

2.4 linux存儲管理-越界通路

2022秋招cpp相關面試總結（長期更新）1、記憶體對齊2、類的占用空間死鎖elf優化bin檔案c語言和c++中const差別sizeof原理malloc一塊記憶體free怎麼找到頭尾

C語言實作連結清單在作業系統中的記憶體配置設定與回收，高效記憶體管理

Linux網卡總結線速光子產品檢視網卡資訊檢視PCI資訊RSS（Receive Side Scaling）RPS（Receive Packet Steering）XPS（Transmit Packet Steering）FD（Flow Director）Rx/Tx Ring Buffer網卡多隊列

【Linux 核心 記憶體管理】實體記憶體組織結構 ④ ( 記憶體區域 zone 簡介 | zone 結構體源碼分析 | zone 結構體源碼 )

文章目錄

1、watermark 成員

2、lowmem_reserve 成員

3、zone_pgdat 成員

4、pageset 成員

5、zone_start_pfn 成員

6、managed_pages、spanned_pages、present_pages成員

7、name 成員

8、free_area 成員

繼續閱讀

【Linux 核心記憶體管理】實體記憶體組織結構 ④ ( 記憶體區域 zone 簡介 | zone 結構體源碼分析 | zone 結構體源碼 )