Linux 软raid - - Barrier

什么是Barriers

在linux软raid中,用来处理正常IO和同步IO的并发问题,可以简单理解为专用于软raid的锁。

软raid在做resync/recovery,或者配置操作时需要raise 屏障,于此同时必须暂停正常IO。
barrier是可以被多次raise的一个计数器,来计算有多少个相关活动事件在发生,其中不包括正常IO。

raise 屏障的条件是没有pending的IO即nr_pending=0。

只有在没有人等待barrier down的情况下,才会选择raise barrier。这意味着,一旦IO请求准备就绪,在IO请求有机会之前,不会启动其他需要屏障的操作。

常规IO调用“wait_barrier”。当返回时,没有后台组IO发生,它必须安排在完成IO后调用allow_barrier。

后台组IO调用必须调用raise_barrier。一旦返回,就没有正常的IO发生。它必须安排在特定后台IO完成时调用lower_barrier。

/* Barriers....* Sometimes we need to suspend IO while we do something else,* either some resync/recovery, or reconfigure the array.* To do this we raise a 'barrier'.* The 'barrier' is a counter that can be raised multiple times* to count how many activities are happening which preclude* normal IO.* We can only raise the barrier if there is no pending IO.* i.e. if nr_pending == 0.* We choose only to raise the barrier if no-one is waiting for the* barrier to go down.  This means that as soon as an IO request* is ready, no other operations which require a barrier will start* until the IO request has had a chance.** So: regular IO calls 'wait_barrier'.  When that returns there*    is no backgroup IO happening,  It must arrange to call*    allow_barrier when it has finished its IO.* backgroup IO calls must call raise_barrier.  Once that returns*    there is no normal IO happeing.  It must arrange to call*    lower_barrier when the particular background IO completes.*/

相关数据结构

用来描述软raid配置相关的所有信息。

struct r1conf {struct mddev		*mddev;struct raid1_info	*mirrors;	/* twice 'raid_disks' to* allow for replacements.*/int			raid_disks;spinlock_t		device_lock;/* list of 'struct r1bio' that need to be processed by raid1d,* whether to retry a read, writeout a resync or recovery* block, or anything else.*/struct list_head	retry_list;/* A separate list of r1bio which just need raid_end_bio_io called.* This mustn't happen for writes which had any errors if the superblock* needs to be written.*/struct list_head	bio_end_io_list;/* queue pending writes to be submitted on unplug */struct bio_list		pending_bio_list;int			pending_count;/* for use when syncing mirrors:* We don't allow both normal IO and resync/recovery IO at* the same time - resync/recovery can only happen when there* is no other IO.  So when either is active, the other has to wait.* See more details description in raid1.c near raise_barrier().*/wait_queue_head_t	wait_barrier;spinlock_t		resync_lock;atomic_t		nr_sync_pending;atomic_t		*nr_pending;atomic_t		*nr_waiting;atomic_t		*nr_queued;atomic_t		*barrier;int			array_frozen;/* Set to 1 if a full sync is needed, (fresh device added).* Cleared when a sync completes.*/int			fullsync;/* When the same as mddev->recovery_disabled we don't allow* recovery to be attempted as we expect a read error.*/int			recovery_disabled;/* poolinfo contains information about the content of the* mempools - it changes when the array grows or shrinks*/struct pool_info	*poolinfo;mempool_t		r1bio_pool;mempool_t		r1buf_pool;struct bio_set		bio_split;/* temporary buffer to synchronous IO when attempting to repair* a read error.*/struct page		*tmppage;/* When taking over an array from a different personality, we store* the new thread here until we fully activate the array.*/struct md_thread	*thread;/* Keep track of cluster resync window to send to other* nodes.*/sector_t		cluster_sync_low;sector_t		cluster_sync_high;};

在当前的例子中,我们需要关注3个成员。

  • nr_pending
    正在处理的正常IO
  • nr_waitting
    等待同步完成的正常IO
  • barrier
    正在处理的同步IO

相关内核函数

raise_barrier

raise_barrier只有在同步IO的场景下raid1_sync_request才会被调用,这就意味着,只有等待正常IO完成之后,才能把屏障加起来。

static sector_t raise_barrier(struct r1conf *conf, sector_t sector_nr)
{int idx = sector_to_idx(sector_nr);	// 获取在bucket中的index。spin_lock_irq(&conf->resync_lock);/* Wait until no block IO is waiting */wait_event_lock_irq(conf->wait_barrier,!atomic_read(&conf->nr_waiting[idx]),conf->resync_lock);/* block any new IO from starting */atomic_inc(&conf->barrier[idx]);/** In raise_barrier() we firstly increase conf->barrier[idx] then* check conf->nr_pending[idx]. In _wait_barrier() we firstly* increase conf->nr_pending[idx] then check conf->barrier[idx].* A memory barrier here to make sure conf->nr_pending[idx] won't* be fetched before conf->barrier[idx] is increased. Otherwise* there will be a race between raise_barrier() and _wait_barrier().*/smp_mb__after_atomic();	// 内存屏障。/* For these conditions we must wait:* A: while the array is in frozen state* B: while conf->nr_pending[idx] is not 0, meaning regular I/O*    existing in corresponding I/O barrier bucket.* C: while conf->barrier[idx] >= RESYNC_DEPTH, meaning reaches*    max resync count which allowed on current I/O barrier bucket.*/wait_event_lock_irq(conf->wait_barrier,(!conf->array_frozen &&!atomic_read(&conf->nr_pending[idx]) &&atomic_read(&conf->barrier[idx]) < RESYNC_DEPTH) ||test_bit(MD_RECOVERY_INTR, &conf->mddev->recovery),conf->resync_lock);if (test_bit(MD_RECOVERY_INTR, &conf->mddev->recovery)) {atomic_dec(&conf->barrier[idx]);spin_unlock_irq(&conf->resync_lock);wake_up(&conf->wait_barrier);return -EINTR;}atomic_inc(&conf->nr_sync_pending);spin_unlock_irq(&conf->resync_lock);return 0;
}

wait_barrier

wait_barrier只有在向下发写请求raid1_write_request时被调用,如果此时对应的磁盘扇区存在barrier,nr_waiting会被添加,表示同一时刻,同一扇区存在同步IO。

static void _wait_barrier(struct r1conf *conf, int idx)
{/** We need to increase conf->nr_pending[idx] very early here,* then raise_barrier() can be blocked when it waits for* conf->nr_pending[idx] to be 0. Then we can avoid holding* conf->resync_lock when there is no barrier raised in same* barrier unit bucket. Also if the array is frozen, I/O* should be blocked until array is unfrozen.*/atomic_inc(&conf->nr_pending[idx]);/** In _wait_barrier() we firstly increase conf->nr_pending[idx], then* check conf->barrier[idx]. In raise_barrier() we firstly increase* conf->barrier[idx], then check conf->nr_pending[idx]. A memory* barrier is necessary here to make sure conf->barrier[idx] won't be* fetched before conf->nr_pending[idx] is increased. Otherwise there* will be a race between _wait_barrier() and raise_barrier().*/smp_mb__after_atomic();/** Don't worry about checking two atomic_t variables at same time* here. If during we check conf->barrier[idx], the array is* frozen (conf->array_frozen is 1), and chonf->barrier[idx] is* 0, it is safe to return and make the I/O continue. Because the* array is frozen, all I/O returned here will eventually complete* or be queued, no race will happen. See code comment in* frozen_array().*/if (!READ_ONCE(conf->array_frozen) &&!atomic_read(&conf->barrier[idx]))return;/** After holding conf->resync_lock, conf->nr_pending[idx]* should be decreased before waiting for barrier to drop.* Otherwise, we may encounter a race condition because* raise_barrer() might be waiting for conf->nr_pending[idx]* to be 0 at same time.*/spin_lock_irq(&conf->resync_lock);atomic_inc(&conf->nr_waiting[idx]);atomic_dec(&conf->nr_pending[idx]);/** In case freeze_array() is waiting for* get_unqueued_pending() == extra*/wake_up(&conf->wait_barrier);/* Wait for the barrier in same barrier unit bucket to drop. */wait_event_lock_irq(conf->wait_barrier,!conf->array_frozen &&!atomic_read(&conf->barrier[idx]),conf->resync_lock);atomic_inc(&conf->nr_pending[idx]);atomic_dec(&conf->nr_waiting[idx]);spin_unlock_irq(&conf->resync_lock);
}static void wait_barrier(struct r1conf *conf, sector_t sector_nr)
{int idx = sector_to_idx(sector_nr);_wait_barrier(conf, idx);
}

wait_read_barrier

wait_read_barrier只有在下发IO读请求时被调用raid1_write_request,读请求入口将对应的bio状态置为pending状态,如果raid处于非frozen状态,直接返回。

static void wait_read_barrier(struct r1conf *conf, sector_t sector_nr)
{int idx = sector_to_idx(sector_nr);/** Very similar to _wait_barrier(). The difference is, for read* I/O we don't need wait for sync I/O, but if the whole array* is frozen, the read I/O still has to wait until the array is* unfrozen. Since there is no ordering requirement with* conf->barrier[idx] here, memory barrier is unnecessary as well.*/atomic_inc(&conf->nr_pending[idx]);if (!READ_ONCE(conf->array_frozen))return;spin_lock_irq(&conf->resync_lock);atomic_inc(&conf->nr_waiting[idx]);atomic_dec(&conf->nr_pending[idx]);/** In case freeze_array() is waiting for* get_unqueued_pending() == extra*/wake_up(&conf->wait_barrier);/* Wait for array to be unfrozen */wait_event_lock_irq(conf->wait_barrier,!conf->array_frozen,conf->resync_lock);atomic_inc(&conf->nr_pending[idx]);atomic_dec(&conf->nr_waiting[idx]);spin_unlock_irq(&conf->resync_lock);
}

lower_barrier

static void lower_barrier(struct r1conf *conf, sector_t sector_nr)
{int idx = sector_to_idx(sector_nr);BUG_ON(atomic_read(&conf->barrier[idx]) <= 0);atomic_dec(&conf->barrier[idx]);atomic_dec(&conf->nr_sync_pending);wake_up(&conf->wait_barrier);
}

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/591799.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

PAT 乙级 1057 数零壹

给定一串长度不超过 10 5 的字符串&#xff0c;本题要求你将其中所有英文字母的序号&#xff08;字母 a-z 对应序号 1-26&#xff0c;不分大小写&#xff09;相加&#xff0c;得到整数 N&#xff0c;然后再分析一下 N 的二进制表示中有多少 0、多少 1。例如给定字符串 PAT (Bas…

人大金仓数据库与mysql比较

简介 人大金仓数据库是基于 PostgreSQL 开发的。 SQL语言 语法 关键字 KES&#xff1a; MYSQL&#xff1a; 语句 *特性MYSQLKES字符串字面量单引号()或 双引号(")十六进制字面量0x5461626c65&#xff0c;X5461626c65/BIT字面量b1000001,0b1000001/Boolean字面量常…

C#中汉字转区位码

目录 一、关于区位码 1.区位码定义 2.算法 二、实例 三、生成效果 四、程序中的知识点 1.byte[] GetBytes(string s) 2.字节数组转short类型 一、关于区位码 1.区位码定义 区位码是一个4位的十进制数&#xff0c;每个区位码都对应着一个唯一的汉字&#xff0c;区位码…

软件测试方法分类-按照开发阶段划分细讲

前面我给出了整体的软件测试分类&#xff0c;那么接下来&#xff0c;我会将每个分类进行细讲。 第一个我们要说到的就是按照开发阶段划分。 我们都知道软件测试方法分类中&#xff0c;如果按照开发阶段划分&#xff0c;可以分为&#xff1a; 1&#xff0c;单元测试 (Unit Te…

androidStudio 没有新建flutter工程的入口?

装了flutter dart 插件 执行了 flutter doctor 也执行了 flutter doctor --android-license 最后重启了 androidStudio 还是没发现在哪新建flutter项目工程 原来 plugins 下的 Android APK Support没有勾选

鸿蒙崛起:互联网大厂加速鸿蒙原生应用开发,人才争夺战打响

随着华为鸿蒙系统的发布和不断推进&#xff0c;一场以鸿蒙为中心的生态竞争已经拉开帷幕。近日&#xff0c;网易、美团等多家互联网公司发布了与鸿蒙系统有关的岗位招聘&#xff0c;加速推进鸿蒙原生应用开发转型。这种趋势表明&#xff0c;鸿蒙系统已经引起了行业的广泛关注&a…

【Unity引擎技术整合】 Unity学习路线 | 知识汇总 | 持续更新 | 保持乐趣 | 共同成长

前言 本文对Unity引擎的知识进行了一个整理总结&#xff0c;基本包含了Unity中大部分的知识介绍。网上也有很多Unity相关的学习资料&#xff0c;但大多数都不成体系&#xff0c;学起来的时候难免会东奔西走的摸不着头脑。本文整理的多数文章都是有对应的系列性文章专栏&#x…

MySQL各字段类型占用字节

数据表每个字段所占空间,需要计算出来.在设计表尽量少占用空间,这样在批量插入时单次插入的条数会增加.从而提高效率 数字类型 类型占用字节tinyint1个字节smallint2个字节mediumint3个字节int4个字节bigint8个字节float4个字节double8个字节DECIMAL定义为DECIMAL(M,D) 则占用…

如何修改Anaconda的Jupyter notebook的默认启动路径

1.打开Anaconda控制台 2.输入下面的命令 jupyter notebook --generate-config 这个命令的作用是生成 Jupyter notebook 的配置文件。如果你是第一次运行&#xff0c;会直接生成这个文件。如果曾经运行过这个命令&#xff0c;就会像下图一样问你时候要覆盖原来的文件。这个时候…

PyTorch训练多任务模型技巧

一、解决在分布式训练中&#xff0c;如果对同一模型进行多次调用的报错 报错如下&#xff1a; RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [256)] is at version 4; expected …

听GPT 讲Rust源代码--compiler(2)

File: rust/compiler/rustc_codegen_cranelift/build_system/prepare.rs 在Rust源代码中&#xff0c;rust/compiler/rustc_codegen_cranelift/build_system/prepare.rs文件的作用是为Cranelift代码生成器构建系统准备依赖项。 具体来说&#xff0c;该文件的主要目标是处理Crane…

C语言注意点(2)

1.使用pow函数的相关问题 局部变量n0 while(num/pow(10,n)) n; 为什么不可行 printf("%d",num/pow(10,4)%10) 为什么要提前用temp先引出来 答&#xff1a;pow函数的返回值为double类型&#xff0c;1.终止条件不会满足 2.num/pow(10,4)结果为浮点型&#xff0c;浮…

牛客小白月赛84

A打靶 题目描述 小蓝非常喜欢玩FPS类游戏&#xff0c;最近他迷上了一款打靶游戏&#xff0c;已知总共会出现 n\mathit nn 个靶子&#xff0c;每次开枪如果打中了靶子则会得到 1\text 11 分&#xff0c;另外不论这次开枪打中与否&#xff0c;靶子都将消失&#xff0c;现在有 m…

提高ThinkPHP对url的安全性

要提高ThinkPHP对url的安全性&#xff0c;可以考虑以下几点&#xff1a; 1. 使用URL重写&#xff1a;在ThinkPHP框架中&#xff0c;可以通过配置文件或者.htaccess文件启用URL重写功能&#xff0c;将URL中的参数隐藏起来&#xff0c;更难被攻击者猜测和利用。 2. 进行输入验证…

为即将到来的量子攻击做好准备的 4 个步骤

当谈到网络和技术领域时&#xff0c;一场风暴正在酝酿——这场风暴有可能摧毁我们数字安全的根本结构。这场风暴被称为 Q-Day&#xff0c;是即将到来的量子计算时代的简写&#xff0c;届时量子计算机的功能将使最复杂的加密算法变得过时。 这场量子革命正以惊人的速度到来&am…

如何使用Node.js快速创建本地HTTP服务器并实现公网访问服务端

&#x1f49d;&#x1f49d;&#x1f49d;欢迎来到我的博客&#xff0c;很高兴能够在这里和您见面&#xff01;希望您在这里可以感受到一份轻松愉快的氛围&#xff0c;不仅可以获得有趣的内容和知识&#xff0c;也可以畅所欲言、分享您的想法和见解。 推荐:kwan 的首页,持续学…

正则表达式基础

文章目录 发现宝藏前言1. 正则表达式的定义2. 常见的正则表达式字符3. 经典示例3.1 匹配电子邮件地址3.2 匹配URL3.3 匹配日期3.4 匹配IP地址3.5 匹配HTML标签3.6 匹配电话号码3.7 匹配用户名 发现宝藏 前些天发现了一个巨牛的人工智能学习网站&#xff0c;通俗易懂&#xff0…

三、C语言中的分支与循环—while循环 (5)

本章分支结构的学习内容如下&#xff1a; 三、C语言中的分支与循环—if语句 (1) 三、C语言中的分支与循环—关系操作符 (2) 三、C语言中的分支与循环—条件操作符 与逻辑操作符(3) 三、C语言中的分支与循环—switch语句&#xff08;4&#xff09;分支结构 完 本章循环结…

虚拟机类加载机制

类的生命周期 类的生命周期指的是在Java程序中&#xff0c;一个类从编写到被加载、连接、初始化、使用、卸载的整个过程。类的生命周期可以分为以下几个阶段&#xff1a; 加载&#xff0c;验证&#xff0c;准备&#xff0c;解析&#xff0c;初始化&#xff0c;使用&#xff0…

2023年郑州轻工业大学软件学院数据结构实验五-查找与排序(详解+源码C语言版+运行结果)

实验要求 一、实验目的 1&#xff0e;掌握常用的查找和排序算法思想&#xff1b; 2&#xff0e;能够用所学过的查找和排序算法解决生活中的实际应用问题。 二、课程目标 支撑课程目标&#xff08;4&#xff09;&#xff1a;能够在软件开发过程中&#xff0c;针对特定需求综…