初识multipath


声明:本文转载自https://my.oschina.net/u/2475751/blog/1574955,转载目的在于传递更多信息,仅供学习交流之用。如有侵权行为,请联系我,我会及时删除。

Multipath Intro

Multipath即多路径,是个通用概念。这里要介绍的是开源的存储多路径技术,也就是DM multipath。有关multipath介绍不少,这里 主要记录我对multipath最初几个问题和答案:

  • 在没有企业存储的情况下,怎么玩multipath?
  • multipath device 是如何命名的?
  • 这种以冒号分割4个数字的设备地址如2:0:0:1怎么解释?
  • 什么是path group?
  • path grouping policy and IO scheduling policy?

在没有企业存储的情况下,怎么玩multipath?

使用虚拟机和iscsi。装一虚拟机,添加块设备,添加两个网卡,再用这个块设备建一个iscsi target。然后在一个想玩multipath的机器 上面,用iscsi client去连接iscsi target。至此,用lsblk会查看到原来的块设备有两个设备节点。

multipath device 是如何命名的?

有时看到一串16进制数字(WWID), 有时是以mpath为前缀的名字(user-friendly name), 有时是任意字母串(alia name)。multipath默 认用的是WWID,为什么不用好记的名字呢? 好记的名字不能工作的一个情景:根文件系统不能在multipath设备上面。好记的名字和 WWID之间的映射是保存在/etc/multipath/bindings文件里的。要访问这个文件,根文件系统必须已经挂载上了,而multipath服务在initrd里就要开始工作,那个时候还没有根系统。因此,默认设置为wwid是为了安全。

这种以冒号分割4个数字的设备地址如2:0:0:1怎么解释?

2:0:0:1设备地址,数字分别对应:Host:Bus:Target:Lun 。比如我们让iscsi target走了两个IP地址,那么对于同一个设备只有 host字段不同。比如:2:0:0:13:0:01

什么是path group?

起初,我对这个概念有混淆:认为一个真实设备对应的所有路径为一个path group,即认为下面是一个path group:

multipath-demo:~ # multipath -l 14945540000000000ccb70d0ceeee4280f8450284d6298b59 dm-0 IET,VIRTUAL-DISK size=10G features='1 retain_attached_hw_handler' hwhandler='0' wp=rw |-+- policy='service-time 0' prio=0 status=active | `- 2:0:0:0 sda 8:0  active undef unknown `-+- policy='service-time 0' prio=0 status=enabled   `- 3:0:0:0 sdc 8:32 active undef unknown  

其实,dm-0设备有两个path group,每个PG都只有一个路径(真实环境有多条),状态active的是正在工作的路径,状态enabled处于备用状态,并不下发IO。 为此,请教了做multipath的同时Martin:

Please have a look at http://christophe.varoqui.free.fr/refbook.html  Path groups are mainly used for active/passive setups, and for cases where some paths have a higher latency/lower bandwidth than others (imagine a mirrored storage with mirror legs in different physical locations, disaster avoidance: the local mirror will be much faster than remote mirrors).  Only one path group is "active" at any given time. The others are serving as standby, for the case that all paths in the currently active group fail. Depending on the storage array, the host may need to take explicit action to switch from one path group to another (e.g. send a certain SCSI command that forces the storage to activate the stand-by ports).   If the active path group contains multiple paths, switching between these paths (more precisely: between those paths in the path group which are not in failed state) is controlled by the "path_selector" algorithm in the kernel. The are 3 algorithms: "round-robin", "queue- length", and "service-time". See multipath.conf(5). Switching of paths inside a path group, unlike switching between path groups, is assumed to be instanteneous, and to require no explicit action. Regardless which path selector is in use, every healthy path will receive IO sooner or later, unless the multipath device is completely idle.  How the paths are grouped into path groups at discovery time is determined by the "path_grouping_policy". It's "failover" by default, meaning that there's a dedicated path group for every path. But multipath's builtin hardware table sets different defaults for many real-world storage arrays. For modern setups, "group_by_prio" is often the best, combined with "detect_prio yes" or or a "prio" setting that assigns different priority to paths with different quality (e.g. "alua", "rdac", or "path_latency").  Path groups are assigned a priority which is calculated as the average of all non-failed paths in the path group. At startup, the path group with the highest prio is set as active PG. When all paths in this PG fail, the kernel will switch to the next-best PG. When paths in the best PG return to good state, the "failback" configuration on determines if, and when, to switch back to the best PG. 

path grouping policy and IO scheduling policy?

path grouping policy 默认是failover, 如martin所说,各设备厂商默认策略不同,主流的在用group_by_prio,作用就是把路径分组。IO scheduling policy默认是service time, 负责如何在一个PG的路径中分配IO。对此,Martin给出了详细的解释。

本文发表于2017年11月17日 16:33
(c)注:本文转载自https://my.oschina.net/u/2475751/blog/1574955,转载目的在于传递更多信息,并不代表本网赞同其观点和对其真实性负责。如有侵权行为,请联系我们,我们会及时删除.

阅读 2841 讨论 0 喜欢 3

抢先体验

扫码体验
趣味小程序
文字表情生成器

闪念胶囊

你要过得好哇,这样我才能恨你啊,你要是过得不好,我都不知道该恨你还是拥抱你啊。

直抵黄龙府,与诸君痛饮尔。

那时陪伴我的人啊,你们如今在何方。

不出意外的话,我们再也不会见了,祝你前程似锦。

这世界真好,吃野东西也要留出这条命来看看

快捷链接
网站地图
提交友链
Copyright © 2016 - 2021 Cion.
All Rights Reserved.
京ICP备2021004668号-1