年底事情比较多,中断了一段时间,这一篇总算要说到Flashcache本身了。由于是内核模块,安装的时候需要内核源码树。具体的安装过程可以参考这里。
make -j 4 KERNEL_TREE=/usr/src/kernels/2.6.32-131.0.15.el6.x86_64 sudo make install
最初版本的Flashcache只支持writeback,后来单独开了一个支持writethrough的分支在flashcache-wt目录,但目前最新的版本已经将write through合并到主版本,并且增加了write around策略。
最新的源码可以到Github获取。
env GIT_SSL_NO_VERIFY=true git clone https://github.com/facebook/flashcache.git
建议下载完源码后的第一件事,就是去doc下阅读flashcache-doc.txt和flashcache-sa-guide.txt,保证比我则几篇blog有养分得多。
不是每个人都有SSD/PCI-E Flash的硬件,所以这里可以给大家一个构建虚拟混合存储设备的小技巧,这样即使是在自己的笔记本上,也可以轻松的模拟Flashcache的试验环境,而且随便折腾。
首先,我们可以用内存来模拟一个性能很好的Flash设备,当然这有一个缺点,就是主机重启后就啥都没了,不过用于实验测试这应该不是什么大问题。用内存来模拟块设备有两种方法,ramdisk或者tmpfs+loop device。由于ramdisk要调整大小需要修改grub并重启,这里我们用tmpfs来实现。
# 限制tmpfs最大不超过10G,避免耗尽内存(测试机器有24G物理内存) $sudo mount tmpfs /dev/shm -t tmpfs -o size=10240m # 创建一个2G的文件,用来模拟2G的flash设备 $dd if=/dev/zero of=/dev/shm/ssd.img bs=1024k count=2048 # 将文件模拟成块设备 $sudo losetup /dev/loop0 /dev/shm/ssd.img
解决了cache设备,还需要有disk持久设备。同样的,可使用普通磁盘上的文件来虚拟成一个loop device。
# 在普通磁盘的文件系统中创建一个4G的文件,用来模拟4G的disk设备 $dd if=/dev/zero of=/u01/jiangfeng/disk.img bs=1024k count=4096 $sudo losetup /dev/loop1 /u01/jiangfeng/disk.img
这样我们就有了一个快速的设备/dev/loop0,一个慢速的磁盘设备/dev/loop1,可以开始创建一个Flashcache混合存储设备了。
$sudo flashcache_create -p back cachedev /dev/loop0 /dev/loop1 cachedev cachedev, ssd_devname /dev/loop0, disk_devname /dev/loop1 cache mode WRITE_BACK block_size 8, md_block_size 8, cache_size 0 Flashcache metadata will use 8MB of your 48384MB main memory $sudo mkfs.ext3 /dev/mapper/cachedev mke2fs 1.41.12 (17-May-2010) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) Stride=0 blocks, Stripe width=0 blocks 262144 inodes, 1048576 blocks 52428 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=1073741824 32 block groups 32768 blocks per group, 32768 fragments per group 8192 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736 Writing inode tables: done Creating journal (32768 blocks): done Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 28 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override. $sudo mount /dev/mapper/cachedev /u03
Ok,检查一下,就可以开始做一些模拟测试啦。
$sudo dmsetup table cachedev: 0 8388608 flashcache conf: ssd dev (/dev/loop0), disk dev (/dev/loop1) cache mode(WRITE_BACK) capacity(2038M), associativity(512), data block size(4K) metadata block size(4096b) skip sequential thresh(0K) total blocks(521728), cached blocks(83), cache percent(0) dirty blocks(0), dirty percent(0) nr_queued(0) Size Hist: 4096:84 $sudo dmsetup status cachedev: 0 8388608 flashcache stats: reads(84), writes(0) read hits(1), read hit percent(1) write hits(0) write hit percent(0) dirty write hits(0) dirty write hit percent(0) replacement(0), write replacement(0) write invalidates(0), read invalidates(0) pending enqueues(0), pending inval(0) metadata dirties(0), metadata cleans(0) metadata batch(0) metadata ssd writes(0) cleanings(0) fallow cleanings(0) no room(0) front merge(0) back merge(0) disk reads(83), disk writes(0) ssd reads(1) ssd writes(83) uncached reads(0), uncached writes(0), uncached IO requeue(0) uncached sequential reads(0), uncached sequential writes(0) pid_adds(0), pid_dels(0), pid_drops(0) pid_expiry(0) $sudo sysctl -a | grep flashcache dev.flashcache.loop0+loop1.io_latency_hist = 0 dev.flashcache.loop0+loop1.do_sync = 0 dev.flashcache.loop0+loop1.stop_sync = 0 dev.flashcache.loop0+loop1.dirty_thresh_pct = 20 dev.flashcache.loop0+loop1.max_clean_ios_total = 4 dev.flashcache.loop0+loop1.max_clean_ios_set = 2 dev.flashcache.loop0+loop1.do_pid_expiry = 0 dev.flashcache.loop0+loop1.max_pids = 100 dev.flashcache.loop0+loop1.pid_expiry_secs = 60 dev.flashcache.loop0+loop1.reclaim_policy = 0 dev.flashcache.loop0+loop1.zero_stats = 0 dev.flashcache.loop0+loop1.fast_remove = 0 dev.flashcache.loop0+loop1.cache_all = 1 dev.flashcache.loop0+loop1.fallow_clean_speed = 2 dev.flashcache.loop0+loop1.fallow_delay = 900 dev.flashcache.loop0+loop1.skip_seq_thresh_kb = 0
参考文档:
[1] Flashcace安装 How-to
[2] Flashcache中文简介——Linux上的回写块缓存
[3] tmpfs from wikipedia
[4] loop device from wikipedia
您可能也喜欢: | ||||
深入浅出Flashcache(一) |
Oracle12G将不再支持裸设备? |
Oracle11gR1 for Linux可以下载了 |
历经千辛万苦,总算得见Cognos Finance真容 |
关于Change Data Capture(一) |
无觅 |