最近在客户遇到一个案例,和大家分享一下,大家如果客户那里也有HP平台,那么要注意在11gR2中kernal参数maxfile_lim要设置成32767,而在11gR1以及之前,这个参数是65536的。(在169706.1文档上,11gR1之前是设置65536的,在11gR2上要求参考install guide,而install guide上写的是32767)故障现象:加asm磁盘的时候,无法加成功,报错disk is not visiable in cluster-wide,同时伴随报错ora-27070。其实盘在2个节点上都是能看得到的。创建test diskgroup的时候,节点1上创建成功,节点2上mount,mount了一会就dismount了,报错ora-27072。两次报错,在alertlog中都有关于rebalance进程的trace文件,在trace文件中,可以看到关于异步IO的一些报错问题。初步怀疑和Bug 17264575 : ORA27070 AND HPUXIA64 ERROR: 15 AND DISKGROUP DISMOUNT AFTER ADDING NEW吻合。根据Bug 17264575和Doc ID 1604055.1,虽然研发说要将maxfiles_lim改小到32767。但最后问题总结却说是加大了参数。前后是有矛盾的。@ Asynchronous IO on HP limits the max value of the file descriptor used to@ 32768, due to the asynch structures storing the fd in a short.@ .@ The problem in 10372187 was that if the OS 'number of files' limit was set@ too high, this low water mark would be set too high, meaning@ file descriptors used would be > 32768, and so asynchronous IO fails.@ => checking code shows fix is present, snippet of change:@ /* 10372187: An fd above 32768 will blow the max value for@ * an fd in the hp asynch structure (short)@ ...@ if (newfd > SB2MAXVAL)@ {@ SLERC_OERC (*se) = OER(27080);@ ...@ .@ The setting as used by customer are:@ maxfiles 8192 8192@ maxfiles_lim 63488 63488<