IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    Understanding System and Architecture for Big Data

    Guancheng (G.C.)发表于 2012-05-09 13:18:49
    love 0

    简介:IBM Research最近在Big Data领域有很多工作,例如我们组在4月份在10台采用POWER7处理器的P730服务器上成功地用14分钟跑完了1TB数据的排序(7月份又在10台Power7R2上用8分44秒跑完了1TB排序),这项工作已经发表为一篇IBM Research Report,欢迎大家围观,并提出宝贵意见,谢谢。

    The use of Big Data underpins critical activities in all sectors of our society. Achieving the full transformative potential of Big Data in this increasingly digital world requires both new data analysis algorithms and a new class of systems to handle the dramatic data growth, the demand to integrate structured and unstructured data analytics, and the increasing computing needs of massive-scale analytics. In this paper, we discuss several Big Data research activities at IBM Research: (1) Big Data benchmarking and methodology; (2) workload optimized systems for Big Data; (3) case study of Big Data workloads on IBM Power systems. In (3), we show that preliminary infrastructure tuning results in sorting 1TB data in 14 minutes on 10 Power 730 machines running IBM InfoSphere BigInsights. Further improvement is expected, among other factors, on the new IBM PowerLinuxTM 7R2 systems.

    By: Anne E. Gattiker, Fadi H. Gebara, Ahmed Gheith, H. Peter Hofstee, Damir A. Jamsek, Jian Li, Evan Speight, Ju Wei Shi, Guan Cheng Chen, Peter W. Wong

    Published in: RC25281 in 2012

    LIMITED DISTRIBUTION NOTICE:

    This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

    Download link: http://domino.research.ibm.com/library/cyberdig.nsf/1e4115aea78b6e7c85256b360066f0d4/f085753cf57c8c35852579e90050598f!OpenDocument&Highlight;=0,big,data

    Questions about this service can be mailed to reports@us.ibm.com .

    相关日志

    • 01/09/2012 X-RIME: 基于Hadoop的开源大规模社交网络分析工具
    • 09/02/2011 IBM中国研究院招聘大规模数据分析实习生
    • 07/17/2011 Facebook的Realtime Hadoop及其应用
    • 08/25/2013 Impala:新一代开源大数据分析引擎
    • 05/17/2012 为什么NoSQL和Hadoop该一起使用?


沪ICP备19023445号-2号
友情链接