IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    ChinaSys小记(2015.6)

    Liu Yutao发表于 2015-06-08 21:14:00
    love 0

    今年上半年的ChinaSys选择了在厦门举办,真是深得我心。顺便见了父母一趟,吃了好多海鲜,和亲人喝了几杯白酒,和几个闷骚程序员去鼓浪屿弱弱地游玩了一趟,还走了走厦门大学,白城沙滩,总的来说还是非常惬意的。不过两天会议下来,我越发觉得程序员与程序员之间的交流还是很局限的,特别是当我发现大部分人和talk都和我方向并不相同的时候,我就不知道聊天的时候该聊些什么。所以在两天会议的过程中其实并没有太多的和外界的交流,这也是我这次参加ChinaSys比较遗憾的一点。

    废话说到这里,开始进入正题。根据主办方的介绍,这次ChinaSys注册105人次,放眼望去大部分都还是老面孔,来自清华,北大,计算所,上交,复旦,中科大,华科,北理工等高校,以及MSRA,百度等公司。从会议的整个过程来看,国内各大高校和公司的研究水平都很高,在各个的领域都有比较深入的探究,但是由于本人水平的原因,很多关键点并没有get到,所以笔记也就显得比较混乱,这次ChinaSys回来让我最大的感受就是要多了解一些各个领域的知识,对整个计算机发展的各个方向有一个大方向的了解,知道相应的问题,挑战,主要技术等,这样在每次参加这些会议的时候收获的就不只是这些皮毛的东西了。

    接下来是我的笔记,基本没有太细节的点,因为很多也是别人正在做的工作,不能透露太多,当然更重要的原因其实是那些细节我自己并不了解。这次会议上用的是markdown直接记的一些印象比较深刻的点,这里也就直接整理一下记录在博客里面,留作纪念。另外这次会议没有很明确的分session,记录顺序就完全按演讲的时间顺序,另外非常不好意思的是由于某些客观原因,有好多个talk我都没有听到,所以也就不再这里记录了。这次ChinaSys共有28个talk,1个华科的金海教授keynote,以及一个讨论“如何在国内做出世界级研究”的panel。


    ChinaSys 2015.6

    105 registrations


    Persistent B+ trees in Non-Volatile Main Memory

    陈世敏 from 中科院计算所

    problem: B+ tree one operation -> multiple inconsistency states in cache when crash

    existing solution:

    • write-ahead log, 4 times (clflush & mfence) cost
    • shadowing (like RCU?), B+ tree need to change 2 pointer, not atomic one.

    solution: re-design B+ tree node.

    • unsorted + slot array/bitmatp B+ node structure.
    • use empty slot to store newly inserted one
    • atomically update slot array & bitmap

    Twin-Load: 一种在同步内存接口上构建异步内存扩展的方法

    陈明宇 from 中科院计算所

    problem: capacity wall (vs. memory wall) 扩容困难 (封装,结构,工艺)

    DRAM系统容量 = 通道数 * 通道内颗粒数 * 颗粒容量

    goal: 不修改通用处理器,支持异步扩展: 同步接口(通用)+异步协议(扩展性)

    solution: 把一次访存分为一次预取和一次读取


    RecFS: Building Reliable and Efficient Cloud Storage Services with File System in User Space

    杨智 from 北大

    background: storage synchronization approaches (dropbox?): inotify + rsync, 校验匹配

    problem: high cost + inconsistency

    solution: 利用用户态文件系统(fuse)截获写操作,获得相关信息 (relation table),


    Computational Memory Architecture

    王颖 from 中科院计算所

    Processing in Memory (PIM):直接在内存中进行计算。 (NDC, NDA …),将通用处理器、流处理器等集成到内存。

    problem: PIM returns (enabling technique + demanding app).

    proposal: computational memory (ProPRAM)

    • in-memory computation application
    • COMS-computible memory technique

    重用内存内资源,不需要integrate新的处理器等,内存加速器。


    GraM: Scaling Graph Computation to the Trillions

    杨凡 from MSRA

    backgroud: graph engine, large graph computing

    GraM: graph engine - focus on Scalability and Efficiency

    design:

    • simple model - message passing
    • multi-core aware RDMA stack

    GridGraph: Large-Scale Graph Processing on a Single Machine Using 2-Level Hierarchical Partitioning

    朱晓伟 from 清华

    background: out-of-core - use disk, guarantee locality by partition.

    insight: if we can guarantee the locality of both source (gather) and destination (scatter) vertex, we are able to merge the 2 phases into 1!

    design: 2 phases -> 1 phase


    Hardware Isolation is coming, What’s Next for System Software?

    徐天妮 from 中科院计算所

    problem: sharing cause inteference, … isolate programs from each other on a shared server is hard.

    insight: a computer is inherently a network, design of network (tag) can be utilized to system.

    PARD Programmable Architecture for Resourcing-on-Demand

    challenge:

    • hardware differentiate application request: taggging each app
    • how to design control plane for a diversity of app: table + programming interface + interrupt line

    like full-system SRIOV (via tagging in hardware)


    云游戏细粒度资源调度

    张伟 from 华科

    background: 视频流 (并发度低,资源利用率低) & 图形流(终端要求高,跨平台难)

    负载:逻辑 + 渲染 + 压缩

    problem: 云游戏资源调度

    solution:

    • 任务解耦:逻辑-渲染分离
    • 多资源融合调度
    • 轻量级负载迁移

    Efficient Deterministic Replay with Hardware Virtualization Extensions

    任仕儒 from 北大

    motivation: software only deterministic replay

    R&R; the Memory interleaving with HAV extension?

    当虚拟机下陷的时候通过EPT里面的dirty|access bit来记录对应的访问。

    truncate chunk using performance counter (BTS)


    An effective correlation-aware VM placement scheme for reducing SLA violation in data center

    许胜 from 中科院计算所

    background: data center power and utilization.

    motivation: 虚拟机部署算法(分布式部署):既保证SLA性能,同时降低物理服务器数量。

    solution: 通过对服务器部署能力进行约束的策略,采用SSP优化部署算法, 考虑应用的资源需求特性。


    Linux内核数据竞争统计与分析

    石剑君 from 北理工

    motivation: summary of linux kernel data race

    approach:

    • sources: BugZilla, linux mailing list, changlog
    • 归类patterns: use before initialization , use after free, access without sync, access with improper sync

    Robust Distributed System Nucleus (rDSN) for Distributed System Study and Research

    郭振宇 from MSRA

    problem: robustness cannot be achieved in a single point, many research tools failed to be adopted in production.

    proposal: come up a new development framework.

    • need to be able to monitor and manipulate all dependencies and non-determinisms in the system, with good semantic level.
    • well-defined interface for apps, reusable.
    • be practical, do not deviate from existing programming model too far.

    Keynote: 图计算

    金海 from 华科

    网络空间实体关联

    数据在哪里

    • 1% Web化数据: 1/500可爬(网站主观(主动屏蔽)、非主观(不符合规范等)原因不可爬)
    • 99%非Web化数据: 人工生成,qq,邮件,物联网…

    Accelerating distributed graph processing with RDMA

    高品 from 清华

    motivation: data locality vs. load balance


    Efficient Concurrent Search Tree for Epoch-based In-memory Database

    张凯源 from 上海交大

    insight: batch B+ tree node insert, search…

    proposal: buffered B+ tree

    problem: good insert, but bad search


    Toward Optimized Array-based Computing Framework

    章明星 from 清华

    background: array-based languages: cannot scale-out

    motivation: array-based program - (front end) -> array primitive - (back end) ->

    中间缺少一个optimizer

    design:

    • distinguish local and distributed data
    • separate computation and communication
    • optimize each locally-computing period

    开源硬件加速创新设计

    刘兴华 from LeMaker

    大学创客创新计划

    创客 vs. DIY

    开源硬件 vs. 开发板


    ShiDianNao: Shifting Vision Processing Closer to the Sensor

    杜子东 from 中科院计算所

    Diannao series : 硬件用于神经网络的加速器

    background: 为什么要在sensor旁边用加速器: 功耗主要消耗在内存


    Hi-fi Playback: Tolerating Position Errors in Shift Operations of Racetrack Memory

    张超 from 北大

    background: Racetrack memory (latency vs. capacity) proposed by IBM

    shift position error (只移到一半,或者移多了)


    总之,这次ChinaSys笔记记得比较浅,希望下次能做的更好。



沪ICP备19023445号-2号
友情链接