中文简介: 本文提出了概率生成模型 Shell Topic Model (STM)对社交论坛文本中的组织性短语(Organizational Phrases)和主题词(topical contents)进行建模分析,主要的应用有组织性短语的挖掘和文档建模。
论文出处:ICDM‘14.
英文摘要:Threaded debate forums have become one of the major social media platforms. Usually people argue with one another using not only claims and evidences about the topic under discussion but also language used to organize them, which we refer to as shell. In this paper, we study how to separate shell from topical contents using unsupervised methods. Along this line, we develop a latent variable model named Shell Topic Model (STM) to jointly model both topics and shell. Experiments on real online debate data show that our model can find both meaningful shell and topics. The results also show the effectiveness of our model by comparing it with several baselines in shell phrases extraction and document modeling.
Threaded debate forums have become one of the
major social media platforms. Usually people argue with one
another using not only claims and evidences about the topic
under discussion but also language used to organize them,
which we refer to as shell. In this paper, we study how
to separate shell from topical contents using unsupervised
methods. Along this line, we develop a latent variable model
named Shell Topic Model (STM) to jointly model both topics
and shell. Experiments on real online debate data show that
our model can find both meaningful shell and topics. The
results also show the effectiveness of our model by comparing it
with several baselines in shell phrases extraction and document
modeling.
Threaded debate forums have become one of the
major social media platforms. Usually people argue with one
another using not only claims and evidences about the topic
under discussion but also language used to organize them,
which we refer to as shell. In this paper, we study how
to separate shell from topical contents using unsupervised
methods. Along this line, we develop a latent variable model
named Shell Topic Model (STM) to jointly model both topics
and shell. Experiments on real online debate data show that
our model can find both meaningful shell and topics. The
results also show the effectiveness of our model by comparing it
with several baselines in shell phrases extraction and document
modeling.
Threaded debate forums have become one of the
major social media platforms. Usually people argue with one
another using not only claims and evidences about the topic
under discussion but also language used to organize them,
which we refer to as shell. In this paper, we study how
to separate shell from topical contents using unsupervised
methods. Along this line, we develop a latent variable model
named Shell Topic Model (STM) to jointly model both topics
and shell. Experiments on real online debate data show that
our model can find both meaningful shell and topics. The
results also show the effectiveness of our model by comparing it
with several baselines in shell phrases extraction and document
modeling.
下载链接:http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=7023403&tag=1