计算语言学和语料库
国际儿童语言研究方法 豆瓣
作者: 布赖恩·麦克维尼 2010 - 6
《国际儿童语言研究方法:CHILDES国际儿童语料库数据储存和分析系统》内容简介:指南的译本和实证研究语料,将促进在单语和双语情境下汉语第一语言习得研究方面的进步。当然,这仅仅是个开始。对于汉语普通话而言,我们依然缺少类似现有的广东话以及多种欧洲语言的细致的纵向研究。在各种可用的汉语语料库中,仅Tardif从北京采集的叙事语料以及香港中文大学的双语语料库有音频链接。
汉语名词短语和动词短语的自动识别方法研究 豆瓣
作者: 李荣 / 曹建芳 出版社: 北京燕山出版社 2008 - 6
《汉语名词短语和动词短语的自动识别方法研究》面向中文信息处理的实际需要,介绍了用规则方法识别汉语名词短语和动词短语的过程,然后介绍了用隐马尔可夫模型识别汉语名词短语,用支持向量机识别汉语动词短语的过程。在此基础上,探讨了解决计算机分析汉语短语结构碰到的各类歧义问题的途径。短语识别是中文信息处理领域的一个重要组成部分。《汉语名词短语和动词短语的自动识别方法研究》可作为高等院校计算机专业高年级学生的教学参考书,也可供从事中文信息处理及人工智能研究的相关人员参考。
Mathematical Models of Spoken Language 豆瓣
作者: Stephen Levinson 出版社: Wiley 2003 - 8
"Mathematical Models of Spoken Language" presents the motivations for, intuitions behind, and basic mathematical models of natural spoken language communication. A comprehensive overview is given of all aspects of the problem from the physics of speech production through the hierarchy of linguistic structure and ending with some observations on language and mind. The author comprehensively explores the argument that these modern technologies are actually the most extensive compilations of linguistic knowledge available. Throughout the book, the emphasis is on placing all the material in a mathematically coherent and computationally tractable framework that captures linguistic structure.It presents material that appears nowhere else and gives a unification of formalisms and perspectives used by linguists and engineers. Its unique features include a coherent nomenclature that emphasizes the deep connections amongst the diverse mathematical models and explores the methods by means of which they capture linguistic structure. This contrasts with some of the superficial similarities described in the existing literature; the historical background and origins of the theories and models; the connections to related disciplines, e.g. artificial intelligence, automata theory and information theory; an elucidation of the current debates and their intellectual origins; many important little-known results and some original proofs of fundamental results, e.g. a geometric interpretation of parameter estimation techniques for stochastic models and finally the author's own unique perspectives on the future of this discipline.There is a vast literature on Speech Recognition and Synthesis however, this book is unlike any other in the field. Although it appears to be a rapidly advancing field, the fundamentals have not changed in decades. Most of the results are presented in journals from which it is difficult to integrate and evaluate all of these recent ideas. Some of the fundamentals have been collected into textbooks, which give detailed descriptions of the techniques but no motivation or perspective. The linguistic texts are mostly descriptive and pictorial, lacking the mathematical and computational aspects. This book strikes a useful balance by covering a wide range of ideas in a common framework. It provides all the basic algorithms and computational techniques and an analysis and perspective, which allows one to intelligently read the latest literature and understand state-of-the-art techniques as they evolve.
英汉翻译中的汉语译文语料库研究 豆瓣
作者: 肖忠华 出版社: 上海交通大学出版社 2012 - 6
《英汉翻译中的汉语译文语料库研究》中所展示的是国家社科基金项目“基于语料库的英汉翻译语言特征量化研究”(项目批准号07BYY011)的最终研究成果。该项目将语言对比与翻译研究有机地结合起来。采用基于语料库的研究方法,对汉语母语语料库以及与之对应的汉语译文语料库进行定性与定量的对比分析,从多种角度、在不同层面对汉语译文的特征进行全方位的宏观与微观考察,找出汉语译文的区别性特征,并通过考察英汉平行语料库中的源语渗透效应来确定翻译过程中源语对译文的影响程度。再将针对汉语译文的研究发现与目前国际上现有的主要是基于英文译文的翻译共性研究结果进行对比分析,探讨英汉语这样大跨度的翻译语对中产生的证据对所谓“第三语码”的启示。
基于认知的汉语计算语言学研究 豆瓣
作者: 袁毓林 出版社: 北京大学出版社 2008
陆序
冯序
一、计算理论和语言研究
计算语言学的理论方法和研究取向
基于统计的语言处理模型的有用性和局限性
认知科学和汉语计算语言学
面向当代科技的语言研究的理论和方法
二、论元结构和描述框架
论元角色的层级关系和语义特征
一套汉语动词的论元角色的语法指标
汉语谓词的论元结构的描述框架
论元结构和句式结构互动的动因、机制和条件——表达精细化对动词配价和句式构造的影响
三、信息抽取和语义标注
信息抽取的语义知识资源研究
用动词的论元结构跟事件模板相匹配——一种由动词驱动的信息抽取方法
用逻辑和篇章知识来约束模板匹配——逻辑结构和篇章结构知识在信息抽取中的运用
基于论元结构的语义标注的体系和规范
新闻语体真实文本的语义标注的实践
四、专题研究和个案分析
容器隐喻和套件隐喻及相关的语法现象——词语同现限制的认知解释和计算分析
关于分词规范和规范词表的若干意见
中文信息处理中的语言难题问答
缓冲式移动通信及其发展方向 ——一个语言学家的设计思想
走向多层面互动的汉语研究
五、附录
赵元任先生评传
朱德熙先生评传
后记
The Handbook of Computational Linguistics and Natural Language Processing (Blackwell Handbooks in Linguistics) 豆瓣
作者: Clark, Alexander; Fox, Chris; Lappin, Shalom 出版社: Wiley-Blackwell 2010 - 8
This comprehensive reference work provides an overview of the concepts, methodologies, and applications in computational linguistics and natural language processing (NLP). Features contributions by the top researchers in the field, reflecting the work that is driving the discipline forward Includes an introduction to the major theoretical issues in these fields, as well as the central engineering applications that the work has produced Presents the major developments in an accessible way, explaining the close connection between scientific understanding of the computational properties of natural language and the creation of effective language technologies Serves as an invaluable state-of-the-art reference source for computational linguists and software engineers developing NLP applications in industrial research and development labs of software companies
Spoken Corpus Linguistics 豆瓣
作者: Svenja Adolphs / Ronald Carter 出版社: Routledge 2013 - 3
In this book, Adolphs and Carter explore key approaches to work in spoken corpus linguistics. The book discusses some of the pioneering challenges faced in designing, building and utilising insights from the analysis of spoken corpora, arguing that, even though writing is heavily privileged in corpus research, the spoken language can reveal patterns of language use that are both different and distinctive and that this has important implications for the way in which language is described, for the study of human communication and for the field of applied linguistics as a whole. Spoken Corpus Linguistics is divided into two main parts. The first part sets the scene by discussing traditional and new approaches to monomodal spoken corpus analysis, with a focus on discourse organisation and conversational interaction and with particular attention to forms of language such as discourse markers and multi-word units, areas of language not conventionally described but which are argued to be of importance to spoken language description and to spoken language learning and teaching research within the field of applied linguistics. The second part of the book moves into the multimodal domain and focuses on alignments between language and gesture in a spoken corpus, with particular reference to gestural movements of the head and the hand and to the different ways in which prosody might be used to enhance communication. A brief final chapter discusses new developments in the area of spoken corpus research, including the relationship between language and context, emerging research methods as well as discussing possible shifts in scope and emphasis in spoken corpus research in the future.
Foundations of Computational Linguistics 豆瓣
作者: Hausser, Roland R. 出版社: Springer Verlag
The central task of a future-oriented computational linguistics is the development of cognitive machines which humans can freely talk with in their respective natural language. In the long run, this task will ensure the development of a functional theory of language, an objective method of verification,and a wide range of practical applications.
Natural communication requires not only verbal processing, but also non-verbal perception and action. Therefore the content of this textbook is organized as a theory of language for the construction of talking robots. The main topic is the mechanism of natural language communication in both the speaker and the hearer. The book contains more than 700 exercises for reviewing key ideas and important problems.
语料库语言学 豆瓣
作者: ~ 道格拉斯•比伯 (Bibe D.) (作者), 苏珊•康拉德 (Conrad S.) (作者), 兰迪•瑞潘 (Reppen R.) 译者: 刘颖 (译者), 胡海涛 (译者) 2012 - 10
《语料库语言学》每章都集中探讨一个语言学问题,用实例详细说明了对其进行定量分析和定性分析的过程。《语料库语言学》可作为中文、外语等专业高年级本科生和研究生教材,也可供从事语料库语言学、计算机辅助语言研究和自然语言处理的研究者参考。
语料库应用教程 豆瓣
作者: 梁茂成 / 李文中 2010 - 7
为了满足读者了解如何动手制作语料库的需要,自2006年以来,中国外语教育研究中心和外语教学与研究出版社连续四年举办了语料库与外语研究研修班,梁茂成、李文中、许家金几位博士都是主讲人。他们根据授课经验,收集和设计了许多案例和颇为有效的软件,精心编制了这本《语料库应用教程》,手把手地传授语料库创建和应用方面的方法和技术。此事可庆可贺,而此书则是一本值得向读者大力推荐的好书。这本书有什么好处呢?
1) 说理清楚。这本书强调的是怎样动手,但是知其然,还必须知其所以然。否则就会陷入盲目性。语料库语言学牵涉到的学科专业不少,例如文本分析、语体分析、计算机技术、统计学等。这本书删繁就简、用精取宏,把道理和概念都交代得清清楚楚。
2) 按部就班,循序渐进。要动手,必须讲究先后安排,那就是条理性的问题。这本书经过精心策划,安排得当,使学习者容易上手,然后逐步升堂入室。
3) 锐意求新,体现语料库语言学的新进展。比如,XML语言是近些年来才得到发展和认可的语言,便于记录语料库中的结构化信息,而和它相联系、便于实现高效检索的正则表达式在语料库相关研究中正在得到广泛应用。这本书对怎样利用XML语言进行语料库建设、如何使用正则表达式进行高效检索等问题做了清楚而详尽的解释和说明,而现有的语料库教科书对此均讳莫如深。
4) 资源丰富。书中介绍了很多关于语料库语言学的资源,并附有作者自行编制的软件,为学习者建立语料库、利用语料库从事研究提供了极大的方便。
我接触语料库语言学已有10多年了,也编制过几个语料库,那些都是第一代的语料库。读了这本新书后,才感到自己的语料库知识大有更新的必要。这本书出版后,我以首先获得它为荣。
——广东外语外贸大学 桂诗春教授
牛津计算语言学手册 豆瓣
作者: 米特科夫 编 出版社: 外语教学与研究出版社 1991
《牛津计算语言学手册》内容简介:《牛津计算语言学手册》是一部手册性的计算语言学专著,收录了包括语言学家、计算机专家和语言工程人员在内的50位学者撰写的综述性文章,全面地反映了国外计算语言学主要领域的最新成果,是我们了解国外计算语言学发展动向的一个窗口。 全书各章写作风格一致,内容协调,浑然一体,使用有趣的实例来介绍艰深的技术问题,而且尽量不使用繁难的数学公式,尤其适合文科背景的读者阅读。对于那些对计算语言学感兴趣和刚入门的读者而言,《牛津计算语言学手册》也是一本必备的参考书。