CS
Introduction to Information Retrieval 豆瓣
Class-tested and coherent, this groundbreaking new textbook teaches classic web information retrieval, including web search and the related areas of text classification and text clustering from basic concepts. Written from a computer science perspective by three leading experts in the field, it gives an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections. All the important ideas are explained using examples and figures, making it perfect for introductory courses in information retrieval for advanced undergraduates and graduate students in computer science. Based on feedback from extensive classroom experience, the book has been carefully structured in order to make teaching more natural and effective. Although originally designed as the primary text for a graduate or advanced undergraduate course in information retrieval, the book will also create a buzz for researchers and professionals alike.
Contents
1. Information retrieval using the Boolean model; 2. The dictionary and postings lists; 3. Tolerant retrieval; 4. Index construction; 5. Index compression; 6. Scoring and term weighting; 7. Vector space retrieval; 8. Evaluation in information retrieval; 9. Relevance feedback and query expansion; 10. XML retrieval; 11. Probabilistic information retrieval; 12. Language models for information retrieval; 13. Text classification and Naive Bayes; 14. Vector space classification; 15. Support vector machines and kernel functions; 16. Flat clustering; 17. Hierarchical clustering; 18. Dimensionality reduction and latent semantic indexing; 19. Web search basics; 20. Web crawling and indexes; 21. Link analysis.
Reviews
“This is the first book that gives you a complete picture of the complications that arise in building a modern web-scale search engine. You'll learn about ranking SVMs, XML, DNS, and LSI. You'll discover the seedy underworld of spam, cloaking, and doorway pages. You'll see how MapReduce and other approaches to parallelism allow us to go beyond megabytes and to efficiently manage petabytes." -Peter Norvig, Director of Research, Google Inc.
"Introduction to Information Retrieval is a comprehensive, up-to-date, and well-written introduction to an increasingly important and rapidly growing area of computer science. Finally, there is a high-quality textbook for an area that was desperately in need of one." -Raymond J. Mooney, Professor of Computer Sciences, University of Texas at Austin
“Through compelling exposition and choice of topics, the authors vividly convey both the fundamental ideas and the rapidly expanding reach of information retrieval as a field.” -Jon Kleinberg, Professor of Computer Science, Cornell University
Contents
1. Information retrieval using the Boolean model; 2. The dictionary and postings lists; 3. Tolerant retrieval; 4. Index construction; 5. Index compression; 6. Scoring and term weighting; 7. Vector space retrieval; 8. Evaluation in information retrieval; 9. Relevance feedback and query expansion; 10. XML retrieval; 11. Probabilistic information retrieval; 12. Language models for information retrieval; 13. Text classification and Naive Bayes; 14. Vector space classification; 15. Support vector machines and kernel functions; 16. Flat clustering; 17. Hierarchical clustering; 18. Dimensionality reduction and latent semantic indexing; 19. Web search basics; 20. Web crawling and indexes; 21. Link analysis.
Reviews
“This is the first book that gives you a complete picture of the complications that arise in building a modern web-scale search engine. You'll learn about ranking SVMs, XML, DNS, and LSI. You'll discover the seedy underworld of spam, cloaking, and doorway pages. You'll see how MapReduce and other approaches to parallelism allow us to go beyond megabytes and to efficiently manage petabytes." -Peter Norvig, Director of Research, Google Inc.
"Introduction to Information Retrieval is a comprehensive, up-to-date, and well-written introduction to an increasingly important and rapidly growing area of computer science. Finally, there is a high-quality textbook for an area that was desperately in need of one." -Raymond J. Mooney, Professor of Computer Sciences, University of Texas at Austin
“Through compelling exposition and choice of topics, the authors vividly convey both the fundamental ideas and the rapidly expanding reach of information retrieval as a field.” -Jon Kleinberg, Professor of Computer Science, Cornell University
Foundations of Statistical Natural Language Processing 豆瓣
作者:
Christopher D. Manning
/
Hinrich Schütze
出版社:
The MIT Press
1999
- 6
Statistical approaches to processing natural language text have become dominant in recent years. This foundational text is the first comprehensive introduction to statistical natural language processing (NLP) to appear. The book contains all the theory and algorithms needed for building NLP tools. It provides broad but rigorous coverage of mathematical and linguistic foundations, as well as detailed discussion of statistical methods, allowing students and researchers to construct their own implementations. The book covers collocation finding, word sense disambiguation, probabilistic parsing, information retrieval, and other applications.
Introduction to Data Mining 豆瓣
Introduction to Data Mining presents fundamental concepts and algorithms for those learning data mining for the first time. Each concept is explored thoroughly and supported with numerous examples. The text requires only a modest background in mathematics. Each major topic is organized into two chapters, beginning with basic concepts that provide necessary background for understanding each data mining technique, followed by more advanced concepts and algorithms. Quotes This book provides a comprehensive coverage of important data mining techniques. Numerous examples are provided to lucidly illustrate the key concepts. -Sanjay Ranka, University of Florida In my opinion this is currently the best data mining text book on the market. I like the comprehensive coverage which spans all major data mining techniques including classification, clustering, and pattern mining (association rules). -Mohammed Zaki, Rensselaer Polytechnic Institute
TCP/IP详解(卷1英文版) 豆瓣
作者:
[美国] 史蒂文斯
出版社:
机械工业出版社
2002
- 6
计算机网络(第6版) 豆瓣
Computer Networking:A Top-Down Approach,Sixth Edition
8.6 (7 个评分)
作者:
[美] James F.Kurose
/
[美] Keith W.Ross
译者:
陈鸣
出版社:
机械工业出版社
2014
- 10
《计算机网络:自顶向下方法(原书第6版)》第1版于12年前出版,首创采用自顶向下的方法讲解计算机网络的原理和协议,出版以来已被几百所大学和学院选用,是业界最经典的计算机网络教材之一。
《计算机网络:自顶向下方法(原书第6版)》第6版继续保持了以前版本的特色,为计算机网络教学提供了一种新颖和与时俱进的方法,同时也进行了相当多的修订和更新:第1章更多地关注时下,更新了接入网的论述;第2章用python替代了java来介绍套接字编程;第3章补充了用于优化云服务性能的tcp分岔知识;第4章有关路由器体系结构的内容做了大量更新;第5章重新组织并新增了数据中心网络的内容;第6章更新了无线网络的内容以反映其最新进展;第7章进行了较大修订,深入讨论了流式视频,包括了适应性流和cdn的讨论;第8章进一步讨论了端点鉴别;等等。另外,书后习题也做了大量更新。
《计算机网络:自顶向下方法(原书第6版)》适合作为本科生或研究生“计算机网络”课程的教材,同时也适合网络技术人员、专业研究人员阅读。
《计算机网络:自顶向下方法(原书第6版)》第6版继续保持了以前版本的特色,为计算机网络教学提供了一种新颖和与时俱进的方法,同时也进行了相当多的修订和更新:第1章更多地关注时下,更新了接入网的论述;第2章用python替代了java来介绍套接字编程;第3章补充了用于优化云服务性能的tcp分岔知识;第4章有关路由器体系结构的内容做了大量更新;第5章重新组织并新增了数据中心网络的内容;第6章更新了无线网络的内容以反映其最新进展;第7章进行了较大修订,深入讨论了流式视频,包括了适应性流和cdn的讨论;第8章进一步讨论了端点鉴别;等等。另外,书后习题也做了大量更新。
《计算机网络:自顶向下方法(原书第6版)》适合作为本科生或研究生“计算机网络”课程的教材,同时也适合网络技术人员、专业研究人员阅读。
具体数学(英文版第2版) 豆瓣 Goodreads
Concrete Mathematics: A Foundation for Computer Science (2/e)
10.0 (5 个评分)
作者:
[美] Ronald L. Graham
/
Donald E. Knuth
…
出版社:
机械工业出版社
2002
- 8
This book introduces the mathematics that supports advanced computer Programming and the analysis of algorithms. The primary aim of its well-known authors is to provide a solid and relevant base of mathematical skills--the skills needed to solve complex problems, to evaluate horrendous sums, and to discover subtle Patterns in data. It is an indispensable text and reference not only for computer scientists--the authors themselves rely heavily on it! but for serious users Of mathematics in virtually every discipline. Concrete mathematics is a blending of continuous and disCRETE mathematics: "More concretely," the authors explain, "it is the controlled manipulation of mathematical formulas,using a collection of techniques for solving problems." The subject mater is primarily an expansion of the Mathematical Preliminaries section in Knuth's c1assic Art of Computer Programming, but the style of presentation is more leisurely, and individual topics are covered more deeply. Several new topics have been added, and the most significant ideas have been traced to their historical roots. The book includes more than 500 exercises, divided into six categories. Complete answers are provided for all exercises, except research problems, making the book particularly valuable for self-study.
深入理解计算机系统(原书第2版) 豆瓣 Goodreads
Computer Systems: A Programmer's Perspective
9.7 (26 个评分)
作者:
[美] Randal E.Bryant
/
[美] David O' Hallaron
译者:
龚奕利
/
雷迎春
出版社:
机械工业出版社
2011
- 1
本书从程序员的视角详细阐述计算机系统的本质概念,并展示这些概念如何实实在在地影响应用程序的正确性、性能和实用性。全书共12章,主要内容包括信息的表示和处理、程序的机器级表示、处理器体系结构、优化程序性能、存储器层次结构、链接、异常控制流、虚拟存储器、系统级I/O、网络编程、并发编程等。书中提供大量的例子和练习,并给出部分答案,有助于读者加深对正文所述概念和知识的理解。
本书的最大优点是为程序员描述计算机系统的实现细节,帮助其在大脑中构造一个层次型的计算机系统,从最底层的数据在内存中的表示到流水线指令的构成,到虚拟存储器,到编译系统,到动态加载库,到最后的用户态应用。通过掌握程序是如何映射到系统上,以及程序是如何执行的,读者能够更好地理解程序的行为为什么是这样的,以及效率低下是如何造成的。
本书适合那些想要写出更快、更可靠程序的程序员阅读,也适合作为高等院校计算机及相关专业本科生、研究生的教材。
本书的最大优点是为程序员描述计算机系统的实现细节,帮助其在大脑中构造一个层次型的计算机系统,从最底层的数据在内存中的表示到流水线指令的构成,到虚拟存储器,到编译系统,到动态加载库,到最后的用户态应用。通过掌握程序是如何映射到系统上,以及程序是如何执行的,读者能够更好地理解程序的行为为什么是这样的,以及效率低下是如何造成的。
本书适合那些想要写出更快、更可靠程序的程序员阅读,也适合作为高等院校计算机及相关专业本科生、研究生的教材。
计算机组成与设计(原书第4版) 豆瓣
Computer Organization and Design: The Hardware/Software Interface (4/e)
作者:
[美] David A.Patterson
/
[美] John L.Hennessy
译者:
康继昌
/
樊晓桠
…
出版社:
机械工业出版社
2012
- 1
《计算机组成与设计》是计算机组成的经典教材。全书着眼于当前计算机设计中最基本的概念,展示了软硬件间的关系,并全面介绍当代计算机系统发展的主流技术和最新成就。
同以往版本一样,本书采用MIPS处理器作为展示计算机硬件技术、汇编语言、计算机算术、流水线、存储器层次结构以及I/O等基本功能的核心。书中强调了计算机从串行到并行的最新革新,在每章中都纳入了并行硬件和软件的主题,以软硬件协同设计发挥多核性能为最终目标。
本书适合作为高等院校相关专业的本科生和研究生教材,对广大技术人员也有很高的参考价值。
同以往版本一样,本书采用MIPS处理器作为展示计算机硬件技术、汇编语言、计算机算术、流水线、存储器层次结构以及I/O等基本功能的核心。书中强调了计算机从串行到并行的最新革新,在每章中都纳入了并行硬件和软件的主题,以软硬件协同设计发挥多核性能为最终目标。
本书适合作为高等院校相关专业的本科生和研究生教材,对广大技术人员也有很高的参考价值。
统计自然语言处理基础 豆瓣 Goodreads
Foundations of Statistical Natural Language Processing
《统计自然语言处理基础:国外计算机科学教材系列》是一本全面系统地介绍统计自然语言处理技术的专著,被国内外许多所著名大学选为计算语言学相关课程的教材。《统计自然语言处理基础:国外计算机科学教材系列》涵盖的内容十分广泛,分为四个部分,共16章,包括了构建自然语言处理软件工具将用到的几乎所有理论和算法。全书的论述过程由浅入深,从数学基础到精确的理论算法,从简单的词法分析到复杂的语法分析,适合不同水平的读者群的需求。同时,《统计自然语言处理基础:国外计算机科学教材系列》将理论与实践紧密联系在一起,在介绍理论知识的基础上给出了自然语言处理技术的高层应用(如信息检索等)。在《统计自然语言处理基础:国外计算机科学教材系列》的配套网站上提供了许多相关资源和工具,便于读者结合书中习题,在实践中获得提高。近年来,自然语言处理中的统计学方法已经逐渐成为主流。
信息论基础 豆瓣
Elements of Information Theory
《信息论基础》(原书第2版)是信息论领域中一本简明易懂的教材。主要内容包括:熵、信源、信道容量、率失真、数据压缩与编码理论和复杂度理论等方面的介绍。《信息论基础》(原书第2版)还对网络信息论和假设检验等进行了介绍,并且以赛马模型为出发点,将对证券市场的研究纳入了信息论的框架,从新的视角给投资组合的研究带来了全新的投资理念和研究技巧。
《信息论基础》(原书第2版)第2版依然保持了第1版清晰。引人深思的写作风格。读者可以又一次获得数学,物理学。统计学以及信息论方面的综合知识。..
关于信息论的主题包括熵、数据压缩。信道容量。率失真。网络信息论以及假设检验等领域的详细介绍,旨在为读者在理论研究和应用方面打下坚实的基础。在每章结束前提供了习题集和要点总结以及主要论点的历史回顾。
《信息论基础》(原书第2版)是电子工程。统计学以及电信方面的高年级本科生和研究生学习信息论基础课程的理想教材。
《信息论基础》(原书第2版)第2版依然保持了第1版清晰。引人深思的写作风格。读者可以又一次获得数学,物理学。统计学以及信息论方面的综合知识。..
关于信息论的主题包括熵、数据压缩。信道容量。率失真。网络信息论以及假设检验等领域的详细介绍,旨在为读者在理论研究和应用方面打下坚实的基础。在每章结束前提供了习题集和要点总结以及主要论点的历史回顾。
《信息论基础》(原书第2版)是电子工程。统计学以及电信方面的高年级本科生和研究生学习信息论基础课程的理想教材。