NLP
Speech and Language Processing, 2nd Edition 豆瓣 Goodreads
10.0 (5 个评分) 作者: Daniel Jurafsky / James H. Martin Prentice Hall 2008 - 5
This is the 2nd edition of "Speech and Language Processing, 2000" (http://www.douban.com/subject/1810715/).
An explosion of Web-based language techniques, merging of distinct fields, availability of phone-based dialogue systems, and much more make this an exciting time in speech and language processing. The first of its kind to thoroughly cover language technology – at all levels and with all modern technologies – this book takes an empirical approach to the subject, based on applying statistical and other machine-learning algorithms to large corporations. Builds each chapter around one or more worked examples demonstrating the main idea of the chapter, usingthe examples to illustrate the relative strengths and weaknesses of various approaches. Adds coverage of statistical sequence labeling, information extraction, question answering and summarization, advanced topics in speech recognition, speech synthesis. Revises coverage of language modeling, formal grammars, statistical parsing, machine translation, and dialog processing. A useful reference for professionals in any of the areas of speech and language processing.
Handbook of Latent Semantic Analysis 豆瓣
作者: Thomas K. Landauer / Danielle S. McNamara Lawrence Erlbaum 2007 - 2
The Handbook of Latent Semantic Analysis is the authoritative reference for the theory behind Latent Semantic Analysis (LSA), a burgeoning mathematical method used to analyze how words make meaning, with the desired outcome to program machines to understand human commands via natural language rather than strict programming protocols. The first book of its kind to deliver such a comprehensive analysis, this volume explores every area of the method and combines theoretical implications as well as practical matters of LSA. Readers are introduced to a powerful new way of understanding language phenomena, as well as innovative ways to perform tasks that depend on language or other complex systems. The Handbook clarifies misunderstandings and pre-formed objections to LSA, and provides examples of exciting new educational technologies made possible by LSA and similar techniques. It raises issues in philosophy, artificial intelligence, and linguistics, while describing how LSA has underwritten a range of educational technologies and information systems. Alternate approaches to language understanding are addressed and compared to LSA. This work is essential reading for anyone-newcomers to this area and experts alike-interested in how human language works or interested in computational analysis and uses of text. Educational technologists, cognitive scientists, philosophers, and information technologists in particular will consider this volume especially useful.
Introduction to Information Retrieval 豆瓣
作者: Christopher D. Manning / Prabhakar Raghavan Cambridge University Press 2008 - 7
Class-tested and coherent, this groundbreaking new textbook teaches classic web information retrieval, including web search and the related areas of text classification and text clustering from basic concepts. Written from a computer science perspective by three leading experts in the field, it gives an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections. All the important ideas are explained using examples and figures, making it perfect for introductory courses in information retrieval for advanced undergraduates and graduate students in computer science. Based on feedback from extensive classroom experience, the book has been carefully structured in order to make teaching more natural and effective. Although originally designed as the primary text for a graduate or advanced undergraduate course in information retrieval, the book will also create a buzz for researchers and professionals alike.
Contents
1. Information retrieval using the Boolean model; 2. The dictionary and postings lists; 3. Tolerant retrieval; 4. Index construction; 5. Index compression; 6. Scoring and term weighting; 7. Vector space retrieval; 8. Evaluation in information retrieval; 9. Relevance feedback and query expansion; 10. XML retrieval; 11. Probabilistic information retrieval; 12. Language models for information retrieval; 13. Text classification and Naive Bayes; 14. Vector space classification; 15. Support vector machines and kernel functions; 16. Flat clustering; 17. Hierarchical clustering; 18. Dimensionality reduction and latent semantic indexing; 19. Web search basics; 20. Web crawling and indexes; 21. Link analysis.
Reviews
“This is the first book that gives you a complete picture of the complications that arise in building a modern web-scale search engine. You'll learn about ranking SVMs, XML, DNS, and LSI. You'll discover the seedy underworld of spam, cloaking, and doorway pages. You'll see how MapReduce and other approaches to parallelism allow us to go beyond megabytes and to efficiently manage petabytes." -Peter Norvig, Director of Research, Google Inc.
"Introduction to Information Retrieval is a comprehensive, up-to-date, and well-written introduction to an increasingly important and rapidly growing area of computer science. Finally, there is a high-quality textbook for an area that was desperately in need of one." -Raymond J. Mooney, Professor of Computer Sciences, University of Texas at Austin
“Through compelling exposition and choice of topics, the authors vividly convey both the fundamental ideas and the rapidly expanding reach of information retrieval as a field.” -Jon Kleinberg, Professor of Computer Science, Cornell University