計算機視覺
Deep Learning: Methods and Applications (Foundations and Trends(r) in Signal Processing) 豆瓣
作者: Li Deng / Dong Yu Now Publishers Inc 2014 - 6
This book is aimed to provide an overview of general deep learning methodology and its applications to a variety of signal and information processing tasks. The application areas are chosen with the following three criteria: 1) expertise or knowledge of the authors; 2) the application areas that have already been transformed by the successful use of deep learning technology, such as speech recognition and computer vision; and 3) the application areas that have the potential to be impacted significantly by deep learning and that have gained concentrated research efforts, including natural language and text processing, information retrieval, and multimodal information processing empowered by multi-task deep learning.
In Chapter 1, we provide the background of deep learning, as intrinsically connected to the use of multiple layers of nonlinear transformations to derive features from the sensory signals such as speech and visual images. In the most recent literature, deep learning is embodied also as representation learning, which involves a hierarchy of features or concepts where higher-level representations of them are defined from lower-level ones and where the same lower-level representations help to define higher-level ones. In Chapter 2, a brief historical account of deep learning is presented. In particular, selected chronological development of speech recognition is used to illustrate the recent impact of deep learning that has become a dominant technology in speech recognition industry within only a few years since the start of a collaboration between academic and industrial researchers in applying deep learning to speech recognition. In Chapter 3, a three-way classification scheme for a large body of work in deep learning is developed. We classify a growing number of deep learning techniques into unsupervised, supervised, and hybrid categories, and present qualitative descriptions and a literature survey for each category. From Chapter 4 to Chapter 6, we discuss in detail three popular deep networks and related learning methods, one in each category. Chapter 4 is devoted to deep autoencoders as a prominent example of the unsupervised deep learning techniques. Chapter 5 gives a major example in the hybrid deep network category, which is the discriminative feed-forward neural network for supervised learning with many layers initialized using layer-by-layer generative, unsupervised pre-training. In Chapter 6, deep stacking networks and several of the variants are discussed in detail, which exemplify the discriminative or supervised deep learning techniques in the three-way categorization scheme.
In Chapters 7-11, we select a set of typical and successful applications of deep learning in diverse areas of signal and information processing and of applied artificial intelligence. In Chapter 7, we review the applications of deep learning to speech and audio processing, with emphasis on speech recognition organized according to several prominent themes. In Chapters 8, we present recent results of applying deep learning to language modeling and natural language processing. Chapter 9 is devoted to selected applications of deep learning to information retrieval including Web search. In Chapter 10, we cover selected applications of deep learning to image object recognition in computer vision. Selected applications of deep learning to multi-modal processing and multi-task learning are reviewed in Chapter 11. Finally, an epilogue is given in Chapter 12 to summarize what we presented in earlier chapters and to discuss future challenges and directions.
The Interpretation of Visual Motion 豆瓣
作者: Ullman, Shimon 1979 - 3
This book uses the methodology of artificial intelligence to investigate the phenomena of visual motion perception: how the visual system constructs descriptions of the environment in terms of objects, their three-dimensional shape, and their motion through space, on the basis of the changing image that reaches the eye. The author has analyzed the computations performed in the course of visual motion analysis. Workable schemes able to perform certain tasks performed by the visual system have been constructed and used as vehicles for investigating the problems faced by the visual system and its methods for solving them.Two major problems are treated: first, the correspondence problem, which concerns the identification of image elements that represent the same object at different times, thereby maintaining the perceptual identity of the object in motion or in change. The second problem is the three-dimensional interpretation of the changing image once a correspondence has been established.The author's computational approach to visual theory makes the work unique, and it should be of interest to psychologists working in visual perception and readers interested in cognitive studies in general, as well as computer scientists interested in machine vision, theoretical neurophysiologists, and philosophers of science.
Brain and Visual Perception 豆瓣
作者: David H. Hubel / Torsten Wiesel Oxford University Press 2004 - 10
Scientists' understanding of two central problems in neuroscience, psychology, and philosophy has been greatly influenced by the work of David Hubel and Torsten Wiesel: (1) What is it to see? This relates to the machinery that underlies visual perception. (2) How do we acquire the brain's mechanisms for vision? This is the nature-nurture question as to whether the nerve connections responsible for vision are innate or whether they develop through experience in the early life of an animal or human. This is a book about the collaboration between Hubel and Wiesel, which began in 1958, lasted until about 1982, and led to a Nobel Prize in 1981. It opens with short autobiographies of both men, describes the state of the field when they started, and tells about the beginnings of their collaboration. It emphasizes the importance of various mentors in their lives, especially Stephen W. Kuffler, who opened up the field by studying the cat retina in 1950, and founded the department of neurobiology at Harvard Medical School, where most of their work was done. The main part of the book consists of Hubel and Wiesel's most important publications. Each reprinted paper is preceded by a foreword that tells how they went about the research, what the difficulties and the pleasures were, and whether they felt a paper was important and why. Each is also followed by an afterword describing how the paper was received and what developments have occurred since its publication. The reader learns things that are often absent from typical scientific publications, including whether the work was difficult, fun, personally rewarding, exhilarating, or just plain tedious. The book ends with a summing-up of the authors' view of the present state of the field. This is much more than a collection of reprinted papers. Above all it tells the story of an unusual scientific collaboration that was hugely enjoyable and served to transform an entire branch of neurobiology. It will appeal to neuroscientists, vision scientists, biologists, psychologists, physicists, historians of science, and to their students and trainees, at all levels from high school on, as well as anyone else who is interested in the scientific process.
Multiple View Geometry in Computer Vision 豆瓣
作者: Richard Hartley / Andrew Zisserman Cambridge University Press 2004 - 4
A basic problem in computer vision is to understand the structure of a real world scene given several images of it. Techniques for solving this problem are taken from projective geometry and photogrammetry. Here, the authors cover the geometric principles and their algebraic representation in terms of camera projection matrices, the fundamental matrix and the trifocal tensor. The theory and methods of computation of these entities are discussed with real examples, as is their use in the reconstruction of scenes from multiple images. The new edition features an extended introduction covering the key ideas in the book (which itself has been updated with additional examples and appendices) and significant new results which have appeared since the first edition. Comprehensive background material is provided, so readers familiar with linear algebra and basic numerical methods can understand the projective geometry and estimation algorithms presented, and implement the algorithms directly from the book.
Robot Vision 豆瓣
作者: Berthold K.P. Horn The MIT Press 1986 - 3
This book presents a coherent approach to the fast moving field of machine vision, using a consistent notation based on a detailed understanding of the image formation process. It covers even the most recent research and will provide a useful and current reference for professionals working in the fields of machine vision, image processing, and pattern recognition.An outgrowth of the author's course at MIT, Robot Vision presents a solid framework for understanding existing work and planning future research. Its coverage includes a great deal of material that important to engineers applying machine vision methods in the real world. The chapters on binary image processing, for example, help explain and suggest how to improve the many commercial devices now available. And the material on photometric stereo and the extended Gaussian image points the way to what may be the next thrust in commercialization of the results in this area. The many exercises complement and extend the material in the text, and an extensive bibliography will serve as a useful guide to current research.Contents: Image Formation and Image Sensing. Binary Images: Geometrical Properties; Topological Properties. Regions and Image Segmentation. Image Processing: Continuous Images; Discrete Images. Edges and Edge Finding. Lightness and Color. Reflectance Map: Photometric Stereo Reflectance Map; Shape from Shading. Motion Field and Optical Flow. Photogrammetry and Stereo. Pattern Classification. Polyhedral Objects. Extended Gaussian Images. Passive Navigation and Structure from Motion. Picking Parts out of a Bin.Berthold Klaus Paul Horn is Associate Professor, Department of Electrical Engineering and Computer Science, MIT. Robot Vision is included in the MIT Electrical Engineering and Computer Science Series.
Affine Differential Geometry 豆瓣
作者: Katsumi Nomizu / Takeshi Sasaki Cambridge University Press 2008 - 6
This is a self-contained and systematic account of affine differential geometry from a contemporary view, not only covering the classical theory, but also introducing more modern developments. In order both to cover as much as possible and to keep the text of a reasonable size, the authors have concentrated on the significant features of the subject and their relationship and application to such areas as Riemannian, Euclidean, Lorentzian and projective differential geometry. In so doing, they also provide a modern introduction to the last. Some of the important geometric surfaces considered are illustrated by computer graphics, making this a physically and mathematically attractive book for all researchers in differential geometry, and for mathematical physicists seeking a quick entry to the subject.
漢字樹 豆瓣
作者: 廖文豪 遠流出版事業股份有限公司
500個與「人」有關的漢字+超過5,000個甲骨文、金文、篆文
收納在2張漢字樹狀圖!
一個台大電機系畢業、專精於電腦的「理工人」,在偶然的機會接觸了被認為許多從事文學創作研究的作家、專家也視為畏途的「文字學」,產生了濃厚的興趣;順著歷來各家的研究,進入了漢字構型的世界,流連忘返。隨著心得漸增,心頭的不解疑團也越來越多。
於是,他引入電腦強大的彙編整理能力,有系統地梳理漢字的構件,試圖找出解釋力更強的說法,在這個過程中,也越加感受到部首的限制與誤導。
部首是一個字組成的構件之一,因為有許多字都有,因而成為漢字分類的標記。但是,屬於同一個部首的字,彼此之間卻未必有關連。反之,有些看似不同的字,從漢字的演化發展來看,卻是關係密切。
作者長年浸淫在文字學的天地,尋索字與字之間的邏輯關連,濃縮在書中的「漢字樹狀圖」中。再透過作者清晰簡要的說明,即使對於在文字學毫無根基的讀者,也可以憑著自身對中文母語的使用經驗,得到許多新奇的發現與樂趣。
Eye and Brain 豆瓣
作者: Richard L. Gregory Princeton University Press 1997
Since the publication of the first edition in 1966, Eye and Brain has established itself worldwide as an essential introduction to the basic phenomena of visual perception. In this book, Richard L. Gregory offers clear explanations of how we see brightness, movement, color, and objects, and he explores the phenomena of visual illusions to establish principles about how perception normally works and why it sometimes fails. Although successive editions have incorporated new discoveries and ideas, Gregory completely revised and updated the book for this publication, adding more than thirty new illustrations. The phenomena of illusion continue to be a major theme in the book, in which the author makes a new attempt to provide a comprehensive classification system. There are also new sections on what babies see and how they learn to see, on motion perception, and tantalizing glimpses of the relationship between vision and consciousness and of the impact of new brain imaging techniques. In addition, the presentation of the text and illustrations has been improved by the larger format and new page design. The thousands of readers of the previous editions of Eye and Brain will find this new revised edition even more attractive and enthralling.
人工智能简史 豆瓣
作者: 尼克 人民邮电出版社 2017
本书全面讲述人工智能的发展史,几乎覆盖人工智能学科的所有领域,包括人工智能的起源。、自动定理证明、专家系统、神经网络、自然语言处理、遗传算法、深度学习、强化学习、超级智能、哲学问题和未来趋势等,以宏阔的视野和生动的语言,对人工智能进行了全面回顾和深度点评。
本书作者和书中诸多人物或为师友或相熟相知,除了详实的考证还有有趣的轶事。本书既适合专业人士了解人工智能鲜为人知的历史,也适合对人工智能感兴趣的大众读者作为入门的向导。
Surfaces and Essences 豆瓣
作者: Douglas Hofstadter / Emmanuel Sander Basic Books 2013 - 4
Is there one central mechanism upon which all human thinking rests? Cognitive scientists Douglas Hofstadter and Emmanuel Sander argue that there is. At this core is our incessant proclivity to take what we perceive, to abstract it, and to find resemblances to prior experiences—in other words, our ability to make analogies. In The Essence of Thought , Hofstadter and Sander show how analogy-making pervades our thought at all levels—indeed, that we make analogies not once a day or once an hour, but many times per second. Thus, analogy is the mechanism that, silently and hidden, chooses our words and phrases for us when we speak, frames how we understand the most banal everyday situation, guides us in unfamiliar situations, and gives rise to great acts of imagination. We categorize because of analogies that range from simple to subtle, and thus our categories, throughout our lives, expand and grow ever more fluid. Through examples galore and lively prose peppered, needless to say, with analogies large and small, Hofstadter and Sander offer us a new way of thinking about thinking.
Computer Vision 豆瓣
作者: David A. Forsyth / Jean Ponce Prentice Hall 2002 - 8
Appropriate for upper-division undergraduate- and graduate-level courses in computer vision found in departments of Computer Science, Computer Engineering and Electrical Engineering. This long anticipated book is the most complete treatment of modern computer vision methods by two of the leading authorities in the field. This accessible presentation gives both a general view of the entire computer vision enterprise and also offers sufficient detail for students to be able to build useful applications. Students will learn techniques that have proven to be useful by first-hand experience and a wide range of mathematical methods.
Foundations of Vision 豆瓣
作者: Brian A. Wandell Sinauer Associates Inc 1995 - 5
Designed for students, scientists and engineers interested in learning about the core ideas of vision science, this volume brings together the broad range of data and theory accumulated in this field. The book consists of three sections and an appendix. The first section consists of an introduction and three chapters that describe image encoding. These chapters review optical image formation by the cornea and lens retinal sampling and wavelength-encoding by the photoreceptors. The text's second section consists of four chapters on image representation. The third section reviews how to interpret images in terms of objects. This section features two chapters that review computational and experimental studies of colour appearance, then motion and depth. These chapters are followed by a chapter with many demonstrations concerning object perception. Topics such as colour appearance, cortical colour-blindness, motion flow, motion appearance, motion physiology and visual illusions are also included in this part of the book. "Foundations of Science" is suitable for courses on vision science in psychology, neuroscience, engineering or computer science departments, and is suitable for upper-level undergraduates and graduate students. The text contains special study exercises at the end of most chapters. The questions aim to enrich the main material and point the way to additional material in the literature. Finally, the book has an appendix consisting of four parts: an introduction to linear systems methods; a discussion of monitor calibration; an introduction to Bayesian classifiers; and a discussion of optic flow computation.
Inattentional Blindness 豆瓣
作者: Arien Mack / Irvin Rock The MIT Press 2000 - 7
Many people believe that merely by opening their eyes, they see everything in their field of view; in fact, a line of psychological research has been taken as evidence of the existence of so-called preattentional perception. In Inattentional Blindness, Arien Mack and Irvin Rock make the radical claim that there is no such thing--that there is no conscious perception of the visual world without attention to it.The authors present a narrative chronicle of their research. Thus, the reader follows the trail that led to the final conclusions, learning why initial hypotheses and explanations were discarded or revised, and how new questions arose along the way. The phenomenon of inattentional blindness has theoretical importance for cognitive psychologists studying perception, attention, and consciousness, as well as for philosophers and neuroscientists interested in the problem of consciousness.