Natural Language

Understanding & Generating Text & Speech

AITopics > Natural Language

Natural Language Processing enables communication between people and computers and automatic translation to enable people to interact easily with others around the world.


Progress on building computer systems that process natural language in any meaningful sense requires considering language as part of a larger communicative situation.

Regarding language as communication requires consideration of what is said (literally), what is intended, and the relationship between the two.

- Barbara Grosz, Utterance and Objective

letter blocks

The value to our society of being able to communicate with computers in everyday natural language cannot be overstated. Imagine asking your computer "Does this candidate have a good record on the environment?" or "When is the next televised National League baseball game?" Or being able to tell your PC "Please format my homework the way my English professor likes it." Commercial products can already do some of these things, and AI scientists expect many more in the next decade. One goal of AI work in natural language is to enable communication between people and computers without resorting to memorization of complex commands and procedures. Automatic translation---enabling scientists, business people and just plain folks to interact easily with people around the world---is another goal. Both are just part of the broad field of AI and natural language, along with the cognitive science aspect of using computers to study how humans understand language.

Definition of the Area

ďIíM SORRY DAVE, IíM AFRAID I CANíT DO THATĒ: LINGUISTICS, STATISTICS, AND NATURAL-LANGUAGE PROCESSING CIRCA 2001 by Lillian Lee, Cornell University. From Section 6 of the National Academy of Science publication Computer Science: Reflections on the Field, Reflections from the Field (2004) Computer Science and Telecommunications Board (CSTB). "According to many pop-culture visions of the future, technology will eventually produce the Machine That Can Speak to Us. ... Natural-language processing, or NLP, is the field of computer science devoted to creating such machinesóthat is, enabling computers to use human languages both as input and as output. The area is quite broad, encompassing problems ranging from simultaneous multi-language translation to advanced search engine development to the design of computer interfaces capable of combining speech, diagrams, and other modalities simultaneously. A natural consequence of this wide range of inquiry is the integration of ideas from computer science with work from many other fields, including linguistics, which provides models of language; psychology, which provides models of cognitive processes; information theory, which provides models of communication; and mathematics and statistics, which provide tools for analyzing and acquiring such models.%quot;

Good Starting Places

What is NLP. From the Natural Language Processing Research Group at the University of Sheffield Department of Computer Science. " Natural Language Processing (NLP) is both a modern computational technology and a method of investigating and evaluating claims about human language itself. Some prefer the term Computational Linguistics in order to capture this latter function, but NLP is a term that links back into the history of Artificial Intelligence (AI), the general study of cognitive function by computational processes, normally with an emphasis on the role of knowledge representations, that is to say the need for representations of our knowledge of the world in order to understand human language with computers. Natural Language Processing (NLP) is the use of computers to process written and spoken language for some practical, useful, purpose: to translate languages, to get information from the web on text data banks so as to answer questions, to carry on conversations with machines, so as to get advice about, say, pensions and so on. These are only examples of major types of NLP, and there is also a huge range of lesser but interesting applications, e.g. getting a computer to decide if one newspaper story has been rewritten from another or not. NLP is not simply applications but the core technical methods and theories that the major tasks above divide up into, such as Machine Learning techniques, which is automating the construction and adaptation of machine dictionaries, modeling human agents' beliefs and desires etc. This last is closer to Artificial Intelligence, and is an essential component of NLP if computers are to engage in realistic conversations: they must, like us, have an internal model of the humans they converse with."

What is Computational Linguistics? Hans Uszkoreit, CL Department, University of the Saarland, Germany. 2000. A short, non-technical overview of this exciting field.

Doing Conversation with Machines By Peter Wallis. AISB Quarterly No. 126, Spring 2008. "...conventional approaches to human language technology ...have tended to focus on either generic machine learning over larger and larger collections of recorded human behaviour, or focused on information flow. The tendency is to ignore the fact that people are social animals and that, in human-human conversation, the primary role of language is to manage social relations... . Such behaviour is mostly invisible to us humans as it is just common-sense. The challenge for those working in this area of artificial intelligence is to come up with some means of capturing this common-sense in a form that is amenable to programming."

Natural Language. A summary by Patrick Doyle. Very informative, though there are some spots that are quite technical. Downloadable .doc file.

Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. By Daniel Jurafsky and James H. Martin. Prentice-Hall, 2000. Chapter 1 (Introduction) is available as a downloadable .pdf file as are the resources for all of the chapters.

Natural Language Processing FAQ. By Dragomir R. Radev. Dept. of Computer Science, Columbia University. A set of frequently-asked questions about computational linguistics with informative answers.

General Readings

Natural Language Learning at UT Austin. "Natural language processing systems are difficult to build, and machine learning methods can help automate their construction significantly. Our research in learning for natural language mainly involves applying inductive logic programming and other relational learning techniques to constructing database interfaces and information extraction systems from supervised examples. However, we have also conducted research in learning for syntactic parsing, machine translation, word-sense disambiguation, and morphology (past tense generation)." Links to many relevant articles.

Glossary of Linguistic Terms. Compiled by Dr. Peter Coxhead of The University of Birmingham School of Computer Science for his students.

The Futurist - The Intelligent Internet. The Promise of Smart Computers and E-Commerce. By William E. Halal. Government Computer News Daily News (June 23, 2004). "Scientific advances are making it possible for people to talk to smart computers, while more enterprises are exploiting the commercial potential of the Internet. ... [F]orecasts conducted under the TechCast Project at George Washington University indicate that 20 commercial aspects of Internet use should reach 30% 'take-off' adoption levels during the second half of this decade to rejuvenate the economy. Meanwhile, the project's technology scanning finds that advances in speech recognition, artificial intelligence, powerful computers, virtual environments, and flat wall monitors are producing a 'conversational' human-machine interface. These powerful trends will drive the next generation of information technology into the mainstream by about 2010. ... The following are a few of the advances in speech recognition, artificial intelligence, powerful chips, virtual environments, and flat-screen wall monitors that are likely to produce this intelligent interface. ... IBM has a Super Human Speech Recognition Program to greatly improve accuracy, and in the next decade Microsoft's program is expected to reduce the error rate of speech recognition, matching human capabilities. ... MIT is planning to demonstrate their Project Oxygen, which features a voice-machine interface. ... Amtrak, Wells Fargo, Land's End, and many other organizations are replacing keypad-menu call centers with speech-recognition systems because they improve customer service and recover investment in a year or two. ... General Motors OnStar driver assistance system relies primarily on voice commands, with live staff for backup; the number of subscribers has grown from 200,000 to 2 million and is expected to increase by 1 million per year. The Lexus DVD Navigation System responds to over 100 commands and guides the driver with voice and visual directions."

Artificial Intelligence. [Radio broadcast; audio available.] Reported by Shay Zeller for The Front Porch. New Hampshire Public Radio (July 12, 2006). "Dartmouth College is celebrating 50 years of Artificial Intelligence this week with a special conference that takes a look forward and a look back at the field. We'll find out how AI has evolved since its inception and how far scientists have come to creating the technological brain that's been depicted in science fiction for decades. We'll also look at the philosophical and ethical questions that go along with creating machines that emulate the human mind. Our guest are: Eugene Charniak, professor of Computer Science at Brown University. Charniak's expertise is in language development, and he's presenting a speech at the conference entitled 'Why Natural Language Processing is Now Statistical Natural Language Processing.' James H. Moor, professor of Philosophy at Dartmouth. He's the conference's main organizer."

Experts Use AI to Help GIs Learn Arabic. By Eric Mankin. USC News (June 21, 2004). " To teach soldiers basic Arabic quickly, USC computer scientists are developing a system that merges artificial intelligence with computer game techniques. The Rapid Tactical Language Training System, created by the USC Viterbi School of Engineering's Center for Research in Technology for Education (CARTE) and partners, tests soldier students with videogame missions in animated virtual environments where, to pass, the students must successfully phrase questions and understand answers in Arabic."

An Overview of Empirical Natural Language Processing. By Eric Brill and Raymond J. Mooney (1997). AI Magazine 18(4): Winter 1997, 13-24. "In recent years, there has been a resurgence in research on empirical methods in natural language processing. These methods employ learning techniques to automatically extract linguistic knowledge from natural language corpora rather than require the system developer to manually encode the requisite knowledge. The current special issue reviews recent research in empirical methods in speech recognition, syntactic parsing, semantic processing, information extraction, and machine translation. This article presents an introduction to the series of specialized articles on these topics and attempts to describe and explain the growing interest in using learning methods to aid the development of natural language processing systems."

Visit the homepage of Daniel Klein, Assistant Professor,Computer Science Division,University of California at Berkeley, and recipient of "the [2006] Grace Murray Hopper Award for the design of the first machine learning system capable of inferring a high-quality grammar for English and other languages directly from text without human annotations or supervision." (See the March 2007 ACM press release.)

Chatbots / Chatterbots

  • What is a chatbot? An explanation from, 2009.
  • Alice, the Chat Robot. Winner of the 2000, 2001 & 2004 Loebner Prize.
  • Ask ALEX. Available from JURIST. "ALEX is an AI-based 'bot' programmed to help you locate basic legal information online. But be warned - she's very experimental, and she sometimes has a bit of an attitude! ... ALEX cannot provide legal advice. If you have a legal problem, consult a lawyer."
  • Brian is a computer program that thinks it's an 18 year old college student. It was written as an entry in the 1998 Loebner Competition, where it won third place out of six entries. From Joe Strout.
  • Chatterbot. From Wikipedia.
  • Chatterbots. Hosted by Simon Laven. An extensive collection that includes classic chatterbots, complex chatterbots, friendly chatterbots, non-English bots (German, Spanish, French, and other languages), and much more.
  • Chatterbot links from the Open Directory project.
  • Eliza - "a friend you could never have before". One of the earliest chatterbots.
  • George - winner of the 2005 Loebner Prize. [See Brit's bot chats way to AI medal.]
    • Video > Meet George - Is this Web site a foreshadow of robots and computers taking over the world? Video Podcast from Nightline Online Sign of the Times (September 18, 2006). Hosted by Terry Moran and reported by Nick Watt. Meet the chatbot, George, and his inventor, Rollo Carpenter.
    • Video > Robot enjoys online chat. BBC News. "George is an online robot that has sufficient artificial intelligence to chat with real people, in a variety of languages. Rory Cellan-Jones visited UK firm Televirtual to meet their prize-winning virtual employee."
  • Jabberwacky. As stated on the About Jabberwacky page, it "is an artificial intelligence - a chat robot, often known as a 'chatbot' or 'chatterbot'. It aims to simulate natural human chat in an interesting, entertaining and humorous manner. Jabberwacky is different. It learns. In some ways it models the way humans learn language, facts, context and rules."
  • Joan - winner of the 2006 Loebner Prize. [See also Programmer wins £1,000 for most human creation.]
  • "John Lennon Artificial Intelligence Project (JLAIP) is recreating the personality of the late Beatle, John Lennon, by programming an Artificial Intelligence (AI) engine with Lennon's own words and thoughts. Triumph PC's breakthroughs take the field of AI to an entirely new level, thus making Persona-Bots™ (robots inhabited with unique and authentic human personalities) possible and further blurring the precarious line between man and machine." (Also see Mind games with John.)
  • MyCyberTwin: Artificial Intelligence to Promote Feature Film 'Flatland.' Press release available from Prime Newswire (August 16, 2007). "You're an independent filmmaker. You don't have a big Hollywood marketing budget. How do you market your film? Make robots, of course! Or, to be more precise, you make 'chatbots.' ... 'No one likes chatbots that pretend to be humans,' said [Ladd] Ehlinger. 'But I wondered if they could be put to better use? To be entertaining in and of themselves, to answer questions about my film, to introduce people to the world of Flatland?' said Ehlinger. Flatland, an animated science fiction feature film based on the 1884 novel by Edwin A. Abbott, is popular with mathematicians and computer scientists for its explorations into such heady subjects as dimensionality and the nature of reality. ... MyCyberTwin ( is a new technology providing intelligent software clones who can have life-like conversations on behalf of their human twins. MyCyberTwin users can easily create online clones that can chat on websites and through social networks such as MySpace, blogs, dating sites, Second Life and MSN instant messaging. ... Soon the Internet may be filled with all sorts of fictional characters you can chat with. Maybe you could discuss the movie 'Shrek' with Shrek himself, or have a shouting match with Darth Vader, or flirt with Brad Pitt or Angelina Jolie. And it all started with A simple Square. You can chat with A Square and other Flatlanders at"
  • " is a software robot (also known as a bot) hosting service. From any browser, you may create and publish your own robots to anyone via the web. We believe that our technology yields the fastest bots available on the Internet. The bots are based on AIML and spring entirely from the work of Dr. Richard Wallace and the A.L.I.C.E. and AIML free software community based at www.AliceBot.Org." As of this posting (8/02), there is no charge to create your own Chat Bot.
  • The Personality Forge. "Come on in, and chat with bots and botmasters, then create your own artificial intelligence personality, and turn it loose to chat with both real people and other chat bots. Here you'll find thousands of AI personalities...." Maintained by Benji Adams.
  • "Ramona is the photorealistic avatar host of She's also the first live virtual performing and recording artist. Read about her history, check out her pictures, and listen to her songs!" And of course, chat with her.
  • Sgt. Star: ‘Star’ power Army uses artificial intelligence to lure new recruits. By William Jackson. GCN [Government Computer News, February 19, 2007]. "The Army has launched a virtual guide to lead visitors through its recruiting Web site, using artificial intelligence to replace online chat with live recruiters. Accurate answers by 'Sgt. Star' (for Strong, Trained And Ready) to users’ questions not only have reduced the number of live chat sessions but also increased traffic to and nearly quadrupled the length of the average visit since his rollout last August. At a time when the Army is facing growing challenges in meeting its recruiting goals, Sgt. Star appeals to a key demographic of young, tech-savvy males being sought by the service, said Gary Bishop, deputy director of the Strategic Outreach Directorate of the Army’s Accessions Command. ... The core technology is distinguished by the ability to understand natural language and to learn over time."
  • dialogues with colorful personalities of early ai
  • Why did the chicken cross the road? See our AI News Toon!
  • To find out who coined the term, chatterbot, see our Namesakes page (which is where you'll also meet the man who designed the Turing Test).

Whatever happened to machines that think? By Justin Mullins. New Scientist (April 23, 2005; Issue 2496: pages 32 - 37). "Clever computers are everywhere. From robotic lawnmowers to intelligent lighting, washing machines and even car engines that self-diagnose faults, there's a silicon brain in just about every modern device you can think of. But can you honestly call any machine intelligent in a meaningful sense of the word? One rainy afternoon last February I decided to find out. I switched on the computer in my study, and logged on to, home to one of the leading artificial intelligences on the planet, to see what the state-of-the-art has to offer. ..."

"Computational Linguistics is the only publication devoted exclusively to the design and analysis of natural language processing systems. From this unique quarterly, university and industry linguists, computational linguists, artificial intelligence (AI) investigators, cognitive scientists, speech specialists, and philosophers get information about computational aspects of research on language, linguistics, and the psychology of language processing and performance. Published by The MIT Press for: The Association for Computational Linguistics." Abstracts are available online.

Natural Language Understanding. By Avron Barr (1980). AI Magazine 1(1): 5-10. "This is an excerpt from the Handbook of Artificial Intelligence, a compendium of hundreds of articles about AI ideas, techniques, and programs being prepared at Stanford University by AI researchers and students from across the country." Don't miss the fascinating section: Early History.

Empirical Methods in Information Extraction. By Claire Cardie (1997). AI Magazine 18 (4): 65-79. "This article surveys the use of empirical, machine-learning methods for a particular natural language-understanding task-information extraction. The author presents a generic architecture for information-extraction systems and then surveys the learning algorithms that have been developed to address the problems of accuracy, portability, and knowledge acquisition for each component of the architecture." Duo-Mining - Combining Data and Text Mining. By Guy Creese. (September 16, 2004). "As standalone capabilities, the pattern-finding technologies of data mining and text mining have been around for years. However, it is only recently that enterprises have started to use the two in tandem - and have discovered that it is a combination that is worth more than the sum of its parts. First of all, what are data mining and text mining? They are similar in that they both 'mine' large amounts of data, looking for meaningful patterns. However, what they analyze is quite different. ... Collections and recovery departments in banks and credit card companies have used duo-mining to good effect. Using data mining to look at repayment trends, these enterprises have a good idea on who is going to default on a loan, for example. When logs from the collection agents are added to the mix, the understanding gets even better. For example, text mining can understand the difference in intent between, 'I will pay,' 'I won't pay,' 'I paid' and generate a propensity to pay score - which, in turn, can be data mined. To take another example, if a customer says, 'I can't pay because a tree fell on my house;' all of a sudden it is clear that it's not a 'bad' delinquency - but rather a sales opportunity for a home loan."

Natural Language Processing. Courseware from Professor Jason Eisner.

Check out IBM's Natural Language Processing projects and their Natural Language Processing research overview.

dialogues with colorful personalities of early ai. By Guven Guzeldere and Stefano Franchi. (1995). From Constructions of the Mind: Artificial Intelligence and the Humanities, a special issue of the Stanford Humanities Review, Volume 4,Issue 2. "Of all the legacies of the era of the sixties, three colorful, not to say garrulous, "personalities" that emerged from the early days of artificial intelligence research are worth mentioning: ELIZA, the Rogerian psychotherapist; PARRY, the paranoid; and (as part of a younger generation) RACTER, the "artificially insane" raconteur. All three of these "characters" are natural language processing systems that can "converse" with human beings (or with one another) in English.

LifeCode: A Deployed Application for Automated Medical Coding. By Daniel T. Heinze, Mark Morsch, Ronald Sheffer, Michelle Jimmink, Mark Jennings, William Morris, and Amy Morsch. AI Magazine 22(2): 76-88 (Summer 2001). This paper is based on the authors' presentation at the Twelfth Innovative Applications of Artificial Intelligence Conference (IAAI-2000). "LifeCode is a natural language processing (NLP) and expert system that extracts demographic and clinical information from free-text clinical records."

Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition. By Daniel Jurafsky and James H. Martin. Prentice-Hall, 2000. The Preface and Chapter 1 are available online.

"I'm sorry Dave, I'm afraid I can't do that": Linguistics, Statistics, and Natural Language Processing circa 2001. By Lillian Lee, Cornell Natural Language Processing Group. In Computer Science: Reflections on the Field, Reflections from the Field (Report of the National Academies' Study on the Fundamentals of Computer Science), pp. 111-118, 2004.

  • Also from the Cornell Natural Language Processing Group: Sentiment Analysis. "CS professors Claire Cardie and Lillian Lee are working on sentiment-analysis technologies for extracting and summarizing opinions from unstruc- tured human-authored documents. They envision systems that (a) fi nd reviews, editorials, and other expressions of opinion on the Web and (b) create condensed versions of the material or graphical summaries of the overall con- sensus."

A Performance Evaluation of Text-Analysis Technologies. By Wendy Lehnert and Beth Sundheim (1991). AI Magazine 12 (3): 81-94. "A performance evaluation of 15 text-analysis systems conducted to assess the state of the art for detailed information extraction from unconstrained continuous text. ... Based on multiple strategies for computing each metric, the competing systems were evaluated for recall, precision, and overgeneration. The results support the claim that systems incorporating natural language-processing techniques are more effective than systems based on stochastic techniques alone."

Natural Language Understanding and Semantics. Section 1.2.4 of Chapter One (available online) of George F. Luger's textbook, Artificial Intelligence: Structures and Strategies for Complex Problem Solving, 5th Edition (Addison-Wesley; 2005). "One of the long-standing goals of artificial intelligence is the creation of programs that are capable of understanding and generating human language. Not only does the ability to use and understand natural language seem to be a fundamental aspect of human intelligence, but also its successful automation would have an incredible impact on the usability and effectiveness of computers themselves. ... Understanding natural language involves much more than parsing sentences into their individual parts of speech and looking those words up in a dictionary. Real understanding depends on extensive background knowledge about the domain of discourse and the idioms used in that domain as well as an ability to apply general contextual knowledge to resolve the omissions and ambiguities that are a normal part of human speech."

Getting Started on Natural Language Processing with Python. By Nitin Madnani. Crossroads, The ACM Student Magazine 13(4) Fall 2007. "The intent of this article is to introduce readers to the area of natural language processing, commonly referred to as NLP. However, rather than just describing the salient concepts of NLP, this article uses the Python programming language to illustrate them as well. For readers unfamiliar with Python, the article provides a number of references to learn how to program in Python."

Computers That Speak Your Language - Voice recognition that finally holds up its end of a conversation is revolutionizing customer service. Now the goal is to make natural language the way to find any type of information, anywhere. By Wade Roush. Technology Review (June 2003). "Building a truly interactive customer service system like Nuanceís requires solutions to each of the major challenges in natural-language processing: accurately transforming human speech into machine-readable text; analyzing the textís vocabulary and structure to extract meaning; generating a sensible response; and replying in a human-sounding voice."

Computing's too important to be left to men. BCS managing editor Brian Runciman interviewed Karen Sparck-Jones, winner of the 2007 BCS Lovelace Medal. The British Computer Society (March 2007). "[Q] By way of introduction, can you tell us something about your work? [A] In some respects I'm not a central computing person, on the other hand the area I've worked in has become more central and important to computing. I've always worked in what I like to call natural language information processing. That is to say dealing with information in natural language and information that is conveyed by natural language, because that's what we use. ..."

Chatbot bids to fool humans - A computer program designed to talk like a human is preparing for its biggest test in its bid to be truly "intelligent". By Jo Twist. BBC (September 22, 2003). "Jabberwacky lives on a computer hard drive, tells jokes, uses slang, sometimes swears and can be quite a confrontational conversationalist. What sets this chatty AI (artificial intelligence) chatbot apart from others is the more it natters, the more it learns. The bot is the only UK finalist in this year's Loebner Prize and is hoping to chat its way to a gold medal for its creator, Rollo Carpenter. The Loebner Prize is the annual competition to find the computer with the most convincing conversational skills and started in 1990. Jabberwacky will join eight other international finalists in October, when they pit their wits against flesh and blood judges to see if they can pass as one of them. It is the ultimate Turing Test, which was designed by mathematician Alan Turing to see whether computers 'think' and have 'intelligence'."

Course Lecture Notes

Graduate Course on Computational Models of Discourse as taught in Spring 2004. Prof. Regina Barzilay. Made available through MIT OpenCourseWare.

Related Resources

The Association for Computational Linguistics (ACL) is the "international scientific and professional society for people working on problems involving natural language and computation."

ACL NLP/CL Universe. Web catalog/search engine that is devoted to Natural Language Processing and Computational Linguistics Web sites. It exists since March 18, 1995." Maintained by Dragomir R. Radev for ACL.]

AI on the Web: Natural Language Processing. A resource companion to Stuart Russell and Peter Norvig's "Artificial Intelligence: A Modern Approach" with links to reference material, people, research groups, books, companies and much more.

Natural Language Research in Turkey Kemik, Natural Language Processing Workgroup in the Computer Engineering Department of Yildiz Technical University, Turkey.

National Centre for Text Mining (NaCTeM): "We provide text mining services in response to the requirements of the UK academic community. Our initial focus is on applications in the biological and medical domains, where the major successes in the mining of scientific texts have so far occurred. We also make significant contributions to the text mining research community, both nationally and internationally."

Natural Language Group. Information Sciences Institute, University of Southern California.

The Natural Language Processing Dictionary (NLP Dictionary). Compiled by Bill Wilson, Associate Professor in the Artificial Intelligence Group, School of Computer Science and Engineering, University of NSW. "You should use The NLP Dictionary to clarify or revise concepts that you have already met. The NLP Dictionary is not a suitable way to begin to learn about NLP."

Natural Language Processing Group, Cornell University.

Natural Language Processing Group, Department of Artificial Intelligence, University of Edinburgh.

"The goal of the [Microsoft] Natural Language Processing (NLP) group is to design and build a computer system that will analyze, understand, and generate languages that humans use naturally, so that eventually you can address your computer as though you were addressing another person. This goal is not easy to reach. ... The challenges we face stem from the highly ambiguous nature of natural language."

Natural Language Processing Laboratory, University of Pittsburgh. "We are pursuing research in a wide range of natural language processing problems, including discourse and dialogue, spoken language processing, affective computing, natural language learning, statistical parsing, and machine translation." Be sure to check out thier projects.

Natural Language Processing Research Group at the University of Sheffield Department of Computer Science.

Natural Language Program. Artificial Intelligence Center, SRI. "The SRI AI Center Natural Language Program does research on natural language processing theory and applications. The Program has three subgroups. Multimedia/Multimodal Interfaces ... Spoken Language Systems ... Written Language Systems." Be sure to follow their links to projects, applications, and more! "The Natural Language Software Registry (NLSR) [fourth edition] is a concise summary of the capabilities and sources of a large amount of natural language processing (NLP) software available to the NLP community. It comprises academic, commercial and proprietary software with specifications and terms on which it can be acquired clearly indicated." From the Language Technology Lab of the German Research Centre for Artificial Intelligence (DFKI GmbH).

The North American Computational Linguistics Olympiad (NAMCLO): "Like former [Linguistics] Olympiads, NAMCLO is a Linguistics contest. It challenges you to demonstrate your ability to understand and analyze human language. Unlike former contests, however, the NAMCLO focuses on Computational Linguistics problems, in addition to general linguistic ones." In addition to contest information, the site offers resources such as:

  • Problem Sets
  • Information about Language: "Technology Language technology is often called Human Language Technology (HLT) and consists of Computational Linguistics (or CL) and Speech Technology at its core and includes many application oriented aspects of them as well. Language technology is closely connected to Computer Science and Linguistics.
    • Language Technology Areas - Here are the general language technology areas [with links to definitions]:
      • Machine Translation
      • Information Retrieval and Extraction
      • Natural Language Processing
      • Question Answering
      • Computational Biology
      • Speech Recognition
      • Speech Synthesis
      • Speaker Identification and Verification
      • Dialogue Systems

Stanford NLP Group. "A distinguishing feature of the Stanford NLP Group is our effective combination of sophisticated and deep linguistic modeling and data analysis with innovative probabilistic and machine learning approaches to NLP."

  • Be sure to see Christopher Manning's annotated list of statistical natural language processing and corpus-based computational linguistics resources.

The Turing Center: "a multidisciplinary research center at the University of Washington, investigating problems at the crossroads of natural language processing, data mining, Web search, and the Semantic Web. ... Our mission is to advance the philosophy, science, and technology of pan-lingual communication and collaboration among human and artificial agents."

Xerox Research Centre Europe (XRCE) - Parsing & Semantics: "ParSem concentrates on automatically making sense of electronic documents, by semantically analyzing them. ParSem concentrates on two main research lines of natural language processing: robust parsing and semantics."

  • Robust Parsing: "Robust parsing provides mechanisms for identifying major syntactic structures and major functional relations between words on large collections of unrestricted documents (ex: Web pages, newspapers, scientific literature, encyclopedias). ... Major applications include contextual entity recognition, lexical and structural disambiguation, coreference resolution and more globally knowledge extraction."
  • Semantics: "With the goal of transforming documents into “meaningful spaces”, the main focus has to be semantics. Semantics is everywhere, hidden in completely different types of documents (e.g. text, images, videos, programs and audio) and at different levels (e.g. document content, document structure). Because most of the “semantics” that is nowadays accessible in documents lies in texts, we concentrate on the semantic content analysis of the textual parts of documents. ... Our current research themes include: Ontology Acquisition ... Semantic Disambiguation ... Linguistic Normalization ... Co-reference ... Discourse Analysis ..."
  • Demos & Videos

Other References Offline

Aikins, Janice, Rodney Brooks, William Clancey, et al. 1981. Natural Language Processing Systems. In The Handbook of Artificial Intelligence, Vol. I, ed.Barr, Avron and Edward A. Feigenbaum, 283-321. Stanford/Los Altos, CA: HeurisTech Press/William Kaufmann, Inc.

Allen, J. F. 1994. Natural Language Understanding. Redwood City, CA: Benjamin/Cummings. A new edition of a classic work.

Bobrow, Daniel. 1968. Natural Language Input for a Computer Problem Solving System. In Semantic Information Processing, ed. Minsky, Marvin, 133-215. Cambridge, MA: MIT Press.

Charniak, E. 1993. Statistical Language Learning. Cambridge, MA: MIT Press.

Cohen, P., J. Morgan, and M. Pollack. 1990. Intentions in Communication. Cambridge, MA: MIT Press.

Grosz, Barbara J., Martha E. Pollack, and Candace L. Sidner. 1989. Discourse. In Foundations of Cognitive Science, ed. Posner, M., 437-468. Cambridge, MA: MIT Press.

Grosz, Barbara J., Karen Sparck Jones, and Bonnie L. Webber, editors. 1986. Readings in Natural Language Processing. San Mateo, CA: Morgan Kaufmann.

Mahesh, Kavi, and Sergei Nirenburg. 1997. Knowledge-Based Systems for Natural Language. In The Computer Science and Engineering Handbook, ed. Allen B. Tucker, Jr., 637-653. Boca Raton, FL: CRC Press, Inc.

McKeown, K., and W. Swartout. 1987. Language Generation and Explanation. In Annual Review of Computer Science, Vol. 2, Palo Alto, CA: Annual Reviews.

Patterson, Dan W. 1990. Natural Language Processing. In Introduction to Artificial Intelligence and Expert Systems by Dan W. Patterson, 227-270. Englewood Cliffs, NJ: Prentice Hall.

Shank, Roger C. 1975. The Structure of Episodes in Memory. In Computation and Intelligence: Collected Readings, ed. Luger, George F., 236-259. Menlo Park/Cambridge, MA: AAAI Press/The MIT Press, 1995.

Weizenbaum, J. 1965. ELIZA--A Computer Program for the Study of Natural Language Communication Between Man and Machine. Communications of the ACM, 9 (1): 36-45. A pioneering work.

Winograd, T. 1972. Understanding Natural Language. New York: Academic Press. A pioneering work.

AAAI   Recent Changes   Edit   History   Print   Contact Us
Page last modified on July 23, 2012, at 07:58 AM