What is parsing in natural language processing
Natural language parsing
The term parsing describes the breaking down of an object into its individual parts. So-called Parser are indispensable in computer science - they occur in every compiler and for many programming languages there are parser generators that generate program code for the decomposition of user input. Regular or context-free languages, which have been and are investigated in the discipline of theoretical computer science, often serve as the theoretical background for these parsers.
These classes of formal languages have their origin in linguistics. They were introduced by Noam Chomsky with the aim of finding a suitable formalism for the description of natural languages to find, i.e. languages that are spoken and written by people. Even if such a formalization is still unmatched, parsers are still important in the field of computational linguistics and the machine processing of natural languages. The aim is the automated processing of natural languages with the help of the computer. Possible applications are, for example, the translation and compilation of texts or chatbots, which can communicate with people on a wide variety of topics. Parsers support these applications by providing a syntactic analysis (that is, an analysis of the grammatical structure) of the natural language input.
Mild context-sensitive languages
Since natural languages soon turned out to be non-regular or context-free, but the class of the more powerful context-sensitive languages is too complex for practical applications, so-called mildly context-sensitive Languages studied. They are a representative of this class Linear context-free rewriting systems (LCFRS) or the syntactically similar and semantically equivalent Multiple context-free grammars (MCFG). Here in particular discontinuous Phrases are modeled, i.e. parts of sentences which form a logical unit but not a set of consecutive words in the sentence. This phenomenon occurs, among other things, in languages with free speech such as German, Swiss-German and Dutch, but also in English.
Part of our working group is dedicated to researching such mildly context-sensitive grammars. On the one hand, this includes theoretical work, such as a Chomsky-Schützenberger characterization of weighted MCFG and an automaton characterization. Since the degree of discontinuity that an MCFG can represent affects its processing complexity, we investigate approximation techniques. These should provide less expressive grammars with less processing complexity, but should retain the language of the given grammar as far as possible. In this context we also have so-called Hybrid grammars introduced, which e.g. synchronize LCFRS with a tree-generating grammar. Hybrid grammars skilfully allow some of the complexity to be shifted into the tree-generating grammar, thereby reducing the processing complexity for practical applications. Another contribution in this area is the development of a reversible lexicalization method for MCFG.
Based on our theoretical research, we implement parsing algorithms and evaluate them on linguistic data sets, so-called corpora. Among other things, the applications rustomata and panda-parser were created. Furthermore, we investigate grammar-based parsers which are controlled by neural models.
Presentation slides on the topic
- Sheets presented by Heiko Vogler at CAI 2017
- Prof. Dr.-Ing. habil. Dr. h.c./Univ. Szeged
Tel .: +49 (0) 351 463-38232
Fax: +49 (0) 351 463-37959
- What are the best restaurants in Wudaokou
- Why are loops used to symbolize infinity
- Can Wolverine be killed by poison
- How do conservative whites become more progressive?
- Who discovered Malta 1
- Why is graphene a new wonder material?
- Why are you dying of Alzheimer's
- What is takeover at SRM IST University
- How can we learn living verbs
- Is it possible to speed up reading
- Do narcissists love their stepchildren
- What is your experience with the jury system
- What are the 6 runlevels under Linux
- How to boot in safe mode
- What's cool about the sound barrier
- Why is my urine cloudy?
- How many wives did Duryodhna have
- Why do we eat on tables
- A haploid cell is possible
- Affects global warming snowstorms
- How do you poach eggs
- How to sell sports tickets
- America is 241 years old
- What if you crush a pill