| Kommentar | By completing the the course, participants will obtain the knowledge and skills to solve a wide range of applied problems in Natural Language Processing. To achieve this goal, the participants will get to know successful methods for solving sub-problems, such as text representation, information extraction, text mining, language modeling, and similarity detection. The participants will understand the conceptual requirements of specific NLP tasks and be able to devise approaches to address these tasks in practice. The participants will be able to assess the strengths and limitations of state-of-the-art NLP approaches and to propose solutions for interdisciplinary NLP problems.   The lecture will cover the following topics: 
IntroductionCourse structure, schedule, projects, requirements, specifics
 Course topics, motivation
 Overview of the field
 
Text representationWords, sentences, paragraphs, documents
 Text processing, regular expressions, tokenization, stemming, lemmatization
 Bag-of-Words, weighting schemes (e.g., tf-idf), information retrieval
 Minimum edit distance
 Language models, N-grams, perplexity, information gain, smoothing
 Word sense, lexical databases, distance measures
 
Word embeddings and dense vector representationsVector representation
 Recap on NLP representations before 2013
 word2vec, GloVe, fastText
 Paragraph-Vectors
 Multi-Sense Embeddings
 ELMo, USE
 
ApplicationsLexical databases, lexical semantics
 Word sense disambiguation, semantic similarity
 Part-of-speech tagging, parsing
 Word similarity, word dissimilarity, distance measures
 Text classification
 Sentiment analysis / evaluation
 Named entity recognition, information extraction, relation extraction
 Questioning and answering, chatbots, dialog systems
 Text summarization
 Machine translation
 Fake news detection
 Plagiarism / paraphrase detection
 Math retrieval, MathML
 Automatic detection of political opinions
 Online harassment detection
 Collaboration network analysis
 Participants (teamwork is possible) will carry out an applied research project that addresses complex NLP downstream tasks and subtasks, such as: 
Word similarityDocument and Sentence classificationNamed entity recognitionQuestion and answering systemText summarizationObjective and subjective classificationSentiment analysisPart-of-speech taggingCompositional knowledge entailment (entailment, contradiction, neutral)Relation extraction and parsingMachine translation... Applications that participants can address in their projects include but are not limited to: 
Plagiarism and paraphrase detectionSocial media analysisFake news identification and classificationSpell checkingDetection of political opinionsIdentification of opinion polarityOnline harassment and bias identification systemsCollaboration network analysis | 
											
				| Voraussetzungen | The course is in English. Basic knowledge of Python (e.g., branches, loops, object orientation) is required to complete the course. Experience with numpy, sckit-learn, pandas, and other libraries in the SciPy ecosystem is beneficial but not mandatory. For participants who are unfamiliar with Python, a fast-paced introduction into the essentials of the language will be provided. |