Zur Seitennavigation oder mit Tastenkombination für den accesskey-Taste und Taste 1 
Zum Seiteninhalt oder mit Tastenkombination für den accesskey und Taste 2 
BUW Logo BUW Logo
Startseite    Anmelden    Semester:  SoSe 2020

Applied Natural Language Processing and Text Mining - Einzelansicht

  • Funktionen:
Veranstaltungsart Vorlesung/Übung Veranstaltungsnummer 192ELE203603
Semester WiSe 2019/20 SWS 4
Erwartete Teilnehmer/-innen Max. Teilnehmer/-innen
Belegung Diese Veranstaltung ist nicht belegpflichtig!
Sprache englisch
Hyperlink http://dke.uni-wuppertal.de/de/teaching
Termine Gruppe: iCalendar Export für Outlook
  Tag Zeit Rhythmus Dauer Raum Lehrperson fällt aus am Max. Teilnehmer/-innen
Einzeltermine anzeigen
iCalendar Export für Outlook
Do. 10:00 bis 12:00 woch 17.10.2019 bis 30.01.2020  FC Campus Freudenberg - FC.00.10     20

Einzeltermine ausblenden
iCalendar Export für Outlook
Do. 12:00 bis 14:00 woch 24.10.2019 bis 30.01.2020  FC Campus Freudenberg - FC.00.10     20
  • 24.10.2019
  • 31.10.2019
  • 07.11.2019
  • 14.11.2019
  • 21.11.2019
  • 28.11.2019
  • 05.12.2019
  • 12.12.2019
  • 19.12.2019
  • 02.01.2020
  • 09.01.2020
  • 16.01.2020
  • 23.01.2020
  • 30.01.2020
Gruppe :

Zugeordnete Personen
Zugeordnete Personen Zuständigkeit
Gipp, Bela, Univ.- Prof. Dr. verantwortlich
Meuschke, Norman begleitend
Abschluss Studiengang Prüfungsversion Semester
Master an Universitäten Informationstechnologie 20111 -
Master an Universitäten Wirtschaftsing.Informat. 20171 -
Master an Universitäten Informatik 20181 -
Zuordnung zu Einrichtungen

By completing the the course, participants will obtain the knowledge and skills to solve a wide range of applied problems in Natural Language Processing. To achieve this goal, the participants will get to know successful methods for solving sub-problems, such as text representation, information extraction, text mining, language modeling, and similarity detection. The participants will understand the conceptual requirements of specific NLP tasks and be able to devise approaches to address these tasks in practice. The participants will be able to assess the strengths and limitations of state-of-the-art NLP approaches and to propose solutions for interdisciplinary NLP problems.


The lecture will cover the following topics:

  • Introduction
    Course structure, schedule, projects, requirements, specifics
    Course topics, motivation
    Overview of the field
  • Text representation
    Words, sentences, paragraphs, documents
    Text processing, regular expressions, tokenization, stemming, lemmatization
    Bag-of-Words, weighting schemes (e.g., tf-idf), information retrieval
    Minimum edit distance
    Language models, N-grams, perplexity, information gain, smoothing
    Word sense, lexical databases, distance measures
  • Word embeddings and dense vector representations
    Vector representation
    Recap on NLP representations before 2013
    word2vec, GloVe, fastText
    Multi-Sense Embeddings
    ELMo, USE
  • Applications
    Lexical databases, lexical semantics
    Word sense disambiguation, semantic similarity
    Part-of-speech tagging, parsing
    Word similarity, word dissimilarity, distance measures
    Text classification
    Sentiment analysis / evaluation
    Named entity recognition, information extraction, relation extraction
    Questioning and answering, chatbots, dialog systems
    Text summarization
    Machine translation
    Fake news detection
    Plagiarism / paraphrase detection
    Math retrieval, MathML
    Automatic detection of political opinions
    Online harassment detection
    Collaboration network analysis

Participants (teamwork is possible) will carry out an applied research project that addresses complex NLP downstream tasks and subtasks, such as:

  • Word similarity
  • Document and Sentence classification
  • Named entity recognition
  • Question and answering system
  • Text summarization
  • Objective and subjective classification
  • Sentiment analysis
  • Part-of-speech tagging
  • Compositional knowledge entailment (entailment, contradiction, neutral)
  • Relation extraction and parsing
  • Machine translation
  • ...

Applications that participants can address in their projects include but are not limited to:

  • Plagiarism and paraphrase detection
  • Social media analysis
  • Fake news identification and classification
  • Spell checking
  • Detection of political opinions
  • Identification of opinion polarity
  • Online harassment and bias identification systems
  • Collaboration network analysis

The course is in English. Basic knowledge of Python (e.g., branches, loops, object orientation) is required to complete the course. Experience with numpy, sckit-learn, pandas, and other libraries in the SciPy ecosystem is beneficial but not mandatory. For participants who are unfamiliar with Python, a fast-paced introduction into the essentials of the language will be provided.


The modality of the final exam depends on the number of participants. In case of ~10 or fewer participants, the exam will be oral; otherwise, the exam will be in writing. The final exam can include questions specific to the applied research project (component B).

Keine Einordnung ins Vorlesungsverzeichnis vorhanden. Veranstaltung ist aus dem Semester WiSe 2019/20 , Aktuelles Semester: SoSe 2020

2007 WUSEL-Team Bergische Universität Wuppertal
Anzahl aktueller Nutzer/-innen auf : 1332