Machine-Learning Java Natural Language Processing
Back
Retina: an API performing complex NLP operations (disambiguation, classification, streaming text filtering, etc...) as quickly and intuitively as the brain.[See the Tutorial Video](https://www.youtube.com/watch?v=CsF4pd7fGF0).Stanford CoreNLP provides a set of natural language analysis tools which can take raw English language text input and give the base forms of words.A natural language parser is a program that works out the grammatical structure of sentences.A Part-Of-Speech Tagger (POS Tagger).Stanford NER is a Java implementation of a Named Entity Recognizer.Tokenization of raw text is a standard pre-processing step for many NLP tasks.Tregex is a utility for matching patterns in trees, based on tree relationships and regular expression matches on nodes (the name is short for "tree regular expressions").Stanford Phrasal is a state-of-the-art statistical phrase-based machine translation system, written in Java.A tokenizer divides text into a sequence of tokens, which roughly correspond to "words".SUTime is a library for recognizing and normalizing time expressions.Learning entities from unlabeled text starting with seed sets using patterns in an iterative fashion.A Java implementation of Twitter's text processing library.A Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.a machine learning based toolkit for the processing of natural language text.A tool kit for processing text using computational linguistics.Apache Clinical Text Analysis and Knowledge Extraction System (cTAKES) is an open-source natural language processing system for information extraction from electronic medical record clinical free-text.This project collects a number of core libraries for Natural Language Processing (NLP) developed in the University of Illinois' Cognitive Computation Group, for example `illinois-core-utilities` which provides a set of NLP-friendly data structures and a number of NLP-related utilities that support writing NLP applications, running experiments, etc, `illinois-edison` a library for feature extraction from illinois-core-utilities data structures and many other packages.