Royce  Reinger

Royce Reinger

1657828080

Open-nlp: Ruby Bindings to The OpenNLP Java Toolkit

About

This library provides high-level Ruby bindings to the Open NLP package, a Java machine learning toolkit for natural language processing (NLP). This gem is compatible with Ruby 1.9.2 and 1.9.3 as well as JRuby 1.7.1. It is tested on both Java 6 and Java 7.

Installing

First, install the gem: gem install open-nlp. Then, download the JARs and English language models in one package (80 MB).

Place the contents of the extracted archive inside the /bin/ folder of the open-nlp gem (e.g. [...]/gems/open-nlp-0.x.x/bin/).

Alternatively, from a terminal window, cd to the gem's folder and run:

wget http://www.louismullie.com/treat/open-nlp-english.zip
unzip -o open-nlp-english.zip -d bin/

Afterwards, you may individually download the appropriate models for other languages from the open-nlp website.

Configuring

After installing and requiring the gem (require 'open-nlp'), you may want to set some of the following configuration options.

# Set an alternative path to look for the JAR files.
# Default is gem's bin folder.
OpenNLP.jar_path = '/path_to_jars/'

# Set an alternative path to look for the model files.
# Default is gem's bin folder.
OpenNLP.model_path = '/path_to_models/'

# Pass some alternative arguments to the Java VM.
# Default is ['-Xms512M', '-Xmx1024M'].
OpenNLP.jvm_args = ['-option1', '-option2']

# Redirect VM output to log.txt
OpenNLP.log_file = 'log.txt'

# Set default models for a language.
OpenNLP.use :language

Examples

Simple tokenizer

OpenNLP.load

sent = "The death of the poet was kept from his poems."
tokenizer = OpenNLP::SimpleTokenizer.new

tokens = tokenizer.tokenize(sent).to_a
# => %w[The death of the poet was kept from his poems .]

Maximum entropy tokenizer, chunker and POS tagger


OpenNLP.load

chunker   = OpenNLP::ChunkerME.new
tokenizer = OpenNLP::TokenizerME.new
tagger    = OpenNLP::POSTaggerME.new

sent   = "The death of the poet was kept from his poems."

tokens = tokenizer.tokenize(sent).to_a
# => %w[The death of the poet was kept from his poems .]

tags   = tagger.tag(tokens).to_a
# => %w[DT NN IN DT NN VBD VBN IN PRP$ NNS .]

chunks = chunker.chunk(tokens, tags).to_a
# => %w[B-NP I-NP B-PP B-NP I-NP B-VP I-VP B-PP B-NP I-NP O]

Abstract Bottom-Up Parser

OpenNLP.load

sent      = "The death of the poet was kept from his poems."
parser = OpenNLP::Parser.new
parse = parser.parse(sent)

parse.get_text.should eql sent

parse.get_span.get_start.should eql 0
parse.get_span.get_end.should eql 46
parse.get_child_count.should eql 1

child = parse.get_children[0]

child.text # => "The death of the poet was kept from his poems."
child.get_child_count # => 3
child.get_head_index #=> 5
child.get_type # => "S"

Maximum Entropy Name Finder*

OpenNLP.load

text = File.read('./spec/sample.txt').gsub!("\n", "")

tokenizer   = OpenNLP::TokenizerME.new
segmenter   = OpenNLP::SentenceDetectorME.new
ner_models  = ['person', 'time', 'money']

ner_finders = ner_models.map do |model|
  OpenNLP::NameFinderME.new("en-ner-#{model}.bin")
end

sentences = segmenter.sent_detect(text)
named_entities = []

sentences.each do |sentence|

  tokens = tokenizer.tokenize(sentence)
  
  ner_models.each_with_index do |model,i|
    finder = ner_finders[i]
    name_spans = finder.find(tokens)
    name_probs = finder.probs()
    name_spans.each_with_index do |name_span,j|
      start = name_span.get_start
      stop  = name_span.get_end-1
      slice = tokens[start..stop].to_a
      prob  = name_probs[j]
      named_entities << [slice, model, prob]
    end
  end

end

Loading specific models

Just pass the name of the model file to the constructor. The gem will search for the file in the OpenNLP.model_path folder.

OpenNLP.load

tokenizer = OpenNLP::TokenizerME.new('en-token.bin')
tagger = OpenNLP::POSTaggerME.new('en-pos-perceptron.bin')
name_finder = OpenNLP::NameFinderME.new('en-ner-person.bin')
# etc.

Loading specific classes

You may want to load specific classes from the OpenNLP library that are not loaded by default. The gem provides an API to do this:

# Default base class is opennlp.tools.
OpenNLP.load_class('SomeClassName')  
# => OpenNLP::SomeClassName

# Here, we specify another base class.
OpenNLP.load_class('SomeOtherClass', 'opennlp.tools.namefind')
# => OpenNLP::SomeOtherClass

Contributing

Fork the project and send me a pull request! Config updates for other languages are welcome.

Author: louismullie
Source Code: https://github.com/louismullie/open-nlp 
License: View license

#ruby #open #nlp #java 

What is GEEK

Buddha Community

Open-nlp: Ruby Bindings to The OpenNLP Java Toolkit
Tyrique  Littel

Tyrique Littel

1600135200

How to Install OpenJDK 11 on CentOS 8

What is OpenJDK?

OpenJDk or Open Java Development Kit is a free, open-source framework of the Java Platform, Standard Edition (or Java SE). It contains the virtual machine, the Java Class Library, and the Java compiler. The difference between the Oracle OpenJDK and Oracle JDK is that OpenJDK is a source code reference point for the open-source model. Simultaneously, the Oracle JDK is a continuation or advanced model of the OpenJDK, which is not open source and requires a license to use.

In this article, we will be installing OpenJDK on Centos 8.

#tutorials #alternatives #centos #centos 8 #configuration #dnf #frameworks #java #java development kit #java ee #java environment variables #java framework #java jdk #java jre #java platform #java sdk #java se #jdk #jre #open java development kit #open source #openjdk #openjdk 11 #openjdk 8 #openjdk runtime environment

Royce  Reinger

Royce Reinger

1657828080

Open-nlp: Ruby Bindings to The OpenNLP Java Toolkit

About

This library provides high-level Ruby bindings to the Open NLP package, a Java machine learning toolkit for natural language processing (NLP). This gem is compatible with Ruby 1.9.2 and 1.9.3 as well as JRuby 1.7.1. It is tested on both Java 6 and Java 7.

Installing

First, install the gem: gem install open-nlp. Then, download the JARs and English language models in one package (80 MB).

Place the contents of the extracted archive inside the /bin/ folder of the open-nlp gem (e.g. [...]/gems/open-nlp-0.x.x/bin/).

Alternatively, from a terminal window, cd to the gem's folder and run:

wget http://www.louismullie.com/treat/open-nlp-english.zip
unzip -o open-nlp-english.zip -d bin/

Afterwards, you may individually download the appropriate models for other languages from the open-nlp website.

Configuring

After installing and requiring the gem (require 'open-nlp'), you may want to set some of the following configuration options.

# Set an alternative path to look for the JAR files.
# Default is gem's bin folder.
OpenNLP.jar_path = '/path_to_jars/'

# Set an alternative path to look for the model files.
# Default is gem's bin folder.
OpenNLP.model_path = '/path_to_models/'

# Pass some alternative arguments to the Java VM.
# Default is ['-Xms512M', '-Xmx1024M'].
OpenNLP.jvm_args = ['-option1', '-option2']

# Redirect VM output to log.txt
OpenNLP.log_file = 'log.txt'

# Set default models for a language.
OpenNLP.use :language

Examples

Simple tokenizer

OpenNLP.load

sent = "The death of the poet was kept from his poems."
tokenizer = OpenNLP::SimpleTokenizer.new

tokens = tokenizer.tokenize(sent).to_a
# => %w[The death of the poet was kept from his poems .]

Maximum entropy tokenizer, chunker and POS tagger


OpenNLP.load

chunker   = OpenNLP::ChunkerME.new
tokenizer = OpenNLP::TokenizerME.new
tagger    = OpenNLP::POSTaggerME.new

sent   = "The death of the poet was kept from his poems."

tokens = tokenizer.tokenize(sent).to_a
# => %w[The death of the poet was kept from his poems .]

tags   = tagger.tag(tokens).to_a
# => %w[DT NN IN DT NN VBD VBN IN PRP$ NNS .]

chunks = chunker.chunk(tokens, tags).to_a
# => %w[B-NP I-NP B-PP B-NP I-NP B-VP I-VP B-PP B-NP I-NP O]

Abstract Bottom-Up Parser

OpenNLP.load

sent      = "The death of the poet was kept from his poems."
parser = OpenNLP::Parser.new
parse = parser.parse(sent)

parse.get_text.should eql sent

parse.get_span.get_start.should eql 0
parse.get_span.get_end.should eql 46
parse.get_child_count.should eql 1

child = parse.get_children[0]

child.text # => "The death of the poet was kept from his poems."
child.get_child_count # => 3
child.get_head_index #=> 5
child.get_type # => "S"

Maximum Entropy Name Finder*

OpenNLP.load

text = File.read('./spec/sample.txt').gsub!("\n", "")

tokenizer   = OpenNLP::TokenizerME.new
segmenter   = OpenNLP::SentenceDetectorME.new
ner_models  = ['person', 'time', 'money']

ner_finders = ner_models.map do |model|
  OpenNLP::NameFinderME.new("en-ner-#{model}.bin")
end

sentences = segmenter.sent_detect(text)
named_entities = []

sentences.each do |sentence|

  tokens = tokenizer.tokenize(sentence)
  
  ner_models.each_with_index do |model,i|
    finder = ner_finders[i]
    name_spans = finder.find(tokens)
    name_probs = finder.probs()
    name_spans.each_with_index do |name_span,j|
      start = name_span.get_start
      stop  = name_span.get_end-1
      slice = tokens[start..stop].to_a
      prob  = name_probs[j]
      named_entities << [slice, model, prob]
    end
  end

end

Loading specific models

Just pass the name of the model file to the constructor. The gem will search for the file in the OpenNLP.model_path folder.

OpenNLP.load

tokenizer = OpenNLP::TokenizerME.new('en-token.bin')
tagger = OpenNLP::POSTaggerME.new('en-pos-perceptron.bin')
name_finder = OpenNLP::NameFinderME.new('en-ner-person.bin')
# etc.

Loading specific classes

You may want to load specific classes from the OpenNLP library that are not loaded by default. The gem provides an API to do this:

# Default base class is opennlp.tools.
OpenNLP.load_class('SomeClassName')  
# => OpenNLP::SomeClassName

# Here, we specify another base class.
OpenNLP.load_class('SomeOtherClass', 'opennlp.tools.namefind')
# => OpenNLP::SomeOtherClass

Contributing

Fork the project and send me a pull request! Config updates for other languages are welcome.

Author: louismullie
Source Code: https://github.com/louismullie/open-nlp 
License: View license

#ruby #open #nlp #java 

Samanta  Moore

Samanta Moore

1620458875

Going Beyond Java 8: Local Variable Type Inference (var) - DZone Java

According to some surveys, such as JetBrains’s great survey, Java 8 is currently the most used version of Java, despite being a 2014 release.

What you are reading is one in a series of articles titled ‘Going beyond Java 8,’ inspired by the contents of my book, Java for Aliens. These articles will guide you step-by-step through the most important features introduced to the language, starting from version 9. The aim is to make you aware of how important it is to move forward from Java 8, explaining the enormous advantages that the latest versions of the language offer.

In this article, we will talk about the most important new feature introduced with Java 10. Officially called local variable type inference, this feature is better known as the **introduction of the word **var. Despite the complicated name, it is actually quite a simple feature to use. However, some observations need to be made before we can see the impact that the introduction of the word var has on other pre-existing characteristics.

#java #java 11 #java 10 #java 12 #var #java 14 #java 13 #java 15 #verbosity

8 Open-Source Tools To Start Your NLP Journey

Teaching machines to understand human context can be a daunting task. With the current evolving landscape, Natural Language Processing (NLP) has turned out to be an extraordinary breakthrough with its advancements in semantic and linguistic knowledge. NLP is vastly leveraged by businesses to build customised chatbots and voice assistants using its optical character and speed recognition techniques along with text simplification.

To address the current requirements of NLP, there are many open-source NLP tools, which are free and flexible enough for developers to customise it according to their needs. Not only these tools will help businesses analyse the required information from the unstructured text but also help in dealing with text analysis problems like classification, word ambiguity, sentiment analysis etc.

Here are eight NLP toolkits, in no particular order, that can help any enthusiast start their journey with Natural language Processing.


Also Read: Deep Learning-Based Text Analysis Tools NLP Enthusiasts Can Use To Parse Text

1| Natural Language Toolkit (NLTK)

About: Natural Language Toolkit aka NLTK is an open-source platform primarily used for Python programming which analyses human language. The platform has been trained on more than 50 corpora and lexical resources, including multilingual WordNet. Along with that, NLTK also includes many text processing libraries which can be used for text classification tokenisation, parsing, and semantic reasoning, to name a few. The platform is vastly used by students, linguists, educators as well as researchers to analyse text and make meaning out of it.


#developers corner #learning nlp #natural language processing #natural language processing tools #nlp #nlp career #nlp tools #open source nlp tools #opensource nlp tools

Ray  Patel

Ray Patel

1623348300

Top 8 Java Open Source Projects You Should Get Your Hands-on [2021]

Learning about Java is no easy feat. It’s a prevalent and in-demand programming language with applications in numerous sectors. We all know that if you want to learn a new skill, the best way to do so is through using it. That’s why we recommend working on projects.

So if you’re a Java student, then you’ve come to the right place as this article will help you learn about the most popular Java open source projects. This way, you’d have a firm grasp of industry trends and the programming language’s applications.

However, before we discuss its various projects, it’s crucial to examine the place where you can get those projects – GitHub. Let’s begin.

#full stack development #java open source projects #java projects #open source projects #top 8 java open source projects #java open source projects