Maria Lopez

Maria Lopez

1632131226

In layman's words, how would you explain a non-fungible token?

If you've been following the blockchain news lately, you've probably heard the phrase "NFT." The greatest NFT tokens and their million-dollar bids are attracting both artists and collectors. However, the majority of individuals are unaware of the true nature of NFTs and how they work. As a result, we'll look into NFTs, their working technology, and real-world applications. We'll also look at where to buy nft crypto currency safely.

Contents of the book

NFTs: What You Should Know
Fungibility should be understood.
Understand the concept of non-fungibility.
Learn more about digital assets.
Blockchain is well-known.
Understanding the distinction between cryptocurrency and non-fungible tokens
Understanding how NFTs work
Final Thoughts


NFTs: What You Should Know
Non-fungible Tokens are a type of digital asset certification that can be in any form of media. Though films, music, and the arts are the most prevalent NFT media, digital documents, writing, and other forms of writing also meet the requirements. Let's have a look at some of the key terminology in the field of NFT crypto:

Fungibility should be understood.
Fungibility, in simple terms, refers to anything that can be exchanged. Fungible items can be exchanged for other items of similar worth, such as silver, gold, currency, wheat, and so on. Also, readily available things may fall into fungible groups.

Understand the concept of non-fungibility.
Items that are non-fungible are one-of-a-kind and irreplaceable. An ancient coin or autograph paper, for example.

Learn more about digital assets.
Anything that is in digital format is referred to as a digital asset. These assets also include permission to use, which means that anybody can copy, replicate, or change them. As a result, digital assets include things like audio, papers, photographs, and visual material.

Blockchain is well-known.
Blockchain technology is a type of database that stores digital data in the form of records in its networks. Blockchain, on the other hand, is not like other databases. It consists of a series of interconnected blocks. This chain of blocks creates a distributed ledger or collection of information/data that records all of the chain's activities. However, if you want to understand more about blockchain, we recommend consulting with a qualified NFT expert.

Every blockchain ledger is stored on thousands of different servers all around the world. Furthermore, everyone who is connected to the network has access to each user's entries. They can also double-check each new network entry. This system is referred to as a peer-to-peer network, and it operates on the basis of shared ledger technology. It is also impossible to modify or change the block data as a result of this.

Understanding the distinction between cryptocurrency and non-fungible tokens
Cryptocurrencies and NFTs both use the same technology. Individuals must also keep a digital wallet with sufficient cryptocurrencies in order to purchase NFTs. The basic functions of cryptocurrencies and NFTs, on the other hand, are vastly different.

Additionally, cryptocurrencies aim to function as digital money by allowing you to sell/buy items or store valuables. Furthermore, like dollars or fiat currencies, they are fungible by nature. NFTs, on the other hand, create one-of-a-kind tokens that represent ownership and grant copyrights to each digital property. If you want to learn everything there is to know about this subject, hire a competent NFT expert. Various NFT-based markets, such as openSeas, Rarible, and others, allow you to purchase and trade NFTs.

Understanding how NFTs work
Even though other blockchains enable NFTs, the Ethereum blockchain typically generates the majority of them. Because blockchain can be viewed by anybody, it is simple to trace and validate NFT ownership rights. The token owner can remain anonymous at the same time.

NFT may tokenize a wide range of digital items, including artworks, gaming accessories, snippets, and complete films from well-known broadcasts. NBA Top Shots is also the most well-known NFT marketplace.

The file size of the digital asset does not matter when linking the NFT item that has ownership rights. This is because it remains independent of the blockchain.

However, be in mind that while purchasing an NFT, licencing and copyright may not be available. There are, of course, exceptions in some circumstances. NFTs uses are expanding beyond the traditional digital artworks, thanks to recent improvements in the underlying blockchain technology.

Final Thoughts
NFTs are making it easier for artists to sell their work on the internet. Additionally, NFTs provide creators with options to profit from each subsequent sale of their NFT artwork. Buyers, on the other hand, can check the legitimacy of digital art. It also allows consumers to display their NFT collection wherever they choose. If you're thinking about investing in NFTs, keep in mind that the market is volatile. Some NFTs sell in the millions, while others have been static for a long period.

For beginners, it was a simple description of non-fungible tokens to obtain a basic understanding. If you want to learn more, join the NFTiCally and benefit from the advice of a highly educated NFT expert.

What is GEEK

Buddha Community

In layman's words, how would you explain a non-fungible token?

Words Counted: A Ruby Natural Language Processor.

WordsCounted

We are all in the gutter, but some of us are looking at the stars.

-- Oscar Wilde

WordsCounted is a Ruby NLP (natural language processor). WordsCounted lets you implement powerful tokensation strategies with a very flexible tokeniser class.

Are you using WordsCounted to do something interesting? Please tell me about it.

 

Demo

Visit this website for one example of what you can do with WordsCounted.

Features

  • Out of the box, get the following data from any string or readable file, or URL:
    • Token count and unique token count
    • Token densities, frequencies, and lengths
    • Char count and average chars per token
    • The longest tokens and their lengths
    • The most frequent tokens and their frequencies.
  • A flexible way to exclude tokens from the tokeniser. You can pass a string, regexp, symbol, lambda, or an array of any combination of those types for powerful tokenisation strategies.
  • Pass your own regexp rules to the tokeniser if you prefer. The default regexp filters special characters but keeps hyphens and apostrophes. It also plays nicely with diacritics (UTF and unicode characters): Bayrūt is treated as ["Bayrūt"] and not ["Bayr", "ū", "t"], for example.
  • Opens and reads files. Pass in a file path or a url instead of a string.

Installation

Add this line to your application's Gemfile:

gem 'words_counted'

And then execute:

$ bundle

Or install it yourself as:

$ gem install words_counted

Usage

Pass in a string or a file path, and an optional filter and/or regexp.

counter = WordsCounted.count(
  "We are all in the gutter, but some of us are looking at the stars."
)

# Using a file
counter = WordsCounted.from_file("path/or/url/to/my/file.txt")

.count and .from_file are convenience methods that take an input, tokenise it, and return an instance of WordsCounted::Counter initialized with the tokens. The WordsCounted::Tokeniser and WordsCounted::Counter classes can be used alone, however.

API

WordsCounted

WordsCounted.count(input, options = {})

Tokenises input and initializes a WordsCounted::Counter object with the resulting tokens.

counter = WordsCounted.count("Hello Beirut!")

Accepts two options: exclude and regexp. See Excluding tokens from the analyser and Passing in a custom regexp respectively.

WordsCounted.from_file(path, options = {})

Reads and tokenises a file, and initializes a WordsCounted::Counter object with the resulting tokens.

counter = WordsCounted.from_file("hello_beirut.txt")

Accepts the same options as .count.

Tokeniser

The tokeniser allows you to tokenise text in a variety of ways. You can pass in your own rules for tokenisation, and apply a powerful filter with any combination of rules as long as they can boil down into a lambda.

Out of the box the tokeniser includes only alpha chars. Hyphenated tokens and tokens with apostrophes are considered a single token.

#tokenise([pattern: TOKEN_REGEXP, exclude: nil])

tokeniser = WordsCounted::Tokeniser.new("Hello Beirut!").tokenise

# With `exclude`
tokeniser = WordsCounted::Tokeniser.new("Hello Beirut!").tokenise(exclude: "hello")

# With `pattern`
tokeniser = WordsCounted::Tokeniser.new("I <3 Beirut!").tokenise(pattern: /[a-z]/i)

See Excluding tokens from the analyser and Passing in a custom regexp for more information.

Counter

The WordsCounted::Counter class allows you to collect various statistics from an array of tokens.

#token_count

Returns the token count of a given string.

counter.token_count #=> 15

#token_frequency

Returns a sorted (unstable) two-dimensional array where each element is a token and its frequency. The array is sorted by frequency in descending order.

counter.token_frequency

[
  ["the", 2],
  ["are", 2],
  ["we",  1],
  # ...
  ["all", 1]
]

#most_frequent_tokens

Returns a hash where each key-value pair is a token and its frequency.

counter.most_frequent_tokens

{ "are" => 2, "the" => 2 }

#token_lengths

Returns a sorted (unstable) two-dimentional array where each element contains a token and its length. The array is sorted by length in descending order.

counter.token_lengths

[
  ["looking", 7],
  ["gutter",  6],
  ["stars",   5],
  # ...
  ["in",      2]
]

#longest_tokens

Returns a hash where each key-value pair is a token and its length.

counter.longest_tokens

{ "looking" => 7 }

#token_density([ precision: 2 ])

Returns a sorted (unstable) two-dimentional array where each element contains a token and its density as a float, rounded to a precision of two. The array is sorted by density in descending order. It accepts a precision argument, which must be a float.

counter.token_density

[
  ["are",     0.13],
  ["the",     0.13],
  ["but",     0.07 ],
  # ...
  ["we",      0.07 ]
]

#char_count

Returns the char count of tokens.

counter.char_count #=> 76

#average_chars_per_token([ precision: 2 ])

Returns the average char count per token rounded to two decimal places. Accepts a precision argument which defaults to two. Precision must be a float.

counter.average_chars_per_token #=> 4

#uniq_token_count

Returns the number of unique tokens.

counter.uniq_token_count #=> 13

Excluding tokens from the tokeniser

You can exclude anything you want from the input by passing the exclude option. The exclude option accepts a variety of filters and is extremely flexible.

  1. A space-delimited string. The filter will normalise the string.
  2. A regular expression.
  3. A lambda.
  4. A symbol that names a predicate method. For example :odd?.
  5. An array of any combination of the above.
tokeniser =
  WordsCounted::Tokeniser.new(
    "Magnificent! That was magnificent, Trevor."
  )

# Using a string
tokeniser.tokenise(exclude: "was magnificent")
# => ["that", "trevor"]

# Using a regular expression
tokeniser.tokenise(exclude: /trevor/)
# => ["magnificent", "that", "was", "magnificent"]

# Using a lambda
tokeniser.tokenise(exclude: ->(t) { t.length < 4 })
# => ["magnificent", "that", "magnificent", "trevor"]

# Using symbol
tokeniser = WordsCounted::Tokeniser.new("Hello! محمد")
tokeniser.tokenise(exclude: :ascii_only?)
# => ["محمد"]

# Using an array
tokeniser = WordsCounted::Tokeniser.new(
  "Hello! اسماءنا هي محمد، كارولينا، سامي، وداني"
)
tokeniser.tokenise(
  exclude: [:ascii_only?, /محمد/, ->(t) { t.length > 6}, "و"]
)
# => ["هي", "سامي", "وداني"]

Passing in a custom regexp

The default regexp accounts for letters, hyphenated tokens, and apostrophes. This means twenty-one is treated as one token. So is Mohamad's.

/[\p{Alpha}\-']+/

You can pass your own criteria as a Ruby regular expression to split your string as desired.

For example, if you wanted to include numbers, you can override the regular expression:

counter = WordsCounted.count("Numbers 1, 2, and 3", pattern: /[\p{Alnum}\-']+/)
counter.tokens
#=> ["numbers", "1", "2", "and", "3"]

Opening and reading files

Use the from_file method to open files. from_file accepts the same options as .count. The file path can be a URL.

counter = WordsCounted.from_file("url/or/path/to/file.text")

Gotchas

A hyphen used in leu of an em or en dash will form part of the token. This affects the tokeniser algorithm.

counter = WordsCounted.count("How do you do?-you are well, I see.")
counter.token_frequency

[
  ["do",   2],
  ["how",  1],
  ["you",  1],
  ["-you", 1], # WTF, mate!
  ["are",  1],
  # ...
]

In this example -you and you are separate tokens. Also, the tokeniser does not include numbers by default. Remember that you can pass your own regular expression if the default behaviour does not fit your needs.

A note on case sensitivity

The program will normalise (downcase) all incoming strings for consistency and filters.

Roadmap

Ability to open URLs

def self.from_url
  # open url and send string here after removing html
end

Contributors

See contributors.

Contributing

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

Author: abitdodgy
Source code: https://github.com/abitdodgy/words_counted
License: MIT license

#ruby  #ruby-on-rails 

jennifer kate

jennifer kate

1625648648

Non-Fungible Token Development | NFT Token Development Company | ERC721 Token Development

Non-Fungible Tokens (NFTs) are undoubtedly the biggest buzz in the world right now. Crypto collectibles are getting more traction in industries like artwork, e-commerce, entertainment, gaming, social media, and sports. Today, digital collectibles have an enormous market capitalization of $18.15 billion as per CoinMarketCap. The daily trading volume of NFTs is a whopping $2.02 billion. Hence, this is the right time for entrepreneurs to kickstart NFT development as there is a high demand from investors.

What are the various services offered by a Non-Fungible Token Development Company?

Creation of Ethereum and TRON compatible assets - It includes the creation of ERC-721 and ERC-1155 standard assets on the Ethereum blockchain network. Further, a skilled Non-Fungible Token Development company will establish TRC-721 based assets on the sturdy TRON blockchain network. Experienced developers will create TRC-721 crypto collectibles on the TRC-165 user interface. Importantly, TRC-721 is completely compatible with ERC-721.

Integration with different digital wallets - Investors can safeguard their Non-Fungible Tokens (NFTs) on multiple digital wallets. Their crypto collectibles are protected by passwords and private keys. A reputed Non-Fungible Token Development company will offer buyers access to Binance Chain Wallet, Coinbase Wallet, Dapper, Fortmatic, MetaMask, MyEtherWallet (MEW), Portis, Torus Wallet, Trust Wallet, WalletConnect, and WalletLink.

White-label clone NFT marketplace solutions - Cryptopreneurs can attract a huge number of institutional and retail investors by getting a White-label NFT marketplace solution. Experts in NFT development will build clone solutions of Axie Infinity, Bitski, Cryptograph, CryptoKitties, CryptoPunks, Decentraland, eBay, Gods Unchained, Meebits, NBA Top Shot, Momint, Nifty Gateway, OpenSea, Polka City, Rarible, Solible, Sorare, SuperRare, WazirX, and Xcad Network.

These white-label NFT marketplaces operate on multiple blockchain networks. It comprises Binance Smart Chain (BSC), Cardano, EOS, Ethereum, Flow, Polkadot, Solana, Stellar, and TRON. Entrepreneurs benefit from decentralization, high scalability, immutability, and low transaction processing fees.

Wrapping Up

Unquestionably, this is an exciting time for entrepreneurs as new NFT marketplaces are getting launched across the world. Importantly, some crypto collectibles like artwork, source codes, and sports videos are selling for several millions of dollars. Hence, collaborate with a top-notch Non-Fungible Token Development company and make a big impact in the crypto world soon.

#nft development #non-fungible token development #nft token development

Divya Raj

1624522471

15 Best NFT Marketplaces to Buy and Sell Non-Fungible Tokes in 2021

NFTs are basically digital items created on Ethereum blockain technology.NFT platform refers to a stage where NFTs are bought and sold. Digital Wallet is a necessity for every user uisng this platform.These tokens are bought and sold with th ehelp of cryptocurrencies.
There are many marketplaces to sel/but NFTs but they type of marketplace totally depends on the kind of NFT you want to buy or sell.
The best NFT Marketplaces to buy and sell NFTs in 2021 are:-

  1. OpenSea
  2. Rarible
  3. SuperRare
  4. Foundation
  5. Atomic Market
  6. NiftGateway
  7. NBA Top Shots…

Explore a lot more here - https://blog.digitalogy.co/best-nft-marketplaces-to-buy-and-sell-non-fungible-tokens/

##bestnftmarketplace #top non-fungible-token marketplaces #top nft marketplaces #best nft marketplace #best non-fungible-token marketplace #biggest nft marketplace

Non-Fungible Tokens: The Technology That Made a 10-sec Video Worth USD 6.6 Million

Do you know what is are Non-Fungible Tokens (NFT)? Non-Fungible Token is a kind of crypto-asset based on the contemporary technology of Blockchain. In other words, the technology of Blockchain endows NFTs with the most secure trading on digital platforms.

Every day, crypto technology evolves, and we observe a change in buying and selling things in the digital world. If Salvator Mundi by Leonardo da Vinci could fetch hundreds of millions of dollars, a digital selling of a 10-second video can also take you by a storm.

A Miami-based art enthusiast Pablo Rodriguez-Fraile bought a 10-sec video paying USD 67000. No other investment could have shot his balance sheet with profit like this –he sold the same art at astonishing USD 6.6 million! The incident set a hallmark of capitalizing on digital investments.

What is the video all about?
With the Internet name Beeple, Mike Winkelmann is a digital artist who made the video which was approved and authenticated by Blockchain as many digital signatures certified that it is an original work.

The artist says, “You can go into the Louvre and take a picture of the Mona Lisa, and you can have it there, but it doesn’t have any value because you don’t have the provenance or the history or the work. Again, the reality here is that this is very, very valuable because of who is behind [it]. It’s a full career of a multi-generational, like a generational career in this space of being the best of the best.” The art approved through Blockchain couldn’t be condoned by the stakeholders for its genuineness –the hefty amount he earned corroborated this notion.

What are Non-Fungible Tokens (NFT)?
Non-Fungible Tokens
While other online digital trading methods are treacherous and insecure, NFT is the most foolproof method to buy digitally crafted art and other media.

It is a kind of crypto-asset based on the contemporary technology of Blockchain. In other words, the technology of Blockchain endows NFTs with the most secure trading on digital platforms.

A couple of years back, the NFT market had estimated the worth of hardly USD 42 billion that skyrocketed with the growth of over 700% by the end of the pandemic year 2020. The marketplace of NFT is expected to break all the records as the entire business ecosystems are turning virtual.

NFT could include anything –from digital artwork to sports cards to plots in real estate etc. Nevertheless, NFT drew everybody’s eyeballs when NBA’s Top Shot portal opened up to the users to purchase and trade NFTs. It was done in the form of visual clips from the games.

Objectives of NFT
The sole objective of NFT is to protect the ownership and preserve the goods’ value. Hence, the secure certification of ownership means a lot in today’s market where forging and duplicating products are on rage.

The NFT technology makes the ownership indisputable and hence adds worth to any product or good; moreover, the digital items it holds are scrupulous and well organized.

Another objective of NFT is to recognize artists’ work and reward them with what they deserve. Not only do the makers of an artwork earn today, but they also continue earning whenever the tokens are traded successfully.

The owners can earn anywhere from 2.5% to 10% for the selling price. For instance, YouTube sensation Logan Paul sold a cartoon image of his own NFT worth USD 5 million.

Key features of NFT
NFT has been observed chiefly for collectibles. The most significant advantage of digital collectible over physical is that every NFT has distinguishing data, making it both different from any other NFT and easy to verify.

No alt text provided for this image
Furthermore, having a unique identity lists out forging and circulation of duplicate goods. Even more, it is easy to track back the original issuer of the forged goods; therefore, no one can repudiate their first posting or release because of the concrete evidence.

NFTs are different from general cryptocurrencies because NFTs are not identical even though they belong to the same platform or collection. Here are the key features of NFT, which make them the most advanced crypto technology.

Secure trading – due to their non-inter-operability, collectibles are unique and cannot be forged or used on the same platform.

Undivided – NFTs are inseparable and cannot be turned into denomination such as Bitcoins satoshis. They behave like one entity, one indivisible item.

Unbreakable – you cannot break or ramify it into further NFTs or coins. Literally indestructible, the token cannot be removed or duplicated. The ownership remains intact and secure for years.

Authenticated – the ownership data remains secure and stored. Digital artworks can be tracked for their original creators even after decades. There’s no third-party verification required as authentication has already been certified in the stored data.

Popularity and future
NNFTs have revolutionized the gaming and collectible domains which have shot the industries with spending of USD 170+ million today. The gamers have become exclusive owners of items and assets within the games; moreover, in some cases, the professional gamblers have legally monetized their casino games and virtual theme parks.

The attire, avatars, and other digital items they acquire in the games can be used as currencies or selling items. The artists now have global audiences, and their works are recognized worldwide with ‘authentication stamp’. Royalties and direct selling is working wonderfully, which makes NFTs the most reliable digital trading.

Today, we have quite expensive NFTs such as Dragon, which is valued at 600 ETH. The F1 Delta Time’s one-of-a-kind “1-1-1” race car brought in impressive 415.9 ETH, whereas the NBA superstar LeBron James’ Topshot digital card got sold at whopping hundred thousand dollars.

With immense popularity, promising, and foreseeable future, more and more stakeholders are investing and buying digital currencies. And, those who believe in quick yet secure success are leaning toward NFTs.

Continue reading : https://www.linkedin.com/pulse/non-fungible-tokens-technology-made-10-sec-video-worth-ashish-parmar

#fungible #non-fungible-tokens #technology #digital-currency

Royce  Reinger

Royce Reinger

1658068560

WordsCounted: A Ruby Natural Language Processor

WordsCounted

We are all in the gutter, but some of us are looking at the stars.

-- Oscar Wilde

WordsCounted is a Ruby NLP (natural language processor). WordsCounted lets you implement powerful tokensation strategies with a very flexible tokeniser class.

Features

  • Out of the box, get the following data from any string or readable file, or URL:
    • Token count and unique token count
    • Token densities, frequencies, and lengths
    • Char count and average chars per token
    • The longest tokens and their lengths
    • The most frequent tokens and their frequencies.
  • A flexible way to exclude tokens from the tokeniser. You can pass a string, regexp, symbol, lambda, or an array of any combination of those types for powerful tokenisation strategies.
  • Pass your own regexp rules to the tokeniser if you prefer. The default regexp filters special characters but keeps hyphens and apostrophes. It also plays nicely with diacritics (UTF and unicode characters): Bayrūt is treated as ["Bayrūt"] and not ["Bayr", "ū", "t"], for example.
  • Opens and reads files. Pass in a file path or a url instead of a string.

Installation

Add this line to your application's Gemfile:

gem 'words_counted'

And then execute:

$ bundle

Or install it yourself as:

$ gem install words_counted

Usage

Pass in a string or a file path, and an optional filter and/or regexp.

counter = WordsCounted.count(
  "We are all in the gutter, but some of us are looking at the stars."
)

# Using a file
counter = WordsCounted.from_file("path/or/url/to/my/file.txt")

.count and .from_file are convenience methods that take an input, tokenise it, and return an instance of WordsCounted::Counter initialized with the tokens. The WordsCounted::Tokeniser and WordsCounted::Counter classes can be used alone, however.

API

WordsCounted

WordsCounted.count(input, options = {})

Tokenises input and initializes a WordsCounted::Counter object with the resulting tokens.

counter = WordsCounted.count("Hello Beirut!")

Accepts two options: exclude and regexp. See Excluding tokens from the analyser and Passing in a custom regexp respectively.

WordsCounted.from_file(path, options = {})

Reads and tokenises a file, and initializes a WordsCounted::Counter object with the resulting tokens.

counter = WordsCounted.from_file("hello_beirut.txt")

Accepts the same options as .count.

Tokeniser

The tokeniser allows you to tokenise text in a variety of ways. You can pass in your own rules for tokenisation, and apply a powerful filter with any combination of rules as long as they can boil down into a lambda.

Out of the box the tokeniser includes only alpha chars. Hyphenated tokens and tokens with apostrophes are considered a single token.

#tokenise([pattern: TOKEN_REGEXP, exclude: nil])

tokeniser = WordsCounted::Tokeniser.new("Hello Beirut!").tokenise

# With `exclude`
tokeniser = WordsCounted::Tokeniser.new("Hello Beirut!").tokenise(exclude: "hello")

# With `pattern`
tokeniser = WordsCounted::Tokeniser.new("I <3 Beirut!").tokenise(pattern: /[a-z]/i)

See Excluding tokens from the analyser and Passing in a custom regexp for more information.

Counter

The WordsCounted::Counter class allows you to collect various statistics from an array of tokens.

#token_count

Returns the token count of a given string.

counter.token_count #=> 15

#token_frequency

Returns a sorted (unstable) two-dimensional array where each element is a token and its frequency. The array is sorted by frequency in descending order.

counter.token_frequency

[
  ["the", 2],
  ["are", 2],
  ["we",  1],
  # ...
  ["all", 1]
]

#most_frequent_tokens

Returns a hash where each key-value pair is a token and its frequency.

counter.most_frequent_tokens

{ "are" => 2, "the" => 2 }

#token_lengths

Returns a sorted (unstable) two-dimentional array where each element contains a token and its length. The array is sorted by length in descending order.

counter.token_lengths

[
  ["looking", 7],
  ["gutter",  6],
  ["stars",   5],
  # ...
  ["in",      2]
]

#longest_tokens

Returns a hash where each key-value pair is a token and its length.

counter.longest_tokens

{ "looking" => 7 }

#token_density([ precision: 2 ])

Returns a sorted (unstable) two-dimentional array where each element contains a token and its density as a float, rounded to a precision of two. The array is sorted by density in descending order. It accepts a precision argument, which must be a float.

counter.token_density

[
  ["are",     0.13],
  ["the",     0.13],
  ["but",     0.07 ],
  # ...
  ["we",      0.07 ]
]

#char_count

Returns the char count of tokens.

counter.char_count #=> 76

#average_chars_per_token([ precision: 2 ])

Returns the average char count per token rounded to two decimal places. Accepts a precision argument which defaults to two. Precision must be a float.

counter.average_chars_per_token #=> 4

#uniq_token_count

Returns the number of unique tokens.

counter.uniq_token_count #=> 13

Excluding tokens from the tokeniser

You can exclude anything you want from the input by passing the exclude option. The exclude option accepts a variety of filters and is extremely flexible.

  1. A space-delimited string. The filter will normalise the string.
  2. A regular expression.
  3. A lambda.
  4. A symbol that names a predicate method. For example :odd?.
  5. An array of any combination of the above.
tokeniser =
  WordsCounted::Tokeniser.new(
    "Magnificent! That was magnificent, Trevor."
  )

# Using a string
tokeniser.tokenise(exclude: "was magnificent")
# => ["that", "trevor"]

# Using a regular expression
tokeniser.tokenise(exclude: /trevor/)
# => ["magnificent", "that", "was", "magnificent"]

# Using a lambda
tokeniser.tokenise(exclude: ->(t) { t.length < 4 })
# => ["magnificent", "that", "magnificent", "trevor"]

# Using symbol
tokeniser = WordsCounted::Tokeniser.new("Hello! محمد")
tokeniser.tokenise(exclude: :ascii_only?)
# => ["محمد"]

# Using an array
tokeniser = WordsCounted::Tokeniser.new(
  "Hello! اسماءنا هي محمد، كارولينا، سامي، وداني"
)
tokeniser.tokenise(
  exclude: [:ascii_only?, /محمد/, ->(t) { t.length > 6}, "و"]
)
# => ["هي", "سامي", "وداني"]

Passing in a custom regexp

The default regexp accounts for letters, hyphenated tokens, and apostrophes. This means twenty-one is treated as one token. So is Mohamad's.

/[\p{Alpha}\-']+/

You can pass your own criteria as a Ruby regular expression to split your string as desired.

For example, if you wanted to include numbers, you can override the regular expression:

counter = WordsCounted.count("Numbers 1, 2, and 3", pattern: /[\p{Alnum}\-']+/)
counter.tokens
#=> ["numbers", "1", "2", "and", "3"]

Opening and reading files

Use the from_file method to open files. from_file accepts the same options as .count. The file path can be a URL.

counter = WordsCounted.from_file("url/or/path/to/file.text")

Gotchas

A hyphen used in leu of an em or en dash will form part of the token. This affects the tokeniser algorithm.

counter = WordsCounted.count("How do you do?-you are well, I see.")
counter.token_frequency

[
  ["do",   2],
  ["how",  1],
  ["you",  1],
  ["-you", 1], # WTF, mate!
  ["are",  1],
  # ...
]

In this example -you and you are separate tokens. Also, the tokeniser does not include numbers by default. Remember that you can pass your own regular expression if the default behaviour does not fit your needs.

A note on case sensitivity

The program will normalise (downcase) all incoming strings for consistency and filters.

Roadmap

Ability to open URLs

def self.from_url
  # open url and send string here after removing html
end

Are you using WordsCounted to do something interesting? Please tell me about it.

Gem Version 

RubyDoc documentation.

Demo

Visit this website for one example of what you can do with WordsCounted.


Contributors

See contributors.

Contributing

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

Author: Abitdodgy
Source Code: https://github.com/abitdodgy/words_counted 
License: MIT license

#ruby #nlp