Top Distributed Computing Tokens by Volume

In this article you will see Top 20 Top Distributed Computing Tokens by Volume, date: 9/27/2022

Name/Coin/TokenDescribePrice overviewWebsite
Internet Computer - ICP

Internet Computer (ICP) is a decentralized blockchain network that was officially launched in May 2021 by the DFINITY Foundation with a global goal to push the boundaries of Internet functionality and support smart contract development at a larger scale. Among the advantages of the project include a skilled team of crypto enthusiasts and the capability for smart contracts, allowing developers to create dApps. Moreover, ICP supports almost endless scalability and a high level of security maintained through a system of checks.

The comprehensive mission of Internet Computer is to turn the public Internet into a world-class computing platform. The developers position Internet Computer as a revolutionary network, due to facts that the blockchain is publicly available & equipped with impressive functionality. In addition, the project provides a borderless environment for smart contracts, running at web speed with reduced computational costs.

With the help of Internet Computer, developers can create DApps and install its code on the Internet. Furthermore, with ICP, transactions can be completed in less than a second, and at a low cost. According to the creators, the project enables users to opt out of centralized services and commercial cloud services.

Buy ICP
Link
Elrond - EGLD

Elrond is a blockchain protocol that seeks to offer extremely fast transaction speeds by using sharding. The project describes itself as a technology ecosystem for the new internet, which includes fintech, decentralized finance and the Internet of Things. Its smart contracts execution platform is reportedly capable of 15,000 transactions per second, six-second latency and a $0.001 transaction cost.

The blockchain has a native token known as eGold, or EGLD, that is used for paying network fees, staking and rewarding validators.

Buy EGLD
Link
Ravencoin - RVN

Ravencoin is a digital peer-to-peer (P2P) network that aims to implement a use case specific blockchain, designed to efficiently handle one specific function: the transfer of assets from one party to another.

Built on a fork of the Bitcoin code, Ravencoin was announced on Oct. 31, 2017 and released binaries for mining on Jan. 3, 2018 with what is called a fair launch: no premine, ICO or masternodes. It was named in reference to a TV show Game of Thrones.

Buy RVN
Link
Mina - MINA

Mina Protocol is a minimal “succinct blockchain” built to curtail computational requirements in order to run DApps more efficiently.

Mina has been described as the world’s lightest blockchain since its size is designed to remain constant despite growth in usage. Furthermore, it remains balanced in terms of security and decentralization. The project was rebranded from Coda Protocol to Mina in October 2020.
 

Buy MINA
Link
Ankr - ANKR

Ankr is a decentralized blockchain infrastructure provider that operates an array of nodes globally distributed across over 50 Proof-of-Stake networks. This infrastructure helps drive the growth of the crypto economy while powering a full suite of multi-chain tools for Web3 users:

Ankr Build Ankr provides comprehensive blockchain developer solutions, including traditional APIs, a decentralized multi-chain network of public RPC nodes used to access blockchain data and execute code, and tools like Ankr Scan to view on-chain information across blockchains.

Ankr Earn Ankr Earn makes staking, liquid staking, and other yield-earning opportunities easy and accessible to any crypto investor. Ankr creates the most scalable and decentralized staking infrastructure solution that aims to solve the capital inefficiency of Proof-of-Stake networks and similar blockchain consensus mechanisms.

Ankr Learn Through Ankr’s learning tools, tutorials, and docs, anyone can become a better user or developer of blockchain-based systems.

Buy ANKR
Link
Golem - GLM

The Golem Network is a decentralized computation network, a new way of distributing redundant computing power to those who are in need of it, on-demand. It creates a peer-to-peer network where users join on an equal basis to buy and sell computation, splitting up complicated tasks into smaller subtasks in the network. In Golem there’s no central authority and no user is more or less important than another.

GNT or Golem Network Token is needed to pay for computations on the network and is the currency that drives our marketplace. As a Requestor, you set a bid for an amount of GNT you are willing to pay to have your task completed. As a Provider, you earn GNT by computing tasks for Requestors. You can set your minimum and maximum price thresholds in your settings.

Buy GLM
Link
Flux - FLUX

Flux is the cryptocurrency that powers the Flux ecosystem. It has a number of uses including purchasing resources, collateralizing nodes and fuelling transactions on FluxOS, as well as rewarding both miners and FluxNode operators for providing computational resources.

The Flux ecosystem is devoted to empowering everyone to develop, deploy and use the decentralized Internet of the future: Web3

The Flux ecosystem consists of: a native, minable POW cryptocurrency ($FLUX), a powerful decentralized computational Flux Network (FluxNodes), a Linux based operating system (FluxOS), the premier digital asset platform (Zelcore) and, finally, the Flux blockchain for on-chain governance, economics and parallel assets to provide interoperability with other blockchains and DeFi access.

Buy FLUX
Link
Ontology - ONT

Ontology is a project designed to bring trust, privacy, and security to Web3 through decentralized identity and data solutions. It is building the infrastructure to provide trusted access to Web3, allowing individuals and enterprises to rest assured that through regulatory compliant digital identity solutions, users and their privacy come first.

The Ontology blockchain is a high speed, low cost public blockchain. It is designed to bring decentralized identity and data solutions to Web3, with the goal of increasing privacy, transparency, and trust. To achieve this, users and enterprises are provided with the flexibility to build blockchain-based solutions that suit their needs, while also ensuring regulatory compliance. Through Ontology’s Ethereum Virtual Machine (EVM), Ontology ensures frictionless compatibility with Ethereum, the first step in the creation of the Ontology Multi-Virtual Machine and further interoperability for the chain.

Buy ONT
Link
RenderToken - RNDRRenderToken (RNDR) is a distributed GPU rendering network built on top of the Ethereum blockchain, aiming to connect artists and studios in need of GPU compute power with mining partners willing to rent their GPU capabilities out. 
 
Buy RNDR
Link
Ontology Gas - ONG

Ontology is a high performance, open source blockchain specializing in digital identity and data.

Ontology's infrastructure supports robust cross-chain collaboration and Layer 2 scalability, offering businesses the flexibility to design a blockchain that suits their needs.

With a suite of decentralized identity and data sharing protocols to enhance speed, security, and trust, Ontology’s features include ONT ID, a mobile digital ID application and DID used throughout the ecosystem, and DDXF, a decentralized data exchange, and collaboration framework.

Ontology adopts a dual-token model, with both ONT and ONG as utility tokens. Ontology decouples ONT and ONG to alleviate the risk of turbulent fluctuations of the native “asset” value on the gas fee.

ONT is used as the staking tool and the time, cost of staking and operating costs of the nodes are considered to be inputs. ONG is used as a value-anchoring tool for on-chain applications and is used in the transactions on the chain.

Buy ONG
Link
iExec RLC - RLC

iExec is the leading provider of blockchain-based decentralized computing. Blockchain is utilized to organize a market network where people can monetize their computing power as well as applications and even datasets.

It does this by providing on-demand access to cloud computing resources. IExec can support applications in fields such as big data, healthcare, AI, rendering and fintech.

Buy RLC
Link
Cartesi - CTSI

Cartesi - The Blockchain OS is a decentralized Layer-2 infrastructure that supports Linux and mainstream software components. For the first time, developers can code scalable smart contracts with rich software tools, libraries, and the services they’re used to, bridging the gap between mainstream software and blockchain.

Cartesi is enabling millions of new startups and their developers to use The Blockchain OS and bring Linux applications on board. With a groundbreaking virtual machines, optimistic rollups, and side-chains, Cartesi paves the way for developers of all kinds, to build the next generation of blockchain apps.

Buy CTSI
Link
QuarkChain - QKCThe QuarkChain Network is a permissionless blockchain architecture that aims to meet global commercial standards. It aims to provide a secure, decentralized, and scalable blockchain solution to deliver 100,000+ on-chain TPS. 
 
Buy QKC
Link
aelf - ELF

aelf is an open-source blockchain network designed as a complete business solution. The structure of ‘one main-chain + multiple side-chains’ can support developers to independently deploy or run DApps on individual side-chains to achieve resource isolation. aelf technology adopts Parallel Processing & AEDPoS Consensus Mechanism. Based on the cross-chain technology of the main-chain index and verification mechanisms, aelf achieves secure communication between the main-chain and all side-chains, as a result, allows direct interoperability between side-chains.

aelf meets the governance needs of varying applications by providing different models, including a Parliament Governance Model, an Association Governance Model, and a Referendum Governance Model. Through the incentive model, the network is equipped with a self-sustainable system and can roll out self-development on a practical basis. Simultaneously, developers can debug, develop and deploy applications based on a mature IDE, provided by aelf.

aelf has launched aelf Enterprise, an enterprise-level integrated blockchain solution. aelf Enterprise is based on the requirements of different business scenarios. To meet the requirements of several industries including supply chain management, credit establishment, user incentives, and property protection, aelf Enterprise provides enterprise-level users with a flexible, but practical modularized blockchain solution. This promotes the hand-in-hand development of both Blockchain and other core economies.

Buy ELF
Link
CONUN - CONCONUN describes itself as a platform for building a horizontally distributed desktop computing system that can share idle computing power to handle projects that require high-performance computing resources. The platform is reportedly able to leverage the idle computing power of personal computers and smartphones.
Buy CON
Link
ARPA Chain - ARPA

ARPA is a blockchain-based layer 2 solution for privacy-preserving computation, enabled by Multi-Party Computation (“MPC”). The goal of ARPA is to separate data utility from ownership and enable data renting. ARPA’s MPC protocol creates ways for multiple entities to collaboratively analyze data and extract data synergies while keeping each party’s data input private and secure.

Developers can build privacy-preserving dApps on blockchains compatible with ARPA. Some immediate use cases include credit anti-fraud, secure data wallet, precision marketing, joint AI model training, and key management systems. For example, banks using the ARPA network can share their credit blacklist with each other for risk management purposes without exposing their customer data or privacy.

Buy ARPA
Link
AIOZ Network - AIOZ

AIOZ Network is a distributed CDN built on our very own Blockchain. On AIOZ Network, users share redundant memory, storage and bandwidth resources to create a vast CDN capable of powering streaming platforms anywhere in the world. We aim to change the way the world streams videos.

To better understand this, imagine that you're watching a video on your phone. Today that video streams from a content delivery network (CDN). A CDN is a system of servers in various locations storing and delivering content to viewers and their devices - like a video you watch on your phone.

AIOZ Network creates a distributed content delivery network (dCDN) and represents a major shift in the way the world streams video. On a dCDN, a video comes from one of many Nodes - a regular person paid to store and deliver content from their device with the help of an app. The app harnesses the device's unused resources such as extra computing power, bandwidth, and storage.

Buy AIOZ
Link
Akash Network - AKTAkash is the world's first decentralized cloud computing marketplace, and the DeCloud for DeFi
Buy AKT
Link
Aleph.im - ALEPH

Aleph.im is an open-source crosschain network featuring decentralized database including file storage, computing, and a decentralized identity (DID) framework.

Aleph.im’s core mission is to help decentralized apps and protocols strip off the centralized parts of their stack, achieving a fully decentralized architecture. You can think of aleph.im as a decentralized AWS or firebase. Aleph.im is focused on supercharging the DeFi ecosystem.

Buy ALEPH
Link
Phantasma - SOUL

Phantasma is a fully interoperable, decentralized feature rich blockchain.

With its innovative staking mechanism, dual token system, sustainable tokenomics model and advanced eco-friendly smartNFTs. The chain is designed to be used for digital goods & services for communications, entertainment, marketplaces and on-chain storage solutions for dApp creators and enterprise clients.

It has proven to be the platform of choice for Gaming and NFTs. Not only because of its low minting and transaction fees but also because of its smartNFT technology that has been built on chain level and its pledge to become the first certified carbon negative blockchain.

Buy SOUL
Link
DxChain Token - DX

DxChain is a blockchain network designed to facilitate big data processing and machine learning. Beta-launched in 2018, DxChain’s main goal is to allow its users to safely exchange big data sets and potentially benefit from improved analytics based on this data.

DX token generation is based on the value and quality of data users submit through the platform.

With decentralized data storage, DxChain offers users a secure environment to perform machine learning experiments and tests. Blockchain technology also provides increased computational power for machine learning and big data processing.

Buy DX
Link
PlatON - LAT

PlatON, initiated and driven by the LatticeX Foundation, is a next-generation Internet infrastructure protocol based on the fundamental properties of blockchain and supported by the privacy-preserving computation network. “Computing interoperability” is its core feature.

By building a computing system assembled by Verifiable Computation, Secure Multi-Party Computation, Zero-Knowledge Proof, Homomorphic Encryption and other cryptographic algorithms and blockchain technology, PlatON provides a public infrastructure in open source architecture for global artificial intelligence, distributed application developers, data providers and various organizations, communities and individuals with computing needs.

Buy LAT
Link
CUDOS - CUDOS

CUDOS powers a decentralised compute network that will interoperate with multiple blockchain ecosystems to provide the following benefits: Trusted layer 1 validator network built on the Tendermint protocol - Wasm compatibility, for smart contracts to be deployed on CUDOS using next-generation languages so long as they compile to WebAssembly. I.e. Golang, Rust, Java etc.

Cross-chain or Horizontal interoperability thanks to the network’s Inter Blockchain Communication (IBC) integration, allowing Cudos Network smart contracts to interface with multiple networks. 10x lower transaction and gas costs compared to those on PoW networks - A massively scalable network to facilitate more sophisticated smart contract operations Higher performance with anywhere between 200 to 500 Peak TPS on the network - access to a globally distributed layer 3 network of secure cloud, and compute, resources Turing complete solutions for non-Turing complete Layer 1 blockchain networks. 

Buy CUDOS
Link
Deeper Network - DPR

Deeper Network is based out of Silicon Valley, Santa Clara, CA. The technology combines blockchain, network security, and sharing economy to create a global peer-to-peer network that empowers real users of the internet and paves the way for the next generation of the web.

Deeper Network is built on Polkadot and has been elected by the Parity committee to participate in the Substrate Builder's Program for its visionary concept of bridging the gap to Web 3.0. 

Buy DPR
Link
SONM (BEP-20) - SNM

Sonm provides cloud services based on distributed customer level hardware including PCs, mining equipment, and servers. Users can either rent out your hardware or use someone’s computing power for their needs.

The SNM token is an internal currency on the Sony computing power marketplace. With SNM users can get access to the resources provided by Sonm.

Buy SNM
Link
ArcBlock - ABT

ArcBlock is a platform for building and deploying decentralized blockchain applications. It bills itself as a complete blockchain 3.0 product platform to build, deploy and manage Apps easily.

The ABT ERC-20 token functions as payment in the Arcblock ecosystem.

Buy ABT
Link
Zenon - ZNN

Zenon was launched as a POS/MS hybrid cryptocurrency in Mar 2019 and proposes a sharding-based decentralized architecture called the Network of Momentum (NoM), which aims to build upon existing blockchain and DAG architectures.

Due to sharding, the network will reportedly be able to scale linearly as the number of nodes grows. The protocol differs from traditional blockchain consensus as transactions are not treated in batches, but asynchronously processed within shards, with the overall state of the network verified and validated at the end of each epoch (each epoch having a random timeframe). The network will also feature a Turing complete scripting language that will allow developers to build and run zApps, create digital assets, and allow low-resource devices to participate in the network.

Buy ZNN
Link
Cellframe - CELL

Cellframe Network is a scalable, open-source, next-generation platform for building and bridging blockchains and services secured by post-quantum encryption.

Cellframe offers an environment for enterprises and developers to build a vast array of products ranging from simple low-level t-dApps to whole other blockchains on top of Cellframe Network.

Cellframe can provide extremely high transaction throughput based on an original sharding implementation. In addition, post-quantum cryptography makes the system resistant to hacking by quantum computers.

Cellframe Network is built with a unique implementation of dual-layer sharding, conditional transactions and multiparty computations. This allows seamless interoperability, as well as fast and economical transactions, all secured by integrated quantum safety measures.

Buy CELL
Link
GamerCoin - GHX

GamerCoin is the first licensed token for gamers and the native utility token of the GamerHash platform. GamerHash’s mission is to provide gamers with a simple and free tool to reap the rewards of blockchain mining and put their idle computing power to use without them having to do much.

Its solution is software that utilizes the computer’s idle computation power and mines cryptocurrencies in the background. The algorithm matches the most optimal cryptocurrency to mine to the user’s hardware configuration and converts the altcoins automatically into bitcoin after mining them. Funds are transferred once a day to users.

GamerHash claims to have built one of the biggest self-financing supercomputers on the planet based on gaming computers voluntarily contributed by players. 

Buy GHX
Link
ProximaX - XPXProximaX is an enterprise-grade infrastructure and development platform that integrates blockchain technology with distributed and decentralized service layers: storage, streaming, database, and Supercontract (enhanced smart contracts). Built on proven technologies, it is an all-in-one, easy-to-use platform which can be extended with more service layers without compromising performance. The ProximaX platform is available in private, public, and hybrid network configurations.
Buy XPX
Link
MultiVAC - MTV

MultiVAC is the innovative sharding killer and pioneering flexible computing framework.

As a high-throughput and flexible public blockchain platform, MultiVAC proposed an all-dimensional sharding solution to increase TPS of the blockchain, and MultiVAC is the first one to propose a flexible computing framework, so the developers can trade-off freely between the blockchain trilemma on this framework.

Everyone can participate in operating a node without competition. Nodes can be operated with a laptop or any ordinary PC.

Buy MTV
Link
Snetwork - SNETSnetwork (SNET) is a cryptocurrency and operates on the Ethereum platform. Snetwork (Distributed Shared Cloud Computing Network)
 
Buy SNET
Link

Source image: Coinmarketcap

Read more ☞ How to Earn with Crypto Exchange Affiliate Programs

Thank for visiting and reading this article! Please share if you liked it!

#blockchain #bitcoin #cryptocurrency #coin #token 

What is GEEK

Buddha Community

Top Distributed Computing Tokens by Volume

Words Counted: A Ruby Natural Language Processor.

WordsCounted

We are all in the gutter, but some of us are looking at the stars.

-- Oscar Wilde

WordsCounted is a Ruby NLP (natural language processor). WordsCounted lets you implement powerful tokensation strategies with a very flexible tokeniser class.

Are you using WordsCounted to do something interesting? Please tell me about it.

 

Demo

Visit this website for one example of what you can do with WordsCounted.

Features

  • Out of the box, get the following data from any string or readable file, or URL:
    • Token count and unique token count
    • Token densities, frequencies, and lengths
    • Char count and average chars per token
    • The longest tokens and their lengths
    • The most frequent tokens and their frequencies.
  • A flexible way to exclude tokens from the tokeniser. You can pass a string, regexp, symbol, lambda, or an array of any combination of those types for powerful tokenisation strategies.
  • Pass your own regexp rules to the tokeniser if you prefer. The default regexp filters special characters but keeps hyphens and apostrophes. It also plays nicely with diacritics (UTF and unicode characters): Bayrūt is treated as ["Bayrūt"] and not ["Bayr", "ū", "t"], for example.
  • Opens and reads files. Pass in a file path or a url instead of a string.

Installation

Add this line to your application's Gemfile:

gem 'words_counted'

And then execute:

$ bundle

Or install it yourself as:

$ gem install words_counted

Usage

Pass in a string or a file path, and an optional filter and/or regexp.

counter = WordsCounted.count(
  "We are all in the gutter, but some of us are looking at the stars."
)

# Using a file
counter = WordsCounted.from_file("path/or/url/to/my/file.txt")

.count and .from_file are convenience methods that take an input, tokenise it, and return an instance of WordsCounted::Counter initialized with the tokens. The WordsCounted::Tokeniser and WordsCounted::Counter classes can be used alone, however.

API

WordsCounted

WordsCounted.count(input, options = {})

Tokenises input and initializes a WordsCounted::Counter object with the resulting tokens.

counter = WordsCounted.count("Hello Beirut!")

Accepts two options: exclude and regexp. See Excluding tokens from the analyser and Passing in a custom regexp respectively.

WordsCounted.from_file(path, options = {})

Reads and tokenises a file, and initializes a WordsCounted::Counter object with the resulting tokens.

counter = WordsCounted.from_file("hello_beirut.txt")

Accepts the same options as .count.

Tokeniser

The tokeniser allows you to tokenise text in a variety of ways. You can pass in your own rules for tokenisation, and apply a powerful filter with any combination of rules as long as they can boil down into a lambda.

Out of the box the tokeniser includes only alpha chars. Hyphenated tokens and tokens with apostrophes are considered a single token.

#tokenise([pattern: TOKEN_REGEXP, exclude: nil])

tokeniser = WordsCounted::Tokeniser.new("Hello Beirut!").tokenise

# With `exclude`
tokeniser = WordsCounted::Tokeniser.new("Hello Beirut!").tokenise(exclude: "hello")

# With `pattern`
tokeniser = WordsCounted::Tokeniser.new("I <3 Beirut!").tokenise(pattern: /[a-z]/i)

See Excluding tokens from the analyser and Passing in a custom regexp for more information.

Counter

The WordsCounted::Counter class allows you to collect various statistics from an array of tokens.

#token_count

Returns the token count of a given string.

counter.token_count #=> 15

#token_frequency

Returns a sorted (unstable) two-dimensional array where each element is a token and its frequency. The array is sorted by frequency in descending order.

counter.token_frequency

[
  ["the", 2],
  ["are", 2],
  ["we",  1],
  # ...
  ["all", 1]
]

#most_frequent_tokens

Returns a hash where each key-value pair is a token and its frequency.

counter.most_frequent_tokens

{ "are" => 2, "the" => 2 }

#token_lengths

Returns a sorted (unstable) two-dimentional array where each element contains a token and its length. The array is sorted by length in descending order.

counter.token_lengths

[
  ["looking", 7],
  ["gutter",  6],
  ["stars",   5],
  # ...
  ["in",      2]
]

#longest_tokens

Returns a hash where each key-value pair is a token and its length.

counter.longest_tokens

{ "looking" => 7 }

#token_density([ precision: 2 ])

Returns a sorted (unstable) two-dimentional array where each element contains a token and its density as a float, rounded to a precision of two. The array is sorted by density in descending order. It accepts a precision argument, which must be a float.

counter.token_density

[
  ["are",     0.13],
  ["the",     0.13],
  ["but",     0.07 ],
  # ...
  ["we",      0.07 ]
]

#char_count

Returns the char count of tokens.

counter.char_count #=> 76

#average_chars_per_token([ precision: 2 ])

Returns the average char count per token rounded to two decimal places. Accepts a precision argument which defaults to two. Precision must be a float.

counter.average_chars_per_token #=> 4

#uniq_token_count

Returns the number of unique tokens.

counter.uniq_token_count #=> 13

Excluding tokens from the tokeniser

You can exclude anything you want from the input by passing the exclude option. The exclude option accepts a variety of filters and is extremely flexible.

  1. A space-delimited string. The filter will normalise the string.
  2. A regular expression.
  3. A lambda.
  4. A symbol that names a predicate method. For example :odd?.
  5. An array of any combination of the above.
tokeniser =
  WordsCounted::Tokeniser.new(
    "Magnificent! That was magnificent, Trevor."
  )

# Using a string
tokeniser.tokenise(exclude: "was magnificent")
# => ["that", "trevor"]

# Using a regular expression
tokeniser.tokenise(exclude: /trevor/)
# => ["magnificent", "that", "was", "magnificent"]

# Using a lambda
tokeniser.tokenise(exclude: ->(t) { t.length < 4 })
# => ["magnificent", "that", "magnificent", "trevor"]

# Using symbol
tokeniser = WordsCounted::Tokeniser.new("Hello! محمد")
tokeniser.tokenise(exclude: :ascii_only?)
# => ["محمد"]

# Using an array
tokeniser = WordsCounted::Tokeniser.new(
  "Hello! اسماءنا هي محمد، كارولينا، سامي، وداني"
)
tokeniser.tokenise(
  exclude: [:ascii_only?, /محمد/, ->(t) { t.length > 6}, "و"]
)
# => ["هي", "سامي", "وداني"]

Passing in a custom regexp

The default regexp accounts for letters, hyphenated tokens, and apostrophes. This means twenty-one is treated as one token. So is Mohamad's.

/[\p{Alpha}\-']+/

You can pass your own criteria as a Ruby regular expression to split your string as desired.

For example, if you wanted to include numbers, you can override the regular expression:

counter = WordsCounted.count("Numbers 1, 2, and 3", pattern: /[\p{Alnum}\-']+/)
counter.tokens
#=> ["numbers", "1", "2", "and", "3"]

Opening and reading files

Use the from_file method to open files. from_file accepts the same options as .count. The file path can be a URL.

counter = WordsCounted.from_file("url/or/path/to/file.text")

Gotchas

A hyphen used in leu of an em or en dash will form part of the token. This affects the tokeniser algorithm.

counter = WordsCounted.count("How do you do?-you are well, I see.")
counter.token_frequency

[
  ["do",   2],
  ["how",  1],
  ["you",  1],
  ["-you", 1], # WTF, mate!
  ["are",  1],
  # ...
]

In this example -you and you are separate tokens. Also, the tokeniser does not include numbers by default. Remember that you can pass your own regular expression if the default behaviour does not fit your needs.

A note on case sensitivity

The program will normalise (downcase) all incoming strings for consistency and filters.

Roadmap

Ability to open URLs

def self.from_url
  # open url and send string here after removing html
end

Contributors

See contributors.

Contributing

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

Author: abitdodgy
Source code: https://github.com/abitdodgy/words_counted
License: MIT license

#ruby  #ruby-on-rails 

Royce  Reinger

Royce Reinger

1658068560

WordsCounted: A Ruby Natural Language Processor

WordsCounted

We are all in the gutter, but some of us are looking at the stars.

-- Oscar Wilde

WordsCounted is a Ruby NLP (natural language processor). WordsCounted lets you implement powerful tokensation strategies with a very flexible tokeniser class.

Features

  • Out of the box, get the following data from any string or readable file, or URL:
    • Token count and unique token count
    • Token densities, frequencies, and lengths
    • Char count and average chars per token
    • The longest tokens and their lengths
    • The most frequent tokens and their frequencies.
  • A flexible way to exclude tokens from the tokeniser. You can pass a string, regexp, symbol, lambda, or an array of any combination of those types for powerful tokenisation strategies.
  • Pass your own regexp rules to the tokeniser if you prefer. The default regexp filters special characters but keeps hyphens and apostrophes. It also plays nicely with diacritics (UTF and unicode characters): Bayrūt is treated as ["Bayrūt"] and not ["Bayr", "ū", "t"], for example.
  • Opens and reads files. Pass in a file path or a url instead of a string.

Installation

Add this line to your application's Gemfile:

gem 'words_counted'

And then execute:

$ bundle

Or install it yourself as:

$ gem install words_counted

Usage

Pass in a string or a file path, and an optional filter and/or regexp.

counter = WordsCounted.count(
  "We are all in the gutter, but some of us are looking at the stars."
)

# Using a file
counter = WordsCounted.from_file("path/or/url/to/my/file.txt")

.count and .from_file are convenience methods that take an input, tokenise it, and return an instance of WordsCounted::Counter initialized with the tokens. The WordsCounted::Tokeniser and WordsCounted::Counter classes can be used alone, however.

API

WordsCounted

WordsCounted.count(input, options = {})

Tokenises input and initializes a WordsCounted::Counter object with the resulting tokens.

counter = WordsCounted.count("Hello Beirut!")

Accepts two options: exclude and regexp. See Excluding tokens from the analyser and Passing in a custom regexp respectively.

WordsCounted.from_file(path, options = {})

Reads and tokenises a file, and initializes a WordsCounted::Counter object with the resulting tokens.

counter = WordsCounted.from_file("hello_beirut.txt")

Accepts the same options as .count.

Tokeniser

The tokeniser allows you to tokenise text in a variety of ways. You can pass in your own rules for tokenisation, and apply a powerful filter with any combination of rules as long as they can boil down into a lambda.

Out of the box the tokeniser includes only alpha chars. Hyphenated tokens and tokens with apostrophes are considered a single token.

#tokenise([pattern: TOKEN_REGEXP, exclude: nil])

tokeniser = WordsCounted::Tokeniser.new("Hello Beirut!").tokenise

# With `exclude`
tokeniser = WordsCounted::Tokeniser.new("Hello Beirut!").tokenise(exclude: "hello")

# With `pattern`
tokeniser = WordsCounted::Tokeniser.new("I <3 Beirut!").tokenise(pattern: /[a-z]/i)

See Excluding tokens from the analyser and Passing in a custom regexp for more information.

Counter

The WordsCounted::Counter class allows you to collect various statistics from an array of tokens.

#token_count

Returns the token count of a given string.

counter.token_count #=> 15

#token_frequency

Returns a sorted (unstable) two-dimensional array where each element is a token and its frequency. The array is sorted by frequency in descending order.

counter.token_frequency

[
  ["the", 2],
  ["are", 2],
  ["we",  1],
  # ...
  ["all", 1]
]

#most_frequent_tokens

Returns a hash where each key-value pair is a token and its frequency.

counter.most_frequent_tokens

{ "are" => 2, "the" => 2 }

#token_lengths

Returns a sorted (unstable) two-dimentional array where each element contains a token and its length. The array is sorted by length in descending order.

counter.token_lengths

[
  ["looking", 7],
  ["gutter",  6],
  ["stars",   5],
  # ...
  ["in",      2]
]

#longest_tokens

Returns a hash where each key-value pair is a token and its length.

counter.longest_tokens

{ "looking" => 7 }

#token_density([ precision: 2 ])

Returns a sorted (unstable) two-dimentional array where each element contains a token and its density as a float, rounded to a precision of two. The array is sorted by density in descending order. It accepts a precision argument, which must be a float.

counter.token_density

[
  ["are",     0.13],
  ["the",     0.13],
  ["but",     0.07 ],
  # ...
  ["we",      0.07 ]
]

#char_count

Returns the char count of tokens.

counter.char_count #=> 76

#average_chars_per_token([ precision: 2 ])

Returns the average char count per token rounded to two decimal places. Accepts a precision argument which defaults to two. Precision must be a float.

counter.average_chars_per_token #=> 4

#uniq_token_count

Returns the number of unique tokens.

counter.uniq_token_count #=> 13

Excluding tokens from the tokeniser

You can exclude anything you want from the input by passing the exclude option. The exclude option accepts a variety of filters and is extremely flexible.

  1. A space-delimited string. The filter will normalise the string.
  2. A regular expression.
  3. A lambda.
  4. A symbol that names a predicate method. For example :odd?.
  5. An array of any combination of the above.
tokeniser =
  WordsCounted::Tokeniser.new(
    "Magnificent! That was magnificent, Trevor."
  )

# Using a string
tokeniser.tokenise(exclude: "was magnificent")
# => ["that", "trevor"]

# Using a regular expression
tokeniser.tokenise(exclude: /trevor/)
# => ["magnificent", "that", "was", "magnificent"]

# Using a lambda
tokeniser.tokenise(exclude: ->(t) { t.length < 4 })
# => ["magnificent", "that", "magnificent", "trevor"]

# Using symbol
tokeniser = WordsCounted::Tokeniser.new("Hello! محمد")
tokeniser.tokenise(exclude: :ascii_only?)
# => ["محمد"]

# Using an array
tokeniser = WordsCounted::Tokeniser.new(
  "Hello! اسماءنا هي محمد، كارولينا، سامي، وداني"
)
tokeniser.tokenise(
  exclude: [:ascii_only?, /محمد/, ->(t) { t.length > 6}, "و"]
)
# => ["هي", "سامي", "وداني"]

Passing in a custom regexp

The default regexp accounts for letters, hyphenated tokens, and apostrophes. This means twenty-one is treated as one token. So is Mohamad's.

/[\p{Alpha}\-']+/

You can pass your own criteria as a Ruby regular expression to split your string as desired.

For example, if you wanted to include numbers, you can override the regular expression:

counter = WordsCounted.count("Numbers 1, 2, and 3", pattern: /[\p{Alnum}\-']+/)
counter.tokens
#=> ["numbers", "1", "2", "and", "3"]

Opening and reading files

Use the from_file method to open files. from_file accepts the same options as .count. The file path can be a URL.

counter = WordsCounted.from_file("url/or/path/to/file.text")

Gotchas

A hyphen used in leu of an em or en dash will form part of the token. This affects the tokeniser algorithm.

counter = WordsCounted.count("How do you do?-you are well, I see.")
counter.token_frequency

[
  ["do",   2],
  ["how",  1],
  ["you",  1],
  ["-you", 1], # WTF, mate!
  ["are",  1],
  # ...
]

In this example -you and you are separate tokens. Also, the tokeniser does not include numbers by default. Remember that you can pass your own regular expression if the default behaviour does not fit your needs.

A note on case sensitivity

The program will normalise (downcase) all incoming strings for consistency and filters.

Roadmap

Ability to open URLs

def self.from_url
  # open url and send string here after removing html
end

Are you using WordsCounted to do something interesting? Please tell me about it.

Gem Version 

RubyDoc documentation.

Demo

Visit this website for one example of what you can do with WordsCounted.


Contributors

See contributors.

Contributing

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

Author: Abitdodgy
Source Code: https://github.com/abitdodgy/words_counted 
License: MIT license

#ruby #nlp 

Lokesh Kumar

1603438098

Top 10 Trending Technologies Must Learn in 2021 | igmGuru

Technology has taken a place of more productiveness and give the best to the world. In the current situation, everything is done through the technical process, you don’t have to bother about doing task, everything will be done automatically.This is an article which has some important technologies which are new in the market are explained according to the career preferences. So let’s have a look into the top trending technologies followed in 2021 and its impression in the coming future in the world.

  1. Data Science
    First in the list of newest technologies is surprisingly Data Science. Data Science is the automation that helps to be reasonable for complicated data. The data is produces in a very large amount every day by several companies which comprise sales data, customer profile information, server data, business data, and financial structures. Almost all of the data which is in the form of big data is very indeterminate. The character of a data scientist is to convert the indeterminate datasets into determinate datasets. Then these structured data will examine to recognize trends and patterns. These trends and patterns are beneficial to understand the company’s business performance, customer retention, and how they can be enhanced.

  2. DevOps
    Next one is DevOps, This technology is a mixture of two different things and they are development (Dev) and operations (Ops). This process and technology provide value to their customers in a continuous manner. This technology plays an important role in different aspects and they can be- IT operations, development, security, quality, and engineering to synchronize and cooperate to develop the best and more definitive products. By embracing a culture of DevOps with creative tools and techniques, because through that company will gain the capacity to preferable comeback to consumer requirement, expand the confidence in the request they construct, and accomplish business goals faster. This makes DevOps come into the top 10 trending technologies.

  3. Machine learning
    Next one is Machine learning which is constantly established in all the categories of companies or industries, generating a high command for skilled professionals. The machine learning retailing business is looking forward to enlarging to $8.81 billion by 2022. Machine learning practices is basically use for data mining, data analytics, and pattern recognition. In today’s scenario, Machine learning has its own reputed place in the industry. This makes machine learning come into the top 10 trending technologies. Get the best machine learning course and make yourself future-ready.

To want to know more click on Top 10 Trending Technologies in 2021

You may also read more blogs mentioned below

How to Become a Salesforce Developer

Python VS R Programming

The Scope of Hadoop and Big Data in 2021

#top trending technologies #top 10 trending technologies #top 10 trending technologies in 2021 #top trending technologies in 2021 #top 5 trending technologies in 2021 #top 5 trending technologies

Ruth  Nabimanya

Ruth Nabimanya

1622359260

#NoBrainers: You Need A High Performing Low Latency Distributed Database | Hacker Noon

…but which industries benefit the most from it?

There are certain industries that greatly benefit from high-performing, low-latency, geo-distributed technologies, while other organizations might be more focused on vertically scaling architectures.

This is dependent on numerous factors including the data pipeline, network, data structure, type of product or solution, short and long term goals, etc.

While there are currently many databases and tools that provide vertical scaling capabilities, there are not many that focus on horizontal scaling – but there’s still a need for both.

#database #distributed-systems #performance #distributed-computing #data-management #scaling #edge-computing #cloud-computing

aaron silva

aaron silva

1622197808

SafeMoon Clone | Create A DeFi Token Like SafeMoon | DeFi token like SafeMoon

SafeMoon is a decentralized finance (DeFi) token. This token consists of RFI tokenomics and auto-liquidity generating protocol. A DeFi token like SafeMoon has reached the mainstream standards under the Binance Smart Chain. Its success and popularity have been immense, thus, making the majority of the business firms adopt this style of cryptocurrency as an alternative.

A DeFi token like SafeMoon is almost similar to the other crypto-token, but the only difference being that it charges a 10% transaction fee from the users who sell their tokens, in which 5% of the fee is distributed to the remaining SafeMoon owners. This feature rewards the owners for holding onto their tokens.

Read More @ https://bit.ly/3oFbJoJ

#create a defi token like safemoon #defi token like safemoon #safemoon token #safemoon token clone #defi token