1615885699
SubQuery’s aim is to help Polkadot/Substrate projects build better dApps by allowing anyone to reliably find and consume data faster. Our service will allow users to extract, transform, persist, and query data initially, as well as connect and present data in the future. Our aim is to make this a core piece of infrastructure for the Substrate/Polkadot ecosystem, just as The Graph has become for Ethereum.
SubQuery is here to help you transform and query the world’s data for a web3.0 future.
For the Web 3.0 dream to be realised, it’s got to be as fast (if not faster) than centralised networks for the end user.
That’s why we’re incredibly proud to announce SubQuery, an open source project that allows users to run an indexer across their chain to build a dataset that can be queried with GraphQL. This suite of tools includes a command line interface to allow projects to generate their own SubQuery project, defining how the indexer should traverse and aggregate their own network. There’s a SubQuery node package that indexes the network and supports GraphQL queries. With the help of these tools, anyone can create and run queries easily.
The Web3 Foundation Open Grants Program Deliverable
The Web3 Foundation Open Grants Program has enabled us to build SubQuery, an open source project that allows users to run an indexer across their chain to build a dataset that can be queried with GraphQL.
This suite of tools includes @subql/cli, to allow projects to generate their own SubQuery project, defining how the indexer should traverse and aggregate their own network. As part of our proposal, we have provided a basic tutorial that shows users how to use the cli to index their network that you can follow here. We’ve even provided more detailed developer documentation for more advanced usages.
Secondly, there’s a SubQuery node package that loads the defined SubQuery project created by the CLI and then indexes the network to a Postgres database. Using Hasura, you can run GraphQL queries right away over indexed tables. With the help of these tools, and the community support material that we’re always improving, anyone can create and run queries easily.
You can get started right away by following our example on the SubQuery Github repository: https://github.com/OnFinality-io/subql. Additionally, you can find out more by reading our SubQuery docs: https://doc.subquery.network/ or visiting our new website
We’re incredibly grateful for the support provided by Web3 Foundation to help us carry out this project for the community. Web3 Foundation funds research and development teams building the technology stack of the decentralized web. It was established in Zug, Switzerland by Ethereum co-founder and former chief technology officer Dr. Gavin Wood. Polkadot is the Foundation’s flagship project.
We saw Polkadot’s potential early and right from the start it felt natural to focus our efforts there. The core premise of Polkadot is to create a thriving community of developers, users, and businesses that will tap into its multichain interoperability — that community is going to need a service that allows them to reliably find and consume data quickly.
Polkadot’s unique architecture means that we can focus on one network and then be able to support multiple current and future chains with ease. Even though Polkadot is still under development, we will be there ready to help the next generation of blockchain developers create the next big dApp.
Since announcing SubQuery to the world just a month ago, the response and feedback we’ve received from the Polkadot community has been overwhelmingly positive. We’ve gained thousands of followers on our channels, and there have been over 1,353 installs from NPM. It’s inspiring to receive all the messages of support, and to see the engagement from the community. It’s time for us to give something back!
We’ve been working overtime over the last month to release the next major stage of our roadmap for SubQuery. Today we’re announcing the release of the SubQuery Explorer.
SubQuery Explorer is an online hosted service that provides access to published SubQuery projects made by contributors around the world and managed by the SubQuery team. It furthers our mission to support Polkadot developers by providing infrastructure services by making accessing Polkadot network data even easier.
Today, anyone can query and extract Polkadot network data in only minutes and at no cost.
The SubQuery explorer makes getting started easy. We’ve prebuilt SubQuery projects for two use cases (more about these below) and have indexed each network. We’re hosting these SubQuery nodes online and allow anyone to query each for free. These managed nodes will be monitored and run by the SubQuery team at a performance level that will allow production apps to use and rely on them.
You’ll also note that the SubQuery Explorer provides a playground for discovering available data with example queries. You can play around with each SubQuery Graph using this explorer without implementing anything in code. Additionally, we’ve made some small improvements to our documentation to better support developers on their journey to better query and analyse the world’s Polkadot data.
You can quickly find total staking revenue awarded to any account since the beginning of time by querying their account address. This subquery project indexes and records the accounts participating in the staking bond on the blockchain. The continued indexing will find out obtained staking reward and slashes for this account and aggregates their sums to a database.
You can quickly see the minimum staking amount required for a validator to be elected. This project is an excellent example of implementing query states in the mapping function. It first finds the active staking Era through a state query and records the validators of this session staking amount by each. It then calculates the minimum staking amount and the total amount staked in this Era. Lastly, it records the maximum number of nominators that can be rewarded.
SubQuery is a data aggregation layer that will operate between the layer-1 blockchain (Acala) and DApp layer. The solution aggregates and organizes data from Acala and other blockchains, serving up well-structured data for developers to use for a wide array of projects. This service allows DApp developers to focus on their core use case and front-end, without needing to waste time on building a custom backend for data processing.
Acala is a firm believer and a long-term builder for the multi-chain future — reducing liquidity fragmentation, increasing composability, and enabling DeFi accessibility to everyone. Acala is a specialized blockchain focusing on decentralized finance (DeFi), and created multiple DeFi primitives that became a DeFi hub and infrastructure serving the Polkadot and Kusama ecosystems. The team has built products including a multi-collateralized stablecoin (aUSD — The Acala Dollar), an automated market maker (AMM) DEX, a tokenized staked asset called Liquid DOT (LDOT), and implemented a bring-your-own-gas feature allowing gas fees to be payable in any supported assets such as stablecoins. Acala’s parachain plans to play the role of DeFi hub for Polkadot and a landing pad that aggregates assets and liquidity from a variety of blockchains.
Today when you access the SubQuery Explorer you’ll be welcomed with a new Acala SubQuery Project. This SubQuery dynamically tracks all the extrinsic data created on Acala and can quickly show derived aggregated stats for the following:
You can play around with the Acala SubQuery Graph using the Explorer without implementing anything in code. Additionally, we’ve documented the types that you can specify in each GraphQL request when analysing Acala’s data.
Below is a simple example of how a user can quickly and easily see the previous 5 transfer events using the ACA token over the Acala Mandala network. You can see here that we use simple GraphQL language to sort and retrieve this data to any client. DApps can use this to monitor loan positions, and jump on auctions etc to help liquidate collaterals.
A slightly more complex example follows, where we inspect a single account and retrieve all token swap events made by it. A portfolio DApp might use this data to create an overview of the holder’s account and token performance, revenue from staking, liquidity provisioning, and expenses on borrowing.
SubQuery Explorer is an online hosted service that provides access to published SubQuery projects made by contributors around the world and managed by the SubQuery team. Its mission is to ease access to Polkadot network data by providing infrastructure services to help developers achieve more.
SubQuery allows every Substrate/Polkadot team to process and query their data. The project is inspired by the growth of data protocols serving the application layer and its aim is to help Polkadot/Substrate projects build better dApps by allowing anyone to reliably find and consume data faster. Today, anyone can query and extract Polkadot network data in only minutes and at no cost.
Start querying data on the new SubQuery Explorer
Build your own SubQuery project by following our SubQuery docs
Visit our website
Talk to us on:
🔺DISCLAIMER: Trading Cryptocurrency is VERY risky. Make sure that you understand these risks if you are a beginner. The Information in the post is my OPINION and not financial advice. You are responsible for what you do with your funds
Learn about Cryptocurrency in this article ☞ What You Should Know Before Investing in Cryptocurrency - For Beginner
Top exchanges for token-coin trading. Follow instructions and make unlimited money
☞ https://www.binance.com
☞ https://www.bittrex.com
☞ https://www.poloniex.com
☞ https://www.bitfinex.com
☞ https://www.huobi.com
☞ https://www.mxc.ai
☞ https://www.probit.com
☞ https://www.gate.io
☞ https://www.coinbase.com
I hope this post will help you. If you liked this, please sharing it with others. Thank you!
#blockchain #bitcoin #subquery network
1615934150
Your reviews are great!
1615939405
Thank you Ryan
1615885699
SubQuery’s aim is to help Polkadot/Substrate projects build better dApps by allowing anyone to reliably find and consume data faster. Our service will allow users to extract, transform, persist, and query data initially, as well as connect and present data in the future. Our aim is to make this a core piece of infrastructure for the Substrate/Polkadot ecosystem, just as The Graph has become for Ethereum.
SubQuery is here to help you transform and query the world’s data for a web3.0 future.
For the Web 3.0 dream to be realised, it’s got to be as fast (if not faster) than centralised networks for the end user.
That’s why we’re incredibly proud to announce SubQuery, an open source project that allows users to run an indexer across their chain to build a dataset that can be queried with GraphQL. This suite of tools includes a command line interface to allow projects to generate their own SubQuery project, defining how the indexer should traverse and aggregate their own network. There’s a SubQuery node package that indexes the network and supports GraphQL queries. With the help of these tools, anyone can create and run queries easily.
The Web3 Foundation Open Grants Program Deliverable
The Web3 Foundation Open Grants Program has enabled us to build SubQuery, an open source project that allows users to run an indexer across their chain to build a dataset that can be queried with GraphQL.
This suite of tools includes @subql/cli, to allow projects to generate their own SubQuery project, defining how the indexer should traverse and aggregate their own network. As part of our proposal, we have provided a basic tutorial that shows users how to use the cli to index their network that you can follow here. We’ve even provided more detailed developer documentation for more advanced usages.
Secondly, there’s a SubQuery node package that loads the defined SubQuery project created by the CLI and then indexes the network to a Postgres database. Using Hasura, you can run GraphQL queries right away over indexed tables. With the help of these tools, and the community support material that we’re always improving, anyone can create and run queries easily.
You can get started right away by following our example on the SubQuery Github repository: https://github.com/OnFinality-io/subql. Additionally, you can find out more by reading our SubQuery docs: https://doc.subquery.network/ or visiting our new website
We’re incredibly grateful for the support provided by Web3 Foundation to help us carry out this project for the community. Web3 Foundation funds research and development teams building the technology stack of the decentralized web. It was established in Zug, Switzerland by Ethereum co-founder and former chief technology officer Dr. Gavin Wood. Polkadot is the Foundation’s flagship project.
We saw Polkadot’s potential early and right from the start it felt natural to focus our efforts there. The core premise of Polkadot is to create a thriving community of developers, users, and businesses that will tap into its multichain interoperability — that community is going to need a service that allows them to reliably find and consume data quickly.
Polkadot’s unique architecture means that we can focus on one network and then be able to support multiple current and future chains with ease. Even though Polkadot is still under development, we will be there ready to help the next generation of blockchain developers create the next big dApp.
Since announcing SubQuery to the world just a month ago, the response and feedback we’ve received from the Polkadot community has been overwhelmingly positive. We’ve gained thousands of followers on our channels, and there have been over 1,353 installs from NPM. It’s inspiring to receive all the messages of support, and to see the engagement from the community. It’s time for us to give something back!
We’ve been working overtime over the last month to release the next major stage of our roadmap for SubQuery. Today we’re announcing the release of the SubQuery Explorer.
SubQuery Explorer is an online hosted service that provides access to published SubQuery projects made by contributors around the world and managed by the SubQuery team. It furthers our mission to support Polkadot developers by providing infrastructure services by making accessing Polkadot network data even easier.
Today, anyone can query and extract Polkadot network data in only minutes and at no cost.
The SubQuery explorer makes getting started easy. We’ve prebuilt SubQuery projects for two use cases (more about these below) and have indexed each network. We’re hosting these SubQuery nodes online and allow anyone to query each for free. These managed nodes will be monitored and run by the SubQuery team at a performance level that will allow production apps to use and rely on them.
You’ll also note that the SubQuery Explorer provides a playground for discovering available data with example queries. You can play around with each SubQuery Graph using this explorer without implementing anything in code. Additionally, we’ve made some small improvements to our documentation to better support developers on their journey to better query and analyse the world’s Polkadot data.
You can quickly find total staking revenue awarded to any account since the beginning of time by querying their account address. This subquery project indexes and records the accounts participating in the staking bond on the blockchain. The continued indexing will find out obtained staking reward and slashes for this account and aggregates their sums to a database.
You can quickly see the minimum staking amount required for a validator to be elected. This project is an excellent example of implementing query states in the mapping function. It first finds the active staking Era through a state query and records the validators of this session staking amount by each. It then calculates the minimum staking amount and the total amount staked in this Era. Lastly, it records the maximum number of nominators that can be rewarded.
SubQuery is a data aggregation layer that will operate between the layer-1 blockchain (Acala) and DApp layer. The solution aggregates and organizes data from Acala and other blockchains, serving up well-structured data for developers to use for a wide array of projects. This service allows DApp developers to focus on their core use case and front-end, without needing to waste time on building a custom backend for data processing.
Acala is a firm believer and a long-term builder for the multi-chain future — reducing liquidity fragmentation, increasing composability, and enabling DeFi accessibility to everyone. Acala is a specialized blockchain focusing on decentralized finance (DeFi), and created multiple DeFi primitives that became a DeFi hub and infrastructure serving the Polkadot and Kusama ecosystems. The team has built products including a multi-collateralized stablecoin (aUSD — The Acala Dollar), an automated market maker (AMM) DEX, a tokenized staked asset called Liquid DOT (LDOT), and implemented a bring-your-own-gas feature allowing gas fees to be payable in any supported assets such as stablecoins. Acala’s parachain plans to play the role of DeFi hub for Polkadot and a landing pad that aggregates assets and liquidity from a variety of blockchains.
Today when you access the SubQuery Explorer you’ll be welcomed with a new Acala SubQuery Project. This SubQuery dynamically tracks all the extrinsic data created on Acala and can quickly show derived aggregated stats for the following:
You can play around with the Acala SubQuery Graph using the Explorer without implementing anything in code. Additionally, we’ve documented the types that you can specify in each GraphQL request when analysing Acala’s data.
Below is a simple example of how a user can quickly and easily see the previous 5 transfer events using the ACA token over the Acala Mandala network. You can see here that we use simple GraphQL language to sort and retrieve this data to any client. DApps can use this to monitor loan positions, and jump on auctions etc to help liquidate collaterals.
A slightly more complex example follows, where we inspect a single account and retrieve all token swap events made by it. A portfolio DApp might use this data to create an overview of the holder’s account and token performance, revenue from staking, liquidity provisioning, and expenses on borrowing.
SubQuery Explorer is an online hosted service that provides access to published SubQuery projects made by contributors around the world and managed by the SubQuery team. Its mission is to ease access to Polkadot network data by providing infrastructure services to help developers achieve more.
SubQuery allows every Substrate/Polkadot team to process and query their data. The project is inspired by the growth of data protocols serving the application layer and its aim is to help Polkadot/Substrate projects build better dApps by allowing anyone to reliably find and consume data faster. Today, anyone can query and extract Polkadot network data in only minutes and at no cost.
Start querying data on the new SubQuery Explorer
Build your own SubQuery project by following our SubQuery docs
Visit our website
Talk to us on:
🔺DISCLAIMER: Trading Cryptocurrency is VERY risky. Make sure that you understand these risks if you are a beginner. The Information in the post is my OPINION and not financial advice. You are responsible for what you do with your funds
Learn about Cryptocurrency in this article ☞ What You Should Know Before Investing in Cryptocurrency - For Beginner
Top exchanges for token-coin trading. Follow instructions and make unlimited money
☞ https://www.binance.com
☞ https://www.bittrex.com
☞ https://www.poloniex.com
☞ https://www.bitfinex.com
☞ https://www.huobi.com
☞ https://www.mxc.ai
☞ https://www.probit.com
☞ https://www.gate.io
☞ https://www.coinbase.com
I hope this post will help you. If you liked this, please sharing it with others. Thank you!
#blockchain #bitcoin #subquery network
1659601560
We are all in the gutter, but some of us are looking at the stars.
-- Oscar Wilde
WordsCounted is a Ruby NLP (natural language processor). WordsCounted lets you implement powerful tokensation strategies with a very flexible tokeniser class.
Are you using WordsCounted to do something interesting? Please tell me about it.
Visit this website for one example of what you can do with WordsCounted.
["Bayrūt"]
and not ["Bayr", "ū", "t"]
, for example.Add this line to your application's Gemfile:
gem 'words_counted'
And then execute:
$ bundle
Or install it yourself as:
$ gem install words_counted
Pass in a string or a file path, and an optional filter and/or regexp.
counter = WordsCounted.count(
"We are all in the gutter, but some of us are looking at the stars."
)
# Using a file
counter = WordsCounted.from_file("path/or/url/to/my/file.txt")
.count
and .from_file
are convenience methods that take an input, tokenise it, and return an instance of WordsCounted::Counter
initialized with the tokens. The WordsCounted::Tokeniser
and WordsCounted::Counter
classes can be used alone, however.
WordsCounted.count(input, options = {})
Tokenises input and initializes a WordsCounted::Counter
object with the resulting tokens.
counter = WordsCounted.count("Hello Beirut!")
Accepts two options: exclude
and regexp
. See Excluding tokens from the analyser and Passing in a custom regexp respectively.
WordsCounted.from_file(path, options = {})
Reads and tokenises a file, and initializes a WordsCounted::Counter
object with the resulting tokens.
counter = WordsCounted.from_file("hello_beirut.txt")
Accepts the same options as .count
.
The tokeniser allows you to tokenise text in a variety of ways. You can pass in your own rules for tokenisation, and apply a powerful filter with any combination of rules as long as they can boil down into a lambda.
Out of the box the tokeniser includes only alpha chars. Hyphenated tokens and tokens with apostrophes are considered a single token.
#tokenise([pattern: TOKEN_REGEXP, exclude: nil])
tokeniser = WordsCounted::Tokeniser.new("Hello Beirut!").tokenise
# With `exclude`
tokeniser = WordsCounted::Tokeniser.new("Hello Beirut!").tokenise(exclude: "hello")
# With `pattern`
tokeniser = WordsCounted::Tokeniser.new("I <3 Beirut!").tokenise(pattern: /[a-z]/i)
See Excluding tokens from the analyser and Passing in a custom regexp for more information.
The WordsCounted::Counter
class allows you to collect various statistics from an array of tokens.
#token_count
Returns the token count of a given string.
counter.token_count #=> 15
#token_frequency
Returns a sorted (unstable) two-dimensional array where each element is a token and its frequency. The array is sorted by frequency in descending order.
counter.token_frequency
[
["the", 2],
["are", 2],
["we", 1],
# ...
["all", 1]
]
#most_frequent_tokens
Returns a hash where each key-value pair is a token and its frequency.
counter.most_frequent_tokens
{ "are" => 2, "the" => 2 }
#token_lengths
Returns a sorted (unstable) two-dimentional array where each element contains a token and its length. The array is sorted by length in descending order.
counter.token_lengths
[
["looking", 7],
["gutter", 6],
["stars", 5],
# ...
["in", 2]
]
#longest_tokens
Returns a hash where each key-value pair is a token and its length.
counter.longest_tokens
{ "looking" => 7 }
#token_density([ precision: 2 ])
Returns a sorted (unstable) two-dimentional array where each element contains a token and its density as a float, rounded to a precision of two. The array is sorted by density in descending order. It accepts a precision
argument, which must be a float.
counter.token_density
[
["are", 0.13],
["the", 0.13],
["but", 0.07 ],
# ...
["we", 0.07 ]
]
#char_count
Returns the char count of tokens.
counter.char_count #=> 76
#average_chars_per_token([ precision: 2 ])
Returns the average char count per token rounded to two decimal places. Accepts a precision argument which defaults to two. Precision must be a float.
counter.average_chars_per_token #=> 4
#uniq_token_count
Returns the number of unique tokens.
counter.uniq_token_count #=> 13
You can exclude anything you want from the input by passing the exclude
option. The exclude option accepts a variety of filters and is extremely flexible.
:odd?
.tokeniser =
WordsCounted::Tokeniser.new(
"Magnificent! That was magnificent, Trevor."
)
# Using a string
tokeniser.tokenise(exclude: "was magnificent")
# => ["that", "trevor"]
# Using a regular expression
tokeniser.tokenise(exclude: /trevor/)
# => ["magnificent", "that", "was", "magnificent"]
# Using a lambda
tokeniser.tokenise(exclude: ->(t) { t.length < 4 })
# => ["magnificent", "that", "magnificent", "trevor"]
# Using symbol
tokeniser = WordsCounted::Tokeniser.new("Hello! محمد")
tokeniser.tokenise(exclude: :ascii_only?)
# => ["محمد"]
# Using an array
tokeniser = WordsCounted::Tokeniser.new(
"Hello! اسماءنا هي محمد، كارولينا، سامي، وداني"
)
tokeniser.tokenise(
exclude: [:ascii_only?, /محمد/, ->(t) { t.length > 6}, "و"]
)
# => ["هي", "سامي", "وداني"]
The default regexp accounts for letters, hyphenated tokens, and apostrophes. This means twenty-one is treated as one token. So is Mohamad's.
/[\p{Alpha}\-']+/
You can pass your own criteria as a Ruby regular expression to split your string as desired.
For example, if you wanted to include numbers, you can override the regular expression:
counter = WordsCounted.count("Numbers 1, 2, and 3", pattern: /[\p{Alnum}\-']+/)
counter.tokens
#=> ["numbers", "1", "2", "and", "3"]
Use the from_file
method to open files. from_file
accepts the same options as .count
. The file path can be a URL.
counter = WordsCounted.from_file("url/or/path/to/file.text")
A hyphen used in leu of an em or en dash will form part of the token. This affects the tokeniser algorithm.
counter = WordsCounted.count("How do you do?-you are well, I see.")
counter.token_frequency
[
["do", 2],
["how", 1],
["you", 1],
["-you", 1], # WTF, mate!
["are", 1],
# ...
]
In this example -you
and you
are separate tokens. Also, the tokeniser does not include numbers by default. Remember that you can pass your own regular expression if the default behaviour does not fit your needs.
The program will normalise (downcase) all incoming strings for consistency and filters.
def self.from_url
# open url and send string here after removing html
end
See contributors.
git checkout -b my-new-feature
)git commit -am 'Add some feature'
)git push origin my-new-feature
)Author: abitdodgy
Source code: https://github.com/abitdodgy/words_counted
License: MIT license
#ruby #ruby-on-rails
1658068560
WordsCounted
We are all in the gutter, but some of us are looking at the stars.
-- Oscar Wilde
WordsCounted is a Ruby NLP (natural language processor). WordsCounted lets you implement powerful tokensation strategies with a very flexible tokeniser class.
["Bayrūt"]
and not ["Bayr", "ū", "t"]
, for example.Add this line to your application's Gemfile:
gem 'words_counted'
And then execute:
$ bundle
Or install it yourself as:
$ gem install words_counted
Pass in a string or a file path, and an optional filter and/or regexp.
counter = WordsCounted.count(
"We are all in the gutter, but some of us are looking at the stars."
)
# Using a file
counter = WordsCounted.from_file("path/or/url/to/my/file.txt")
.count
and .from_file
are convenience methods that take an input, tokenise it, and return an instance of WordsCounted::Counter
initialized with the tokens. The WordsCounted::Tokeniser
and WordsCounted::Counter
classes can be used alone, however.
WordsCounted.count(input, options = {})
Tokenises input and initializes a WordsCounted::Counter
object with the resulting tokens.
counter = WordsCounted.count("Hello Beirut!")
Accepts two options: exclude
and regexp
. See Excluding tokens from the analyser and Passing in a custom regexp respectively.
WordsCounted.from_file(path, options = {})
Reads and tokenises a file, and initializes a WordsCounted::Counter
object with the resulting tokens.
counter = WordsCounted.from_file("hello_beirut.txt")
Accepts the same options as .count
.
The tokeniser allows you to tokenise text in a variety of ways. You can pass in your own rules for tokenisation, and apply a powerful filter with any combination of rules as long as they can boil down into a lambda.
Out of the box the tokeniser includes only alpha chars. Hyphenated tokens and tokens with apostrophes are considered a single token.
#tokenise([pattern: TOKEN_REGEXP, exclude: nil])
tokeniser = WordsCounted::Tokeniser.new("Hello Beirut!").tokenise
# With `exclude`
tokeniser = WordsCounted::Tokeniser.new("Hello Beirut!").tokenise(exclude: "hello")
# With `pattern`
tokeniser = WordsCounted::Tokeniser.new("I <3 Beirut!").tokenise(pattern: /[a-z]/i)
See Excluding tokens from the analyser and Passing in a custom regexp for more information.
The WordsCounted::Counter
class allows you to collect various statistics from an array of tokens.
#token_count
Returns the token count of a given string.
counter.token_count #=> 15
#token_frequency
Returns a sorted (unstable) two-dimensional array where each element is a token and its frequency. The array is sorted by frequency in descending order.
counter.token_frequency
[
["the", 2],
["are", 2],
["we", 1],
# ...
["all", 1]
]
#most_frequent_tokens
Returns a hash where each key-value pair is a token and its frequency.
counter.most_frequent_tokens
{ "are" => 2, "the" => 2 }
#token_lengths
Returns a sorted (unstable) two-dimentional array where each element contains a token and its length. The array is sorted by length in descending order.
counter.token_lengths
[
["looking", 7],
["gutter", 6],
["stars", 5],
# ...
["in", 2]
]
#longest_tokens
Returns a hash where each key-value pair is a token and its length.
counter.longest_tokens
{ "looking" => 7 }
#token_density([ precision: 2 ])
Returns a sorted (unstable) two-dimentional array where each element contains a token and its density as a float, rounded to a precision of two. The array is sorted by density in descending order. It accepts a precision
argument, which must be a float.
counter.token_density
[
["are", 0.13],
["the", 0.13],
["but", 0.07 ],
# ...
["we", 0.07 ]
]
#char_count
Returns the char count of tokens.
counter.char_count #=> 76
#average_chars_per_token([ precision: 2 ])
Returns the average char count per token rounded to two decimal places. Accepts a precision argument which defaults to two. Precision must be a float.
counter.average_chars_per_token #=> 4
#uniq_token_count
Returns the number of unique tokens.
counter.uniq_token_count #=> 13
You can exclude anything you want from the input by passing the exclude
option. The exclude option accepts a variety of filters and is extremely flexible.
:odd?
.tokeniser =
WordsCounted::Tokeniser.new(
"Magnificent! That was magnificent, Trevor."
)
# Using a string
tokeniser.tokenise(exclude: "was magnificent")
# => ["that", "trevor"]
# Using a regular expression
tokeniser.tokenise(exclude: /trevor/)
# => ["magnificent", "that", "was", "magnificent"]
# Using a lambda
tokeniser.tokenise(exclude: ->(t) { t.length < 4 })
# => ["magnificent", "that", "magnificent", "trevor"]
# Using symbol
tokeniser = WordsCounted::Tokeniser.new("Hello! محمد")
tokeniser.tokenise(exclude: :ascii_only?)
# => ["محمد"]
# Using an array
tokeniser = WordsCounted::Tokeniser.new(
"Hello! اسماءنا هي محمد، كارولينا، سامي، وداني"
)
tokeniser.tokenise(
exclude: [:ascii_only?, /محمد/, ->(t) { t.length > 6}, "و"]
)
# => ["هي", "سامي", "وداني"]
The default regexp accounts for letters, hyphenated tokens, and apostrophes. This means twenty-one is treated as one token. So is Mohamad's.
/[\p{Alpha}\-']+/
You can pass your own criteria as a Ruby regular expression to split your string as desired.
For example, if you wanted to include numbers, you can override the regular expression:
counter = WordsCounted.count("Numbers 1, 2, and 3", pattern: /[\p{Alnum}\-']+/)
counter.tokens
#=> ["numbers", "1", "2", "and", "3"]
Use the from_file
method to open files. from_file
accepts the same options as .count
. The file path can be a URL.
counter = WordsCounted.from_file("url/or/path/to/file.text")
A hyphen used in leu of an em or en dash will form part of the token. This affects the tokeniser algorithm.
counter = WordsCounted.count("How do you do?-you are well, I see.")
counter.token_frequency
[
["do", 2],
["how", 1],
["you", 1],
["-you", 1], # WTF, mate!
["are", 1],
# ...
]
In this example -you
and you
are separate tokens. Also, the tokeniser does not include numbers by default. Remember that you can pass your own regular expression if the default behaviour does not fit your needs.
The program will normalise (downcase) all incoming strings for consistency and filters.
def self.from_url
# open url and send string here after removing html
end
Are you using WordsCounted to do something interesting? Please tell me about it.
Visit this website for one example of what you can do with WordsCounted.
Contributors
See contributors.
git checkout -b my-new-feature
)git commit -am 'Add some feature'
)git push origin my-new-feature
)Author: Abitdodgy
Source Code: https://github.com/abitdodgy/words_counted
License: MIT license
1624658400
Hey guys, in this video I review PAID NETWORK. This is a DeFi project that aims to solve complex legal process using decentralised protocols and DeFi products for 2021.
PAID Network is an ecosystem DAPP that leverages blockchain technology to deliver DeFi powered SMART Agreements to make business exponentially more efficient. We allow users to create their own policy, to ensure they Get PAID.
📺 The video in this post was made by Crypto expat
The origin of the article: https://www.youtube.com/watch?v=ZIU5javfL90
🔺 DISCLAIMER: The article is for information sharing. The content of this video is solely the opinions of the speaker who is not a licensed financial advisor or registered investment advisor. Not investment advice or legal advice.
Cryptocurrency trading is VERY risky. Make sure you understand these risks and that you are responsible for what you do with your money
🔥 If you’re a beginner. I believe the article below will be useful to you ☞ What You Should Know Before Investing in Cryptocurrency - For Beginner
⭐ ⭐ ⭐The project is of interest to the community. Join to Get free ‘GEEK coin’ (GEEKCASH coin)!
☞ **-----CLICK HERE-----**⭐ ⭐ ⭐
Thanks for visiting and watching! Please don’t forget to leave a like, comment and share!
#bitcoin #blockchain #paid network #paid network review #token sale #paid network review, is it worth investing in? token sale coming soon !!
1622197808
SafeMoon is a decentralized finance (DeFi) token. This token consists of RFI tokenomics and auto-liquidity generating protocol. A DeFi token like SafeMoon has reached the mainstream standards under the Binance Smart Chain. Its success and popularity have been immense, thus, making the majority of the business firms adopt this style of cryptocurrency as an alternative.
A DeFi token like SafeMoon is almost similar to the other crypto-token, but the only difference being that it charges a 10% transaction fee from the users who sell their tokens, in which 5% of the fee is distributed to the remaining SafeMoon owners. This feature rewards the owners for holding onto their tokens.
Read More @ https://bit.ly/3oFbJoJ
#create a defi token like safemoon #defi token like safemoon #safemoon token #safemoon token clone #defi token