Royce  Reinger

Royce Reinger

1657287687

CommonRegexRuby: Find A Lot Of Kinds Of Common information In A String

CommonRegexRuby

CommonRegex port for Ruby

Find a lot of kinds of common information in a string.

Pull requests welcome!

Please note that this is currently English/US specific.

Installation

To install CommonRegexRuby, just run:

    $ gem install commonregex

Now you're able to use the CommonRegex class, check the API and the examples.

API

Instance methods will return the results relative to the text passed at the constructor. Class methods will receive a text as parameter and return the results relative to it.

Possible instance and class methods:

  • get_dates([text])
  • get_times([text])
  • get_phones([text])
  • get_links([text])
  • get_emails([text])
  • get_ipv4([text])
  • get_ipv6([text])
  • get_hex_colors([text])
  • get_acronyms([text])
  • get_money([text])
  • get_percentages([text]) (matches percentages between 0.00% and 100.00%)
  • get_credit_cards([text])
  • get_addresses([text])

Examples

    text = "John, please get that article on www.linkedin.com to me by 5:00PM\n"
    "on Jan 9th 2012. 4:00 would be ideal, actually. If you have any questions,\n"
    "you can reach my associate at (012)-345-6789 or associative@mail.com.\n"
    "I\'ll be on UK during the whole week on a J.R.R. Tolkien convention."
    
    common_regex = CommonRegex.new(text)
    put common_regex.get_dates
    // ["Jan 9th 2012"]
    puts common_regex.get_times
    // ["5:00PM", "4:00"]
    puts common_regex.get_phones
    // ["(012)-345-6789"]
    puts common_regex.get_links
    // ["www.linkedin.com"]
    puts common_regex.get_emails
    // ["associative@mail.com"]
    puts common_regex.get_acronyms
    // ["UK", "J.R.R."]

Alternatively, you can use class methods.


    puts CommonRegex.get_times 'When are you free? Do you want to meet up for coffee at 4:00?'
    // ["4:00"]
    puts CommonRegex.get_money 'They said the price was US$5,000.90, actually it is US$3,900.5. It\'s $1100.4 less, can you imagine this?'
    // ["US$5,000.90", "US$3,900.5", "$1100.4"]
    puts CommonRegex.get_percentages 'I\'m 99.9999999% sure that I\'ll get a raise of 5%.'
    // ["99.9999999%", "5%"]
    puts CommonRegex.get_ipv6 'The IPv6 address for localhost is 0:0:0:0:0:0:0:1, or alternatively, ::1.'
    // ["0:0:0:0:0:0:0:1", "::1"]

CommonRegex Ports

There are CommonRegex ports for other languages, see here

Author: talyssonoc
Source Code: https://github.com/talyssonoc/CommonRegexRuby 
License: 

#ruby #string #regex 

What is GEEK

Buddha Community

CommonRegexRuby: Find A Lot Of Kinds Of Common information In A String

Activeinteraction: Manage Application Specific Business Logic Of Ruby

ActiveInteraction

ActiveInteraction manages application-specific business logic. It's an implementation of service objects designed to blend seamlessly into Rails. 


ActiveInteraction gives you a place to put your business logic. It also helps you write safer code by validating that your inputs conform to your expectations. If ActiveModel deals with your nouns, then ActiveInteraction handles your verbs.

API Documentation

Installation

Add it to your Gemfile:

gem 'active_interaction', '~> 5.1'

Or install it manually:

$ gem install active_interaction --version '~> 5.1'

This project uses Semantic Versioning. Check out GitHub releases for a detailed list of changes.

Basic usage

To define an interaction, create a subclass of ActiveInteraction::Base. Then you need to do two things:

Define your inputs. Use class filter methods to define what you expect your inputs to look like. For instance, if you need a boolean flag for pepperoni, use boolean :pepperoni. Check out the filters section for all the available options.

Define your business logic. Do this by implementing the #execute method. Each input you defined will be available as the type you specified. If any of the inputs are invalid, #execute won't be run. Filters are responsible for checking your inputs. Check out the validations section if you need more than that.

That covers the basics. Let's put it all together into a simple example that squares a number.

require 'active_interaction'

class Square < ActiveInteraction::Base
  float :x

  def execute
    x**2
  end
end

Call .run on your interaction to execute it. You must pass a single hash to .run. It will return an instance of your interaction. By convention, we call this an outcome. You can use the #valid? method to ask the outcome if it's valid. If it's invalid, take a look at its errors with #errors. In either case, the value returned from #execute will be stored in #result.

outcome = Square.run(x: 'two point one')
outcome.valid?
# => nil
outcome.errors.messages
# => {:x=>["is not a valid float"]}

outcome = Square.run(x: 2.1)
outcome.valid?
# => true
outcome.result
# => 4.41

You can also use .run! to execute interactions. It's like .run but more dangerous. It doesn't return an outcome. If the outcome would be invalid, it will instead raise an error. But if the outcome would be valid, it simply returns the result.

Square.run!(x: 'two point one')
# ActiveInteraction::InvalidInteractionError: X is not a valid float
Square.run!(x: 2.1)
# => 4.41

Validations

ActiveInteraction checks your inputs. Often you'll want more than that. For instance, you may want an input to be a string with at least one non-whitespace character. Instead of writing your own validation for that, you can use validations from ActiveModel.

These validations aren't provided by ActiveInteraction. They're from ActiveModel. You can also use any custom validations you wrote yourself in your interactions.

class SayHello < ActiveInteraction::Base
  string :name

  validates :name,
    presence: true

  def execute
    "Hello, #{name}!"
  end
end

When you run this interaction, two things will happen. First ActiveInteraction will check your inputs. Then ActiveModel will validate them. If both of those are happy, it will be executed.

SayHello.run!(name: nil)
# ActiveInteraction::InvalidInteractionError: Name is required

SayHello.run!(name: '')
# ActiveInteraction::InvalidInteractionError: Name can't be blank

SayHello.run!(name: 'Taylor')
# => "Hello, Taylor!"

Filters

You can define filters inside an interaction using the appropriate class method. Each method has the same signature:

Some symbolic names. These are the attributes to create.

An optional hash of options. Each filter supports at least these two options:

default is the fallback value to use if nil is given. To make a filter optional, set default: nil.

desc is a human-readable description of the input. This can be useful for generating documentation. For more information about this, read the descriptions section.

An optional block of sub-filters. Only array and hash filters support this. Other filters will ignore blocks when given to them.

Let's take a look at an example filter. It defines three inputs: x, y, and z. Those inputs are optional and they all share the same description ("an example filter").

array :x, :y, :z,
  default: nil,
  desc: 'an example filter' do
    # Some filters support sub-filters here.
  end

In general, filters accept values of the type they correspond to, plus a few alternatives that can be reasonably coerced. Typically the coercions come from Rails, so "1" can be interpreted as the boolean value true, the string "1", or the number 1.

Basic Filters

Array

In addition to accepting arrays, array inputs will convert ActiveRecord::Relations into arrays.

class ArrayInteraction < ActiveInteraction::Base
  array :toppings

  def execute
    toppings.size
  end
end

ArrayInteraction.run!(toppings: 'everything')
# ActiveInteraction::InvalidInteractionError: Toppings is not a valid array
ArrayInteraction.run!(toppings: [:cheese, 'pepperoni'])
# => 2

Use a block to constrain the types of elements an array can contain. Note that you can only have one filter inside an array block, and it must not have a name.

array :birthdays do
  date
end

For interface, object, and record filters, the name of the array filter will be singularized and used to determine the type of value passed. In the example below, the objects passed would need to be of type Cow.

array :cows do
  object
end

You can override this by passing the necessary information to the inner filter.

array :managers do
  object class: People
end

Errors that occur will be indexed based on the Rails configuration setting index_nested_attribute_errors. You can also manually override this setting with the :index_errors option. In this state is is possible to get multiple errors from a single filter.

class ArrayInteraction < ActiveInteraction::Base
  array :favorite_numbers, index_errors: true do
    integer
  end

  def execute
    favorite_numbers
  end
end

ArrayInteraction.run(favorite_numbers: [8, 'bazillion']).errors.details
=> {:"favorite_numbers[1]"=>[{:error=>:invalid_type, :type=>"array"}]}

With :index_errors set to false the error would have been:

{:favorite_numbers=>[{:error=>:invalid_type, :type=>"array"}]}

Boolean

Boolean filters convert the strings "1", "true", and "on" (case-insensitive) into true. They also convert "0", "false", and "off" into false. Blank strings will be treated as nil.

class BooleanInteraction < ActiveInteraction::Base
  boolean :kool_aid

  def execute
    'Oh yeah!' if kool_aid
  end
end

BooleanInteraction.run!(kool_aid: 1)
# ActiveInteraction::InvalidInteractionError: Kool aid is not a valid boolean
BooleanInteraction.run!(kool_aid: true)
# => "Oh yeah!"

File

File filters also accept TempFiles and anything that responds to #rewind. That means that you can pass the params from uploading files via forms in Rails.

class FileInteraction < ActiveInteraction::Base
  file :readme

  def execute
    readme.size
  end
end

FileInteraction.run!(readme: 'README.md')
# ActiveInteraction::InvalidInteractionError: Readme is not a valid file
FileInteraction.run!(readme: File.open('README.md'))
# => 21563

Hash

Hash filters accept hashes. The expected value types are given by passing a block and nesting other filters. You can have any number of filters inside a hash, including other hashes.

class HashInteraction < ActiveInteraction::Base
  hash :preferences do
    boolean :newsletter
    boolean :sweepstakes
  end

  def execute
    puts 'Thanks for joining the newsletter!' if preferences[:newsletter]
    puts 'Good luck in the sweepstakes!' if preferences[:sweepstakes]
  end
end

HashInteraction.run!(preferences: 'yes, no')
# ActiveInteraction::InvalidInteractionError: Preferences is not a valid hash
HashInteraction.run!(preferences: { newsletter: true, 'sweepstakes' => false })
# Thanks for joining the newsletter!
# => nil

Setting default hash values can be tricky. The default value has to be either nil or {}. Use nil to make the hash optional. Use {} if you want to set some defaults for values inside the hash.

hash :optional,
  default: nil
# => {:optional=>nil}

hash :with_defaults,
  default: {} do
    boolean :likes_cookies,
      default: true
  end
# => {:with_defaults=>{:likes_cookies=>true}}

By default, hashes remove any keys that aren't given as nested filters. To allow all hash keys, set strip: false. In general we don't recommend doing this, but it's sometimes necessary.

hash :stuff,
  strip: false

String

String filters define inputs that only accept strings.

class StringInteraction < ActiveInteraction::Base
  string :name

  def execute
    "Hello, #{name}!"
  end
end

StringInteraction.run!(name: 0xDEADBEEF)
# ActiveInteraction::InvalidInteractionError: Name is not a valid string
StringInteraction.run!(name: 'Taylor')
# => "Hello, Taylor!"

String filter strips leading and trailing whitespace by default. To disable it, set the strip option to false.

string :comment,
  strip: false

Symbol

Symbol filters define inputs that accept symbols. Strings will be converted into symbols.

class SymbolInteraction < ActiveInteraction::Base
  symbol :method

  def execute
    method.to_proc
  end
end

SymbolInteraction.run!(method: -> {})
# ActiveInteraction::InvalidInteractionError: Method is not a valid symbol
SymbolInteraction.run!(method: :object_id)
# => #<Proc:0x007fdc9ba94118>

Dates and times

Filters that work with dates and times behave similarly. By default, they all convert strings into their expected data types using .parse. Blank strings will be treated as nil. If you give the format option, they will instead convert strings using .strptime. Note that formats won't work with DateTime and Time filters if a time zone is set.

Date

class DateInteraction < ActiveInteraction::Base
  date :birthday

  def execute
    birthday + (18 * 365)
  end
end

DateInteraction.run!(birthday: 'yesterday')
# ActiveInteraction::InvalidInteractionError: Birthday is not a valid date
DateInteraction.run!(birthday: Date.new(1989, 9, 1))
# => #<Date: 2007-08-28 ((2454341j,0s,0n),+0s,2299161j)>
date :birthday,
  format: '%Y-%m-%d'

DateTime

class DateTimeInteraction < ActiveInteraction::Base
  date_time :now

  def execute
    now.iso8601
  end
end

DateTimeInteraction.run!(now: 'now')
# ActiveInteraction::InvalidInteractionError: Now is not a valid date time
DateTimeInteraction.run!(now: DateTime.now)
# => "2015-03-11T11:04:40-05:00"
date_time :start,
  format: '%Y-%m-%dT%H:%M:%S'

Time

In addition to converting strings with .parse (or .strptime), time filters convert numbers with .at.

class TimeInteraction < ActiveInteraction::Base
  time :epoch

  def execute
    Time.now - epoch
  end
end

TimeInteraction.run!(epoch: 'a long, long time ago')
# ActiveInteraction::InvalidInteractionError: Epoch is not a valid time
TimeInteraction.run!(epoch: Time.new(1970))
# => 1426068362.5136619
time :start,
  format: '%Y-%m-%dT%H:%M:%S'

Numbers

All numeric filters accept numeric input. They will also convert strings using the appropriate method from Kernel (like .Float). Blank strings will be treated as nil.

Decimal

class DecimalInteraction < ActiveInteraction::Base
  decimal :price

  def execute
    price * 1.0825
  end
end

DecimalInteraction.run!(price: 'one ninety-nine')
# ActiveInteraction::InvalidInteractionError: Price is not a valid decimal
DecimalInteraction.run!(price: BigDecimal(1.99, 2))
# => #<BigDecimal:7fe792a42028,'0.2165E1',18(45)>

To specify the number of significant digits, use the digits option.

decimal :dollars,
  digits: 2

Float

class FloatInteraction < ActiveInteraction::Base
  float :x

  def execute
    x**2
  end
end

FloatInteraction.run!(x: 'two point one')
# ActiveInteraction::InvalidInteractionError: X is not a valid float
FloatInteraction.run!(x: 2.1)
# => 4.41

Integer

class IntegerInteraction < ActiveInteraction::Base
  integer :limit

  def execute
    limit.downto(0).to_a
  end
end

IntegerInteraction.run!(limit: 'ten')
# ActiveInteraction::InvalidInteractionError: Limit is not a valid integer
IntegerInteraction.run!(limit: 10)
# => [10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

When a String is passed into an integer input, the value will be coerced. A default base of 10 is used though it may be overridden with the base option. If a base of 0 is provided, the coercion will respect radix indicators present in the string.

class IntegerInteraction < ActiveInteraction::Base
  integer :limit1
  integer :limit2, base: 8
  integer :limit3, base: 0

  def execute
    [limit1, limit2, limit3]
  end
end

IntegerInteraction.run!(limit1: 71, limit2: 71, limit3: 71)
# => [71, 71, 71]
IntegerInteraction.run!(limit1: "071", limit2: "071", limit3: "0x71")
# => [71, 57, 113]
IntegerInteraction.run!(limit1: "08", limit2: "08", limit3: "08")
ActiveInteraction::InvalidInteractionError: Limit2 is not a valid integer, Limit3 is not a valid integer

Advanced Filters

Interface

Interface filters allow you to specify an interface that the passed value must meet in order to pass. The name of the interface is used to look for a constant inside the ancestor listing for the passed value. This allows for a variety of checks depending on what's passed. Class instances are checked for an included module or an inherited ancestor class. Classes are checked for an extended module or an inherited ancestor class. Modules are checked for an extended module.

class InterfaceInteraction < ActiveInteraction::Base
  interface :exception

  def execute
    exception
  end
end

InterfaceInteraction.run!(exception: Exception)
# ActiveInteraction::InvalidInteractionError: Exception is not a valid interface
InterfaceInteraction.run!(exception: NameError) # a subclass of Exception
# => NameError

You can use :from to specify a class or module. This would be the equivalent of what's above.

class InterfaceInteraction < ActiveInteraction::Base
  interface :error,
    from: Exception

  def execute
    error
  end
end

You can also create an anonymous interface on the fly by passing the methods option.

class InterfaceInteraction < ActiveInteraction::Base
  interface :serializer,
    methods: %i[dump load]

  def execute
    input = '{ "is_json" : true }'
    object = serializer.load(input)
    output = serializer.dump(object)

    output
  end
end

require 'json'

InterfaceInteraction.run!(serializer: Object.new)
# ActiveInteraction::InvalidInteractionError: Serializer is not a valid interface
InterfaceInteraction.run!(serializer: JSON)
# => "{\"is_json\":true}"

Object

Object filters allow you to require an instance of a particular class or one of its subclasses.

class Cow
  def moo
    'Moo!'
  end
end

class ObjectInteraction < ActiveInteraction::Base
  object :cow

  def execute
    cow.moo
  end
end

ObjectInteraction.run!(cow: Object.new)
# ActiveInteraction::InvalidInteractionError: Cow is not a valid object
ObjectInteraction.run!(cow: Cow.new)
# => "Moo!"

The class name is automatically determined by the filter name. If your filter name is different than your class name, use the class option. It can be either the class, a string, or a symbol.

object :dolly1,
  class: Sheep
object :dolly2,
  class: 'Sheep'
object :dolly3,
  class: :Sheep

If you have value objects or you would like to build one object from another, you can use the converter option. It is only called if the value provided is not an instance of the class or one of its subclasses. The converter option accepts a symbol that specifies a class method on the object class or a proc. Both will be passed the value and any errors thrown inside the converter will cause the value to be considered invalid. Any returned value that is not the correct class will also be treated as invalid. Any default that is not an instance of the class or subclass and is not nil will also be converted.

class ObjectInteraction < ActiveInteraction::Base
  object :ip_address,
    class: IPAddr,
    converter: :new

  def execute
    ip_address
  end
end

ObjectInteraction.run!(ip_address: '192.168.1.1')
# #<IPAddr: IPv4:192.168.1.1/255.255.255.255>

ObjectInteraction.run!(ip_address: 1)
# ActiveInteraction::InvalidInteractionError: Ip address is not a valid object

Record

Record filters allow you to require an instance of a particular class (or one of its subclasses) or a value that can be used to locate an instance of the object. If the value does not match, it will call find on the class of the record. This is particularly useful when working with ActiveRecord objects. Like an object filter, the class is derived from the name passed but can be specified with the class option. Any default that is not an instance of the class or subclass and is not nil will also be found. Blank strings passed in will be treated as nil.

class RecordInteraction < ActiveInteraction::Base
  record :encoding

  def execute
    encoding
  end
end

> RecordInteraction.run!(encoding: Encoding::US_ASCII)
=> #<Encoding:US-ASCII>

> RecordInteraction.run!(encoding: 'ascii')
=> #<Encoding:US-ASCII>

A different method can be specified by providing a symbol to the finder option.

Rails

ActiveInteraction plays nicely with Rails. You can use interactions to handle your business logic instead of models or controllers. To see how it all works, let's take a look at a complete example of a controller with the typical resourceful actions.

Setup

We recommend putting your interactions in app/interactions. It's also very helpful to group them by model. That way you can look in app/interactions/accounts for all the ways you can interact with accounts.

- app/
  - controllers/
    - accounts_controller.rb
  - interactions/
    - accounts/
      - create_account.rb
      - destroy_account.rb
      - find_account.rb
      - list_accounts.rb
      - update_account.rb
  - models/
    - account.rb
  - views/
    - account/
      - edit.html.erb
      - index.html.erb
      - new.html.erb
      - show.html.erb

Controller

Index

# GET /accounts
def index
  @accounts = ListAccounts.run!
end

Since we're not passing any inputs to ListAccounts, it makes sense to use .run! instead of .run. If it failed, that would mean we probably messed up writing the interaction.

class ListAccounts < ActiveInteraction::Base
  def execute
    Account.not_deleted.order(last_name: :asc, first_name: :asc)
  end
end

Show

Up next is the show action. For this one we'll define a helper method to handle raising the correct errors. We have to do this because calling .run! would raise an ActiveInteraction::InvalidInteractionError instead of an ActiveRecord::RecordNotFound. That means Rails would render a 500 instead of a 404.

# GET /accounts/:id
def show
  @account = find_account!
end

private

def find_account!
  outcome = FindAccount.run(params)

  if outcome.valid?
    outcome.result
  else
    fail ActiveRecord::RecordNotFound, outcome.errors.full_messages.to_sentence
  end
end

This probably looks a little different than you're used to. Rails commonly handles this with a before_filter that sets the @account instance variable. Why is all this interaction code better? Two reasons: One, you can reuse the FindAccount interaction in other places, like your API controller or a Resque task. And two, if you want to change how accounts are found, you only have to change one place.

Inside the interaction, we could use #find instead of #find_by_id. That way we wouldn't need the #find_account! helper method in the controller because the error would bubble all the way up. However, you should try to avoid raising errors from interactions. If you do, you'll have to deal with raised exceptions as well as the validity of the outcome.

class FindAccount < ActiveInteraction::Base
  integer :id

  def execute
    account = Account.not_deleted.find_by_id(id)

    if account
      account
    else
      errors.add(:id, 'does not exist')
    end
  end
end

Note that it's perfectly fine to add errors during execution. Not all errors have to come from checking or validation.

New

The new action will be a little different than the ones we've looked at so far. Instead of calling .run or .run!, it's going to initialize a new interaction. This is possible because interactions behave like ActiveModels.

# GET /accounts/new
def new
  @account = CreateAccount.new
end

Since interactions behave like ActiveModels, we can use ActiveModel validations with them. We'll use validations here to make sure that the first and last names are not blank. The validations section goes into more detail about this.

class CreateAccount < ActiveInteraction::Base
  string :first_name, :last_name

  validates :first_name, :last_name,
    presence: true

  def to_model
    Account.new
  end

  def execute
    account = Account.new(inputs)

    unless account.save
      errors.merge!(account.errors)
    end

    account
  end
end

We used a couple of advanced features here. The #to_model method helps determine the correct form to use in the view. Check out the section on forms for more about that. Inside #execute, we merge errors. This is a convenient way to move errors from one object to another. Read more about it in the errors section.

Create

The create action has a lot in common with the new action. Both of them use the CreateAccount interaction. And if creating the account fails, this action falls back to rendering the new action.

# POST /accounts
def create
  outcome = CreateAccount.run(params.fetch(:account, {}))

  if outcome.valid?
    redirect_to(outcome.result)
  else
    @account = outcome
    render(:new)
  end
end

Note that we have to pass a hash to .run. Passing nil is an error.

Since we're using an interaction, we don't need strong parameters. The interaction will ignore any inputs that weren't defined by filters. So you can forget about params.require and params.permit because interactions handle that for you.

Destroy

The destroy action will reuse the #find_account! helper method we wrote earlier.

# DELETE /accounts/:id
def destroy
  DestroyAccount.run!(account: find_account!)
  redirect_to(accounts_url)
end

In this simple example, the destroy interaction doesn't do much. It's not clear that you gain anything by putting it in an interaction. But in the future, when you need to do more than account.destroy, you'll only have to update one spot.

class DestroyAccount < ActiveInteraction::Base
  object :account

  def execute
    account.destroy
  end
end

Edit

Just like the destroy action, editing uses the #find_account! helper. Then it creates a new interaction instance to use as a form object.

# GET /accounts/:id/edit
def edit
  account = find_account!
  @account = UpdateAccount.new(
    account: account,
    first_name: account.first_name,
    last_name: account.last_name)
end

The interaction that updates accounts is more complicated than the others. It requires an account to update, but the other inputs are optional. If they're missing, it'll ignore those attributes. If they're present, it'll update them.

class UpdateAccount < ActiveInteraction::Base
  object :account

  string :first_name, :last_name,
    default: nil

  validates :first_name,
    presence: true,
    unless: -> { first_name.nil? }
  validates :last_name,
    presence: true,
    unless: -> { last_name.nil? }

  def execute
    account.first_name = first_name if first_name.present?
    account.last_name = last_name if last_name.present?

    unless account.save
      errors.merge!(account.errors)
    end

    account
  end
end

Update

Hopefully you've gotten the hang of this by now. We'll use #find_account! to get the account. Then we'll build up the inputs for UpdateAccount. Then we'll run the interaction and either redirect to the updated account or back to the edit page.

# PUT /accounts/:id
def update
  inputs = { account: find_account! }.reverse_merge(params[:account])
  outcome = UpdateAccount.run(inputs)

  if outcome.valid?
    redirect_to(outcome.result)
  else
    @account = outcome
    render(:edit)
  end
end

Advanced usage

Callbacks

ActiveSupport::Callbacks provides a powerful framework for defining callbacks. ActiveInteraction uses that framework to allow hooking into various parts of an interaction's lifecycle.

class Increment < ActiveInteraction::Base
  set_callback :filter, :before, -> { puts 'before filter' }

  integer :x

  set_callback :validate, :after, -> { puts 'after validate' }

  validates :x,
    numericality: { greater_than_or_equal_to: 0 }

  set_callback :execute, :around, lambda { |_interaction, block|
    puts '>>>'
    block.call
    puts '<<<'
  }

  def execute
    puts 'executing'
    x + 1
  end
end

Increment.run!(x: 1)
# before filter
# after validate
# >>>
# executing
# <<<
# => 2

In order, the available callbacks are filter, validate, and execute. You can set before, after, or around on any of them.

Composition

You can run interactions from within other interactions with #compose. If the interaction is successful, it'll return the result (just like if you had called it with .run!). If something went wrong, execution will halt immediately and the errors will be moved onto the caller.

class Add < ActiveInteraction::Base
  integer :x, :y

  def execute
    x + y
  end
end

class AddThree < ActiveInteraction::Base
  integer :x

  def execute
    compose(Add, x: x, y: 3)
  end
end

AddThree.run!(x: 5)
# => 8

To bring in filters from another interaction, use .import_filters. Combined with inputs, delegating to another interaction is a piece of cake.

class AddAndDouble < ActiveInteraction::Base
  import_filters Add

  def execute
    compose(Add, inputs) * 2
  end
end

Note that errors in composed interactions have a few tricky cases. See the errors section for more information about them.

Defaults

The default value for an input can take on many different forms. Setting the default to nil makes the input optional. Setting it to some value makes that the default value for that input. Setting it to a lambda will lazily set the default value for that input. That means the value will be computed when the interaction is run, as opposed to when it is defined.

Lambda defaults are evaluated in the context of the interaction, so you can use the values of other inputs in them.

# This input is optional.
time :a, default: nil
# This input defaults to `Time.at(123)`.
time :b, default: Time.at(123)
# This input lazily defaults to `Time.now`.
time :c, default: -> { Time.now }
# This input defaults to the value of `c` plus 10 seconds.
time :d, default: -> { c + 10 }

Descriptions

Use the desc option to provide human-readable descriptions of filters. You should prefer these to comments because they can be used to generate documentation. The interaction class has a .filters method that returns a hash of filters. Each filter has a #desc method that returns the description.

class Descriptive < ActiveInteraction::Base
  string :first_name,
    desc: 'your first name'
  string :last_name,
    desc: 'your last name'
end

Descriptive.filters.each do |name, filter|
  puts "#{name}: #{filter.desc}"
end
# first_name: your first name
# last_name: your last name

Errors

ActiveInteraction provides detailed errors for easier introspection and testing of errors. Detailed errors improve on regular errors by adding a symbol that represents the type of error that has occurred. Let's look at an example where an item is purchased using a credit card.

class BuyItem < ActiveInteraction::Base
  object :credit_card, :item
  hash :options do
    boolean :gift_wrapped
  end

  def execute
    order = credit_card.purchase(item)
    notify(credit_card.account)
    order
  end

  private def notify(account)
    # ...
  end
end

Having missing or invalid inputs causes the interaction to fail and return errors.

outcome = BuyItem.run(item: 'Thing', options: { gift_wrapped: 'yes' })
outcome.errors.messages
# => {:credit_card=>["is required"], :item=>["is not a valid object"], :"options.gift_wrapped"=>["is not a valid boolean"]}

Determining the type of error based on the string is difficult if not impossible. Calling #details instead of #messages on errors gives you the same list of errors with a testable label representing the error.

outcome.errors.details
# => {:credit_card=>[{:error=>:missing}], :item=>[{:error=>:invalid_type, :type=>"object"}], :"options.gift_wrapped"=>[{:error=>:invalid_type, :type=>"boolean"}]}

Detailed errors can also be manually added during the execute call by passing a symbol to #add instead of a string.

def execute
  errors.add(:monster, :no_passage)
end

ActiveInteraction also supports merging errors. This is useful if you want to delegate validation to some other object. For example, if you have an interaction that updates a record, you might want that record to validate itself. By using the #merge! helper on errors, you can do exactly that.

class UpdateThing < ActiveInteraction::Base
  object :thing

  def execute
    unless thing.save
      errors.merge!(thing.errors)
    end

    thing
  end
end

When a composed interaction fails, its errors are merged onto the caller. This generally produces good error messages, but there are a few cases to look out for.

class Inner < ActiveInteraction::Base
  boolean :x, :y
end

class Outer < ActiveInteraction::Base
  string :x
  boolean :z, default: nil

  def execute
    compose(Inner, x: x, y: z)
  end
end

outcome = Outer.run(x: 'yes')
outcome.errors.details
# => { :x    => [{ :error => :invalid_type, :type => "boolean" }],
#      :base => [{ :error => "Y is required" }] }
outcome.errors.full_messages.join(' and ')
# => "X is not a valid boolean and Y is required"

Since both interactions have an input called x, the inner error for that input is moved to the x error on the outer interaction. This results in a misleading error that claims the input x is not a valid boolean even though it's a string on the outer interaction.

Since only the inner interaction has an input called y, the inner error for that input is moved to the base error on the outer interaction. This results in a confusing error that claims the input y is required even though it's not present on the outer interaction.

Forms

The outcome returned by .run can be used in forms as though it were an ActiveModel object. You can also create a form object by calling .new on the interaction.

Given an application with an Account model we'll create a new Account using the CreateAccount interaction.

# GET /accounts/new
def new
  @account = CreateAccount.new
end

# POST /accounts
def create
  outcome = CreateAccount.run(params.fetch(:account, {}))

  if outcome.valid?
    redirect_to(outcome.result)
  else
    @account = outcome
    render(:new)
  end
end

The form used to create a new Account has slightly more information on the form_for call than you might expect.

<%= form_for @account, as: :account, url: accounts_path do |f| %>
  <%= f.text_field :first_name %>
  <%= f.text_field :last_name %>
  <%= f.submit 'Create' %>
<% end %>

This is necessary because we want the form to act like it is creating a new Account. Defining to_model on the CreateAccount interaction tells the form to treat our interaction like an Account.

class CreateAccount < ActiveInteraction::Base
  # ...

  def to_model
    Account.new
  end
end

Now our form_for call knows how to generate the correct URL and param name (i.e. params[:account]).

# app/views/accounts/new.html.erb
<%= form_for @account do |f| %>
  <%# ... %>
<% end %>

If you have an interaction that updates an Account, you can define to_model to return the object you're updating.

class UpdateAccount < ActiveInteraction::Base
  # ...

  object :account

  def to_model
    account
  end
end

ActiveInteraction also supports formtastic and simple_form. The filters used to define the inputs on your interaction will relay type information to these gems. As a result, form fields will automatically use the appropriate input type.

Shared input options

It can be convenient to apply the same options to a bunch of inputs. One common use case is making many inputs optional. Instead of setting default: nil on each one of them, you can use with_options to reduce duplication.

with_options default: nil do
  date :birthday
  string :name
  boolean :wants_cake
end

Optional inputs

Optional inputs can be defined by using the :default option as described in the filters section. Within the interaction, provided and default values are merged to create inputs. There are times where it is useful to know whether a value was passed to run or the result of a filter default. In particular, it is useful when nil is an acceptable value. For example, you may optionally track your users' birthdays. You can use the inputs.given? predicate to see if an input was even passed to run. With inputs.given? you can also check the input of a hash or array filter by passing a series of keys or indexes to check.

class UpdateUser < ActiveInteraction::Base
  object :user
  date :birthday,
    default: nil

  def execute
    user.birthday = birthday if inputs.given?(:birthday)
    errors.merge!(user.errors) unless user.save
    user
  end
end

Now you have a few options. If you don't want to update their birthday, leave it out of the hash. If you want to remove their birthday, set birthday: nil. And if you want to update it, pass in the new value as usual.

user = User.find(...)

# Don't update their birthday.
UpdateUser.run!(user: user)

# Remove their birthday.
UpdateUser.run!(user: user, birthday: nil)

# Update their birthday.
UpdateUser.run!(user: user, birthday: Date.new(2000, 1, 2))

Translations

ActiveInteraction is i18n aware out of the box! All you have to do is add translations to your project. In Rails, these typically go into config/locales. For example, let's say that for some reason you want to print everything out backwards. Simply add translations for ActiveInteraction to your hsilgne locale.

# config/locales/hsilgne.yml
hsilgne:
  active_interaction:
    types:
      array: yarra
      boolean: naeloob
      date: etad
      date_time: emit etad
      decimal: lamiced
      file: elif
      float: taolf
      hash: hsah
      integer: regetni
      interface: ecafretni
      object: tcejbo
      string: gnirts
      symbol: lobmys
      time: emit
    errors:
      messages:
        invalid: dilavni si
        invalid_type: '%{type} dilav a ton si'
        missing: deriuqer si

Then set your locale and run interactions like normal.

class I18nInteraction < ActiveInteraction::Base
  string :name
end

I18nInteraction.run(name: false).errors.messages[:name]
# => ["is not a valid string"]

I18n.locale = :hsilgne
I18nInteraction.run(name: false).errors.messages[:name]
# => ["gnirts dilav a ton si"]

Everything else works like an activerecord entry. For example, to rename an attribute you can use attributes.

Here we'll rename the num attribute on an interaction named product:

en:
  active_interaction:
    attributes:
      product:
        num: 'Number'

Credits

ActiveInteraction is brought to you by Aaron Lasseigne. Along with Aaron, Taylor Fausak helped create and maintain ActiveInteraction but has since moved on.

If you want to contribute to ActiveInteraction, please read our contribution guidelines. A complete list of contributors is available on GitHub.

ActiveInteraction is licensed under the MIT License.


Author: AaronLasseigne
Source code: https://github.com/AaronLasseigne/active_interaction
License: MIT license

#ruby 

Anissa  Barrows

Anissa Barrows

1669099573

What Is Face Recognition? Facial Recognition with Python and OpenCV

In this article, we will know what is face recognition and how is different from face detection. We will go briefly over the theory of face recognition and then jump on to the coding section. At the end of this article, you will be able to make a face recognition program for recognizing faces in images as well as on a live webcam feed.

What is Face Detection?

In computer vision, one essential problem we are trying to figure out is to automatically detect objects in an image without human intervention. Face detection can be thought of as such a problem where we detect human faces in an image. There may be slight differences in the faces of humans but overall, it is safe to say that there are certain features that are associated with all the human faces. There are various face detection algorithms but Viola-Jones Algorithm is one of the oldest methods that is also used today and we will use the same later in the article. You can go through the Viola-Jones Algorithm after completing this article as I’ll link it at the end of this article.

Face detection is usually the first step towards many face-related technologies, such as face recognition or verification. However, face detection can have very useful applications. The most successful application of face detection would probably be photo taking. When you take a photo of your friends, the face detection algorithm built into your digital camera detects where the faces are and adjusts the focus accordingly.

For a tutorial on Real-Time Face detection

What is Face Recognition?

face recognition

Now that we are successful in making such algorithms that can detect faces, can we also recognise whose faces are they?

Face recognition is a method of identifying or verifying the identity of an individual using their face. There are various algorithms that can do face recognition but their accuracy might vary. Here I am going to describe how we do face recognition using deep learning.

So now let us understand how we recognise faces using deep learning. We make use of face embedding in which each face is converted into a vector and this technique is called deep metric learning. Let me further divide this process into three simple steps for easy understanding:

Face Detection: The very first task we perform is detecting faces in the image or video stream. Now that we know the exact location/coordinates of face, we extract this face for further processing ahead.
 

Feature Extraction: Now that we have cropped the face out of the image, we extract features from it. Here we are going to use face embeddings to extract the features out of the face. A neural network takes an image of the person’s face as input and outputs a vector which represents the most important features of a face. In machine learning, this vector is called embedding and thus we call this vector as face embedding. Now how does this help in recognizing faces of different persons? 
 

While training the neural network, the network learns to output similar vectors for faces that look similar. For example, if I have multiple images of faces within different timespan, of course, some of the features of my face might change but not up to much extent. So in this case the vectors associated with the faces are similar or in short, they are very close in the vector space. Take a look at the below diagram for a rough idea:

Now after training the network, the network learns to output vectors that are closer to each other(similar) for faces of the same person(looking similar). The above vectors now transform into:

We are not going to train such a network here as it takes a significant amount of data and computation power to train such networks. We will use a pre-trained network trained by Davis King on a dataset of ~3 million images. The network outputs a vector of 128 numbers which represent the most important features of a face.

Now that we know how this network works, let us see how we use this network on our own data. We pass all the images in our data to this pre-trained network to get the respective embeddings and save these embeddings in a file for the next step.

Comparing faces: Now that we have face embeddings for every face in our data saved in a file, the next step is to recognise a new t image that is not in our data. So the first step is to compute the face embedding for the image using the same network we used above and then compare this embedding with the rest of the embeddings we have. We recognise the face if the generated embedding is closer or similar to any other embedding as shown below:

So we passed two images, one of the images is of Vladimir Putin and other of George W. Bush. In our example above, we did not save the embeddings for Putin but we saved the embeddings of Bush. Thus when we compared the two new embeddings with the existing ones, the vector for Bush is closer to the other face embeddings of Bush whereas the face embeddings of Putin are not closer to any other embedding and thus the program cannot recognise him.

What is OpenCV

In the field of Artificial Intelligence, Computer Vision is one of the most interesting and Challenging tasks. Computer Vision acts like a bridge between Computer Software and visualizations around us. It allows computer software to understand and learn about the visualizations in the surroundings. For Example: Based on the color, shape and size determining the fruit. This task can be very easy for the human brain however in the Computer Vision pipeline, first we gather the data, then we perform the data processing activities and then we train and teach the model to understand how to distinguish between the fruits based on size, shape and color of fruit. 

Currently, various packages are present to perform machine learning, deep learning and computer vision tasks. By far, computer vision is the best module for such complex activities. OpenCV is an open-source library. It is supported by various programming languages such as R, Python. It runs on most of the platforms such as Windows, Linux and MacOS.

To know more about how face recognition works on opencv, check out the free course on face recognition in opencv.

Advantages of OpenCV:

  • OpenCV is an open-source library and is free of cost.
  • As compared to other libraries, it is fast since it is written in C/C++.
  • It works better on System with lesser RAM
  • To supports most of the Operating Systems such as Windows, Linux and MacOS.
  •  

Installation: 

Here we will be focusing on installing OpenCV for python only. We can install OpenCV using pip or conda(for anaconda environment). 

  1. Using pip: 

Using pip, the installation process of openCV can be done by using the following command in the command prompt.

pip install opencv-python

  1. Anaconda:

If you are using anaconda environment, either you can execute the above code in anaconda prompt or you can execute the following code in anaconda prompt.

conda install -c conda-forge opencv

Face Recognition using Python

In this section, we shall implement face recognition using OpenCV and Python. First, let us see the libraries we will need and how to install them:

  • OpenCV
  • dlib
  • Face_recognition

OpenCV is an image and video processing library and is used for image and video analysis, like facial detection, license plate reading, photo editing, advanced robotic vision, optical character recognition, and a whole lot more.
 

The dlib library, maintained by Davis King, contains our implementation of “deep metric learning” which is used to construct our face embeddings used for the actual recognition process.
 

The face_recognition  library, created by Adam Geitgey, wraps around dlib’s facial recognition functionality, and this library is super easy to work with and we will be using this in our code. Remember to install dlib library first before you install face_recognition.
 

To install OpenCV, type in command prompt 
 

pip install opencv-python

I have tried various ways to install dlib on Windows but the easiest of all of them is via Anaconda. First, install Anaconda (here is a guide to install it) and then use this command in your command prompt:
 

conda install -c conda-forge dlib

Next to install face_recognition, type in command prompt

pip install face_recognition

Now that we have all the dependencies installed, let us start coding. We will have to create three files, one will take our dataset and extract face embedding for each face using dlib. Next, we will save these embedding in a file.
 

In the next file we will compare the faces with the existing the recognise faces in images and next we will do the same but recognise faces in live webcam feed
 

Extracting features from Face

First, you need to get a dataset or even create one of you own. Just make sure to arrange all images in folders with each folder containing images of just one person.

Next, save the dataset in a folder the same as you are going to make the file. Now here is the code:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

from imutils import paths

import face_recognition

import pickle

import cv2

import os

#get paths of each file in folder named Images

#Images here contains my data(folders of various persons)

imagePaths = list(paths.list_images('Images'))

knownEncodings = []

knownNames = []

# loop over the image paths

for (i, imagePath) in enumerate(imagePaths):

    # extract the person name from the image path

    name = imagePath.split(os.path.sep)[-2]

    # load the input image and convert it from BGR (OpenCV ordering)

    # to dlib ordering (RGB)

    image = cv2.imread(imagePath)

    rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

    #Use Face_recognition to locate faces

    boxes = face_recognition.face_locations(rgb,model='hog')

    # compute the facial embedding for the face

    encodings = face_recognition.face_encodings(rgb, boxes)

    # loop over the encodings

    for encoding in encodings:

        knownEncodings.append(encoding)

        knownNames.append(name)

#save emcodings along with their names in dictionary data

data = {"encodings": knownEncodings, "names": knownNames}

#use pickle to save data into a file for later use

f = open("face_enc", "wb")

f.write(pickle.dumps(data))

f.close()

Now that we have stored the embedding in a file named “face_enc”, we can use them to recognise faces in images or live video stream.

Face Recognition in Live webcam Feed

Here is the script to recognise faces on a live webcam feed:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

import face_recognition

import imutils

import pickle

import time

import cv2

import os

#find path of xml file containing haarcascade file

cascPathface = os.path.dirname(

 cv2.__file__) + "/data/haarcascade_frontalface_alt2.xml"

# load the harcaascade in the cascade classifier

faceCascade = cv2.CascadeClassifier(cascPathface)

# load the known faces and embeddings saved in last file

data = pickle.loads(open('face_enc', "rb").read())

print("Streaming started")

video_capture = cv2.VideoCapture(0)

# loop over frames from the video file stream

while True:

    # grab the frame from the threaded video stream

    ret, frame = video_capture.read()

    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    faces = faceCascade.detectMultiScale(gray,

                                         scaleFactor=1.1,

                                         minNeighbors=5,

                                         minSize=(60, 60),

                                         flags=cv2.CASCADE_SCALE_IMAGE)

    # convert the input frame from BGR to RGB

    rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

    # the facial embeddings for face in input

    encodings = face_recognition.face_encodings(rgb)

    names = []

    # loop over the facial embeddings incase

    # we have multiple embeddings for multiple fcaes

    for encoding in encodings:

       #Compare encodings with encodings in data["encodings"]

       #Matches contain array with boolean values and True for the embeddings it matches closely

       #and False for rest

        matches = face_recognition.compare_faces(data["encodings"],

         encoding)

        #set name =inknown if no encoding matches

        name = "Unknown"

        # check to see if we have found a match

        if True in matches:

            #Find positions at which we get True and store them

            matchedIdxs = [i for (i, b) in enumerate(matches) if b]

            counts = {}

            # loop over the matched indexes and maintain a count for

            # each recognized face face

            for i in matchedIdxs:

                #Check the names at respective indexes we stored in matchedIdxs

                name = data["names"][i]

                #increase count for the name we got

                counts[name] = counts.get(name, 0) + 1

            #set name which has highest count

            name = max(counts, key=counts.get)

        # update the list of names

        names.append(name)

        # loop over the recognized faces

        for ((x, y, w, h), name) in zip(faces, names):

            # rescale the face coordinates

            # draw the predicted face name on the image

            cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)

            cv2.putText(frame, name, (x, y), cv2.FONT_HERSHEY_SIMPLEX,

             0.75, (0, 255, 0), 2)

    cv2.imshow("Frame", frame)

    if cv2.waitKey(1) & 0xFF == ord('q'):

        break

video_capture.release()

cv2.destroyAllWindows()

https://www.youtube.com/watch?v=fLnGdkZxRkg

Although in the example above we have used haar cascade to detect faces, you can also use face_recognition.face_locations to detect a face as we did in the previous script

Face Recognition in Images

The script for detecting and recognising faces in images is almost similar to what you saw above. Try it yourself and if you can’t take a look at the code below:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

import face_recognition

import imutils

import pickle

import time

import cv2

import os

#find path of xml file containing haarcascade file

cascPathface = os.path.dirname(

 cv2.__file__) + "/data/haarcascade_frontalface_alt2.xml"

# load the harcaascade in the cascade classifier

faceCascade = cv2.CascadeClassifier(cascPathface)

# load the known faces and embeddings saved in last file

data = pickle.loads(open('face_enc', "rb").read())

#Find path to the image you want to detect face and pass it here

image = cv2.imread(Path-to-img)

rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

#convert image to Greyscale for haarcascade

gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

faces = faceCascade.detectMultiScale(gray,

                                     scaleFactor=1.1,

                                     minNeighbors=5,

                                     minSize=(60, 60),

                                     flags=cv2.CASCADE_SCALE_IMAGE)

# the facial embeddings for face in input

encodings = face_recognition.face_encodings(rgb)

names = []

# loop over the facial embeddings incase

# we have multiple embeddings for multiple fcaes

for encoding in encodings:

    #Compare encodings with encodings in data["encodings"]

    #Matches contain array with boolean values and True for the embeddings it matches closely

    #and False for rest

    matches = face_recognition.compare_faces(data["encodings"],

    encoding)

    #set name =inknown if no encoding matches

    name = "Unknown"

    # check to see if we have found a match

    if True in matches:

        #Find positions at which we get True and store them

        matchedIdxs = [i for (i, b) in enumerate(matches) if b]

        counts = {}

        # loop over the matched indexes and maintain a count for

        # each recognized face face

        for i in matchedIdxs:

            #Check the names at respective indexes we stored in matchedIdxs

            name = data["names"][i]

            #increase count for the name we got

            counts[name] = counts.get(name, 0) + 1

            #set name which has highest count

            name = max(counts, key=counts.get)

        # update the list of names

        names.append(name)

        # loop over the recognized faces

        for ((x, y, w, h), name) in zip(faces, names):

            # rescale the face coordinates

            # draw the predicted face name on the image

            cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), 2)

            cv2.putText(image, name, (x, y), cv2.FONT_HERSHEY_SIMPLEX,

             0.75, (0, 255, 0), 2)

    cv2.imshow("Frame", image)

    cv2.waitKey(0)

Output:

InputOutput

This brings us to the end of this article where we learned about face recognition.

You can also upskill with Great Learning’s PGP Artificial Intelligence and Machine Learning Course. The course offers mentorship from industry leaders, and you will also have the opportunity to work on real-time industry-relevant projects.


Original article source at: https://www.mygreatlearning.com

#python #opencv 

Muhammad  Price

Muhammad Price

1659511140

Roadie: Making HTML Emails Comfortable for The Ruby Rockstars

Roadie 

  
:warning:This gem is now in [passive maintenance mode][passive]. [(more)][passive]

Making HTML emails comfortable for the Ruby rockstars

Roadie tries to make sending HTML emails a little less painful by inlining stylesheets and rewriting relative URLs for you inside your emails.

How does it work?

Email clients have bad support for stylesheets, and some of them blocks stylesheets from downloading. The easiest way to handle this is to work with inline styles (style="..."), but that is error prone and hard to work with as you cannot use classes and/or reuse styling over your HTML.

This gem makes this easier by automatically inlining stylesheets into the document. You give Roadie your CSS, or let it find it by itself from the <link> and <style> tags in the markup, and it will go through all of the selectors assigning the styles to the matching elements. Careful attention has been put into selectors being applied in the correct order, so it should behave just like in the browser.

"Dynamic" selectors (:hover, :visited, :focus, etc.), or selectors not understood by Nokogiri will be inlined into a single <style> element for those email clients that support it. This changes specificity a great deal for these rules, so it might not work 100% out of the box. (See more about this below)

Roadie also rewrites all relative URLs in the email to an absolute counterpart, making images you insert and those referenced in your stylesheets work. No more headaches about how to write the stylesheets while still having them work with emails from your acceptance environments. You can disable this on specific elements using a data-roadie-ignore marker.

Features

  • Writes CSS styles inline.
    • Respects !important styles.
    • Does not overwrite styles already present in the style attribute of tags.
    • Supports the same CSS selectors as Nokogiri; use CSS3 selectors in your emails!
    • Keeps :hover, @media { ... } and friends around in a separate <style> element.
  • Makes image urls absolute.
    • Hostname and port configurable on a per-environment basis.
    • Can be disabled on individual elements.
  • Makes link hrefs and img srcs absolute.
  • Automatically adds proper HTML skeleton when missing; you don't have to create a layout for emails.
    • Also supports HTML fragments / partial documents, where layout is not added.
  • Allows you to inject stylesheets in a number of ways, at runtime.
  • Removes data-roadie-ignore markers before finishing the HTML.

Install & Usage

Add this gem to your Gemfile as recommended by Rubygems and run bundle install.

gem 'roadie', '~> 4.0'

Your document instance can be configured with several options:

  • url_options - Dictates how absolute URLs should be built.
  • keep_uninlinable_css - Set to false to skip CSS that cannot be inlined.
  • merge_media_queries - Set to false to not group media queries. Some users might prefer to not group rules within media queries because it will result in rules getting reordered. e.g.
@media(max-width: 600px) { .col-6 { display: block; } }
@media(max-width: 400px) { .col-12 { display: inline-block; } }
@media(max-width: 600px) { .col-12 { display: block; } }
  • will become
@media(max-width: 600px) { .col-6 { display: block; } .col-12 { display: block; } }
@media(max-width: 400px) { .col-12 { display: inline-block; } }
  • asset_providers - A list of asset providers that are invoked when CSS files are referenced. See below.
  • external_asset_providers - A list of asset providers that are invoked when absolute CSS URLs are referenced. See below.
  • before_transformation - A callback run before transformation starts.
  • after_transformation - A callback run after transformation is completed.

Making URLs absolute

In order to make URLs absolute you need to first configure the URL options of the document.

html = '... <a href="/about-us">Read more!</a> ...'
document = Roadie::Document.new html
document.url_options = {host: "myapp.com", protocol: "https"}
document.transform
  # => "... <a href=\"https://myapp.com/about-us\">Read more!</a> ..."

The following URLs will be rewritten for you:

  • a[href] (HTML)
  • img[src] (HTML)
  • url() (CSS)

You can disable individual elements by adding an data-roadie-ignore marker on them. CSS will still be inlined on those elements, but URLs will not be rewritten.

<a href="|UNSUBSCRIBE_URL|" data-roadie-ignore>Unsubscribe</a>

Referenced stylesheets

By default, style and link elements in the email document's head are processed along with the stylesheets and removed from the head.

You can set a special data-roadie-ignore attribute on style and link tags that you want to ignore (the attribute will be removed, however). This is the place to put things like :hover selectors that you want to have for email clients allowing them.

Style and link elements with media="print" are also ignored.

<head>
  <link rel="stylesheet" type="text/css" href="/assets/emails/rock.css">         <!-- Will be inlined with normal providers -->
  <link rel="stylesheet" type="text/css" href="http://www.metal.org/metal.css">  <!-- Will be inlined with external providers, *IF* specified; otherwise ignored. -->
  <link rel="stylesheet" type="text/css" href="/assets/jazz.css" media="print">  <!-- Will NOT be inlined; print style -->
  <link rel="stylesheet" type="text/css" href="/ambient.css" data-roadie-ignore> <!-- Will NOT be inlined; ignored -->
  <style></style>                    <!-- Will be inlined -->
  <style data-roadie-ignore></style> <!-- Will NOT be inlined; ignored -->
</head>

Roadie will use the given asset providers to look for the actual CSS that is referenced. If you don't change the default, it will use the Roadie::FilesystemProvider which looks for stylesheets on the filesystem, relative to the current working directory.

Example:

# /home/user/foo/stylesheets/primary.css
body { color: green; }

# /home/user/foo/script.rb
html = <<-HTML
<html>
  <head>
  <link rel="stylesheet" type="text/css" href="/stylesheets/primary.css">
  </head>
  <body>
  </body>
</html>
HTML

Dir.pwd # => "/home/user/foo"
document = Roadie::Document.new html
document.transform # =>
                   # <!DOCTYPE html>
                   # <html>
                   #   <head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"></head>
                   #   <body style="color:green;"></body>
                   # </html>

If a referenced stylesheet cannot be found, the #transform method will raise an Roadie::CssNotFound error. If you instead want to ignore missing stylesheets, you can use the NullProvider.

Configuring providers

You can write your own providers if you need very specific behavior for your app, or you can use the built-in providers. Providers come in two groups: normal and external. Normal providers handle paths without host information (/style/foo.css) while external providers handle URLs with host information (//example.com/foo.css, localhost:3001/bar.css, and so on).

The default configuration is to not have any external providers configured, which will cause those referenced stylesheets to be ignored. Adding one or more providers for external assets causes all of them to be searched and inlined, so if you only want this to happen to specific stylesheets you need to add ignore markers to every other styleshheet (see above).

Included providers:

  • FilesystemProvider – Looks for files on the filesystem, relative to the given directory unless otherwise specified.
  • ProviderList – Wraps a list of other providers and searches them in order. The asset_providers setting is an instance of this. It behaves a lot like an array, so you can push, pop, shift and unshift to it.
  • NullProvider – Does not actually provide anything, it always finds empty stylesheets. Use this in tests or if you want to ignore stylesheets that cannot be found by your other providers (or if you want to force the other providers to never run).
  • NetHttpProvider – Downloads stylesheets using Net::HTTP. Can be given a whitelist of hosts to download from.
  • CachedProvider – Wraps another provider (or ProviderList) and caches responses inside the provided cache store.
  • PathRewriterProvider – Rewrites the passed path and then passes it on to another provider (or ProviderList).

If you want to search several locations on the filesystem, you can declare that:

document.asset_providers = [
  Roadie::FilesystemProvider.new(App.root.join("resources", "stylesheets")),
  Roadie::FilesystemProvider.new(App.root.join("system", "uploads", "stylesheets")),
]

NullProvider

If you want to ignore stylesheets that cannot be found instead of crashing, push the NullProvider to the end:

# Don't crash on missing assets
document.asset_providers << Roadie::NullProvider.new

# Don't download assets in tests
document.external_asset_providers.unshift Roadie::NullProvider.new

Note: This will cause the referenced stylesheet to be removed from the source code, so email client will never see it either.

NetHttpProvider

The NetHttpProvider will download the URLs that is is given using Ruby's standard Net::HTTP library.

You can give it a whitelist of hosts that downloads are allowed from:

document.external_asset_providers << Roadie::NetHttpProvider.new(
  whitelist: ["myapp.com", "assets.myapp.com", "cdn.cdnnetwork.co.jp"],
)
document.external_asset_providers << Roadie::NetHttpProvider.new # Allows every host

CachedProvider

You might want to cache providers from working several times. If you are sending several emails quickly from the same process, this might also save a lot of time on parsing the stylesheets if you use in-memory storage such as a hash.

You can wrap any other kind of providers with it, even a ProviderList:

document.external_asset_providers = Roadie::CachedProvider.new(document.external_asset_providers, my_cache)

If you don't pass a cache backend, it will use a normal Hash. The cache store must follow this protocol:

my_cache["key"] = some_stylesheet_instance # => #<Roadie::Stylesheet instance>
my_cache["key"]                            # => #<Roadie::Stylesheet instance>
my_cache["missing"]                        # => nil

Warning: The default Hash store will never be cleared, so make sure you don't allow the number of unique asset paths to grow too large in a single run. This is especially important if you run Roadie in a daemon that accepts arbritary documents, and/or if you use hash digests in your filenames. Making a new instance of CachedProvider will use a new Hash instance.

You can implement your own custom cache store by implementing the [] and []= methods.

class MyRoadieMemcacheStore
  def initialize(memcache)
    @memcache = memcache
  end

  def [](path)
    css = memcache.read("assets/#{path}/css")
    if css
      name = memcache.read("assets/#{path}/name") || "cached #{path}"
      Roadie::Stylesheet.new(name, css)
    end
  end

  def []=(path, stylesheet)
    memcache.write("assets/#{path}/css", stylesheet.to_s)
    memcache.write("assets/#{path}/name", stylesheet.name)
    stylesheet # You need to return the set Stylesheet
  end
end

document.external_asset_providers = Roadie::CachedProvider.new(
  document.external_asset_providers,
  MyRoadieMemcacheStore.new(MemcacheClient.instance)
)

If you are using Rspec, you can test your implementation by using the shared examples for the "roadie cache store" role:

require "roadie/rspec"

describe MyRoadieMemcacheStore do
  let(:memcache_client) { MemcacheClient.instance }
  subject { MyRoadieMemcacheStore.new(memcache_client) }

  it_behaves_like "roadie cache store" do
    before { memcache_client.clear }
  end
end

PathRewriterProvider

With this provider, you can rewrite the paths that are searched in order to more easily support another provider. Examples could include rewriting absolute URLs into something that can be found on the filesystem, or to access internal hosts instead of external ones.

filesystem = Roadie::FilesystemProvider.new("assets")
document.asset_providers << Roadie::PathRewriterProvider.new(filesystem) do |path|
  path.sub('stylesheets', 'css').downcase
end

document.external_asset_providers = Roadie::PathRewriterProvider.new(filesystem) do |url|
  if url =~ /myapp\.com/
    URI.parse(url).path.sub(%r{^/assets}, '')
  else
    url
  end
end

You can also wrap a list, for example to implement external_asset_providers by composing the normal asset_providers:

document.external_asset_providers =
  Roadie::PathRewriterProvider.new(document.asset_providers) do |url|
    URI.parse(url).path
  end

Writing your own provider

Writing your own provider is also easy. You need to provide:

  • #find_stylesheet(name), returning either a Roadie::Stylesheet or nil.
  • #find_stylesheet!(name), returning either a Roadie::Stylesheet or raising Roadie::CssNotFound.
class UserAssetsProvider
  def initialize(user_collection)
    @user_collection = user_collection
  end

  def find_stylesheet(name)
    if name =~ %r{^/users/(\d+)\.css$}
      user = @user_collection.find_user($1)
      Roadie::Stylesheet.new("user #{user.id} stylesheet", user.stylesheet)
    end
  end

  def find_stylesheet!(name)
    find_stylesheet(name) or
      raise Roadie::CssNotFound.new(
        css_name: name, message: "does not match a user stylesheet", provider: self
      )
  end

  # Instead of implementing #find_stylesheet!, you could also:
  #     include Roadie::AssetProvider
  # That will give you a default implementation without any error message. If
  # you have multiple error cases, it's recommended that you implement
  # #find_stylesheet! without #find_stylesheet and raise with an explanatory
  # error message.
end

# Try to look for a user stylesheet first, then fall back to normal filesystem lookup.
document.asset_providers = [
  UserAssetsProvider.new(app),
  Roadie::FilesystemProvider.new('./stylesheets'),
]

You can test for compliance by using the built-in RSpec examples:

require 'spec_helper'
require 'roadie/rspec'

describe MyOwnProvider do
  # Will use the default `subject` (MyOwnProvider.new)
  it_behaves_like "roadie asset provider", valid_name: "found.css", invalid_name: "does_not_exist.css"

  # Extra setup just for these tests:
  it_behaves_like "roadie asset provider", valid_name: "found.css", invalid_name: "does_not_exist.css" do
    subject { MyOwnProvider.new(...) }
    before { stub_dependencies }
  end
end

Keeping CSS that is impossible to inline

Some CSS is impossible to inline properly. :hover and ::after comes to mind. Roadie tries its best to keep these around by injecting them inside a new <style> element in the <head> (or at the beginning of the partial if transforming a partial document).

The problem here is that Roadie cannot possible adjust the specificity for you, so they will not apply the same way as they did before the styles were inlined.

Another caveat is that a lot of email clients does not support this (which is the entire point of inlining in the first place), so don't put anything important in here. Always handle the case of these selectors not being part of the email.

Specificity problems

Inlined styles will have much higher specificity than styles in a <style>. Here's an example:

<style>p:hover { color: blue; }</style>
<p style="color: green;">Hello world</p>

When hovering over this <p>, the color will not change as the color: green rule takes precedence. You can get it to work by adding !important to the :hover rule.

It would be foolish to try to automatically inject !important on every rule automatically, so this is a manual process.

Turning it off

If you'd rather skip this and have the styles not possible to inline disappear, you can turn off this feature by setting the keep_uninlinable_css option to false.

document.keep_uninlinable_css = false

Callbacks

Callbacks allow you to do custom work on documents before they are transformed. The Nokogiri document tree is passed to the callable along with the Roadie::Document instance:

class TrackNewsletterLinks
  def call(dom, document)
    dom.css("a").each { |link| fix_link(link) }
  end

  def fix_link(link)
    divider = (link['href'] =~ /?/ ? '&' : '?')
    link['href'] = link['href'] + divider + 'source=newsletter'
  end
end

document.before_transformation = ->(dom, document) {
  logger.debug "Inlining document with title #{dom.at_css('head > title').try(:text)}"
}
document.after_transformation = TrackNewsletterLinks.new

XHTML vs HTML

You can configure the underlying HTML/XML engine to output XHTML or HTML (which is the default). One usecase for this is that { tokens usually gets escaped to &#123;, which would be a problem if you then pass the resulting HTML on to some other templating engine that uses those tokens (like Handlebars or Mustache).

document.mode = :xhtml

This will also affect the emitted <!DOCTYPE> if transforming a full document. Partial documents does not have a <!DOCTYPE>.

Build Status

Tested with Github CI using:

  • MRI 2.6
  • MRI 2.7
  • MRI 3.0
  • MRI 3.1

Let me know if you want any other runtime supported officially.

Versioning

This project follows Semantic Versioning and has been since version 1.0.0.

FAQ

Why is my markup changed in subtle ways?

Roadie uses Nokogiri to parse and regenerate the HTML of your email, which means that some unintentional changes might show up.

One example would be that Nokogiri might remove your &nbsp;s in some cases.

Another example is Nokogiri's lack of HTML5 support, so certain new element might have spaces removed. I recommend you don't use HTML5 in emails anyway because of bad email client support (that includes web mail!).

I'm getting segmentation faults (or other C-like problems)! What should I do?

Roadie uses Nokogiri to parse the HTML of your email, so any C-like problems like segfaults are likely in that end. The best way to fix this is to first upgrade libxml2 on your system and then reinstall Nokogiri. Instructions on how to do this on most platforms, see Nokogiri's official install guide.

What happened to my @keyframes?

The CSS Parser used in Roadie does not handle keyframes. I don't think any email clients do either, but if you want to keep on trying you can add them manually to a <style> element (or a separate referenced stylesheet) and tell Roadie not to touch them.

My @media queries are reordered, how can I fix this?

Different @media query blocks with the same conditions are merged by default, which will change the order in some cases. You can disable this by setting merge_media_queries to false. (See Install & Usage section above).

How do I get rid of the <body> elements that are added?

It sounds like you want to transform a partial document. Maybe you are building partials or template fragments to later place in other documents. Use Document#transform_partial instead of Document#transform in order to treat the HTML as a partial document.

Can I skip URL rewriting on a specific element?

If you add the data-roadie-ignore attribute on an element, URL rewriting will not be performed on that element. This could be really useful for you if you intend to send the email through some other rendering pipeline that replaces some placeholders/variables.

<a href="/about-us">About us</a>
<a href="|UNSUBSCRIBE_URL|" data-roadie-ignore>Unsubscribe</a>

Note that this will not skip CSS inlining on the element; it will still get the correct styles applied.

What should I do about "Invalid URL" errors?

If the URL is invalid on purpose, see Can I skip URL rewriting on a specific element? above. Otherwise, you can try to parse it yourself using Ruby's URI class and see if you can figure it out.

require "uri"
URI.parse("https://example.com/best image.jpg") # raises
URI.parse("https://example.com/best%20image.jpg") # Works!

Documentation

Running specs

bundle install
rake

Security

Roadie is set up with the assumption that all CSS and HTML passing through it is under your control. It is not recommended to run arbritary HTML with the default settings.

Care has been given to try to secure all file system accesses, but it is never guaranteed that someone cannot access something they should not be able to access.

In order to secure Roadie against file system access, only use your own asset providers that you yourself can secure against your particular environment.

If you have found any security vulnerability, please email me at magnus.bergmark+security@gmail.com to disclose it. For very sensitive issues, please use my public GPG key. You can also encrypt your message with my public key and open an issue if you do not want to email me directly. Thank you.

History and contributors

This gem was previously tied to Rails. It is now framework-agnostic and supports any type of HTML documents. If you want to use it with Rails, check out roadie-rails.

Major contributors to Roadie:

You can see all contributors on GitHub.

License

(The MIT License)

Copyright (c) 2009-2022 Magnus Bergmark, Jim Neath / Purify, and contributors.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the ‘Software’), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Author: Mange
Source code: https://github.com/Mange/roadie
License: MIT license

#ruby   #ruby-on-rails #html 

Royce  Reinger

Royce Reinger

1657287687

CommonRegexRuby: Find A Lot Of Kinds Of Common information In A String

CommonRegexRuby

CommonRegex port for Ruby

Find a lot of kinds of common information in a string.

Pull requests welcome!

Please note that this is currently English/US specific.

Installation

To install CommonRegexRuby, just run:

    $ gem install commonregex

Now you're able to use the CommonRegex class, check the API and the examples.

API

Instance methods will return the results relative to the text passed at the constructor. Class methods will receive a text as parameter and return the results relative to it.

Possible instance and class methods:

  • get_dates([text])
  • get_times([text])
  • get_phones([text])
  • get_links([text])
  • get_emails([text])
  • get_ipv4([text])
  • get_ipv6([text])
  • get_hex_colors([text])
  • get_acronyms([text])
  • get_money([text])
  • get_percentages([text]) (matches percentages between 0.00% and 100.00%)
  • get_credit_cards([text])
  • get_addresses([text])

Examples

    text = "John, please get that article on www.linkedin.com to me by 5:00PM\n"
    "on Jan 9th 2012. 4:00 would be ideal, actually. If you have any questions,\n"
    "you can reach my associate at (012)-345-6789 or associative@mail.com.\n"
    "I\'ll be on UK during the whole week on a J.R.R. Tolkien convention."
    
    common_regex = CommonRegex.new(text)
    put common_regex.get_dates
    // ["Jan 9th 2012"]
    puts common_regex.get_times
    // ["5:00PM", "4:00"]
    puts common_regex.get_phones
    // ["(012)-345-6789"]
    puts common_regex.get_links
    // ["www.linkedin.com"]
    puts common_regex.get_emails
    // ["associative@mail.com"]
    puts common_regex.get_acronyms
    // ["UK", "J.R.R."]

Alternatively, you can use class methods.


    puts CommonRegex.get_times 'When are you free? Do you want to meet up for coffee at 4:00?'
    // ["4:00"]
    puts CommonRegex.get_money 'They said the price was US$5,000.90, actually it is US$3,900.5. It\'s $1100.4 less, can you imagine this?'
    // ["US$5,000.90", "US$3,900.5", "$1100.4"]
    puts CommonRegex.get_percentages 'I\'m 99.9999999% sure that I\'ll get a raise of 5%.'
    // ["99.9999999%", "5%"]
    puts CommonRegex.get_ipv6 'The IPv6 address for localhost is 0:0:0:0:0:0:0:1, or alternatively, ::1.'
    // ["0:0:0:0:0:0:0:1", "::1"]

CommonRegex Ports

There are CommonRegex ports for other languages, see here

Author: talyssonoc
Source Code: https://github.com/talyssonoc/CommonRegexRuby 
License: 

#ruby #string #regex 

Beth  Cooper

Beth Cooper

1659753600

Easily Create A Hierarchy That Supports Your ActiveRecord Model

Closure Tree

Closure_tree lets your ActiveRecord models act as nodes in a tree data structure

Common applications include modeling hierarchical data, like tags, threaded comments, page graphs in CMSes, and tracking user referrals.

Dramatically more performant than ancestry and acts_as_tree, and even more awesome than awesome_nested_set, closure_tree has some great features:

  • Best-in-class select performance:
    • Fetch your whole ancestor lineage in 1 SELECT.
    • Grab all your descendants in 1 SELECT.
    • Get all your siblings in 1 SELECT.
    • Fetch all descendants as a nested hash in 1 SELECT.
    • Find a node by ancestry path in 1 SELECT.
  • Best-in-class mutation performance:
    • 2 SQL INSERTs on node creation
    • 3 SQL INSERT/UPDATEs on node reparenting
  • Support for concurrency (using with_advisory_lock)
  • Tested against ActiveRecord 6.0+ with Ruby 2.7+
  • Support for reparenting children (and all their descendants)
  • Support for single-table inheritance (STI) within the hierarchy
  • find_or_create_by_path for building out heterogeneous hierarchies quickly and conveniently
  • Support for deterministic ordering
  • Support for preordered traversal of descendants
  • Support for rendering trees in DOT format, using Graphviz
  • Excellent test coverage in a comprehensive variety of environments

See Bill Karwin's excellent Models for hierarchical data presentation for a description of different tree storage algorithms.

Installation

Note that closure_tree only supports ActiveRecord 6.0 and later, and has test coverage for MySQL, PostgreSQL, and SQLite.

Add gem 'closure_tree' to your Gemfile

Run bundle install

Add has_closure_tree (or acts_as_tree, which is an alias of the same method) to your hierarchical model:

class Tag < ActiveRecord::Base
  has_closure_tree
end

class AnotherTag < ActiveRecord::Base
  acts_as_tree
end

Make sure you check out the large number of options that has_closure_tree accepts.

IMPORTANT: Make sure you add has_closure_tree after attr_accessible and self.table_name = lines in your model.

If you're already using other hierarchical gems, like ancestry or acts_as_tree, please refer to the warning section!

Add a migration to add a parent_id column to the hierarchical model. You may want to also add a column for deterministic ordering of children, but that's optional.

class AddParentIdToTag < ActiveRecord::Migration
  def change
    add_column :tags, :parent_id, :integer
  end
end

The column must be nullable. Root nodes have a NULL parent_id.

Run rails g closure_tree:migration tag (and replace tag with your model name) to create the closure tree table for your model.

By default the table name will be the model's table name, followed by "_hierarchies". Note that by calling has_closure_tree, a "virtual model" (in this case, TagHierarchy) will be created dynamically. You don't need to create it.

Run rake db:migrate

If you're migrating from another system where your model already has a parent_id column, run Tag.rebuild! and your tag_hierarchies table will be truncated and rebuilt.

If you're starting from scratch you don't need to call rebuild!.

NOTE: Run rails g closure_tree:config to create an initializer with extra configurations. (Optional)

Warning

As stated above, using multiple hierarchy gems (like ancestry or nested set) on the same model will most likely result in pain, suffering, hair loss, tooth decay, heel-related ailments, and gingivitis. Assume things will break.

Usage

Creation

Create a root node:

grandparent = Tag.create(name: 'Grandparent')

Child nodes are created by appending to the children collection:

parent = grandparent.children.create(name: 'Parent')

Or by appending to the children collection:

child2 = Tag.new(name: 'Second Child')
parent.children << child2

Or by calling the "add_child" method:

child3 = Tag.new(name: 'Third Child')
parent.add_child child3

Or by setting the parent on the child :

Tag.create(name: 'Fourth Child', parent: parent)

Then:

grandparent.self_and_descendants.collect(&:name)
=> ["Grandparent", "Parent", "First Child", "Second Child", "Third Child", "Fourth Child"]

child1.ancestry_path
=> ["Grandparent", "Parent", "First Child"]

find_or_create_by_path

You can find as well as find_or_create by "ancestry paths".

If you provide an array of strings to these methods, they reference the name column in your model, which can be overridden with the :name_column option provided to has_closure_tree.

child = Tag.find_or_create_by_path(%w[grandparent parent child])

As of v5.0.0, find_or_create_by_path can also take an array of attribute hashes:

child = Tag.find_or_create_by_path([
  {name: 'Grandparent', title: 'Sr.'},
  {name: 'Parent', title: 'Mrs.'},
  {name: 'Child', title: 'Jr.'}
])

If you're using STI, The attribute hashes can contain the sti_name and things work as expected:

child = Label.find_or_create_by_path([
  {type: 'DateLabel', name: '2014'},
  {type: 'DateLabel', name: 'August'},
  {type: 'DateLabel', name: '5'},
  {type: 'EventLabel', name: 'Visit the Getty Center'}
])

Moving nodes around the tree

Nodes can be moved around to other parents, and closure_tree moves the node's descendancy to the new parent for you:

d = Tag.find_or_create_by_path %w[a b c d]
h = Tag.find_or_create_by_path %w[e f g h]
e = h.root
d.add_child(e) # "d.children << e" would work too, of course
h.ancestry_path
=> ["a", "b", "c", "d", "e", "f", "g", "h"]

When it is more convenient to simply change the parent_id of a node directly (for example, when dealing with a form <select>), closure_tree will handle the necessary changes automatically when the record is saved:

j = Tag.find 102
j.self_and_ancestor_ids
=> [102, 87, 77]
j.update parent_id: 96
j.self_and_ancestor_ids
=> [102, 96, 95, 78]

Nested hashes

hash_tree provides a method for rendering a subtree as an ordered nested hash:

b = Tag.find_or_create_by_path %w(a b)
a = b.parent
b2 = Tag.find_or_create_by_path %w(a b2)
d1 = b.find_or_create_by_path %w(c1 d1)
c1 = d1.parent
d2 = b.find_or_create_by_path %w(c2 d2)
c2 = d2.parent

Tag.hash_tree
=> {a => {b => {c1 => {d1 => {}}, c2 => {d2 => {}}}, b2 => {}}}

Tag.hash_tree(:limit_depth => 2)
=> {a => {b => {}, b2 => {}}}

b.hash_tree
=> {b => {c1 => {d1 => {}}, c2 => {d2 => {}}}}

b.hash_tree(:limit_depth => 2)
=> {b => {c1 => {}, c2 => {}}}

If your tree is large (or might become so), use :limit_depth.

Without this option, hash_tree will load the entire contents of that table into RAM. Your server may not be happy trying to do this.

HT: ancestry and elhoyos

Eager loading

Since most of closure_tree's methods (e.g. children) return regular ActiveRecord scopes, you can use the includes method for eager loading, e.g.

comment.children.includes(:author)

However, note that the above approach only eager loads the requested associations for the immediate children of comment. If you want to walk through the entire tree, you may still end up making many queries and loading duplicate copies of objects.

In some cases, a viable alternative is the following:

comment.self_and_descendants.includes(:author)

This would load authors for comment and all its descendants in a constant number of queries. However, the return value is an array of Comments, and the tree structure is thus lost, which makes it difficult to walk the tree using elegant recursive algorithms.

A third option is to use has_closure_tree_root on the model that is composed by the closure_tree model (e.g. a Post may be composed by a tree of Comments). So in post.rb, you would do:

# app/models/post.rb
has_closure_tree_root :root_comment

This gives you a plain has_one association (root_comment) to the root Comment (i.e. that with null parent_id).

It also gives you a method called root_comment_including_tree, which you can invoke as follows:

a_post.root_comment_including_tree(:author)

The result of this call will be the root Comment with all descendants and associations loaded in a constant number of queries. Inverse associations are also setup on all nodes, so as you walk the tree, calling children or parent on any node will not trigger any further queries and no duplicate copies of objects are loaded into memory.

The class and foreign key of root_comment are assumed to be Comment and post_id, respectively. These can be overridden in the usual way.

The same caveat stated above with hash_tree also applies here: this method will load the entire tree into memory. If the tree is very large, this may be a bad idea, in which case using the eager loading methods above may be preferred.

Graph visualization

to_dot_digraph is suitable for passing into Graphviz.

For example, for the above tree, write out the DOT file with ruby:

File.open("example.dot", "w") { |f| f.write(Tag.root.to_dot_digraph) }

Then, in a shell, dot -Tpng example.dot > example.png, which produces:

Example tree

If you want to customize the label value, override the #to_digraph_label instance method in your model.

Just for kicks, this is the test tree I used for proving that preordered tree traversal was correct:

Preordered test tree

Available options

When you include has_closure_tree in your model, you can provide a hash to override the following defaults:

  • :parent_column_name to override the column name of the parent foreign key in the model's table. This defaults to "parent_id".
  • :hierarchy_class_name to override the hierarchy class name. This defaults to the singular name of the model + "Hierarchy", like TagHierarchy.
  • :hierarchy_table_name to override the hierarchy table name. This defaults to the singular name of the model + "_hierarchies", like tag_hierarchies.
  • :dependent determines what happens when a node is destroyed. Defaults to nullify.
    • :nullify will simply set the parent column to null. Each child node will be considered a "root" node. This is the default.
    • :delete_all will delete all descendant nodes (which circumvents the destroy hooks)
    • :destroy will destroy all descendant nodes (which runs the destroy hooks on each child node)
    • nil does nothing with descendant nodes
  • :name_column used by #find_or_create_by_path, #find_by_path, and ancestry_path instance methods. This is primarily useful if the model only has one required field (like a "tag").
  • :order used to set up deterministic ordering
  • :touch delegates to the belongs_to annotation for the parent, so touching cascades to all children (the performance of this for deep trees isn't currently optimal).

Accessing Data

Class methods

Tag.root returns an arbitrary root node

Tag.roots returns all root nodes

Tag.leaves returns all leaf nodes

Tag.hash_tree returns an ordered, nested hash that can be depth-limited.

Tag.find_by_path(path, attributes) returns the node whose name path is path. See (#find_or_create_by_path).

Tag.find_or_create_by_path(path, attributes) returns the node whose name path is path, and will create the node if it doesn't exist already.See (#find_or_create_by_path).

Tag.find_all_by_generation(generation_level) returns the descendant nodes who are generation_level away from a root. Tag.find_all_by_generation(0) is equivalent to Tag.roots.

Tag.with_ancestor(ancestors) scopes to all descendants whose ancestors(s) is/are in the given list.

Tag.with_descendant(ancestors) scopes to all ancestors whose descendant(s) is/are in the given list.

Tag.lowest_common_ancestor(descendants) finds the lowest common ancestor of the descendants.

Instance methods

tag.root returns the root for this node

tag.root? returns true if this is a root node

tag.root_of?(node) returns true if current node is root of another one

tag.child? returns true if this is a child node. It has a parent.

tag.leaf? returns true if this is a leaf node. It has no children.

tag.leaves is scoped to all leaf nodes in self_and_descendants.

tag.depth returns the depth, or "generation", for this node in the tree. A root node will have a value of 0.

tag.parent returns the node's immediate parent. Root nodes will return nil.

tag.parent_of?(node) returns true if current node is parent of another one

tag.children is a has_many of immediate children (just those nodes whose parent is the current node).

tag.child_ids is an array of the IDs of the children.

tag.child_of?(node) returns true if current node is child of another one

tag.ancestors is a ordered scope of [ parent, grandparent, great grandparent, … ]. Note that the size of this array will always equal tag.depth.

tag.ancestor_ids is an array of the IDs of the ancestors.

tag.ancestor_of?(node) returns true if current node is ancestor of another one

tag.self_and_ancestors returns a scope containing self, parent, grandparent, great grandparent, etc.

tag.self_and_ancestors_ids returns IDs containing self, parent, grandparent, great grandparent, etc.

tag.siblings returns a scope containing all nodes with the same parent as tag, excluding self.

tag.sibling_ids returns an array of the IDs of the siblings.

tag.self_and_siblings returns a scope containing all nodes with the same parent as tag, including self.

tag.descendants returns a scope of all children, childrens' children, etc., excluding self ordered by depth.

tag.descendant_ids returns an array of the IDs of the descendants.

tag.descendant_of?(node) returns true if current node is descendant of another one

tag.self_and_descendants returns a scope of self, all children, childrens' children, etc., ordered by depth.

tag.self_and_descendant_ids returns IDs of self, all children, childrens' children, etc., ordered by depth.

tag.family_of? returns true if current node and another one have a same root.

tag.hash_tree returns an ordered, nested hash that can be depth-limited.

tag.find_by_path(path) returns the node whose name path from tag is path. See (#find_or_create_by_path).

tag.find_or_create_by_path(path) returns the node whose name path from tag is path, and will create the node if it doesn't exist already.See (#find_or_create_by_path).

tag.find_all_by_generation(generation_level) returns the descendant nodes who are generation_level away from tag.

  • tag.find_all_by_generation(0).to_a == [tag]
  • tag.find_all_by_generation(1) == tag.children
  • tag.find_all_by_generation(2) will return the tag's grandchildren, and so on.

tag.destroy will destroy a node and do something to its children, which is determined by the :dependent option passed to has_closure_tree.

Polymorphic hierarchies with STI

Polymorphic models using single table inheritance (STI) are supported:

  1. Create a db migration that adds a String type column to your model
  2. Subclass the model class. You only need to add has_closure_tree to your base class:
class Tag < ActiveRecord::Base
  has_closure_tree
end
class WhenTag < Tag ; end
class WhereTag < Tag ; end
class WhatTag < Tag ; end

Note that if you call rebuild! on any of the subclasses, the complete Tag hierarchy will be emptied, thus taking the hiearchies of all other subclasses with it (issue #275). However, only the hierarchies for the class rebuild! was called on will be rebuilt, leaving the other subclasses without hierarchy entries.

You can work around that by overloading the rebuild! class method in all your STI subclasses and call the super classes rebuild! method:

class WhatTag < Tag
  def self.rebuild!
    Tag.rebuild!
  end
end

This way, the complete hierarchy including all subclasses will be rebuilt.

Deterministic ordering

By default, children will be ordered by your database engine, which may not be what you want.

If you want to order children alphabetically, and your model has a name column, you'd do this:

class Tag < ActiveRecord::Base
  has_closure_tree order: 'name'
end

If you want a specific order, add a new integer column to your model in a migration:

t.integer :sort_order

and in your model:

class OrderedTag < ActiveRecord::Base
  has_closure_tree order: 'sort_order', numeric_order: true
end

When you enable order, you'll also have the following new methods injected into your model:

  • tag.siblings_before is a scope containing all nodes with the same parent as tag, whose sort order column is less than self. These will be ordered properly, so the last element in scope will be the sibling immediately before self
  • tag.siblings_after is a scope containing all nodes with the same parent as tag, whose sort order column is more than self. These will be ordered properly, so the first element in scope will be the sibling immediately "after" self

If your order column is an integer attribute, you'll also have these:

The class method #roots_and_descendants_preordered, which returns all nodes in your tree, pre-ordered.

node1.self_and_descendants_preordered which will return descendants, pre-ordered.

node1.append_child(node2) (which is an alias to add_child), which will

  1. set node2's parent to node1
  2. set node2's sort order to place node2 last in the children array

node1.prepend_child(node2) which will

  1. set node2's parent to node1
  2. set node2's sort order to place node2 first in the children array Note that all of node1's children's sort_orders will be incremented

node1.prepend_sibling(node2) which will

  1. set node2 to the same parent as node1,
  2. set node2's order column to 1 less than node1's value, and
  3. increment the order_column of all children of node1's parents whose order_column is > node2's new value by 1.

node1.append_sibling(node2) which will

  1. set node2 to the same parent as node1,
  2. set node2's order column to 1 more than node1's value, and
  3. increment the order_column of all children of node1's parents whose order_column is > node2's new value by 1.

root = OrderedTag.create(name: 'root')
a = root.append_child(Label.new(name: 'a'))
b = OrderedTag.create(name: 'b')
c = OrderedTag.create(name: 'c')

# We have to call 'root.reload.children' because root won't be in sync with the database otherwise:

a.append_sibling(b)
root.reload.children.pluck(:name)
=> ["a", "b"]

a.prepend_sibling(b)
root.reload.children.pluck(:name)
=> ["b", "a"]

a.append_sibling(c)
root.reload.children.pluck(:name)
=> ["b", "a", "c"]

b.append_sibling(c)
root.reload.children.pluck(:name)
=> ["b", "c", "a"]

Ordering Roots

With numeric ordering, root nodes are, by default, assigned order values globally across the whole database table. So for instance if you have 5 nodes with no parent, they will be ordered 0 through 4 by default. If your model represents many separate trees and you have a lot of records, this can cause performance problems, and doesn't really make much sense.

You can disable this default behavior by passing dont_order_roots: true as an option to your delcaration:

has_closure_tree order: 'sort_order', numeric_order: true, dont_order_roots: true

In this case, calling prepend_sibling and append_sibling on a root node or calling roots_and_descendants_preordered on the model will raise a RootOrderingDisabledError.

The dont_order_roots option will be ignored unless numeric_order is set to true.

Concurrency

Several methods, especially #rebuild and #find_or_create_by_path, cannot run concurrently correctly. #find_or_create_by_path, for example, may create duplicate nodes.

Database row-level locks work correctly with PostgreSQL, but MySQL's row-level locking is broken, and erroneously reports deadlocks where there are none. To work around this, and have a consistent implementation for both MySQL and PostgreSQL, with_advisory_lock is used automatically to ensure correctness.

If you are already managing concurrency elsewhere in your application, and want to disable the use of with_advisory_lock, pass with_advisory_lock: false in the options hash:

class Tag
  has_closure_tree with_advisory_lock: false
end

Note that you will eventually have data corruption if you disable advisory locks, write to your database with multiple threads, and don't provide an alternative mutex.

I18n

You can customize error messages using I18n:

en-US:
  closure_tree:
    loop_error: Your descendant cannot be your parent!

FAQ

Are there any how-to articles on how to use this gem?

Yup! Ilya Bodrov wrote Nested Comments with Rails.

Does this work well with #default_scope?

No. Please see issue 86 for details.

Can I update parentage with update_attribute?

No. update_attribute skips the validation hook that is required for maintaining the hierarchy table.

Can I assign a parent to multiple children with #update_all?

No. Please see issue 197 for details.

Does this gem support multiple parents?

No. This gem's API is based on the assumption that each node has either 0 or 1 parent.

The underlying closure tree structure will support multiple parents, but there would be many breaking-API changes to support it. I'm open to suggestions and pull requests.

How do I use this with test fixtures?

Test fixtures aren't going to be running your after_save hooks after inserting all your fixture data, so you need to call .rebuild! before your test runs. There's an example in the spec tag_spec.rb:

  describe "Tag with fixtures" do
    fixtures :tags
    before :each do
      Tag.rebuild! # <- required if you use fixtures
    end

However, if you're just starting with Rails, may I humbly suggest you adopt a factory library, rather than using fixtures? Lots of people have written about this already.

There are many lock-* files in my project directory after test runs

This is expected if you aren't using MySQL or Postgresql for your tests.

SQLite doesn't have advisory locks, so we resort to file locking, which will only work if the FLOCK_DIR is set consistently for all ruby processes.

In your spec_helper.rb or minitest_helper.rb, add a before and after block:

before do
  ENV['FLOCK_DIR'] = Dir.mktmpdir
end

after do
  FileUtils.remove_entry_secure ENV['FLOCK_DIR']
end

bundle install says Gem::Ext::BuildError: ERROR: Failed to build gem native extension

When building from source, the mysql2, pg, and sqlite gems need their native client libraries installed on your system. Note that this error isn't specific to ClosureTree.

On Ubuntu/Debian systems, run:

sudo apt-get install libpq-dev libsqlite3-dev libmysqlclient-dev
bundle install

Object destroy fails with MySQL v5.7+

A bug was introduced in MySQL's query optimizer. See the workaround here.

Hierarchy maintenance errors from MySQL v5.7.9-v5.7.10

Upgrade to MySQL 5.7.12 or later if you see this issue:

Mysql2::Error: You can't specify target table '*_hierarchies' for update in FROM clause

Testing with Closure Tree

Closure tree comes with some RSpec2/3 matchers which you may use for your tests:

require 'spec_helper'
require 'closure_tree/test/matcher'

describe Category do
 # Should syntax
 it { should be_a_closure_tree }
 # Expect syntax
 it { is_expected.to be_a_closure_tree }
end

describe Label do
 # Should syntax
 it { should be_a_closure_tree.ordered }
 # Expect syntax
 it { is_expected.to be_a_closure_tree.ordered }
end

describe TodoList::Item do
 # Should syntax
 it { should be_a_closure_tree.ordered(:priority_order) }
 # Expect syntax
 it { is_expected.to be_a_closure_tree.ordered(:priority_order) }
end

Testing

Closure tree is tested under every valid combination of

  • Ruby 2.7+
  • ActiveRecord 6.0+
  • PostgreSQL, MySQL, and SQLite. Concurrency tests are only run with MySQL and PostgreSQL.

Assuming you're using rbenv, you can use tests.sh to run the test matrix locally.

Change log

See the change log.

Thanks to


Author:  ClosureTree
Source code: https://github.com/ClosureTree/closure_tree
License: MIT license

#ruby   #ruby-on-rails