Mike  Kozey

Mike Kozey

1658567760

Memoized: Decorators to Cache Returned Values

Overview

saves the previously computed value

final sum = Memoized(() => 1.to(999999999).sum());
print(sum());
print(sum()); // returned immediately

with LRU Cache (Fibonacci series)

// Memoized1<ReturnType, ArgumentType>
late final Memoized1<int, int> fib;
fib = Memoized1((int n) {
  if (n <= 1) return n;
  return fib(n-1) + fib(n-2);
});

print(fib(80));

Memoized1~5 maintains an LRU cache that uses a function argument as key and the result value according to the function arguments as value, as follows.

LruMap<ArgumentType, ReturnType>

Installing

dependencies:
  memoized:
import 'package:memoized/memoized.dart';

Memoized

Wrap a function to store the previously computed value and return this value on the next call.

Basic

Iterable<int> numbers = 1.to(30000000);
final calculateSum = (() => numbers.sum()).memo;

print(time(calculateSum));
print(time(calculateSum));  // It returns the memoized value.

expire

Iterable<int> numbers = 1.to(30000000);
final calculateSum = (() => numbers.sum()).memo;
print(calculateSum());

numbers = 1.to(9043483);
calculateSum.expire()  // not computed

numbers = 1.to(45000000);
calculateSum.expire()  // not computed

final value = calculateSum()   // recomputed at this point.

in Class

class IntervalTimer {
  final List<Duration> timers = [/* durations */];

  late final totalDuration = _totalDurationImpl.memo;
  Duration _totalDurationImpl() => timers.fold<Duration>(
    Duration.zero,
      (p, v) => p + v
  );
}

Memoized1 ~ 5

Decorator to wrap a function with a memoizing callable that saves up to the capacity most recent calls.

fibonacci

// Memoized1<ReturnType, ArgumentType>
late final Memoized1<int, int> fib;
fib = Memoized1((int n) {
    if (n <= 1) return n;
    return fib(n - 1) + fib(n - 2);
});

print(fib(90));

expire

print(await fetchDoument('hello')); 
fetchDocument.expire('hello');
print(await fetchDoument('hello'));

in Class

class Test {
  late final Memoized1<int, int> fib = Memoized1(
    (n) => n <= 1 ? n : fib(n - 1) + fib(n - 2),
  );
}

capacity

class Test {
  late final Memoized1<int, int> fib = Memoized1(
    (n) => n <= 1 ? n : fib(n - 1) + fib(n - 2),
    capacity: 10, // default == 128
  );
}

Installing

Use this package as a library

Depend on it

Run this command:

With Dart:

 $ dart pub add memoized

With Flutter:

 $ flutter pub add memoized

This will add a line like this to your package's pubspec.yaml (and run an implicit dart pub get):

dependencies:
  memoized: ^1.5.0

Alternatively, your editor might support dart pub get or flutter pub get. Check the docs for your editor to learn more.

Import it

Now in your Dart code, you can use:

import 'package:memoized/memoized.dart';

example/memoized.dart

import 'package:memoized/memoized.dart';
import './timer.dart';

class MemoizedExample {
  late final sumRange = Memoized2((int begin, int end) {
    int sum = 0;
    for (int i = begin; i <= end; i++) {
      sum += i;
    }
    return sum;
  });

  static final one = BigInt.one;
  static final two = BigInt.two;
  late final Memoized1<BigInt, BigInt> fibonacci = Memoized1((n) {
    if (n <= one) return n;
    return fibonacci(n - one) + fibonacci(n - two);
  });
}

void main() {
  final example = MemoizedExample();
  time(() => example.sumRange(0, 999999999));
  time(() => example.sumRange(0, 999999999)); // cached data

  // recursion by using cached data (top-down dynamic programming)
  time(() => example.fibonacci(BigInt.parse('300')));
}

Author: sylfreee 
Source Code: https://github.com/sylfreee/memoized 
License: MIT license

#flutter #dart #decorators 

What is GEEK

Buddha Community

Memoized: Decorators to Cache Returned Values
Edward Jackson

Edward Jackson

1653377002

PySpark Cheat Sheet: Spark in Python

This PySpark cheat sheet with code samples covers the basics like initializing Spark in Python, loading data, sorting, and repartitioning.

Apache Spark is generally known as a fast, general and open-source engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing. It allows you to speed analytic applications up to 100 times faster compared to technologies on the market today. You can interface Spark with Python through "PySpark". This is the Spark Python API exposes the Spark programming model to Python. 

Even though working with Spark will remind you in many ways of working with Pandas DataFrames, you'll also see that it can be tough getting familiar with all the functions that you can use to query, transform, inspect, ... your data. What's more, if you've never worked with any other programming language or if you're new to the field, it might be hard to distinguish between RDD operations.

Let's face it, map() and flatMap() are different enough, but it might still come as a challenge to decide which one you really need when you're faced with them in your analysis. Or what about other functions, like reduce() and reduceByKey()

PySpark cheat sheet

Even though the documentation is very elaborate, it never hurts to have a cheat sheet by your side, especially when you're just getting into it.

This PySpark cheat sheet covers the basics, from initializing Spark and loading your data, to retrieving RDD information, sorting, filtering and sampling your data. But that's not all. You'll also see that topics such as repartitioning, iterating, merging, saving your data and stopping the SparkContext are included in the cheat sheet. 

Note that the examples in the document take small data sets to illustrate the effect of specific functions on your data. In real life data analysis, you'll be using Spark to analyze big data.

PySpark is the Spark Python API that exposes the Spark programming model to Python.

Initializing Spark 

SparkContext 

>>> from pyspark import SparkContext
>>> sc = SparkContext(master = 'local[2]')

Inspect SparkContext 

>>> sc.version #Retrieve SparkContext version
>>> sc.pythonVer #Retrieve Python version
>>> sc.master #Master URL to connect to
>>> str(sc.sparkHome) #Path where Spark is installed on worker nodes
>>> str(sc.sparkUser()) #Retrieve name of the Spark User running SparkContext
>>> sc.appName #Return application name
>>> sc.applicationld #Retrieve application ID
>>> sc.defaultParallelism #Return default level of parallelism
>>> sc.defaultMinPartitions #Default minimum number of partitions for RDDs

Configuration 

>>> from pyspark import SparkConf, SparkContext
>>> conf = (SparkConf()
     .setMaster("local")
     .setAppName("My app")
     . set   ("spark. executor.memory",   "lg"))
>>> sc = SparkContext(conf = conf)

Using the Shell 

In the PySpark shell, a special interpreter-aware SparkContext is already created in the variable called sc.

$ ./bin/spark-shell --master local[2]
$ ./bin/pyspark --master local[s] --py-files code.py

Set which master the context connects to with the --master argument, and add Python .zip..egg or.py files to the

runtime path by passing a comma-separated list to  --py-files.

Loading Data 

Parallelized Collections 

>>> rdd = sc.parallelize([('a',7),('a',2),('b',2)])
>>> rdd2 = sc.parallelize([('a',2),('d',1),('b',1)])
>>> rdd3 = sc.parallelize(range(100))
>>> rdd = sc.parallelize([("a",["x","y","z"]),
               ("b" ["p","r,"])])

External Data 

Read either one text file from HDFS, a local file system or any Hadoop-supported file system URI with textFile(), or read in a directory of text files with wholeTextFiles(). 

>>> textFile = sc.textFile("/my/directory/•.txt")
>>> textFile2 = sc.wholeTextFiles("/my/directory/")

Retrieving RDD Information 

Basic Information 

>>> rdd.getNumPartitions() #List the number of partitions
>>> rdd.count() #Count RDD instances 3
>>> rdd.countByKey() #Count RDD instances by key
defaultdict(<type 'int'>,{'a':2,'b':1})
>>> rdd.countByValue() #Count RDD instances by value
defaultdict(<type 'int'>,{('b',2):1,('a',2):1,('a',7):1})
>>> rdd.collectAsMap() #Return (key,value) pairs as a dictionary
   {'a': 2, 'b': 2}
>>> rdd3.sum() #Sum of RDD elements 4950
>>> sc.parallelize([]).isEmpty() #Check whether RDD is empty
True

Summary 

>>> rdd3.max() #Maximum value of RDD elements 
99
>>> rdd3.min() #Minimum value of RDD elements
0
>>> rdd3.mean() #Mean value of RDD elements 
49.5
>>> rdd3.stdev() #Standard deviation of RDD elements 
28.866070047722118
>>> rdd3.variance() #Compute variance of RDD elements 
833.25
>>> rdd3.histogram(3) #Compute histogram by bins
([0,33,66,99],[33,33,34])
>>> rdd3.stats() #Summary statistics (count, mean, stdev, max & min)

Applying Functions 

#Apply a function to each RFD element
>>> rdd.map(lambda x: x+(x[1],x[0])).collect()
[('a' ,7,7, 'a'),('a' ,2,2, 'a'), ('b' ,2,2, 'b')]
#Apply a function to each RDD element and flatten the result
>>> rdd5 = rdd.flatMap(lambda x: x+(x[1],x[0]))
>>> rdd5.collect()
['a',7 , 7 ,  'a' , 'a' , 2,  2,  'a', 'b', 2 , 2, 'b']
#Apply a flatMap function to each (key,value) pair of rdd4 without changing the keys
>>> rdds.flatMapValues(lambda x: x).collect()
[('a', 'x'), ('a', 'y'), ('a', 'z'),('b', 'p'),('b', 'r')]

Selecting Data

Getting

>>> rdd.collect() #Return a list with all RDD elements 
[('a', 7), ('a', 2), ('b', 2)]
>>> rdd.take(2) #Take first 2 RDD elements 
[('a', 7),  ('a', 2)]
>>> rdd.first() #Take first RDD element
('a', 7)
>>> rdd.top(2) #Take top 2 RDD elements 
[('b', 2), ('a', 7)]

Sampling

>>> rdd3.sample(False, 0.15, 81).collect() #Return sampled subset of rdd3
     [3,4,27,31,40,41,42,43,60,76,79,80,86,97]

Filtering

>>> rdd.filter(lambda x: "a" in x).collect() #Filter the RDD
[('a',7),('a',2)]
>>> rdd5.distinct().collect() #Return distinct RDD values
['a' ,2, 'b',7]
>>> rdd.keys().collect() #Return (key,value) RDD's keys
['a',  'a',  'b']

Iterating 

>>> def g (x): print(x)
>>> rdd.foreach(g) #Apply a function to all RDD elements
('a', 7)
('b', 2)
('a', 2)

Reshaping Data 

Reducing

>>> rdd.reduceByKey(lambda x,y : x+y).collect() #Merge the rdd values for each key
[('a',9),('b',2)]
>>> rdd.reduce(lambda a, b: a+ b) #Merge the rdd values
('a', 7, 'a' , 2 , 'b' , 2)

 

Grouping by

>>> rdd3.groupBy(lambda x: x % 2) #Return RDD of grouped values
          .mapValues(list)
          .collect()
>>> rdd.groupByKey() #Group rdd by key
          .mapValues(list)
          .collect() 
[('a',[7,2]),('b',[2])]

Aggregating

>> seqOp = (lambda x,y: (x[0]+y,x[1]+1))
>>> combOp = (lambda x,y:(x[0]+y[0],x[1]+y[1]))
#Aggregate RDD elements of each partition and then the results
>>> rdd3.aggregate((0,0),seqOp,combOp) 
(4950,100)
#Aggregate values of each RDD key
>>> rdd.aggregateByKey((0,0),seqop,combop).collect() 
     [('a',(9,2)), ('b',(2,1))]
#Aggregate the elements of each partition, and then the results
>>> rdd3.fold(0,add)
     4950
#Merge the values for each key
>>> rdd.foldByKey(0, add).collect()
[('a' ,9), ('b' ,2)]
#Create tuples of RDD elements by applying a function
>>> rdd3.keyBy(lambda x: x+x).collect()

Mathematical Operations 

>>>> rdd.subtract(rdd2).collect() #Return each rdd value not contained in rdd2
[('b' ,2), ('a' ,7)]
#Return each (key,value) pair of rdd2 with no matching key in rdd
>>> rdd2.subtractByKey(rdd).collect()
[('d', 1)1
>>>rdd.cartesian(rdd2).collect() #Return the Cartesian product of rdd and rdd2

Sort 

>>> rdd2.sortBy(lambda x: x[1]).collect() #Sort RDD by given function
[('d',1),('b',1),('a',2)]
>>> rdd2.sortByKey().collect() #Sort (key, value) ROD by key
[('a' ,2), ('b' ,1), ('d' ,1)]

Repartitioning 

>>> rdd.repartition(4) #New RDD with 4 partitions
>>> rdd.coalesce(1) #Decrease the number of partitions in the RDD to 1

Saving 

>>> rdd.saveAsTextFile("rdd.txt")
>>> rdd.saveAsHadoopFile("hdfs:// namenodehost/parent/child",
               'org.apache.hadoop.mapred.TextOutputFormat')

Stopping SparkContext 

>>> sc.stop()

Execution 

$ ./bin/spark-submit examples/src/main/python/pi.py

Have this Cheat Sheet at your fingertips

Original article source at https://www.datacamp.com

#pyspark #cheatsheet #spark #python

Callum Slater

Callum Slater

1653465344

PySpark Cheat Sheet: Spark DataFrames in Python

This PySpark SQL cheat sheet is your handy companion to Apache Spark DataFrames in Python and includes code samples.

You'll probably already know about Apache Spark, the fast, general and open-source engine for big data processing; It has built-in modules for streaming, SQL, machine learning and graph processing. Spark allows you to speed analytic applications up to 100 times faster compared to other technologies on the market today. Interfacing Spark with Python is easy with PySpark: this Spark Python API exposes the Spark programming model to Python. 

Now, it's time to tackle the Spark SQL module, which is meant for structured data processing, and the DataFrame API, which is not only available in Python, but also in Scala, Java, and R.

Without further ado, here's the cheat sheet:

PySpark SQL cheat sheet

This PySpark SQL cheat sheet covers the basics of working with the Apache Spark DataFrames in Python: from initializing the SparkSession to creating DataFrames, inspecting the data, handling duplicate values, querying, adding, updating or removing columns, grouping, filtering or sorting data. You'll also see that this cheat sheet also on how to run SQL Queries programmatically, how to save your data to parquet and JSON files, and how to stop your SparkSession.

Spark SGlL is Apache Spark's module for working with structured data.

Initializing SparkSession 
 

A SparkSession can be used create DataFrame, register DataFrame as tables, execute SGL over tables, cache tables, and read parquet files.

>>> from pyspark.sql import SparkSession
>>> spark a SparkSession \
     .builder\
     .appName("Python Spark SQL basic example") \
     .config("spark.some.config.option", "some-value") \
     .getOrCreate()

Creating DataFrames
 

Fromm RDDs

>>> from pyspark.sql.types import*

Infer Schema

>>> sc = spark.sparkContext
>>> lines = sc.textFile(''people.txt'')
>>> parts = lines.map(lambda l: l.split(","))
>>> people = parts.map(lambda p: Row(nameap[0],ageaint(p[l])))
>>> peopledf = spark.createDataFrame(people)

Specify Schema

>>> people = parts.map(lambda p: Row(name=p[0],
               age=int(p[1].strip())))
>>>  schemaString = "name age"
>>> fields = [StructField(field_name, StringType(), True) for field_name in schemaString.split()]
>>> schema = StructType(fields)
>>> spark.createDataFrame(people, schema).show()

 

From Spark Data Sources
JSON

>>>  df = spark.read.json("customer.json")
>>> df.show()

>>>  df2 = spark.read.load("people.json", format="json")

Parquet files

>>> df3 = spark.read.load("users.parquet")

TXT files

>>> df4 = spark.read.text("people.txt")

Filter 

#Filter entries of age, only keep those records of which the values are >24
>>> df.filter(df["age"]>24).show()

Duplicate Values 

>>> df = df.dropDuplicates()

Queries 
 

>>> from pyspark.sql import functions as F

Select

>>> df.select("firstName").show() #Show all entries in firstName column
>>> df.select("firstName","lastName") \
      .show()
>>> df.select("firstName", #Show all entries in firstName, age and type
              "age",
              explode("phoneNumber") \
              .alias("contactInfo")) \
      .select("contactInfo.type",
              "firstName",
              "age") \
      .show()
>>> df.select(df["firstName"],df["age"]+ 1) #Show all entries in firstName and age, .show() add 1 to the entries of age
>>> df.select(df['age'] > 24).show() #Show all entries where age >24

When

>>> df.select("firstName", #Show firstName and 0 or 1 depending on age >30
               F.when(df.age > 30, 1) \
              .otherwise(0)) \
      .show()
>>> df[df.firstName.isin("Jane","Boris")] #Show firstName if in the given options
.collect()

Like 

>>> df.select("firstName", #Show firstName, and lastName is TRUE if lastName is like Smith
              df.lastName.like("Smith")) \
     .show()

Startswith - Endswith 

>>> df.select("firstName", #Show firstName, and TRUE if lastName starts with Sm
              df.lastName \
                .startswith("Sm")) \
      .show()
>>> df.select(df.lastName.endswith("th"))\ #Show last names ending in th
      .show()

Substring 

>>> df.select(df.firstName.substr(1, 3) \ #Return substrings of firstName
                          .alias("name")) \
        .collect()

Between 

>>> df.select(df.age.between(22, 24)) \ #Show age: values are TRUE if between 22 and 24
          .show()

Add, Update & Remove Columns 

Adding Columns

 >>> df = df.withColumn('city',df.address.city) \
            .withColumn('postalCode',df.address.postalCode) \
            .withColumn('state',df.address.state) \
            .withColumn('streetAddress',df.address.streetAddress) \
            .withColumn('telePhoneNumber', explode(df.phoneNumber.number)) \
            .withColumn('telePhoneType', explode(df.phoneNumber.type)) 

Updating Columns

>>> df = df.withColumnRenamed('telePhoneNumber', 'phoneNumber')

Removing Columns

  >>> df = df.drop("address", "phoneNumber")
 >>> df = df.drop(df.address).drop(df.phoneNumber)
 

Missing & Replacing Values 
 

>>> df.na.fill(50).show() #Replace null values
 >>> df.na.drop().show() #Return new df omitting rows with null values
 >>> df.na \ #Return new df replacing one value with another
       .replace(10, 20) \
       .show()

GroupBy 

>>> df.groupBy("age")\ #Group by age, count the members in the groups
      .count() \
      .show()

Sort 
 

>>> peopledf.sort(peopledf.age.desc()).collect()
>>> df.sort("age", ascending=False).collect()
>>> df.orderBy(["age","city"],ascending=[0,1])\
     .collect()

Repartitioning 

>>> df.repartition(10)\ #df with 10 partitions
      .rdd \
      .getNumPartitions()
>>> df.coalesce(1).rdd.getNumPartitions() #df with 1 partition

Running Queries Programmatically 
 

Registering DataFrames as Views

>>> peopledf.createGlobalTempView("people")
>>> df.createTempView("customer")
>>> df.createOrReplaceTempView("customer")

Query Views

>>> df5 = spark.sql("SELECT * FROM customer").show()
>>> peopledf2 = spark.sql("SELECT * FROM global_temp.people")\
               .show()

Inspect Data 
 

>>> df.dtypes #Return df column names and data types
>>> df.show() #Display the content of df
>>> df.head() #Return first n rows
>>> df.first() #Return first row
>>> df.take(2) #Return the first n rows >>> df.schema Return the schema of df
>>> df.describe().show() #Compute summary statistics >>> df.columns Return the columns of df
>>> df.count() #Count the number of rows in df
>>> df.distinct().count() #Count the number of distinct rows in df
>>> df.printSchema() #Print the schema of df
>>> df.explain() #Print the (logical and physical) plans

Output

Data Structures 
 

 >>> rdd1 = df.rdd #Convert df into an RDD
 >>> df.toJSON().first() #Convert df into a RDD of string
 >>> df.toPandas() #Return the contents of df as Pandas DataFrame

Write & Save to Files 

>>> df.select("firstName", "city")\
       .write \
       .save("nameAndCity.parquet")
 >>> df.select("firstName", "age") \
       .write \
       .save("namesAndAges.json",format="json")

Stopping SparkSession 

>>> spark.stop()

Have this Cheat Sheet at your fingertips

Original article source at https://www.datacamp.com

#pyspark #cheatsheet #spark #dataframes #python #bigdata

Royce  Reinger

Royce Reinger

1659330128

Calculates Edit Distance using Damerau-Levenshtein Algorithm

damerau-levenshtein

The damerau-levenshtein gem allows to find edit distance between two UTF-8 or ASCII encoded strings with O(N*M) efficiency.

This gem implements pure Levenshtein algorithm, Damerau modification of it (where 2 character transposition counts as 1 edit distance). It also includes Boehmer & Rees 2008 modification of Damerau algorithm, where transposition of bigger than 1 character blocks is taken in account as well (Rees 2014).

require "damerau-levenshtein"
DamerauLevenshtein.distance("Something", "Smoething") #returns 1

It also returns a diff between two strings according to Levenshtein alrorithm. The diff is expressed by tags <ins>, <del>, and <subst>. Such tags make it possible to highlight differnce between strings in a flexible way.

require "damerau-levenshtein"
differ = DamerauLevenshtein::Differ.new
differ.run("corn", "cron")
# output: ["c<subst>or</subst>n", "c<subst>ro</subst>n"]

Dependencies

sudo apt-get install build-essential libgmp3-dev

Installation

gem install damerau-levenshtein

Examples

require "damerau-levenshtein"
dl = DamerauLevenshtein
  • compare using Damerau Levenshtein algorithm
dl.distance("Something", "Smoething") #returns 1
  • compare using Levensthein algorithm
dl.distance("Something", "Smoething", 0) #returns 2
  • compare using Boehmer & Rees modification
dl.distance("Something", "meSothing", 2) #returns 2 instead of 4
  • comparison of words with UTF-8 characters should work fine:
dl.distance("Sjöstedt", "Sjostedt") #returns 1
  • compare two arrays
dl.array_distance([1,2,3,5], [1,2,3,4]) #returns 1
  • return diff between two strings
differ = DamerauLevenshtein::Differ.new
differ.run("Something", "smthg")
  • return diff between two strings in raw format
differ = DamerauLevenshtein::Differ.new
differ.format = :raw
differ.run("Something", "smthg")

API Description

Methods

DamerauLevenshtein.version

DamerauLevenshtein.version
#returns version number of the gem

DamerauLevenshtein.distance

DamerauLevenshtein.distance(string1, string2, block_size, max_distance)
#returns edit distance between 2 strings

DamerauLevenshtein.string_distance(string1, string2, block_size, max_distance)
# an alias for .distance

DamerauLevenshtein.array_distance(array1, array2, block_size, max_distance)
# returns edit distance between 2 arrays of integers

DamerauLevenshtein.distance and .array_distance take 4 arguments:

  • string1 (array1 for .array_distance)
  • string2 (array2 for .array_distance)
  • block_size (default is 1)
  • max_distance (default is 10)

block_size determines maximum number of characters in a transposition block:

block_size = 0
(transposition does not count -- it is a pure Levenshtein algorithm)

block_size = 1
(transposition between 2 adjustent characters --
it is pure Damerau-Levenshtein algorithm)

block_size = 2
(transposition between blocks as big as 2 characters -- so abcd and cdab
counts as edit distance 2, not 4)

block_size = 3
(transposition between blocks as big as 3 characters --
so abcdef and defabc counts as edit distance 3, not 6)

etc.

max_distance -- is a threshold after which algorithm gives up and returns max_distance instead of real edit distance.

Levenshtein algorithm is expensive, so it makes sense to give up when edit distance is becoming too big. The argument max_distance does just that.


DamerauLevenshtein.distance("abcdefg", "1234567", 0, 3)
# output: 4 -- it gave up when edit distance exceeded 3

DamerauLevenshtein::Differ

differ = DamerauLevenshtein::Differ.new creates an instance of new differ class to return difference between two strings

differ.format shows current format for diff. Default is :tag format

differ.format = :raw changes current format for diffs. Possible values are :tag and :raw

differ.run("String1", "String2") returns difference between two strings.

For example:

differ = DamerauLevenshtein::Differ.new
differ.run("Something", "smthng")
# output: ["<ins>S</ins><subst>o</subst>m<ins>e</ins>th<ins>i</ins>ng",
#          "<del>S</del><subst>s</subst>m<del>e</del>th<del>i</del>ng"]

Or with parsing:

require "damerau-levenshtein"
require "nokogiri"

differ = DamerauLevenshtein::Differ.new
res = differ.run("Something", "Smothing!")
nodes = Nokogiri::XML("<root>#{res.first}</root>")

markup = nodes.root.children.map do |n|
  case n.name
  when "text"
    n.text
  when "del"
    "~~#{n.children.first.text}~~"
  when "ins"
    "*#{n.children.first.text}*"
  when "subst"
    "**#{n.children.first.text}**"
  end
end.join("")

puts markup

Output

S*o*m**e**thing~~!~~

Contributing to damerau-levenshtein

  • Check out the latest master to make sure the feature hasn't been implemented or the bug hasn't been fixed yet
  • Check out the issue tracker to make sure someone already hasn't requested it and/or contributed it
  • Fork the project
  • Start a feature/bugfix branch
  • Commit and push until you are happy with your contribution
  • Make sure to add tests for it. This is important so I don't break it in a future version unintentionally.
  • Please try not to mess with the Rakefile, version, or history. If you want to have your own version, or is otherwise necessary, that is fine, but please isolate to its own commit so I can cherry-pick around it.

Versioning

This gem is following practices of Semantic Versioning

Download Details: 

Author: GlobalNamesArchitecture
Source Code: https://github.com/GlobalNamesArchitecture/damerau-levenshtein 
License: MIT license

#ruby #algorithm 

Josefa  Corwin

Josefa Corwin

1659852060

A Template Language That Completely Separates Structure and Logic/Ruby

Curly

Curly is a template language that completely separates structure and logic. Instead of interspersing your HTML with snippets of Ruby, all logic is moved to a presenter class.

Installing

Installing Curly is as simple as running gem install curly-templates. If you're using Bundler to manage your dependencies, add this to your Gemfile

gem 'curly-templates'

Curly can also install an application layout file, replacing the .erb file commonly created by Rails. If you wish to use this, run the curly:install generator.

$ rails generate curly:install

How to use Curly

In order to use Curly for a view or partial, use the suffix .curly instead of .erb, e.g. app/views/posts/_comment.html.curly. Curly will look for a corresponding presenter class named Posts::CommentPresenter. By convention, these are placed in app/presenters/, so in this case the presenter would reside in app/presenters/posts/comment_presenter.rb. Note that presenters for partials are not prepended with an underscore.

Add some HTML to the partial template along with some Curly components:

<!-- app/views/posts/_comment.html.curly -->
<div class="comment">
  <p>
    {{author_link}} posted {{time_ago}} ago.
  </p>

  {{body}}

  {{#author?}}
    <p>{{deletion_link}}</p>
  {{/author?}}
</div>

The presenter will be responsible for providing the data for the components. Add the necessary Ruby code to the presenter:

# app/presenters/posts/comment_presenter.rb
class Posts::CommentPresenter < Curly::Presenter
  presents :comment

  def body
    SafeMarkdown.render(@comment.body)
  end

  def author_link
    link_to @comment.author.name, @comment.author, rel: "author"
  end

  def deletion_link
    link_to "Delete", @comment, method: :delete
  end

  def time_ago
    time_ago_in_words(@comment.created_at)
  end

  def author?
    @comment.author == current_user
  end
end

The partial can now be rendered like any other, e.g. by calling

render 'comment', comment: comment
render comment
render collection: post.comments

Curly components are surrounded by curly brackets, e.g. {{hello}}. They always map to a public method on the presenter class, in this case #hello. Methods ending in a question mark can be used for conditional blocks, e.g. {{#admin?}} ... {{/admin?}}.

Identifiers

Curly components can specify an identifier using the so-called dot notation: {{x.y.z}}. This can be very useful if the data you're accessing is hierarchical in nature. One common example is I18n:

<h1>{{i18n.homepage.header}}</h1>
# In the presenter, the identifier is passed as an argument to the method. The
# argument will always be a String.
def i18n(key)
  translate(key)
end

The identifier is separated from the component name with a dot. If the presenter method has a default value for the argument, the identifier is optional – otherwise it's mandatory.

Attributes

In addition to an identifier, Curly components can be annotated with attributes. These are key-value pairs that affect how a component is rendered.

The syntax is reminiscent of HTML:

<div>{{sidebar rows=3 width=200px title="I'm the sidebar!"}}</div>

The presenter method that implements the component must have a matching keyword argument:

def sidebar(rows: "1", width: "100px", title:); end

All argument values will be strings. A compilation error will be raised if

  • an attribute is used in a component without a matching keyword argument being present in the method definition; or
  • a required keyword argument in the method definition is not set as an attribute in the component.

You can define default values using Ruby's own syntax. Additionally, if the presenter method accepts arbitrary keyword arguments using the **doublesplat syntax then all attributes will be valid for the component, e.g.

def greetings(**names)
  names.map {|name, greeting| "#{name}: #{greeting}!" }.join("\n")
end
{{greetings alice=hello bob=hi}}
<!-- The above would be rendered as: -->
alice: hello!
bob: hi!

Note that since keyword arguments in Ruby are represented as Symbol objects, which are not garbage collected in Ruby versions less than 2.2, accepting arbitrary attributes represents a security vulnerability if your application allows untrusted Curly templates to be rendered. Only use this feature with trusted templates if you're not on Ruby 2.2 yet.

Conditional blocks

If there is some content you only want rendered under specific circumstances, you can use conditional blocks. The {{#admin?}}...{{/admin?}} syntax will only render the content of the block if the admin? method on the presenter returns true, while the {{^admin?}}...{{/admin?}} syntax will only render the content if it returns false.

Both forms can have an identifier: {{#locale.en?}}...{{/locale.en?}} will only render the block if the locale? method on the presenter returns true given the argument "en". Here's how to implement that method in the presenter:

class SomePresenter < Curly::Presenter
  # Allows rendering content only if the locale matches a specified identifier.
  def locale?(identifier)
    current_locale == identifier
  end
end

Furthermore, attributes can be set on the block. These only need to be specified when opening the block, not when closing it:

{{#square? width=3 height=3}}
  <p>It's square!</p>
{{/square?}}

Attributes work the same way as they do for normal components.

Collection blocks

Sometimes you want to render one or more items within the current template, and splitting out a separate template and rendering that in the presenter is too much overhead. You can instead define the template that should be used to render the items inline in the current template using the collection block syntax.

Collection blocks are opened using an asterisk:

{{*comments}}
  <li>{{body}} ({{author_name}})</li>
{{/comments}}

The presenter will need to expose the method #comments, which should return a collection of objects:

class Posts::ShowPresenter < Curly::Presenter
  presents :post

  def comments
    @post.comments
  end
end

The template within the collection block will be used to render each item, and it will be backed by a presenter named after the component – in this case, comments. The name will be singularized and Curly will try to find the presenter class in the following order:

  • Posts::ShowPresenter::CommentPresenter
  • Posts::CommentPresenter
  • CommentPresenter

This allows you some flexibility with regards to how you want to organize these nested templates and presenters.

Note that the nested template will only have access to the methods on the nested presenter, but all variables passed to the "parent" presenter will be forwarded to the nested presenter. In addition, the current item in the collection will be passed, as well as that item's index in the collection:

class Posts::CommentPresenter < Curly::Presenter
  presents :post, :comment, :comment_counter

  def number
    # `comment_counter` is automatically set to the item's index in the collection,
    # starting with 1.
    @comment_counter
  end

  def body
    @comment.body
  end

  def author_name
    @comment.author.name
  end
end

Collection blocks are an alternative to splitting out a separate template and rendering that from the presenter – which solution is best depends on your use case.

Context blocks

While collection blocks allow you to define the template that should be used to render items in a collection right within the parent template, context blocks allow you to define the template for an arbitrary context. This is very powerful, and can be used to define widget-style components and helpers, and provide an easy way to work with structured data. Let's say you have a comment form on your page, and you'd rather keep the template inline. A simple template could look like:

<!-- post.html.curly -->
<h1>{{title}}</h1>
{{body}}

{{@comment_form}}
  <b>Name: </b> {{name_field}}<br>
  <b>E-mail: </b> {{email_field}}<br>
  {{comment_field}}

  {{submit_button}}
{{/comment_form}}

Note that an @ character is used to denote a context block. Like with collection blocks, a separate presenter class is used within the block, and a simple convention is used to find it. The name of the context component (in this case, comment_form) will be camel cased, and the current presenter's namespace will be searched:

class PostPresenter < Curly::Presenter
  presents :post
  def title; @post.title; end
  def body; markdown(@post.body); end

  # A context block method *must* take a block argument. The return value
  # of the method will be used when rendering. Calling the block argument will
  # render the nested template. If you pass a value when calling the block
  # argument it will be passed to the presenter.
  def comment_form(&block)
    form_for(Comment.new, &block)
  end

  # The presenter name is automatically deduced.
  class CommentFormPresenter < Curly::Presenter
    # The value passed to the block argument will be passed in a parameter named
    # after the component.
    presents :comment_form

    # Any parameters passed to the parent presenter will be forwarded to this
    # presenter as well.
    presents :post

    def name_field
      @comment_form.text_field :name
    end

    # ...
  end
end

Context blocks were designed to work well with Rails' helper methods such as form_for and content_tag, but you can also work directly with the block. For instance, if you want to directly control the value that is passed to the nested presenter, you can call the call method on the block yourself:

def author(&block)
  content_tag :div, class: "author" do
    # The return value of `call` will be the result of rendering the nested template
    # with the argument. You can post-process the string if you want.
    block.call(@post.author)
  end
end

Context shorthand syntax

If you find yourself opening a context block just in order to use a single component, e.g. {{@author}}{{name}}{{/author}}, you can use the shorthand syntax instead: {{author:name}}. This works for all component types, e.g.

{{#author:admin?}}
  <p>The author is an admin!</p>
{{/author:admin?}}

The syntax works for nested contexts as well, e.g. {{comment:author:name}}. Any identifier and attributes are passed to the target component, which in this example would be {{name}}.

Setting up state

Although most code in Curly presenters should be free of side effects, sometimes side effects are required. One common example is defining content for a content_for block.

If a Curly presenter class defines a setup! method, it will be called before the view is rendered:

class PostPresenter < Curly::Presenter
  presents :post

  def setup!
    content_for :title, post.title

    content_for :sidebar do
      render 'post_sidebar', post: post
    end
  end
end

Escaping Curly syntax

In order to have {{ appear verbatim in the rendered HTML, use the triple Curly escape syntax:

This is {{{escaped}}.

You don't need to escape the closing }}.

Comments

If you want to add comments to your Curly templates that are not visible in the rendered HTML, use the following syntax:

{{! This is some interesting stuff }}

Presenters

Presenters are classes that inherit from Curly::Presenter – they're usually placed in app/presenters/, but you can put them anywhere you'd like. The name of the presenter classes match the virtual path of the view they're part of, so if your controller is rendering posts/show, the Posts::ShowPresenter class will be used. Note that Curly is only used to render a view if a template can be found – in this case, at app/views/posts/show.html.curly.

Presenters can declare a list of accepted variables using the presents method:

class Posts::ShowPresenter < Curly::Presenter
  presents :post
end

A variable can have a default value:

class Posts::ShowPresenter < Curly::Presenter
  presents :post
  presents :comment, default: nil
end

Any public method defined on the presenter is made available to the template as a component:

class Posts::ShowPresenter < Curly::Presenter
  presents :post

  def title
    @post.title
  end

  def author_link
    # You can call any Rails helper from within a presenter instance:
    link_to author.name, profile_path(author), rel: "author"
  end

  private

  # Private methods are not available to the template, so they're safe to
  # use.
  def author
    @post.author
  end
end

Presenter methods can even take an argument. Say your Curly template has the content {{t.welcome_message}}, where welcome_message is an I18n key. The following presenter method would make the lookup work:

def t(key)
  translate(key)
end

That way, simple ``functions'' can be added to the Curly language. Make sure these do not have any side effects, though, as an important part of Curly is the idempotence of the templates.

Layouts and content blocks

Both layouts and content blocks (see content_for) use yield to signal that content can be inserted. Curly works just like ERB, so calling yield with no arguments will make the view usable as a layout, while passing a Symbol will make it try to read a content block with the given name:

# Given you have the following Curly template in
# app/views/layouts/application.html.curly
#
#   <html>
#     <head>
#       <title>{{title}}</title>
#     </head>
#     <body>
#       <div id="sidebar">{{sidebar}}</div>
#       {{body}}
#     </body>
#   </html>
#
class ApplicationLayout < Curly::Presenter
  def title
    "You can use methods just like in any other presenter!"
  end

  def sidebar
    # A view can call `content_for(:sidebar) { "some HTML here" }`
    yield :sidebar
  end

  def body
    # The view will be rendered and inserted here:
    yield
  end
end

Rails helper methods

In order to make a Rails helper method available as a component in your template, use the exposes_helper method:

class Layouts::ApplicationPresenter < Curly::Presenter
  # The components {{sign_in_path}} and {{root_path}} are made available.
  exposes_helper :sign_in_path, :root_path
end

Testing

Presenters can be tested directly, but sometimes it makes sense to integrate with Rails on some levels. Currently, only RSpec is directly supported, but you can easily instantiate a presenter:

SomePresenter.new(context, assigns)

context is a view context, i.e. an object that responds to render, has all the helper methods you expect, etc. You can pass in a test double and see what you need to stub out. assigns is the hash containing the controller and local assigns. You need to pass in a key for each argument the presenter expects.

Testing with RSpec

In order to test presenters with RSpec, make sure you have rspec-rails in your Gemfile. Given the following presenter:

# app/presenters/posts/show_presenter.rb
class Posts::ShowPresenter < Curly::Presenter
  presents :post

  def body
    Markdown.render(@post.body)
  end
end

You can test the presenter methods like this:

# You can put this in your `spec_helper.rb`.
require 'curly/rspec'

# spec/presenters/posts/show_presenter_spec.rb
describe Posts::ShowPresenter, type: :presenter do
  describe "#body" do
    it "renders the post's body as Markdown" do
      assign(:post, double(:post, body: "**hello!**"))
      expect(presenter.body).to eq "<strong>hello!</strong>"
    end
  end
end

Note that your spec must be tagged with type: :presenter.

Examples

Here is a simple Curly template – it will be looked up by Rails automatically.

<!-- app/views/posts/show.html.curly -->
<h1>{{title}}<h1>
<p class="author">{{author}}</p>
<p>{{description}}</p>

{{comment_form}}

<div class="comments">
  {{comments}}
</div>

When rendering the template, a presenter is automatically instantiated with the variables assigned in the controller or the render call. The presenter declares the variables it expects with presents, which takes a list of variables names.

# app/presenters/posts/show_presenter.rb
class Posts::ShowPresenter < Curly::Presenter
  presents :post

  def title
    @post.title
  end

  def author
    link_to(@post.author.name, @post.author, rel: "author")
  end

  def description
    Markdown.new(@post.description).to_html.html_safe
  end

  def comments
    render 'comment', collection: @post.comments
  end

  def comment_form
    if @post.comments_allowed?
      render 'comment_form', post: @post
    else
      content_tag(:p, "Comments are disabled for this post")
    end
  end
end

Caching

Caching is handled at two levels in Curly – statically and dynamically. Static caching concerns changes to your code and templates introduced by deploys. If you do not wish to clear your entire cache every time you deploy, you need a way to indicate that some view, helper, or other piece of logic has changed.

Dynamic caching concerns changes that happen on the fly, usually made by your users in the running system. You wish to cache a view or a partial and have it expire whenever some data is updated – usually whenever a specific record is changed.

Dynamic Caching

Because of the way logic is contained in presenters, caching entire views or partials by the data they present becomes exceedingly straightforward. Simply define a #cache_key method that returns a non-nil object, and the return value will be used to cache the template.

Whereas in ERB you would include the cache call in the template itself:

<% cache([@post, signed_in?]) do %>
  ...
<% end %>

In Curly you would instead declare it in the presenter:

class Posts::ShowPresenter < Curly::Presenter
  presents :post

  def cache_key
    [@post, signed_in?]
  end
end

Likewise, you can add a #cache_duration method if you wish to automatically expire the fragment cache:

class Posts::ShowPresenter < Curly::Presenter
  ...

  def cache_duration
    30.minutes
  end
end

In order to set any cache option, define a #cache_options method that returns a Hash of options:

class Posts::ShowPresenter < Curly::Presenter
  ...

  def cache_options
    { compress: true, namespace: "my-app" }
  end
end

Static Caching

Static caching will only be enabled for presenters that define a non-nil #cache_key method (see Dynamic Caching.)

In order to make a deploy expire the cache for a specific view, set the version of the view to something new, usually by incrementing by one:

class Posts::ShowPresenter < Curly::Presenter
  version 3

  def cache_key
    # Some objects
  end
end

This will change the cache keys for all instances of that view, effectively expiring the old cache entries.

This works well for views, or for partials that are rendered in views that themselves are not cached. If the partial is nested within a view that is cached, however, the outer cache will not be expired. The solution is to register that the inner partial is a dependency of the outer one such that Curly can automatically deduce that the outer partial cache should be expired:

class Posts::ShowPresenter < Curly::Presenter
  version 3
  depends_on 'posts/comment'

  def cache_key
    # Some objects
  end
end

class Posts::CommentPresenter < Curly::Presenter
  version 4

  def cache_key
    # Some objects
  end
end

Now, if the version of Posts::CommentPresenter is bumped, the cache keys for both presenters would change. You can register any number of view paths with depends_on.

Curly integrates well with the caching mechanism in Rails 4 (or Cache Digests in Rails 3), so the dependencies defined with depends_on will be tracked by Rails. This will allow you to deploy changes to your templates and have the relevant caches automatically expire.

Thanks

Thanks to Zendesk for sponsoring the work on Curly.

Contributors

Build Status


Author: zendesk
Source code: https://github.com/zendesk/curly

#ruby   #ruby-on-rails 

Mike  Kozey

Mike Kozey

1658567760

Memoized: Decorators to Cache Returned Values

Overview

saves the previously computed value

final sum = Memoized(() => 1.to(999999999).sum());
print(sum());
print(sum()); // returned immediately

with LRU Cache (Fibonacci series)

// Memoized1<ReturnType, ArgumentType>
late final Memoized1<int, int> fib;
fib = Memoized1((int n) {
  if (n <= 1) return n;
  return fib(n-1) + fib(n-2);
});

print(fib(80));

Memoized1~5 maintains an LRU cache that uses a function argument as key and the result value according to the function arguments as value, as follows.

LruMap<ArgumentType, ReturnType>

Installing

dependencies:
  memoized:
import 'package:memoized/memoized.dart';

Memoized

Wrap a function to store the previously computed value and return this value on the next call.

Basic

Iterable<int> numbers = 1.to(30000000);
final calculateSum = (() => numbers.sum()).memo;

print(time(calculateSum));
print(time(calculateSum));  // It returns the memoized value.

expire

Iterable<int> numbers = 1.to(30000000);
final calculateSum = (() => numbers.sum()).memo;
print(calculateSum());

numbers = 1.to(9043483);
calculateSum.expire()  // not computed

numbers = 1.to(45000000);
calculateSum.expire()  // not computed

final value = calculateSum()   // recomputed at this point.

in Class

class IntervalTimer {
  final List<Duration> timers = [/* durations */];

  late final totalDuration = _totalDurationImpl.memo;
  Duration _totalDurationImpl() => timers.fold<Duration>(
    Duration.zero,
      (p, v) => p + v
  );
}

Memoized1 ~ 5

Decorator to wrap a function with a memoizing callable that saves up to the capacity most recent calls.

fibonacci

// Memoized1<ReturnType, ArgumentType>
late final Memoized1<int, int> fib;
fib = Memoized1((int n) {
    if (n <= 1) return n;
    return fib(n - 1) + fib(n - 2);
});

print(fib(90));

expire

print(await fetchDoument('hello')); 
fetchDocument.expire('hello');
print(await fetchDoument('hello'));

in Class

class Test {
  late final Memoized1<int, int> fib = Memoized1(
    (n) => n <= 1 ? n : fib(n - 1) + fib(n - 2),
  );
}

capacity

class Test {
  late final Memoized1<int, int> fib = Memoized1(
    (n) => n <= 1 ? n : fib(n - 1) + fib(n - 2),
    capacity: 10, // default == 128
  );
}

Installing

Use this package as a library

Depend on it

Run this command:

With Dart:

 $ dart pub add memoized

With Flutter:

 $ flutter pub add memoized

This will add a line like this to your package's pubspec.yaml (and run an implicit dart pub get):

dependencies:
  memoized: ^1.5.0

Alternatively, your editor might support dart pub get or flutter pub get. Check the docs for your editor to learn more.

Import it

Now in your Dart code, you can use:

import 'package:memoized/memoized.dart';

example/memoized.dart

import 'package:memoized/memoized.dart';
import './timer.dart';

class MemoizedExample {
  late final sumRange = Memoized2((int begin, int end) {
    int sum = 0;
    for (int i = begin; i <= end; i++) {
      sum += i;
    }
    return sum;
  });

  static final one = BigInt.one;
  static final two = BigInt.two;
  late final Memoized1<BigInt, BigInt> fibonacci = Memoized1((n) {
    if (n <= one) return n;
    return fibonacci(n - one) + fibonacci(n - two);
  });
}

void main() {
  final example = MemoizedExample();
  time(() => example.sumRange(0, 999999999));
  time(() => example.sumRange(0, 999999999)); // cached data

  // recursion by using cached data (top-down dynamic programming)
  time(() => example.fibonacci(BigInt.parse('300')));
}

Author: sylfreee 
Source Code: https://github.com/sylfreee/memoized 
License: MIT license

#flutter #dart #decorators