Now, Transformers are being applied for keyword spotting

The Transformer architecture has been successfully applied across various domains – language processing, computer vision, time series analysis, among others. Researchers have found yet another domain that could do well with Transformer – keyword spotting. 

What is GEEK

Buddha Community

Now, Transformers are being applied for keyword spotting
Edward Jackson

Edward Jackson


PySpark Cheat Sheet: Spark in Python

This PySpark cheat sheet with code samples covers the basics like initializing Spark in Python, loading data, sorting, and repartitioning.

Apache Spark is generally known as a fast, general and open-source engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing. It allows you to speed analytic applications up to 100 times faster compared to technologies on the market today. You can interface Spark with Python through "PySpark". This is the Spark Python API exposes the Spark programming model to Python. 

Even though working with Spark will remind you in many ways of working with Pandas DataFrames, you'll also see that it can be tough getting familiar with all the functions that you can use to query, transform, inspect, ... your data. What's more, if you've never worked with any other programming language or if you're new to the field, it might be hard to distinguish between RDD operations.

Let's face it, map() and flatMap() are different enough, but it might still come as a challenge to decide which one you really need when you're faced with them in your analysis. Or what about other functions, like reduce() and reduceByKey()

PySpark cheat sheet

Even though the documentation is very elaborate, it never hurts to have a cheat sheet by your side, especially when you're just getting into it.

This PySpark cheat sheet covers the basics, from initializing Spark and loading your data, to retrieving RDD information, sorting, filtering and sampling your data. But that's not all. You'll also see that topics such as repartitioning, iterating, merging, saving your data and stopping the SparkContext are included in the cheat sheet. 

Note that the examples in the document take small data sets to illustrate the effect of specific functions on your data. In real life data analysis, you'll be using Spark to analyze big data.

PySpark is the Spark Python API that exposes the Spark programming model to Python.

Initializing Spark 


>>> from pyspark import SparkContext
>>> sc = SparkContext(master = 'local[2]')

Inspect SparkContext 

>>> sc.version #Retrieve SparkContext version
>>> sc.pythonVer #Retrieve Python version
>>> sc.master #Master URL to connect to
>>> str(sc.sparkHome) #Path where Spark is installed on worker nodes
>>> str(sc.sparkUser()) #Retrieve name of the Spark User running SparkContext
>>> sc.appName #Return application name
>>> sc.applicationld #Retrieve application ID
>>> sc.defaultParallelism #Return default level of parallelism
>>> sc.defaultMinPartitions #Default minimum number of partitions for RDDs


>>> from pyspark import SparkConf, SparkContext
>>> conf = (SparkConf()
     .setAppName("My app")
     . set   ("spark. executor.memory",   "lg"))
>>> sc = SparkContext(conf = conf)

Using the Shell 

In the PySpark shell, a special interpreter-aware SparkContext is already created in the variable called sc.

$ ./bin/spark-shell --master local[2]
$ ./bin/pyspark --master local[s] --py-files

Set which master the context connects to with the --master argument, and add Python .zip..egg files to the

runtime path by passing a comma-separated list to  --py-files.

Loading Data 

Parallelized Collections 

>>> rdd = sc.parallelize([('a',7),('a',2),('b',2)])
>>> rdd2 = sc.parallelize([('a',2),('d',1),('b',1)])
>>> rdd3 = sc.parallelize(range(100))
>>> rdd = sc.parallelize([("a",["x","y","z"]),
               ("b" ["p","r,"])])

External Data 

Read either one text file from HDFS, a local file system or any Hadoop-supported file system URI with textFile(), or read in a directory of text files with wholeTextFiles(). 

>>> textFile = sc.textFile("/my/directory/•.txt")
>>> textFile2 = sc.wholeTextFiles("/my/directory/")

Retrieving RDD Information 

Basic Information 

>>> rdd.getNumPartitions() #List the number of partitions
>>> rdd.count() #Count RDD instances 3
>>> rdd.countByKey() #Count RDD instances by key
defaultdict(<type 'int'>,{'a':2,'b':1})
>>> rdd.countByValue() #Count RDD instances by value
defaultdict(<type 'int'>,{('b',2):1,('a',2):1,('a',7):1})
>>> rdd.collectAsMap() #Return (key,value) pairs as a dictionary
   {'a': 2, 'b': 2}
>>> rdd3.sum() #Sum of RDD elements 4950
>>> sc.parallelize([]).isEmpty() #Check whether RDD is empty


>>> rdd3.max() #Maximum value of RDD elements 
>>> rdd3.min() #Minimum value of RDD elements
>>> rdd3.mean() #Mean value of RDD elements 
>>> rdd3.stdev() #Standard deviation of RDD elements 
>>> rdd3.variance() #Compute variance of RDD elements 
>>> rdd3.histogram(3) #Compute histogram by bins
>>> rdd3.stats() #Summary statistics (count, mean, stdev, max & min)

Applying Functions 

#Apply a function to each RFD element
>>> x: x+(x[1],x[0])).collect()
[('a' ,7,7, 'a'),('a' ,2,2, 'a'), ('b' ,2,2, 'b')]
#Apply a function to each RDD element and flatten the result
>>> rdd5 = rdd.flatMap(lambda x: x+(x[1],x[0]))
>>> rdd5.collect()
['a',7 , 7 ,  'a' , 'a' , 2,  2,  'a', 'b', 2 , 2, 'b']
#Apply a flatMap function to each (key,value) pair of rdd4 without changing the keys
>>> rdds.flatMapValues(lambda x: x).collect()
[('a', 'x'), ('a', 'y'), ('a', 'z'),('b', 'p'),('b', 'r')]

Selecting Data


>>> rdd.collect() #Return a list with all RDD elements 
[('a', 7), ('a', 2), ('b', 2)]
>>> rdd.take(2) #Take first 2 RDD elements 
[('a', 7),  ('a', 2)]
>>> rdd.first() #Take first RDD element
('a', 7)
>>> #Take top 2 RDD elements 
[('b', 2), ('a', 7)]


>>> rdd3.sample(False, 0.15, 81).collect() #Return sampled subset of rdd3


>>> rdd.filter(lambda x: "a" in x).collect() #Filter the RDD
>>> rdd5.distinct().collect() #Return distinct RDD values
['a' ,2, 'b',7]
>>> rdd.keys().collect() #Return (key,value) RDD's keys
['a',  'a',  'b']


>>> def g (x): print(x)
>>> rdd.foreach(g) #Apply a function to all RDD elements
('a', 7)
('b', 2)
('a', 2)

Reshaping Data 


>>> rdd.reduceByKey(lambda x,y : x+y).collect() #Merge the rdd values for each key
>>> rdd.reduce(lambda a, b: a+ b) #Merge the rdd values
('a', 7, 'a' , 2 , 'b' , 2)


Grouping by

>>> rdd3.groupBy(lambda x: x % 2) #Return RDD of grouped values
>>> rdd.groupByKey() #Group rdd by key


>> seqOp = (lambda x,y: (x[0]+y,x[1]+1))
>>> combOp = (lambda x,y:(x[0]+y[0],x[1]+y[1]))
#Aggregate RDD elements of each partition and then the results
>>> rdd3.aggregate((0,0),seqOp,combOp) 
#Aggregate values of each RDD key
>>> rdd.aggregateByKey((0,0),seqop,combop).collect() 
     [('a',(9,2)), ('b',(2,1))]
#Aggregate the elements of each partition, and then the results
>>> rdd3.fold(0,add)
#Merge the values for each key
>>> rdd.foldByKey(0, add).collect()
[('a' ,9), ('b' ,2)]
#Create tuples of RDD elements by applying a function
>>> rdd3.keyBy(lambda x: x+x).collect()

Mathematical Operations 

>>>> rdd.subtract(rdd2).collect() #Return each rdd value not contained in rdd2
[('b' ,2), ('a' ,7)]
#Return each (key,value) pair of rdd2 with no matching key in rdd
>>> rdd2.subtractByKey(rdd).collect()
[('d', 1)1
>>>rdd.cartesian(rdd2).collect() #Return the Cartesian product of rdd and rdd2


>>> rdd2.sortBy(lambda x: x[1]).collect() #Sort RDD by given function
>>> rdd2.sortByKey().collect() #Sort (key, value) ROD by key
[('a' ,2), ('b' ,1), ('d' ,1)]


>>> rdd.repartition(4) #New RDD with 4 partitions
>>> rdd.coalesce(1) #Decrease the number of partitions in the RDD to 1


>>> rdd.saveAsTextFile("rdd.txt")
>>> rdd.saveAsHadoopFile("hdfs:// namenodehost/parent/child",

Stopping SparkContext 

>>> sc.stop()


$ ./bin/spark-submit examples/src/main/python/

Have this Cheat Sheet at your fingertips

Original article source at

#pyspark #cheatsheet #spark #python

Brandon  Adams

Brandon Adams


Using the This Keyword in Java | What is the This Keyword?

In this video, we use the this keyword in Java to access an object’s instance variables, invoke an object’s instance methods, and return the current object. Thank you for watching and happy coding!

Need some new tech gadgets or a new charger? Buy from my Amazon Storefront

What is Object Oriented Programming:


Check out my courses on LinkedIn Learning!

Support me on Patreon!

Check out my Python Basics course on Highbrow!

Check out behind-the-scenes and more tech tips on my Instagram!

Free HACKATHON MODE playlist:

Stitch Fix Invite Code:
FabFitFun Invite Code:
Uber Invite Code: kathrynh1277ue
Postmates Invite Code: 7373F
SoulCycle Invite Code:
Rent The Runway:

Want to BINGE?? Check out these playlists…

Quick Code Tutorials:

Command Line:

30 Days of Code:

Intermediate Web Dev Tutorials:

GitHub |

Twitter |

LinkedIn |

#keyword #java #using the this keyword in java #what is the this keyword

Ajay Kapoor


Digital Transformation Consulting Services & solutions

Compete in this Digital-First world with PixelCrayons’ advanced level digital transformation consulting services. With 16+ years of domain expertise, we have transformed thousands of companies digitally. Our insight-led, unique, and mindful thinking process helps organizations realize Digital Capital from business outcomes.

Let our expert digital transformation consultants partner with you in order to solve even complex business problems at speed and at scale.

Digital transformation company in india

#digital transformation agency #top digital transformation companies in india #digital transformation companies in india #digital transformation services india #digital transformation consulting firms

Chelsie  Towne

Chelsie Towne


A Deep Dive Into the Transformer Architecture – The Transformer Models

Transformers for Natural Language Processing

It may seem like a long time since the world of natural language processing (NLP) was transformed by the seminal “Attention is All You Need” paper by Vaswani et al., but in fact that was less than 3 years ago. The relative recency of the introduction of transformer architectures and the ubiquity with which they have upended language tasks speaks to the rapid rate of progress in machine learning and artificial intelligence. There’s no better time than now to gain a deep understanding of the inner workings of transformer architectures, especially with transformer models making big inroads into diverse new applications like predicting chemical reactions and reinforcement learning.

Whether you’re an old hand or you’re only paying attention to transformer style architecture for the first time, this article should offer something for you. First, we’ll dive deep into the fundamental concepts used to build the original 2017 Transformer. Then we’ll touch on some of the developments implemented in subsequent transformer models. Where appropriate we’ll point out some limitations and how modern models inheriting ideas from the original Transformer are trying to overcome various shortcomings or improve performance.

What Do Transformers Do?

Transformers are the current state-of-the-art type of model for dealing with sequences. Perhaps the most prominent application of these models is in text processing tasks, and the most prominent of these is machine translation. In fact, transformers and their conceptual progeny have infiltrated just about every benchmark leaderboard in natural language processing (NLP), from question answering to grammar correction. In many ways transformer architectures are undergoing a surge in development similar to what we saw with convolutional neural networks following the 2012 ImageNet competition, for better and for worse.

#natural language processing #ai artificial intelligence #transformers #transformer architecture #transformer models

En Momin

En Momin


A Simple Flutter API to Manage Rest Api Request Easily

api_manager .A simple flutter API to manage rest api request easily with the help of flutter dio.

Get started


Add dependency

  api_manager: $latest_version

Super simple to use

import 'package:api_manager/api_manager.dart';

void main() async {
  ApiResponse response = await ApiManager().request(
    requestType: RequestType.GET,
    route: "your route",

Config in a base manager

class ApiRepository {
  static final ApiRepository _instance = ApiRepository._internal(); /// singleton api repository
  ApiManager _apiManager;

  factory ApiRepository() {
    return _instance;

  /// base configuration for api manager
  ApiRepository._internal() {
    _apiManager = ApiManager();
    _apiManager.options.baseUrl = BASE_URL; /// EX: BASE_URL = 
    _apiManager.options.connectTimeout = 100000;
    _apiManager.options.receiveTimeout = 100000;
    _apiManager.enableLogging(responseBody: true, requestBody: false); /// enable api logging EX: response, request, headers etc
    _apiManager.enableAuthTokenCheck(() => "access_token"); /// EX: JWT/PASSPORT auth token store in cache


Suppose we have a response model like this:

class SampleResponse{
  String name;
  int id;

  SampleResponse.fromJson(jsonMap): = jsonMap['name'], = jsonMap['id'];

and actual api response json structure is:

    "data": {
        "name": "md afratul kaoser taohid",
        "id": "id"

#Now we Performing a GET request :

 Future<ApiResponse<SampleResponse>> getRequestSample() async =>
      await _apiManager.request<SampleResponse>(
        requestType: RequestType.GET,
        route: 'api_route',
        requestParams: {"userId": 12}, /// add params if required
        isAuthRequired: true, /// by set it to true, this request add a header authorization from this method enableAuthTokenCheck();
        responseBodySerializer: (jsonMap) {
          return SampleResponse.fromJson(jsonMap); /// parse the json response into dart model class

#Now we Performing a POST request :

 Future<ApiResponse<SampleResponse>> postRequestSample() async =>
      await _apiManager.request<SampleResponse>(
        requestType: RequestType.POST,
        route: 'api_route',
        requestBody: {"userId": 12}, /// add POST request body
        isAuthRequired: true, /// by set it to true, this request add a header authorization from this method enableAuthTokenCheck();
        responseBodySerializer: (jsonMap) {
          return SampleResponse.fromJson(jsonMap); /// parse the json response into dart model class

#Now er performing a multipart file upload request :

  Future<ApiResponse<void>> updateProfilePicture(
    String filePath,
  ) async {
    MultipartFile multipartFile =
        await _apiManager.getMultipartFileData(filePath);
    FormData formData = FormData.fromMap({'picture': multipartFile});

    return await _apiManager.request(
      requestType: RequestType.POST,
      isAuthRequired: true,
      requestBody: formData,
      route: 'api_route',

Use this package as a library

Depend on it

Run this command:

With Flutter:

 $ flutter pub add api_manager

This will add a line like this to your package's pubspec.yaml (and run an implicit dart pub get):

  api_manager: ^0.1.29

Alternatively, your editor might support flutter pub get. Check the docs for your editor to learn more.

Import it

Now in your Dart code, you can use:

import 'package:api_manager/api_manager.dart';


//void main() async {
//  ApiManager _apiManager = ApiManager();
//  _apiManager.options.baseUrl = $base_url;
//  _apiManager.responseBodyWrapper("data");
//  ApiResponse<List<dynamic>> response = await _apiManager.request(
//    requestType: RequestType.GET,
//    route: $route,
//    responseBodySerializer: (jsonMap) {
//      return jsonMap as List;
//    },
//  );
//  print(response);

GitHub :

#flutter  #restapi