Let’s Build Backend for Friends in Our Social Network App using Spring

In this tutorial, we are going to add a Friend feature using Springboot Backend for a demo social network site we are building

Photo by Hannah Rodrigo on Unsplash

In the previous tutorial, we have added Github OAuth with JWT based Authentication.

In this tutorial, we will extend the existing app into a social network site, like facebook, where users can add each other friend and have real time chat and video/voice call.

We will be building the backend of the friend feature, in this tutorial

#web-development #java #social-network #spring-boot #technology #build backend for friends in our social network app using spring

What is GEEK

Buddha Community

Let’s Build Backend for Friends in Our Social Network App using Spring

Enhance Amazon Aurora Read/Write Capability with ShardingSphere-JDBC

1. Introduction

Amazon Aurora is a relational database management system (RDBMS) developed by AWS(Amazon Web Services). Aurora gives you the performance and availability of commercial-grade databases with full MySQL and PostgreSQL compatibility. In terms of high performance, Aurora MySQL and Aurora PostgreSQL have shown an increase in throughput of up to 5X over stock MySQL and 3X over stock PostgreSQL respectively on similar hardware. In terms of scalability, Aurora achieves enhancements and innovations in storage and computing, horizontal and vertical functions.

Aurora supports up to 128TB of storage capacity and supports dynamic scaling of storage layer in units of 10GB. In terms of computing, Aurora supports scalable configurations for multiple read replicas. Each region can have an additional 15 Aurora replicas. In addition, Aurora provides multi-primary architecture to support four read/write nodes. Its Serverless architecture allows vertical scaling and reduces typical latency to under a second, while the Global Database enables a single database cluster to span multiple AWS Regions in low latency.

Aurora already provides great scalability with the growth of user data volume. Can it handle more data and support more concurrent access? You may consider using sharding to support the configuration of multiple underlying Aurora clusters. To this end, a series of blogs, including this one, provides you with a reference in choosing between Proxy and JDBC for sharding.

1.1 Why sharding is needed

AWS Aurora offers a single relational database. Primary-secondary, multi-primary, and global database, and other forms of hosting architecture can satisfy various architectural scenarios above. However, Aurora doesn’t provide direct support for sharding scenarios, and sharding has a variety of forms, such as vertical and horizontal forms. If we want to further increase data capacity, some problems have to be solved, such as cross-node database Join, associated query, distributed transactions, SQL sorting, page turning, function calculation, database global primary key, capacity planning, and secondary capacity expansion after sharding.

1.2 Sharding methods

It is generally accepted that when the capacity of a MySQL table is less than 10 million, the time spent on queries is optimal because at this time the height of its BTREE index is between 3 and 5. Data sharding can reduce the amount of data in a single table and distribute the read and write loads to different data nodes at the same time. Data sharding can be divided into vertical sharding and horizontal sharding.

1. Advantages of vertical sharding

  • Address the coupling of business system and make clearer.
  • Implement hierarchical management, maintenance, monitoring, and expansion to data of different businesses, like micro-service governance.
  • In high concurrency scenarios, vertical sharding removes the bottleneck of IO, database connections, and hardware resources on a single machine to some extent.

2. Disadvantages of vertical sharding

  • After splitting the library, Join can only be implemented by interface aggregation, which will increase the complexity of development.
  • After splitting the library, it is complex to process distributed transactions.
  • There is a large amount of data on a single table and horizontal sharding is required.

3. Advantages of horizontal sharding

  • There is no such performance bottleneck as a large amount of data on a single database and high concurrency, and it increases system stability and load capacity.
  • The business modules do not need to be split due to minor modification on the application client.

4. Disadvantages of horizontal sharding

  • Transaction consistency across shards is hard to be guaranteed;
  • The performance of associated query in cross-library Join is poor.
  • It’s difficult to scale the data many times and maintenance is a big workload.

Based on the analysis above, and the available studis on popular sharding middleware, we selected ShardingSphere, an open source product, combined with Amazon Aurora to introduce how the combination of these two products meets various forms of sharding and how to solve the problems brought by sharding.

ShardingSphere is an open source ecosystem including a set of distributed database middleware solutions, including 3 independent products, Sharding-JDBC, Sharding-Proxy & Sharding-Sidecar.

2. ShardingSphere introduction:

The characteristics of Sharding-JDBC are:

  1. With the client end connecting directly to the database, it provides service in the form of jar and requires no extra deployment and dependence.
  2. It can be considered as an enhanced JDBC driver, which is fully compatible with JDBC and all kinds of ORM frameworks.
  3. Applicable in any ORM framework based on JDBC, such as JPA, Hibernate, Mybatis, Spring JDBC Template or direct use of JDBC.
  4. Support any third-party database connection pool, such as DBCP, C3P0, BoneCP, Druid, HikariCP;
  5. Support any kind of JDBC standard database: MySQL, Oracle, SQLServer, PostgreSQL and any databases accessible to JDBC.
  6. Sharding-JDBC adopts decentralized architecture, applicable to high-performance light-weight OLTP application developed with Java

Hybrid Structure Integrating Sharding-JDBC and Applications

Sharding-JDBC’s core concepts

Data node: The smallest unit of a data slice, consisting of a data source name and a data table, such as ds_0.product_order_0.

Actual table: The physical table that really exists in the horizontal sharding database, such as product order tables: product_order_0, product_order_1, and product_order_2.

Logic table: The logical name of the horizontal sharding databases (tables) with the same schema. For instance, the logic table of the order product_order_0, product_order_1, and product_order_2 is product_order.

Binding table: It refers to the primary table and the joiner table with the same sharding rules. For example, product_order table and product_order_item are sharded by order_id, so they are binding tables with each other. Cartesian product correlation will not appear in the multi-tables correlating query, so the query efficiency will increase greatly.

Broadcast table: It refers to tables that exist in all sharding database sources. The schema and data must consist in each database. It can be applied to the small data volume that needs to correlate with big data tables to query, dictionary table and configuration table for example.

3. Testing ShardingSphere-JDBC

3.1 Example project

Download the example project code locally. In order to ensure the stability of the test code, we choose shardingsphere-example-4.0.0 version.

git clone https://github.com/apache/shardingsphere-example.git

Project description:

shardingsphere-example
  ├── example-core
  │   ├── config-utility
  │   ├── example-api
  │   ├── example-raw-jdbc
  │   ├── example-spring-jpa #spring+jpa integration-based entity,repository
  │   └── example-spring-mybatis
  ├── sharding-jdbc-example
  │   ├── sharding-example
  │   │   ├── sharding-raw-jdbc-example
  │   │   ├── sharding-spring-boot-jpa-example #integration-based sharding-jdbc functions
  │   │   ├── sharding-spring-boot-mybatis-example
  │   │   ├── sharding-spring-namespace-jpa-example
  │   │   └── sharding-spring-namespace-mybatis-example
  │   ├── orchestration-example
  │   │   ├── orchestration-raw-jdbc-example
  │   │   ├── orchestration-spring-boot-example #integration-based sharding-jdbc governance function
  │   │   └── orchestration-spring-namespace-example
  │   ├── transaction-example
  │   │   ├── transaction-2pc-xa-example #sharding-jdbc sample of two-phase commit for a distributed transaction
  │   │   └──transaction-base-seata-example #sharding-jdbc distributed transaction seata sample
  │   ├── other-feature-example
  │   │   ├── hint-example
  │   │   └── encrypt-example
  ├── sharding-proxy-example
  │   └── sharding-proxy-boot-mybatis-example
  └── src/resources
        └── manual_schema.sql  

Configuration file description:

application-master-slave.properties #read/write splitting profile
application-sharding-databases-tables.properties #sharding profile
application-sharding-databases.properties       #library split profile only
application-sharding-master-slave.properties    #sharding and read/write splitting profile
application-sharding-tables.properties          #table split profile
application.properties                         #spring boot profile

Code logic description:

The following is the entry class of the Spring Boot application below. Execute it to run the project.

The execution logic of demo is as follows:

3.2 Verifying read/write splitting

As business grows, the write and read requests can be split to different database nodes to effectively promote the processing capability of the entire database cluster. Aurora uses a reader/writer endpoint to meet users' requirements to write and read with strong consistency, and a read-only endpoint to meet the requirements to read without strong consistency. Aurora's read and write latency is within single-digit milliseconds, much lower than MySQL's binlog-based logical replication, so there's a lot of loads that can be directed to a read-only endpoint.

Through the one primary and multiple secondary configuration, query requests can be evenly distributed to multiple data replicas, which further improves the processing capability of the system. Read/write splitting can improve the throughput and availability of system, but it can also lead to data inconsistency. Aurora provides a primary/secondary architecture in a fully managed form, but applications on the upper-layer still need to manage multiple data sources when interacting with Aurora, routing SQL requests to different nodes based on the read/write type of SQL statements and certain routing policies.

ShardingSphere-JDBC provides read/write splitting features and it is integrated with application programs so that the complex configuration between application programs and database clusters can be separated from application programs. Developers can manage the Shard through configuration files and combine it with ORM frameworks such as Spring JPA and Mybatis to completely separate the duplicated logic from the code, which greatly improves the ability to maintain code and reduces the coupling between code and database.

3.2.1 Setting up the database environment

Create a set of Aurora MySQL read/write splitting clusters. The model is db.r5.2xlarge. Each set of clusters has one write node and two read nodes.

3.2.2 Configuring Sharding-JDBC

application.properties spring boot Master profile description:

You need to replace the green ones with your own environment configuration.

# Jpa automatically creates and drops data tables based on entities
spring.jpa.properties.hibernate.hbm2ddl.auto=create-drop
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.MySQL5Dialect
spring.jpa.properties.hibernate.show_sql=true

#spring.profiles.active=sharding-databases
#spring.profiles.active=sharding-tables
#spring.profiles.active=sharding-databases-tables
#Activate master-slave configuration item so that sharding-jdbc can use master-slave profile
spring.profiles.active=master-slave
#spring.profiles.active=sharding-master-slave

application-master-slave.properties sharding-jdbc profile description:

spring.shardingsphere.datasource.names=ds_master,ds_slave_0,ds_slave_1
# data souce-master
spring.shardingsphere.datasource.ds_master.driver-class-name=com.mysql.jdbc.Driver
spring.shardingsphere.datasource.ds_master.password=Your master DB password
spring.shardingsphere.datasource.ds_master.type=com.zaxxer.hikari.HikariDataSource
spring.shardingsphere.datasource.ds_master.jdbc-url=Your primary DB data sourceurl spring.shardingsphere.datasource.ds_master.username=Your primary DB username
# data source-slave
spring.shardingsphere.datasource.ds_slave_0.driver-class-name=com.mysql.jdbc.Driver
spring.shardingsphere.datasource.ds_slave_0.password= Your slave DB password
spring.shardingsphere.datasource.ds_slave_0.type=com.zaxxer.hikari.HikariDataSource
spring.shardingsphere.datasource.ds_slave_0.jdbc-url=Your slave DB data source url
spring.shardingsphere.datasource.ds_slave_0.username= Your slave DB username
# data source-slave
spring.shardingsphere.datasource.ds_slave_1.driver-class-name=com.mysql.jdbc.Driver
spring.shardingsphere.datasource.ds_slave_1.password= Your slave DB password
spring.shardingsphere.datasource.ds_slave_1.type=com.zaxxer.hikari.HikariDataSource
spring.shardingsphere.datasource.ds_slave_1.jdbc-url= Your slave DB data source url
spring.shardingsphere.datasource.ds_slave_1.username= Your slave DB username
# Routing Policy Configuration
spring.shardingsphere.masterslave.load-balance-algorithm-type=round_robin
spring.shardingsphere.masterslave.name=ds_ms
spring.shardingsphere.masterslave.master-data-source-name=ds_master
spring.shardingsphere.masterslave.slave-data-source-names=ds_slave_0,ds_slave_1
# sharding-jdbc configures the information storage mode
spring.shardingsphere.mode.type=Memory
# start shardingsphere log,and you can see the conversion from logical SQL to actual SQL from the print
spring.shardingsphere.props.sql.show=true

 

3.2.3 Test and verification process description

  • Test environment data initialization: Spring JPA initialization automatically creates tables for testing.

  • Write data to the master instance

As shown in the ShardingSphere-SQL log figure below, the write SQL is executed on the ds_master data source.

  • Data query operations are performed on the slave library.

As shown in the ShardingSphere-SQL log figure below, the read SQL is executed on the ds_slave data source in the form of polling.

[INFO ] 2022-04-02 19:43:39,376 --main-- [ShardingSphere-SQL] Rule Type: master-slave 
[INFO ] 2022-04-02 19:43:39,376 --main-- [ShardingSphere-SQL] SQL: select orderentit0_.order_id as order_id1_1_, orderentit0_.address_id as address_2_1_, 
orderentit0_.status as status3_1_, orderentit0_.user_id as user_id4_1_ from t_order orderentit0_ ::: DataSources: ds_slave_0 
---------------------------- Print OrderItem Data -------------------
Hibernate: select orderiteme1_.order_item_id as order_it1_2_, orderiteme1_.order_id as order_id2_2_, orderiteme1_.status as status3_2_, orderiteme1_.user_id 
as user_id4_2_ from t_order orderentit0_ cross join t_order_item orderiteme1_ where orderentit0_.order_id=orderiteme1_.order_id
[INFO ] 2022-04-02 19:43:40,898 --main-- [ShardingSphere-SQL] Rule Type: master-slave 
[INFO ] 2022-04-02 19:43:40,898 --main-- [ShardingSphere-SQL] SQL: select orderiteme1_.order_item_id as order_it1_2_, orderiteme1_.order_id as order_id2_2_, orderiteme1_.status as status3_2_, 
orderiteme1_.user_id as user_id4_2_ from t_order orderentit0_ cross join t_order_item orderiteme1_ where orderentit0_.order_id=orderiteme1_.order_id ::: DataSources: ds_slave_1 

Note: As shown in the figure below, if there are both reads and writes in a transaction, Sharding-JDBC routes both read and write operations to the master library. If the read/write requests are not in the same transaction, the corresponding read requests are distributed to different read nodes according to the routing policy.

@Override
@Transactional // When a transaction is started, both read and write in the transaction go through the master library. When closed, read goes through the slave library and write goes through the master library
public void processSuccess() throws SQLException {
    System.out.println("-------------- Process Success Begin ---------------");
    List<Long> orderIds = insertData();
    printData();
    deleteData(orderIds);
    printData();
    System.out.println("-------------- Process Success Finish --------------");
}

3.2.4 Verifying Aurora failover scenario

The Aurora database environment adopts the configuration described in Section 2.2.1.

3.2.4.1 Verification process description

  1. Start the Spring-Boot project

2. Perform a failover on Aurora’s console

3. Execute the Rest API request

4. Repeatedly execute POST (http://localhost:8088/save-user) until the call to the API failed to write to Aurora and eventually recovered successfully.

5. The following figure shows the process of executing code failover. It takes about 37 seconds from the time when the latest SQL write is successfully performed to the time when the next SQL write is successfully performed. That is, the application can be automatically recovered from Aurora failover, and the recovery time is about 37 seconds.

3.3 Testing table sharding-only function

3.3.1 Configuring Sharding-JDBC

application.properties spring boot master profile description

# Jpa automatically creates and drops data tables based on entities
spring.jpa.properties.hibernate.hbm2ddl.auto=create-drop
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.MySQL5Dialect
spring.jpa.properties.hibernate.show_sql=true
#spring.profiles.active=sharding-databases
#Activate sharding-tables configuration items
#spring.profiles.active=sharding-tables
#spring.profiles.active=sharding-databases-tables
# spring.profiles.active=master-slave
#spring.profiles.active=sharding-master-slave

application-sharding-tables.properties sharding-jdbc profile description

## configure primary-key policy
spring.shardingsphere.sharding.tables.t_order.key-generator.column=order_id
spring.shardingsphere.sharding.tables.t_order.key-generator.type=SNOWFLAKE
spring.shardingsphere.sharding.tables.t_order.key-generator.props.worker.id=123
spring.shardingsphere.sharding.tables.t_order_item.actual-data-nodes=ds.t_order_item_$->{0..1}
spring.shardingsphere.sharding.tables.t_order_item.table-strategy.inline.sharding-column=order_id
spring.shardingsphere.sharding.tables.t_order_item.table-strategy.inline.algorithm-expression=t_order_item_$->{order_id % 2}
spring.shardingsphere.sharding.tables.t_order_item.key-generator.column=order_item_id
spring.shardingsphere.sharding.tables.t_order_item.key-generator.type=SNOWFLAKE
spring.shardingsphere.sharding.tables.t_order_item.key-generator.props.worker.id=123
# configure the binding relation of t_order and t_order_item
spring.shardingsphere.sharding.binding-tables[0]=t_order,t_order_item
# configure broadcast tables
spring.shardingsphere.sharding.broadcast-tables=t_address
# sharding-jdbc mode
spring.shardingsphere.mode.type=Memory
# start shardingsphere log
spring.shardingsphere.props.sql.show=true

 

3.3.2 Test and verification process description

1. DDL operation

JPA automatically creates tables for testing. When Sharding-JDBC routing rules are configured, the client executes DDL, and Sharding-JDBC automatically creates corresponding tables according to the table splitting rules. If t_address is a broadcast table, create a t_address because there is only one master instance. Two physical tables t_order_0 and t_order_1 will be created when creating t_order.

2. Write operation

As shown in the figure below, Logic SQL inserts a record into t_order. When Sharding-JDBC is executed, data will be distributed to t_order_0 and t_order_1 according to the table splitting rules.

When t_order and t_order_item are bound, the records associated with order_item and order are placed on the same physical table.

3. Read operation

As shown in the figure below, perform the join query operations to order and order_item under the binding table, and the physical shard is precisely located based on the binding relationship.

The join query operations on order and order_item under the unbound table will traverse all shards.

3.4 Testing database sharding-only function

3.4.1 Setting up the database environment

Create two instances on Aurora: ds_0 and ds_1

When the sharding-spring-boot-jpa-example project is started, tables t_order, t_order_itemt_address will be created on two Aurora instances.

3.4.2 Configuring Sharding-JDBC

application.properties springboot master profile description

# Jpa automatically creates and drops data tables based on entities
spring.jpa.properties.hibernate.hbm2ddl.auto=create
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.MySQL5Dialect
spring.jpa.properties.hibernate.show_sql=true

# Activate sharding-databases configuration items
spring.profiles.active=sharding-databases
#spring.profiles.active=sharding-tables
#spring.profiles.active=sharding-databases-tables
#spring.profiles.active=master-slave
#spring.profiles.active=sharding-master-slave

application-sharding-databases.properties sharding-jdbc profile description

spring.shardingsphere.datasource.names=ds_0,ds_1
# ds_0
spring.shardingsphere.datasource.ds_0.type=com.zaxxer.hikari.HikariDataSource
spring.shardingsphere.datasource.ds_0.driver-class-name=com.mysql.jdbc.Driver
spring.shardingsphere.datasource.ds_0.jdbc-url= spring.shardingsphere.datasource.ds_0.username= 
spring.shardingsphere.datasource.ds_0.password=
# ds_1
spring.shardingsphere.datasource.ds_1.type=com.zaxxer.hikari.HikariDataSource
spring.shardingsphere.datasource.ds_1.driver-class-name=com.mysql.jdbc.Driver
spring.shardingsphere.datasource.ds_1.jdbc-url= 
spring.shardingsphere.datasource.ds_1.username= 
spring.shardingsphere.datasource.ds_1.password=
spring.shardingsphere.sharding.default-database-strategy.inline.sharding-column=user_id
spring.shardingsphere.sharding.default-database-strategy.inline.algorithm-expression=ds_$->{user_id % 2}
spring.shardingsphere.sharding.binding-tables=t_order,t_order_item
spring.shardingsphere.sharding.broadcast-tables=t_address
spring.shardingsphere.sharding.default-data-source-name=ds_0

spring.shardingsphere.sharding.tables.t_order.actual-data-nodes=ds_$->{0..1}.t_order
spring.shardingsphere.sharding.tables.t_order.key-generator.column=order_id
spring.shardingsphere.sharding.tables.t_order.key-generator.type=SNOWFLAKE
spring.shardingsphere.sharding.tables.t_order.key-generator.props.worker.id=123
spring.shardingsphere.sharding.tables.t_order_item.actual-data-nodes=ds_$->{0..1}.t_order_item
spring.shardingsphere.sharding.tables.t_order_item.key-generator.column=order_item_id
spring.shardingsphere.sharding.tables.t_order_item.key-generator.type=SNOWFLAKE
spring.shardingsphere.sharding.tables.t_order_item.key-generator.props.worker.id=123
# sharding-jdbc mode
spring.shardingsphere.mode.type=Memory
# start shardingsphere log
spring.shardingsphere.props.sql.show=true

 

3.4.3 Test and verification process description

1. DDL operation

JPA automatically creates tables for testing. When Sharding-JDBC’s library splitting and routing rules are configured, the client executes DDL, and Sharding-JDBC will automatically create corresponding tables according to table splitting rules. If t_address is a broadcast table, physical tables will be created on ds_0 and ds_1. The three tables, t_address, t_order and t_order_item will be created on ds_0 and ds_1 respectively.

2. Write operation

For the broadcast table t_address, each record written will also be written to the t_address tables of ds_0 and ds_1.

The tables t_order and t_order_item of the slave library are written on the table in the corresponding instance according to the slave library field and routing policy.

3. Read operation

Query order is routed to the corresponding Aurora instance according to the routing rules of the slave library .

Query Address. Since address is a broadcast table, an instance of address will be randomly selected and queried from the nodes used.

As shown in the figure below, perform the join query operations to order and order_item under the binding table, and the physical shard is precisely located based on the binding relationship.

3.5 Verifying sharding function

3.5.1 Setting up the database environment

As shown in the figure below, create two instances on Aurora: ds_0 and ds_1

When the sharding-spring-boot-jpa-example project is started, physical tables t_order_01, t_order_02, t_order_item_01,and t_order_item_02 and global table t_address will be created on two Aurora instances.

3.5.2 Configuring Sharding-JDBC

application.properties springboot master profile description

# Jpa automatically creates and drops data tables based on entities
spring.jpa.properties.hibernate.hbm2ddl.auto=create
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.MySQL5Dialect
spring.jpa.properties.hibernate.show_sql=true
# Activate sharding-databases-tables configuration items
#spring.profiles.active=sharding-databases
#spring.profiles.active=sharding-tables
spring.profiles.active=sharding-databases-tables
#spring.profiles.active=master-slave
#spring.profiles.active=sharding-master-slave

application-sharding-databases.properties sharding-jdbc profile description

spring.shardingsphere.datasource.names=ds_0,ds_1
# ds_0
spring.shardingsphere.datasource.ds_0.type=com.zaxxer.hikari.HikariDataSource
spring.shardingsphere.datasource.ds_0.driver-class-name=com.mysql.jdbc.Driver
spring.shardingsphere.datasource.ds_0.jdbc-url= 306/dev?useSSL=false&characterEncoding=utf-8
spring.shardingsphere.datasource.ds_0.username= 
spring.shardingsphere.datasource.ds_0.password=
spring.shardingsphere.datasource.ds_0.max-active=16
# ds_1
spring.shardingsphere.datasource.ds_1.type=com.zaxxer.hikari.HikariDataSource
spring.shardingsphere.datasource.ds_1.driver-class-name=com.mysql.jdbc.Driver
spring.shardingsphere.datasource.ds_1.jdbc-url= 
spring.shardingsphere.datasource.ds_1.username= 
spring.shardingsphere.datasource.ds_1.password=
spring.shardingsphere.datasource.ds_1.max-active=16
# default library splitting policy
spring.shardingsphere.sharding.default-database-strategy.inline.sharding-column=user_id
spring.shardingsphere.sharding.default-database-strategy.inline.algorithm-expression=ds_$->{user_id % 2}
spring.shardingsphere.sharding.binding-tables=t_order,t_order_item
spring.shardingsphere.sharding.broadcast-tables=t_address
# Tables that do not meet the library splitting policy are placed on ds_0
spring.shardingsphere.sharding.default-data-source-name=ds_0
# t_order table splitting policy
spring.shardingsphere.sharding.tables.t_order.actual-data-nodes=ds_$->{0..1}.t_order_$->{0..1}
spring.shardingsphere.sharding.tables.t_order.table-strategy.inline.sharding-column=order_id
spring.shardingsphere.sharding.tables.t_order.table-strategy.inline.algorithm-expression=t_order_$->{order_id % 2}
spring.shardingsphere.sharding.tables.t_order.key-generator.column=order_id
spring.shardingsphere.sharding.tables.t_order.key-generator.type=SNOWFLAKE
spring.shardingsphere.sharding.tables.t_order.key-generator.props.worker.id=123
# t_order_item table splitting policy
spring.shardingsphere.sharding.tables.t_order_item.actual-data-nodes=ds_$->{0..1}.t_order_item_$->{0..1}
spring.shardingsphere.sharding.tables.t_order_item.table-strategy.inline.sharding-column=order_id
spring.shardingsphere.sharding.tables.t_order_item.table-strategy.inline.algorithm-expression=t_order_item_$->{order_id % 2}
spring.shardingsphere.sharding.tables.t_order_item.key-generator.column=order_item_id
spring.shardingsphere.sharding.tables.t_order_item.key-generator.type=SNOWFLAKE
spring.shardingsphere.sharding.tables.t_order_item.key-generator.props.worker.id=123
# sharding-jdbc mdoe
spring.shardingsphere.mode.type=Memory
# start shardingsphere log
spring.shardingsphere.props.sql.show=true

 

3.5.3 Test and verification process description

1. DDL operation

JPA automatically creates tables for testing. When Sharding-JDBC’s sharding and routing rules are configured, the client executes DDL, and Sharding-JDBC will automatically create corresponding tables according to table splitting rules. If t_address is a broadcast table, t_address will be created on both ds_0 and ds_1. The three tables, t_address, t_order and t_order_item will be created on ds_0 and ds_1 respectively.

2. Write operation

For the broadcast table t_address, each record written will also be written to the t_address tables of ds_0 and ds_1.

The tables t_order and t_order_item of the sub-library are written to the table on the corresponding instance according to the slave library field and routing policy.

3. Read operation

The read operation is similar to the library split function verification described in section2.4.3.

3.6 Testing database sharding, table sharding and read/write splitting function

3.6.1 Setting up the database environment

The following figure shows the physical table of the created database instance.

3.6.2 Configuring Sharding-JDBC

application.properties spring boot master profile description

# Jpa automatically creates and drops data tables based on entities
spring.jpa.properties.hibernate.hbm2ddl.auto=create
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.MySQL5Dialect
spring.jpa.properties.hibernate.show_sql=true

# activate sharding-databases-tables configuration items
#spring.profiles.active=sharding-databases
#spring.profiles.active=sharding-tables
#spring.profiles.active=sharding-databases-tables
#spring.profiles.active=master-slave
spring.profiles.active=sharding-master-slave

application-sharding-master-slave.properties sharding-jdbc profile description

The url, name and password of the database need to be changed to your own database parameters.

spring.shardingsphere.datasource.names=ds_master_0,ds_master_1,ds_master_0_slave_0,ds_master_0_slave_1,ds_master_1_slave_0,ds_master_1_slave_1
spring.shardingsphere.datasource.ds_master_0.type=com.zaxxer.hikari.HikariDataSource
spring.shardingsphere.datasource.ds_master_0.driver-class-name=com.mysql.jdbc.Driver
spring.shardingsphere.datasource.ds_master_0.jdbc-url= spring.shardingsphere.datasource.ds_master_0.username= 
spring.shardingsphere.datasource.ds_master_0.password=
spring.shardingsphere.datasource.ds_master_0.max-active=16
spring.shardingsphere.datasource.ds_master_0_slave_0.type=com.zaxxer.hikari.HikariDataSource
spring.shardingsphere.datasource.ds_master_0_slave_0.driver-class-name=com.mysql.jdbc.Driver
spring.shardingsphere.datasource.ds_master_0_slave_0.jdbc-url= spring.shardingsphere.datasource.ds_master_0_slave_0.username= 
spring.shardingsphere.datasource.ds_master_0_slave_0.password=
spring.shardingsphere.datasource.ds_master_0_slave_0.max-active=16
spring.shardingsphere.datasource.ds_master_0_slave_1.type=com.zaxxer.hikari.HikariDataSource
spring.shardingsphere.datasource.ds_master_0_slave_1.driver-class-name=com.mysql.jdbc.Driver
spring.shardingsphere.datasource.ds_master_0_slave_1.jdbc-url= spring.shardingsphere.datasource.ds_master_0_slave_1.username= 
spring.shardingsphere.datasource.ds_master_0_slave_1.password=
spring.shardingsphere.datasource.ds_master_0_slave_1.max-active=16
spring.shardingsphere.datasource.ds_master_1.type=com.zaxxer.hikari.HikariDataSource
spring.shardingsphere.datasource.ds_master_1.driver-class-name=com.mysql.jdbc.Driver
spring.shardingsphere.datasource.ds_master_1.jdbc-url= 
spring.shardingsphere.datasource.ds_master_1.username= 
spring.shardingsphere.datasource.ds_master_1.password=
spring.shardingsphere.datasource.ds_master_1.max-active=16
spring.shardingsphere.datasource.ds_master_1_slave_0.type=com.zaxxer.hikari.HikariDataSource
spring.shardingsphere.datasource.ds_master_1_slave_0.driver-class-name=com.mysql.jdbc.Driver
spring.shardingsphere.datasource.ds_master_1_slave_0.jdbc-url=
spring.shardingsphere.datasource.ds_master_1_slave_0.username=
spring.shardingsphere.datasource.ds_master_1_slave_0.password=
spring.shardingsphere.datasource.ds_master_1_slave_0.max-active=16
spring.shardingsphere.datasource.ds_master_1_slave_1.type=com.zaxxer.hikari.HikariDataSource
spring.shardingsphere.datasource.ds_master_1_slave_1.driver-class-name=com.mysql.jdbc.Driver
spring.shardingsphere.datasource.ds_master_1_slave_1.jdbc-url= spring.shardingsphere.datasource.ds_master_1_slave_1.username=admin
spring.shardingsphere.datasource.ds_master_1_slave_1.password=
spring.shardingsphere.datasource.ds_master_1_slave_1.max-active=16
spring.shardingsphere.sharding.default-database-strategy.inline.sharding-column=user_id
spring.shardingsphere.sharding.default-database-strategy.inline.algorithm-expression=ds_$->{user_id % 2}
spring.shardingsphere.sharding.binding-tables=t_order,t_order_item
spring.shardingsphere.sharding.broadcast-tables=t_address
spring.shardingsphere.sharding.default-data-source-name=ds_master_0
spring.shardingsphere.sharding.tables.t_order.actual-data-nodes=ds_$->{0..1}.t_order_$->{0..1}
spring.shardingsphere.sharding.tables.t_order.table-strategy.inline.sharding-column=order_id
spring.shardingsphere.sharding.tables.t_order.table-strategy.inline.algorithm-expression=t_order_$->{order_id % 2}
spring.shardingsphere.sharding.tables.t_order.key-generator.column=order_id
spring.shardingsphere.sharding.tables.t_order.key-generator.type=SNOWFLAKE
spring.shardingsphere.sharding.tables.t_order.key-generator.props.worker.id=123
spring.shardingsphere.sharding.tables.t_order_item.actual-data-nodes=ds_$->{0..1}.t_order_item_$->{0..1}
spring.shardingsphere.sharding.tables.t_order_item.table-strategy.inline.sharding-column=order_id
spring.shardingsphere.sharding.tables.t_order_item.table-strategy.inline.algorithm-expression=t_order_item_$->{order_id % 2}
spring.shardingsphere.sharding.tables.t_order_item.key-generator.column=order_item_id
spring.shardingsphere.sharding.tables.t_order_item.key-generator.type=SNOWFLAKE
spring.shardingsphere.sharding.tables.t_order_item.key-generator.props.worker.id=123
# master/slave data source and slave data source configuration
spring.shardingsphere.sharding.master-slave-rules.ds_0.master-data-source-name=ds_master_0
spring.shardingsphere.sharding.master-slave-rules.ds_0.slave-data-source-names=ds_master_0_slave_0, ds_master_0_slave_1
spring.shardingsphere.sharding.master-slave-rules.ds_1.master-data-source-name=ds_master_1
spring.shardingsphere.sharding.master-slave-rules.ds_1.slave-data-source-names=ds_master_1_slave_0, ds_master_1_slave_1
# sharding-jdbc mode
spring.shardingsphere.mode.type=Memory
# start shardingsphere log
spring.shardingsphere.props.sql.show=true

 

3.6.3 Test and verification process description

1. DDL operation

JPA automatically creates tables for testing. When Sharding-JDBC’s library splitting and routing rules are configured, the client executes DDL, and Sharding-JDBC will automatically create corresponding tables according to table splitting rules. If t_address is a broadcast table, t_address will be created on both ds_0 and ds_1. The three tables, t_address, t_order and t_order_item will be created on ds_0 and ds_1 respectively.

2. Write operation

For the broadcast table t_address, each record written will also be written to the t_address tables of ds_0 and ds_1.

The tables t_order and t_order_item of the slave library are written to the table on the corresponding instance according to the slave library field and routing policy.

3. Read operation

The join query operations on order and order_item under the binding table are shown below.

3. Conclusion

As an open source product focusing on database enhancement, ShardingSphere is pretty good in terms of its community activitiy, product maturity and documentation richness.

Among its products, ShardingSphere-JDBC is a sharding solution based on the client-side, which supports all sharding scenarios. And there’s no need to introduce an intermediate layer like Proxy, so the complexity of operation and maintenance is reduced. Its latency is theoretically lower than Proxy due to the lack of intermediate layer. In addition, ShardingSphere-JDBC can support a variety of relational databases based on SQL standards such as MySQL/PostgreSQL/Oracle/SQL Server, etc.

However, due to the integration of Sharding-JDBC with the application program, it only supports Java language for now, and is strongly dependent on the application programs. Nevertheless, Sharding-JDBC separates all sharding configuration from the application program, which brings relatively small changes when switching to other middleware.

In conclusion, Sharding-JDBC is a good choice if you use a Java-based system and have to to interconnect with different relational databases — and don’t want to bother with introducing an intermediate layer.

Author

Sun Jinhua

A senior solution architect at AWS, Sun is responsible for the design and consult on cloud architecture. for providing customers with cloud-related design and consulting services. Before joining AWS, he ran his own business, specializing in building e-commerce platforms and designing the overall architecture for e-commerce platforms of automotive companies. He worked in a global leading communication equipment company as a senior engineer, responsible for the development and architecture design of multiple subsystems of LTE equipment system. He has rich experience in architecture design with high concurrency and high availability system, microservice architecture design, database, middleware, IOT etc.

Let’s Build Backend for Friends in Our Social Network App using Spring

In this tutorial, we are going to add a Friend feature using Springboot Backend for a demo social network site we are building

Photo by Hannah Rodrigo on Unsplash

In the previous tutorial, we have added Github OAuth with JWT based Authentication.

In this tutorial, we will extend the existing app into a social network site, like facebook, where users can add each other friend and have real time chat and video/voice call.

We will be building the backend of the friend feature, in this tutorial

#web-development #java #social-network #spring-boot #technology #build backend for friends in our social network app using spring

Lina  Biyinzika

Lina Biyinzika

1678051620

A Practical Guide of Unsupervised Learning Algorithms

In this article, learn about Machine Learning Tutorial: A Practical  Guide of Unsupervised Learning Algorithms. Machine learning is a fast-growing technology that allows computers to learn from the past and predict the future. It uses numerous algorithms for building mathematical models and predicting future trends. Machine learning (ML) has widespread applications in the industry, including speech recognition, image recognition, churn prediction, email filtering, chatbot development, recommender systems, and much more.

Machine learning (ML) can be classified into three main categories; supervised, unsupervised, and reinforcement learning. In supervised learning, the model is trained on labeled data. While in unsupervised learning, unlabeled data is provided to the model to predict the outcomes. Reinforcement learning is feedback learning in which the agent collects a reward for each correct action and gets a penalty for a wrong decision. The goal of the learning agent is to get maximum reward points and deduce the error.

What is Unsupervised Learning?

In unsupervised learning, the model learns from unlabeled data without proper supervision.

Unsupervised learning uses machine learning techniques to cluster unlabeled data based on similarities and differences. The unsupervised algorithms discover hidden patterns in data without human supervision. Unsupervised learning aims to arrange the raw data into new features or groups together with similar patterns of data.

For instance, to predict the churn rate, we provide unlabeled data to our model for prediction. There is no information given that customers have churned or not. The model will analyze the data and find hidden patterns to categorize into two clusters: churned and non-churned customers.

Unsupervised Learning Approaches

Unsupervised algorithms can be used for three tasks—clustering, dimensionality reduction, and association. Below, we will highlight some commonly used clustering and association algorithms.

Clustering Techniques

Clustering, or cluster analysis, is a popular data mining technique for unsupervised learning. The clustering approach works to group non-labeled data based on similarities and differences. Unlike supervised learning, clustering algorithms discover natural groupings in data. 

A good clustering method produces high-quality clusters having high intra-class similarity (similar data within a cluster) and less intra-class similarity (cluster data is dissimilar to other clusters). 

It can be defined as the grouping of data points into various clusters containing similar data points. The same objects remain in the group that has fewer similarities with other groups. Here, we will discuss two popular clustering techniques: K-Means clustering and DBScan Clustering.

K-Means Clustering

K-Means is the simplest unsupervised technique used to solve clustering problems. It groups the unlabeled data into various clusters. The K value defines the number of clusters you need to tell the system how many to create.

K-Means is a centroid-based algorithm in which each cluster is associated with the centroid. The goal is to minimize the sum of the distances between the data points and their corresponding clusters.

It is an iterative approach that breaks down the unlabeled data into different clusters so that each data point belongs to a group with similar characteristics.

K-means clustering performs two tasks:

  1. Using an iterative process to create the best value of K.
  2. Each data point is assigned to its closest k-center. The data point that is closer to the particular k-center makes a cluster.

 

An illustration of K-means clustering. Image source

DBScan Clustering

“DBScan” stands for “Density-based spatial clustering of applications with noise.” There are three main words in DBscan: density, clustering, and noise. Therefore, this algorithm uses the notion of density-based clustering to form clusters and detect the noise.

Clusters are usually dense regions that are separated by lower density regions. Unlike the k-means algorithm, which works only on well-separated clusters, DBscan has a wider scope and can create clusters within the cluster. It discovers clusters of various shapes and sizes from a large set of data, which consists of noise and outliers.

There are two parameters in the DBScan algorithm:

minPts: The threshold, or the minimum number of points grouped together for a region considered as a dense region.

eps(ε): The distance measure used to locate the points in the neighborhood. 

dbscan-clustering

 

An illustration of density-based clustering. Image Source 

Association Rule Mining

An association rule mining is a popular data mining technique. It finds interesting correlations in large numbers of data items. This rule shows how frequently items occur in a transaction.

Market Basket Analysis is a typical example of an association rule mining that finds relationships between items in the grocery store. It enables retailers to identify and analyze the associations between items that people frequently buy.

Important terminology used in association rules:

Support: It tells us about the combination of items bought frequently or frequently bought items.

Confidence: It tells us how often the items A and B occur together, given the number of times A occurs.

Lift: The lift indicates the strength of a rule over the random occurrence of A and B. For instance, A->B, the life value is 5. It means that if you buy A, the occurrence of buying B is five times.

The Apriori algorithm is a well-known association rule based technique.

Apriori algorithm 

The Apriori algorithm was proposed by R. Agarwal and R. Srikant in 1994 to find the frequent items in the dataset. The algorithm’s name is based on the fact that it uses prior knowledge of frequently occurring things. 

The Apriori algorithm finds frequently occurring items with minimum support. 

It consists of two steps:

  • Generation of candidate itemsets.
  • Removing items that are infrequent and don’t fulfill the criteria of minimum support.

Practical Implementation of Unsupervised Algorithms 

In this tutorial, you will learn about the implementation of various unsupervised algorithms in Python. Scikit-learn is a powerful Python library widely used for various unsupervised learning tasks. It is an open-source library that provides numerous robust algorithms, which include classification, dimensionality reduction, clustering techniques, and association rules.

Let’s begin!

1. K-Means algorithm 

Now let’s dive deep into the implementation of the K-Means algorithm in Python. We’ll break down each code snippet so that you can understand it easily.

Import libraries

First of all, we will import the required libraries and get access to the functions.

#Let's import the required libraries
import numpy as np
import pandas as pd 
import matplotlib.pyplot as plt
import seaborn as sns

Loading the dataset 

The dataset is taken from the kaggle website. You can easily download it from the given link. To load the dataset, we use the pd.read_csv() function. head() returns the first five rows of the dataset.

my_data = pd.read_csv('Customers_Mall.csv.') my_data.head() dataset-columns

The dataset contains five columns: customer ID, gender, age, annual income in (K$), and spending score from 1-100. 

Data Preprocessing 

The info() function is used to get quick information about the dataset. It shows the number of entries, columns, total non-null values, memory usage, and datatypes. 

my_data.info()

dataset-description

 

To check the missing values in the dataset, we use isnull().sum(), which returns the total number of null values.

 

#Check missing values 
my_data.isnull().sum()

dataset-null-values

 

The box plot or whisker plot is used to detect outliers in the dataset. It also shows a statistical five number summary, which includes the minimum, first quartile, median (2nd quartile), third quartile, and maximum.

my_data.boxplot(figsize=(8,4)) dataset-boxplot-detect-outliers

Using Box Plot, we’ve detected an outlier in the annual income column. Now we will try to remove it before training our model. 

#let's remove outlier from data
med =61
my_data["Annual Income (k$)"] = np.where(my_data["Annual Income (k$)"] >
 120,med,my_data['Annual Income (k$)'])


The outlier in the annual income column has been removed now to confirm we used the box plot again.

my_data.boxplot(figsize=(8,5)) outlier-removed

Data Visualization

A histogram is used to illustrate the important features of the distribution of data. The hist() function is used to show the distribution of data in each numerical column.

my_data.hist(figsize=(6,6)) 

The correlation heatmap is used to find the potential relationships between variables in the data and to display the strength of those relationships. To display the heatmap, we have used the seaborn plotting library.

plt.figure(figsize=(10,6)) sns.heatmap(my_data.corr(), annot=True, cmap='icefire').set_title('seaborn') plt.show() dataset-histogram

Choosing the Best K Value

The iloc() function is used to select a particular cell of the data. It enables us to select a value that belongs to a specific row or column. Here, we’ve chosen the annual income and spending score columns.

 

X_val = my_data.iloc[:, 3:].values X_val

 

# Loading Kmeans Library

from sklearn.cluster import KMeans

Now we will select the best value for K using the Elbow’s method. It is used to determine the optimal number of clusters in K-means clustering.

my_val = []

for i in range(1,11):

    kmeans = KMeans(n_clusters = i, init='k-means++', random_state = 123)

    kmeans.fit(X_val)

    my_val.append(kmeans.inertia_)

The sklearn.cluster.KMeans() is used to choose the number of clusters along with the initialization of other parameters. To display the result, just call the variable.

my_val dataset-iloc-function #Visualization of clusters using elbow’s method plt.plot(range(1,11),my_val) plt.xlabel('The No of clusters') plt.ylabel('Outcome') plt.title('The Elbow Method') plt.show() clusters-elbow-method

Through Elbow’s Method, when the graph looks like an arm, then the elbow on the arm is the best value of K. In this case, we’ve taken K=3, which is the optimal value for K.

kmeans = KMeans(n_clusters = 3, init='k-means++') kmeans.fit(X_val) number-of-clusters #To show centroids of clusters  kmeans.cluster_centers_ cluster-centers #Prediction of K-Means clustering  y_kmeans = kmeans.fit_predict(X_val) y_kmeans

fit-predict-function-kmeans

Splitting the dataset into three clusters

The scatter graph is used to plot the classification results of our dataset into three clusters.

plt.scatter(X_val[y_kmeans == 0,0], X_val[y_kmeans == 0,1], c='red',s=100)

plt.scatter(X_val[y_kmeans == 1,0], X_val[y_kmeans == 1,1], c='green',s=100)

plt.scatter(X_val[y_kmeans == 2,0], X_val[y_kmeans == 2,1], c='orange',s=100)

plt.scatter(kmeans.cluster_centers_[:,0], kmeans.cluster_centers_[:,1], s=300, c='brown')

plt.title('K-Means Unsupervised Learning')

plt.show()


2. Apriori Algorithm

To implement the apriori algorithm, we will utilize “The Bread Basket” dataset. The dataset is available on Kaggle and you can download it from the link. This algorithm suggests products based on the user’s purchase history. Walmart has greatly utilized the algorithm to recommend relevant items to its users.

Let’s implement the Apriori algorithm in Python. 

Import libraries 

To implement the algorithm, we need to import some important libraries.

import pandas as pd

import matplotlib.pyplot as plt

import numpy as np

import seaborn as sns

Loading the dataset 

The dataset contains five columns and 20507 entries. The data_time is a prominent column and we can extract many vital insights from it.

my_data= pd.read_csv("bread basket.csv") my_data.head() bread-basket-dataset-apriori

Data Preprocessing 

Convert the data_time into an appropriate format.

my_data['date_time'] = pd.to_datetime(my_data['date_time'])

#Total No of unique customers

my_data['Transaction'].nunique()

unique-customers-apriori

Now we want to extract new columns from the data_time to extract meaningful information from the data.

#Let's extract date

my_data['date'] = my_data['date_time'].dt.date

#Let's extract time

my_data['time'] = my_data['date_time'].dt.time

#Extract month and replacing it with String

my_data['month'] = my_data['date_time'].dt.month

my_data['month'] = my_data['month'].replace((1,2,3,4,5,6,7,8,9,10,11,12), 

                                          ('Jan','Feb','Mar','Apr','May','Jun','Jul','Aug',

                                          'Sep','Oct','Nov','Dec'))

#Extract hour

my_data[‘hour’] = my_data[‘date_time’].dt.hour

# Replacing hours with text

# Replacing hours with text

hr_num = (1,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23)

hr_obj = (‘1-2′,’7-8′,’8-9′,’9-10′,’10-11′,’11-12′,’12-13′,’13-14′,’14-15’,

               ’15-16′,’16-17′,’17-18′,’18-19′,’19-20′,’20-21′,’21-22′,’22-23′,’23-24′)

my_data[‘hour’] = my_data[‘hour’].replace(hr_num, hr_obj)

# Extracting weekday and replacing it with String 

my_data[‘weekday’] = my_data[‘date_time’].dt.weekday

my_data[‘weekday’] = my_data[‘weekday’].replace((0,1,2,3,4,5,6), 

                                          (‘Mon’,’Tues’,’Wed’,’Thur’,’Fri’,’Sat’,’Sun’))

#Now drop date_time column

my_data.drop(‘date_time’, axis = 1, inplace = True)

After extracting the date, time, month, and hour columns, we dropped the data_time column.

Now to display, we simply use the head() function to see the changes in the dataset.

my_data.head()

dataset-apriori

# cleaning the item column

my_data[‘Item’] = my_data[‘Item’].str.strip()

my_data[‘Item’] = my_data[‘Item’].str.lower()

my_data.head()

clean-dataset

Data Visualization 

To display the top 10 items purchased by customers, we used a barplot() of the seaborn library. 

plt.figure(figsize=(10,5))

sns.barplot(x=my_data.Item.value_counts().head(10).index, y=my_data.Item.value_counts().head(10).values,palette='RdYlGn')

plt.xlabel('No of Items', size = 17)

plt.xticks(rotation=45)

plt.ylabel('Total Items', size = 18)

plt.title('Top 10 Items purchased', color = 'blue', size = 23)

plt.show()


From the graph, coffee is the top item purchased by the customers, followed by bread.

Now, to display the number of orders received each month, the groupby() function is used along with barplot() to visually show the results.

mon_Tran =my_data.groupby('month')['Transaction'].count().reset_index() mon_Tran.loc[:,"mon_order"] =[4,8,12,2,1,7,6,3,5,11,10,9] mon_Tran.sort_values("mon_order",inplace=True) plt.figure(figsize=(12,5)) sns.barplot(data = mon_Tran, x = "month", y = "Transaction") plt.xlabel('Months', size = 14) plt.ylabel('Monthly Orders', size = 14) plt.title('No of orders received each month', color = 'blue', size = 18) plt.show() orders-received-dataset

To show the number of orders received each day, we applied groupby() to the weekday column.

wk_Tran = my_data.groupby('weekday')['Transaction'].count().reset_index()

wk_Tran.loc[:,"wk_ord"] = [4,0,5,6,3,1,2]

wk_Tran.sort_values("wk_ord",inplace=True)

plt.figure(figsize=(11,4))

sns.barplot(data = wk_Tran, x = "weekday", y = "Transaction",palette='RdYlGn')

plt.xlabel('Week Day', size = 14)

plt.ylabel('Per day orders', size = 14)

plt.title('Orders received per day', color = 'blue', size = 18)

plt.show()


 Implementation of the Apriori Algorithm 

We import the mlxtend library to implement the association rules and count the number of items.

from mlxtend.frequent_patterns import association_rules, apriori

tran_str= my_data.groupby(['Transaction', 'Item'])['Item'].count().reset_index(name ='Count')

tran_str.head(8)

association-rule-mixtend

Now we’ll make a mxn matrix where m=transaction and n=items, and each row represents whether the item was in the transaction or not.

Mar_baskt = tran_str.pivot_table(index='Transaction', columns='Item', values='Count', aggfunc='sum').fillna(0)

Mar_baskt.head()

market-basket

We want to make a function that returns 0 and 1. 0 means that the item wasn’t present in the transaction, while 1 means the item exists.

def encode(val):

    if val<=0:

        return 0

    if val>=1:

        return 1

#Let's apply the function to the dataset

Basket=Mar_baskt.applymap(encode)

Basket.head()

basket-head

#using apriori algorithm to set min_support 0.01 means 1% freq_items = apriori(Basket, min_support = 0.01,use_colnames = True) freq_items.head()

frequent-items-apriori

Using the association_rules() function to generate the most frequent items from the dataset.

App_rule= association_rules(freq_items, metric = "lift", min_threshold = 1) App_rule.sort_values('confidence', ascending = False, inplace = True) App_rule.head() association-rules-apriori

From the above implementation, the most frequent items are coffee and toast, both having a lift value of 1.47 and a confidence value of 0.70. 

3. Principal Component Analysis 

Principal component analysis (PCA) is one of the most widely used unsupervised learning techniques. It can be used for various tasks, including dimensionality reduction, information compression, exploratory data analysis and Data de-noising.

Let’s use the PCA algorithm!

First we import the required libraries to implement this algorithm.

import numpy as np 

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

%matplotlib inline

from sklearn.decomposition import PCA

from sklearn.datasets import load_digits

Loading the Dataset 

To implement the PCA algorithm the load_digits dataset of Scikit-learn is used which can easily be loaded using the below command. The dataset contains images data which include 1797 entries and 64 columns.

 

#Load the dataset

my_data= load_digits()

#Creating features

X_value = my_data.data

#Creating target

#Let's check the shape of X_value

X_value.shape

 

dataset-X-shape

 

 

#Each image is 8x8 pixels therefore 64px  my_data.images[10] image-pixels #Let's display the image plt.gray()  plt.matshow(my_data.images[34])  plt.show()

image-pixels

Now let’s project data from 64 columns to 16 to show how 16 dimensions classify the data.

X_val = my_data.data 

y_val = my_data.target

my_pca = PCA(16)  

X_projection = my_pca.fit_transform(X_val)

print(X_val.shape)

print(X_projection.shape)

projection-shape

Using colormap we visualize that with only ten dimensions  we can classify the data points. Now we’ll select the optimal number of dimensions (principal components) by which data can be reduced into lower dimensions.

plt.scatter(X_projection[:, 0], X_projection[:, 1], c=y_val, edgecolor='white',

            cmap=plt.cm.get_cmap("gist_heat",12))

plt.colorbar();

x-projection

pca=PCA().fit(X_val)

plt.plot(np.cumsum(my_pca.explained_variance_ratio_))

plt.xlabel('Principal components')

plt.ylabel('Explained variance')

Based on the below graph, only 12 components are required to explain more than 80% of the variance which is still better than computing all the 64 features. Thus, we’ve reduced the large number of dimensions into 12 dimensions to avoid the dimensionality curse. pca=PCA().fit(X_val)

plt.plot(np.cumsum(pca.explained_variance_ratio_))

plt.xlabel('Principal components')

plt.ylabel('Explained variance')



#Let's visualize how it looks like

Unsupervised_pca = PCA(12)  

X_pro = Unsupervised_pca.fit_transform(X_val)

print("New Data Shape is =>",X_pro.shape)

#Let's Create a scatter plot

plt.scatter(X_pro[:, 0], X_pro[:, 1], c=y_val, edgecolor='white',

            cmap=plt.cm.get_cmap("nipy_spectral",10))

plt.colorbar();


principal-component-analysis

Wrapping Up 

beyond machine

In this machine learning tutorial, we’ve implemented the Kmeans, Apriori, and PCA algorithms. These are some of the most widely used algorithms, having numerous industrial applications and solve many real world problems. For instance, K-means clustering is used in astronomy to study stellar and galaxy spectra, solar polarization spectra, and X-ray spectra. And, Apriori is used by retail stores to optimize their product inventory. 

Dreaming of becoming a data scientist or data analyst even without a university and a college degree? Do you need the knowledge of data science and analysis for promotions in your current role?

Are you interested in securing your dream job in data science and analysis and looking for a way to get started, we can help you? With over 10 years of experience in data science and data analysis, we will teach you the rubrics, guiding you with one-on-one lessons from the fundamentals until you become a pro.

Our courses are affordable and easy to understand with numerous exercises and assignments you can learn from. At the completion of our courses, you’ll be readily equipped with technical and practical skills to take on any data science and data analysis role in companies, collaborate effectively among teams and help businesses meet and exceed their objectives by extracting actionable insights from data.

Original article sourced at: https://thedatascientist.com

#machine-learning 

Como Adicionar O Botão De Compartilhamento Social No Laravel 8

O pacote Laravel Share permite que você gere dinamicamente botões de compartilhamento social de redes sociais populares para aumentar o engajamento de mídia social.

Isso permite que os visitantes do site compartilhem facilmente o conteúdo com suas conexões e redes de mídia social.

Neste tutorial, mostro como você pode adicionar links de compartilhamento social em seu projeto Laravel 8 usando o pacote Laravel Share.

1. Instale o pacote

Instale o pacote usando o compositor –

composer require jorenvanhocht/laravel-share

2. Atualize app.php

  • Abrir config/app.phparquivo.
  • Adicione o seguinte Jorenvh\Share\Providers\ShareServiceProvider::classem 'providers'
'providers' => [
      ....
      ....
      ....  
      Jorenvh\Share\Providers\ShareServiceProvider::class,
];
  • Adicione o seguinte 'Share' => Jorenvh\Share\ShareFacade::classem 'aliases'
'aliases' => [
     .... 
     .... 
     .... 
     'Share' => Jorenvh\Share\ShareFacade::class,
];

3. Publicar pacote

Execute o comando -

php artisan vendor:publish --provider="Jorenvh\Share\Providers\ShareServiceProvider"

4. Rota

  • Abrir  routes/web.php arquivo.
  • Crie uma rota -
    • / – Carregar visualização de índice.

Código concluído

<?php

use Illuminate\Support\Facades\Route;
use App\Http\Controllers\PageController;

Route::get('/', [PageController::class, 'index']);

5. Controlador

  • Criar  PageController controlador.
php artisan make:controller PageController
  • Abrir  app/Http/Controllers/PageController.php arquivo.
  • Criar 1 método –

index() – Crie um link de compartilhamento usando Share::page()e atribua a $shareButtons1. Da mesma forma, crie mais 2 links e atribua as variáveis.

Carregue indexa visualização e passe $shareButtons1, $shareButtons2, e $shareButtons3.

Código concluído

<?php

namespace App\Http\Controllers;

use Illuminate\Http\Request;

class PageController extends Controller
{
         public function index(){

               // Share button 1
               $shareButtons1 = \Share::page(
                     'https://makitweb.com/datatables-ajax-pagination-with-search-and-sort-in-laravel-8/'
               )
               ->facebook()
               ->twitter()
               ->linkedin()
               ->telegram()
               ->whatsapp() 
               ->reddit();

               // Share button 2
               $shareButtons2 = \Share::page(
                     'https://makitweb.com/how-to-make-autocomplete-search-using-jquery-ui-in-laravel-8/'
               )
               ->facebook()
               ->twitter()
               ->linkedin()
               ->telegram();

               // Share button 3
               $shareButtons3 = \Share::page(
                      'https://makitweb.com/how-to-upload-multiple-files-with-vue-js-and-php/'
               )
               ->facebook()
               ->twitter()
               ->linkedin()
               ->telegram()
               ->whatsapp() 
               ->reddit();

               // Load index view
               return view('index')
                     ->with('shareButtons1',$shareButtons1 )
                     ->with('shareButtons2',$shareButtons2 )
                     ->with('shareButtons3',$shareButtons3 );
         }
}

6. Visualizar

Criar index.blade.php arquivo na  resources/views/ pasta.

Inclua Bootstrap, CSS de fonte incrível, jQuery e js/share.js. –

<!-- CSS -->
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.0.1/dist/css/bootstrap.min.css" rel="stylesheet">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.3/css/all.min.css"/>

<!-- jQuery -->
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.6.0/jquery.min.js"></script>

<!-- Share JS -->
<script src="{{ asset('js/share.js') }}"></script>

Adicionado CSS para personalizar links de compartilhamento social.

Exiba links de compartilhamento social usando –

{!! $shareButtons1 !!}

Da mesma forma, exiba outros 2 – {!! $shareButtons2 !!} e {!! $shareButtons3 !!}.

Código concluído

<!DOCTYPE html>
<html>
<head>
     <title>Add social share button in Laravel 8 with Laravel Share</title>

     <!-- Meta -->
     <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
     <meta charset="utf-8">
     <meta name="viewport" content="width=device-width, initial-scale=1.0">

     <!-- CSS -->
     <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.0.1/dist/css/bootstrap.min.css" rel="stylesheet">
     <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.3/css/all.min.css"/>

     <!-- jQuery -->
     <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.6.0/jquery.min.js"></script>

     <!-- Share JS -->
     <script src="{{ asset('js/share.js') }}"></script>

     <style>
     #social-links ul{
          padding-left: 0;
     }
     #social-links ul li {
          display: inline-block;
     } 
     #social-links ul li a {
          padding: 6px;
          border: 1px solid #ccc;
          border-radius: 5px;
          margin: 1px;
          font-size: 25px;
     }
     #social-links .fa-facebook{
           color: #0d6efd;
     }
     #social-links .fa-twitter{
           color: deepskyblue;
     }
     #social-links .fa-linkedin{
           color: #0e76a8;
     }
     #social-links .fa-whatsapp{
          color: #25D366
     }
     #social-links .fa-reddit{
          color: #FF4500;;
     }
     #social-links .fa-telegram{
          color: #0088cc;
     }
     </style>
</head>
<body>

    <div class='container'>

         <!-- Post 1 -->
         <div class='row mt-5'>
               <h2>Datatables AJAX pagination with Search and Sort in Laravel 8</h2>

               <p>With pagination, it is easier to display a huge list of data on the page.</p>

               <p>You can create pagination with and without AJAX.</p>

               <p>There are many jQuery plugins are available for adding pagination. One of them is DataTables.</p>

               <p>In this tutorial, I show how you can add Datatables AJAX pagination without the Laravel package in Laravel 8.</p>

               <!-- Social Share buttons 1 -->
               <div class="social-btn-sp">
                     {!! $shareButtons1 !!}
               </div> 
          </div>

          <!-- Post 2 -->
          <div class='row mt-5'>
                 <h2>How to make Autocomplete search using jQuery UI in Laravel 8</h2>

                 <p>jQuery UI has different types of widgets available, one of them is autocomplete.</p>

                 <p>Data is loaded according to the input after initialize autocomplete on a textbox. User can select an option from the suggestion list.</p>

                 <p>In this tutorial, I show how you can make autocomplete search using jQuery UI in Laravel 8.</p>

                 <!-- Social Share buttons 2 -->
                 <div class="social-btn-sp">
                        {!! $shareButtons2 !!}
                 </div>
           </div>

           <!-- Post 3 -->
           <div class='row mt-5 mb-5'>
                 <h2>How to upload multiple files with Vue.js and PHP</h2>

                 <p>Instead of adding multiple file elements, you can use a single file element for allowing the user to upload more than one file.</p>

                 <p>Using the FormData object to pass the selected files to the PHP for upload.</p>

                 <p>In this tutorial, I show how you can upload multiple files using Vue.js and PHP.</p>

                 <!-- Social Share buttons 3 -->
                 <div class="social-btn-sp">
                      {!! $shareButtons3 !!}
                 </div>
           </div>

     </div>
</body>
</html>

7. Demonstração

Ver demonstração


8. Conclusão

No exemplo, consertei os links, mas você pode configurá-los dinamicamente.

Personalize o design usando CSS e o número de ícones sociais visíveis usando o controlador.

Usando o pacote Laravel Share, você pode compartilhar links para –

  • Facebook,
  • Twitter,
  • LinkedIn,
  • Whatsapp,
  • Reddit, e
  • Telegrama

Fonte:  https://makitweb.com

#php #laravel 

Jarrod  Douglas

Jarrod Douglas

1658370780

Ajouter Un Bouton De Partage Social Dans Laravel 8 Avec Laravel Share

Le package Laravel Share vous permet de générer dynamiquement des boutons de partage social à partir de réseaux sociaux populaires pour augmenter l'engagement sur les réseaux sociaux.

Ceux-ci permettent aux visiteurs du site Web de partager facilement le contenu avec leurs connexions et réseaux de médias sociaux.

Dans ce didacticiel, je montre comment vous pouvez ajouter des liens de partage social dans votre projet Laravel 8 à l'aide du package Laravel Share.

1. Installer le package

Installez le package à l'aide de composer -

composer require jorenvanhocht/laravel-share

2. Mettre à jour app.php

  • Ouvrir config/app.phple fichier.
  • Ajoutez ce qui suit Jorenvh\Share\Providers\ShareServiceProvider::classdans 'providers'-
'providers' => [
      ....
      ....
      ....  
      Jorenvh\Share\Providers\ShareServiceProvider::class,
];
  • Ajoutez ce qui suit 'Share' => Jorenvh\Share\ShareFacade::classdans 'aliases'-
'aliases' => [
     .... 
     .... 
     .... 
     'Share' => Jorenvh\Share\ShareFacade::class,
];

3. Publier le package

Exécutez la commande -

php artisan vendor:publish --provider="Jorenvh\Share\Providers\ShareServiceProvider"

4. Itinéraire

  • Ouvrir  routes/web.php le fichier.
  • Créer un itinéraire -
    • / – Charger la vue d'index.

Code terminé

<?php

use Illuminate\Support\Facades\Route;
use App\Http\Controllers\PageController;

Route::get('/', [PageController::class, 'index']);

5. Contrôleur

  • Créer  PageController un contrôleur.
php artisan make:controller PageController
  • Ouvrir  app/Http/Controllers/PageController.php le fichier.
  • Créer 1 méthode –

index() - Créez un lien de partage en utilisant Share::page()et attribuez-le à $shareButtons1. De même, créez 2 autres liens et affectez-les aux variables.

Charger la indexvue et passer $shareButtons1, $shareButtons2et $shareButtons3.

Code terminé

<?php

namespace App\Http\Controllers;

use Illuminate\Http\Request;

class PageController extends Controller
{
         public function index(){

               // Share button 1
               $shareButtons1 = \Share::page(
                     'https://makitweb.com/datatables-ajax-pagination-with-search-and-sort-in-laravel-8/'
               )
               ->facebook()
               ->twitter()
               ->linkedin()
               ->telegram()
               ->whatsapp() 
               ->reddit();

               // Share button 2
               $shareButtons2 = \Share::page(
                     'https://makitweb.com/how-to-make-autocomplete-search-using-jquery-ui-in-laravel-8/'
               )
               ->facebook()
               ->twitter()
               ->linkedin()
               ->telegram();

               // Share button 3
               $shareButtons3 = \Share::page(
                      'https://makitweb.com/how-to-upload-multiple-files-with-vue-js-and-php/'
               )
               ->facebook()
               ->twitter()
               ->linkedin()
               ->telegram()
               ->whatsapp() 
               ->reddit();

               // Load index view
               return view('index')
                     ->with('shareButtons1',$shareButtons1 )
                     ->with('shareButtons2',$shareButtons2 )
                     ->with('shareButtons3',$shareButtons3 );
         }
}

6. Voir

Créer index.blade.php un fichier dans  resources/views/ le dossier.

Incluez Bootstrap, CSS font-awesome, jQuery et js/share.js. –

<!-- CSS -->
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.0.1/dist/css/bootstrap.min.css" rel="stylesheet">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.3/css/all.min.css"/>

<!-- jQuery -->
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.6.0/jquery.min.js"></script>

<!-- Share JS -->
<script src="{{ asset('js/share.js') }}"></script>

CSS ajouté pour personnaliser les liens de partage social.

Afficher les liens de partage social en utilisant –

{!! $shareButtons1 !!}

De même, affichez les autres 2 – {!! $shareButtons2 !!}, et { !! $shareButtons3 !!}.

Code terminé

<!DOCTYPE html>
<html>
<head>
     <title>Add social share button in Laravel 8 with Laravel Share</title>

     <!-- Meta -->
     <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
     <meta charset="utf-8">
     <meta name="viewport" content="width=device-width, initial-scale=1.0">

     <!-- CSS -->
     <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.0.1/dist/css/bootstrap.min.css" rel="stylesheet">
     <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.3/css/all.min.css"/>

     <!-- jQuery -->
     <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.6.0/jquery.min.js"></script>

     <!-- Share JS -->
     <script src="{{ asset('js/share.js') }}"></script>

     <style>
     #social-links ul{
          padding-left: 0;
     }
     #social-links ul li {
          display: inline-block;
     } 
     #social-links ul li a {
          padding: 6px;
          border: 1px solid #ccc;
          border-radius: 5px;
          margin: 1px;
          font-size: 25px;
     }
     #social-links .fa-facebook{
           color: #0d6efd;
     }
     #social-links .fa-twitter{
           color: deepskyblue;
     }
     #social-links .fa-linkedin{
           color: #0e76a8;
     }
     #social-links .fa-whatsapp{
          color: #25D366
     }
     #social-links .fa-reddit{
          color: #FF4500;;
     }
     #social-links .fa-telegram{
          color: #0088cc;
     }
     </style>
</head>
<body>

    <div class='container'>

         <!-- Post 1 -->
         <div class='row mt-5'>
               <h2>Datatables AJAX pagination with Search and Sort in Laravel 8</h2>

               <p>With pagination, it is easier to display a huge list of data on the page.</p>

               <p>You can create pagination with and without AJAX.</p>

               <p>There are many jQuery plugins are available for adding pagination. One of them is DataTables.</p>

               <p>In this tutorial, I show how you can add Datatables AJAX pagination without the Laravel package in Laravel 8.</p>

               <!-- Social Share buttons 1 -->
               <div class="social-btn-sp">
                     {!! $shareButtons1 !!}
               </div> 
          </div>

          <!-- Post 2 -->
          <div class='row mt-5'>
                 <h2>How to make Autocomplete search using jQuery UI in Laravel 8</h2>

                 <p>jQuery UI has different types of widgets available, one of them is autocomplete.</p>

                 <p>Data is loaded according to the input after initialize autocomplete on a textbox. User can select an option from the suggestion list.</p>

                 <p>In this tutorial, I show how you can make autocomplete search using jQuery UI in Laravel 8.</p>

                 <!-- Social Share buttons 2 -->
                 <div class="social-btn-sp">
                        {!! $shareButtons2 !!}
                 </div>
           </div>

           <!-- Post 3 -->
           <div class='row mt-5 mb-5'>
                 <h2>How to upload multiple files with Vue.js and PHP</h2>

                 <p>Instead of adding multiple file elements, you can use a single file element for allowing the user to upload more than one file.</p>

                 <p>Using the FormData object to pass the selected files to the PHP for upload.</p>

                 <p>In this tutorial, I show how you can upload multiple files using Vue.js and PHP.</p>

                 <!-- Social Share buttons 3 -->
                 <div class="social-btn-sp">
                      {!! $shareButtons3 !!}
                 </div>
           </div>

     </div>
</body>
</html>

7. Démo

Voir la démo


8.Conclusion

Dans l'exemple, j'ai corrigé les liens mais vous pouvez les définir dynamiquement.

Personnalisez la conception à l'aide de CSS et du nombre d'icônes sociales visibles à l'aide du contrôleur.

En utilisant le package Laravel Share, vous pouvez partager des liens vers -

  • Facebook,
  • Twitter,
  • LinkedIn,
  • WhatsApp,
  • Reddit, et
  • Télégramme

Source :  https://makitweb.com

#php #laravel