Karim Aya

Karim Aya

1665544549

How WebSockets Work | Real-time Websocket Server Creation

A WebSocket is a persistent bi-directional communication channel between a client (e.g. a browser) and a backend service. In contrast with HTTP request/response connections, websockets can transport any number of protocols and provide server-to-client message delivery without polling.

In today's article we will learn how WebSockets work and how to create a real-time Websocket server in the simplest way.

Table Of Contents

How WebSockets work

Making a WebSocket

  • 1: Creating our Server
  • 2: Connect on Frontend

How WebSockets work

Typically, WebSocket was developed out of the limitations of the HTTP technology. In HTTP-based protocol, a client first asks for a resource, and the server responds to the call with the requested data. This makes HTTP strictly unidirectional. To work around this limitation, users had to apply long polling, but this also uses a lot of server resources, which is where WebSockets come in.

WebSockets are fully-duplex communication protocols and allow servers to send message-based running data with reliance on TCP. This is where WebSocket is different from HTTP. They rely on HTTP strings as the preliminary communication transport mechanism but retain TCP connection after receiving the HTTP response, which enables sending messages between the server and the client. WebSockets allows users to build “real-time” applications without long-polling from a user’s perspective.

It’s worth noting that a WebSocket is different from HTTP. Although both protocols rely on TCP, the design of a WebSocket “enables it to work over HTTP/HTTPS ports 80 and 443 along with supporting HTTP/HTTPS proxies and intermediaries.” This is where its compatibility with HTTP comes from. What’s more, to accomplish this compatibility, the WebSocket handshake utilizes the HTTP/HTTPS Upgrade header to alter from the HTTPS/HTTP protocol to its WebSocket protocol.

Making a WebSocket

A WebSocket therefore consists of two parts - the server, and the local machine that the user is using. For what we are doing, we'll be using Node.JS as our server, but other languages also support WebSockets.

When the user access our website, we load a file with some Javascript which contains a connection string to our WebSocket. Meanwhile, in our backend, we will have WebSocket set up that the user will connect to.

To create a WebSocket we follow these two steps:

1: Creating our Server

Let's start by making our Node.JS web server for the WebSocket connection. For this, we're going to be using an express server with an additional package called express-ws. This additional package will allow us to use ws in the same way we might use get with express.

If you don't have Node.JS installed, you'll need to do that first by going to this link. Once installed, create a new folder called server-websocket.

Once in the folder, you need to install the dependency packages. Start installing your dependencies, by running each of the following commands:

npm i express
npm i express-ws
npm i path
npm i url

After that, make a file called index.js and put in the following code:

// Import path and url dependencies
import path from 'path'
import { fileURLToPath } from 'url'

// Get the directory and file path
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);

// Import express, expressWs, and http
import express from 'express'
import expressWs from 'express-ws'
import http from 'http'

// Our port
let port = 3000;

// App and server
let app = express();
let server = http.createServer(app).listen(port);    

// Apply expressWs
expressWs(app, server);

app.use(express.static(__dirname + '/views'));

// Get the route / 
app.get('/', (req, res) => {
    res.status(200).send("Welcome to our app");
});

// This lets the server pick up the '/ws' WebSocket route
app.ws('/ws', async function(ws, req) {
    // After which we wait for a message and respond to it
    ws.on('message', async function(msg) {
        // If a message occurs, we'll console log it on the server
        console.log(msg);
        // Start listening for messages
    });
});

The last clause, app.ws, refers to the WebSocket, and that's what we'll try to connect to on the frontend. For the time being, the WebSocket only console logs a message, whenever it receives one from the frontend. Let's change that so it sends something back:

// Get the /ws WebSocket route
app.ws('/ws', async function(ws, req) {
    ws.on('message', async function(msg) {
        // What was the message?
        console.log(msg);
        // Send back some data
        ws.send(JSON.stringify({
            "append" : true,
            "returnText" : "I am using WebSockets!"
        }));
    });
});

Now whenever this WebSocket connection receives data, it will send back the object containing append and returnText. We'll also console log the message that the server has received.

We can then manipulate this object in our frontend, to display or change views for the user.

Save that file in your websocket-server folder as index.js. Then from your terminal, in the websocket-server folder, run the following command:

node index.js

2: Connect on Frontend

Now we have a running websocket server, but no way to connect to it. We want to achieve something like this:

  • A user visits our site.
  • We initiate a WebSocket connection from our Javascript file.
  • The user successfully connects to the WebSocket, and sends a message to the WebSocket once connected.
  • We can then send data back to the user, now that they have a live connection to our WebSocket server, creating real time data exchange.

For our demo, let's start by making two files: index.html, and local.js, both of which will be front end files. Next, let's put the following in our index.html file:

<script src="local.js"></script>
<p>Welcome to WebSockets. Click here to start receiving messages.</p>
<button id="websocket-button">Click me</button>
<div id="websocket-returns"></div>

Next, we need to connect the user to our WebSocket, via the local.js file. Our local.js file will ultimately look like this:

// @connect
// Connect to the websocket
let socket;
// This will let us create a connection to our Server websocket.
// For this to work, your websocket needs to be running with node index.js
const connect = function() {
    // Return a promise, which will wait for the socket to open
    return new Promise((resolve, reject) => {
        // This calculates the link to the websocket. 
        const socketProtocol = (window.location.protocol === 'https:' ? 'wss:' : 'ws:')
        const port = 3000;
        const socketUrl = `${socketProtocol}//${window.location.hostname}:${port}/ws/`
        
        // Define socket
        // If you are running your websocket on localhost, you can change 
        // socketUrl to 'http://localhost:3000', as we are running our websocket
        // on port 3000 from the previous websocket code.
        socket = new WebSocket(socketUrl);

        // This will fire once the socket opens
        socket.onopen = (e) => {
            // Send a little test data, which we can use on the server if we want
            socket.send(JSON.stringify({ "loaded" : true }));
            // Resolve the promise - we are connected
            resolve();
        }

        // This will fire when the server sends the user a message
        socket.onmessage = (data) => {
            console.log(data);
            // Any data from the server can be manipulated here.
            let parsedData = JSON.parse(data.data);
            if(parsedData.append === true) {
                const newEl = document.createElement('p');
                newEl.textContent = parsedData.returnText;
                document.getElementById('websocket-returns').appendChild(newEl);
            }
        }

        // This will fire on error
        socket.onerror = (e) => {
            // Return an error if any occurs
            console.log(e);
            resolve();
            // Try to connect again
            connect();
        }
    });
}

// @isOpen
// check if a websocket is open
const isOpen = function(ws) { 
    return ws.readyState === ws.OPEN 
}

// When the document has loaded
document.addEventListener('DOMContentLoaded', function() {
    // Connect to the websocket
    connect();
    // And add our event listeners
    document.getElementById('websocket-button').addEventListener('click', function(e) {
        if(isOpen(socket)) {
            socket.send(JSON.stringify({
                "data" : "this is our data to send",
                "other" : "this can be in any format"
            }))
        }
    });
});

This might look like a lot, but let's break it down. In our connection function we start by constructing our WebSocket URL. This can simply be written as ws://localhost:3000/, since our WebSocket server runs on port 3000. Above, it is configured to adjust automatically should you be using HTTP or HTTPS.

We then pass some event listeners to our newly created WebSocket. All of our event listeners, and the URL to connect to the WebSocket server sit within our connect() function - the purpose of which is to connect to our WebSocket server.

Our WebSocket event listeners look like this:

  • socket.onopen - if the connection is successful and open, this fires.
  • socket.onmessage - any time the server sends a message to us, this fires. In our example, we will append a new element to our user's HTML if they receive data which has append set to true.
  • socket.onerror - if the connection is fails, or an error occurs, this will fire.

Now that we have a connect() function which lets us connect to our Websocket server, we have to run it. We start by waiting for the page to load, using DOMContentLoaded. Then we connect to our websocket using the connect() function.

Finally, we attach an event listener to the button on our HTML page, which when clicked will now send some data to our WebSocket using the socket.send() function. When the server receives this data, it then sends back its own data, as the servers 'message' event will fire.

// When the document has loaded
document.addEventListener('DOMContentLoaded', function() {
    // Connect to the websocket
    connect();
    // And add our event listeners
    document.getElementById('websocket-button').addEventListener('click', function(e) {
        if(isOpen(socket)) {
            socket.send(JSON.stringify({
                "data" : "this is our data to send",
                "other" : "this can be in any format"
            }))
        }
    });
});

Since our onmessage event handler on our WebSocket fires whenever new data comes from the WebSocket server, clicking the button causes the WebSocket server to send a message back to us - thus creating a new HTML <p> element.

That's all we want to introduce to you. Happy coding!!!

#websocket #javascript 

What is GEEK

Buddha Community

How WebSockets Work | Real-time Websocket Server Creation

Cómo Extraer Datos De Twitter Usando Tweepy Y Snscrape

Si es un entusiasta de los datos, probablemente estará de acuerdo en que una de las fuentes más ricas de datos del mundo real son las redes sociales. Sitios como Twitter están llenos de datos.

Puede usar los datos que puede obtener de las redes sociales de varias maneras, como el análisis de sentimientos (análisis de los pensamientos de las personas) sobre un tema o campo de interés específico.

Hay varias formas de raspar (o recopilar) datos de Twitter. Y en este artículo, veremos dos de esas formas: usando Tweepy y Snscrape.

Aprenderemos un método para recopilar conversaciones públicas de personas sobre un tema de tendencia específico, así como tweets de un usuario en particular.

Ahora, sin más preámbulos, comencemos.

Tweepy vs Snscrape – Introducción a nuestras herramientas de raspado

Ahora, antes de entrar en la implementación de cada plataforma, intentemos comprender las diferencias y los límites de cada plataforma.

llorona

Tweepy es una biblioteca de Python para integrarse con la API de Twitter. Debido a que Tweepy está conectado con la API de Twitter, puede realizar consultas complejas además de raspar tweets. Le permite aprovechar todas las capacidades de la API de Twitter.

Pero hay algunos inconvenientes, como el hecho de que su API estándar solo le permite recopilar tweets durante un máximo de una semana (es decir, Tweepy no permite la recuperación de tweets más allá de una ventana de una semana, por lo que no se permite la recuperación de datos históricos).

Además, hay límites en la cantidad de tweets que puede recuperar de la cuenta de un usuario. Puedes leer más sobre las funcionalidades de Tweepy aquí .

Snscrape

Snscrape es otro enfoque para extraer información de Twitter que no requiere el uso de una API. Snscrape le permite recopilar información básica, como el perfil de un usuario, el contenido del tweet, la fuente, etc.

Snscrape no se limita a Twitter, sino que también puede extraer contenido de otras redes sociales destacadas como Facebook, Instagram y otras.

Sus ventajas son que no hay límites para la cantidad de tweets que puede recuperar o la ventana de tweets (es decir, el rango de fechas de los tweets). Entonces Snscrape le permite recuperar datos antiguos.

Pero la única desventaja es que carece de todas las demás funcionalidades de Tweepy; aún así, si solo desea raspar tweets, Snscrape sería suficiente.

Ahora que hemos aclarado la distinción entre los dos métodos, repasemos su implementación uno por uno.

Cómo usar Tweepy para raspar tweets

Antes de comenzar a usar Tweepy, primero debemos asegurarnos de que nuestras credenciales de Twitter estén listas. Con eso, podemos conectar Tweepy a nuestra clave API y comenzar a raspar.

Si no tiene credenciales de Twitter, puede registrarse para obtener una cuenta de desarrollador de Twitter yendo aquí . Se le harán algunas preguntas básicas sobre cómo pretende utilizar la API de Twitter. Después de eso, puede comenzar la implementación.

El primer paso es instalar la biblioteca Tweepy en su máquina local, lo que puede hacer escribiendo:

pip install git+https://github.com/tweepy/tweepy.git

Cómo extraer tweets de un usuario en Twitter

Ahora que hemos instalado la biblioteca Tweepy, raspamos 100 tweets de un usuario llamado johnen Twitter. Veremos la implementación del código completo que nos permitirá hacer esto y lo discutiremos en detalle para que podamos comprender lo que está sucediendo:

import tweepy

consumer_key = "XXXX" #Your API/Consumer key 
consumer_secret = "XXXX" #Your API/Consumer Secret Key
access_token = "XXXX"    #Your Access token key
access_token_secret = "XXXX" #Your Access token Secret key

#Pass in our twitter API authentication key
auth = tweepy.OAuth1UserHandler(
    consumer_key, consumer_secret,
    access_token, access_token_secret
)

#Instantiate the tweepy API
api = tweepy.API(auth, wait_on_rate_limit=True)


username = "john"
no_of_tweets =100


try:
    #The number of tweets we want to retrieved from the user
    tweets = api.user_timeline(screen_name=username, count=no_of_tweets)
    
    #Pulling Some attributes from the tweet
    attributes_container = [[tweet.created_at, tweet.favorite_count,tweet.source,  tweet.text] for tweet in tweets]

    #Creation of column list to rename the columns in the dataframe
    columns = ["Date Created", "Number of Likes", "Source of Tweet", "Tweet"]
    
    #Creation of Dataframe
    tweets_df = pd.DataFrame(attributes_container, columns=columns)
except BaseException as e:
    print('Status Failed On,',str(e))
    time.sleep(3)

Ahora repasemos cada parte del código en el bloque anterior.

import tweepy

consumer_key = "XXXX" #Your API/Consumer key 
consumer_secret = "XXXX" #Your API/Consumer Secret Key
access_token = "XXXX"    #Your Access token key
access_token_secret = "XXXX" #Your Access token Secret key

#Pass in our twitter API authentication key
auth = tweepy.OAuth1UserHandler(
    consumer_key, consumer_secret,
    access_token, access_token_secret
)

#Instantiate the tweepy API
api = tweepy.API(auth, wait_on_rate_limit=True)

En el código anterior, hemos importado la biblioteca Tweepy a nuestro código, luego hemos creado algunas variables donde almacenamos nuestras credenciales de Twitter (el controlador de autenticación de Tweepy requiere cuatro de nuestras credenciales de Twitter). Entonces pasamos esas variables al controlador de autenticación Tweepy y las guardamos en otra variable.

Luego, la última declaración de llamada es donde instanciamos la API de Tweepy y pasamos los parámetros requeridos.

username = "john"
no_of_tweets =100


try:
    #The number of tweets we want to retrieved from the user
    tweets = api.user_timeline(screen_name=username, count=no_of_tweets)
    
    #Pulling Some attributes from the tweet
    attributes_container = [[tweet.created_at, tweet.favorite_count,tweet.source,  tweet.text] for tweet in tweets]

    #Creation of column list to rename the columns in the dataframe
    columns = ["Date Created", "Number of Likes", "Source of Tweet", "Tweet"]
    
    #Creation of Dataframe
    tweets_df = pd.DataFrame(attributes_container, columns=columns)
except BaseException as e:
    print('Status Failed On,',str(e))

En el código anterior, creamos el nombre del usuario (el @nombre en Twitter) del que queremos recuperar los tweets y también la cantidad de tweets. Luego creamos un controlador de excepciones para ayudarnos a detectar errores de una manera más efectiva.

Después de eso, api.user_timeline()devuelve una colección de los tweets más recientes publicados por el usuario que elegimos en el screen_nameparámetro y la cantidad de tweets que desea recuperar.

En la siguiente línea de código, pasamos algunos atributos que queremos recuperar de cada tweet y los guardamos en una lista. Para ver más atributos que puede recuperar de un tweet, lea esto .

En el último fragmento de código, creamos un marco de datos y pasamos la lista que creamos junto con los nombres de la columna que creamos.

Tenga en cuenta que los nombres de las columnas deben estar en la secuencia de cómo los pasó al contenedor de atributos (es decir, cómo pasó esos atributos en una lista cuando estaba recuperando los atributos del tweet).

Si seguiste correctamente los pasos que describí, deberías tener algo como esto:

imagen-17

Imagen por autor

Ahora que hemos terminado, repasemos un ejemplo más antes de pasar a la implementación de Snscrape.

Cómo extraer tweets de una búsqueda de texto

En este método, recuperaremos un tweet basado en una búsqueda. Puedes hacerlo así:

import tweepy

consumer_key = "XXXX" #Your API/Consumer key 
consumer_secret = "XXXX" #Your API/Consumer Secret Key
access_token = "XXXX"    #Your Access token key
access_token_secret = "XXXX" #Your Access token Secret key

#Pass in our twitter API authentication key
auth = tweepy.OAuth1UserHandler(
    consumer_key, consumer_secret,
    access_token, access_token_secret
)

#Instantiate the tweepy API
api = tweepy.API(auth, wait_on_rate_limit=True)


search_query = "sex for grades"
no_of_tweets =150


try:
    #The number of tweets we want to retrieved from the search
    tweets = api.search_tweets(q=search_query, count=no_of_tweets)
    
    #Pulling Some attributes from the tweet
    attributes_container = [[tweet.user.name, tweet.created_at, tweet.favorite_count, tweet.source,  tweet.text] for tweet in tweets]

    #Creation of column list to rename the columns in the dataframe
    columns = ["User", "Date Created", "Number of Likes", "Source of Tweet", "Tweet"]
    
    #Creation of Dataframe
    tweets_df = pd.DataFrame(attributes_container, columns=columns)
except BaseException as e:
    print('Status Failed On,',str(e))

El código anterior es similar al código anterior, excepto que cambiamos el método API de api.user_timeline()a api.search_tweets(). También hemos agregado tweet.user.namea la lista de contenedores de atributos.

En el código anterior, puede ver que pasamos dos atributos. Esto se debe a que si solo pasamos tweet.user, solo devolvería un objeto de usuario de diccionario. Entonces, también debemos pasar otro atributo que queremos recuperar del objeto de usuario, que es name.

Puede ir aquí para ver una lista de atributos adicionales que puede recuperar de un objeto de usuario. Ahora deberías ver algo como esto una vez que lo ejecutes:

imagen-18

Imagen por Autor.

Muy bien, eso casi concluye la implementación de Tweepy. Solo recuerda que hay un límite en la cantidad de tweets que puedes recuperar, y no puedes recuperar tweets de más de 7 días usando Tweepy.

Cómo usar Snscrape para raspar tweets

Como mencioné anteriormente, Snscrape no requiere credenciales de Twitter (clave API) para acceder a él. Tampoco hay límite para la cantidad de tweets que puede obtener.

Para este ejemplo, sin embargo, solo recuperaremos los mismos tweets que en el ejemplo anterior, pero usando Snscrape en su lugar.

Para usar Snscrape, primero debemos instalar su biblioteca en nuestra PC. Puedes hacerlo escribiendo:

pip3 install git+https://github.com/JustAnotherArchivist/snscrape.git

Cómo raspar tweets de un usuario con Snscrape

Snscrape incluye dos métodos para obtener tweets de Twitter: la interfaz de línea de comandos (CLI) y Python Wrapper. Solo tenga en cuenta que Python Wrapper actualmente no está documentado, pero aún podemos salir adelante con prueba y error.

En este ejemplo, usaremos Python Wrapper porque es más intuitivo que el método CLI. Pero si te quedas atascado con algún código, siempre puedes recurrir a la comunidad de GitHub para obtener ayuda. Los colaboradores estarán encantados de ayudarte.

Para recuperar tweets de un usuario en particular, podemos hacer lo siguiente:

import snscrape.modules.twitter as sntwitter
import pandas as pd

# Created a list to append all tweet attributes(data)
attributes_container = []

# Using TwitterSearchScraper to scrape data and append tweets to list
for i,tweet in enumerate(sntwitter.TwitterSearchScraper('from:john').get_items()):
    if i>100:
        break
    attributes_container.append([tweet.date, tweet.likeCount, tweet.sourceLabel, tweet.content])
    
# Creating a dataframe from the tweets list above 
tweets_df = pd.DataFrame(attributes_container, columns=["Date Created", "Number of Likes", "Source of Tweet", "Tweets"])

Repasemos algunos de los códigos que quizás no entiendas a primera vista:

for i,tweet in enumerate(sntwitter.TwitterSearchScraper('from:john').get_items()):
    if i>100:
        break
    attributes_container.append([tweet.date, tweet.likeCount, tweet.sourceLabel, tweet.content])
    
  
# Creating a dataframe from the tweets list above 
tweets_df = pd.DataFrame(attributes_container, columns=["Date Created", "Number of Likes", "Source of Tweet", "Tweets"])

En el código anterior, lo que sntwitter.TwitterSearchScaperhace es devolver un objeto de tweets del nombre del usuario que le pasamos (que es john).

Como mencioné anteriormente, Snscrape no tiene límites en la cantidad de tweets, por lo que devolverá la cantidad de tweets de ese usuario. Para ayudar con esto, necesitamos agregar la función de enumeración que iterará a través del objeto y agregará un contador para que podamos acceder a los 100 tweets más recientes del usuario.

Puede ver que la sintaxis de los atributos que obtenemos de cada tweet se parece a la de Tweepy. Esta es la lista de atributos que podemos obtener del tweet Snscrape que fue curado por Martin Beck.

Sns.Scrape

Crédito: Martin Beck

Se pueden agregar más atributos, ya que la biblioteca Snscrape aún está en desarrollo. Como por ejemplo en la imagen de arriba, sourceha sido reemplazado por sourceLabel. Si pasa solo sourcedevolverá un objeto.

Si ejecuta el código anterior, también debería ver algo como esto:

imagen-19

Imagen por autor

Ahora hagamos lo mismo para raspar por búsqueda.

Cómo extraer tweets de una búsqueda de texto con Snscrape

import snscrape.modules.twitter as sntwitter
import pandas as pd

# Creating list to append tweet data to
attributes_container = []

# Using TwitterSearchScraper to scrape data and append tweets to list
for i,tweet in enumerate(sntwitter.TwitterSearchScraper('sex for grades since:2021-07-05 until:2022-07-06').get_items()):
    if i>150:
        break
    attributes_container.append([tweet.user.username, tweet.date, tweet.likeCount, tweet.sourceLabel, tweet.content])
    
# Creating a dataframe to load the list
tweets_df = pd.DataFrame(attributes_container, columns=["User", "Date Created", "Number of Likes", "Source of Tweet", "Tweet"])

Nuevamente, puede acceder a una gran cantidad de datos históricos utilizando Snscrape (a diferencia de Tweepy, ya que su API estándar no puede exceder los 7 días. La API premium es de 30 días). Entonces podemos pasar la fecha a partir de la cual queremos comenzar la búsqueda y la fecha en la que queremos que finalice en el sntwitter.TwitterSearchScraper()método.

Lo que hemos hecho en el código anterior es básicamente lo que discutimos antes. Lo único a tener en cuenta es que hasta funciona de manera similar a la función de rango en Python (es decir, excluye el último entero). Entonces, si desea obtener tweets de hoy, debe incluir el día después de hoy en el parámetro "hasta".

imagen-21

Imagen de Autor.

¡Ahora también sabes cómo raspar tweets con Snscrape!

Cuándo usar cada enfoque

Ahora que hemos visto cómo funciona cada método, es posible que se pregunte cuándo usar cuál.

Bueno, no existe una regla universal sobre cuándo utilizar cada método. Todo se reduce a una preferencia de materia y su caso de uso.

Si desea adquirir un sinfín de tweets, debe usar Snscrape. Pero si desea utilizar funciones adicionales que Snscrape no puede proporcionar (como la geolocalización, por ejemplo), definitivamente debería utilizar Tweepy. Se integra directamente con la API de Twitter y proporciona una funcionalidad completa.

Aun así, Snscrape es el método más utilizado para el raspado básico.

Conclusión

En este artículo, aprendimos cómo extraer datos de Python usando Tweepy y Snscrape. Pero esto fue solo una breve descripción de cómo funciona cada enfoque. Puede obtener más información explorando la web para obtener información adicional.

He incluido algunos recursos útiles que puede usar si necesita información adicional. Gracias por leer.

 Fuente: https://www.freecodecamp.org/news/python-web-scraping-tutorial/

#python #web 

LaravelS: Glue for using Swoole in Laravel Or Lumen

🚀 LaravelS is an out-of-the-box adapter between Swoole and Laravel/Lumen.

Please Watch this repository to get the latest updates.

 _                               _  _____ 
| |                             | |/ ____|
| |     __ _ _ __ __ ___   _____| | (___  
| |    / _` | '__/ _` \ \ / / _ \ |\___ \ 
| |___| (_| | | | (_| |\ V /  __/ |____) |
|______\__,_|_|  \__,_| \_/ \___|_|_____/ 

中文文档

Features

Built-in Http/WebSocket server

Multi-port mixed protocol

Custom process

Memory resident

Asynchronous event listening

Asynchronous task queue

Millisecond cron job

Common Components

Gracefully reload

Automatically reload after modifying code

Support Laravel/Lumen both, good compatibility

Simple & Out of the box

Benchmark

Which is the fastest web framework?

TechEmpower Framework Benchmarks

Requirements

DependencyRequirement
PHP>= 5.5.9 Recommend PHP7+
Swoole>= 1.7.19 No longer support PHP5 since 2.0.12 Recommend 4.5.0+
Laravel/Lumen>= 5.1 Recommend 8.0+

Install

1.Require package via Composer(packagist).

composer require "hhxsv5/laravel-s:~3.7.0" -vvv
# Make sure that your composer.lock file is under the VCS

2.Register service provider(pick one of two).

Laravel: in config/app.php file, Laravel 5.5+ supports package discovery automatically, you should skip this step

'providers' => [
    //...
    Hhxsv5\LaravelS\Illuminate\LaravelSServiceProvider::class,
],

Lumen: in bootstrap/app.php file

$app->register(Hhxsv5\LaravelS\Illuminate\LaravelSServiceProvider::class);

3.Publish configuration and binaries.

After upgrading LaravelS, you need to republish; click here to see the change notes of each version.

php artisan laravels publish
# Configuration: config/laravels.php
# Binary: bin/laravels bin/fswatch bin/inotify

4.Change config/laravels.php: listen_ip, listen_port, refer Settings.

5.Performance tuning

Adjust kernel parameters

Number of Workers: LaravelS uses Swoole's Synchronous IO mode, the larger the worker_num setting, the better the concurrency performance, but it will cause more memory usage and process switching overhead. If one request takes 100ms, in order to provide 1000QPS concurrency, at least 100 Worker processes need to be configured. The calculation method is: worker_num = 1000QPS/(1s/1ms) = 100, so incremental pressure testing is needed to calculate the best worker_num.

Number of Task Workers

Run

Please read the notices carefully before running, Important notices(IMPORTANT).

  • Commands: php bin/laravels {start|stop|restart|reload|info|help}.
CommandDescription
startStart LaravelS, list the processes by "ps -ef|grep laravels"
stopStop LaravelS, and trigger the method onStop of Custom process
restartRestart LaravelS: Stop gracefully before starting; The service is unavailable until startup is complete
reloadReload all Task/Worker/Timer processes which contain your business codes, and trigger the method onReload of Custom process, CANNOT reload Master/Manger processes. After modifying config/laravels.php, you only have to call restart to restart
infoDisplay component version information
helpDisplay help information
  • Boot options for the commands start and restart.
OptionDescription
-d|--daemonizeRun as a daemon, this option will override the swoole.daemonize setting in laravels.php
-e|--envThe environment the command should run under, such as --env=testing will use the configuration file .env.testing firstly, this feature requires Laravel 5.2+
-i|--ignoreIgnore checking PID file of Master process
-x|--x-versionThe version(branch) of the current project, stored in $_ENV/$_SERVER, access via $_ENV['X_VERSION'] $_SERVER['X_VERSION'] $request->server->get('X_VERSION')
  • Runtime files: start will automatically execute php artisan laravels config and generate these files, developers generally don't need to pay attention to them, it's recommended to add them to .gitignore.
FileDescription
storage/laravels.confLaravelS's runtime configuration file
storage/laravels.pidPID file of Master process
storage/laravels-timer-process.pidPID file of the Timer process
storage/laravels-custom-processes.pidPID file of all custom processes

Deploy

It is recommended to supervise the main process through Supervisord, the premise is without option -d and to set swoole.daemonize to false.

[program:laravel-s-test]
directory=/var/www/laravel-s-test
command=/usr/local/bin/php bin/laravels start -i
numprocs=1
autostart=true
autorestart=true
startretries=3
user=www-data
redirect_stderr=true
stdout_logfile=/var/log/supervisor/%(program_name)s.log

Cooperate with Nginx (Recommended)

Demo.

gzip on;
gzip_min_length 1024;
gzip_comp_level 2;
gzip_types text/plain text/css text/javascript application/json application/javascript application/x-javascript application/xml application/x-httpd-php image/jpeg image/gif image/png font/ttf font/otf image/svg+xml;
gzip_vary on;
gzip_disable "msie6";
upstream swoole {
    # Connect IP:Port
    server 127.0.0.1:5200 weight=5 max_fails=3 fail_timeout=30s;
    # Connect UnixSocket Stream file, tips: put the socket file in the /dev/shm directory to get better performance
    #server unix:/yourpath/laravel-s-test/storage/laravels.sock weight=5 max_fails=3 fail_timeout=30s;
    #server 192.168.1.1:5200 weight=3 max_fails=3 fail_timeout=30s;
    #server 192.168.1.2:5200 backup;
    keepalive 16;
}
server {
    listen 80;
    # Don't forget to bind the host
    server_name laravels.com;
    root /yourpath/laravel-s-test/public;
    access_log /yourpath/log/nginx/$server_name.access.log  main;
    autoindex off;
    index index.html index.htm;
    # Nginx handles the static resources(recommend enabling gzip), LaravelS handles the dynamic resource.
    location / {
        try_files $uri @laravels;
    }
    # Response 404 directly when request the PHP file, to avoid exposing public/*.php
    #location ~* \.php$ {
    #    return 404;
    #}
    location @laravels {
        # proxy_connect_timeout 60s;
        # proxy_send_timeout 60s;
        # proxy_read_timeout 120s;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Real-PORT $remote_port;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $http_host;
        proxy_set_header Scheme $scheme;
        proxy_set_header Server-Protocol $server_protocol;
        proxy_set_header Server-Name $server_name;
        proxy_set_header Server-Addr $server_addr;
        proxy_set_header Server-Port $server_port;
        # "swoole" is the upstream
        proxy_pass http://swoole;
    }
}

Cooperate with Apache

LoadModule proxy_module /yourpath/modules/mod_proxy.so
LoadModule proxy_balancer_module /yourpath/modules/mod_proxy_balancer.so
LoadModule lbmethod_byrequests_module /yourpath/modules/mod_lbmethod_byrequests.so
LoadModule proxy_http_module /yourpath/modules/mod_proxy_http.so
LoadModule slotmem_shm_module /yourpath/modules/mod_slotmem_shm.so
LoadModule rewrite_module /yourpath/modules/mod_rewrite.so
LoadModule remoteip_module /yourpath/modules/mod_remoteip.so
LoadModule deflate_module /yourpath/modules/mod_deflate.so

<IfModule deflate_module>
    SetOutputFilter DEFLATE
    DeflateCompressionLevel 2
    AddOutputFilterByType DEFLATE text/html text/plain text/css text/javascript application/json application/javascript application/x-javascript application/xml application/x-httpd-php image/jpeg image/gif image/png font/ttf font/otf image/svg+xml
</IfModule>

<VirtualHost *:80>
    # Don't forget to bind the host
    ServerName www.laravels.com
    ServerAdmin hhxsv5@sina.com

    DocumentRoot /yourpath/laravel-s-test/public;
    DirectoryIndex index.html index.htm
    <Directory "/">
        AllowOverride None
        Require all granted
    </Directory>

    RemoteIPHeader X-Forwarded-For

    ProxyRequests Off
    ProxyPreserveHost On
    <Proxy balancer://laravels>  
        BalancerMember http://192.168.1.1:5200 loadfactor=7
        #BalancerMember http://192.168.1.2:5200 loadfactor=3
        #BalancerMember http://192.168.1.3:5200 loadfactor=1 status=+H
        ProxySet lbmethod=byrequests
    </Proxy>
    #ProxyPass / balancer://laravels/
    #ProxyPassReverse / balancer://laravels/

    # Apache handles the static resources, LaravelS handles the dynamic resource.
    RewriteEngine On
    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-d
    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-f
    RewriteRule ^/(.*)$ balancer://laravels%{REQUEST_URI} [P,L]

    ErrorLog ${APACHE_LOG_DIR}/www.laravels.com.error.log
    CustomLog ${APACHE_LOG_DIR}/www.laravels.com.access.log combined
</VirtualHost>

Enable WebSocket server

The Listening address of WebSocket Sever is the same as Http Server.

1.Create WebSocket Handler class, and implement interface WebSocketHandlerInterface.The instant is automatically instantiated when start, you do not need to manually create it.

namespace App\Services;
use Hhxsv5\LaravelS\Swoole\WebSocketHandlerInterface;
use Swoole\Http\Request;
use Swoole\Http\Response;
use Swoole\WebSocket\Frame;
use Swoole\WebSocket\Server;
/**
 * @see https://www.swoole.co.uk/docs/modules/swoole-websocket-server
 */
class WebSocketService implements WebSocketHandlerInterface
{
    // Declare constructor without parameters
    public function __construct()
    {
    }
    // public function onHandShake(Request $request, Response $response)
    // {
           // Custom handshake: https://www.swoole.co.uk/docs/modules/swoole-websocket-server-on-handshake
           // The onOpen event will be triggered automatically after a successful handshake
    // }
    public function onOpen(Server $server, Request $request)
    {
        // Before the onOpen event is triggered, the HTTP request to establish the WebSocket has passed the Laravel route,
        // so Laravel's Request, Auth information are readable, Session is readable and writable, but only in the onOpen event.
        // \Log::info('New WebSocket connection', [$request->fd, request()->all(), session()->getId(), session('xxx'), session(['yyy' => time()])]);
        // The exceptions thrown here will be caught by the upper layer and recorded in the Swoole log. Developers need to try/catch manually.
        $server->push($request->fd, 'Welcome to LaravelS');
    }
    public function onMessage(Server $server, Frame $frame)
    {
        // \Log::info('Received message', [$frame->fd, $frame->data, $frame->opcode, $frame->finish]);
        // The exceptions thrown here will be caught by the upper layer and recorded in the Swoole log. Developers need to try/catch manually.
        $server->push($frame->fd, date('Y-m-d H:i:s'));
    }
    public function onClose(Server $server, $fd, $reactorId)
    {
        // The exceptions thrown here will be caught by the upper layer and recorded in the Swoole log. Developers need to try/catch manually.
    }
}

2.Modify config/laravels.php.

// ...
'websocket'      => [
    'enable'  => true, // Note: set enable to true
    'handler' => \App\Services\WebSocketService::class,
],
'swoole'         => [
    //...
    // Must set dispatch_mode in (2, 4, 5), see https://www.swoole.co.uk/docs/modules/swoole-server/configuration
    'dispatch_mode' => 2,
    //...
],
// ...

3.Use SwooleTable to bind FD & UserId, optional, Swoole Table Demo. Also you can use the other global storage services, like Redis/Memcached/MySQL, but be careful that FD will be possible conflicting between multiple Swoole Servers.

4.Cooperate with Nginx (Recommended)

Refer WebSocket Proxy

map $http_upgrade $connection_upgrade {
    default upgrade;
    ''      close;
}
upstream swoole {
    # Connect IP:Port
    server 127.0.0.1:5200 weight=5 max_fails=3 fail_timeout=30s;
    # Connect UnixSocket Stream file, tips: put the socket file in the /dev/shm directory to get better performance
    #server unix:/yourpath/laravel-s-test/storage/laravels.sock weight=5 max_fails=3 fail_timeout=30s;
    #server 192.168.1.1:5200 weight=3 max_fails=3 fail_timeout=30s;
    #server 192.168.1.2:5200 backup;
    keepalive 16;
}
server {
    listen 80;
    # Don't forget to bind the host
    server_name laravels.com;
    root /yourpath/laravel-s-test/public;
    access_log /yourpath/log/nginx/$server_name.access.log  main;
    autoindex off;
    index index.html index.htm;
    # Nginx handles the static resources(recommend enabling gzip), LaravelS handles the dynamic resource.
    location / {
        try_files $uri @laravels;
    }
    # Response 404 directly when request the PHP file, to avoid exposing public/*.php
    #location ~* \.php$ {
    #    return 404;
    #}
    # Http and WebSocket are concomitant, Nginx identifies them by "location"
    # !!! The location of WebSocket is "/ws"
    # Javascript: var ws = new WebSocket("ws://laravels.com/ws");
    location =/ws {
        # proxy_connect_timeout 60s;
        # proxy_send_timeout 60s;
        # proxy_read_timeout: Nginx will close the connection if the proxied server does not send data to Nginx in 60 seconds; At the same time, this close behavior is also affected by heartbeat setting of Swoole.
        # proxy_read_timeout 60s;
        proxy_http_version 1.1;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Real-PORT $remote_port;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $http_host;
        proxy_set_header Scheme $scheme;
        proxy_set_header Server-Protocol $server_protocol;
        proxy_set_header Server-Name $server_name;
        proxy_set_header Server-Addr $server_addr;
        proxy_set_header Server-Port $server_port;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection $connection_upgrade;
        proxy_pass http://swoole;
    }
    location @laravels {
        # proxy_connect_timeout 60s;
        # proxy_send_timeout 60s;
        # proxy_read_timeout 60s;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Real-PORT $remote_port;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $http_host;
        proxy_set_header Scheme $scheme;
        proxy_set_header Server-Protocol $server_protocol;
        proxy_set_header Server-Name $server_name;
        proxy_set_header Server-Addr $server_addr;
        proxy_set_header Server-Port $server_port;
        proxy_pass http://swoole;
    }
}

5.Heartbeat setting

Heartbeat setting of Swoole

// config/laravels.php
'swoole' => [
    //...
    // All connections are traversed every 60 seconds. If a connection does not send any data to the server within 600 seconds, the connection will be forced to close.
    'heartbeat_idle_time'      => 600,
    'heartbeat_check_interval' => 60,
    //...
],

Proxy read timeout of Nginx

# Nginx will close the connection if the proxied server does not send data to Nginx in 60 seconds
proxy_read_timeout 60s;

6.Push data in controller

namespace App\Http\Controllers;
class TestController extends Controller
{
    public function push()
    {
        $fd = 1; // Find fd by userId from a map [userId=>fd].
        /**@var \Swoole\WebSocket\Server $swoole */
        $swoole = app('swoole');
        $success = $swoole->push($fd, 'Push data to fd#1 in Controller');
        var_dump($success);
    }
}

Listen events

System events

Usually, you can reset/destroy some global/static variables, or change the current Request/Response object.

laravels.received_request After LaravelS parsed Swoole\Http\Request to Illuminate\Http\Request, before Laravel's Kernel handles this request.

// Edit file `app/Providers/EventServiceProvider.php`, add the following code into method `boot`
// If no variable $events, you can also call Facade \Event::listen(). 
$events->listen('laravels.received_request', function (\Illuminate\Http\Request $req, $app) {
    $req->query->set('get_key', 'hhxsv5');// Change query of request
    $req->request->set('post_key', 'hhxsv5'); // Change post of request
});

laravels.generated_response After Laravel's Kernel handled the request, before LaravelS parses Illuminate\Http\Response to Swoole\Http\Response.

// Edit file `app/Providers/EventServiceProvider.php`, add the following code into method `boot`
// If no variable $events, you can also call Facade \Event::listen(). 
$events->listen('laravels.generated_response', function (\Illuminate\Http\Request $req, \Symfony\Component\HttpFoundation\Response $rsp, $app) {
    $rsp->headers->set('header-key', 'hhxsv5');// Change header of response
});

Customized asynchronous events

This feature depends on AsyncTask of Swoole, your need to set swoole.task_worker_num in config/laravels.php firstly. The performance of asynchronous event processing is influenced by number of Swoole task process, you need to set task_worker_num appropriately.

1.Create event class.

use Hhxsv5\LaravelS\Swoole\Task\Event;
class TestEvent extends Event
{
    protected $listeners = [
        // Listener list
        TestListener1::class,
        // TestListener2::class,
    ];
    private $data;
    public function __construct($data)
    {
        $this->data = $data;
    }
    public function getData()
    {
        return $this->data;
    }
}

2.Create listener class.

use Hhxsv5\LaravelS\Swoole\Task\Task;
use Hhxsv5\LaravelS\Swoole\Task\Listener;
class TestListener1 extends Listener
{
    /**
     * @var TestEvent
     */
    protected $event;
    
    public function handle()
    {
        \Log::info(__CLASS__ . ':handle start', [$this->event->getData()]);
        sleep(2);// Simulate the slow codes
        // Deliver task in CronJob, but NOT support callback finish() of task.
        // Note: Modify task_ipc_mode to 1 or 2 in config/laravels.php, see https://www.swoole.co.uk/docs/modules/swoole-server/configuration
        $ret = Task::deliver(new TestTask('task data'));
        var_dump($ret);
        // The exceptions thrown here will be caught by the upper layer and recorded in the Swoole log. Developers need to try/catch manually.
    }
}

3.Fire event.

// Create instance of event and fire it, "fire" is asynchronous.
use Hhxsv5\LaravelS\Swoole\Task\Event;
$event = new TestEvent('event data');
// $event->delay(10); // Delay 10 seconds to fire event
// $event->setTries(3); // When an error occurs, try 3 times in total
$success = Event::fire($event);
var_dump($success);// Return true if sucess, otherwise false

Asynchronous task queue

This feature depends on AsyncTask of Swoole, your need to set swoole.task_worker_num in config/laravels.php firstly. The performance of task processing is influenced by number of Swoole task process, you need to set task_worker_num appropriately.

1.Create task class.

use Hhxsv5\LaravelS\Swoole\Task\Task;
class TestTask extends Task
{
    private $data;
    private $result;
    public function __construct($data)
    {
        $this->data = $data;
    }
    // The logic of task handling, run in task process, CAN NOT deliver task
    public function handle()
    {
        \Log::info(__CLASS__ . ':handle start', [$this->data]);
        sleep(2);// Simulate the slow codes
        // The exceptions thrown here will be caught by the upper layer and recorded in the Swoole log. Developers need to try/catch manually.
        $this->result = 'the result of ' . $this->data;
    }
    // Optional, finish event, the logic of after task handling, run in worker process, CAN deliver task 
    public function finish()
    {
        \Log::info(__CLASS__ . ':finish start', [$this->result]);
        Task::deliver(new TestTask2('task2 data')); // Deliver the other task
    }
}

2.Deliver task.

// Create instance of TestTask and deliver it, "deliver" is asynchronous.
use Hhxsv5\LaravelS\Swoole\Task\Task;
$task = new TestTask('task data');
// $task->delay(3);// delay 3 seconds to deliver task
// $task->setTries(3); // When an error occurs, try 3 times in total
$ret = Task::deliver($task);
var_dump($ret);// Return true if sucess, otherwise false

Millisecond cron job

Wrapper cron job base on Swoole's Millisecond Timer, replace Linux Crontab.

1.Create cron job class.

namespace App\Jobs\Timer;
use App\Tasks\TestTask;
use Swoole\Coroutine;
use Hhxsv5\LaravelS\Swoole\Task\Task;
use Hhxsv5\LaravelS\Swoole\Timer\CronJob;
class TestCronJob extends CronJob
{
    protected $i = 0;
    // !!! The `interval` and `isImmediate` of cron job can be configured in two ways(pick one of two): one is to overload the corresponding method, and the other is to pass parameters when registering cron job.
    // --- Override the corresponding method to return the configuration: begin
    public function interval()
    {
        return 1000;// Run every 1000ms
    }
    public function isImmediate()
    {
        return false;// Whether to trigger `run` immediately after setting up
    }
    // --- Override the corresponding method to return the configuration: end
    public function run()
    {
        \Log::info(__METHOD__, ['start', $this->i, microtime(true)]);
        // do something
        // sleep(1); // Swoole < 2.1
        Coroutine::sleep(1); // Swoole>=2.1 Coroutine will be automatically created for run().
        $this->i++;
        \Log::info(__METHOD__, ['end', $this->i, microtime(true)]);

        if ($this->i >= 10) { // Run 10 times only
            \Log::info(__METHOD__, ['stop', $this->i, microtime(true)]);
            $this->stop(); // Stop this cron job, but it will run again after restart/reload.
            // Deliver task in CronJob, but NOT support callback finish() of task.
            // Note: Modify task_ipc_mode to 1 or 2 in config/laravels.php, see https://www.swoole.co.uk/docs/modules/swoole-server/configuration
            $ret = Task::deliver(new TestTask('task data'));
            var_dump($ret);
        }
        // The exceptions thrown here will be caught by the upper layer and recorded in the Swoole log. Developers need to try/catch manually.
    }
}

2.Register cron job.

// Register cron jobs in file "config/laravels.php"
[
    // ...
    'timer'          => [
        'enable' => true, // Enable Timer
        'jobs'   => [ // The list of cron job
            // Enable LaravelScheduleJob to run `php artisan schedule:run` every 1 minute, replace Linux Crontab
            // \Hhxsv5\LaravelS\Illuminate\LaravelScheduleJob::class,
            // Two ways to configure parameters:
            // [\App\Jobs\Timer\TestCronJob::class, [1000, true]], // Pass in parameters when registering
            \App\Jobs\Timer\TestCronJob::class, // Override the corresponding method to return the configuration
        ],
        'max_wait_time' => 5, // Max waiting time of reloading
        // Enable the global lock to ensure that only one instance starts the timer when deploying multiple instances. This feature depends on Redis, please see https://laravel.com/docs/7.x/redis
        'global_lock'     => false,
        'global_lock_key' => config('app.name', 'Laravel'),
    ],
    // ...
];

3.Note: it will launch multiple timers when build the server cluster, so you need to make sure that launch one timer only to avoid running repetitive task.

4.LaravelS v3.4.0 starts to support the hot restart [Reload] Timer process. After LaravelS receives the SIGUSR1 signal, it waits for max_wait_time(default 5) seconds to end the process, then the Manager process will pull up the Timer process again.

5.If you only need to use minute-level scheduled tasks, it is recommended to enable Hhxsv5\LaravelS\Illuminate\LaravelScheduleJob instead of Linux Crontab, so that you can follow the coding habits of Laravel task scheduling and configure Kernel.

// app/Console/Kernel.php
protected function schedule(Schedule $schedule)
{
    // runInBackground() will start a new child process to execute the task. This is asynchronous and will not affect the execution timing of other tasks.
    $schedule->command(TestCommand::class)->runInBackground()->everyMinute();
}

Automatically reload after modifying code

Via inotify, support Linux only.

1.Install inotify extension.

2.Turn on the switch in Settings.

3.Notice: Modify the file only in Linux to receive the file change events. It's recommended to use the latest Docker. Vagrant Solution.

Via fswatch, support OS X/Linux/Windows.

1.Install fswatch.

2.Run command in your project root directory.

# Watch current directory
./bin/fswatch
# Watch app directory
./bin/fswatch ./app

Via inotifywait, support Linux.

1.Install inotify-tools.

2.Run command in your project root directory.

# Watch current directory
./bin/inotify
# Watch app directory
./bin/inotify ./app

When the above methods does not work, the ultimate solution: set max_request=1,worker_num=1, so that Worker process will restart after processing a request. The performance of this method is very poor, so only development environment use.

Get the instance of SwooleServer in your project

/**
 * $swoole is the instance of `Swoole\WebSocket\Server` if enable WebSocket server, otherwise `Swoole\Http\Server`
 * @var \Swoole\WebSocket\Server|\Swoole\Http\Server $swoole
 */
$swoole = app('swoole');
var_dump($swoole->stats());
$swoole->push($fd, 'Push WebSocket message');

Use SwooleTable

1.Define Table, support multiple.

All defined tables will be created before Swoole starting.

// in file "config/laravels.php"
[
    // ...
    'swoole_tables'  => [
        // Scene:bind UserId & FD in WebSocket
        'ws' => [// The Key is table name, will add suffix "Table" to avoid naming conflicts. Here defined a table named "wsTable"
            'size'   => 102400,// The max size
            'column' => [// Define the columns
                ['name' => 'value', 'type' => \Swoole\Table::TYPE_INT, 'size' => 8],
            ],
        ],
        //...Define the other tables
    ],
    // ...
];

2.Access Table: all table instances will be bound on SwooleServer, access by app('swoole')->xxxTable.

namespace App\Services;
use Hhxsv5\LaravelS\Swoole\WebSocketHandlerInterface;
use Swoole\Http\Request;
use Swoole\WebSocket\Frame;
use Swoole\WebSocket\Server;
class WebSocketService implements WebSocketHandlerInterface
{
    /**@var \Swoole\Table $wsTable */
    private $wsTable;
    public function __construct()
    {
        $this->wsTable = app('swoole')->wsTable;
    }
    // Scene:bind UserId & FD in WebSocket
    public function onOpen(Server $server, Request $request)
    {
        // var_dump(app('swoole') === $server);// The same instance
        /**
         * Get the currently logged in user
         * This feature requires that the path to establish a WebSocket connection go through middleware such as Authenticate.
         * E.g:
         * Browser side: var ws = new WebSocket("ws://127.0.0.1:5200/ws");
         * Then the /ws route in Laravel needs to add the middleware like Authenticate.
         * Route::get('/ws', function () {
         *     // Respond any content with status code 200
         *     return 'websocket';
         * })->middleware(['auth']);
         */
        // $user = Auth::user();
        // $userId = $user ? $user->id : 0; // 0 means a guest user who is not logged in
        $userId = mt_rand(1000, 10000);
        // if (!$userId) {
        //     // Disconnect the connections of unlogged users
        //     $server->disconnect($request->fd);
        //     return;
        // }
        $this->wsTable->set('uid:' . $userId, ['value' => $request->fd]);// Bind map uid to fd
        $this->wsTable->set('fd:' . $request->fd, ['value' => $userId]);// Bind map fd to uid
        $server->push($request->fd, "Welcome to LaravelS #{$request->fd}");
    }
    public function onMessage(Server $server, Frame $frame)
    {
        // Broadcast
        foreach ($this->wsTable as $key => $row) {
            if (strpos($key, 'uid:') === 0 && $server->isEstablished($row['value'])) {
                $content = sprintf('Broadcast: new message "%s" from #%d', $frame->data, $frame->fd);
                $server->push($row['value'], $content);
            }
        }
    }
    public function onClose(Server $server, $fd, $reactorId)
    {
        $uid = $this->wsTable->get('fd:' . $fd);
        if ($uid !== false) {
            $this->wsTable->del('uid:' . $uid['value']); // Unbind uid map
        }
        $this->wsTable->del('fd:' . $fd);// Unbind fd map
        $server->push($fd, "Goodbye #{$fd}");
    }
}

Multi-port mixed protocol

For more information, please refer to Swoole Server AddListener

To make our main server support more protocols not just Http and WebSocket, we bring the feature multi-port mixed protocol of Swoole in LaravelS and name it Socket. Now, you can build TCP/UDP applications easily on top of Laravel.

Create Socket handler class, and extend Hhxsv5\LaravelS\Swoole\Socket\{TcpSocket|UdpSocket|Http|WebSocket}.

namespace App\Sockets;
use Hhxsv5\LaravelS\Swoole\Socket\TcpSocket;
use Swoole\Server;
class TestTcpSocket extends TcpSocket
{
    public function onConnect(Server $server, $fd, $reactorId)
    {
        \Log::info('New TCP connection', [$fd]);
        $server->send($fd, 'Welcome to LaravelS.');
    }
    public function onReceive(Server $server, $fd, $reactorId, $data)
    {
        \Log::info('Received data', [$fd, $data]);
        $server->send($fd, 'LaravelS: ' . $data);
        if ($data === "quit\r\n") {
            $server->send($fd, 'LaravelS: bye' . PHP_EOL);
            $server->close($fd);
        }
    }
    public function onClose(Server $server, $fd, $reactorId)
    {
        \Log::info('Close TCP connection', [$fd]);
        $server->send($fd, 'Goodbye');
    }
}

These Socket connections share the same worker processes with your HTTP/WebSocket connections. So it won't be a problem at all if you want to deliver tasks, use SwooleTable, even Laravel components such as DB, Eloquent and so on. At the same time, you can access Swoole\Server\Port object directly by member property swoolePort.

public function onReceive(Server $server, $fd, $reactorId, $data)
{
    $port = $this->swoolePort; // Get the `Swoole\Server\Port` object
}
namespace App\Http\Controllers;
class TestController extends Controller
{
    public function test()
    {
        /**@var \Swoole\Http\Server|\Swoole\WebSocket\Server $swoole */
        $swoole = app('swoole');
        // $swoole->ports: Traverse all Port objects, https://www.swoole.co.uk/docs/modules/swoole-server/multiple-ports
        $port = $swoole->ports[0]; // Get the `Swoole\Server\Port` object, $port[0] is the port of the main server
        foreach ($port->connections as $fd) { // Traverse all connections
            // $swoole->send($fd, 'Send tcp message');
            // if($swoole->isEstablished($fd)) {
            //     $swoole->push($fd, 'Send websocket message');
            // }
        }
    }
}

Register Sockets.

// Edit `config/laravels.php`
//...
'sockets' => [
    [
        'host'     => '127.0.0.1',
        'port'     => 5291,
        'type'     => SWOOLE_SOCK_TCP,// Socket type: SWOOLE_SOCK_TCP/SWOOLE_SOCK_TCP6/SWOOLE_SOCK_UDP/SWOOLE_SOCK_UDP6/SWOOLE_UNIX_DGRAM/SWOOLE_UNIX_STREAM
        'settings' => [// Swoole settings:https://www.swoole.co.uk/docs/modules/swoole-server-methods#swoole_server-addlistener
            'open_eof_check' => true,
            'package_eof'    => "\r\n",
        ],
        'handler'  => \App\Sockets\TestTcpSocket::class,
        'enable'   => true, // whether to enable, default true
    ],
],

About the heartbeat configuration, it can only be set on the main server and cannot be configured on Socket, but the Socket inherits the heartbeat configuration of the main server.

For TCP socket, onConnect and onClose events will be blocked when dispatch_mode of Swoole is 1/3, so if you want to unblock these two events please set dispatch_mode to 2/4/5.

'swoole' => [
    //...
    'dispatch_mode' => 2,
    //...
];

Test.

TCP: telnet 127.0.0.1 5291

UDP: [Linux] echo "Hello LaravelS" > /dev/udp/127.0.0.1/5292

Register example of other protocols.

  • UDP
  • Http
  • WebSocket: The main server must turn on WebSocket, that is, set websocket.enable to true.

Coroutine

Swoole Coroutine

Warning: The order of code execution in the coroutine is out of order. The data of the request level should be isolated by the coroutine ID. However, there are many singleton and static attributes in Laravel/Lumen, the data between different requests will affect each other, it's Unsafe. For example, the database connection is a singleton, the same database connection shares the same PDO resource. This is fine in the synchronous blocking mode, but it does not work in the asynchronous coroutine mode. Each query needs to create different connections and maintain IO state of different connections, which requires a connection pool.

DO NOT enable the coroutine, only the custom process can use the coroutine.

Custom process

Support developers to create special work processes for monitoring, reporting, or other special tasks. Refer addProcess.

Create Proccess class, implements CustomProcessInterface.

namespace App\Processes;
use App\Tasks\TestTask;
use Hhxsv5\LaravelS\Swoole\Process\CustomProcessInterface;
use Hhxsv5\LaravelS\Swoole\Task\Task;
use Swoole\Coroutine;
use Swoole\Http\Server;
use Swoole\Process;
class TestProcess implements CustomProcessInterface
{
    /**
     * @var bool Quit tag for Reload updates
     */
    private static $quit = false;

    public static function callback(Server $swoole, Process $process)
    {
        // The callback method cannot exit. Once exited, Manager process will automatically create the process 
        while (!self::$quit) {
            \Log::info('Test process: running');
            // sleep(1); // Swoole < 2.1
            Coroutine::sleep(1); // Swoole>=2.1: Coroutine & Runtime will be automatically enabled for callback().
             // Deliver task in custom process, but NOT support callback finish() of task.
            // Note: Modify task_ipc_mode to 1 or 2 in config/laravels.php, see https://www.swoole.co.uk/docs/modules/swoole-server/configuration
            $ret = Task::deliver(new TestTask('task data'));
            var_dump($ret);
            // The upper layer will catch the exception thrown in the callback and record it in the Swoole log, and then this process will exit. The Manager process will re-create the process after 3 seconds, so developers need to try/catch to catch the exception by themselves to avoid frequent process creation.
            // throw new \Exception('an exception');
        }
    }
    // Requirements: LaravelS >= v3.4.0 & callback() must be async non-blocking program.
    public static function onReload(Server $swoole, Process $process)
    {
        // Stop the process...
        // Then end process
        \Log::info('Test process: reloading');
        self::$quit = true;
        // $process->exit(0); // Force exit process
    }
    // Requirements: LaravelS >= v3.7.4 & callback() must be async non-blocking program.
    public static function onStop(Server $swoole, Process $process)
    {
        // Stop the process...
        // Then end process
        \Log::info('Test process: stopping');
        self::$quit = true;
        // $process->exit(0); // Force exit process
    }
}

Register TestProcess.

// Edit `config/laravels.php`
// ...
'processes' => [
    'test' => [ // Key name is process name
        'class'    => \App\Processes\TestProcess::class,
        'redirect' => false, // Whether redirect stdin/stdout, true or false
        'pipe'     => 0,     // The type of pipeline, 0: no pipeline 1: SOCK_STREAM 2: SOCK_DGRAM
        'enable'   => true,  // Whether to enable, default true
        //'num'    => 3   // To create multiple processes of this class, default is 1
        //'queue'    => [ // Enable message queue as inter-process communication, configure empty array means use default parameters
        //    'msg_key'  => 0,    // The key of the message queue. Default: ftok(__FILE__, 1).
        //    'mode'     => 2,    // Communication mode, default is 2, which means contention mode
        //    'capacity' => 8192, // The length of a single message, is limited by the operating system kernel parameters. The default is 8192, and the maximum is 65536
        //],
        //'restart_interval' => 5, // After the process exits abnormally, how many seconds to wait before restarting the process, default 5 seconds
    ],
],

Note: The callback() cannot quit. If quit, the Manager process will re-create the process.

Example: Write data to a custom process.

// config/laravels.php
'processes' => [
    'test' => [
        'class'    => \App\Processes\TestProcess::class,
        'redirect' => false,
        'pipe'     => 1,
    ],
],
// app/Processes/TestProcess.php
public static function callback(Server $swoole, Process $process)
{
    while ($data = $process->read()) {
        \Log::info('TestProcess: read data', [$data]);
        $process->write('TestProcess: ' . $data);
    }
}
// app/Http/Controllers/TestController.php
public function testProcessWrite()
{
    /**@var \Swoole\Process $process */
    $process = app('swoole')->customProcesses['test'];
    $process->write('TestController: write data' . time());
    var_dump($process->read());
}

Common components

Apollo

LaravelS will pull the Apollo configuration and write it to the .env file when starting. At the same time, LaravelS will start the custom process apollo to monitor the configuration and automatically reload when the configuration changes.

Enable Apollo: add --enable-apollo and Apollo parameters to the startup parameters.

php bin/laravels start --enable-apollo --apollo-server=http://127.0.0.1:8080 --apollo-app-id=LARAVEL-S-TEST

Support hot updates(optional).

// Edit `config/laravels.php`
'processes' => Hhxsv5\LaravelS\Components\Apollo\Process::getDefinition(),
// When there are other custom process configurations
'processes' => [
    'test' => [
        'class'    => \App\Processes\TestProcess::class,
        'redirect' => false,
        'pipe'     => 1,
    ],
    // ...
] + Hhxsv5\LaravelS\Components\Apollo\Process::getDefinition(),

List of available parameters.

ParameterDescriptionDefaultDemo
apollo-serverApollo server URL---apollo-server=http://127.0.0.1:8080
apollo-app-idApollo APP ID---apollo-app-id=LARAVEL-S-TEST
apollo-namespacesThe namespace to which the APP belongs, support specify the multipleapplication--apollo-namespaces=application --apollo-namespaces=env
apollo-clusterThe cluster to which the APP belongsdefault--apollo-cluster=default
apollo-client-ipIP of current instance, can also be used for grayscale publishingLocal intranet IP--apollo-client-ip=10.2.1.83
apollo-pull-timeoutTimeout time(seconds) when pulling configuration5--apollo-pull-timeout=5
apollo-backup-old-envWhether to backup the old configuration file when updating the configuration file .envfalse--apollo-backup-old-env

Prometheus

Support Prometheus monitoring and alarm, Grafana visually view monitoring metrics. Please refer to Docker Compose for the environment construction of Prometheus and Grafana.

Require extension APCu >= 5.0.0, please install it by pecl install apcu.

Copy the configuration file prometheus.php to the config directory of your project. Modify the configuration as appropriate.

# Execute commands in the project root directory
cp vendor/hhxsv5/laravel-s/config/prometheus.php config/

If your project is Lumen, you also need to manually load the configuration $app->configure('prometheus'); in bootstrap/app.php.

Configure global middleware: Hhxsv5\LaravelS\Components\Prometheus\RequestMiddleware::class. In order to count the request time consumption as accurately as possible, RequestMiddleware must be the first global middleware, which needs to be placed in front of other middleware.

Register ServiceProvider: Hhxsv5\LaravelS\Components\Prometheus\ServiceProvider::class.

Configure the CollectorProcess in config/laravels.php to collect the metrics of Swoole Worker/Task/Timer processes regularly.

'processes' => Hhxsv5\LaravelS\Components\Prometheus\CollectorProcess::getDefinition(),

Create the route to output metrics.

use Hhxsv5\LaravelS\Components\Prometheus\Exporter;

Route::get('/actuator/prometheus', function () {
    $result = app(Exporter::class)->render();
    return response($result, 200, ['Content-Type' => Exporter::REDNER_MIME_TYPE]);
});

Complete the configuration of Prometheus and start it.

global:
  scrape_interval: 5s
  scrape_timeout: 5s
  evaluation_interval: 30s
scrape_configs:
- job_name: laravel-s-test
  honor_timestamps: true
  metrics_path: /actuator/prometheus
  scheme: http
  follow_redirects: true
  static_configs:
  - targets:
    - 127.0.0.1:5200 # The ip and port of the monitored service
# Dynamically discovered using one of the supported service-discovery mechanisms
# https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config
# - job_name: laravels-eureka
#   honor_timestamps: true
#   scrape_interval: 5s
#   metrics_path: /actuator/prometheus
#   scheme: http
#   follow_redirects: true
  # eureka_sd_configs:
  # - server: http://127.0.0.1:8080/eureka
  #   follow_redirects: true
  #   refresh_interval: 5s

Start Grafana, then import panel json.

Grafana Dashboard

Other features

Configure Swoole events

Supported events:

EventInterfaceWhen happened
ServerStartHhxsv5\LaravelS\Swoole\Events\ServerStartInterfaceOccurs when the Master process is starting, this event should not handle complex business logic, and can only do some simple work of initialization.
ServerStopHhxsv5\LaravelS\Swoole\Events\ServerStopInterfaceOccurs when the server exits normally, CANNOT use async or coroutine related APIs in this event.
WorkerStartHhxsv5\LaravelS\Swoole\Events\WorkerStartInterfaceOccurs after the Worker/Task process is started, and the Laravel initialization has been completed.
WorkerStopHhxsv5\LaravelS\Swoole\Events\WorkerStopInterfaceOccurs after the Worker/Task process exits normally
WorkerErrorHhxsv5\LaravelS\Swoole\Events\WorkerErrorInterfaceOccurs when an exception or fatal error occurs in the Worker/Task process

1.Create an event class to implement the corresponding interface.

namespace App\Events;
use Hhxsv5\LaravelS\Swoole\Events\ServerStartInterface;
use Swoole\Atomic;
use Swoole\Http\Server;
class ServerStartEvent implements ServerStartInterface
{
    public function __construct()
    {
    }
    public function handle(Server $server)
    {
        // Initialize a global counter (available across processes)
        $server->atomicCount = new Atomic(2233);

        // Invoked in controller: app('swoole')->atomicCount->get();
    }
}
namespace App\Events;
use Hhxsv5\LaravelS\Swoole\Events\WorkerStartInterface;
use Swoole\Http\Server;
class WorkerStartEvent implements WorkerStartInterface
{
    public function __construct()
    {
    }
    public function handle(Server $server, $workerId)
    {
        // Initialize a database connection pool
        // DatabaseConnectionPool::init();
    }
}

2.Configuration.

// Edit `config/laravels.php`
'event_handlers' => [
    'ServerStart' => [\App\Events\ServerStartEvent::class], // Trigger events in array order
    'WorkerStart' => [\App\Events\WorkerStartEvent::class],
],

Serverless

Alibaba Cloud Function Compute

Function Compute.

1.Modify bootstrap/app.php and set the storage directory. Because the project directory is read-only, the /tmp directory can only be read and written.

$app->useStoragePath(env('APP_STORAGE_PATH', '/tmp/storage'));

2.Create a shell script laravels_bootstrap and grant executable permission.

#!/usr/bin/env bash
set +e

# Create storage-related directories
mkdir -p /tmp/storage/app/public
mkdir -p /tmp/storage/framework/cache
mkdir -p /tmp/storage/framework/sessions
mkdir -p /tmp/storage/framework/testing
mkdir -p /tmp/storage/framework/views
mkdir -p /tmp/storage/logs

# Set the environment variable APP_STORAGE_PATH, please make sure it's the same as APP_STORAGE_PATH in .env
export APP_STORAGE_PATH=/tmp/storage

# Start LaravelS
php bin/laravels start

3.Configure template.xml.

ROSTemplateFormatVersion: '2015-09-01'
Transform: 'Aliyun::Serverless-2018-04-03'
Resources:
  laravel-s-demo:
    Type: 'Aliyun::Serverless::Service'
    Properties:
      Description: 'LaravelS Demo for Serverless'
    fc-laravel-s:
      Type: 'Aliyun::Serverless::Function'
      Properties:
        Handler: laravels.handler
        Runtime: custom
        MemorySize: 512
        Timeout: 30
        CodeUri: ./
        InstanceConcurrency: 10
        EnvironmentVariables:
          BOOTSTRAP_FILE: laravels_bootstrap

Important notices

Singleton Issue

Under FPM mode, singleton instances will be instantiated and recycled in every request, request start=>instantiate instance=>request end=>recycled instance.

Under Swoole Server, All singleton instances will be held in memory, different lifetime from FPM, request start=>instantiate instance=>request end=>do not recycle singleton instance. So need developer to maintain status of singleton instances in every request.

Common solutions:

Write a XxxCleaner class to clean up the singleton object state. This class implements the interface Hhxsv5\LaravelS\Illuminate\Cleaners\CleanerInterface and then registers it in cleaners of laravels.php.

Reset status of singleton instances by Middleware.

Re-register ServiceProvider, add XxxServiceProvider into register_providers of file laravels.php. So that reinitialize singleton instances in every request Refer.

Cleaners

Configuration cleaners.

Known issues

Known issues: a package of known issues and solutions.

Debugging method

Logging; if you want to output to the console, you can use stderr, Log::channel('stderr')->debug('debug message').

Laravel Dump Server(Laravel 5.7 has been integrated by default).

Read request

Read request by Illuminate\Http\Request Object, $_ENV is readable, $_SERVER is partially readable, CANNOT USE $_GET/$_POST/$_FILES/$_COOKIE/$_REQUEST/$_SESSION/$GLOBALS.

public function form(\Illuminate\Http\Request $request)
{
    $name = $request->input('name');
    $all = $request->all();
    $sessionId = $request->cookie('sessionId');
    $photo = $request->file('photo');
    // Call getContent() to get the raw POST body, instead of file_get_contents('php://input')
    $rawContent = $request->getContent();
    //...
}

Output response

Respond by Illuminate\Http\Response Object, compatible with echo/vardump()/print_r(),CANNOT USE functions dd()/exit()/die()/header()/setcookie()/http_response_code().

public function json()
{
    return response()->json(['time' => time()])->header('header1', 'value1')->withCookie('c1', 'v1');
}

Persistent connection

Singleton connection will be resident in memory, it is recommended to turn on persistent connection for better performance.

  1. Database connection, it will reconnect automatically immediately after disconnect.
// config/database.php
'connections' => [
    'my_conn' => [
        'driver'    => 'mysql',
        'host'      => env('DB_MY_CONN_HOST', 'localhost'),
        'port'      => env('DB_MY_CONN_PORT', 3306),
        'database'  => env('DB_MY_CONN_DATABASE', 'forge'),
        'username'  => env('DB_MY_CONN_USERNAME', 'forge'),
        'password'  => env('DB_MY_CONN_PASSWORD', ''),
        'charset'   => 'utf8mb4',
        'collation' => 'utf8mb4_unicode_ci',
        'prefix'    => '',
        'strict'    => false,
        'options'   => [
            // Enable persistent connection
            \PDO::ATTR_PERSISTENT => true,
        ],
    ],
],
  1. Redis connection, it won't reconnect automatically immediately after disconnect, and will throw an exception about lost connection, reconnect next time. You need to make sure that SELECT DB correctly before operating Redis every time.
// config/database.php
'redis' => [
    'client' => env('REDIS_CLIENT', 'phpredis'), // It is recommended to use phpredis for better performance.
    'default' => [
        'host'       => env('REDIS_HOST', 'localhost'),
        'password'   => env('REDIS_PASSWORD', null),
        'port'       => env('REDIS_PORT', 6379),
        'database'   => 0,
        'persistent' => true, // Enable persistent connection
    ],
],

About memory leaks

Avoid using global variables. If necessary, please clean or reset them manually.

Infinitely appending element into static/global variable will lead to OOM(Out of Memory).

class Test
{
    public static $array = [];
    public static $string = '';
}

// Controller
public function test(Request $req)
{
    // Out of Memory
    Test::$array[] = $req->input('param1');
    Test::$string .= $req->input('param2');
}

Memory leak detection method

Modify config/laravels.php: worker_num=1, max_request=1000000, remember to change it back after test;

Add routing /debug-memory-leak without route middleware to observe the memory changes of the Worker process;

Start LaravelS and request /debug-memory-leak until diff_mem is less than or equal to zero; if diff_mem is always greater than zero, it means that there may be a memory leak in Global Middleware or Laravel Framework;

After completing Step 3, alternately request the business routes and /debug-memory-leak (It is recommended to use ab/wrk to make a large number of requests for business routes), the initial increase in memory is normal. After a large number of requests for the business routes, if diff_mem is always greater than zero and curr_mem continues to increase, there is a high probability of memory leak; If curr_mem always changes within a certain range and does not continue to increase, there is a low probability of memory leak.

If you still can't solve it, max_request is the last guarantee.

Linux kernel parameter adjustment

Linux kernel parameter adjustment

Pressure test

Pressure test

Alternatives

Sponsor

PayPal

BTC

Gitee

Author: hhxsv5
Source Code: https://github.com/hhxsv5/laravel-s 
License: MIT License

#php #laravel #http 

Hoang  Ha

Hoang Ha

1657764000

Cách Thu Thập Dữ Liệu Từ Twitter Bằng Tweepy Và Snscrape


You can use the data you can get from social media in a number of ways, like sentiment analysis (analyzing people's thoughts) on a specific issue or field of interest.

There are several ways you can scrape (or gather) data from Twitter. And in this article, we will look at two of those ways: using Tweepy and Snscrape.

We will learn a method to scrape public conversations from people on a specific trending topic, as well as tweets from a particular user.

Now without further ado, let’s get started.

Tweepy vs Snscrape – Introduction to Our Scraping Tools

Now, before we get into the implementation of each platform, let's try to grasp the differences and limits of each platform.

Tweepy

Tweepy is a Python library for integrating with the Twitter API. Because Tweepy is connected with the Twitter API, you can perform complex queries in addition to scraping tweets. It enables you to take advantage of all of the Twitter API's capabilities.

But there are some drawbacks – like the fact that its standard API only allows you to collect tweets for up to a week (that is, Tweepy does not allow recovery of tweets beyond a week window, so historical data retrieval is not permitted).

Also, there are limits to how many tweets you can retrieve from a user's account. You can read more about Tweepy's functionalities here.

Snscrape

Snscrape is another approach for scraping information from Twitter that does not require the use of an API. Snscrape allows you to scrape basic information such as a user's profile, tweet content, source, and so on.

Snscrape is not limited to Twitter, but can also scrape content from other prominent social media networks like Facebook, Instagram, and others.

Its advantages are that there are no limits to the number of tweets you can retrieve or the window of tweets (that is, the date range of tweets). So Snscrape allows you to retrieve old data.

But the one disadvantage is that it lacks all the other functionalities of Tweepy – still, if you only want to scrape tweets, Snscrape would be enough.

Now that we've clarified the distinction between the two methods, let's go over their implementation one by one.

How to Use Tweepy to Scrape Tweets

Before we begin using Tweepy, we must first make sure that our Twitter credentials are ready. With that, we can connect Tweepy to our API key and begin scraping.

If you do not have Twitter credentials, you can register for a Twitter developer account by going here. You will be asked some basic questions about how you intend to use the Twitter API. After that, you can begin the implementation.

The first step is to install the Tweepy library on your local machine, which you can do by typing:

pip install git+https://github.com/tweepy/tweepy.git

How to Scrape Tweets from a User on Twitter

Now that we’ve installed the Tweepy library, let’s scrape 100 tweets from a user called john on Twitter. We'll look at the full code implementation that will let us do this and discuss it in detail so we can grasp what’s going on:

import tweepy

consumer_key = "XXXX" #Your API/Consumer key 
consumer_secret = "XXXX" #Your API/Consumer Secret Key
access_token = "XXXX"    #Your Access token key
access_token_secret = "XXXX" #Your Access token Secret key

#Pass in our twitter API authentication key
auth = tweepy.OAuth1UserHandler(
    consumer_key, consumer_secret,
    access_token, access_token_secret
)

#Instantiate the tweepy API
api = tweepy.API(auth, wait_on_rate_limit=True)


username = "john"
no_of_tweets =100


try:
    #The number of tweets we want to retrieved from the user
    tweets = api.user_timeline(screen_name=username, count=no_of_tweets)
    
    #Pulling Some attributes from the tweet
    attributes_container = [[tweet.created_at, tweet.favorite_count,tweet.source,  tweet.text] for tweet in tweets]

    #Creation of column list to rename the columns in the dataframe
    columns = ["Date Created", "Number of Likes", "Source of Tweet", "Tweet"]
    
    #Creation of Dataframe
    tweets_df = pd.DataFrame(attributes_container, columns=columns)
except BaseException as e:
    print('Status Failed On,',str(e))
    time.sleep(3)

Now let's go over each part of the code in the above block.

import tweepy

consumer_key = "XXXX" #Your API/Consumer key 
consumer_secret = "XXXX" #Your API/Consumer Secret Key
access_token = "XXXX"    #Your Access token key
access_token_secret = "XXXX" #Your Access token Secret key

#Pass in our twitter API authentication key
auth = tweepy.OAuth1UserHandler(
    consumer_key, consumer_secret,
    access_token, access_token_secret
)

#Instantiate the tweepy API
api = tweepy.API(auth, wait_on_rate_limit=True)

In the above code, we've imported the Tweepy library into our code, then we've created some variables where we store our Twitter credentials (The Tweepy authentication handler requires four of our Twitter credentials). So we then pass in those variable into the Tweepy authentication handler and save them into another variable.

Then the last statement of call is where we instantiated the Tweepy API and passed in the require parameters.

username = "john"
no_of_tweets =100


try:
    #The number of tweets we want to retrieved from the user
    tweets = api.user_timeline(screen_name=username, count=no_of_tweets)
    
    #Pulling Some attributes from the tweet
    attributes_container = [[tweet.created_at, tweet.favorite_count,tweet.source,  tweet.text] for tweet in tweets]

    #Creation of column list to rename the columns in the dataframe
    columns = ["Date Created", "Number of Likes", "Source of Tweet", "Tweet"]
    
    #Creation of Dataframe
    tweets_df = pd.DataFrame(attributes_container, columns=columns)
except BaseException as e:
    print('Status Failed On,',str(e))

In the above code, we created the name of the user (the @name in Twitter) we want to retrieved the tweets from and also the number of tweets. We then created an exception handler to help us catch errors in a more effective way.

After that, the api.user_timeline() returns a collection of the most recent tweets posted by the user we picked in the screen_name parameter and the number of tweets you want to retrieve.

In the next line of code, we passed in some attributes we want to retrieve from each tweet and saved them into a list. To see more attributes you can retrieve from a tweet, read this.

In the last chunk of code we created a dataframe and passed in the list we created along with the names of the column we created.

Note that the column names must be in the sequence of how you passed them into the attributes container (that is, how you passed those attributes in a list when you were retrieving the attributes from the tweet).

If you correctly followed the steps I described, you should have something like this:

image-17

Image by Author

Now that we are done, let's go over one more example before we move into the Snscrape implementation.

How to Scrape Tweets from a Text Search

In this method, we will be retrieving a tweet based on a search. You can do that like this:

import tweepy

consumer_key = "XXXX" #Your API/Consumer key 
consumer_secret = "XXXX" #Your API/Consumer Secret Key
access_token = "XXXX"    #Your Access token key
access_token_secret = "XXXX" #Your Access token Secret key

#Pass in our twitter API authentication key
auth = tweepy.OAuth1UserHandler(
    consumer_key, consumer_secret,
    access_token, access_token_secret
)

#Instantiate the tweepy API
api = tweepy.API(auth, wait_on_rate_limit=True)


search_query = "sex for grades"
no_of_tweets =150


try:
    #The number of tweets we want to retrieved from the search
    tweets = api.search_tweets(q=search_query, count=no_of_tweets)
    
    #Pulling Some attributes from the tweet
    attributes_container = [[tweet.user.name, tweet.created_at, tweet.favorite_count, tweet.source,  tweet.text] for tweet in tweets]

    #Creation of column list to rename the columns in the dataframe
    columns = ["User", "Date Created", "Number of Likes", "Source of Tweet", "Tweet"]
    
    #Creation of Dataframe
    tweets_df = pd.DataFrame(attributes_container, columns=columns)
except BaseException as e:
    print('Status Failed On,',str(e))

The above code is similar to the previous code, except that we changed the API method from api.user_timeline() to api.search_tweets(). We've also added tweet.user.name to the attributes container list.

In the code above, you can see that we passed in two attributes. This is because if we only pass in tweet.user, it would only return a dictionary user object. So we must also pass in another attribute we want to retrieve from the user object, which is name.

You can go here to see a list of additional attributes that you can retrieve from a user object. Now you should see something like this once you run it:

image-18

Image by Author.

Alright, that just about wraps up the Tweepy implementation. Just remember that there is a limit to the number of tweets you can retrieve, and you can not retrieve tweets more than 7 days old using Tweepy.

How to Use Snscrape to Scrape Tweets

As I mentioned previously, Snscrape does not require Twitter credentials (API key) to access it. There is also no limit to the number of tweets you can fetch.

For this example, though, we'll just retrieve the same tweets as in the previous example, but using Snscrape instead.

To use Snscrape, we must first install its library on our PC. You can do that by typing:

pip3 install git+https://github.com/JustAnotherArchivist/snscrape.git

How to Scrape Tweets from a User with Snscrape

Snscrape includes two methods for getting tweets from Twitter: the command line interface (CLI) and a Python Wrapper. Just keep in mind that the Python Wrapper is currently undocumented – but we can still get by with trial and error.

In this example, we will use the Python Wrapper because it is more intuitive than the CLI method. But if you get stuck with some code, you can always turn to the GitHub community for assistance. The contributors will be happy to help you.

To retrieve tweets from a particular user, we can do the following:

import snscrape.modules.twitter as sntwitter
import pandas as pd

# Created a list to append all tweet attributes(data)
attributes_container = []

# Using TwitterSearchScraper to scrape data and append tweets to list
for i,tweet in enumerate(sntwitter.TwitterSearchScraper('from:john').get_items()):
    if i>100:
        break
    attributes_container.append([tweet.date, tweet.likeCount, tweet.sourceLabel, tweet.content])
    
# Creating a dataframe from the tweets list above 
tweets_df = pd.DataFrame(attributes_container, columns=["Date Created", "Number of Likes", "Source of Tweet", "Tweets"])

Let's go over some of the code that you might not understand at first glance:

for i,tweet in enumerate(sntwitter.TwitterSearchScraper('from:john').get_items()):
    if i>100:
        break
    attributes_container.append([tweet.date, tweet.likeCount, tweet.sourceLabel, tweet.content])
    
  
# Creating a dataframe from the tweets list above 
tweets_df = pd.DataFrame(attributes_container, columns=["Date Created", "Number of Likes", "Source of Tweet", "Tweets"])

In the above code, what the sntwitter.TwitterSearchScaper does is return an object of tweets from the name of the user we passed into it (which is john).

As I mentioned earlier, Snscrape does not have limits on numbers of tweets so it will return however many tweets from that user. To help with this, we need to add the enumerate function which will iterate through the object and add a counter so we can access the most recent 100 tweets from the user.

You can see that the attributes syntax we get from each tweet looks like the one from Tweepy. These are the list of attributes that we can get from the Snscrape tweet which was curated by Martin Beck.

Sns.Scrape

Credit: Martin Beck

More attributes might be added, as the Snscrape library is still in development. Like for instance in the above image, source has been replaced with sourceLabel. If you pass in only source it will return an object.

If you run the above code, you should see something like this as well:

image-19

Image by Author

Now let's do the same for scraping by search.

How to Scrape Tweets from a Text Search with Snscrape

import snscrape.modules.twitter as sntwitter
import pandas as pd

# Creating list to append tweet data to
attributes_container = []

# Using TwitterSearchScraper to scrape data and append tweets to list
for i,tweet in enumerate(sntwitter.TwitterSearchScraper('sex for grades since:2021-07-05 until:2022-07-06').get_items()):
    if i>150:
        break
    attributes_container.append([tweet.user.username, tweet.date, tweet.likeCount, tweet.sourceLabel, tweet.content])
    
# Creating a dataframe to load the list
tweets_df = pd.DataFrame(attributes_container, columns=["User", "Date Created", "Number of Likes", "Source of Tweet", "Tweet"])

Again, you can access a lot of historical data using Snscrape (unlike Tweepy, as its standard API cannot exceed 7 days. The premium API is 30 days.). So we can pass in the date from which we want to start the search and the date we want it to end in the sntwitter.TwitterSearchScraper() method.

What we've done in the preceding code is basically what we discussed before. The only thing to bear in mind is that until works similarly to the range function in Python (that is, it excludes the last integer). So if you want to get tweets from today, you need to include the day after today in the "until" parameter.

image-21

Image of Author.

Now you know how to scrape tweets with Snscrape, too!

When to use each approach

Now that we've seen how each method works, you might be wondering when to use which.

Well, there is no universal rule for when to utilize each method. Everything comes down to a matter preference and your use case.

If you want to acquire an endless number of tweets, you should use Snscrape. But if you want to use extra features that Snscrape cannot provide (like geolocation, for example), then you should definitely use Tweepy. It is directly integrated with the Twitter API and provides complete functionality.

Even so, Snscrape is the most commonly used method for basic scraping.

Conclusion

In this article, we learned how to scrape data from Python using Tweepy and Snscrape. But this was only a brief overview of how each approach works. You can learn more by exploring the web for additional information.

I've included some useful resources that you can use if you need additional information. Thank you for reading.

 Source: https://www.freecodecamp.org/news/python-web-scraping-tutorial/

#python #web 

许 志强

许 志强

1657769340

如何使用 Tweepy 和 Snscape 从 Twitter 上抓取数据

如果您是数据爱好者,您可能会同意社交媒体是现实世界数据中最丰富的来源之一。像 Twitter 这样的网站充满了数据。

您可以通过多种方式使用从社交媒体获得的数据,例如针对特定问题或感兴趣领域的情绪分析(分析人们的想法)。

您可以通过多种方式从 Twitter 上抓取(或收集)数据。在本文中,我们将研究其中两种方式:使用 Tweepy 和 Snscrap。

我们将学习一种方法来抓取人们关于特定趋势主题的公开对话,以及来自特定用户的推文。

现在事不宜迟,让我们开始吧。

Tweepy vs Snscrape——我们的抓取工具简介

现在,在我们进入每个平台的实现之前,让我们尝试掌握每个平台的差异和限制。

呸呸呸

Tweepy 是一个用于与 Twitter API 集成的 Python 库。因为 Tweepy 与 Twitter API 连接,除了抓取推文之外,您还可以执行复杂的查询。它使您能够利用 Twitter API 的所有功能。

但也有一些缺点——比如它的标准 API 只允许您收集长达一周的推文(也就是说,Tweepy 不允许恢复超过一周窗口的推文,因此不允许检索历史数据)。

此外,您可以从用户帐户中检索多少条推文也是有限制的。您可以在此处阅读有关 Tweepy 功能的更多信息

刮擦

Snscape 是另一种从 Twitter 上抓取信息的方法,不需要使用 API。Snscrape 允许您抓取基本信息,例如用户的个人资料、推文内容、来源等。

Snscape 不仅限于 Twitter,还可以从其他著名的社交媒体网络(如 Facebook、Instagram 等)中抓取内容。

它的优点是可以检索的推文数量或推文窗口(即推文的日期范围)没有限制。因此,Snscape 允许您检索旧数据。

但一个缺点是它缺乏 Tweepy 的所有其他功能——不过,如果你只想抓取推文,Snscrap 就足够了。

现在我们已经阐明了这两种方法之间的区别,让我们一一来看看它们的实现。

如何使用 Tweepy 抓取推文

在我们开始使用 Tweepy 之前,我们必须首先确保我们的 Twitter 凭据已准备好。有了它,我们可以将 Tweepy 连接到我们的 API 密钥并开始抓取。

如果您没有 Twitter 凭据,您可以前往此处注册 Twitter 开发者帐户。您将被问及一些关于您打算如何使用 Twitter API 的基本问题。之后,您可以开始实施。

第一步是在你的本地机器上安装 Tweepy 库,你可以通过键入:

pip install git+https://github.com/tweepy/tweepy.git

如何在 Twitter 上抓取用户的推文

现在我们已经安装了 Tweepy 库,让我们从johnTwitter 上调用的用户那里抓取 100 条推文。我们将查看完整的代码实现,让我们这样做并详细讨论它,以便我们了解发生了什么:

import tweepy

consumer_key = "XXXX" #Your API/Consumer key 
consumer_secret = "XXXX" #Your API/Consumer Secret Key
access_token = "XXXX"    #Your Access token key
access_token_secret = "XXXX" #Your Access token Secret key

#Pass in our twitter API authentication key
auth = tweepy.OAuth1UserHandler(
    consumer_key, consumer_secret,
    access_token, access_token_secret
)

#Instantiate the tweepy API
api = tweepy.API(auth, wait_on_rate_limit=True)


username = "john"
no_of_tweets =100


try:
    #The number of tweets we want to retrieved from the user
    tweets = api.user_timeline(screen_name=username, count=no_of_tweets)
    
    #Pulling Some attributes from the tweet
    attributes_container = [[tweet.created_at, tweet.favorite_count,tweet.source,  tweet.text] for tweet in tweets]

    #Creation of column list to rename the columns in the dataframe
    columns = ["Date Created", "Number of Likes", "Source of Tweet", "Tweet"]
    
    #Creation of Dataframe
    tweets_df = pd.DataFrame(attributes_container, columns=columns)
except BaseException as e:
    print('Status Failed On,',str(e))
    time.sleep(3)

现在让我们回顾一下上面代码块中的每一部分代码。

import tweepy

consumer_key = "XXXX" #Your API/Consumer key 
consumer_secret = "XXXX" #Your API/Consumer Secret Key
access_token = "XXXX"    #Your Access token key
access_token_secret = "XXXX" #Your Access token Secret key

#Pass in our twitter API authentication key
auth = tweepy.OAuth1UserHandler(
    consumer_key, consumer_secret,
    access_token, access_token_secret
)

#Instantiate the tweepy API
api = tweepy.API(auth, wait_on_rate_limit=True)

在上面的代码中,我们将 Tweepy 库导入到我们的代码中,然后我们创建了一些变量来存储我们的 Twitter 凭据(Tweepy 身份验证处理程序需要我们的四个 Twitter 凭据)。所以我们然后将这些变量传递给 Tweepy 身份验证处理程序并将它们保存到另一个变量中。

然后最后一个调用语句是我们实例化 Tweepy API 并传入 require 参数的地方。

username = "john"
no_of_tweets =100


try:
    #The number of tweets we want to retrieved from the user
    tweets = api.user_timeline(screen_name=username, count=no_of_tweets)
    
    #Pulling Some attributes from the tweet
    attributes_container = [[tweet.created_at, tweet.favorite_count,tweet.source,  tweet.text] for tweet in tweets]

    #Creation of column list to rename the columns in the dataframe
    columns = ["Date Created", "Number of Likes", "Source of Tweet", "Tweet"]
    
    #Creation of Dataframe
    tweets_df = pd.DataFrame(attributes_container, columns=columns)
except BaseException as e:
    print('Status Failed On,',str(e))

在上面的代码中,我们创建了要从中检索推文的用户名(Twitter 中的@name)以及推文的数量。然后我们创建了一个异常处理程序来帮助我们以更有效的方式捕获错误。

之后,api.user_timeline()返回我们在参数中选择的用户发布的最新推文的集合以及screen_name您要检索的推文数量。

在下一行代码中,我们传入了一些我们想从每条推文中检索的属性,并将它们保存到一个列表中。要查看可以从推文中检索到的更多属性,请阅读

在最后一段代码中,我们创建了一个数据框,并传入了我们创建的列表以及我们创建的列的名称。

请注意,列名必须按照您将它们传递到属性容器的顺序(即,当您从推文中检索属性时,您如何在列表中传递这些属性)。

如果你正确地按照我描述的步骤,你应该有这样的:

图像 17

作者图片

现在我们已经完成了,在我们进入 Snscrap 实现之前,让我们再看一个例子。

如何从文本搜索中抓取推文

在这种方法中,我们将根据搜索检索推文。你可以这样做:

import tweepy

consumer_key = "XXXX" #Your API/Consumer key 
consumer_secret = "XXXX" #Your API/Consumer Secret Key
access_token = "XXXX"    #Your Access token key
access_token_secret = "XXXX" #Your Access token Secret key

#Pass in our twitter API authentication key
auth = tweepy.OAuth1UserHandler(
    consumer_key, consumer_secret,
    access_token, access_token_secret
)

#Instantiate the tweepy API
api = tweepy.API(auth, wait_on_rate_limit=True)


search_query = "sex for grades"
no_of_tweets =150


try:
    #The number of tweets we want to retrieved from the search
    tweets = api.search_tweets(q=search_query, count=no_of_tweets)
    
    #Pulling Some attributes from the tweet
    attributes_container = [[tweet.user.name, tweet.created_at, tweet.favorite_count, tweet.source,  tweet.text] for tweet in tweets]

    #Creation of column list to rename the columns in the dataframe
    columns = ["User", "Date Created", "Number of Likes", "Source of Tweet", "Tweet"]
    
    #Creation of Dataframe
    tweets_df = pd.DataFrame(attributes_container, columns=columns)
except BaseException as e:
    print('Status Failed On,',str(e))

上面的代码与前面的代码类似,只是我们将 API 方法从 更改api.user_timeline()api.search_tweets()。我们还添加tweet.user.name了属性容器列表。

在上面的代码中,你可以看到我们传入了两个属性。这是因为如果我们只传入tweet.user,它只会返回一个字典用户对象。所以我们还必须传入另一个我们想从用户对象中检索的属性,即name.

您可以在此处查看可以从用户对象中检索的附加属性列表。现在,一旦您运行它,您应该会看到类似这样的内容:

图像 18

图片由作者提供。

好的,这就是 Tweepy 的实现。请记住,您可以检索的推文数量是有限制的,并且您不能使用 Tweepy 检索超过 7 天的推文。

如何使用 Snscrape 来抓取推文

正如我之前提到的,Snscrape 不需要 Twitter 凭据(API 密钥)来访问它。您可以获取的推文数量也没有限制。

但是,对于这个示例,我们将只检索与上一个示例相同的推文,但使用 Snscape。

要使用 Snscrap,我们必须首先在我们的 PC 上安装它的库。您可以通过键入:

pip3 install git+https://github.com/JustAnotherArchivist/snscrape.git

如何使用 Snscrape 抓取用户的推文

Snscrape 包括两种从 Twitter 获取推文的方法:命令行界面 (CLI) 和 Python Wrapper。请记住,Python Wrapper 目前没有文档记录——但我们仍然可以通过反复试验来度过难关。

在本例中,我们将使用 Python Wrapper,因为它比 CLI 方法更直观。但是,如果您遇到一些代码问题,您可以随时向 GitHub 社区寻求帮助。贡献者将很乐意为您提供帮助。

要检索特定用户的推文,我们可以执行以下操作:

import snscrape.modules.twitter as sntwitter
import pandas as pd

# Created a list to append all tweet attributes(data)
attributes_container = []

# Using TwitterSearchScraper to scrape data and append tweets to list
for i,tweet in enumerate(sntwitter.TwitterSearchScraper('from:john').get_items()):
    if i>100:
        break
    attributes_container.append([tweet.date, tweet.likeCount, tweet.sourceLabel, tweet.content])
    
# Creating a dataframe from the tweets list above 
tweets_df = pd.DataFrame(attributes_container, columns=["Date Created", "Number of Likes", "Source of Tweet", "Tweets"])

让我们复习一下你可能第一眼看不懂的一些代码:

for i,tweet in enumerate(sntwitter.TwitterSearchScraper('from:john').get_items()):
    if i>100:
        break
    attributes_container.append([tweet.date, tweet.likeCount, tweet.sourceLabel, tweet.content])
    
  
# Creating a dataframe from the tweets list above 
tweets_df = pd.DataFrame(attributes_container, columns=["Date Created", "Number of Likes", "Source of Tweet", "Tweets"])

在上面的代码中,所做的sntwitter.TwitterSearchScaper是从我们传递给它的用户名(即 john)中返回一个推文对象。

正如我之前提到的,Snscrape 对推文的数量没有限制,因此它会返回来自该用户的许多推文。为了解决这个问题,我们需要添加枚举函数,该函数将遍历对象并添加一个计数器,以便我们可以访问用户最近的 100 条推文。

您可以看到,我们从每条推文中获得的属性语法与 Tweepy 中的类似。这些是我们可以从 Martin Beck 策划的 Snscape 推文中获得的属性列表。

Sns.Scrape

学分:马丁贝克

可能会添加更多属性,因为 Snscape 库仍在开发中。例如上图中的,source已替换为sourceLabel. 如果你只传入source它会返回一个对象。

如果你运行上面的代码,你应该也会看到类似这样的东西:

图像 19

作者图片

现在让我们对通过搜索进行抓取做同样的事情。

如何使用 Snscrape 从文本搜索中抓取推文

import snscrape.modules.twitter as sntwitter
import pandas as pd

# Creating list to append tweet data to
attributes_container = []

# Using TwitterSearchScraper to scrape data and append tweets to list
for i,tweet in enumerate(sntwitter.TwitterSearchScraper('sex for grades since:2021-07-05 until:2022-07-06').get_items()):
    if i>150:
        break
    attributes_container.append([tweet.user.username, tweet.date, tweet.likeCount, tweet.sourceLabel, tweet.content])
    
# Creating a dataframe to load the list
tweets_df = pd.DataFrame(attributes_container, columns=["User", "Date Created", "Number of Likes", "Source of Tweet", "Tweet"])

同样,您可以使用 Snscrape 访问大量历史数据(与 Tweepy 不同,因为它的标准 API 不能超过 7 天。高级 API 是 30 天。)。所以我们可以在方法中传入我们想要开始搜索的日期和想要结束的日期sntwitter.TwitterSearchScraper()

我们在前面的代码中所做的基本上就是我们之前讨论过的。唯一要记住的是,直到与 Python 中的范围函数类似(也就是说,它不包括最后一个整数)。因此,如果您想从今天开始获取推文,则需要在“直到”参数中包含今天之后的一天。

图像 21

作者的形象。

现在您也知道如何使用 Snscape 抓取推文了!

何时使用每种方法

现在我们已经了解了每种方法的工作原理,您可能想知道何时使用哪种方法。

好吧,对于何时使用每种方法没有通用规则。一切都取决于问题偏好和您的用例。

如果你想获得无穷无尽的推文,你应该使用 Snscrap。但是,如果您想使用 Snscrape 无法提供的额外功能(例如地理定位),那么您绝对应该使用 Tweepy。它直接与 Twitter API 集成并提供完整的功能。

即便如此,Snscrape 是最常用的基本刮削方法。

结论

在本文中,我们学习了如何使用 Tweepy 和 Snscrap 从 Python 中抓取数据。但这只是对每种方法如何工作的简要概述。您可以通过浏览网络了解更多信息以获取更多信息。

我提供了一些有用的资源,如果您需要更多信息,可以使用它们。感谢您的阅读。

 来源:https ://www.freecodecamp.org/news/python-web-scraping-tutorial/

#python #web 

Veronica  Roob

Veronica Roob

1648869960

LaravelS: An Out-Of-The-Box Adapter Between Swoole and Laravel/Lumen

 _                               _  _____ 
| |                             | |/ ____|
| |     __ _ _ __ __ ___   _____| | (___  
| |    / _` | '__/ _` \ \ / / _ \ |\___ \ 
| |___| (_| | | | (_| |\ V /  __/ |____) |
|______\__,_|_|  \__,_| \_/ \___|_|_____/ 
                                           

🚀 LaravelS is an out-of-the-box adapter between Swoole and Laravel/Lumen.

Please Watch this repository to get the latest updates.

中文文档

Features

Built-in Http/WebSocket server

Multi-port mixed protocol

Custom process

Memory resident

Asynchronous event listening

Asynchronous task queue

Millisecond cron job

Common Components

Gracefully reload

Automatically reload after modifying code

Support Laravel/Lumen both, good compatibility

Simple & Out of the box

Benchmark

Which is the fastest web framework?

TechEmpower Framework Benchmarks

Requirements

DependencyRequirement
PHP>= 5.5.9 Recommend PHP7+
Swoole>= 1.7.19 No longer support PHP5 since 2.0.12 Recommend 4.5.0+
Laravel/Lumen>= 5.1 Recommend 8.0+

Install

1.Require package via Composer(packagist).

composer require "hhxsv5/laravel-s:~3.7.0" -vvv
# Make sure that your composer.lock file is under the VCS

2.Register service provider(pick one of two).

Laravel: in config/app.php file, Laravel 5.5+ supports package discovery automatically, you should skip this step

'providers' => [
    //...
    Hhxsv5\LaravelS\Illuminate\LaravelSServiceProvider::class,
],

Lumen: in bootstrap/app.php file

$app->register(Hhxsv5\LaravelS\Illuminate\LaravelSServiceProvider::class);

3.Publish configuration and binaries.

After upgrading LaravelS, you need to republish; click here to see the change notes of each version.

php artisan laravels publish
# Configuration: config/laravels.php
# Binary: bin/laravels bin/fswatch bin/inotify

4.Change config/laravels.php: listen_ip, listen_port, refer Settings.

5.Performance tuning

Adjust kernel parameters

Number of Workers: LaravelS uses Swoole's Synchronous IO mode, the larger the worker_num setting, the better the concurrency performance, but it will cause more memory usage and process switching overhead. If one request takes 100ms, in order to provide 1000QPS concurrency, at least 100 Worker processes need to be configured. The calculation method is: worker_num = 1000QPS/(1s/1ms) = 100, so incremental pressure testing is needed to calculate the best worker_num.

Number of Task Workers

Run

Please read the notices carefully before running, Important notices(IMPORTANT).

  • Commands: php bin/laravels {start|stop|restart|reload|info|help}.
CommandDescription
startStart LaravelS, list the processes by "ps -ef|grep laravels"
stopStop LaravelS, and trigger the method onStop of Custom process
restartRestart LaravelS: Stop gracefully before starting; The service is unavailable until startup is complete
reloadReload all Task/Worker/Timer processes which contain your business codes, and trigger the method onReload of Custom process, CANNOT reload Master/Manger processes. After modifying config/laravels.php, you only have to call restart to restart
infoDisplay component version information
helpDisplay help information
  • Boot options for the commands start and restart.
OptionDescription
-d|--daemonizeRun as a daemon, this option will override the swoole.daemonize setting in laravels.php
-e|--envThe environment the command should run under, such as --env=testing will use the configuration file .env.testing firstly, this feature requires Laravel 5.2+
-i|--ignoreIgnore checking PID file of Master process
-x|--x-versionThe version(branch) of the current project, stored in $_ENV/$_SERVER, access via $_ENV['X_VERSION'] $_SERVER['X_VERSION'] $request->server->get('X_VERSION')
  • Runtime files: start will automatically execute php artisan laravels config and generate these files, developers generally don't need to pay attention to them, it's recommended to add them to .gitignore.
FileDescription
storage/laravels.confLaravelS's runtime configuration file
storage/laravels.pidPID file of Master process
storage/laravels-timer-process.pidPID file of the Timer process
storage/laravels-custom-processes.pidPID file of all custom processes

Deploy

It is recommended to supervise the main process through Supervisord, the premise is without option -d and to set swoole.daemonize to false.

[program:laravel-s-test]
directory=/var/www/laravel-s-test
command=/usr/local/bin/php bin/laravels start -i
numprocs=1
autostart=true
autorestart=true
startretries=3
user=www-data
redirect_stderr=true
stdout_logfile=/var/log/supervisor/%(program_name)s.log

Cooperate with Nginx (Recommended)

Demo.

gzip on;
gzip_min_length 1024;
gzip_comp_level 2;
gzip_types text/plain text/css text/javascript application/json application/javascript application/x-javascript application/xml application/x-httpd-php image/jpeg image/gif image/png font/ttf font/otf image/svg+xml;
gzip_vary on;
gzip_disable "msie6";
upstream swoole {
    # Connect IP:Port
    server 127.0.0.1:5200 weight=5 max_fails=3 fail_timeout=30s;
    # Connect UnixSocket Stream file, tips: put the socket file in the /dev/shm directory to get better performance
    #server unix:/yourpath/laravel-s-test/storage/laravels.sock weight=5 max_fails=3 fail_timeout=30s;
    #server 192.168.1.1:5200 weight=3 max_fails=3 fail_timeout=30s;
    #server 192.168.1.2:5200 backup;
    keepalive 16;
}
server {
    listen 80;
    # Don't forget to bind the host
    server_name laravels.com;
    root /yourpath/laravel-s-test/public;
    access_log /yourpath/log/nginx/$server_name.access.log  main;
    autoindex off;
    index index.html index.htm;
    # Nginx handles the static resources(recommend enabling gzip), LaravelS handles the dynamic resource.
    location / {
        try_files $uri @laravels;
    }
    # Response 404 directly when request the PHP file, to avoid exposing public/*.php
    #location ~* \.php$ {
    #    return 404;
    #}
    location @laravels {
        # proxy_connect_timeout 60s;
        # proxy_send_timeout 60s;
        # proxy_read_timeout 120s;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Real-PORT $remote_port;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $http_host;
        proxy_set_header Scheme $scheme;
        proxy_set_header Server-Protocol $server_protocol;
        proxy_set_header Server-Name $server_name;
        proxy_set_header Server-Addr $server_addr;
        proxy_set_header Server-Port $server_port;
        # "swoole" is the upstream
        proxy_pass http://swoole;
    }
}

Cooperate with Apache

LoadModule proxy_module /yourpath/modules/mod_proxy.so
LoadModule proxy_balancer_module /yourpath/modules/mod_proxy_balancer.so
LoadModule lbmethod_byrequests_module /yourpath/modules/mod_lbmethod_byrequests.so
LoadModule proxy_http_module /yourpath/modules/mod_proxy_http.so
LoadModule slotmem_shm_module /yourpath/modules/mod_slotmem_shm.so
LoadModule rewrite_module /yourpath/modules/mod_rewrite.so
LoadModule remoteip_module /yourpath/modules/mod_remoteip.so
LoadModule deflate_module /yourpath/modules/mod_deflate.so

<IfModule deflate_module>
    SetOutputFilter DEFLATE
    DeflateCompressionLevel 2
    AddOutputFilterByType DEFLATE text/html text/plain text/css text/javascript application/json application/javascript application/x-javascript application/xml application/x-httpd-php image/jpeg image/gif image/png font/ttf font/otf image/svg+xml
</IfModule>

<VirtualHost *:80>
    # Don't forget to bind the host
    ServerName www.laravels.com
    ServerAdmin hhxsv5@sina.com

    DocumentRoot /yourpath/laravel-s-test/public;
    DirectoryIndex index.html index.htm
    <Directory "/">
        AllowOverride None
        Require all granted
    </Directory>

    RemoteIPHeader X-Forwarded-For

    ProxyRequests Off
    ProxyPreserveHost On
    <Proxy balancer://laravels>  
        BalancerMember http://192.168.1.1:5200 loadfactor=7
        #BalancerMember http://192.168.1.2:5200 loadfactor=3
        #BalancerMember http://192.168.1.3:5200 loadfactor=1 status=+H
        ProxySet lbmethod=byrequests
    </Proxy>
    #ProxyPass / balancer://laravels/
    #ProxyPassReverse / balancer://laravels/

    # Apache handles the static resources, LaravelS handles the dynamic resource.
    RewriteEngine On
    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-d
    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-f
    RewriteRule ^/(.*)$ balancer://laravels%{REQUEST_URI} [P,L]

    ErrorLog ${APACHE_LOG_DIR}/www.laravels.com.error.log
    CustomLog ${APACHE_LOG_DIR}/www.laravels.com.access.log combined
</VirtualHost>

Enable WebSocket server

The Listening address of WebSocket Sever is the same as Http Server.

1.Create WebSocket Handler class, and implement interface WebSocketHandlerInterface.The instant is automatically instantiated when start, you do not need to manually create it.

namespace App\Services;
use Hhxsv5\LaravelS\Swoole\WebSocketHandlerInterface;
use Swoole\Http\Request;
use Swoole\Http\Response;
use Swoole\WebSocket\Frame;
use Swoole\WebSocket\Server;
/**
 * @see https://www.swoole.co.uk/docs/modules/swoole-websocket-server
 */
class WebSocketService implements WebSocketHandlerInterface
{
    // Declare constructor without parameters
    public function __construct()
    {
    }
    // public function onHandShake(Request $request, Response $response)
    // {
           // Custom handshake: https://www.swoole.co.uk/docs/modules/swoole-websocket-server-on-handshake
           // The onOpen event will be triggered automatically after a successful handshake
    // }
    public function onOpen(Server $server, Request $request)
    {
        // Before the onOpen event is triggered, the HTTP request to establish the WebSocket has passed the Laravel route,
        // so Laravel's Request, Auth information are readable, Session is readable and writable, but only in the onOpen event.
        // \Log::info('New WebSocket connection', [$request->fd, request()->all(), session()->getId(), session('xxx'), session(['yyy' => time()])]);
        // The exceptions thrown here will be caught by the upper layer and recorded in the Swoole log. Developers need to try/catch manually.
        $server->push($request->fd, 'Welcome to LaravelS');
    }
    public function onMessage(Server $server, Frame $frame)
    {
        // \Log::info('Received message', [$frame->fd, $frame->data, $frame->opcode, $frame->finish]);
        // The exceptions thrown here will be caught by the upper layer and recorded in the Swoole log. Developers need to try/catch manually.
        $server->push($frame->fd, date('Y-m-d H:i:s'));
    }
    public function onClose(Server $server, $fd, $reactorId)
    {
        // The exceptions thrown here will be caught by the upper layer and recorded in the Swoole log. Developers need to try/catch manually.
    }
}

2.Modify config/laravels.php.

// ...
'websocket'      => [
    'enable'  => true, // Note: set enable to true
    'handler' => \App\Services\WebSocketService::class,
],
'swoole'         => [
    //...
    // Must set dispatch_mode in (2, 4, 5), see https://www.swoole.co.uk/docs/modules/swoole-server/configuration
    'dispatch_mode' => 2,
    //...
],
// ...

3.Use SwooleTable to bind FD & UserId, optional, Swoole Table Demo. Also you can use the other global storage services, like Redis/Memcached/MySQL, but be careful that FD will be possible conflicting between multiple Swoole Servers.

4.Cooperate with Nginx (Recommended)

Refer WebSocket Proxy

map $http_upgrade $connection_upgrade {
    default upgrade;
    ''      close;
}
upstream swoole {
    # Connect IP:Port
    server 127.0.0.1:5200 weight=5 max_fails=3 fail_timeout=30s;
    # Connect UnixSocket Stream file, tips: put the socket file in the /dev/shm directory to get better performance
    #server unix:/yourpath/laravel-s-test/storage/laravels.sock weight=5 max_fails=3 fail_timeout=30s;
    #server 192.168.1.1:5200 weight=3 max_fails=3 fail_timeout=30s;
    #server 192.168.1.2:5200 backup;
    keepalive 16;
}
server {
    listen 80;
    # Don't forget to bind the host
    server_name laravels.com;
    root /yourpath/laravel-s-test/public;
    access_log /yourpath/log/nginx/$server_name.access.log  main;
    autoindex off;
    index index.html index.htm;
    # Nginx handles the static resources(recommend enabling gzip), LaravelS handles the dynamic resource.
    location / {
        try_files $uri @laravels;
    }
    # Response 404 directly when request the PHP file, to avoid exposing public/*.php
    #location ~* \.php$ {
    #    return 404;
    #}
    # Http and WebSocket are concomitant, Nginx identifies them by "location"
    # !!! The location of WebSocket is "/ws"
    # Javascript: var ws = new WebSocket("ws://laravels.com/ws");
    location =/ws {
        # proxy_connect_timeout 60s;
        # proxy_send_timeout 60s;
        # proxy_read_timeout: Nginx will close the connection if the proxied server does not send data to Nginx in 60 seconds; At the same time, this close behavior is also affected by heartbeat setting of Swoole.
        # proxy_read_timeout 60s;
        proxy_http_version 1.1;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Real-PORT $remote_port;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $http_host;
        proxy_set_header Scheme $scheme;
        proxy_set_header Server-Protocol $server_protocol;
        proxy_set_header Server-Name $server_name;
        proxy_set_header Server-Addr $server_addr;
        proxy_set_header Server-Port $server_port;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection $connection_upgrade;
        proxy_pass http://swoole;
    }
    location @laravels {
        # proxy_connect_timeout 60s;
        # proxy_send_timeout 60s;
        # proxy_read_timeout 60s;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Real-PORT $remote_port;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $http_host;
        proxy_set_header Scheme $scheme;
        proxy_set_header Server-Protocol $server_protocol;
        proxy_set_header Server-Name $server_name;
        proxy_set_header Server-Addr $server_addr;
        proxy_set_header Server-Port $server_port;
        proxy_pass http://swoole;
    }
}

5.Heartbeat setting

Heartbeat setting of Swoole

// config/laravels.php
'swoole' => [
    //...
    // All connections are traversed every 60 seconds. If a connection does not send any data to the server within 600 seconds, the connection will be forced to close.
    'heartbeat_idle_time'      => 600,
    'heartbeat_check_interval' => 60,
    //...
],

Proxy read timeout of Nginx

# Nginx will close the connection if the proxied server does not send data to Nginx in 60 seconds
proxy_read_timeout 60s;

6.Push data in controller

namespace App\Http\Controllers;
class TestController extends Controller
{
    public function push()
    {
        $fd = 1; // Find fd by userId from a map [userId=>fd].
        /**@var \Swoole\WebSocket\Server $swoole */
        $swoole = app('swoole');
        $success = $swoole->push($fd, 'Push data to fd#1 in Controller');
        var_dump($success);
    }
}

Listen events

System events

Usually, you can reset/destroy some global/static variables, or change the current Request/Response object.

laravels.received_request After LaravelS parsed Swoole\Http\Request to Illuminate\Http\Request, before Laravel's Kernel handles this request.

// Edit file `app/Providers/EventServiceProvider.php`, add the following code into method `boot`
// If no variable $events, you can also call Facade \Event::listen(). 
$events->listen('laravels.received_request', function (\Illuminate\Http\Request $req, $app) {
    $req->query->set('get_key', 'hhxsv5');// Change query of request
    $req->request->set('post_key', 'hhxsv5'); // Change post of request
});

laravels.generated_response After Laravel's Kernel handled the request, before LaravelS parses Illuminate\Http\Response to Swoole\Http\Response.

// Edit file `app/Providers/EventServiceProvider.php`, add the following code into method `boot`
// If no variable $events, you can also call Facade \Event::listen(). 
$events->listen('laravels.generated_response', function (\Illuminate\Http\Request $req, \Symfony\Component\HttpFoundation\Response $rsp, $app) {
    $rsp->headers->set('header-key', 'hhxsv5');// Change header of response
});

Customized asynchronous events

This feature depends on AsyncTask of Swoole, your need to set swoole.task_worker_num in config/laravels.php firstly. The performance of asynchronous event processing is influenced by number of Swoole task process, you need to set task_worker_num appropriately.

1.Create event class.

use Hhxsv5\LaravelS\Swoole\Task\Event;
class TestEvent extends Event
{
    protected $listeners = [
        // Listener list
        TestListener1::class,
        // TestListener2::class,
    ];
    private $data;
    public function __construct($data)
    {
        $this->data = $data;
    }
    public function getData()
    {
        return $this->data;
    }
}

2.Create listener class.

use Hhxsv5\LaravelS\Swoole\Task\Task;
use Hhxsv5\LaravelS\Swoole\Task\Listener;
class TestListener1 extends Listener
{
    /**
     * @var TestEvent
     */
    protected $event;
    
    public function handle()
    {
        \Log::info(__CLASS__ . ':handle start', [$this->event->getData()]);
        sleep(2);// Simulate the slow codes
        // Deliver task in CronJob, but NOT support callback finish() of task.
        // Note: Modify task_ipc_mode to 1 or 2 in config/laravels.php, see https://www.swoole.co.uk/docs/modules/swoole-server/configuration
        $ret = Task::deliver(new TestTask('task data'));
        var_dump($ret);
        // The exceptions thrown here will be caught by the upper layer and recorded in the Swoole log. Developers need to try/catch manually.
    }
}

3.Fire event.

// Create instance of event and fire it, "fire" is asynchronous.
use Hhxsv5\LaravelS\Swoole\Task\Event;
$event = new TestEvent('event data');
// $event->delay(10); // Delay 10 seconds to fire event
// $event->setTries(3); // When an error occurs, try 3 times in total
$success = Event::fire($event);
var_dump($success);// Return true if sucess, otherwise false

Asynchronous task queue

This feature depends on AsyncTask of Swoole, your need to set swoole.task_worker_num in config/laravels.php firstly. The performance of task processing is influenced by number of Swoole task process, you need to set task_worker_num appropriately.

1.Create task class.

use Hhxsv5\LaravelS\Swoole\Task\Task;
class TestTask extends Task
{
    private $data;
    private $result;
    public function __construct($data)
    {
        $this->data = $data;
    }
    // The logic of task handling, run in task process, CAN NOT deliver task
    public function handle()
    {
        \Log::info(__CLASS__ . ':handle start', [$this->data]);
        sleep(2);// Simulate the slow codes
        // The exceptions thrown here will be caught by the upper layer and recorded in the Swoole log. Developers need to try/catch manually.
        $this->result = 'the result of ' . $this->data;
    }
    // Optional, finish event, the logic of after task handling, run in worker process, CAN deliver task 
    public function finish()
    {
        \Log::info(__CLASS__ . ':finish start', [$this->result]);
        Task::deliver(new TestTask2('task2 data')); // Deliver the other task
    }
}

2.Deliver task.

// Create instance of TestTask and deliver it, "deliver" is asynchronous.
use Hhxsv5\LaravelS\Swoole\Task\Task;
$task = new TestTask('task data');
// $task->delay(3);// delay 3 seconds to deliver task
// $task->setTries(3); // When an error occurs, try 3 times in total
$ret = Task::deliver($task);
var_dump($ret);// Return true if sucess, otherwise false

Millisecond cron job

Wrapper cron job base on Swoole's Millisecond Timer, replace Linux Crontab.

1.Create cron job class.

namespace App\Jobs\Timer;
use App\Tasks\TestTask;
use Swoole\Coroutine;
use Hhxsv5\LaravelS\Swoole\Task\Task;
use Hhxsv5\LaravelS\Swoole\Timer\CronJob;
class TestCronJob extends CronJob
{
    protected $i = 0;
    // !!! The `interval` and `isImmediate` of cron job can be configured in two ways(pick one of two): one is to overload the corresponding method, and the other is to pass parameters when registering cron job.
    // --- Override the corresponding method to return the configuration: begin
    public function interval()
    {
        return 1000;// Run every 1000ms
    }
    public function isImmediate()
    {
        return false;// Whether to trigger `run` immediately after setting up
    }
    // --- Override the corresponding method to return the configuration: end
    public function run()
    {
        \Log::info(__METHOD__, ['start', $this->i, microtime(true)]);
        // do something
        // sleep(1); // Swoole < 2.1
        Coroutine::sleep(1); // Swoole>=2.1 Coroutine will be automatically created for run().
        $this->i++;
        \Log::info(__METHOD__, ['end', $this->i, microtime(true)]);

        if ($this->i >= 10) { // Run 10 times only
            \Log::info(__METHOD__, ['stop', $this->i, microtime(true)]);
            $this->stop(); // Stop this cron job, but it will run again after restart/reload.
            // Deliver task in CronJob, but NOT support callback finish() of task.
            // Note: Modify task_ipc_mode to 1 or 2 in config/laravels.php, see https://www.swoole.co.uk/docs/modules/swoole-server/configuration
            $ret = Task::deliver(new TestTask('task data'));
            var_dump($ret);
        }
        // The exceptions thrown here will be caught by the upper layer and recorded in the Swoole log. Developers need to try/catch manually.
    }
}

2.Register cron job.

// Register cron jobs in file "config/laravels.php"
[
    // ...
    'timer'          => [
        'enable' => true, // Enable Timer
        'jobs'   => [ // The list of cron job
            // Enable LaravelScheduleJob to run `php artisan schedule:run` every 1 minute, replace Linux Crontab
            // \Hhxsv5\LaravelS\Illuminate\LaravelScheduleJob::class,
            // Two ways to configure parameters:
            // [\App\Jobs\Timer\TestCronJob::class, [1000, true]], // Pass in parameters when registering
            \App\Jobs\Timer\TestCronJob::class, // Override the corresponding method to return the configuration
        ],
        'max_wait_time' => 5, // Max waiting time of reloading
        // Enable the global lock to ensure that only one instance starts the timer when deploying multiple instances. This feature depends on Redis, please see https://laravel.com/docs/7.x/redis
        'global_lock'     => false,
        'global_lock_key' => config('app.name', 'Laravel'),
    ],
    // ...
];

3.Note: it will launch multiple timers when build the server cluster, so you need to make sure that launch one timer only to avoid running repetitive task.

4.LaravelS v3.4.0 starts to support the hot restart [Reload] Timer process. After LaravelS receives the SIGUSR1 signal, it waits for max_wait_time(default 5) seconds to end the process, then the Manager process will pull up the Timer process again.

5.If you only need to use minute-level scheduled tasks, it is recommended to enable Hhxsv5\LaravelS\Illuminate\LaravelScheduleJob instead of Linux Crontab, so that you can follow the coding habits of Laravel task scheduling and configure Kernel.

// app/Console/Kernel.php
protected function schedule(Schedule $schedule)
{
    // runInBackground() will start a new child process to execute the task. This is asynchronous and will not affect the execution timing of other tasks.
    $schedule->command(TestCommand::class)->runInBackground()->everyMinute();
}

Automatically reload after modifying code

Via inotify, support Linux only.

1.Install inotify extension.

2.Turn on the switch in Settings.

3.Notice: Modify the file only in Linux to receive the file change events. It's recommended to use the latest Docker. Vagrant Solution.

Via fswatch, support OS X/Linux/Windows.

1.Install fswatch.

2.Run command in your project root directory.

# Watch current directory
./bin/fswatch
# Watch app directory
./bin/fswatch ./app

Via inotifywait, support Linux.

1.Install inotify-tools.

2.Run command in your project root directory.

# Watch current directory
./bin/inotify
# Watch app directory
./bin/inotify ./app

When the above methods does not work, the ultimate solution: set max_request=1,worker_num=1, so that Worker process will restart after processing a request. The performance of this method is very poor, so only development environment use.

Get the instance of SwooleServer in your project

/**
 * $swoole is the instance of `Swoole\WebSocket\Server` if enable WebSocket server, otherwise `Swoole\Http\Server`
 * @var \Swoole\WebSocket\Server|\Swoole\Http\Server $swoole
 */
$swoole = app('swoole');
var_dump($swoole->stats());
$swoole->push($fd, 'Push WebSocket message');

Use SwooleTable

1.Define Table, support multiple.

All defined tables will be created before Swoole starting.

// in file "config/laravels.php"
[
    // ...
    'swoole_tables'  => [
        // Scene:bind UserId & FD in WebSocket
        'ws' => [// The Key is table name, will add suffix "Table" to avoid naming conflicts. Here defined a table named "wsTable"
            'size'   => 102400,// The max size
            'column' => [// Define the columns
                ['name' => 'value', 'type' => \Swoole\Table::TYPE_INT, 'size' => 8],
            ],
        ],
        //...Define the other tables
    ],
    // ...
];

2.Access Table: all table instances will be bound on SwooleServer, access by app('swoole')->xxxTable.

namespace App\Services;
use Hhxsv5\LaravelS\Swoole\WebSocketHandlerInterface;
use Swoole\Http\Request;
use Swoole\WebSocket\Frame;
use Swoole\WebSocket\Server;
class WebSocketService implements WebSocketHandlerInterface
{
    /**@var \Swoole\Table $wsTable */
    private $wsTable;
    public function __construct()
    {
        $this->wsTable = app('swoole')->wsTable;
    }
    // Scene:bind UserId & FD in WebSocket
    public function onOpen(Server $server, Request $request)
    {
        // var_dump(app('swoole') === $server);// The same instance
        /**
         * Get the currently logged in user
         * This feature requires that the path to establish a WebSocket connection go through middleware such as Authenticate.
         * E.g:
         * Browser side: var ws = new WebSocket("ws://127.0.0.1:5200/ws");
         * Then the /ws route in Laravel needs to add the middleware like Authenticate.
         * Route::get('/ws', function () {
         *     // Respond any content with status code 200
         *     return 'websocket';
         * })->middleware(['auth']);
         */
        // $user = Auth::user();
        // $userId = $user ? $user->id : 0; // 0 means a guest user who is not logged in
        $userId = mt_rand(1000, 10000);
        // if (!$userId) {
        //     // Disconnect the connections of unlogged users
        //     $server->disconnect($request->fd);
        //     return;
        // }
        $this->wsTable->set('uid:' . $userId, ['value' => $request->fd]);// Bind map uid to fd
        $this->wsTable->set('fd:' . $request->fd, ['value' => $userId]);// Bind map fd to uid
        $server->push($request->fd, "Welcome to LaravelS #{$request->fd}");
    }
    public function onMessage(Server $server, Frame $frame)
    {
        // Broadcast
        foreach ($this->wsTable as $key => $row) {
            if (strpos($key, 'uid:') === 0 && $server->isEstablished($row['value'])) {
                $content = sprintf('Broadcast: new message "%s" from #%d', $frame->data, $frame->fd);
                $server->push($row['value'], $content);
            }
        }
    }
    public function onClose(Server $server, $fd, $reactorId)
    {
        $uid = $this->wsTable->get('fd:' . $fd);
        if ($uid !== false) {
            $this->wsTable->del('uid:' . $uid['value']); // Unbind uid map
        }
        $this->wsTable->del('fd:' . $fd);// Unbind fd map
        $server->push($fd, "Goodbye #{$fd}");
    }
}

Multi-port mixed protocol

For more information, please refer to Swoole Server AddListener

To make our main server support more protocols not just Http and WebSocket, we bring the feature multi-port mixed protocol of Swoole in LaravelS and name it Socket. Now, you can build TCP/UDP applications easily on top of Laravel.

Create Socket handler class, and extend Hhxsv5\LaravelS\Swoole\Socket\{TcpSocket|UdpSocket|Http|WebSocket}.

namespace App\Sockets;
use Hhxsv5\LaravelS\Swoole\Socket\TcpSocket;
use Swoole\Server;
class TestTcpSocket extends TcpSocket
{
    public function onConnect(Server $server, $fd, $reactorId)
    {
        \Log::info('New TCP connection', [$fd]);
        $server->send($fd, 'Welcome to LaravelS.');
    }
    public function onReceive(Server $server, $fd, $reactorId, $data)
    {
        \Log::info('Received data', [$fd, $data]);
        $server->send($fd, 'LaravelS: ' . $data);
        if ($data === "quit\r\n") {
            $server->send($fd, 'LaravelS: bye' . PHP_EOL);
            $server->close($fd);
        }
    }
    public function onClose(Server $server, $fd, $reactorId)
    {
        \Log::info('Close TCP connection', [$fd]);
        $server->send($fd, 'Goodbye');
    }
}

These Socket connections share the same worker processes with your HTTP/WebSocket connections. So it won't be a problem at all if you want to deliver tasks, use SwooleTable, even Laravel components such as DB, Eloquent and so on. At the same time, you can access Swoole\Server\Port object directly by member property swoolePort.

public function onReceive(Server $server, $fd, $reactorId, $data)
{
    $port = $this->swoolePort; // Get the `Swoole\Server\Port` object
}
namespace App\Http\Controllers;
class TestController extends Controller
{
    public function test()
    {
        /**@var \Swoole\Http\Server|\Swoole\WebSocket\Server $swoole */
        $swoole = app('swoole');
        // $swoole->ports: Traverse all Port objects, https://www.swoole.co.uk/docs/modules/swoole-server/multiple-ports
        $port = $swoole->ports[0]; // Get the `Swoole\Server\Port` object, $port[0] is the port of the main server
        foreach ($port->connections as $fd) { // Traverse all connections
            // $swoole->send($fd, 'Send tcp message');
            // if($swoole->isEstablished($fd)) {
            //     $swoole->push($fd, 'Send websocket message');
            // }
        }
    }
}

Register Sockets.

// Edit `config/laravels.php`
//...
'sockets' => [
    [
        'host'     => '127.0.0.1',
        'port'     => 5291,
        'type'     => SWOOLE_SOCK_TCP,// Socket type: SWOOLE_SOCK_TCP/SWOOLE_SOCK_TCP6/SWOOLE_SOCK_UDP/SWOOLE_SOCK_UDP6/SWOOLE_UNIX_DGRAM/SWOOLE_UNIX_STREAM
        'settings' => [// Swoole settings:https://www.swoole.co.uk/docs/modules/swoole-server-methods#swoole_server-addlistener
            'open_eof_check' => true,
            'package_eof'    => "\r\n",
        ],
        'handler'  => \App\Sockets\TestTcpSocket::class,
        'enable'   => true, // whether to enable, default true
    ],
],

About the heartbeat configuration, it can only be set on the main server and cannot be configured on Socket, but the Socket inherits the heartbeat configuration of the main server.

For TCP socket, onConnect and onClose events will be blocked when dispatch_mode of Swoole is 1/3, so if you want to unblock these two events please set dispatch_mode to 2/4/5.

'swoole' => [
    //...
    'dispatch_mode' => 2,
    //...
];

Test.

TCP: telnet 127.0.0.1 5291

UDP: [Linux] echo "Hello LaravelS" > /dev/udp/127.0.0.1/5292

Register example of other protocols.

  • UDP
  • Http
  • WebSocket: The main server must turn on WebSocket, that is, set websocket.enable to true.

Coroutine

Swoole Coroutine

Warning: The order of code execution in the coroutine is out of order. The data of the request level should be isolated by the coroutine ID. However, there are many singleton and static attributes in Laravel/Lumen, the data between different requests will affect each other, it's Unsafe. For example, the database connection is a singleton, the same database connection shares the same PDO resource. This is fine in the synchronous blocking mode, but it does not work in the asynchronous coroutine mode. Each query needs to create different connections and maintain IO state of different connections, which requires a connection pool.

DO NOT enable the coroutine, only the custom process can use the coroutine.

Custom process

Support developers to create special work processes for monitoring, reporting, or other special tasks. Refer addProcess.

Create Proccess class, implements CustomProcessInterface.

namespace App\Processes;
use App\Tasks\TestTask;
use Hhxsv5\LaravelS\Swoole\Process\CustomProcessInterface;
use Hhxsv5\LaravelS\Swoole\Task\Task;
use Swoole\Coroutine;
use Swoole\Http\Server;
use Swoole\Process;
class TestProcess implements CustomProcessInterface
{
    /**
     * @var bool Quit tag for Reload updates
     */
    private static $quit = false;

    public static function callback(Server $swoole, Process $process)
    {
        // The callback method cannot exit. Once exited, Manager process will automatically create the process 
        while (!self::$quit) {
            \Log::info('Test process: running');
            // sleep(1); // Swoole < 2.1
            Coroutine::sleep(1); // Swoole>=2.1: Coroutine & Runtime will be automatically enabled for callback().
             // Deliver task in custom process, but NOT support callback finish() of task.
            // Note: Modify task_ipc_mode to 1 or 2 in config/laravels.php, see https://www.swoole.co.uk/docs/modules/swoole-server/configuration
            $ret = Task::deliver(new TestTask('task data'));
            var_dump($ret);
            // The upper layer will catch the exception thrown in the callback and record it in the Swoole log, and then this process will exit. The Manager process will re-create the process after 3 seconds, so developers need to try/catch to catch the exception by themselves to avoid frequent process creation.
            // throw new \Exception('an exception');
        }
    }
    // Requirements: LaravelS >= v3.4.0 & callback() must be async non-blocking program.
    public static function onReload(Server $swoole, Process $process)
    {
        // Stop the process...
        // Then end process
        \Log::info('Test process: reloading');
        self::$quit = true;
        // $process->exit(0); // Force exit process
    }
    // Requirements: LaravelS >= v3.7.4 & callback() must be async non-blocking program.
    public static function onStop(Server $swoole, Process $process)
    {
        // Stop the process...
        // Then end process
        \Log::info('Test process: stopping');
        self::$quit = true;
        // $process->exit(0); // Force exit process
    }
}

Register TestProcess.

// Edit `config/laravels.php`
// ...
'processes' => [
    'test' => [ // Key name is process name
        'class'    => \App\Processes\TestProcess::class,
        'redirect' => false, // Whether redirect stdin/stdout, true or false
        'pipe'     => 0,     // The type of pipeline, 0: no pipeline 1: SOCK_STREAM 2: SOCK_DGRAM
        'enable'   => true,  // Whether to enable, default true
        //'num'    => 3   // To create multiple processes of this class, default is 1
        //'queue'    => [ // Enable message queue as inter-process communication, configure empty array means use default parameters
        //    'msg_key'  => 0,    // The key of the message queue. Default: ftok(__FILE__, 1).
        //    'mode'     => 2,    // Communication mode, default is 2, which means contention mode
        //    'capacity' => 8192, // The length of a single message, is limited by the operating system kernel parameters. The default is 8192, and the maximum is 65536
        //],
        //'restart_interval' => 5, // After the process exits abnormally, how many seconds to wait before restarting the process, default 5 seconds
    ],
],

Note: The callback() cannot quit. If quit, the Manager process will re-create the process.

Example: Write data to a custom process.

// config/laravels.php
'processes' => [
    'test' => [
        'class'    => \App\Processes\TestProcess::class,
        'redirect' => false,
        'pipe'     => 1,
    ],
],
// app/Processes/TestProcess.php
public static function callback(Server $swoole, Process $process)
{
    while ($data = $process->read()) {
        \Log::info('TestProcess: read data', [$data]);
        $process->write('TestProcess: ' . $data);
    }
}
// app/Http/Controllers/TestController.php
public function testProcessWrite()
{
    /**@var \Swoole\Process $process */
    $process = app('swoole')->customProcesses['test'];
    $process->write('TestController: write data' . time());
    var_dump($process->read());
}

Common components

Apollo

LaravelS will pull the Apollo configuration and write it to the .env file when starting. At the same time, LaravelS will start the custom process apollo to monitor the configuration and automatically reload when the configuration changes.

Enable Apollo: add --enable-apollo and Apollo parameters to the startup parameters.

php bin/laravels start --enable-apollo --apollo-server=http://127.0.0.1:8080 --apollo-app-id=LARAVEL-S-TEST

Support hot updates(optional).

// Edit `config/laravels.php`
'processes' => Hhxsv5\LaravelS\Components\Apollo\Process::getDefinition(),
// When there are other custom process configurations
'processes' => [
    'test' => [
        'class'    => \App\Processes\TestProcess::class,
        'redirect' => false,
        'pipe'     => 1,
    ],
    // ...
] + Hhxsv5\LaravelS\Components\Apollo\Process::getDefinition(),

List of available parameters.

ParameterDescriptionDefaultDemo
apollo-serverApollo server URL---apollo-server=http://127.0.0.1:8080
apollo-app-idApollo APP ID---apollo-app-id=LARAVEL-S-TEST
apollo-namespacesThe namespace to which the APP belongs, support specify the multipleapplication--apollo-namespaces=application --apollo-namespaces=env
apollo-clusterThe cluster to which the APP belongsdefault--apollo-cluster=default
apollo-client-ipIP of current instance, can also be used for grayscale publishingLocal intranet IP--apollo-client-ip=10.2.1.83
apollo-pull-timeoutTimeout time(seconds) when pulling configuration5--apollo-pull-timeout=5
apollo-backup-old-envWhether to backup the old configuration file when updating the configuration file .envfalse--apollo-backup-old-env

Prometheus

Support Prometheus monitoring and alarm, Grafana visually view monitoring metrics. Please refer to Docker Compose for the environment construction of Prometheus and Grafana.

Require extension APCu >= 5.0.0, please install it by pecl install apcu.

Copy the configuration file prometheus.php to the config directory of your project. Modify the configuration as appropriate.

# Execute commands in the project root directory
cp vendor/hhxsv5/laravel-s/config/prometheus.php config/

If your project is Lumen, you also need to manually load the configuration $app->configure('prometheus'); in bootstrap/app.php.

Configure global middleware: Hhxsv5\LaravelS\Components\Prometheus\RequestMiddleware::class. In order to count the request time consumption as accurately as possible, RequestMiddleware must be the first global middleware, which needs to be placed in front of other middleware.

Register ServiceProvider: Hhxsv5\LaravelS\Components\Prometheus\ServiceProvider::class.

Configure the CollectorProcess in config/laravels.php to collect the metrics of Swoole Worker/Task/Timer processes regularly.

'processes' => Hhxsv5\LaravelS\Components\Prometheus\CollectorProcess::getDefinition(),

Create the route to output metrics.

use Hhxsv5\LaravelS\Components\Prometheus\Exporter;

Route::get('/actuator/prometheus', function () {
    $result = app(Exporter::class)->render();
    return response($result, 200, ['Content-Type' => Exporter::REDNER_MIME_TYPE]);
});

Complete the configuration of Prometheus and start it.

global:
  scrape_interval: 5s
  scrape_timeout: 5s
  evaluation_interval: 30s
scrape_configs:
- job_name: laravel-s-test
  honor_timestamps: true
  metrics_path: /actuator/prometheus
  scheme: http
  follow_redirects: true
  static_configs:
  - targets:
    - 127.0.0.1:5200 # The ip and port of the monitored service
# Dynamically discovered using one of the supported service-discovery mechanisms
# https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config
# - job_name: laravels-eureka
#   honor_timestamps: true
#   scrape_interval: 5s
#   metrics_path: /actuator/prometheus
#   scheme: http
#   follow_redirects: true
  # eureka_sd_configs:
  # - server: http://127.0.0.1:8080/eureka
  #   follow_redirects: true
  #   refresh_interval: 5s

Start Grafana, then import panel json.

Grafana Dashboard

Other features

Configure Swoole events

Supported events:

EventInterfaceWhen happened
ServerStartHhxsv5\LaravelS\Swoole\Events\ServerStartInterfaceOccurs when the Master process is starting, this event should not handle complex business logic, and can only do some simple work of initialization.
ServerStopHhxsv5\LaravelS\Swoole\Events\ServerStopInterfaceOccurs when the server exits normally, CANNOT use async or coroutine related APIs in this event.
WorkerStartHhxsv5\LaravelS\Swoole\Events\WorkerStartInterfaceOccurs after the Worker/Task process is started, and the Laravel initialization has been completed.
WorkerStopHhxsv5\LaravelS\Swoole\Events\WorkerStopInterfaceOccurs after the Worker/Task process exits normally
WorkerErrorHhxsv5\LaravelS\Swoole\Events\WorkerErrorInterfaceOccurs when an exception or fatal error occurs in the Worker/Task process

1.Create an event class to implement the corresponding interface.

namespace App\Events;
use Hhxsv5\LaravelS\Swoole\Events\ServerStartInterface;
use Swoole\Atomic;
use Swoole\Http\Server;
class ServerStartEvent implements ServerStartInterface
{
    public function __construct()
    {
    }
    public function handle(Server $server)
    {
        // Initialize a global counter (available across processes)
        $server->atomicCount = new Atomic(2233);

        // Invoked in controller: app('swoole')->atomicCount->get();
    }
}
namespace App\Events;
use Hhxsv5\LaravelS\Swoole\Events\WorkerStartInterface;
use Swoole\Http\Server;
class WorkerStartEvent implements WorkerStartInterface
{
    public function __construct()
    {
    }
    public function handle(Server $server, $workerId)
    {
        // Initialize a database connection pool
        // DatabaseConnectionPool::init();
    }
}

2.Configuration.

// Edit `config/laravels.php`
'event_handlers' => [
    'ServerStart' => [\App\Events\ServerStartEvent::class], // Trigger events in array order
    'WorkerStart' => [\App\Events\WorkerStartEvent::class],
],

Serverless

Alibaba Cloud Function Compute

Function Compute.

1.Modify bootstrap/app.php and set the storage directory. Because the project directory is read-only, the /tmp directory can only be read and written.

$app->useStoragePath(env('APP_STORAGE_PATH', '/tmp/storage'));

2.Create a shell script laravels_bootstrap and grant executable permission.

#!/usr/bin/env bash
set +e

# Create storage-related directories
mkdir -p /tmp/storage/app/public
mkdir -p /tmp/storage/framework/cache
mkdir -p /tmp/storage/framework/sessions
mkdir -p /tmp/storage/framework/testing
mkdir -p /tmp/storage/framework/views
mkdir -p /tmp/storage/logs

# Set the environment variable APP_STORAGE_PATH, please make sure it's the same as APP_STORAGE_PATH in .env
export APP_STORAGE_PATH=/tmp/storage

# Start LaravelS
php bin/laravels start

3.Configure template.xml.

ROSTemplateFormatVersion: '2015-09-01'
Transform: 'Aliyun::Serverless-2018-04-03'
Resources:
  laravel-s-demo:
    Type: 'Aliyun::Serverless::Service'
    Properties:
      Description: 'LaravelS Demo for Serverless'
    fc-laravel-s:
      Type: 'Aliyun::Serverless::Function'
      Properties:
        Handler: laravels.handler
        Runtime: custom
        MemorySize: 512
        Timeout: 30
        CodeUri: ./
        InstanceConcurrency: 10
        EnvironmentVariables:
          BOOTSTRAP_FILE: laravels_bootstrap

Important notices

Singleton Issue

Under FPM mode, singleton instances will be instantiated and recycled in every request, request start=>instantiate instance=>request end=>recycled instance.

Under Swoole Server, All singleton instances will be held in memory, different lifetime from FPM, request start=>instantiate instance=>request end=>do not recycle singleton instance. So need developer to maintain status of singleton instances in every request.

Common solutions:

Write a XxxCleaner class to clean up the singleton object state. This class implements the interface Hhxsv5\LaravelS\Illuminate\Cleaners\CleanerInterface and then registers it in cleaners of laravels.php.

Reset status of singleton instances by Middleware.

Re-register ServiceProvider, add XxxServiceProvider into register_providers of file laravels.php. So that reinitialize singleton instances in every request Refer.

Cleaners

Configuration cleaners.

Known issues

Known issues: a package of known issues and solutions.

Debugging method

Logging; if you want to output to the console, you can use stderr, Log::channel('stderr')->debug('debug message').

Laravel Dump Server(Laravel 5.7 has been integrated by default).

Read request

Read request by Illuminate\Http\Request Object, $_ENV is readable, $_SERVER is partially readable, CANNOT USE $_GET/$_POST/$_FILES/$_COOKIE/$_REQUEST/$_SESSION/$GLOBALS.

public function form(\Illuminate\Http\Request $request)
{
    $name = $request->input('name');
    $all = $request->all();
    $sessionId = $request->cookie('sessionId');
    $photo = $request->file('photo');
    // Call getContent() to get the raw POST body, instead of file_get_contents('php://input')
    $rawContent = $request->getContent();
    //...
}

Output response

Respond by Illuminate\Http\Response Object, compatible with echo/vardump()/print_r(),CANNOT USE functions dd()/exit()/die()/header()/setcookie()/http_response_code().

public function json()
{
    return response()->json(['time' => time()])->header('header1', 'value1')->withCookie('c1', 'v1');
}

Persistent connection

Singleton connection will be resident in memory, it is recommended to turn on persistent connection for better performance.

Database connection, it will reconnect automatically immediately after disconnect.

// config/database.php
'connections' => [
    'my_conn' => [
        'driver'    => 'mysql',
        'host'      => env('DB_MY_CONN_HOST', 'localhost'),
        'port'      => env('DB_MY_CONN_PORT', 3306),
        'database'  => env('DB_MY_CONN_DATABASE', 'forge'),
        'username'  => env('DB_MY_CONN_USERNAME', 'forge'),
        'password'  => env('DB_MY_CONN_PASSWORD', ''),
        'charset'   => 'utf8mb4',
        'collation' => 'utf8mb4_unicode_ci',
        'prefix'    => '',
        'strict'    => false,
        'options'   => [
            // Enable persistent connection
            \PDO::ATTR_PERSISTENT => true,
        ],
    ],
],

Redis connection, it won't reconnect automatically immediately after disconnect, and will throw an exception about lost connection, reconnect next time. You need to make sure that SELECT DB correctly before operating Redis every time.

// config/database.php
'redis' => [
    'client' => env('REDIS_CLIENT', 'phpredis'), // It is recommended to use phpredis for better performance.
    'default' => [
        'host'       => env('REDIS_HOST', 'localhost'),
        'password'   => env('REDIS_PASSWORD', null),
        'port'       => env('REDIS_PORT', 6379),
        'database'   => 0,
        'persistent' => true, // Enable persistent connection
    ],
],

About memory leaks

Avoid using global variables. If necessary, please clean or reset them manually.

Infinitely appending element into static/global variable will lead to OOM(Out of Memory).

class Test
{
    public static $array = [];
    public static $string = '';
}

// Controller
public function test(Request $req)
{
    // Out of Memory
    Test::$array[] = $req->input('param1');
    Test::$string .= $req->input('param2');
}

Memory leak detection method

Modify config/laravels.php: worker_num=1, max_request=1000000, remember to change it back after test;

Add routing /debug-memory-leak without route middleware to observe the memory changes of the Worker process;

Start LaravelS and request /debug-memory-leak until diff_mem is less than or equal to zero; if diff_mem is always greater than zero, it means that there may be a memory leak in Global Middleware or Laravel Framework;

After completing Step 3, alternately request the business routes and /debug-memory-leak (It is recommended to use ab/wrk to make a large number of requests for business routes), the initial increase in memory is normal. After a large number of requests for the business routes, if diff_mem is always greater than zero and curr_mem continues to increase, there is a high probability of memory leak; If curr_mem always changes within a certain range and does not continue to increase, there is a low probability of memory leak.

If you still can't solve it, max_request is the last guarantee.

Linux kernel parameter adjustment

Linux kernel parameter adjustment

Pressure test

Pressure test

Alternatives

Sponsor

PayPal

BTC

Gitee

License

MIT

Author: hhxsv5
Source Code: https://github.com/hhxsv5/laravel-s
License: MIT License

#php #laravel