chris morris

chris morris

1611581830

How Much Does it Cost to Create ERC20 Token? | Token Creation Services

Ethereum stays the second-largest Blockhchain and Erc20 tokens are lot more beneficial if you create Ethereum tokens. Check out this Blog to know the cost to create ERC20 tokens - https://bit.ly/36choKN

#costtocreateerc20token #createerc20token #erc20tokendevelopmentcost

What is GEEK

Buddha Community

How Much Does it Cost to Create ERC20 Token? | Token Creation Services
Easter  Deckow

Easter Deckow

1655630160

PyTumblr: A Python Tumblr API v2 Client

PyTumblr

Installation

Install via pip:

$ pip install pytumblr

Install from source:

$ git clone https://github.com/tumblr/pytumblr.git
$ cd pytumblr
$ python setup.py install

Usage

Create a client

A pytumblr.TumblrRestClient is the object you'll make all of your calls to the Tumblr API through. Creating one is this easy:

client = pytumblr.TumblrRestClient(
    '<consumer_key>',
    '<consumer_secret>',
    '<oauth_token>',
    '<oauth_secret>',
)

client.info() # Grabs the current user information

Two easy ways to get your credentials to are:

  1. The built-in interactive_console.py tool (if you already have a consumer key & secret)
  2. The Tumblr API console at https://api.tumblr.com/console
  3. Get sample login code at https://api.tumblr.com/console/calls/user/info

Supported Methods

User Methods

client.info() # get information about the authenticating user
client.dashboard() # get the dashboard for the authenticating user
client.likes() # get the likes for the authenticating user
client.following() # get the blogs followed by the authenticating user

client.follow('codingjester.tumblr.com') # follow a blog
client.unfollow('codingjester.tumblr.com') # unfollow a blog

client.like(id, reblogkey) # like a post
client.unlike(id, reblogkey) # unlike a post

Blog Methods

client.blog_info(blogName) # get information about a blog
client.posts(blogName, **params) # get posts for a blog
client.avatar(blogName) # get the avatar for a blog
client.blog_likes(blogName) # get the likes on a blog
client.followers(blogName) # get the followers of a blog
client.blog_following(blogName) # get the publicly exposed blogs that [blogName] follows
client.queue(blogName) # get the queue for a given blog
client.submission(blogName) # get the submissions for a given blog

Post Methods

Creating posts

PyTumblr lets you create all of the various types that Tumblr supports. When using these types there are a few defaults that are able to be used with any post type.

The default supported types are described below.

  • state - a string, the state of the post. Supported types are published, draft, queue, private
  • tags - a list, a list of strings that you want tagged on the post. eg: ["testing", "magic", "1"]
  • tweet - a string, the string of the customized tweet you want. eg: "Man I love my mega awesome post!"
  • date - a string, the customized GMT that you want
  • format - a string, the format that your post is in. Support types are html or markdown
  • slug - a string, the slug for the url of the post you want

We'll show examples throughout of these default examples while showcasing all the specific post types.

Creating a photo post

Creating a photo post supports a bunch of different options plus the described default options * caption - a string, the user supplied caption * link - a string, the "click-through" url for the photo * source - a string, the url for the photo you want to use (use this or the data parameter) * data - a list or string, a list of filepaths or a single file path for multipart file upload

#Creates a photo post using a source URL
client.create_photo(blogName, state="published", tags=["testing", "ok"],
                    source="https://68.media.tumblr.com/b965fbb2e501610a29d80ffb6fb3e1ad/tumblr_n55vdeTse11rn1906o1_500.jpg")

#Creates a photo post using a local filepath
client.create_photo(blogName, state="queue", tags=["testing", "ok"],
                    tweet="Woah this is an incredible sweet post [URL]",
                    data="/Users/johnb/path/to/my/image.jpg")

#Creates a photoset post using several local filepaths
client.create_photo(blogName, state="draft", tags=["jb is cool"], format="markdown",
                    data=["/Users/johnb/path/to/my/image.jpg", "/Users/johnb/Pictures/kittens.jpg"],
                    caption="## Mega sweet kittens")

Creating a text post

Creating a text post supports the same options as default and just a two other parameters * title - a string, the optional title for the post. Supports markdown or html * body - a string, the body of the of the post. Supports markdown or html

#Creating a text post
client.create_text(blogName, state="published", slug="testing-text-posts", title="Testing", body="testing1 2 3 4")

Creating a quote post

Creating a quote post supports the same options as default and two other parameter * quote - a string, the full text of the qote. Supports markdown or html * source - a string, the cited source. HTML supported

#Creating a quote post
client.create_quote(blogName, state="queue", quote="I am the Walrus", source="Ringo")

Creating a link post

  • title - a string, the title of post that you want. Supports HTML entities.
  • url - a string, the url that you want to create a link post for.
  • description - a string, the desciption of the link that you have
#Create a link post
client.create_link(blogName, title="I like to search things, you should too.", url="https://duckduckgo.com",
                   description="Search is pretty cool when a duck does it.")

Creating a chat post

Creating a chat post supports the same options as default and two other parameters * title - a string, the title of the chat post * conversation - a string, the text of the conversation/chat, with diablog labels (no html)

#Create a chat post
chat = """John: Testing can be fun!
Renee: Testing is tedious and so are you.
John: Aw.
"""
client.create_chat(blogName, title="Renee just doesn't understand.", conversation=chat, tags=["renee", "testing"])

Creating an audio post

Creating an audio post allows for all default options and a has 3 other parameters. The only thing to keep in mind while dealing with audio posts is to make sure that you use the external_url parameter or data. You cannot use both at the same time. * caption - a string, the caption for your post * external_url - a string, the url of the site that hosts the audio file * data - a string, the filepath of the audio file you want to upload to Tumblr

#Creating an audio file
client.create_audio(blogName, caption="Rock out.", data="/Users/johnb/Music/my/new/sweet/album.mp3")

#lets use soundcloud!
client.create_audio(blogName, caption="Mega rock out.", external_url="https://soundcloud.com/skrillex/sets/recess")

Creating a video post

Creating a video post allows for all default options and has three other options. Like the other post types, it has some restrictions. You cannot use the embed and data parameters at the same time. * caption - a string, the caption for your post * embed - a string, the HTML embed code for the video * data - a string, the path of the file you want to upload

#Creating an upload from YouTube
client.create_video(blogName, caption="Jon Snow. Mega ridiculous sword.",
                    embed="http://www.youtube.com/watch?v=40pUYLacrj4")

#Creating a video post from local file
client.create_video(blogName, caption="testing", data="/Users/johnb/testing/ok/blah.mov")

Editing a post

Updating a post requires you knowing what type a post you're updating. You'll be able to supply to the post any of the options given above for updates.

client.edit_post(blogName, id=post_id, type="text", title="Updated")
client.edit_post(blogName, id=post_id, type="photo", data="/Users/johnb/mega/awesome.jpg")

Reblogging a Post

Reblogging a post just requires knowing the post id and the reblog key, which is supplied in the JSON of any post object.

client.reblog(blogName, id=125356, reblog_key="reblog_key")

Deleting a post

Deleting just requires that you own the post and have the post id

client.delete_post(blogName, 123456) # Deletes your post :(

A note on tags: When passing tags, as params, please pass them as a list (not a comma-separated string):

client.create_text(blogName, tags=['hello', 'world'], ...)

Getting notes for a post

In order to get the notes for a post, you need to have the post id and the blog that it is on.

data = client.notes(blogName, id='123456')

The results include a timestamp you can use to make future calls.

data = client.notes(blogName, id='123456', before_timestamp=data["_links"]["next"]["query_params"]["before_timestamp"])

Tagged Methods

# get posts with a given tag
client.tagged(tag, **params)

Using the interactive console

This client comes with a nice interactive console to run you through the OAuth process, grab your tokens (and store them for future use).

You'll need pyyaml installed to run it, but then it's just:

$ python interactive-console.py

and away you go! Tokens are stored in ~/.tumblr and are also shared by other Tumblr API clients like the Ruby client.

Running tests

The tests (and coverage reports) are run with nose, like this:

python setup.py test

Author: tumblr
Source Code: https://github.com/tumblr/pytumblr
License: Apache-2.0 license

#python #api 

How much does it cost to create an online learning app?

Are you thinking of executing an E-learning app in the market?

Then firstly you need to understand the concept of E-learning in more detail and also know about the types of E-learning app and what is the E-learning app demand in the market.

In this present time, every industry is taking the help of technology for maximizing their profits, as people love to use the technology for fulfilling their basic requirements. Every industry is now providing online services via web apps or mobile apps.

What are the Basic features an E-learning app contains?

Features list for a learner Panel:

  • Easy registration and login module for learners.
  • Easy navigation to the courses and study material
  • Can able to search various courses by applying various filters
  • Can get notify whenever a new course is added to the platform.
  • Can purchase the courses by doing online payment
  • Can access the quiz test and mock test services
  • Learners can post questions and answer
  • Learner can directly chat with the tutor for clearing doubts.
  • Learner can check their history or list of purchased courses.
  • Learner can easily track their progress by reports which is generated in-app.

Features list for a tutor Panel:

  • Tutor can easily set up and manage their account
  • Tutor can easily update or modify their uploaded courses.
  • Tutor get notify whenever any learner has posted a question
  • Tutor can manage the payment module
  • Tutor can clear the doubts of the learner by chat module.

Feature list for an Admin Panel:

  • Admin can do the learners data management
  • Admin can do the tutor data management
  • Can manage the courses
  • Manage and define the categories or Subcategories of courses.
  • Manage the premium and subscription packages.
  • Payment management
  • Manage the chats and discussion forum.
  • Content management system
  • Able to generate reports and do analysis.

What are the factors on which the cost of the E-learning app depends?

The cost of the E-learning app is depended on some of the factors. Let me list down the factors affecting the cost of an E-learning app:

  • The Cost of an E-learning is depended on the UI/UX design of the app.
  • Cost also depends on the size of the app.
  • The features or functionality you want to add in your E-learning app
  • The cost highly depends on the platform which is chosen for the development of an E-learning app it can be in Android, IOS or Both.

How much does it cost to develop an E-learning app?

As we have discussed the cost of an E-learning app is highly depends on some of the factors. We are at AppClues Infotech, which is a leading app development industry. We help you to develop an E-learning app by providing you with the best solution and Unique UI/UX design.

We can offer you to hire experienced and expert android as well as an IOS developer.

So here we are providing you with the approximate timeline and cost of developing an E-learning app:

Timeline:

  • App Design:- 7 Working Days
  • Android App Development:- 25 Working Days
  • iOS App Development:- 25 Working Days
  • Web Backend & Apis:- 30 Working Days
  • Testing, Bug fixing, and Deployment:- 5 Working Days

Costing:
The approximate cost of developing an E-learning app is $30,000-$70,000.

#how much does it cost to develop an e-learning app #how much does it cost to create e-learning #how much does it cost to develop a educational app #how much does cost to make an e-learning app #how to create an educational app #e-learning mobile app development cost and features

Tamale  Moses

Tamale Moses

1669003576

Exploring Mutable and Immutable in Python

In this Python article, let's learn about Mutable and Immutable in Python. 

Mutable and Immutable in Python

Mutable is a fancy way of saying that the internal state of the object is changed/mutated. So, the simplest definition is: An object whose internal state can be changed is mutable. On the other hand, immutable doesn’t allow any change in the object once it has been created.

Both of these states are integral to Python data structure. If you want to become more knowledgeable in the entire Python Data Structure, take this free course which covers multiple data structures in Python including tuple data structure which is immutable. You will also receive a certificate on completion which is sure to add value to your portfolio.

Mutable Definition

Mutable is when something is changeable or has the ability to change. In Python, ‘mutable’ is the ability of objects to change their values. These are often the objects that store a collection of data.

Immutable Definition

Immutable is the when no change is possible over time. In Python, if the value of an object cannot be changed over time, then it is known as immutable. Once created, the value of these objects is permanent.

List of Mutable and Immutable objects

Objects of built-in type that are mutable are:

  • Lists
  • Sets
  • Dictionaries
  • User-Defined Classes (It purely depends upon the user to define the characteristics) 

Objects of built-in type that are immutable are:

  • Numbers (Integer, Rational, Float, Decimal, Complex & Booleans)
  • Strings
  • Tuples
  • Frozen Sets
  • User-Defined Classes (It purely depends upon the user to define the characteristics)

Object mutability is one of the characteristics that makes Python a dynamically typed language. Though Mutable and Immutable in Python is a very basic concept, it can at times be a little confusing due to the intransitive nature of immutability.

Objects in Python

In Python, everything is treated as an object. Every object has these three attributes:

  • Identity – This refers to the address that the object refers to in the computer’s memory.
  • Type – This refers to the kind of object that is created. For example- integer, list, string etc. 
  • Value – This refers to the value stored by the object. For example – List=[1,2,3] would hold the numbers 1,2 and 3

While ID and Type cannot be changed once it’s created, values can be changed for Mutable objects.

Check out this free python certificate course to get started with Python.

Mutable Objects in Python

I believe, rather than diving deep into the theory aspects of mutable and immutable in Python, a simple code would be the best way to depict what it means in Python. Hence, let us discuss the below code step-by-step:

#Creating a list which contains name of Indian cities  

cities = [‘Delhi’, ‘Mumbai’, ‘Kolkata’]

# Printing the elements from the list cities, separated by a comma & space

for city in cities:
		print(city, end=’, ’)

Output [1]: Delhi, Mumbai, Kolkata

#Printing the location of the object created in the memory address in hexadecimal format

print(hex(id(cities)))

Output [2]: 0x1691d7de8c8

#Adding a new city to the list cities

cities.append(‘Chennai’)

#Printing the elements from the list cities, separated by a comma & space 

for city in cities:
	print(city, end=’, ’)

Output [3]: Delhi, Mumbai, Kolkata, Chennai

#Printing the location of the object created in the memory address in hexadecimal format

print(hex(id(cities)))

Output [4]: 0x1691d7de8c8

The above example shows us that we were able to change the internal state of the object ‘cities’ by adding one more city ‘Chennai’ to it, yet, the memory address of the object did not change. This confirms that we did not create a new object, rather, the same object was changed or mutated. Hence, we can say that the object which is a type of list with reference variable name ‘cities’ is a MUTABLE OBJECT.

Let us now discuss the term IMMUTABLE. Considering that we understood what mutable stands for, it is obvious that the definition of immutable will have ‘NOT’ included in it. Here is the simplest definition of immutable– An object whose internal state can NOT be changed is IMMUTABLE.

Again, if you try and concentrate on different error messages, you have encountered, thrown by the respective IDE; you use you would be able to identify the immutable objects in Python. For instance, consider the below code & associated error message with it, while trying to change the value of a Tuple at index 0. 

#Creating a Tuple with variable name ‘foo’

foo = (1, 2)

#Changing the index[0] value from 1 to 3

foo[0] = 3
	
TypeError: 'tuple' object does not support item assignment 

Immutable Objects in Python

Once again, a simple code would be the best way to depict what immutable stands for. Hence, let us discuss the below code step-by-step:

#Creating a Tuple which contains English name of weekdays

weekdays = ‘Sunday’, ‘Monday’, ‘Tuesday’, ‘Wednesday’, ‘Thursday’, ‘Friday’, ‘Saturday’

# Printing the elements of tuple weekdays

print(weekdays)

Output [1]:  (‘Sunday’, ‘Monday’, ‘Tuesday’, ‘Wednesday’, ‘Thursday’, ‘Friday’, ‘Saturday’)

#Printing the location of the object created in the memory address in hexadecimal format

print(hex(id(weekdays)))

Output [2]: 0x1691cc35090

#tuples are immutable, so you cannot add new elements, hence, using merge of tuples with the # + operator to add a new imaginary day in the tuple ‘weekdays’

weekdays  +=  ‘Pythonday’,

#Printing the elements of tuple weekdays

print(weekdays)

Output [3]: (‘Sunday’, ‘Monday’, ‘Tuesday’, ‘Wednesday’, ‘Thursday’, ‘Friday’, ‘Saturday’, ‘Pythonday’)

#Printing the location of the object created in the memory address in hexadecimal format

print(hex(id(weekdays)))

Output [4]: 0x1691cc8ad68

This above example shows that we were able to use the same variable name that is referencing an object which is a type of tuple with seven elements in it. However, the ID or the memory location of the old & new tuple is not the same. We were not able to change the internal state of the object ‘weekdays’. The Python program manager created a new object in the memory address and the variable name ‘weekdays’ started referencing the new object with eight elements in it.  Hence, we can say that the object which is a type of tuple with reference variable name ‘weekdays’ is an IMMUTABLE OBJECT.

Also Read: Understanding the Exploratory Data Analysis (EDA) in Python

Where can you use mutable and immutable objects:

Mutable objects can be used where you want to allow for any updates. For example, you have a list of employee names in your organizations, and that needs to be updated every time a new member is hired. You can create a mutable list, and it can be updated easily.

Immutability offers a lot of useful applications to different sensitive tasks we do in a network centred environment where we allow for parallel processing. By creating immutable objects, you seal the values and ensure that no threads can invoke overwrite/update to your data. This is also useful in situations where you would like to write a piece of code that cannot be modified. For example, a debug code that attempts to find the value of an immutable object.

Watch outs:  Non transitive nature of Immutability:

OK! Now we do understand what mutable & immutable objects in Python are. Let’s go ahead and discuss the combination of these two and explore the possibilities. Let’s discuss, as to how will it behave if you have an immutable object which contains the mutable object(s)? Or vice versa? Let us again use a code to understand this behaviour–

#creating a tuple (immutable object) which contains 2 lists(mutable) as it’s elements

#The elements (lists) contains the name, age & gender 

person = (['Ayaan', 5, 'Male'], ['Aaradhya', 8, 'Female'])

#printing the tuple

print(person)

Output [1]: (['Ayaan', 5, 'Male'], ['Aaradhya', 8, 'Female'])

#printing the location of the object created in the memory address in hexadecimal format

print(hex(id(person)))

Output [2]: 0x1691ef47f88

#Changing the age for the 1st element. Selecting 1st element of tuple by using indexing [0] then 2nd element of the list by using indexing [1] and assigning a new value for age as 4

person[0][1] = 4

#printing the updated tuple

print(person)

Output [3]: (['Ayaan', 4, 'Male'], ['Aaradhya', 8, 'Female'])

#printing the location of the object created in the memory address in hexadecimal format

print(hex(id(person)))

Output [4]: 0x1691ef47f88

In the above code, you can see that the object ‘person’ is immutable since it is a type of tuple. However, it has two lists as it’s elements, and we can change the state of lists (lists being mutable). So, here we did not change the object reference inside the Tuple, but the referenced object was mutated.

Also Read: Real-Time Object Detection Using TensorFlow

Same way, let’s explore how it will behave if you have a mutable object which contains an immutable object? Let us again use a code to understand the behaviour–

#creating a list (mutable object) which contains tuples(immutable) as it’s elements

list1 = [(1, 2, 3), (4, 5, 6)]

#printing the list

print(list1)

Output [1]: [(1, 2, 3), (4, 5, 6)]

#printing the location of the object created in the memory address in hexadecimal format

print(hex(id(list1)))

Output [2]: 0x1691d5b13c8	

#changing object reference at index 0

list1[0] = (7, 8, 9)

#printing the list

Output [3]: [(7, 8, 9), (4, 5, 6)]

#printing the location of the object created in the memory address in hexadecimal format

print(hex(id(list1)))

Output [4]: 0x1691d5b13c8

As an individual, it completely depends upon you and your requirements as to what kind of data structure you would like to create with a combination of mutable & immutable objects. I hope that this information will help you while deciding the type of object you would like to select going forward.

Before I end our discussion on IMMUTABILITY, allow me to use the word ‘CAVITE’ when we discuss the String and Integers. There is an exception, and you may see some surprising results while checking the truthiness for immutability. For instance:
#creating an object of integer type with value 10 and reference variable name ‘x’ 

x = 10
 

#printing the value of ‘x’

print(x)

Output [1]: 10

#Printing the location of the object created in the memory address in hexadecimal format

print(hex(id(x)))

Output [2]: 0x538fb560

#creating an object of integer type with value 10 and reference variable name ‘y’

y = 10

#printing the value of ‘y’

print(y)

Output [3]: 10

#Printing the location of the object created in the memory address in hexadecimal format

print(hex(id(y)))

Output [4]: 0x538fb560

As per our discussion and understanding, so far, the memory address for x & y should have been different, since, 10 is an instance of Integer class which is immutable. However, as shown in the above code, it has the same memory address. This is not something that we expected. It seems that what we have understood and discussed, has an exception as well.

Quick checkPython Data Structures

Immutability of Tuple

Tuples are immutable and hence cannot have any changes in them once they are created in Python. This is because they support the same sequence operations as strings. We all know that strings are immutable. The index operator will select an element from a tuple just like in a string. Hence, they are immutable.

Exceptions in immutability

Like all, there are exceptions in the immutability in python too. Not all immutable objects are really mutable. This will lead to a lot of doubts in your mind. Let us just take an example to understand this.

Consider a tuple ‘tup’.

Now, if we consider tuple tup = (‘GreatLearning’,[4,3,1,2]) ;

We see that the tuple has elements of different data types. The first element here is a string which as we all know is immutable in nature. The second element is a list which we all know is mutable. Now, we all know that the tuple itself is an immutable data type. It cannot change its contents. But, the list inside it can change its contents. So, the value of the Immutable objects cannot be changed but its constituent objects can. change its value.

FAQs

1. Difference between mutable vs immutable in Python?

Mutable ObjectImmutable Object
State of the object can be modified after it is created.State of the object can’t be modified once it is created.
They are not thread safe.They are thread safe
Mutable classes are not final.It is important to make the class final before creating an immutable object.

2. What are the mutable and immutable data types in Python?

  • Some mutable data types in Python are:

list, dictionary, set, user-defined classes.

  • Some immutable data types are: 

int, float, decimal, bool, string, tuple, range.

3. Are lists mutable in Python?

Lists in Python are mutable data types as the elements of the list can be modified, individual elements can be replaced, and the order of elements can be changed even after the list has been created.
(Examples related to lists have been discussed earlier in this blog.)

4. Why are tuples called immutable types?

Tuple and list data structures are very similar, but one big difference between the data types is that lists are mutable, whereas tuples are immutable. The reason for the tuple’s immutability is that once the elements are added to the tuple and the tuple has been created; it remains unchanged.

A programmer would always prefer building a code that can be reused instead of making the whole data object again. Still, even though tuples are immutable, like lists, they can contain any Python object, including mutable objects.

5. Are sets mutable in Python?

A set is an iterable unordered collection of data type which can be used to perform mathematical operations (like union, intersection, difference etc.). Every element in a set is unique and immutable, i.e. no duplicate values should be there, and the values can’t be changed. However, we can add or remove items from the set as the set itself is mutable.

6. Are strings mutable in Python?

Strings are not mutable in Python. Strings are a immutable data types which means that its value cannot be updated.

Join Great Learning Academy’s free online courses and upgrade your skills today.


Original article source at: https://www.mygreatlearning.com

#python 

Como Extrair Dados Do Twitter Usando Tweepy E Snscrape

Se você é um entusiasta de dados, provavelmente concordará que uma das fontes mais ricas de dados do mundo real são as mídias sociais. Sites como o Twitter estão cheios de dados.

Você pode usar os dados obtidos nas mídias sociais de várias maneiras, como análise de sentimentos (analisando os pensamentos das pessoas) sobre um assunto ou campo de interesse específico.

Existem várias maneiras de extrair (ou coletar) dados do Twitter. E neste artigo, veremos duas dessas maneiras: usando o Tweepy e o Snscrape.

Aprenderemos um método para extrair conversas públicas de pessoas sobre um tópico de tendência específico, bem como tweets de um usuário específico.

Agora sem mais delongas, vamos começar.

Tweepy vs Snscrape – Introdução às nossas ferramentas de raspagem

Agora, antes de entrarmos na implementação de cada plataforma, vamos tentar entender as diferenças e os limites de cada plataforma.

Tweepy

Tweepy é uma biblioteca Python para integração com a API do Twitter. Como o Tweepy está conectado à API do Twitter, você pode realizar consultas complexas além de extrair tweets. Ele permite que você aproveite todos os recursos da API do Twitter.

Mas existem algumas desvantagens – como o fato de que sua API padrão só permite coletar tweets por até uma semana (ou seja, o Tweepy não permite a recuperação de tweets além de uma janela de semana, portanto, a recuperação de dados históricos não é permitida).

Além disso, há limites para quantos tweets você pode recuperar da conta de um usuário. Você pode ler mais sobre as funcionalidades do Tweepy aqui .

Snscrape

Snscrape é outra abordagem para extrair informações do Twitter que não requer o uso de uma API. O Snscrape permite extrair informações básicas, como o perfil de um usuário, conteúdo do tweet, fonte e assim por diante.

O Snscrape não se limita ao Twitter, mas também pode extrair conteúdo de outras redes sociais proeminentes, como Facebook, Instagram e outros.

Suas vantagens são que não há limites para o número de tweets que você pode recuperar ou a janela de tweets (ou seja, o intervalo de datas dos tweets). Então Snscrape permite que você recupere dados antigos.

Mas a única desvantagem é que ele não possui todas as outras funcionalidades do Tweepy – ainda assim, se você quiser apenas raspar tweets, o Snscrape seria suficiente.

Agora que esclarecemos a distinção entre os dois métodos, vamos analisar sua implementação um por um.

Como usar o Tweepy para raspar tweets

Antes de começarmos a usar o Tweepy, devemos primeiro ter certeza de que nossas credenciais do Twitter estão prontas. Com isso, podemos conectar o Tweepy à nossa chave de API e começar a raspar.

Se você não tiver credenciais do Twitter, poderá se registrar para uma conta de desenvolvedor do Twitter acessando aqui . Serão feitas algumas perguntas básicas sobre como você pretende usar a API do Twitter. Depois disso, você pode começar a implementação.

O primeiro passo é instalar a biblioteca Tweepy em sua máquina local, o que você pode fazer digitando:

pip install git+https://github.com/tweepy/tweepy.git

Como raspar tweets de um usuário no Twitter

Agora que instalamos a biblioteca Tweepy, vamos extrair 100 tweets de um usuário chamado johnno Twitter. Veremos a implementação completa do código que nos permitirá fazer isso e discutiremos em detalhes para que possamos entender o que está acontecendo:

import tweepy

consumer_key = "XXXX" #Your API/Consumer key 
consumer_secret = "XXXX" #Your API/Consumer Secret Key
access_token = "XXXX"    #Your Access token key
access_token_secret = "XXXX" #Your Access token Secret key

#Pass in our twitter API authentication key
auth = tweepy.OAuth1UserHandler(
    consumer_key, consumer_secret,
    access_token, access_token_secret
)

#Instantiate the tweepy API
api = tweepy.API(auth, wait_on_rate_limit=True)


username = "john"
no_of_tweets =100


try:
    #The number of tweets we want to retrieved from the user
    tweets = api.user_timeline(screen_name=username, count=no_of_tweets)
    
    #Pulling Some attributes from the tweet
    attributes_container = [[tweet.created_at, tweet.favorite_count,tweet.source,  tweet.text] for tweet in tweets]

    #Creation of column list to rename the columns in the dataframe
    columns = ["Date Created", "Number of Likes", "Source of Tweet", "Tweet"]
    
    #Creation of Dataframe
    tweets_df = pd.DataFrame(attributes_container, columns=columns)
except BaseException as e:
    print('Status Failed On,',str(e))
    time.sleep(3)

Agora vamos examinar cada parte do código no bloco acima.

import tweepy

consumer_key = "XXXX" #Your API/Consumer key 
consumer_secret = "XXXX" #Your API/Consumer Secret Key
access_token = "XXXX"    #Your Access token key
access_token_secret = "XXXX" #Your Access token Secret key

#Pass in our twitter API authentication key
auth = tweepy.OAuth1UserHandler(
    consumer_key, consumer_secret,
    access_token, access_token_secret
)

#Instantiate the tweepy API
api = tweepy.API(auth, wait_on_rate_limit=True)

No código acima, importamos a biblioteca Tweepy para nosso código e criamos algumas variáveis ​​nas quais armazenamos nossas credenciais do Twitter (o manipulador de autenticação do Tweepy requer quatro de nossas credenciais do Twitter). Então, passamos essas variáveis ​​para o manipulador de autenticação Tweepy e as salvamos em outra variável.

Em seguida, a última instrução de chamada é onde instanciamos a API do Tweepy e passamos os parâmetros require.

username = "john"
no_of_tweets =100


try:
    #The number of tweets we want to retrieved from the user
    tweets = api.user_timeline(screen_name=username, count=no_of_tweets)
    
    #Pulling Some attributes from the tweet
    attributes_container = [[tweet.created_at, tweet.favorite_count,tweet.source,  tweet.text] for tweet in tweets]

    #Creation of column list to rename the columns in the dataframe
    columns = ["Date Created", "Number of Likes", "Source of Tweet", "Tweet"]
    
    #Creation of Dataframe
    tweets_df = pd.DataFrame(attributes_container, columns=columns)
except BaseException as e:
    print('Status Failed On,',str(e))

No código acima, criamos o nome do usuário (o @name no Twitter) do qual queremos recuperar os tweets e também o número de tweets. Em seguida, criamos um manipulador de exceção para nos ajudar a detectar erros de maneira mais eficaz.

Depois disso, o api.user_timeline()retorna uma coleção dos tweets mais recentes postados pelo usuário que escolhemos no screen_nameparâmetro e o número de tweets que você deseja recuperar.

Na próxima linha de código, passamos alguns atributos que queremos recuperar de cada tweet e os salvamos em uma lista. Para ver mais atributos que você pode recuperar de um tweet, leia isto .

No último pedaço de código criamos um dataframe e passamos a lista que criamos junto com os nomes da coluna que criamos.

Observe que os nomes das colunas devem estar na sequência de como você os passou para o contêiner de atributos (ou seja, como você passou esses atributos em uma lista quando estava recuperando os atributos do tweet).

Se você seguiu corretamente os passos que descrevi, você deve ter algo assim:

imagem-17

Imagem do autor

Agora que terminamos, vamos ver mais um exemplo antes de passarmos para a implementação do Snscrape.

Como raspar tweets de uma pesquisa de texto

Neste método, estaremos recuperando um tweet com base em uma pesquisa. Você pode fazer assim:

import tweepy

consumer_key = "XXXX" #Your API/Consumer key 
consumer_secret = "XXXX" #Your API/Consumer Secret Key
access_token = "XXXX"    #Your Access token key
access_token_secret = "XXXX" #Your Access token Secret key

#Pass in our twitter API authentication key
auth = tweepy.OAuth1UserHandler(
    consumer_key, consumer_secret,
    access_token, access_token_secret
)

#Instantiate the tweepy API
api = tweepy.API(auth, wait_on_rate_limit=True)


search_query = "sex for grades"
no_of_tweets =150


try:
    #The number of tweets we want to retrieved from the search
    tweets = api.search_tweets(q=search_query, count=no_of_tweets)
    
    #Pulling Some attributes from the tweet
    attributes_container = [[tweet.user.name, tweet.created_at, tweet.favorite_count, tweet.source,  tweet.text] for tweet in tweets]

    #Creation of column list to rename the columns in the dataframe
    columns = ["User", "Date Created", "Number of Likes", "Source of Tweet", "Tweet"]
    
    #Creation of Dataframe
    tweets_df = pd.DataFrame(attributes_container, columns=columns)
except BaseException as e:
    print('Status Failed On,',str(e))

O código acima é semelhante ao código anterior, exceto que alteramos o método da API de api.user_timeline()para api.search_tweets(). Também adicionamos tweet.user.nameà lista de contêineres de atributos.

No código acima, você pode ver que passamos dois atributos. Isso ocorre porque se passarmos apenas tweet.user, ele retornaria apenas um objeto de usuário de dicionário. Portanto, também devemos passar outro atributo que queremos recuperar do objeto de usuário, que é name.

Você pode acessar aqui para ver uma lista de atributos adicionais que podem ser recuperados de um objeto de usuário. Agora você deve ver algo assim depois de executá-lo:

imagem-18

Imagem do Autor.

Tudo bem, isso praticamente encerra a implementação do Tweepy. Apenas lembre-se de que há um limite para o número de tweets que você pode recuperar, e você não pode recuperar tweets com mais de 7 dias usando o Tweepy.

Como usar o Snscrape para raspar tweets

Como mencionei anteriormente, o Snscrape não requer credenciais do Twitter (chave de API) para acessá-lo. Também não há limite para o número de tweets que você pode buscar.

Para este exemplo, porém, apenas recuperaremos os mesmos tweets do exemplo anterior, mas usando o Snscrape.

Para usar o Snscrape, devemos primeiro instalar sua biblioteca em nosso PC. Você pode fazer isso digitando:

pip3 install git+https://github.com/JustAnotherArchivist/snscrape.git

Como raspar tweets de um usuário com Snscrape

O Snscrape inclui dois métodos para obter tweets do Twitter: a interface de linha de comando (CLI) e um Python Wrapper. Apenas tenha em mente que o Python Wrapper não está documentado no momento – mas ainda podemos nos virar com tentativa e erro.

Neste exemplo, usaremos o Python Wrapper porque é mais intuitivo que o método CLI. Mas se você ficar preso a algum código, sempre poderá recorrer à comunidade do GitHub para obter assistência. Os colaboradores terão prazer em ajudá-lo.

Para recuperar tweets de um usuário específico, podemos fazer o seguinte:

import snscrape.modules.twitter as sntwitter
import pandas as pd

# Created a list to append all tweet attributes(data)
attributes_container = []

# Using TwitterSearchScraper to scrape data and append tweets to list
for i,tweet in enumerate(sntwitter.TwitterSearchScraper('from:john').get_items()):
    if i>100:
        break
    attributes_container.append([tweet.date, tweet.likeCount, tweet.sourceLabel, tweet.content])
    
# Creating a dataframe from the tweets list above 
tweets_df = pd.DataFrame(attributes_container, columns=["Date Created", "Number of Likes", "Source of Tweet", "Tweets"])

Vamos revisar alguns dos códigos que você pode não entender à primeira vista:

for i,tweet in enumerate(sntwitter.TwitterSearchScraper('from:john').get_items()):
    if i>100:
        break
    attributes_container.append([tweet.date, tweet.likeCount, tweet.sourceLabel, tweet.content])
    
  
# Creating a dataframe from the tweets list above 
tweets_df = pd.DataFrame(attributes_container, columns=["Date Created", "Number of Likes", "Source of Tweet", "Tweets"])

No código acima, o que o sntwitter.TwitterSearchScaperfaz é retornar um objeto de tweets do nome do usuário que passamos para ele (que é john).

Como mencionei anteriormente, o Snscrape não tem limites no número de tweets, então ele retornará quantos tweets desse usuário. Para ajudar com isso, precisamos adicionar a função enumerate que irá percorrer o objeto e adicionar um contador para que possamos acessar os 100 tweets mais recentes do usuário.

Você pode ver que a sintaxe de atributos que obtemos de cada tweet se parece com a do Tweepy. Esta é a lista de atributos que podemos obter do tweet do Snscrape, com curadoria de Martin Beck.

Sns.Scrape

Crédito: Martin Beck

Mais atributos podem ser adicionados, pois a biblioteca Snscrape ainda está em desenvolvimento. Como por exemplo na imagem acima, sourcefoi substituído por sourceLabel. Se você passar apenas sourceele retornará um objeto.

Se você executar o código acima, deverá ver algo assim também:

imagem-19

Imagem do autor

Agora vamos fazer o mesmo para raspagem por pesquisa.

Como raspar tweets de uma pesquisa de texto com Snscrape

import snscrape.modules.twitter as sntwitter
import pandas as pd

# Creating list to append tweet data to
attributes_container = []

# Using TwitterSearchScraper to scrape data and append tweets to list
for i,tweet in enumerate(sntwitter.TwitterSearchScraper('sex for grades since:2021-07-05 until:2022-07-06').get_items()):
    if i>150:
        break
    attributes_container.append([tweet.user.username, tweet.date, tweet.likeCount, tweet.sourceLabel, tweet.content])
    
# Creating a dataframe to load the list
tweets_df = pd.DataFrame(attributes_container, columns=["User", "Date Created", "Number of Likes", "Source of Tweet", "Tweet"])

Novamente, você pode acessar muitos dados históricos usando o Snscrape (ao contrário do Tweepy, pois sua API padrão não pode exceder 7 dias. A API premium é de 30 dias). Assim, podemos passar a data a partir da qual queremos iniciar a pesquisa e a data em que queremos que ela termine no sntwitter.TwitterSearchScraper()método.

O que fizemos no código anterior é basicamente o que discutimos antes. A única coisa a ter em mente é que até funciona de forma semelhante à função range em Python (ou seja, exclui o último inteiro). Portanto, se você deseja obter tweets de hoje, precisa incluir o dia depois de hoje no parâmetro "até".

imagem-21

Imagem do Autor.

Agora você também sabe como raspar tweets com o Snscrape!

Quando usar cada abordagem

Agora que vimos como cada método funciona, você deve estar se perguntando quando usar qual.

Bem, não existe uma regra universal para quando utilizar cada método. Tudo se resume a uma preferência de assunto e seu caso de uso.

Se você deseja adquirir um número infinito de tweets, deve usar o Snscrape. Mas se você quiser usar recursos extras que o Snscrape não pode fornecer (como geolocalização, por exemplo), você deve definitivamente usar o Tweepy. Ele é integrado diretamente com a API do Twitter e oferece funcionalidade completa.

Mesmo assim, o Snscrape é o método mais comumente usado para raspagem básica.

Conclusão

Neste artigo, aprendemos como extrair dados do Python usando Tweepy e Snscrape. Mas esta foi apenas uma breve visão geral de como cada abordagem funciona. Você pode aprender mais explorando a web para obter informações adicionais.

Incluí alguns recursos úteis que você pode usar se precisar de informações adicionais. Obrigado por ler.

 Fonte: https://www.freecodecamp.org/news/python-web-scraping-tutorial/ 

#python #web 

许 志强

许 志强

1657769340

如何使用 Tweepy 和 Snscape 从 Twitter 上抓取数据

如果您是数据爱好者,您可能会同意社交媒体是现实世界数据中最丰富的来源之一。像 Twitter 这样的网站充满了数据。

您可以通过多种方式使用从社交媒体获得的数据,例如针对特定问题或感兴趣领域的情绪分析(分析人们的想法)。

您可以通过多种方式从 Twitter 上抓取(或收集)数据。在本文中,我们将研究其中两种方式:使用 Tweepy 和 Snscrap。

我们将学习一种方法来抓取人们关于特定趋势主题的公开对话,以及来自特定用户的推文。

现在事不宜迟,让我们开始吧。

Tweepy vs Snscrape——我们的抓取工具简介

现在,在我们进入每个平台的实现之前,让我们尝试掌握每个平台的差异和限制。

呸呸呸

Tweepy 是一个用于与 Twitter API 集成的 Python 库。因为 Tweepy 与 Twitter API 连接,除了抓取推文之外,您还可以执行复杂的查询。它使您能够利用 Twitter API 的所有功能。

但也有一些缺点——比如它的标准 API 只允许您收集长达一周的推文(也就是说,Tweepy 不允许恢复超过一周窗口的推文,因此不允许检索历史数据)。

此外,您可以从用户帐户中检索多少条推文也是有限制的。您可以在此处阅读有关 Tweepy 功能的更多信息

刮擦

Snscape 是另一种从 Twitter 上抓取信息的方法,不需要使用 API。Snscrape 允许您抓取基本信息,例如用户的个人资料、推文内容、来源等。

Snscape 不仅限于 Twitter,还可以从其他著名的社交媒体网络(如 Facebook、Instagram 等)中抓取内容。

它的优点是可以检索的推文数量或推文窗口(即推文的日期范围)没有限制。因此,Snscape 允许您检索旧数据。

但一个缺点是它缺乏 Tweepy 的所有其他功能——不过,如果你只想抓取推文,Snscrap 就足够了。

现在我们已经阐明了这两种方法之间的区别,让我们一一来看看它们的实现。

如何使用 Tweepy 抓取推文

在我们开始使用 Tweepy 之前,我们必须首先确保我们的 Twitter 凭据已准备好。有了它,我们可以将 Tweepy 连接到我们的 API 密钥并开始抓取。

如果您没有 Twitter 凭据,您可以前往此处注册 Twitter 开发者帐户。您将被问及一些关于您打算如何使用 Twitter API 的基本问题。之后,您可以开始实施。

第一步是在你的本地机器上安装 Tweepy 库,你可以通过键入:

pip install git+https://github.com/tweepy/tweepy.git

如何在 Twitter 上抓取用户的推文

现在我们已经安装了 Tweepy 库,让我们从johnTwitter 上调用的用户那里抓取 100 条推文。我们将查看完整的代码实现,让我们这样做并详细讨论它,以便我们了解发生了什么:

import tweepy

consumer_key = "XXXX" #Your API/Consumer key 
consumer_secret = "XXXX" #Your API/Consumer Secret Key
access_token = "XXXX"    #Your Access token key
access_token_secret = "XXXX" #Your Access token Secret key

#Pass in our twitter API authentication key
auth = tweepy.OAuth1UserHandler(
    consumer_key, consumer_secret,
    access_token, access_token_secret
)

#Instantiate the tweepy API
api = tweepy.API(auth, wait_on_rate_limit=True)


username = "john"
no_of_tweets =100


try:
    #The number of tweets we want to retrieved from the user
    tweets = api.user_timeline(screen_name=username, count=no_of_tweets)
    
    #Pulling Some attributes from the tweet
    attributes_container = [[tweet.created_at, tweet.favorite_count,tweet.source,  tweet.text] for tweet in tweets]

    #Creation of column list to rename the columns in the dataframe
    columns = ["Date Created", "Number of Likes", "Source of Tweet", "Tweet"]
    
    #Creation of Dataframe
    tweets_df = pd.DataFrame(attributes_container, columns=columns)
except BaseException as e:
    print('Status Failed On,',str(e))
    time.sleep(3)

现在让我们回顾一下上面代码块中的每一部分代码。

import tweepy

consumer_key = "XXXX" #Your API/Consumer key 
consumer_secret = "XXXX" #Your API/Consumer Secret Key
access_token = "XXXX"    #Your Access token key
access_token_secret = "XXXX" #Your Access token Secret key

#Pass in our twitter API authentication key
auth = tweepy.OAuth1UserHandler(
    consumer_key, consumer_secret,
    access_token, access_token_secret
)

#Instantiate the tweepy API
api = tweepy.API(auth, wait_on_rate_limit=True)

在上面的代码中,我们将 Tweepy 库导入到我们的代码中,然后我们创建了一些变量来存储我们的 Twitter 凭据(Tweepy 身份验证处理程序需要我们的四个 Twitter 凭据)。所以我们然后将这些变量传递给 Tweepy 身份验证处理程序并将它们保存到另一个变量中。

然后最后一个调用语句是我们实例化 Tweepy API 并传入 require 参数的地方。

username = "john"
no_of_tweets =100


try:
    #The number of tweets we want to retrieved from the user
    tweets = api.user_timeline(screen_name=username, count=no_of_tweets)
    
    #Pulling Some attributes from the tweet
    attributes_container = [[tweet.created_at, tweet.favorite_count,tweet.source,  tweet.text] for tweet in tweets]

    #Creation of column list to rename the columns in the dataframe
    columns = ["Date Created", "Number of Likes", "Source of Tweet", "Tweet"]
    
    #Creation of Dataframe
    tweets_df = pd.DataFrame(attributes_container, columns=columns)
except BaseException as e:
    print('Status Failed On,',str(e))

在上面的代码中,我们创建了要从中检索推文的用户名(Twitter 中的@name)以及推文的数量。然后我们创建了一个异常处理程序来帮助我们以更有效的方式捕获错误。

之后,api.user_timeline()返回我们在参数中选择的用户发布的最新推文的集合以及screen_name您要检索的推文数量。

在下一行代码中,我们传入了一些我们想从每条推文中检索的属性,并将它们保存到一个列表中。要查看可以从推文中检索到的更多属性,请阅读

在最后一段代码中,我们创建了一个数据框,并传入了我们创建的列表以及我们创建的列的名称。

请注意,列名必须按照您将它们传递到属性容器的顺序(即,当您从推文中检索属性时,您如何在列表中传递这些属性)。

如果你正确地按照我描述的步骤,你应该有这样的:

图像 17

作者图片

现在我们已经完成了,在我们进入 Snscrap 实现之前,让我们再看一个例子。

如何从文本搜索中抓取推文

在这种方法中,我们将根据搜索检索推文。你可以这样做:

import tweepy

consumer_key = "XXXX" #Your API/Consumer key 
consumer_secret = "XXXX" #Your API/Consumer Secret Key
access_token = "XXXX"    #Your Access token key
access_token_secret = "XXXX" #Your Access token Secret key

#Pass in our twitter API authentication key
auth = tweepy.OAuth1UserHandler(
    consumer_key, consumer_secret,
    access_token, access_token_secret
)

#Instantiate the tweepy API
api = tweepy.API(auth, wait_on_rate_limit=True)


search_query = "sex for grades"
no_of_tweets =150


try:
    #The number of tweets we want to retrieved from the search
    tweets = api.search_tweets(q=search_query, count=no_of_tweets)
    
    #Pulling Some attributes from the tweet
    attributes_container = [[tweet.user.name, tweet.created_at, tweet.favorite_count, tweet.source,  tweet.text] for tweet in tweets]

    #Creation of column list to rename the columns in the dataframe
    columns = ["User", "Date Created", "Number of Likes", "Source of Tweet", "Tweet"]
    
    #Creation of Dataframe
    tweets_df = pd.DataFrame(attributes_container, columns=columns)
except BaseException as e:
    print('Status Failed On,',str(e))

上面的代码与前面的代码类似,只是我们将 API 方法从 更改api.user_timeline()api.search_tweets()。我们还添加tweet.user.name了属性容器列表。

在上面的代码中,你可以看到我们传入了两个属性。这是因为如果我们只传入tweet.user,它只会返回一个字典用户对象。所以我们还必须传入另一个我们想从用户对象中检索的属性,即name.

您可以在此处查看可以从用户对象中检索的附加属性列表。现在,一旦您运行它,您应该会看到类似这样的内容:

图像 18

图片由作者提供。

好的,这就是 Tweepy 的实现。请记住,您可以检索的推文数量是有限制的,并且您不能使用 Tweepy 检索超过 7 天的推文。

如何使用 Snscrape 来抓取推文

正如我之前提到的,Snscrape 不需要 Twitter 凭据(API 密钥)来访问它。您可以获取的推文数量也没有限制。

但是,对于这个示例,我们将只检索与上一个示例相同的推文,但使用 Snscape。

要使用 Snscrap,我们必须首先在我们的 PC 上安装它的库。您可以通过键入:

pip3 install git+https://github.com/JustAnotherArchivist/snscrape.git

如何使用 Snscrape 抓取用户的推文

Snscrape 包括两种从 Twitter 获取推文的方法:命令行界面 (CLI) 和 Python Wrapper。请记住,Python Wrapper 目前没有文档记录——但我们仍然可以通过反复试验来度过难关。

在本例中,我们将使用 Python Wrapper,因为它比 CLI 方法更直观。但是,如果您遇到一些代码问题,您可以随时向 GitHub 社区寻求帮助。贡献者将很乐意为您提供帮助。

要检索特定用户的推文,我们可以执行以下操作:

import snscrape.modules.twitter as sntwitter
import pandas as pd

# Created a list to append all tweet attributes(data)
attributes_container = []

# Using TwitterSearchScraper to scrape data and append tweets to list
for i,tweet in enumerate(sntwitter.TwitterSearchScraper('from:john').get_items()):
    if i>100:
        break
    attributes_container.append([tweet.date, tweet.likeCount, tweet.sourceLabel, tweet.content])
    
# Creating a dataframe from the tweets list above 
tweets_df = pd.DataFrame(attributes_container, columns=["Date Created", "Number of Likes", "Source of Tweet", "Tweets"])

让我们复习一下你可能第一眼看不懂的一些代码:

for i,tweet in enumerate(sntwitter.TwitterSearchScraper('from:john').get_items()):
    if i>100:
        break
    attributes_container.append([tweet.date, tweet.likeCount, tweet.sourceLabel, tweet.content])
    
  
# Creating a dataframe from the tweets list above 
tweets_df = pd.DataFrame(attributes_container, columns=["Date Created", "Number of Likes", "Source of Tweet", "Tweets"])

在上面的代码中,所做的sntwitter.TwitterSearchScaper是从我们传递给它的用户名(即 john)中返回一个推文对象。

正如我之前提到的,Snscrape 对推文的数量没有限制,因此它会返回来自该用户的许多推文。为了解决这个问题,我们需要添加枚举函数,该函数将遍历对象并添加一个计数器,以便我们可以访问用户最近的 100 条推文。

您可以看到,我们从每条推文中获得的属性语法与 Tweepy 中的类似。这些是我们可以从 Martin Beck 策划的 Snscape 推文中获得的属性列表。

Sns.Scrape

学分:马丁贝克

可能会添加更多属性,因为 Snscape 库仍在开发中。例如上图中的,source已替换为sourceLabel. 如果你只传入source它会返回一个对象。

如果你运行上面的代码,你应该也会看到类似这样的东西:

图像 19

作者图片

现在让我们对通过搜索进行抓取做同样的事情。

如何使用 Snscrape 从文本搜索中抓取推文

import snscrape.modules.twitter as sntwitter
import pandas as pd

# Creating list to append tweet data to
attributes_container = []

# Using TwitterSearchScraper to scrape data and append tweets to list
for i,tweet in enumerate(sntwitter.TwitterSearchScraper('sex for grades since:2021-07-05 until:2022-07-06').get_items()):
    if i>150:
        break
    attributes_container.append([tweet.user.username, tweet.date, tweet.likeCount, tweet.sourceLabel, tweet.content])
    
# Creating a dataframe to load the list
tweets_df = pd.DataFrame(attributes_container, columns=["User", "Date Created", "Number of Likes", "Source of Tweet", "Tweet"])

同样,您可以使用 Snscrape 访问大量历史数据(与 Tweepy 不同,因为它的标准 API 不能超过 7 天。高级 API 是 30 天。)。所以我们可以在方法中传入我们想要开始搜索的日期和想要结束的日期sntwitter.TwitterSearchScraper()

我们在前面的代码中所做的基本上就是我们之前讨论过的。唯一要记住的是,直到与 Python 中的范围函数类似(也就是说,它不包括最后一个整数)。因此,如果您想从今天开始获取推文,则需要在“直到”参数中包含今天之后的一天。

图像 21

作者的形象。

现在您也知道如何使用 Snscape 抓取推文了!

何时使用每种方法

现在我们已经了解了每种方法的工作原理,您可能想知道何时使用哪种方法。

好吧,对于何时使用每种方法没有通用规则。一切都取决于问题偏好和您的用例。

如果你想获得无穷无尽的推文,你应该使用 Snscrap。但是,如果您想使用 Snscrape 无法提供的额外功能(例如地理定位),那么您绝对应该使用 Tweepy。它直接与 Twitter API 集成并提供完整的功能。

即便如此,Snscrape 是最常用的基本刮削方法。

结论

在本文中,我们学习了如何使用 Tweepy 和 Snscrap 从 Python 中抓取数据。但这只是对每种方法如何工作的简要概述。您可以通过浏览网络了解更多信息以获取更多信息。

我提供了一些有用的资源,如果您需要更多信息,可以使用它们。感谢您的阅读。

 来源:https ://www.freecodecamp.org/news/python-web-scraping-tutorial/

#python #web