1611581830
Ethereum stays the second-largest Blockhchain and Erc20 tokens are lot more beneficial if you create Ethereum tokens. Check out this Blog to know the cost to create ERC20 tokens - https://bit.ly/36choKN
#costtocreateerc20token #createerc20token #erc20tokendevelopmentcost
1655630160
Install via pip:
$ pip install pytumblr
Install from source:
$ git clone https://github.com/tumblr/pytumblr.git
$ cd pytumblr
$ python setup.py install
A pytumblr.TumblrRestClient
is the object you'll make all of your calls to the Tumblr API through. Creating one is this easy:
client = pytumblr.TumblrRestClient(
'<consumer_key>',
'<consumer_secret>',
'<oauth_token>',
'<oauth_secret>',
)
client.info() # Grabs the current user information
Two easy ways to get your credentials to are:
interactive_console.py
tool (if you already have a consumer key & secret)client.info() # get information about the authenticating user
client.dashboard() # get the dashboard for the authenticating user
client.likes() # get the likes for the authenticating user
client.following() # get the blogs followed by the authenticating user
client.follow('codingjester.tumblr.com') # follow a blog
client.unfollow('codingjester.tumblr.com') # unfollow a blog
client.like(id, reblogkey) # like a post
client.unlike(id, reblogkey) # unlike a post
client.blog_info(blogName) # get information about a blog
client.posts(blogName, **params) # get posts for a blog
client.avatar(blogName) # get the avatar for a blog
client.blog_likes(blogName) # get the likes on a blog
client.followers(blogName) # get the followers of a blog
client.blog_following(blogName) # get the publicly exposed blogs that [blogName] follows
client.queue(blogName) # get the queue for a given blog
client.submission(blogName) # get the submissions for a given blog
Creating posts
PyTumblr lets you create all of the various types that Tumblr supports. When using these types there are a few defaults that are able to be used with any post type.
The default supported types are described below.
We'll show examples throughout of these default examples while showcasing all the specific post types.
Creating a photo post
Creating a photo post supports a bunch of different options plus the described default options * caption - a string, the user supplied caption * link - a string, the "click-through" url for the photo * source - a string, the url for the photo you want to use (use this or the data parameter) * data - a list or string, a list of filepaths or a single file path for multipart file upload
#Creates a photo post using a source URL
client.create_photo(blogName, state="published", tags=["testing", "ok"],
source="https://68.media.tumblr.com/b965fbb2e501610a29d80ffb6fb3e1ad/tumblr_n55vdeTse11rn1906o1_500.jpg")
#Creates a photo post using a local filepath
client.create_photo(blogName, state="queue", tags=["testing", "ok"],
tweet="Woah this is an incredible sweet post [URL]",
data="/Users/johnb/path/to/my/image.jpg")
#Creates a photoset post using several local filepaths
client.create_photo(blogName, state="draft", tags=["jb is cool"], format="markdown",
data=["/Users/johnb/path/to/my/image.jpg", "/Users/johnb/Pictures/kittens.jpg"],
caption="## Mega sweet kittens")
Creating a text post
Creating a text post supports the same options as default and just a two other parameters * title - a string, the optional title for the post. Supports markdown or html * body - a string, the body of the of the post. Supports markdown or html
#Creating a text post
client.create_text(blogName, state="published", slug="testing-text-posts", title="Testing", body="testing1 2 3 4")
Creating a quote post
Creating a quote post supports the same options as default and two other parameter * quote - a string, the full text of the qote. Supports markdown or html * source - a string, the cited source. HTML supported
#Creating a quote post
client.create_quote(blogName, state="queue", quote="I am the Walrus", source="Ringo")
Creating a link post
#Create a link post
client.create_link(blogName, title="I like to search things, you should too.", url="https://duckduckgo.com",
description="Search is pretty cool when a duck does it.")
Creating a chat post
Creating a chat post supports the same options as default and two other parameters * title - a string, the title of the chat post * conversation - a string, the text of the conversation/chat, with diablog labels (no html)
#Create a chat post
chat = """John: Testing can be fun!
Renee: Testing is tedious and so are you.
John: Aw.
"""
client.create_chat(blogName, title="Renee just doesn't understand.", conversation=chat, tags=["renee", "testing"])
Creating an audio post
Creating an audio post allows for all default options and a has 3 other parameters. The only thing to keep in mind while dealing with audio posts is to make sure that you use the external_url parameter or data. You cannot use both at the same time. * caption - a string, the caption for your post * external_url - a string, the url of the site that hosts the audio file * data - a string, the filepath of the audio file you want to upload to Tumblr
#Creating an audio file
client.create_audio(blogName, caption="Rock out.", data="/Users/johnb/Music/my/new/sweet/album.mp3")
#lets use soundcloud!
client.create_audio(blogName, caption="Mega rock out.", external_url="https://soundcloud.com/skrillex/sets/recess")
Creating a video post
Creating a video post allows for all default options and has three other options. Like the other post types, it has some restrictions. You cannot use the embed and data parameters at the same time. * caption - a string, the caption for your post * embed - a string, the HTML embed code for the video * data - a string, the path of the file you want to upload
#Creating an upload from YouTube
client.create_video(blogName, caption="Jon Snow. Mega ridiculous sword.",
embed="http://www.youtube.com/watch?v=40pUYLacrj4")
#Creating a video post from local file
client.create_video(blogName, caption="testing", data="/Users/johnb/testing/ok/blah.mov")
Editing a post
Updating a post requires you knowing what type a post you're updating. You'll be able to supply to the post any of the options given above for updates.
client.edit_post(blogName, id=post_id, type="text", title="Updated")
client.edit_post(blogName, id=post_id, type="photo", data="/Users/johnb/mega/awesome.jpg")
Reblogging a Post
Reblogging a post just requires knowing the post id and the reblog key, which is supplied in the JSON of any post object.
client.reblog(blogName, id=125356, reblog_key="reblog_key")
Deleting a post
Deleting just requires that you own the post and have the post id
client.delete_post(blogName, 123456) # Deletes your post :(
A note on tags: When passing tags, as params, please pass them as a list (not a comma-separated string):
client.create_text(blogName, tags=['hello', 'world'], ...)
Getting notes for a post
In order to get the notes for a post, you need to have the post id and the blog that it is on.
data = client.notes(blogName, id='123456')
The results include a timestamp you can use to make future calls.
data = client.notes(blogName, id='123456', before_timestamp=data["_links"]["next"]["query_params"]["before_timestamp"])
# get posts with a given tag
client.tagged(tag, **params)
This client comes with a nice interactive console to run you through the OAuth process, grab your tokens (and store them for future use).
You'll need pyyaml
installed to run it, but then it's just:
$ python interactive-console.py
and away you go! Tokens are stored in ~/.tumblr
and are also shared by other Tumblr API clients like the Ruby client.
The tests (and coverage reports) are run with nose, like this:
python setup.py test
Author: tumblr
Source Code: https://github.com/tumblr/pytumblr
License: Apache-2.0 license
1624338997
Are you thinking of executing an E-learning app in the market?
Then firstly you need to understand the concept of E-learning in more detail and also know about the types of E-learning app and what is the E-learning app demand in the market.
In this present time, every industry is taking the help of technology for maximizing their profits, as people love to use the technology for fulfilling their basic requirements. Every industry is now providing online services via web apps or mobile apps.
What are the Basic features an E-learning app contains?
Features list for a learner Panel:
Features list for a tutor Panel:
Feature list for an Admin Panel:
What are the factors on which the cost of the E-learning app depends?
The cost of the E-learning app is depended on some of the factors. Let me list down the factors affecting the cost of an E-learning app:
How much does it cost to develop an E-learning app?
As we have discussed the cost of an E-learning app is highly depends on some of the factors. We are at AppClues Infotech, which is a leading app development industry. We help you to develop an E-learning app by providing you with the best solution and Unique UI/UX design.
We can offer you to hire experienced and expert android as well as an IOS developer.
So here we are providing you with the approximate timeline and cost of developing an E-learning app:
Timeline:
Costing:
The approximate cost of developing an E-learning app is $30,000-$70,000.
#how much does it cost to develop an e-learning app #how much does it cost to create e-learning #how much does it cost to develop a educational app #how much does cost to make an e-learning app #how to create an educational app #e-learning mobile app development cost and features
1669003576
In this Python article, let's learn about Mutable and Immutable in Python.
Mutable is a fancy way of saying that the internal state of the object is changed/mutated. So, the simplest definition is: An object whose internal state can be changed is mutable. On the other hand, immutable doesn’t allow any change in the object once it has been created.
Both of these states are integral to Python data structure. If you want to become more knowledgeable in the entire Python Data Structure, take this free course which covers multiple data structures in Python including tuple data structure which is immutable. You will also receive a certificate on completion which is sure to add value to your portfolio.
Mutable is when something is changeable or has the ability to change. In Python, ‘mutable’ is the ability of objects to change their values. These are often the objects that store a collection of data.
Immutable is the when no change is possible over time. In Python, if the value of an object cannot be changed over time, then it is known as immutable. Once created, the value of these objects is permanent.
Objects of built-in type that are mutable are:
Objects of built-in type that are immutable are:
Object mutability is one of the characteristics that makes Python a dynamically typed language. Though Mutable and Immutable in Python is a very basic concept, it can at times be a little confusing due to the intransitive nature of immutability.
In Python, everything is treated as an object. Every object has these three attributes:
While ID and Type cannot be changed once it’s created, values can be changed for Mutable objects.
Check out this free python certificate course to get started with Python.
I believe, rather than diving deep into the theory aspects of mutable and immutable in Python, a simple code would be the best way to depict what it means in Python. Hence, let us discuss the below code step-by-step:
#Creating a list which contains name of Indian cities
cities = [‘Delhi’, ‘Mumbai’, ‘Kolkata’]
# Printing the elements from the list cities, separated by a comma & space
for city in cities:
print(city, end=’, ’)
Output [1]: Delhi, Mumbai, Kolkata
#Printing the location of the object created in the memory address in hexadecimal format
print(hex(id(cities)))
Output [2]: 0x1691d7de8c8
#Adding a new city to the list cities
cities.append(‘Chennai’)
#Printing the elements from the list cities, separated by a comma & space
for city in cities:
print(city, end=’, ’)
Output [3]: Delhi, Mumbai, Kolkata, Chennai
#Printing the location of the object created in the memory address in hexadecimal format
print(hex(id(cities)))
Output [4]: 0x1691d7de8c8
The above example shows us that we were able to change the internal state of the object ‘cities’ by adding one more city ‘Chennai’ to it, yet, the memory address of the object did not change. This confirms that we did not create a new object, rather, the same object was changed or mutated. Hence, we can say that the object which is a type of list with reference variable name ‘cities’ is a MUTABLE OBJECT.
Let us now discuss the term IMMUTABLE. Considering that we understood what mutable stands for, it is obvious that the definition of immutable will have ‘NOT’ included in it. Here is the simplest definition of immutable– An object whose internal state can NOT be changed is IMMUTABLE.
Again, if you try and concentrate on different error messages, you have encountered, thrown by the respective IDE; you use you would be able to identify the immutable objects in Python. For instance, consider the below code & associated error message with it, while trying to change the value of a Tuple at index 0.
#Creating a Tuple with variable name ‘foo’
foo = (1, 2)
#Changing the index[0] value from 1 to 3
foo[0] = 3
TypeError: 'tuple' object does not support item assignment
Once again, a simple code would be the best way to depict what immutable stands for. Hence, let us discuss the below code step-by-step:
#Creating a Tuple which contains English name of weekdays
weekdays = ‘Sunday’, ‘Monday’, ‘Tuesday’, ‘Wednesday’, ‘Thursday’, ‘Friday’, ‘Saturday’
# Printing the elements of tuple weekdays
print(weekdays)
Output [1]: (‘Sunday’, ‘Monday’, ‘Tuesday’, ‘Wednesday’, ‘Thursday’, ‘Friday’, ‘Saturday’)
#Printing the location of the object created in the memory address in hexadecimal format
print(hex(id(weekdays)))
Output [2]: 0x1691cc35090
#tuples are immutable, so you cannot add new elements, hence, using merge of tuples with the # + operator to add a new imaginary day in the tuple ‘weekdays’
weekdays += ‘Pythonday’,
#Printing the elements of tuple weekdays
print(weekdays)
Output [3]: (‘Sunday’, ‘Monday’, ‘Tuesday’, ‘Wednesday’, ‘Thursday’, ‘Friday’, ‘Saturday’, ‘Pythonday’)
#Printing the location of the object created in the memory address in hexadecimal format
print(hex(id(weekdays)))
Output [4]: 0x1691cc8ad68
This above example shows that we were able to use the same variable name that is referencing an object which is a type of tuple with seven elements in it. However, the ID or the memory location of the old & new tuple is not the same. We were not able to change the internal state of the object ‘weekdays’. The Python program manager created a new object in the memory address and the variable name ‘weekdays’ started referencing the new object with eight elements in it. Hence, we can say that the object which is a type of tuple with reference variable name ‘weekdays’ is an IMMUTABLE OBJECT.
Also Read: Understanding the Exploratory Data Analysis (EDA) in Python
Where can you use mutable and immutable objects:
Mutable objects can be used where you want to allow for any updates. For example, you have a list of employee names in your organizations, and that needs to be updated every time a new member is hired. You can create a mutable list, and it can be updated easily.
Immutability offers a lot of useful applications to different sensitive tasks we do in a network centred environment where we allow for parallel processing. By creating immutable objects, you seal the values and ensure that no threads can invoke overwrite/update to your data. This is also useful in situations where you would like to write a piece of code that cannot be modified. For example, a debug code that attempts to find the value of an immutable object.
Watch outs: Non transitive nature of Immutability:
OK! Now we do understand what mutable & immutable objects in Python are. Let’s go ahead and discuss the combination of these two and explore the possibilities. Let’s discuss, as to how will it behave if you have an immutable object which contains the mutable object(s)? Or vice versa? Let us again use a code to understand this behaviour–
#creating a tuple (immutable object) which contains 2 lists(mutable) as it’s elements
#The elements (lists) contains the name, age & gender
person = (['Ayaan', 5, 'Male'], ['Aaradhya', 8, 'Female'])
#printing the tuple
print(person)
Output [1]: (['Ayaan', 5, 'Male'], ['Aaradhya', 8, 'Female'])
#printing the location of the object created in the memory address in hexadecimal format
print(hex(id(person)))
Output [2]: 0x1691ef47f88
#Changing the age for the 1st element. Selecting 1st element of tuple by using indexing [0] then 2nd element of the list by using indexing [1] and assigning a new value for age as 4
person[0][1] = 4
#printing the updated tuple
print(person)
Output [3]: (['Ayaan', 4, 'Male'], ['Aaradhya', 8, 'Female'])
#printing the location of the object created in the memory address in hexadecimal format
print(hex(id(person)))
Output [4]: 0x1691ef47f88
In the above code, you can see that the object ‘person’ is immutable since it is a type of tuple. However, it has two lists as it’s elements, and we can change the state of lists (lists being mutable). So, here we did not change the object reference inside the Tuple, but the referenced object was mutated.
Also Read: Real-Time Object Detection Using TensorFlow
Same way, let’s explore how it will behave if you have a mutable object which contains an immutable object? Let us again use a code to understand the behaviour–
#creating a list (mutable object) which contains tuples(immutable) as it’s elements
list1 = [(1, 2, 3), (4, 5, 6)]
#printing the list
print(list1)
Output [1]: [(1, 2, 3), (4, 5, 6)]
#printing the location of the object created in the memory address in hexadecimal format
print(hex(id(list1)))
Output [2]: 0x1691d5b13c8
#changing object reference at index 0
list1[0] = (7, 8, 9)
#printing the list
Output [3]: [(7, 8, 9), (4, 5, 6)]
#printing the location of the object created in the memory address in hexadecimal format
print(hex(id(list1)))
Output [4]: 0x1691d5b13c8
As an individual, it completely depends upon you and your requirements as to what kind of data structure you would like to create with a combination of mutable & immutable objects. I hope that this information will help you while deciding the type of object you would like to select going forward.
Before I end our discussion on IMMUTABILITY, allow me to use the word ‘CAVITE’ when we discuss the String and Integers. There is an exception, and you may see some surprising results while checking the truthiness for immutability. For instance:
#creating an object of integer type with value 10 and reference variable name ‘x’
x = 10
#printing the value of ‘x’
print(x)
Output [1]: 10
#Printing the location of the object created in the memory address in hexadecimal format
print(hex(id(x)))
Output [2]: 0x538fb560
#creating an object of integer type with value 10 and reference variable name ‘y’
y = 10
#printing the value of ‘y’
print(y)
Output [3]: 10
#Printing the location of the object created in the memory address in hexadecimal format
print(hex(id(y)))
Output [4]: 0x538fb560
As per our discussion and understanding, so far, the memory address for x & y should have been different, since, 10 is an instance of Integer class which is immutable. However, as shown in the above code, it has the same memory address. This is not something that we expected. It seems that what we have understood and discussed, has an exception as well.
Quick check – Python Data Structures
Tuples are immutable and hence cannot have any changes in them once they are created in Python. This is because they support the same sequence operations as strings. We all know that strings are immutable. The index operator will select an element from a tuple just like in a string. Hence, they are immutable.
Like all, there are exceptions in the immutability in python too. Not all immutable objects are really mutable. This will lead to a lot of doubts in your mind. Let us just take an example to understand this.
Consider a tuple ‘tup’.
Now, if we consider tuple tup = (‘GreatLearning’,[4,3,1,2]) ;
We see that the tuple has elements of different data types. The first element here is a string which as we all know is immutable in nature. The second element is a list which we all know is mutable. Now, we all know that the tuple itself is an immutable data type. It cannot change its contents. But, the list inside it can change its contents. So, the value of the Immutable objects cannot be changed but its constituent objects can. change its value.
Mutable Object | Immutable Object |
State of the object can be modified after it is created. | State of the object can’t be modified once it is created. |
They are not thread safe. | They are thread safe |
Mutable classes are not final. | It is important to make the class final before creating an immutable object. |
list, dictionary, set, user-defined classes.
int, float, decimal, bool, string, tuple, range.
Lists in Python are mutable data types as the elements of the list can be modified, individual elements can be replaced, and the order of elements can be changed even after the list has been created.
(Examples related to lists have been discussed earlier in this blog.)
Tuple and list data structures are very similar, but one big difference between the data types is that lists are mutable, whereas tuples are immutable. The reason for the tuple’s immutability is that once the elements are added to the tuple and the tuple has been created; it remains unchanged.
A programmer would always prefer building a code that can be reused instead of making the whole data object again. Still, even though tuples are immutable, like lists, they can contain any Python object, including mutable objects.
A set is an iterable unordered collection of data type which can be used to perform mathematical operations (like union, intersection, difference etc.). Every element in a set is unique and immutable, i.e. no duplicate values should be there, and the values can’t be changed. However, we can add or remove items from the set as the set itself is mutable.
Strings are not mutable in Python. Strings are a immutable data types which means that its value cannot be updated.
Join Great Learning Academy’s free online courses and upgrade your skills today.
Original article source at: https://www.mygreatlearning.com
1657771200
Se você é um entusiasta de dados, provavelmente concordará que uma das fontes mais ricas de dados do mundo real são as mídias sociais. Sites como o Twitter estão cheios de dados.
Você pode usar os dados obtidos nas mídias sociais de várias maneiras, como análise de sentimentos (analisando os pensamentos das pessoas) sobre um assunto ou campo de interesse específico.
Existem várias maneiras de extrair (ou coletar) dados do Twitter. E neste artigo, veremos duas dessas maneiras: usando o Tweepy e o Snscrape.
Aprenderemos um método para extrair conversas públicas de pessoas sobre um tópico de tendência específico, bem como tweets de um usuário específico.
Agora sem mais delongas, vamos começar.
Agora, antes de entrarmos na implementação de cada plataforma, vamos tentar entender as diferenças e os limites de cada plataforma.
Tweepy é uma biblioteca Python para integração com a API do Twitter. Como o Tweepy está conectado à API do Twitter, você pode realizar consultas complexas além de extrair tweets. Ele permite que você aproveite todos os recursos da API do Twitter.
Mas existem algumas desvantagens – como o fato de que sua API padrão só permite coletar tweets por até uma semana (ou seja, o Tweepy não permite a recuperação de tweets além de uma janela de semana, portanto, a recuperação de dados históricos não é permitida).
Além disso, há limites para quantos tweets você pode recuperar da conta de um usuário. Você pode ler mais sobre as funcionalidades do Tweepy aqui .
Snscrape é outra abordagem para extrair informações do Twitter que não requer o uso de uma API. O Snscrape permite extrair informações básicas, como o perfil de um usuário, conteúdo do tweet, fonte e assim por diante.
O Snscrape não se limita ao Twitter, mas também pode extrair conteúdo de outras redes sociais proeminentes, como Facebook, Instagram e outros.
Suas vantagens são que não há limites para o número de tweets que você pode recuperar ou a janela de tweets (ou seja, o intervalo de datas dos tweets). Então Snscrape permite que você recupere dados antigos.
Mas a única desvantagem é que ele não possui todas as outras funcionalidades do Tweepy – ainda assim, se você quiser apenas raspar tweets, o Snscrape seria suficiente.
Agora que esclarecemos a distinção entre os dois métodos, vamos analisar sua implementação um por um.
Antes de começarmos a usar o Tweepy, devemos primeiro ter certeza de que nossas credenciais do Twitter estão prontas. Com isso, podemos conectar o Tweepy à nossa chave de API e começar a raspar.
Se você não tiver credenciais do Twitter, poderá se registrar para uma conta de desenvolvedor do Twitter acessando aqui . Serão feitas algumas perguntas básicas sobre como você pretende usar a API do Twitter. Depois disso, você pode começar a implementação.
O primeiro passo é instalar a biblioteca Tweepy em sua máquina local, o que você pode fazer digitando:
pip install git+https://github.com/tweepy/tweepy.git
Agora que instalamos a biblioteca Tweepy, vamos extrair 100 tweets de um usuário chamado john
no Twitter. Veremos a implementação completa do código que nos permitirá fazer isso e discutiremos em detalhes para que possamos entender o que está acontecendo:
import tweepy
consumer_key = "XXXX" #Your API/Consumer key
consumer_secret = "XXXX" #Your API/Consumer Secret Key
access_token = "XXXX" #Your Access token key
access_token_secret = "XXXX" #Your Access token Secret key
#Pass in our twitter API authentication key
auth = tweepy.OAuth1UserHandler(
consumer_key, consumer_secret,
access_token, access_token_secret
)
#Instantiate the tweepy API
api = tweepy.API(auth, wait_on_rate_limit=True)
username = "john"
no_of_tweets =100
try:
#The number of tweets we want to retrieved from the user
tweets = api.user_timeline(screen_name=username, count=no_of_tweets)
#Pulling Some attributes from the tweet
attributes_container = [[tweet.created_at, tweet.favorite_count,tweet.source, tweet.text] for tweet in tweets]
#Creation of column list to rename the columns in the dataframe
columns = ["Date Created", "Number of Likes", "Source of Tweet", "Tweet"]
#Creation of Dataframe
tweets_df = pd.DataFrame(attributes_container, columns=columns)
except BaseException as e:
print('Status Failed On,',str(e))
time.sleep(3)
Agora vamos examinar cada parte do código no bloco acima.
import tweepy
consumer_key = "XXXX" #Your API/Consumer key
consumer_secret = "XXXX" #Your API/Consumer Secret Key
access_token = "XXXX" #Your Access token key
access_token_secret = "XXXX" #Your Access token Secret key
#Pass in our twitter API authentication key
auth = tweepy.OAuth1UserHandler(
consumer_key, consumer_secret,
access_token, access_token_secret
)
#Instantiate the tweepy API
api = tweepy.API(auth, wait_on_rate_limit=True)
No código acima, importamos a biblioteca Tweepy para nosso código e criamos algumas variáveis nas quais armazenamos nossas credenciais do Twitter (o manipulador de autenticação do Tweepy requer quatro de nossas credenciais do Twitter). Então, passamos essas variáveis para o manipulador de autenticação Tweepy e as salvamos em outra variável.
Em seguida, a última instrução de chamada é onde instanciamos a API do Tweepy e passamos os parâmetros require.
username = "john"
no_of_tweets =100
try:
#The number of tweets we want to retrieved from the user
tweets = api.user_timeline(screen_name=username, count=no_of_tweets)
#Pulling Some attributes from the tweet
attributes_container = [[tweet.created_at, tweet.favorite_count,tweet.source, tweet.text] for tweet in tweets]
#Creation of column list to rename the columns in the dataframe
columns = ["Date Created", "Number of Likes", "Source of Tweet", "Tweet"]
#Creation of Dataframe
tweets_df = pd.DataFrame(attributes_container, columns=columns)
except BaseException as e:
print('Status Failed On,',str(e))
No código acima, criamos o nome do usuário (o @name no Twitter) do qual queremos recuperar os tweets e também o número de tweets. Em seguida, criamos um manipulador de exceção para nos ajudar a detectar erros de maneira mais eficaz.
Depois disso, o api.user_timeline()
retorna uma coleção dos tweets mais recentes postados pelo usuário que escolhemos no screen_name
parâmetro e o número de tweets que você deseja recuperar.
Na próxima linha de código, passamos alguns atributos que queremos recuperar de cada tweet e os salvamos em uma lista. Para ver mais atributos que você pode recuperar de um tweet, leia isto .
No último pedaço de código criamos um dataframe e passamos a lista que criamos junto com os nomes da coluna que criamos.
Observe que os nomes das colunas devem estar na sequência de como você os passou para o contêiner de atributos (ou seja, como você passou esses atributos em uma lista quando estava recuperando os atributos do tweet).
Se você seguiu corretamente os passos que descrevi, você deve ter algo assim:
Imagem do autor
Agora que terminamos, vamos ver mais um exemplo antes de passarmos para a implementação do Snscrape.
Neste método, estaremos recuperando um tweet com base em uma pesquisa. Você pode fazer assim:
import tweepy
consumer_key = "XXXX" #Your API/Consumer key
consumer_secret = "XXXX" #Your API/Consumer Secret Key
access_token = "XXXX" #Your Access token key
access_token_secret = "XXXX" #Your Access token Secret key
#Pass in our twitter API authentication key
auth = tweepy.OAuth1UserHandler(
consumer_key, consumer_secret,
access_token, access_token_secret
)
#Instantiate the tweepy API
api = tweepy.API(auth, wait_on_rate_limit=True)
search_query = "sex for grades"
no_of_tweets =150
try:
#The number of tweets we want to retrieved from the search
tweets = api.search_tweets(q=search_query, count=no_of_tweets)
#Pulling Some attributes from the tweet
attributes_container = [[tweet.user.name, tweet.created_at, tweet.favorite_count, tweet.source, tweet.text] for tweet in tweets]
#Creation of column list to rename the columns in the dataframe
columns = ["User", "Date Created", "Number of Likes", "Source of Tweet", "Tweet"]
#Creation of Dataframe
tweets_df = pd.DataFrame(attributes_container, columns=columns)
except BaseException as e:
print('Status Failed On,',str(e))
O código acima é semelhante ao código anterior, exceto que alteramos o método da API de api.user_timeline()
para api.search_tweets()
. Também adicionamos tweet.user.name
à lista de contêineres de atributos.
No código acima, você pode ver que passamos dois atributos. Isso ocorre porque se passarmos apenas tweet.user
, ele retornaria apenas um objeto de usuário de dicionário. Portanto, também devemos passar outro atributo que queremos recuperar do objeto de usuário, que é name
.
Você pode acessar aqui para ver uma lista de atributos adicionais que podem ser recuperados de um objeto de usuário. Agora você deve ver algo assim depois de executá-lo:
Imagem do Autor.
Tudo bem, isso praticamente encerra a implementação do Tweepy. Apenas lembre-se de que há um limite para o número de tweets que você pode recuperar, e você não pode recuperar tweets com mais de 7 dias usando o Tweepy.
Como mencionei anteriormente, o Snscrape não requer credenciais do Twitter (chave de API) para acessá-lo. Também não há limite para o número de tweets que você pode buscar.
Para este exemplo, porém, apenas recuperaremos os mesmos tweets do exemplo anterior, mas usando o Snscrape.
Para usar o Snscrape, devemos primeiro instalar sua biblioteca em nosso PC. Você pode fazer isso digitando:
pip3 install git+https://github.com/JustAnotherArchivist/snscrape.git
O Snscrape inclui dois métodos para obter tweets do Twitter: a interface de linha de comando (CLI) e um Python Wrapper. Apenas tenha em mente que o Python Wrapper não está documentado no momento – mas ainda podemos nos virar com tentativa e erro.
Neste exemplo, usaremos o Python Wrapper porque é mais intuitivo que o método CLI. Mas se você ficar preso a algum código, sempre poderá recorrer à comunidade do GitHub para obter assistência. Os colaboradores terão prazer em ajudá-lo.
Para recuperar tweets de um usuário específico, podemos fazer o seguinte:
import snscrape.modules.twitter as sntwitter
import pandas as pd
# Created a list to append all tweet attributes(data)
attributes_container = []
# Using TwitterSearchScraper to scrape data and append tweets to list
for i,tweet in enumerate(sntwitter.TwitterSearchScraper('from:john').get_items()):
if i>100:
break
attributes_container.append([tweet.date, tweet.likeCount, tweet.sourceLabel, tweet.content])
# Creating a dataframe from the tweets list above
tweets_df = pd.DataFrame(attributes_container, columns=["Date Created", "Number of Likes", "Source of Tweet", "Tweets"])
Vamos revisar alguns dos códigos que você pode não entender à primeira vista:
for i,tweet in enumerate(sntwitter.TwitterSearchScraper('from:john').get_items()):
if i>100:
break
attributes_container.append([tweet.date, tweet.likeCount, tweet.sourceLabel, tweet.content])
# Creating a dataframe from the tweets list above
tweets_df = pd.DataFrame(attributes_container, columns=["Date Created", "Number of Likes", "Source of Tweet", "Tweets"])
No código acima, o que o sntwitter.TwitterSearchScaper
faz é retornar um objeto de tweets do nome do usuário que passamos para ele (que é john).
Como mencionei anteriormente, o Snscrape não tem limites no número de tweets, então ele retornará quantos tweets desse usuário. Para ajudar com isso, precisamos adicionar a função enumerate que irá percorrer o objeto e adicionar um contador para que possamos acessar os 100 tweets mais recentes do usuário.
Você pode ver que a sintaxe de atributos que obtemos de cada tweet se parece com a do Tweepy. Esta é a lista de atributos que podemos obter do tweet do Snscrape, com curadoria de Martin Beck.
Crédito: Martin Beck
Mais atributos podem ser adicionados, pois a biblioteca Snscrape ainda está em desenvolvimento. Como por exemplo na imagem acima, source
foi substituído por sourceLabel
. Se você passar apenas source
ele retornará um objeto.
Se você executar o código acima, deverá ver algo assim também:
Imagem do autor
Agora vamos fazer o mesmo para raspagem por pesquisa.
import snscrape.modules.twitter as sntwitter
import pandas as pd
# Creating list to append tweet data to
attributes_container = []
# Using TwitterSearchScraper to scrape data and append tweets to list
for i,tweet in enumerate(sntwitter.TwitterSearchScraper('sex for grades since:2021-07-05 until:2022-07-06').get_items()):
if i>150:
break
attributes_container.append([tweet.user.username, tweet.date, tweet.likeCount, tweet.sourceLabel, tweet.content])
# Creating a dataframe to load the list
tweets_df = pd.DataFrame(attributes_container, columns=["User", "Date Created", "Number of Likes", "Source of Tweet", "Tweet"])
Novamente, você pode acessar muitos dados históricos usando o Snscrape (ao contrário do Tweepy, pois sua API padrão não pode exceder 7 dias. A API premium é de 30 dias). Assim, podemos passar a data a partir da qual queremos iniciar a pesquisa e a data em que queremos que ela termine no sntwitter.TwitterSearchScraper()
método.
O que fizemos no código anterior é basicamente o que discutimos antes. A única coisa a ter em mente é que até funciona de forma semelhante à função range em Python (ou seja, exclui o último inteiro). Portanto, se você deseja obter tweets de hoje, precisa incluir o dia depois de hoje no parâmetro "até".
Imagem do Autor.
Agora você também sabe como raspar tweets com o Snscrape!
Agora que vimos como cada método funciona, você deve estar se perguntando quando usar qual.
Bem, não existe uma regra universal para quando utilizar cada método. Tudo se resume a uma preferência de assunto e seu caso de uso.
Se você deseja adquirir um número infinito de tweets, deve usar o Snscrape. Mas se você quiser usar recursos extras que o Snscrape não pode fornecer (como geolocalização, por exemplo), você deve definitivamente usar o Tweepy. Ele é integrado diretamente com a API do Twitter e oferece funcionalidade completa.
Mesmo assim, o Snscrape é o método mais comumente usado para raspagem básica.
Conclusão
Neste artigo, aprendemos como extrair dados do Python usando Tweepy e Snscrape. Mas esta foi apenas uma breve visão geral de como cada abordagem funciona. Você pode aprender mais explorando a web para obter informações adicionais.
Incluí alguns recursos úteis que você pode usar se precisar de informações adicionais. Obrigado por ler.
Fonte: https://www.freecodecamp.org/news/python-web-scraping-tutorial/
1657769340
如果您是数据爱好者,您可能会同意社交媒体是现实世界数据中最丰富的来源之一。像 Twitter 这样的网站充满了数据。
您可以通过多种方式使用从社交媒体获得的数据,例如针对特定问题或感兴趣领域的情绪分析(分析人们的想法)。
您可以通过多种方式从 Twitter 上抓取(或收集)数据。在本文中,我们将研究其中两种方式:使用 Tweepy 和 Snscrap。
我们将学习一种方法来抓取人们关于特定趋势主题的公开对话,以及来自特定用户的推文。
现在事不宜迟,让我们开始吧。
现在,在我们进入每个平台的实现之前,让我们尝试掌握每个平台的差异和限制。
Tweepy 是一个用于与 Twitter API 集成的 Python 库。因为 Tweepy 与 Twitter API 连接,除了抓取推文之外,您还可以执行复杂的查询。它使您能够利用 Twitter API 的所有功能。
但也有一些缺点——比如它的标准 API 只允许您收集长达一周的推文(也就是说,Tweepy 不允许恢复超过一周窗口的推文,因此不允许检索历史数据)。
此外,您可以从用户帐户中检索多少条推文也是有限制的。您可以在此处阅读有关 Tweepy 功能的更多信息。
Snscape 是另一种从 Twitter 上抓取信息的方法,不需要使用 API。Snscrape 允许您抓取基本信息,例如用户的个人资料、推文内容、来源等。
Snscape 不仅限于 Twitter,还可以从其他著名的社交媒体网络(如 Facebook、Instagram 等)中抓取内容。
它的优点是可以检索的推文数量或推文窗口(即推文的日期范围)没有限制。因此,Snscape 允许您检索旧数据。
但一个缺点是它缺乏 Tweepy 的所有其他功能——不过,如果你只想抓取推文,Snscrap 就足够了。
现在我们已经阐明了这两种方法之间的区别,让我们一一来看看它们的实现。
在我们开始使用 Tweepy 之前,我们必须首先确保我们的 Twitter 凭据已准备好。有了它,我们可以将 Tweepy 连接到我们的 API 密钥并开始抓取。
如果您没有 Twitter 凭据,您可以前往此处注册 Twitter 开发者帐户。您将被问及一些关于您打算如何使用 Twitter API 的基本问题。之后,您可以开始实施。
第一步是在你的本地机器上安装 Tweepy 库,你可以通过键入:
pip install git+https://github.com/tweepy/tweepy.git
现在我们已经安装了 Tweepy 库,让我们从john
Twitter 上调用的用户那里抓取 100 条推文。我们将查看完整的代码实现,让我们这样做并详细讨论它,以便我们了解发生了什么:
import tweepy
consumer_key = "XXXX" #Your API/Consumer key
consumer_secret = "XXXX" #Your API/Consumer Secret Key
access_token = "XXXX" #Your Access token key
access_token_secret = "XXXX" #Your Access token Secret key
#Pass in our twitter API authentication key
auth = tweepy.OAuth1UserHandler(
consumer_key, consumer_secret,
access_token, access_token_secret
)
#Instantiate the tweepy API
api = tweepy.API(auth, wait_on_rate_limit=True)
username = "john"
no_of_tweets =100
try:
#The number of tweets we want to retrieved from the user
tweets = api.user_timeline(screen_name=username, count=no_of_tweets)
#Pulling Some attributes from the tweet
attributes_container = [[tweet.created_at, tweet.favorite_count,tweet.source, tweet.text] for tweet in tweets]
#Creation of column list to rename the columns in the dataframe
columns = ["Date Created", "Number of Likes", "Source of Tweet", "Tweet"]
#Creation of Dataframe
tweets_df = pd.DataFrame(attributes_container, columns=columns)
except BaseException as e:
print('Status Failed On,',str(e))
time.sleep(3)
现在让我们回顾一下上面代码块中的每一部分代码。
import tweepy
consumer_key = "XXXX" #Your API/Consumer key
consumer_secret = "XXXX" #Your API/Consumer Secret Key
access_token = "XXXX" #Your Access token key
access_token_secret = "XXXX" #Your Access token Secret key
#Pass in our twitter API authentication key
auth = tweepy.OAuth1UserHandler(
consumer_key, consumer_secret,
access_token, access_token_secret
)
#Instantiate the tweepy API
api = tweepy.API(auth, wait_on_rate_limit=True)
在上面的代码中,我们将 Tweepy 库导入到我们的代码中,然后我们创建了一些变量来存储我们的 Twitter 凭据(Tweepy 身份验证处理程序需要我们的四个 Twitter 凭据)。所以我们然后将这些变量传递给 Tweepy 身份验证处理程序并将它们保存到另一个变量中。
然后最后一个调用语句是我们实例化 Tweepy API 并传入 require 参数的地方。
username = "john"
no_of_tweets =100
try:
#The number of tweets we want to retrieved from the user
tweets = api.user_timeline(screen_name=username, count=no_of_tweets)
#Pulling Some attributes from the tweet
attributes_container = [[tweet.created_at, tweet.favorite_count,tweet.source, tweet.text] for tweet in tweets]
#Creation of column list to rename the columns in the dataframe
columns = ["Date Created", "Number of Likes", "Source of Tweet", "Tweet"]
#Creation of Dataframe
tweets_df = pd.DataFrame(attributes_container, columns=columns)
except BaseException as e:
print('Status Failed On,',str(e))
在上面的代码中,我们创建了要从中检索推文的用户名(Twitter 中的@name)以及推文的数量。然后我们创建了一个异常处理程序来帮助我们以更有效的方式捕获错误。
之后,api.user_timeline()
返回我们在参数中选择的用户发布的最新推文的集合以及screen_name
您要检索的推文数量。
在下一行代码中,我们传入了一些我们想从每条推文中检索的属性,并将它们保存到一个列表中。要查看可以从推文中检索到的更多属性,请阅读此。
在最后一段代码中,我们创建了一个数据框,并传入了我们创建的列表以及我们创建的列的名称。
请注意,列名必须按照您将它们传递到属性容器的顺序(即,当您从推文中检索属性时,您如何在列表中传递这些属性)。
如果你正确地按照我描述的步骤,你应该有这样的:
作者图片
现在我们已经完成了,在我们进入 Snscrap 实现之前,让我们再看一个例子。
在这种方法中,我们将根据搜索检索推文。你可以这样做:
import tweepy
consumer_key = "XXXX" #Your API/Consumer key
consumer_secret = "XXXX" #Your API/Consumer Secret Key
access_token = "XXXX" #Your Access token key
access_token_secret = "XXXX" #Your Access token Secret key
#Pass in our twitter API authentication key
auth = tweepy.OAuth1UserHandler(
consumer_key, consumer_secret,
access_token, access_token_secret
)
#Instantiate the tweepy API
api = tweepy.API(auth, wait_on_rate_limit=True)
search_query = "sex for grades"
no_of_tweets =150
try:
#The number of tweets we want to retrieved from the search
tweets = api.search_tweets(q=search_query, count=no_of_tweets)
#Pulling Some attributes from the tweet
attributes_container = [[tweet.user.name, tweet.created_at, tweet.favorite_count, tweet.source, tweet.text] for tweet in tweets]
#Creation of column list to rename the columns in the dataframe
columns = ["User", "Date Created", "Number of Likes", "Source of Tweet", "Tweet"]
#Creation of Dataframe
tweets_df = pd.DataFrame(attributes_container, columns=columns)
except BaseException as e:
print('Status Failed On,',str(e))
上面的代码与前面的代码类似,只是我们将 API 方法从 更改api.user_timeline()
为api.search_tweets()
。我们还添加tweet.user.name
了属性容器列表。
在上面的代码中,你可以看到我们传入了两个属性。这是因为如果我们只传入tweet.user
,它只会返回一个字典用户对象。所以我们还必须传入另一个我们想从用户对象中检索的属性,即name
.
您可以在此处查看可以从用户对象中检索的附加属性列表。现在,一旦您运行它,您应该会看到类似这样的内容:
图片由作者提供。
好的,这就是 Tweepy 的实现。请记住,您可以检索的推文数量是有限制的,并且您不能使用 Tweepy 检索超过 7 天的推文。
正如我之前提到的,Snscrape 不需要 Twitter 凭据(API 密钥)来访问它。您可以获取的推文数量也没有限制。
但是,对于这个示例,我们将只检索与上一个示例相同的推文,但使用 Snscape。
要使用 Snscrap,我们必须首先在我们的 PC 上安装它的库。您可以通过键入:
pip3 install git+https://github.com/JustAnotherArchivist/snscrape.git
Snscrape 包括两种从 Twitter 获取推文的方法:命令行界面 (CLI) 和 Python Wrapper。请记住,Python Wrapper 目前没有文档记录——但我们仍然可以通过反复试验来度过难关。
在本例中,我们将使用 Python Wrapper,因为它比 CLI 方法更直观。但是,如果您遇到一些代码问题,您可以随时向 GitHub 社区寻求帮助。贡献者将很乐意为您提供帮助。
要检索特定用户的推文,我们可以执行以下操作:
import snscrape.modules.twitter as sntwitter
import pandas as pd
# Created a list to append all tweet attributes(data)
attributes_container = []
# Using TwitterSearchScraper to scrape data and append tweets to list
for i,tweet in enumerate(sntwitter.TwitterSearchScraper('from:john').get_items()):
if i>100:
break
attributes_container.append([tweet.date, tweet.likeCount, tweet.sourceLabel, tweet.content])
# Creating a dataframe from the tweets list above
tweets_df = pd.DataFrame(attributes_container, columns=["Date Created", "Number of Likes", "Source of Tweet", "Tweets"])
让我们复习一下你可能第一眼看不懂的一些代码:
for i,tweet in enumerate(sntwitter.TwitterSearchScraper('from:john').get_items()):
if i>100:
break
attributes_container.append([tweet.date, tweet.likeCount, tweet.sourceLabel, tweet.content])
# Creating a dataframe from the tweets list above
tweets_df = pd.DataFrame(attributes_container, columns=["Date Created", "Number of Likes", "Source of Tweet", "Tweets"])
在上面的代码中,所做的sntwitter.TwitterSearchScaper
是从我们传递给它的用户名(即 john)中返回一个推文对象。
正如我之前提到的,Snscrape 对推文的数量没有限制,因此它会返回来自该用户的许多推文。为了解决这个问题,我们需要添加枚举函数,该函数将遍历对象并添加一个计数器,以便我们可以访问用户最近的 100 条推文。
您可以看到,我们从每条推文中获得的属性语法与 Tweepy 中的类似。这些是我们可以从 Martin Beck 策划的 Snscape 推文中获得的属性列表。
学分:马丁贝克
可能会添加更多属性,因为 Snscape 库仍在开发中。例如上图中的,source
已替换为sourceLabel
. 如果你只传入source
它会返回一个对象。
如果你运行上面的代码,你应该也会看到类似这样的东西:
作者图片
现在让我们对通过搜索进行抓取做同样的事情。
import snscrape.modules.twitter as sntwitter
import pandas as pd
# Creating list to append tweet data to
attributes_container = []
# Using TwitterSearchScraper to scrape data and append tweets to list
for i,tweet in enumerate(sntwitter.TwitterSearchScraper('sex for grades since:2021-07-05 until:2022-07-06').get_items()):
if i>150:
break
attributes_container.append([tweet.user.username, tweet.date, tweet.likeCount, tweet.sourceLabel, tweet.content])
# Creating a dataframe to load the list
tweets_df = pd.DataFrame(attributes_container, columns=["User", "Date Created", "Number of Likes", "Source of Tweet", "Tweet"])
同样,您可以使用 Snscrape 访问大量历史数据(与 Tweepy 不同,因为它的标准 API 不能超过 7 天。高级 API 是 30 天。)。所以我们可以在方法中传入我们想要开始搜索的日期和想要结束的日期sntwitter.TwitterSearchScraper()
。
我们在前面的代码中所做的基本上就是我们之前讨论过的。唯一要记住的是,直到与 Python 中的范围函数类似(也就是说,它不包括最后一个整数)。因此,如果您想从今天开始获取推文,则需要在“直到”参数中包含今天之后的一天。
作者的形象。
现在您也知道如何使用 Snscape 抓取推文了!
现在我们已经了解了每种方法的工作原理,您可能想知道何时使用哪种方法。
好吧,对于何时使用每种方法没有通用规则。一切都取决于问题偏好和您的用例。
如果你想获得无穷无尽的推文,你应该使用 Snscrap。但是,如果您想使用 Snscrape 无法提供的额外功能(例如地理定位),那么您绝对应该使用 Tweepy。它直接与 Twitter API 集成并提供完整的功能。
即便如此,Snscrape 是最常用的基本刮削方法。
结论
在本文中,我们学习了如何使用 Tweepy 和 Snscrap 从 Python 中抓取数据。但这只是对每种方法如何工作的简要概述。您可以通过浏览网络了解更多信息以获取更多信息。
我提供了一些有用的资源,如果您需要更多信息,可以使用它们。感谢您的阅读。
来源:https ://www.freecodecamp.org/news/python-web-scraping-tutorial/