1594329660
Data is the new oil — but it’s definitely not cheap. We have data flowing in from all directions; web, apps, social media, etc and it is imperative that data scientists are able to mine some of it. In the following blog, we will learn how to quickly mine/scrape data from a website (for fun) using a Python library ‘BeautifulSoup’
Plan of action
Anyone who has worked in customer experience or hospitality industry understands the importance of customer satisfaction. NPS or Net Promoter Score is considered to be a benchmark for customer experience. Although NPS is a specially designed survey, there are other methods to understand customer sentiment. One of them being — **Customer feedback and Rating on Appstore **(of course only if your app is available there).
So here what is we will do —
→Take a random app (eg: Facebook)
→ Go to iTune reviews
→ Extract the rating, comments, date, etc that different user have given
→Export them in a clean ‘csv/xlsx’ format.
Beautiful Soup_(aka BS4)_ is a Python package for parsing HTML and XML documents. It creates a parse tree for parsed pages that can be used to extract data from HTML, which is useful for web scraping. It is available for Python 2.7 and Python 3
iTunes has made it really easy to get app review from the Apple App Store. Facebook’s app id is 28488215
and we just need to add the same in the following URL
#data-science #xml #data-mining #data analysis
1594329660
Data is the new oil — but it’s definitely not cheap. We have data flowing in from all directions; web, apps, social media, etc and it is imperative that data scientists are able to mine some of it. In the following blog, we will learn how to quickly mine/scrape data from a website (for fun) using a Python library ‘BeautifulSoup’
Plan of action
Anyone who has worked in customer experience or hospitality industry understands the importance of customer satisfaction. NPS or Net Promoter Score is considered to be a benchmark for customer experience. Although NPS is a specially designed survey, there are other methods to understand customer sentiment. One of them being — **Customer feedback and Rating on Appstore **(of course only if your app is available there).
So here what is we will do —
→Take a random app (eg: Facebook)
→ Go to iTune reviews
→ Extract the rating, comments, date, etc that different user have given
→Export them in a clean ‘csv/xlsx’ format.
Beautiful Soup_(aka BS4)_ is a Python package for parsing HTML and XML documents. It creates a parse tree for parsed pages that can be used to extract data from HTML, which is useful for web scraping. It is available for Python 2.7 and Python 3
iTunes has made it really easy to get app review from the Apple App Store. Facebook’s app id is 28488215
and we just need to add the same in the following URL
#data-science #xml #data-mining #data analysis
1603029420
Network Neutrality, which literally protects freedom of speech, has become a controversial concept in the U.S.
The Internet is an essential commodity in contemporary life. No one disagrees. However, not everyone agrees on the relevance of network neutrality.
Net neutrality was founded on the idea that the Internet is open to all, with all websites treated equally, whatever the platform used to access them.
It upholds the idea that Internet Service Providers (ISPs) like Verizon and Comcast should not transfer selected data into “fast lanes” so users can access them faster, and, on the other hand, block or discriminate against other content to slow them down, so users cannot access them easily.
The idea upheld is also to provide this service like a utility, and prevent discrimination in delivering its service; a city’s water supply is a utility service that affords the same water pressure to all, considering as immaterial, user identity or reason for consumption.
In other words, an ISP should not be allowed to make a huge global corporation’s website faster than a small business website. The inventor of the World Wide Web, Tim Berners-Lee, himself, says,
“It’s time to recognize the internet as a basic human right. It means guaranteeing affordable access for all, ensuring internet packets are delivered without commercial or political discrimination, and protecting the privacy and freedom of web users regardless of where they live.”
In fact, the United Nations Human Rights Council, in 2012, determined that connecting to the internet is a human right. The UN Resolution condemned all attempts to block free speech online, and stated in conclusion, that “the same rights that people have offline must also be protected online, in particular, freedom of expression.” The resolution was updated and unanimously re-adopted twice, in 2014 and in 2016.
This principle of being fair to all content and websites, took on enhanced significance during the global stay-at-home orders and consequently extensive remote work situations.
#internet #freedom #rights #internet-as-a-right #universal-rights #good-company #latest-tech-stories #net-neutrality
1624595434
When scraping a website with Python, it’s common to use the
urllib
or theRequests
libraries to sendGET
requests to the server in order to receive its information.
However, you’ll eventually need to send some information to the website yourself before receiving the data you want, maybe because it’s necessary to perform a log-in or to interact somehow with the page.
To execute such interactions, Selenium is a frequently used tool. However, it also comes with some downsides as it’s a bit slow and can also be quite unstable sometimes. The alternative is to send a
POST
request containing the information the website needs using the request library.
In fact, when compared to Requests, Selenium becomes a very slow approach since it does the entire work of actually opening your browser to navigate through the websites you’ll collect data from. Of course, depending on the problem, you’ll eventually need to use it, but for some other situations, a
POST
request may be your best option, which makes it an important tool for your web scraping toolbox.
In this article, we’ll see a brief introduction to the
POST
method and how it can be implemented to improve your web scraping routines.
#python #web-scraping #requests #web-scraping-with-python #data-science #data-collection #python-tutorials #data-scraping
1620203018
HTML’s full form is Hypertext Markup Language, while XML is an Extensible Markup Language. The purpose of HTML is to display data and focus on how the data looks. Therefore, HTML describes a web page’s structure and displays information, whereas XML structures, stores, and transfers information and describes what the data is.
In this article, HTML and XML shall be discussed in detail to understand the differences between them.
Hypertext Markup Language (HTML) is a programming language that displays data and describes a web page’s structure. Hypertext facilitates browsing the web by referring to the hyperlinks an HTML page contains. The hyperlink enables one to go to any place on the internet by clicking it. There is no set order to do so.
Extensible Markup Language (XML) is a programming language created by the World Wide Web Consortium (W3C). XML facilitates encoding documents, defined by a set of rules, in a format that can be read by both humans and machines. By using tags, XML defines the document structure, how it should be stored and transported. It enables the creation of web applications and web pages and is a dynamic language that transports data. It’s often used as the basis for many other document formats, some of which are as follows.
#html #html vs xml #xml
1619678404
HTML’s full form is Hypertext Markup Language, while XML is an Extensible Markup Language. The purpose of HTML is to display data and focus on how the data looks. Therefore, HTML describes a web page’s structure and displays information, whereas XML structures, stores, and transfers information and describes what the data is.
In this article, HTML and XML shall be discussed in detail to understand the differences between them.
Hypertext Markup Language (HTML) is a programming language that displays data and describes a web page’s structure. Hypertext facilitates browsing the web by referring to the hyperlinks an HTML page contains. The hyperlink enables one to go to any place on the internet by clicking it. There is no set order to do so.
Markup language points out to the way tags are used in defining the page layout and the elements within the page. It consists of various HTML elements comprising tags and their content. HTML language enables the creation of links of documents, is static, and can ignore small errors. In HTML, closing tags are not necessary. It can be defined as a markup language that makes the text more dynamic and interactive.
#software development #html #html vs xml #xml