Shubham Ankit

Shubham Ankit

1555127646

Web Scraping Mountain Weather Forecasts using Python and a Raspberry Pi

Originally published by Andres Vourakis at https://towardsdatascience.com/web-scraping-mountain-weather-forecasts-using-python-and-a-raspberry-pi-f215fdf82c6b

Web Scraping Mountain Weather Forecasts using Python and a Raspberry Pi. Extracting data from a website without an API. Go to the profile of ...

Motivation

Before walking you through the project, let me tell you a little bit about the motivation behind it. Aside from Data Science and Machine Learning, my other passion is spending time in the mountains. Planning a trip to any mountain requires lots of careful planning in order to minimize the risks. That means paying close attention to the weather conditions as the summit day approaches. My absolutely favorite website for this is Mountain-Forecast.com which gives you the weather forecasts for almost any mountain in the world at different elevations. The only problem is that it doesn’t offer any historical data (as far as I can tell), which can sometimes be useful when determining if it is a good idea to make the trip or wait for better conditions.

This problem has been in the back of my mind for a while and I finally decided to do something about it. Below, I’ll describe how I wrote a web scraper for Mountain-Forecast.com using Python and Beautiful Soup, and put it into a Raspberry Pi to collect the data on a daily basis.

if you’d rather skip to the code, then check out the repository on GitHub.

Web Scraping

Inspecting the Website

In order to figure out which elements I needed to target, I started by inspecting the source code of the page. This can be easily done by right clicking on the element of interest and selecting inspect. This brings up the HTML code where we can see the element that each field is contained within.

Lucky for me, the forecast information for every mountain is contained within a table. The only problem is that each day has multiple sub columns associated with it (i.e. AM, PM and night) and so I would need to figure out a way to iterate through them. In addition, since weather forecasts are provided at different elevations, I would need to extract the link for each one of them and scrape them individually.

Similarly, I inspected the directory containing the URLs for the highest 100 mountains in the United States.

This seemed like a much easier task since all I needed from the table were the URLs and Mountain Names, in no specific order.

Parsing the Web Page using Beautiful Soup

After familiarizing myself with the HTML structure of the page, it was time to get started.

My first task was to collect the URLs for the mountains I was interested in. I wrote a couple of functions to store the information in a dictionary, where the key is Mountain Name and the value is a list of all the URLs associated with it (URLs by elevation). Then I used the pickle module to serialize the dictionary and save it into a file so that it could be easily retrieved when needed. Here is the code I wrote to do that:

def load_urls(urls_filename):
    """ Returns dictionary of mountain urls saved a pickle file """
full_path = os.path.join(os.getcwd(), urls_filename)

with open(full_path, 'rb') as file:
    urls = pickle.load(file)
return urls

def dump_urls(mountain_urls, urls_filename):
“”" Saves dictionary of mountain urls as a pickle file “”"

full_path = os.path.join(os.getcwd(), urls_filename)

with open(full_path, 'wb') as file:
    pickle.dump(mountain_urls, file)

def get_urls_by_elevation(url):
“”" Given a mountain url it returns a list or its urls by elevation “”"

base_url = 'https://www.mountain-forecast.com/'
full_url = urljoin(base_url, url)

time.sleep(1) # Delay to not bombard the website with requests
page = requests.get(full_url)
soup = bs(page.content, 'html.parser')

elevation_items = soup.find('ul', attrs={'class':'b-elevation__container'}).find_all('a', attrs={'class':'js-elevation-link'})

return [urljoin(base_url, item['href']) for item in elevation_items]

def get_mountains_urls(urls_filename = ‘mountains_urls.pickle’, url = ‘https://www.mountain-forecast.com/countries/United-States?top100=yes’):
“”" Returs dictionary of mountain urls

If a file with urls doesn't exists then create a new one using "url" and return it
"""

try:
    mountain_urls = load_urls(urls_filename)

except:  # Is this better than checking if the file exists? Should I catch specific errors?

    directory_url = url

    page = requests.get(directory_url)
    soup = bs(page.content, 'html.parser')

    mountain_items = soup.find('ul', attrs={'class':'b-list-table'}).find_all('li')
    mountain_urls = {item.find('a').get_text() : get_urls_by_elevation(item.find('a')['href']) for item in mountain_items}

    dump_urls(mountain_urls, urls_filename)

finally:
    return mountain_urls

My next task was to collect the weather forecast for each mountain in the dictionary. I used requests to get the content of the page and beautifulsoup4 to parse it.

page = requests.get(url)
soup = bs(page.content, ‘html.parser’)

Get data from header

forecast_table = soup.find(‘table’, attrs={‘class’: ‘forecast__table forecast__table–js’})
days = forecast_table.find(‘tr’, attrs={‘data-row’: ‘days’}).find_all(‘td’)

Get rows from body

times = forecast_table.find(‘tr’, attrs={‘data-row’: ‘time’}).find_all(‘td’)
winds = forecast_table.find(‘tr’, attrs={‘data-row’: ‘wind’}).find_all(‘img’) # Use “img” instead of “td” to get direction of wind
summaries = forecast_table.find(‘tr’, attrs={‘data-row’: ‘summary’}).find_all(‘td’)
rains = forecast_table.find(‘tr’, attrs={‘data-row’: ‘rain’}).find_all(‘td’)
snows = forecast_table.find(‘tr’, attrs={‘data-row’: ‘snow’}).find_all(‘td’)
max_temps = forecast_table.find(‘tr’, attrs={‘data-row’: ‘max-temperature’}).find_all(‘td’)
min_temps = forecast_table.find(‘tr’, attrs={‘data-row’: ‘min-temperature’}).find_all(‘td’)
chills = forecast_table.find(‘tr’, attrs={‘data-row’: ‘chill’}).find_all(‘td’)
freezings = forecast_table.find(‘tr’, attrs={‘data-row’: ‘freezing-level’}).find_all(‘td’)
sunrises = forecast_table.find(‘tr’, attrs={‘data-row’: ‘sunrise’}).find_all(‘td’)
sunsets = forecast_table.find(‘tr’, attrs={‘data-row’: ‘sunset’}).find_all(‘td’)

Iterate over days

for i, day in enumerate(days):
current_day = clean(day.get_text())
elevation = url.rsplit(‘/’, 1)[-1]
num_cols = int(day[‘data-columns’])

if current_day != '':

    date = str(datetime.date(datetime.date.today().year, datetime.date.today().month, int(current_day.split(' ')[1])))  # Avoid using date format. Pandas adds 00:00:00 for some reason. Figure out better way to format

    # Iterate over forecast
    for j in range(i, i + num_cols):    

        time_cell = clean(times[j].get_text())
        wind = clean(winds[j]['alt'])
        summary = clean(summaries[j].get_text())
        rain = clean(rains[j].get_text())
        snow = clean(snows[j].get_text())
        max_temp = clean(max_temps[j].get_text())
        min_temp = clean(min_temps[j].get_text())
        chill = clean(chills[j].get_text())
        freezing = clean(freezings[j].get_text())
        sunrise = clean(sunrises[j].get_text())
        sunset = clean(sunsets[j].get_text())

        rows.append(np.array([mountain_name, date, elevation, time_cell, wind, summary, rain, snow, max_temp, min_temp, chill, freezing, sunrise, sunset]))

As you can see from the code, I manually saved each element of interest into its own variable instead of iterating through them. It wasn’t pretty but I decided to do it that way since I wasn’t interested in all of the elements (i.e. weather maps and freezing scale) and there were a few of them that needed to be handled differently than the rest.

Saving the data

Since my goal was to scrape daily and the forecasts get updated everyday, it was important to figure out a way to update old forecasts instead of creating duplicates and append the new ones. I used the pandas module to turn the data into a DataFrame (a two-dimensional data structure consisting or rows and columns) and be able to easily manipulate it and then save it as a CSV file. Here is what the code looks like:

def save_data(rows):
“”" Saves the collected forecasts into a CSV file

If the file already exists then it updates the old forecasts
as necessary and/or appends new ones.
"""

column_names = ['mountain', 'date', 'elevation', 'time', 'wind', 'summary', 'rain', 'snow', 'max_temperature', 'min_temperature', 'chill', 'freezing_level', 'sunrise', 'sunset']

today = datetime.date.today()
dataset_name = os.path.join(os.getcwd(), '{:02d}{}_mountain_forecasts.csv'.format(today.month, today.year))  # i.e. 042019_mountain_forecasts.csv

try:
    new_df = pd.DataFrame(rows, columns=column_names)
    old_df = pd.read_csv(dataset_name, dtype=object)

    new_df.set_index(column_names[:4], inplace=True)
    old_df.set_index(column_names[:4], inplace=True)
    
    # Update old forecasts and append new ones
    old_df.update(new_df)
    only_include = ~old_df.index.isin(new_df.index)
    combined = pd.concat([old_df[only_include], new_df])
    combined.to_csv(dataset_name)

except FileNotFoundError:
    new_df.to_csv(dataset_name, index=False)

Once the data collection and manipulation was done, I ended up with this table:


Running the Scraper on Raspberry Pi

The Raspberry Pi is a low cost, credit-card sized computer that can be used for a variety of projects like, retro-gaming emulation, home automation, robotics, or in this case, web-scraping. Running the scraper on the Raspberry Pi can be a better alternative to leaving your personal desktop or laptop running all the time, or investing on a server.

Setting it up

First I needed to install an Operating System on the Raspberry Pi and I chose Raspbian Stretch Lite, a Debian-based operating system without a graphical desktop, just a terminal.

After installing Raspbian Stretch Lite, I used the the command sudo raspi-config to open up the configuration tool and change the password, expand filesystem, change host name and enable SSH.

Finally, I used sudo apt-get update && sudo apt-get upgrade to make sure everything was up-to-date and proceeded to install all of the dependencies necessary to run my script (i.e. Pandas, Beautiful Soup 4, etc…)

Automating the script

In order to schedule the script to run daily, I used cron, a time-based job scheduler in Unix-like computer operating systems (i.e. Ubuntu, Raspbian, macOS, etc…). Using the following command the script was scheduled to run daily at 10:00 AM.

0 10 * * * /usr/bin/python3 /home/pi/scraper.py

This is what the final set-up looks like:

Since SSH is enabled on the Raspberry Pi, I can now easily connect to it via terminal (no need for an extra monitor and keyboard) using my personal laptop or phone and keep an eye on the scraper.

I hope you enjoyed this walk through and it inspired you to code your own Web Scraper using Python and a Raspberry Pi. If you have any questions or feedback, I’ll be happy to read them in the comments below :)

Originally published by Andres Vourakis at https://towardsdatascience.com/web-scraping-mountain-weather-forecasts-using-python-and-a-raspberry-pi-f215fdf82c6b

Thanks for reading :heart: If you liked this post, share it with all of your programming buddies! Follow me on Facebook | Twitter

Learn More

☞ Machine Learning with Python, Jupyter, KSQL and TensorFlow

☞ Introduction to Python Microservices with Nameko

☞ Comparing Python and SQL for Building Data Pipelines

☞ Python Tutorial - Complete Programming Tutorial for Beginners (2019)

☞ Python and HDFS for Machine Learning

☞ Build a chat widget with Python and JavaScript

☞ Complete Python Bootcamp: Go from zero to hero in Python 3

☞ Complete Python Masterclass

☞ Learn Python by Building a Blockchain & Cryptocurrency

☞ Python and Django Full Stack Web Developer Bootcamp

☞ The Python Bible™ | Everything You Need to Program in Python

☞ Learning Python for Data Analysis and Visualization

☞ Python for Financial Analysis and Algorithmic Trading

☞ The Modern Python 3 Bootcamp

#python #data-science

What is GEEK

Buddha Community

Web Scraping Mountain Weather Forecasts using Python and a Raspberry Pi
Ray  Patel

Ray Patel

1619518440

top 30 Python Tips and Tricks for Beginners

Welcome to my Blog , In this article, you are going to learn the top 10 python tips and tricks.

1) swap two numbers.

2) Reversing a string in Python.

3) Create a single string from all the elements in list.

4) Chaining Of Comparison Operators.

5) Print The File Path Of Imported Modules.

6) Return Multiple Values From Functions.

7) Find The Most Frequent Value In A List.

8) Check The Memory Usage Of An Object.

#python #python hacks tricks #python learning tips #python programming tricks #python tips #python tips and tricks #python tips and tricks advanced #python tips and tricks for beginners #python tips tricks and techniques #python tutorial #tips and tricks in python #tips to learn python #top 30 python tips and tricks for beginners

Sival Alethea

Sival Alethea

1624402800

Beautiful Soup Tutorial - Web Scraping in Python

The Beautiful Soup module is used for web scraping in Python. Learn how to use the Beautiful Soup and Requests modules in this tutorial. After watching, you will be able to start scraping the web on your own.
📺 The video in this post was made by freeCodeCamp.org
The origin of the article: https://www.youtube.com/watch?v=87Gx3U0BDlo&list=PLWKjhJtqVAbnqBxcdjVGgT3uVR10bzTEB&index=12
🔥 If you’re a beginner. I believe the article below will be useful to you ☞ What You Should Know Before Investing in Cryptocurrency - For Beginner
⭐ ⭐ ⭐The project is of interest to the community. Join to Get free ‘GEEK coin’ (GEEKCASH coin)!
☞ **-----CLICK HERE-----**⭐ ⭐ ⭐
Thanks for visiting and watching! Please don’t forget to leave a like, comment and share!

#web scraping #python #beautiful soup #beautiful soup tutorial #web scraping in python #beautiful soup tutorial - web scraping in python

Ray  Patel

Ray Patel

1619510796

Lambda, Map, Filter functions in python

Welcome to my Blog, In this article, we will learn python lambda function, Map function, and filter function.

Lambda function in python: Lambda is a one line anonymous function and lambda takes any number of arguments but can only have one expression and python lambda syntax is

Syntax: x = lambda arguments : expression

Now i will show you some python lambda function examples:

#python #anonymous function python #filter function in python #lambda #lambda python 3 #map python #python filter #python filter lambda #python lambda #python lambda examples #python map

Shardul Bhatt

Shardul Bhatt

1626775355

Why use Python for Software Development

No programming language is pretty much as diverse as Python. It enables building cutting edge applications effortlessly. Developers are as yet investigating the full capability of end-to-end Python development services in various areas. 

By areas, we mean FinTech, HealthTech, InsureTech, Cybersecurity, and that's just the beginning. These are New Economy areas, and Python has the ability to serve every one of them. The vast majority of them require massive computational abilities. Python's code is dynamic and powerful - equipped for taking care of the heavy traffic and substantial algorithmic capacities. 

Programming advancement is multidimensional today. Endeavor programming requires an intelligent application with AI and ML capacities. Shopper based applications require information examination to convey a superior client experience. Netflix, Trello, and Amazon are genuine instances of such applications. Python assists with building them effortlessly. 

5 Reasons to Utilize Python for Programming Web Apps 

Python can do such numerous things that developers can't discover enough reasons to admire it. Python application development isn't restricted to web and enterprise applications. It is exceptionally adaptable and superb for a wide range of uses.

Robust frameworks 

Python is known for its tools and frameworks. There's a structure for everything. Django is helpful for building web applications, venture applications, logical applications, and mathematical processing. Flask is another web improvement framework with no conditions. 

Web2Py, CherryPy, and Falcon offer incredible capabilities to customize Python development services. A large portion of them are open-source frameworks that allow quick turn of events. 

Simple to read and compose 

Python has an improved sentence structure - one that is like the English language. New engineers for Python can undoubtedly understand where they stand in the development process. The simplicity of composing allows quick application building. 

The motivation behind building Python, as said by its maker Guido Van Rossum, was to empower even beginner engineers to comprehend the programming language. The simple coding likewise permits developers to roll out speedy improvements without getting confused by pointless subtleties. 

Utilized by the best 

Alright - Python isn't simply one more programming language. It should have something, which is the reason the business giants use it. Furthermore, that too for different purposes. Developers at Google use Python to assemble framework organization systems, parallel information pusher, code audit, testing and QA, and substantially more. Netflix utilizes Python web development services for its recommendation algorithm and media player. 

Massive community support 

Python has a steadily developing community that offers enormous help. From amateurs to specialists, there's everybody. There are a lot of instructional exercises, documentation, and guides accessible for Python web development solutions. 

Today, numerous universities start with Python, adding to the quantity of individuals in the community. Frequently, Python designers team up on various tasks and help each other with algorithmic, utilitarian, and application critical thinking. 

Progressive applications 

Python is the greatest supporter of data science, Machine Learning, and Artificial Intelligence at any enterprise software development company. Its utilization cases in cutting edge applications are the most compelling motivation for its prosperity. Python is the second most well known tool after R for data analytics.

The simplicity of getting sorted out, overseeing, and visualizing information through unique libraries makes it ideal for data based applications. TensorFlow for neural networks and OpenCV for computer vision are two of Python's most well known use cases for Machine learning applications.

Summary

Thinking about the advances in programming and innovation, Python is a YES for an assorted scope of utilizations. Game development, web application development services, GUI advancement, ML and AI improvement, Enterprise and customer applications - every one of them uses Python to its full potential. 

The disadvantages of Python web improvement arrangements are regularly disregarded by developers and organizations because of the advantages it gives. They focus on quality over speed and performance over blunders. That is the reason it's a good idea to utilize Python for building the applications of the future.

#python development services #python development company #python app development #python development #python in web development #python software development

Osiki  Douglas

Osiki Douglas

1624595434

How POST Requests with Python Make Web Scraping Easier

When scraping a website with Python, it’s common to use the

urllibor theRequestslibraries to sendGETrequests to the server in order to receive its information.

However, you’ll eventually need to send some information to the website yourself before receiving the data you want, maybe because it’s necessary to perform a log-in or to interact somehow with the page.

To execute such interactions, Selenium is a frequently used tool. However, it also comes with some downsides as it’s a bit slow and can also be quite unstable sometimes. The alternative is to send a

POSTrequest containing the information the website needs using the request library.

In fact, when compared to Requests, Selenium becomes a very slow approach since it does the entire work of actually opening your browser to navigate through the websites you’ll collect data from. Of course, depending on the problem, you’ll eventually need to use it, but for some other situations, a

POSTrequest may be your best option, which makes it an important tool for your web scraping toolbox.

In this article, we’ll see a brief introduction to the

POSTmethod and how it can be implemented to improve your web scraping routines.

#python #web-scraping #requests #web-scraping-with-python #data-science #data-collection #python-tutorials #data-scraping