A Step-by-Step Tutorial for Conducting Sentiment Analysis

Following the steps from my previous articles, I preprocessed the text data and transformed the “cleaned” data into a sparse matrix. Please follow the links to check out more details.

Now I am at the last step of conducting news sentiment analysis on WTI crude oil future prices. In this article, I will discuss the use of logistic regression, and some interest results I found from my project. I have some background introduction to this project here.

Define and Construct the Target Value

As discussed briefly in my previous articles, conducting sentiment analysis is solving a classification problem (usually binary) by machine learning models and text data. Solving a classification problem is solving a supervised machine learning problem, which requires both features and target values when training the model. If it is a binary classification problem, the target values are usually positive sentiment and negative sentiment. They are assigned and detailedly defined depending on the context of your research question.

Take my project as an example, the purpose of my project is to predict the change of the crude oil future prices from recently released news articles. I define the positive news as the ones that would predict the price increase, while the negative ones would predict the price decrease. Since I have already collected and transformed the text data and will use them as the features, I now need to assign the target values for my dataset.

The target value of my project is the directions of price change with respect to different news articles. I collected the high-frequency trading data from Bloomberg for the WTI crude oil future close price, which is updating every five minutes.

#machine-learning #gridsearchcv #logistic-regression #sentiment-analysis #data-science

What is GEEK

Buddha Community

A Step-by-Step Tutorial for Conducting Sentiment Analysis
John  Smith

John Smith


Find the Best Restaurant Mobile App Development Company in Abu Dhbai

The era of mobile app development has completely changed the scenario for businesses in regions like Abu Dhabi. Restaurants and food delivery businesses are experiencing huge benefits via smart business applications. The invention and development of the food ordering app have helped all-scale businesses reach new customers and boost sales and profit. 

As a result, many business owners are searching for the best restaurant mobile app development company in Abu Dhabi. If you are also searching for the same, this article is helpful for you. It will let you know the step-by-step process to hire the right team of restaurant mobile app developers. 

Step-by-Step Process to Find the Best Restaurant App Development Company

Searching for the top mobile app development company in Abu Dhabi? Don't know the best way to search for professionals? Don't panic! Here is the step-by-step process to hire the best professionals. 

#Step 1 – Know the Company's Culture

Knowing the organization's culture is very crucial before finalizing a food ordering app development company in Abu Dhabi. An organization's personality is shaped by its common beliefs, goals, practices, or company culture. So, digging into the company culture reveals the core beliefs of the organization, its objectives, and its development team. 

Now, you might be wondering, how will you identify the company's culture? Well, you can take reference from the following sources – 

  • Social media posts 
  • App development process
  • About us Page
  • Client testimonials

#Step 2 - Refer to Clients' Reviews

Another best way to choose the On-demand app development firm for your restaurant business is to refer to the clients' reviews. Reviews are frequently available on the organization's website with a tag of "Reviews" or "Testimonials." It's important to read the reviews as they will help you determine how happy customers are with the company's app development process. 

You can also assess a company's abilities through reviews and customer testimonials. They can let you know if the mobile app developers create a valuable app or not. 

#Step 3 – Analyze the App Development Process

Regardless of the company's size or scope, adhering to the restaurant delivery app development process will ensure the success of your business application. Knowing the processes an app developer follows in designing and producing a top-notch app will help you know the working process. Organizations follow different app development approaches, so getting well-versed in the process is essential before finalizing any mobile app development company. 

#Step 4 – Consider Previous Experience

Besides considering other factors, considering the previous experience of the developers is a must. You can obtain a broad sense of the developer's capacity to assist you in creating a unique mobile application for a restaurant business.

You can also find out if the developers' have contributed to the creation of other successful applications or not. It will help you know the working capacity of a particular developer or organization. Prior experience is essential to evaluating their work. For instance, whether they haven't previously produced an app similar to yours or not. 

#Step 5 – Check for Their Technical Support

As you expect a working and successful restaurant mobile app for your business, checking on this factor is a must. A well-established organization is nothing without a good technical support team. So, ensure whatever restaurant mobile app development company you choose they must be well-equipped with a team of dedicated developers, designers, and testers. 

Strong tech support from your mobile app developers will help you identify new bugs and fix them bugs on time. All this will ensure the application's success. 

#Step 6 – Analyze Design Standards

Besides focusing on an organization's development, testing, and technical support, you should check the design standards. An appealing design is crucial in attracting new users and keeping the existing ones stick to your services. So, spend some time analyzing the design standards of an organization. Now, you might be wondering, how will you do it? Simple! By looking at the organization's portfolio. 

Whether hiring an iPhone app development company or any other, these steps apply to all. So, don't miss these steps. 

#Step 7 – Know Their Location

Finally, the last yet very crucial factor that will not only help you finalize the right person for your restaurant mobile app development but will also decide the mobile app development cost. So, you have to choose the location of the developers wisely, as it is a crucial factor in defining the cost. 

Summing Up!!!

Restaurant mobile applications have taken the food industry to heights none have ever considered. As a result, the demand for restaurant mobile app development companies has risen greatly, which is why businesses find it difficult to finalize the right person. But, we hope that after referring to this article, it will now be easier to hire dedicated developers under the desired budget. So, begin the hiring process now and get a well-craft food ordering app in hand. 

Dylan  Iqbal

Dylan Iqbal


Matplotlib Cheat Sheet: Plotting in Python

This Matplotlib cheat sheet introduces you to the basics that you need to plot your data with Python and includes code samples.

Data visualization and storytelling with your data are essential skills that every data scientist needs to communicate insights gained from analyses effectively to any audience out there. 

For most beginners, the first package that they use to get in touch with data visualization and storytelling is, naturally, Matplotlib: it is a Python 2D plotting library that enables users to make publication-quality figures. But, what might be even more convincing is the fact that other packages, such as Pandas, intend to build more plotting integration with Matplotlib as time goes on.

However, what might slow down beginners is the fact that this package is pretty extensive. There is so much that you can do with it and it might be hard to still keep a structure when you're learning how to work with Matplotlib.   

DataCamp has created a Matplotlib cheat sheet for those who might already know how to use the package to their advantage to make beautiful plots in Python, but that still want to keep a one-page reference handy. Of course, for those who don't know how to work with Matplotlib, this might be the extra push be convinced and to finally get started with data visualization in Python. 

You'll see that this cheat sheet presents you with the six basic steps that you can go through to make beautiful plots. 

Check out the infographic by clicking on the button below:

Python Matplotlib cheat sheet

With this handy reference, you'll familiarize yourself in no time with the basics of Matplotlib: you'll learn how you can prepare your data, create a new plot, use some basic plotting routines to your advantage, add customizations to your plots, and save, show and close the plots that you make.

What might have looked difficult before will definitely be more clear once you start using this cheat sheet! Use it in combination with the Matplotlib Gallery, the documentation.


Matplotlib is a Python 2D plotting library which produces publication-quality figures in a variety of hardcopy formats and interactive environments across platforms.

Prepare the Data 

1D Data 

>>> import numpy as np
>>> x = np.linspace(0, 10, 100)
>>> y = np.cos(x)
>>> z = np.sin(x)

2D Data or Images 

>>> data = 2 * np.random.random((10, 10))
>>> data2 = 3 * np.random.random((10, 10))
>>> Y, X = np.mgrid[-3:3:100j, -3:3:100j]
>>> U = 1 X** 2 + Y
>>> V = 1 + X Y**2
>>> from matplotlib.cbook import get_sample_data
>>> img = np.load(get_sample_data('axes_grid/bivariate_normal.npy'))

Create Plot

>>> import matplotlib.pyplot as plt


>>> fig = plt.figure()
>>> fig2 = plt.figure(figsize=plt.figaspect(2.0))


>>> fig.add_axes()
>>> ax1 = fig.add_subplot(221) #row-col-num
>>> ax3 = fig.add_subplot(212)
>>> fig3, axes = plt.subplots(nrows=2,ncols=2)
>>> fig4, axes2 = plt.subplots(ncols=3)

Save Plot 

>>> plt.savefig('foo.png') #Save figures
>>> plt.savefig('foo.png',  transparent=True) #Save transparent figures

Show Plot

>>> plt.show()

Plotting Routines 

1D Data 

>>> fig, ax = plt.subplots()
>>> lines = ax.plot(x,y) #Draw points with lines or markers connecting them
>>> ax.scatter(x,y) #Draw unconnected points, scaled or colored
>>> axes[0,0].bar([1,2,3],[3,4,5]) #Plot vertical rectangles (constant width)
>>> axes[1,0].barh([0.5,1,2.5],[0,1,2]) #Plot horiontal rectangles (constant height)
>>> axes[1,1].axhline(0.45) #Draw a horizontal line across axes
>>> axes[0,1].axvline(0.65) #Draw a vertical line across axes
>>> ax.fill(x,y,color='blue') #Draw filled polygons
>>> ax.fill_between(x,y,color='yellow') #Fill between y values and 0

2D Data 

>>> fig, ax = plt.subplots()
>>> im = ax.imshow(img, #Colormapped or RGB arrays
      cmap= 'gist_earth', 
      interpolation= 'nearest',
>>> axes2[0].pcolor(data2) #Pseudocolor plot of 2D array
>>> axes2[0].pcolormesh(data) #Pseudocolor plot of 2D array
>>> CS = plt.contour(Y,X,U) #Plot contours
>>> axes2[2].contourf(data1) #Plot filled contours
>>> axes2[2]= ax.clabel(CS) #Label a contour plot

Vector Fields 

>>> axes[0,1].arrow(0,0,0.5,0.5) #Add an arrow to the axes
>>> axes[1,1].quiver(y,z) #Plot a 2D field of arrows
>>> axes[0,1].streamplot(X,Y,U,V) #Plot a 2D field of arrows

Data Distributions 

>>> ax1.hist(y) #Plot a histogram
>>> ax3.boxplot(y) #Make a box and whisker plot
>>> ax3.violinplot(z)  #Make a violin plot

Plot Anatomy & Workflow 

Plot Anatomy 




The basic steps to creating plots with matplotlib are:

1 Prepare Data
2 Create Plot
3 Plot
4 Customized Plot
5 Save Plot
6 Show Plot

>>> import matplotlib.pyplot as plt
>>> x = [1,2,3,4]  #Step 1
>>> y = [10,20,25,30] 
>>> fig = plt.figure() #Step 2
>>> ax = fig.add_subplot(111) #Step 3
>>> ax.plot(x, y, color= 'lightblue', linewidth=3)  #Step 3, 4
>>> ax.scatter([2,4,6],
          color= 'darkgreen',
          marker= '^' )
>>> ax.set_xlim(1, 6.5)
>>> plt.savefig('foo.png' ) #Step 5
>>> plt.show() #Step 6

Close and Clear 

>>> plt.cla()  #Clear an axis
>>> plt.clf(). #Clear the entire figure
>>> plt.close(). #Close a window

Plotting Customize Plot 

Colors, Color Bars & Color Maps 

>>> plt.plot(x, x, x, x**2, x, x** 3)
>>> ax.plot(x, y, alpha = 0.4)
>>> ax.plot(x, y, c= 'k')
>>> fig.colorbar(im, orientation= 'horizontal')
>>> im = ax.imshow(img,
            cmap= 'seismic' )


>>> fig, ax = plt.subplots()
>>> ax.scatter(x,y,marker= ".")
>>> ax.plot(x,y,marker= "o")


>>> plt.plot(x,y,linewidth=4.0)
>>> plt.plot(x,y,ls= 'solid') 
>>> plt.plot(x,y,ls= '--') 
>>> plt.plot(x,y,'--' ,x**2,y**2,'-.' ) 
>>> plt.setp(lines,color= 'r',linewidth=4.0)

Text & Annotations 

>>> ax.text(1,
           'Example Graph', 
            style= 'italic' )
>>> ax.annotate("Sine", 
xy=(8, 0),
xycoords= 'data', 
xytext=(10.5, 0),
textcoords= 'data', 
arrowprops=dict(arrowstyle= "->", 


>>> plt.title(r '$sigma_i=15$', fontsize=20)

Limits, Legends and Layouts 

Limits & Autoscaling 

>>> ax.margins(x=0.0,y=0.1) #Add padding to a plot
>>> ax.axis('equal')  #Set the aspect ratio of the plot to 1
>>> ax.set(xlim=[0,10.5],ylim=[-1.5,1.5])  #Set limits for x-and y-axis
>>> ax.set_xlim(0,10.5) #Set limits for x-axis


>>> ax.set(title= 'An Example Axes',  #Set a title and x-and y-axis labels
            ylabel= 'Y-Axis', 
            xlabel= 'X-Axis')
>>> ax.legend(loc= 'best')  #No overlapping plot elements


>>> ax.xaxis.set(ticks=range(1,5),  #Manually set x-ticks
             ticklabels=[3,100, 12,"foo" ])
>>> ax.tick_params(axis= 'y', #Make y-ticks longer and go in and out
             direction= 'inout', 

Subplot Spacing 

>>> fig3.subplots_adjust(wspace=0.5,   #Adjust the spacing between subplots
>>> fig.tight_layout() #Fit subplot(s) in to the figure area

Axis Spines 

>>> ax1.spines[ 'top'].set_visible(False) #Make the top axis line for a plot invisible
>>> ax1.spines['bottom' ].set_position(( 'outward',10))  #Move the bottom axis line outward

Have this Cheat Sheet at your fingertips

Original article source at https://www.datacamp.com

#matplotlib #cheatsheet #python

Sofia  Maggio

Sofia Maggio


Sentiment Analysis in Python using Machine Learning

Sentiment analysis or opinion mining is a simple task of understanding the emotions of the writer of a particular text. What was the intent of the writer when writing a certain thing?

We use various natural language processing (NLP) and text analysis tools to figure out what could be subjective information. We need to identify, extract and quantify such details from the text for easier classification and working with the data.

But why do we need sentiment analysis?

Sentiment analysis serves as a fundamental aspect of dealing with customers on online portals and websites for the companies. They do this all the time to classify a comment as a query, complaint, suggestion, opinion, or just love for a product. This way they can easily sort through the comments or questions and prioritize what they need to handle first and even order them in a way that looks better. Companies sometimes even try to delete content that has a negative sentiment attached to it.

It is an easy way to understand and analyze public reception and perception of different ideas and concepts, or a newly launched product, maybe an event or a government policy.

Emotion understanding and sentiment analysis play a huge role in collaborative filtering based recommendation systems. Grouping together people who have similar reactions to a certain product and showing them related products. Like recommending movies to people by grouping them with others that have similar perceptions for a certain show or movie.

Lastly, they are also used for spam filtering and removing unwanted content.

How does sentiment analysis work?

NLP or natural language processing is the basic concept on which sentiment analysis is built upon. Natural language processing is a superclass of sentiment analysis that deals with understanding all kinds of things from a piece of text.

NLP is the branch of AI dealing with texts, giving machines the ability to understand and derive from the text. For tasks such as virtual assistant, query solving, creating and maintaining human-like conversations, summarizing texts, spam detection, sentiment analysis, etc. it includes everything from counting the number of words to a machine writing a story, indistinguishable from human texts.

Sentiment analysis can be classified into various categories based on various criteria. Depending upon the scope it can be classified into document-level sentiment analysis, sentence level sentiment analysis, and sub sentence level or phrase level sentiment analysis.

Also, a very common classification is based on what needs to be done with the data or the reason for sentiment analysis. Examples of which are

  • Simple classification of text into positive, negative or neutral. It may also advance into fine grained answers like very positive or moderately positive.
  • Aspect-based sentiment analysis- where we figure out the sentiment along with a specific aspect it is related to. Like identifying sentiments regarding various aspects or parts of a car in user reviews, identifying what feature or aspect was appreciated or disliked.
  • The sentiment along with an action associated with it. Like mails written to customer support. Understanding if it is a query or complaint or suggestion etc

Based on what needs to be done and what kind of data we need to work with there are two major methods of tackling this problem.

  • Matching rules based sentiment analysis: There is a predefined list of words for each type of sentiment needed and then the text or document is matched with the lists. The algorithm then determines which type of words or which sentiment is more prevalent in it.
  • This type of rule based sentiment analysis is easy to implement, but lacks flexibility and does not account for context.
  • Automatic sentiment analysis: They are mostly based on supervised machine learning algorithms and are actually very useful in understanding complicated texts. Algorithms in this category include support vector machine, linear regression, rnn, and its types. This is what we are gonna explore and learn more about.

In this machine learning project, we will use recurrent neural network for sentiment analysis in python.

#machine learning tutorials #machine learning project #machine learning sentiment analysis #python sentiment analysis #sentiment analysis

Garry Taylor

Garry Taylor


Python Data Visualization: Bokeh Cheat Sheet

A handy cheat sheet for interactive plotting and statistical charts with Bokeh.

Bokeh distinguishes itself from other Python visualization libraries such as Matplotlib or Seaborn in the fact that it is an interactive visualization library that is ideal for anyone who would like to quickly and easily create interactive plots, dashboards, and data applications. 

Bokeh is also known for enabling high-performance visual presentation of large data sets in modern web browsers. 

For data scientists, Bokeh is the ideal tool to build statistical charts quickly and easily; But there are also other advantages, such as the various output options and the fact that you can embed your visualizations in applications. And let's not forget that the wide variety of visualization customization options makes this Python library an indispensable tool for your data science toolbox.

Now, DataCamp has created a Bokeh cheat sheet for those who have already taken the course and that still want a handy one-page reference or for those who need an extra push to get started.

In short, you'll see that this cheat sheet not only presents you with the five steps that you can go through to make beautiful plots but will also introduce you to the basics of statistical charts. 

Python Bokeh Cheat Sheet

In no time, this Bokeh cheat sheet will make you familiar with how you can prepare your data, create a new plot, add renderers for your data with custom visualizations, output your plot and save or show it. And the creation of basic statistical charts will hold no secrets for you any longer. 

Boost your Python data visualizations now with the help of Bokeh! :)

Plotting With Bokeh

The Python interactive visualization library Bokeh enables high-performance visual presentation of large datasets in modern web browsers.

Bokeh's mid-level general-purpose bokeh. plotting interface is centered around two main components: data and glyphs.

The basic steps to creating plots with the bokeh. plotting interface are:

  1. Prepare some data (Python lists, NumPy arrays, Pandas DataFrames and other sequences of values)
  2. Create a new plot
  3. Add renderers for your data, with visual customizations
  4. Specify where to generate the output
  5. Show or save the results
>>> from bokeh.plotting import figure
>>> from bokeh.io import output_file, show
>>> x = [1, 2, 3, 4, 5] #Step 1
>>> y = [6, 7, 2, 4, 5]
>>> p = figure(title="simple line example", #Step 2
>>> p.line(x, y, legend="Temp.", line_width=2) #Step 3
>>> output_file("lines.html") #Step 4
>>> show(p) #Step 5

1. Data 

Under the hood, your data is converted to Column Data Sources. You can also do this manually:

>>> import numpy as np
>>> import pandas as pd
>>> df = pd.OataFrame(np.array([[33.9,4,65, 'US'], [32.4, 4, 66, 'Asia'], [21.4, 4, 109, 'Europe']]),
                     columns= ['mpg', 'cyl',   'hp',   'origin'],
                      index=['Toyota', 'Fiat', 'Volvo'])

>>> from bokeh.models import ColumnOataSource
>>> cds_df = ColumnOataSource(df)

2. Plotting 

>>> from bokeh.plotting import figure
>>>p1= figure(plot_width=300, tools='pan,box_zoom')
>>> p2 = figure(plot_width=300, plot_height=300,
x_range=(0, 8), y_range=(0, 8))
>>> p3 = figure()

3. Renderers & Visual Customizations 


Scatter Markers 
Bokeh Scatter Markers

>>> p1.circle(np.array([1,2,3]), np.array([3,2,1]), fill_color='white')
>>> p2.square(np.array([1.5,3.5,5.5]), [1,4,3],
color='blue', size=1)

Line Glyphs 

Bokeh Line Glyphs

>>> pl.line([1,2,3,4], [3,4,5,6], line_width=2)
>>> p2.multi_line(pd.DataFrame([[1,2,3],[5,6,7]]),

Customized Glyphs

Selection and Non-Selection Glyphs 

Selection Glyphs

>>> p = figure(tools='box_select')
>>> p. circle ('mpg', 'cyl', source=cds_df,

Hover Glyphs

Hover Glyphs

>>> from bokeh.models import HoverTool
>>>hover= HoverTool(tooltips=None, mode='vline')
>>> p3.add_tools(hover)

Color Mapping 

Bokeh Colormapping Glyphs

>>> from bokeh.models import CategoricalColorMapper
>>> color_mapper = CategoricalColorMapper(
             factors= ['US', 'Asia', 'Europe'],
             palette= ['blue', 'red', 'green'])
>>>  p3. circle ('mpg', 'cyl', source=cds_df,
                 transform=color_mapper), legend='Origin')

4. Output & Export 


>>> from bokeh.io import output_notebook, show
>>> output_notebook()


Standalone HTML 

>>> from bokeh.embed import file_html
>>> from bokeh.resources import CON
>>> html = file_html(p, CON, "my_plot")

>>> from  bokeh.io  import  output_file,  show
>>> output_file('my_bar_chart.html',  mode='cdn')


>>> from bokeh.embed import components
>>> script, div= components(p)


>>> from bokeh.io import export_png
>>> export_png(p, filename="plot.png")


>>> from bokeh.io import export_svgs
>>> p. output_backend = "svg"
>>> export_svgs(p,filename="plot.svg")

Legend Location 

Inside Plot Area 

>>> p.legend.location = 'bottom left'

Outside Plot Area 

>>> from bokeh.models import Legend
>>> r1 = p2.asterisk(np.array([1,2,3]), np.array([3,2,1])
>>> r2 = p2.line([1,2,3,4], [3,4,5,6])
>>> legend = Legend(items=[("One" ,[p1, r1]),("Two",[r2])], location=(0, -30))
>>> p.add_layout(legend, 'right')

Legend Background & Border 

>>> p.legend. border_line_color = "navy"
>>> p.legend.background_fill_color = "white"

Legend Orientation 

>>> p.legend.orientation = "horizontal"
>>> p.legend.orientation = "vertical"

Rows & Columns Layout


>>> from bokeh.layouts import row
>>>layout= row(p1,p2,p3)


>>> from bokeh.layouts import columns
>>>layout= column(p1,p2,p3)

Nesting Rows & Columns 

>>>layout= row(column(p1,p2), p3)

Grid Layout 

>>> from bokeh.layouts import gridplot
>>> rowl = [p1,p2]
>>> row2 = [p3]
>>> layout = gridplot([[p1, p2],[p3]])

Tabbed Layout 

>>> from bokeh.models.widgets import Panel, Tabs
>>> tab1 = Panel(child=p1, title="tab1")
>>> tab2 = Panel(child=p2, title="tab2")
>>> layout = Tabs(tabs=[tab1, tab2])

Linked Plots

Linked Axes 

Linked Axes
>>> p2.x_range = p1.x_range
>>> p2.y_range = p1.y_range

Linked Brushing 

>>> p4 = figure(plot_width = 100, tools='box_select,lasso_select')
>>> p4.circle('mpg', 'cyl' , source=cds_df)
>>> p5 = figure(plot_width = 200, tools='box_select,lasso_select')
>>> p5.circle('mpg', 'hp', source=cds df)
>>>layout= row(p4,p5)

5. Show or Save Your Plots  

>>> show(p1)
>>> show(layout)
>>> save(p1)

Have this Cheat Sheet at your fingertips

Original article source at https://www.datacamp.com

#python #datavisualization #bokeh #cheatsheet

Autumn  Blick

Autumn Blick


R Tutorial: Better Blog Post Analysis with googleAnalyticsR

In my previous role as a marketing data analyst for a blogging company, one of my most important tasks was to track how blog posts performed.

On the surface, it’s a fairly straightforward goal. With Google Analytics, you can quickly get just about any metric you need for your blog posts, for any date range.

But when it comes to comparing blog post performance, things get a bit trickier.

For example, let’s say we want to compare the performance of the blog posts we published on the Dataquest blog in June (using the month of June as our date range).

But wait… two blog posts with more than 1,000 pageviews were published earlier in the month, And the two with fewer than 500 pageviews were published at the end of the month. That’s hardly a fair comparison!

My first solution to this problem was to look up each post individually, so that I could make an even comparison of how each post performed in their first day, first week, first month, etc.

However, that required a lot of manual copy-and-paste work, which was extremely tedious if I wanted to compare more than a few posts, date ranges, or metrics at a time.

But then, I learned R, and realized that there was a much better way.

In this post, we’ll walk through how it’s done, so you can do my better blog post analysis for yourself!

What we’ll need

To complete this tutorial, you’ll need basic knowledge of R syntax and the tidyverse, and access to a Google Analytics account.

Not yet familiar with the basics of R? We can help with that! Our interactive online courses teach you R from scratch, with no prior programming experience required. Sign up and start today!

You’ll also need the dyplrlubridate, and stringr packages installed — which, as a reminder, you can do with the install.packages() command.

Finally, you will need a CSV of the blog posts you want to analyze. Here’s what’s in my dataset:

post_url: the page path of the blog post

post_date: the date the post was published (formatted m/d/yy)

category: the blog category the post was published in (optional)

title: the title of the blog post (optional)

Depending on your content management system, there may be a way for you to automate gathering this data — but that’s out of the scope of this tutorial!

For this tutorial, we’ll use a manually-gathered dataset of the past ten Dataquest blog posts.

#data science tutorials #promote #r #r tutorial #r tutorials #rstats #tutorial #tutorials