1594020420
For starters, I work with a client who is considered as one of the biggest retail giants in my country (India). They operate a chain of more than 180 stores pan India and have 27 brands registered under their name. Now you might wonder why am I talking about my client? I was supposed to talk about analytics, and about the work I do. The reason I am giving this background is because I want you to imagine the amount of data that gets generated every day. Just to give you a flavor, the different types of data points captured are the number of walk-ins, sale in terms of revenue as well as product quantity, goods returned, loyalty programs, etc. These and many other data points for all 180+ stores. Just imagine the volume of data that gets generated every single day.
I work in the field of retail analytics. The scope of performing analysis is tremendous in retail space. Be it marketing, logistics, loyalty programs, customer segmentation, store segmentation, etc. For any analytics project, it is very important to define the objective and scope of the project. Defining the objective and scope creates the base level framework that we need to execute. It also helps us understand the kind of data points that we would need to gather to perform our analysis. Gathering data is simpler if we know what we are looking for and where to find it? In my case, I have to deal with multiple sources of data. I work with MySQL database, PostgreSQL, and even csv files for some static data. Once you gather the data points you need, the next step is to clean that data. I have come across so many articles where people share their experience wherein they talk about spending 60–70% of their time in gathering and cleaning data. This is 100% true in my case. I spend a considerable amount of time writing SQL queries to get the right data points with the right calculations.
Personal experience highlight: The data that you pull from the database to perform your analysis and the data that the management team refers to using various reports might differ. In my case, the data presented to the management team by various sources of reporting contained a lot of internal filters which I was totally unaware of. I learned it the hard way. However, I personally feel that things like this happen in an organization and that becomes a learning curve for future projects.
Alright, I’ll tell you what I do. But before that, I need to explain to you one term. One term, which describes my entire project. Its called Sales Per Square Foot aka SPSF. Some even refer to it as SPF.
You might have visited some or the other retail outlets of a specific brand at some point in time. When you enter the store, you fall in awe of just how huge they are in terms of space. There are many which are having multiple floors. Now, this huge space has its own set of pros and cons.
Pros:
Cons:
So, when you visit these stores, you`ll come across specific brands being placed at a specific place. Why do you think they are where they are? A lot of research and analysis of data takes place to come up with the brand location within a store. There’s also an entire area of visual merchandising connected to it.
Be patient. Everything is coming together.
In the retail industry, it is very important to measure the performance of each and individual store. This helps the company understand which of their stores are performing best and which are not. Based on this they can make critical decisions like store expansion or shutting down a specific store. There are various metrics to measure store performance. One such measure is SPSF.
Let’s dissect the term: Sale per square foot (SPSF). Commonsensically, I do what the term says. I measure the SPSF of stores and analyze the data to find patterns and come up with suggestions on how can we improve it. Let’s take an example:
Say store A generates a revenue of Rs. 1,00,00,000 for a fiscal year 2018–2019. The total carpet area of the store is, say 20,000 sq. ft.
So, the SPSF of store A =1,00,00,000 / 20,000
Thus, the SPSF of store A = 500 Rs/sq. ft
To interpret the above result in simple language, we can say, store A generated Rs.500 for every square foot of area for the fiscal year 2018–2019. What an interesting idea to measure store performance.
However, there is one critical piece of thought that I want to highlight. So, when you visit a store, a typical store consists of cash counters, changing rooms (in case of apparels outlet), escalators, elevators, walking area, displaying options, storage room, etc. Now, one would argue that the space occupied by escalators or elevators isn’t the place where we keep our products to purchase for our customers, and hence that space doesn’t contribute to the overall revenue of the store. Let’s call this space as “non-selling” space. So now we have two types of spaces in the same store. One where we have the actual products displayed called “selling space” and other where we don’t have any products on display (like cash counters, elevators, escalators, storage room) called “non-selling space”. Let me surprise you with another fact based on my experience.
#data-analysis #data-science #analytics #retail-industry #data analysis
1678051620
In this article, learn about Machine Learning Tutorial: A Practical Guide of Unsupervised Learning Algorithms. Machine learning is a fast-growing technology that allows computers to learn from the past and predict the future. It uses numerous algorithms for building mathematical models and predicting future trends. Machine learning (ML) has widespread applications in the industry, including speech recognition, image recognition, churn prediction, email filtering, chatbot development, recommender systems, and much more.
Machine learning (ML) can be classified into three main categories; supervised, unsupervised, and reinforcement learning. In supervised learning, the model is trained on labeled data. While in unsupervised learning, unlabeled data is provided to the model to predict the outcomes. Reinforcement learning is feedback learning in which the agent collects a reward for each correct action and gets a penalty for a wrong decision. The goal of the learning agent is to get maximum reward points and deduce the error.
In unsupervised learning, the model learns from unlabeled data without proper supervision.
Unsupervised learning uses machine learning techniques to cluster unlabeled data based on similarities and differences. The unsupervised algorithms discover hidden patterns in data without human supervision. Unsupervised learning aims to arrange the raw data into new features or groups together with similar patterns of data.
For instance, to predict the churn rate, we provide unlabeled data to our model for prediction. There is no information given that customers have churned or not. The model will analyze the data and find hidden patterns to categorize into two clusters: churned and non-churned customers.
Unsupervised algorithms can be used for three tasks—clustering, dimensionality reduction, and association. Below, we will highlight some commonly used clustering and association algorithms.
Clustering, or cluster analysis, is a popular data mining technique for unsupervised learning. The clustering approach works to group non-labeled data based on similarities and differences. Unlike supervised learning, clustering algorithms discover natural groupings in data.
A good clustering method produces high-quality clusters having high intra-class similarity (similar data within a cluster) and less intra-class similarity (cluster data is dissimilar to other clusters).
It can be defined as the grouping of data points into various clusters containing similar data points. The same objects remain in the group that has fewer similarities with other groups. Here, we will discuss two popular clustering techniques: K-Means clustering and DBScan Clustering.
K-Means is the simplest unsupervised technique used to solve clustering problems. It groups the unlabeled data into various clusters. The K value defines the number of clusters you need to tell the system how many to create.
K-Means is a centroid-based algorithm in which each cluster is associated with the centroid. The goal is to minimize the sum of the distances between the data points and their corresponding clusters.
It is an iterative approach that breaks down the unlabeled data into different clusters so that each data point belongs to a group with similar characteristics.
K-means clustering performs two tasks:
An illustration of K-means clustering. Image source
“DBScan” stands for “Density-based spatial clustering of applications with noise.” There are three main words in DBscan: density, clustering, and noise. Therefore, this algorithm uses the notion of density-based clustering to form clusters and detect the noise.
Clusters are usually dense regions that are separated by lower density regions. Unlike the k-means algorithm, which works only on well-separated clusters, DBscan has a wider scope and can create clusters within the cluster. It discovers clusters of various shapes and sizes from a large set of data, which consists of noise and outliers.
There are two parameters in the DBScan algorithm:
minPts: The threshold, or the minimum number of points grouped together for a region considered as a dense region.
eps(ε): The distance measure used to locate the points in the neighborhood.
An illustration of density-based clustering. Image Source
An association rule mining is a popular data mining technique. It finds interesting correlations in large numbers of data items. This rule shows how frequently items occur in a transaction.
Market Basket Analysis is a typical example of an association rule mining that finds relationships between items in the grocery store. It enables retailers to identify and analyze the associations between items that people frequently buy.
Important terminology used in association rules:
Support: It tells us about the combination of items bought frequently or frequently bought items.
Confidence: It tells us how often the items A and B occur together, given the number of times A occurs.
Lift: The lift indicates the strength of a rule over the random occurrence of A and B. For instance, A->B, the life value is 5. It means that if you buy A, the occurrence of buying B is five times.
The Apriori algorithm is a well-known association rule based technique.
The Apriori algorithm was proposed by R. Agarwal and R. Srikant in 1994 to find the frequent items in the dataset. The algorithm’s name is based on the fact that it uses prior knowledge of frequently occurring things.
The Apriori algorithm finds frequently occurring items with minimum support.
It consists of two steps:
In this tutorial, you will learn about the implementation of various unsupervised algorithms in Python. Scikit-learn is a powerful Python library widely used for various unsupervised learning tasks. It is an open-source library that provides numerous robust algorithms, which include classification, dimensionality reduction, clustering techniques, and association rules.
Let’s begin!
Now let’s dive deep into the implementation of the K-Means algorithm in Python. We’ll break down each code snippet so that you can understand it easily.
First of all, we will import the required libraries and get access to the functions.
#Let's import the required libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
The dataset is taken from the kaggle website. You can easily download it from the given link. To load the dataset, we use the pd.read_csv() function. head() returns the first five rows of the dataset.
my_data = pd.read_csv('Customers_Mall.csv.')
my_data.head()
The dataset contains five columns: customer ID, gender, age, annual income in (K$), and spending score from 1-100.
The info() function is used to get quick information about the dataset. It shows the number of entries, columns, total non-null values, memory usage, and datatypes.
my_data.info()
To check the missing values in the dataset, we use isnull().sum(), which returns the total number of null values.
#Check missing values
my_data.isnull().sum()
The box plot or whisker plot is used to detect outliers in the dataset. It also shows a statistical five number summary, which includes the minimum, first quartile, median (2nd quartile), third quartile, and maximum.
my_data.boxplot(figsize=(8,4))
Using Box Plot, we’ve detected an outlier in the annual income column. Now we will try to remove it before training our model.
#let's remove outlier from data
med =61
my_data["Annual Income (k$)"] = np.where(my_data["Annual Income (k$)"] >
120,med,my_data['Annual Income (k$)'])
The outlier in the annual income column has been removed now to confirm we used the box plot again.
my_data.boxplot(figsize=(8,5))
A histogram is used to illustrate the important features of the distribution of data. The hist() function is used to show the distribution of data in each numerical column.
my_data.hist(figsize=(6,6))
The correlation heatmap is used to find the potential relationships between variables in the data and to display the strength of those relationships. To display the heatmap, we have used the seaborn plotting library.
plt.figure(figsize=(10,6))
sns.heatmap(my_data.corr(), annot=True, cmap='icefire').set_title('seaborn')
plt.show()
The iloc() function is used to select a particular cell of the data. It enables us to select a value that belongs to a specific row or column. Here, we’ve chosen the annual income and spending score columns.
X_val = my_data.iloc[:, 3:].values
X_val
# Loading Kmeans Library
from sklearn.cluster import KMeans
Now we will select the best value for K using the Elbow’s method. It is used to determine the optimal number of clusters in K-means clustering.
my_val = []
for i in range(1,11):
kmeans = KMeans(n_clusters = i, init='k-means++', random_state = 123)
kmeans.fit(X_val)
my_val.append(kmeans.inertia_)
The sklearn.cluster.KMeans() is used to choose the number of clusters along with the initialization of other parameters. To display the result, just call the variable.
my_val
#Visualization of clusters using elbow’s method
plt.plot(range(1,11),my_val)
plt.xlabel('The No of clusters')
plt.ylabel('Outcome')
plt.title('The Elbow Method')
plt.show()
Through Elbow’s Method, when the graph looks like an arm, then the elbow on the arm is the best value of K. In this case, we’ve taken K=3, which is the optimal value for K.
kmeans = KMeans(n_clusters = 3, init='k-means++')
kmeans.fit(X_val)
#To show centroids of clusters
kmeans.cluster_centers_
#Prediction of K-Means clustering
y_kmeans = kmeans.fit_predict(X_val)
y_kmeans
The scatter graph is used to plot the classification results of our dataset into three clusters.
plt.scatter(X_val[y_kmeans == 0,0], X_val[y_kmeans == 0,1], c='red',s=100)
plt.scatter(X_val[y_kmeans == 1,0], X_val[y_kmeans == 1,1], c='green',s=100)
plt.scatter(X_val[y_kmeans == 2,0], X_val[y_kmeans == 2,1], c='orange',s=100)
plt.scatter(kmeans.cluster_centers_[:,0], kmeans.cluster_centers_[:,1], s=300, c='brown')
plt.title('K-Means Unsupervised Learning')
plt.show()
To implement the apriori algorithm, we will utilize “The Bread Basket” dataset. The dataset is available on Kaggle and you can download it from the link. This algorithm suggests products based on the user’s purchase history. Walmart has greatly utilized the algorithm to recommend relevant items to its users.
Let’s implement the Apriori algorithm in Python.
To implement the algorithm, we need to import some important libraries.
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
The dataset contains five columns and 20507 entries. The data_time is a prominent column and we can extract many vital insights from it.
my_data= pd.read_csv("bread basket.csv")
my_data.head()
Convert the data_time into an appropriate format.
my_data['date_time'] = pd.to_datetime(my_data['date_time'])
#Total No of unique customers
my_data['Transaction'].nunique()
Now we want to extract new columns from the data_time to extract meaningful information from the data.
#Let's extract date
my_data['date'] = my_data['date_time'].dt.date
#Let's extract time
my_data['time'] = my_data['date_time'].dt.time
#Extract month and replacing it with String
my_data['month'] = my_data['date_time'].dt.month
my_data['month'] = my_data['month'].replace((1,2,3,4,5,6,7,8,9,10,11,12),
('Jan','Feb','Mar','Apr','May','Jun','Jul','Aug',
'Sep','Oct','Nov','Dec'))
#Extract hour
my_data[‘hour’] = my_data[‘date_time’].dt.hour
# Replacing hours with text
# Replacing hours with text
hr_num = (1,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23)
hr_obj = (‘1-2′,’7-8′,’8-9′,’9-10′,’10-11′,’11-12′,’12-13′,’13-14′,’14-15’,
’15-16′,’16-17′,’17-18′,’18-19′,’19-20′,’20-21′,’21-22′,’22-23′,’23-24′)
my_data[‘hour’] = my_data[‘hour’].replace(hr_num, hr_obj)
# Extracting weekday and replacing it with String
my_data[‘weekday’] = my_data[‘date_time’].dt.weekday
my_data[‘weekday’] = my_data[‘weekday’].replace((0,1,2,3,4,5,6),
(‘Mon’,’Tues’,’Wed’,’Thur’,’Fri’,’Sat’,’Sun’))
#Now drop date_time column
my_data.drop(‘date_time’, axis = 1, inplace = True)
After extracting the date, time, month, and hour columns, we dropped the data_time column.
Now to display, we simply use the head() function to see the changes in the dataset.
my_data.head()
# cleaning the item column
my_data[‘Item’] = my_data[‘Item’].str.strip()
my_data[‘Item’] = my_data[‘Item’].str.lower()
my_data.head()
To display the top 10 items purchased by customers, we used a barplot() of the seaborn library.
plt.figure(figsize=(10,5))
sns.barplot(x=my_data.Item.value_counts().head(10).index, y=my_data.Item.value_counts().head(10).values,palette='RdYlGn')
plt.xlabel('No of Items', size = 17)
plt.xticks(rotation=45)
plt.ylabel('Total Items', size = 18)
plt.title('Top 10 Items purchased', color = 'blue', size = 23)
plt.show()
From the graph, coffee is the top item purchased by the customers, followed by bread.
Now, to display the number of orders received each month, the groupby() function is used along with barplot() to visually show the results.
mon_Tran =my_data.groupby('month')['Transaction'].count().reset_index()
mon_Tran.loc[:,"mon_order"] =[4,8,12,2,1,7,6,3,5,11,10,9]
mon_Tran.sort_values("mon_order",inplace=True)
plt.figure(figsize=(12,5))
sns.barplot(data = mon_Tran, x = "month", y = "Transaction")
plt.xlabel('Months', size = 14)
plt.ylabel('Monthly Orders', size = 14)
plt.title('No of orders received each month', color = 'blue', size = 18)
plt.show()
To show the number of orders received each day, we applied groupby() to the weekday column.
wk_Tran = my_data.groupby('weekday')['Transaction'].count().reset_index()
wk_Tran.loc[:,"wk_ord"] = [4,0,5,6,3,1,2]
wk_Tran.sort_values("wk_ord",inplace=True)
plt.figure(figsize=(11,4))
sns.barplot(data = wk_Tran, x = "weekday", y = "Transaction",palette='RdYlGn')
plt.xlabel('Week Day', size = 14)
plt.ylabel('Per day orders', size = 14)
plt.title('Orders received per day', color = 'blue', size = 18)
plt.show()
We import the mlxtend library to implement the association rules and count the number of items.
from mlxtend.frequent_patterns import association_rules, apriori
tran_str= my_data.groupby(['Transaction', 'Item'])['Item'].count().reset_index(name ='Count')
tran_str.head(8)
Now we’ll make a mxn matrix where m=transaction and n=items, and each row represents whether the item was in the transaction or not.
Mar_baskt = tran_str.pivot_table(index='Transaction', columns='Item', values='Count', aggfunc='sum').fillna(0)
Mar_baskt.head()
We want to make a function that returns 0 and 1. 0 means that the item wasn’t present in the transaction, while 1 means the item exists.
def encode(val):
if val<=0:
return 0
if val>=1:
return 1
#Let's apply the function to the dataset
Basket=Mar_baskt.applymap(encode)
Basket.head()
#using apriori algorithm to set min_support 0.01 means 1% freq_items = apriori(Basket, min_support = 0.01,use_colnames = True) freq_items.head()
Using the association_rules() function to generate the most frequent items from the dataset.
App_rule= association_rules(freq_items, metric = "lift", min_threshold = 1)
App_rule.sort_values('confidence', ascending = False, inplace = True)
App_rule.head()
From the above implementation, the most frequent items are coffee and toast, both having a lift value of 1.47 and a confidence value of 0.70.
Principal component analysis (PCA) is one of the most widely used unsupervised learning techniques. It can be used for various tasks, including dimensionality reduction, information compression, exploratory data analysis and Data de-noising.
Let’s use the PCA algorithm!
First we import the required libraries to implement this algorithm.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
from sklearn.decomposition import PCA
from sklearn.datasets import load_digits
To implement the PCA algorithm the load_digits dataset of Scikit-learn is used which can easily be loaded using the below command. The dataset contains images data which include 1797 entries and 64 columns.
#Load the dataset
my_data= load_digits()
#Creating features
X_value = my_data.data
#Creating target
#Let's check the shape of X_value
X_value.shape
#Each image is 8x8 pixels therefore 64px
my_data.images[10]
#Let's display the image
plt.gray()
plt.matshow(my_data.images[34])
plt.show()
Now let’s project data from 64 columns to 16 to show how 16 dimensions classify the data.
X_val = my_data.data
y_val = my_data.target
my_pca = PCA(16)
X_projection = my_pca.fit_transform(X_val)
print(X_val.shape)
print(X_projection.shape)
Using colormap we visualize that with only ten dimensions we can classify the data points. Now we’ll select the optimal number of dimensions (principal components) by which data can be reduced into lower dimensions.
plt.scatter(X_projection[:, 0], X_projection[:, 1], c=y_val, edgecolor='white',
cmap=plt.cm.get_cmap("gist_heat",12))
plt.colorbar();
pca=PCA().fit(X_val)
plt.plot(np.cumsum(my_pca.explained_variance_ratio_))
plt.xlabel('Principal components')
plt.ylabel('Explained variance')
Based on the below graph, only 12 components are required to explain more than 80% of the variance which is still better than computing all the 64 features. Thus, we’ve reduced the large number of dimensions into 12 dimensions to avoid the dimensionality curse. pca=PCA().fit(X_val)
plt.plot(np.cumsum(pca.explained_variance_ratio_))
plt.xlabel('Principal components')
plt.ylabel('Explained variance')
#Let's visualize how it looks like
Unsupervised_pca = PCA(12)
X_pro = Unsupervised_pca.fit_transform(X_val)
print("New Data Shape is =>",X_pro.shape)
#Let's Create a scatter plot
plt.scatter(X_pro[:, 0], X_pro[:, 1], c=y_val, edgecolor='white',
cmap=plt.cm.get_cmap("nipy_spectral",10))
plt.colorbar();
In this machine learning tutorial, we’ve implemented the Kmeans, Apriori, and PCA algorithms. These are some of the most widely used algorithms, having numerous industrial applications and solve many real world problems. For instance, K-means clustering is used in astronomy to study stellar and galaxy spectra, solar polarization spectra, and X-ray spectra. And, Apriori is used by retail stores to optimize their product inventory.
Dreaming of becoming a data scientist or data analyst even without a university and a college degree? Do you need the knowledge of data science and analysis for promotions in your current role?
Are you interested in securing your dream job in data science and analysis and looking for a way to get started, we can help you? With over 10 years of experience in data science and data analysis, we will teach you the rubrics, guiding you with one-on-one lessons from the fundamentals until you become a pro.
Our courses are affordable and easy to understand with numerous exercises and assignments you can learn from. At the completion of our courses, you’ll be readily equipped with technical and practical skills to take on any data science and data analysis role in companies, collaborate effectively among teams and help businesses meet and exceed their objectives by extracting actionable insights from data.
Original article sourced at: https://thedatascientist.com
1656151740
Flutter Console Coverage Test
This small dart tools is used to generate Flutter Coverage Test report to console
Add a line like this to your package's pubspec.yaml (and run an implicit flutter pub get):
dev_dependencies:
test_cov_console: ^0.2.2
flutter pub get
Running "flutter pub get" in coverage... 0.5s
flutter test --coverage
00:02 +1: All tests passed!
flutter pub run test_cov_console
---------------------------------------------|---------|---------|---------|-------------------|
File |% Branch | % Funcs | % Lines | Uncovered Line #s |
---------------------------------------------|---------|---------|---------|-------------------|
lib/src/ | | | | |
print_cov.dart | 100.00 | 100.00 | 88.37 |...,149,205,206,207|
print_cov_constants.dart | 0.00 | 0.00 | 0.00 | no unit testing|
lib/ | | | | |
test_cov_console.dart | 0.00 | 0.00 | 0.00 | no unit testing|
---------------------------------------------|---------|---------|---------|-------------------|
All files with unit testing | 100.00 | 100.00 | 88.37 | |
---------------------------------------------|---------|---------|---------|-------------------|
If not given a FILE, "coverage/lcov.info" will be used.
-f, --file=<FILE> The target lcov.info file to be reported
-e, --exclude=<STRING1,STRING2,...> A list of contains string for files without unit testing
to be excluded from report
-l, --line It will print Lines & Uncovered Lines only
Branch & Functions coverage percentage will not be printed
-i, --ignore It will not print any file without unit testing
-m, --multi Report from multiple lcov.info files
-c, --csv Output to CSV file
-o, --output=<CSV-FILE> Full path of output CSV file
If not given, "coverage/test_cov_console.csv" will be used
-t, --total Print only the total coverage
Note: it will ignore all other option (if any), except -m
-p, --pass=<MINIMUM> Print only the whether total coverage is passed MINIMUM value or not
If the value >= MINIMUM, it will print PASSED, otherwise FAILED
Note: it will ignore all other option (if any), except -m
-h, --help Show this help
flutter pub run test_cov_console --file=coverage/lcov.info --exclude=_constants,_mock
---------------------------------------------|---------|---------|---------|-------------------|
File |% Branch | % Funcs | % Lines | Uncovered Line #s |
---------------------------------------------|---------|---------|---------|-------------------|
lib/src/ | | | | |
print_cov.dart | 100.00 | 100.00 | 88.37 |...,149,205,206,207|
lib/ | | | | |
test_cov_console.dart | 0.00 | 0.00 | 0.00 | no unit testing|
---------------------------------------------|---------|---------|---------|-------------------|
All files with unit testing | 100.00 | 100.00 | 88.37 | |
---------------------------------------------|---------|---------|---------|-------------------|
It support to run for multiple lcov.info files with the followings directory structures:
1. No root module
<root>/<module_a>
<root>/<module_a>/coverage/lcov.info
<root>/<module_a>/lib/src
<root>/<module_b>
<root>/<module_b>/coverage/lcov.info
<root>/<module_b>/lib/src
...
2. With root module
<root>/coverage/lcov.info
<root>/lib/src
<root>/<module_a>
<root>/<module_a>/coverage/lcov.info
<root>/<module_a>/lib/src
<root>/<module_b>
<root>/<module_b>/coverage/lcov.info
<root>/<module_b>/lib/src
...
You must run test_cov_console on <root> dir, and the report would be grouped by module, here is
the sample output for directory structure 'with root module':
flutter pub run test_cov_console --file=coverage/lcov.info --exclude=_constants,_mock --multi
---------------------------------------------|---------|---------|---------|-------------------|
File |% Branch | % Funcs | % Lines | Uncovered Line #s |
---------------------------------------------|---------|---------|---------|-------------------|
lib/src/ | | | | |
print_cov.dart | 100.00 | 100.00 | 88.37 |...,149,205,206,207|
lib/ | | | | |
test_cov_console.dart | 0.00 | 0.00 | 0.00 | no unit testing|
---------------------------------------------|---------|---------|---------|-------------------|
All files with unit testing | 100.00 | 100.00 | 88.37 | |
---------------------------------------------|---------|---------|---------|-------------------|
---------------------------------------------|---------|---------|---------|-------------------|
File - module_a - |% Branch | % Funcs | % Lines | Uncovered Line #s |
---------------------------------------------|---------|---------|---------|-------------------|
lib/src/ | | | | |
print_cov.dart | 100.00 | 100.00 | 88.37 |...,149,205,206,207|
lib/ | | | | |
test_cov_console.dart | 0.00 | 0.00 | 0.00 | no unit testing|
---------------------------------------------|---------|---------|---------|-------------------|
All files with unit testing | 100.00 | 100.00 | 88.37 | |
---------------------------------------------|---------|---------|---------|-------------------|
---------------------------------------------|---------|---------|---------|-------------------|
File - module_b - |% Branch | % Funcs | % Lines | Uncovered Line #s |
---------------------------------------------|---------|---------|---------|-------------------|
lib/src/ | | | | |
print_cov.dart | 100.00 | 100.00 | 88.37 |...,149,205,206,207|
lib/ | | | | |
test_cov_console.dart | 0.00 | 0.00 | 0.00 | no unit testing|
---------------------------------------------|---------|---------|---------|-------------------|
All files with unit testing | 100.00 | 100.00 | 88.37 | |
---------------------------------------------|---------|---------|---------|-------------------|
flutter pub run test_cov_console -c --output=coverage/test_coverage.csv
#### sample CSV output file:
File,% Branch,% Funcs,% Lines,Uncovered Line #s
lib/,,,,
test_cov_console.dart,0.00,0.00,0.00,no unit testing
lib/src/,,,,
parser.dart,100.00,100.00,97.22,"97"
parser_constants.dart,100.00,100.00,100.00,""
print_cov.dart,100.00,100.00,82.91,"29,49,51,52,171,174,177,180,183,184,185,186,187,188,279,324,325,387,388,389,390,391,392,393,394,395,398"
print_cov_constants.dart,0.00,0.00,0.00,no unit testing
All files with unit testing,100.00,100.00,86.07,""
You can install the package from the command line:
dart pub global activate test_cov_console
The package has the following executables:
$ test_cov_console
Run this command:
With Dart:
$ dart pub add test_cov_console
With Flutter:
$ flutter pub add test_cov_console
This will add a line like this to your package's pubspec.yaml (and run an implicit dart pub get
):
dependencies:
test_cov_console: ^0.2.2
Alternatively, your editor might support dart pub get
or flutter pub get
. Check the docs for your editor to learn more.
Now in your Dart code, you can use:
import 'package:test_cov_console/test_cov_console.dart';
example/lib/main.dart
import 'package:flutter/material.dart';
void main() {
runApp(MyApp());
}
class MyApp extends StatelessWidget {
// This widget is the root of your application.
@override
Widget build(BuildContext context) {
return MaterialApp(
title: 'Flutter Demo',
theme: ThemeData(
// This is the theme of your application.
//
// Try running your application with "flutter run". You'll see the
// application has a blue toolbar. Then, without quitting the app, try
// changing the primarySwatch below to Colors.green and then invoke
// "hot reload" (press "r" in the console where you ran "flutter run",
// or simply save your changes to "hot reload" in a Flutter IDE).
// Notice that the counter didn't reset back to zero; the application
// is not restarted.
primarySwatch: Colors.blue,
// This makes the visual density adapt to the platform that you run
// the app on. For desktop platforms, the controls will be smaller and
// closer together (more dense) than on mobile platforms.
visualDensity: VisualDensity.adaptivePlatformDensity,
),
home: MyHomePage(title: 'Flutter Demo Home Page'),
);
}
}
class MyHomePage extends StatefulWidget {
MyHomePage({Key? key, required this.title}) : super(key: key);
// This widget is the home page of your application. It is stateful, meaning
// that it has a State object (defined below) that contains fields that affect
// how it looks.
// This class is the configuration for the state. It holds the values (in this
// case the title) provided by the parent (in this case the App widget) and
// used by the build method of the State. Fields in a Widget subclass are
// always marked "final".
final String title;
@override
_MyHomePageState createState() => _MyHomePageState();
}
class _MyHomePageState extends State<MyHomePage> {
int _counter = 0;
void _incrementCounter() {
setState(() {
// This call to setState tells the Flutter framework that something has
// changed in this State, which causes it to rerun the build method below
// so that the display can reflect the updated values. If we changed
// _counter without calling setState(), then the build method would not be
// called again, and so nothing would appear to happen.
_counter++;
});
}
@override
Widget build(BuildContext context) {
// This method is rerun every time setState is called, for instance as done
// by the _incrementCounter method above.
//
// The Flutter framework has been optimized to make rerunning build methods
// fast, so that you can just rebuild anything that needs updating rather
// than having to individually change instances of widgets.
return Scaffold(
appBar: AppBar(
// Here we take the value from the MyHomePage object that was created by
// the App.build method, and use it to set our appbar title.
title: Text(widget.title),
),
body: Center(
// Center is a layout widget. It takes a single child and positions it
// in the middle of the parent.
child: Column(
// Column is also a layout widget. It takes a list of children and
// arranges them vertically. By default, it sizes itself to fit its
// children horizontally, and tries to be as tall as its parent.
//
// Invoke "debug painting" (press "p" in the console, choose the
// "Toggle Debug Paint" action from the Flutter Inspector in Android
// Studio, or the "Toggle Debug Paint" command in Visual Studio Code)
// to see the wireframe for each widget.
//
// Column has various properties to control how it sizes itself and
// how it positions its children. Here we use mainAxisAlignment to
// center the children vertically; the main axis here is the vertical
// axis because Columns are vertical (the cross axis would be
// horizontal).
mainAxisAlignment: MainAxisAlignment.center,
children: <Widget>[
Text(
'You have pushed the button this many times:',
),
Text(
'$_counter',
style: Theme.of(context).textTheme.headline4,
),
],
),
),
floatingActionButton: FloatingActionButton(
onPressed: _incrementCounter,
tooltip: 'Increment',
child: Icon(Icons.add),
), // This trailing comma makes auto-formatting nicer for build methods.
);
}
}
Author: DigitalKatalis
Source Code: https://github.com/DigitalKatalis/test_cov_console
License: BSD-3-Clause license
1621644000
Data management, analytics, data science, and real-time systems will converge this year enabling new automated and self-learning solutions for real-time business operations.
The global pandemic of 2020 has upended social behaviors and business operations. Working from home is the new normal for many, and technology has accelerated and opened new lines of business. Retail and travel have been hit hard, and tech-savvy companies are reinventing e-commerce and in-store channels to survive and thrive. In biotech, pharma, and healthcare, analytics command centers have become the center of operations, much like network operation centers in transport and logistics during pre-COVID times.
While data management and analytics have been critical to strategy and growth over the last decade, COVID-19 has propelled these functions into the center of business operations. Data science and analytics have become a focal point for business leaders to make critical decisions like how to adapt business in this new order of supply and demand and forecast what lies ahead.
In the next year, I anticipate a convergence of data, analytics, integration, and DevOps to create an environment for rapid development of AI-infused applications to address business challenges and opportunities. We will see a proliferation of API-led microservices developer environments for real-time data integration, and the emergence of data hubs as a bridge between at-rest and in-motion data assets, and event-enabled analytics with deeper collaboration between data scientists, DevOps, and ModelOps developers. From this, an ML engineer persona will emerge.
#analytics #artificial intelligence technologies #big data #big data analysis tools #from our experts #machine learning #real-time decisions #real-time analytics #real-time data #real-time data analytics
1594020420
For starters, I work with a client who is considered as one of the biggest retail giants in my country (India). They operate a chain of more than 180 stores pan India and have 27 brands registered under their name. Now you might wonder why am I talking about my client? I was supposed to talk about analytics, and about the work I do. The reason I am giving this background is because I want you to imagine the amount of data that gets generated every day. Just to give you a flavor, the different types of data points captured are the number of walk-ins, sale in terms of revenue as well as product quantity, goods returned, loyalty programs, etc. These and many other data points for all 180+ stores. Just imagine the volume of data that gets generated every single day.
I work in the field of retail analytics. The scope of performing analysis is tremendous in retail space. Be it marketing, logistics, loyalty programs, customer segmentation, store segmentation, etc. For any analytics project, it is very important to define the objective and scope of the project. Defining the objective and scope creates the base level framework that we need to execute. It also helps us understand the kind of data points that we would need to gather to perform our analysis. Gathering data is simpler if we know what we are looking for and where to find it? In my case, I have to deal with multiple sources of data. I work with MySQL database, PostgreSQL, and even csv files for some static data. Once you gather the data points you need, the next step is to clean that data. I have come across so many articles where people share their experience wherein they talk about spending 60–70% of their time in gathering and cleaning data. This is 100% true in my case. I spend a considerable amount of time writing SQL queries to get the right data points with the right calculations.
Personal experience highlight: The data that you pull from the database to perform your analysis and the data that the management team refers to using various reports might differ. In my case, the data presented to the management team by various sources of reporting contained a lot of internal filters which I was totally unaware of. I learned it the hard way. However, I personally feel that things like this happen in an organization and that becomes a learning curve for future projects.
Alright, I’ll tell you what I do. But before that, I need to explain to you one term. One term, which describes my entire project. Its called Sales Per Square Foot aka SPSF. Some even refer to it as SPF.
You might have visited some or the other retail outlets of a specific brand at some point in time. When you enter the store, you fall in awe of just how huge they are in terms of space. There are many which are having multiple floors. Now, this huge space has its own set of pros and cons.
Pros:
Cons:
So, when you visit these stores, you`ll come across specific brands being placed at a specific place. Why do you think they are where they are? A lot of research and analysis of data takes place to come up with the brand location within a store. There’s also an entire area of visual merchandising connected to it.
Be patient. Everything is coming together.
In the retail industry, it is very important to measure the performance of each and individual store. This helps the company understand which of their stores are performing best and which are not. Based on this they can make critical decisions like store expansion or shutting down a specific store. There are various metrics to measure store performance. One such measure is SPSF.
Let’s dissect the term: Sale per square foot (SPSF). Commonsensically, I do what the term says. I measure the SPSF of stores and analyze the data to find patterns and come up with suggestions on how can we improve it. Let’s take an example:
Say store A generates a revenue of Rs. 1,00,00,000 for a fiscal year 2018–2019. The total carpet area of the store is, say 20,000 sq. ft.
So, the SPSF of store A =1,00,00,000 / 20,000
Thus, the SPSF of store A = 500 Rs/sq. ft
To interpret the above result in simple language, we can say, store A generated Rs.500 for every square foot of area for the fiscal year 2018–2019. What an interesting idea to measure store performance.
However, there is one critical piece of thought that I want to highlight. So, when you visit a store, a typical store consists of cash counters, changing rooms (in case of apparels outlet), escalators, elevators, walking area, displaying options, storage room, etc. Now, one would argue that the space occupied by escalators or elevators isn’t the place where we keep our products to purchase for our customers, and hence that space doesn’t contribute to the overall revenue of the store. Let’s call this space as “non-selling” space. So now we have two types of spaces in the same store. One where we have the actual products displayed called “selling space” and other where we don’t have any products on display (like cash counters, elevators, escalators, storage room) called “non-selling space”. Let me surprise you with another fact based on my experience.
#data-analysis #data-science #analytics #retail-industry #data analysis
1598797980
Today’s digitally-savvy, continuously connected consumers have no patience for data latency and data silos that are the main obstacles to a relevant, personalized CX.
Ambitious marketers recognize the power of customer experience (CX) to differentiate and have a material impact on revenue. However, achieving this outcome is easier said than done: Today’s always-on, continuously connected consumers expect brands to orchestrate interactions across multiple channels, in a personalized and relevant way. Anything less, and brands run the risk of introducing friction into a customer experience which will quickly drive customers to a competitor.
Consider a 2019 Harris Poll, where 63% of consumers surveyed said that a personalized CX is now part of the standard service they expect, and 37% claimed they would stop doing business with a company that fails to offer such an experience. Brands though, struggle to meet this expectation; marketers in the Harris Poll survey said that real-time engagement (50%) and customer understanding (48%) were the top challenges in providing an exceptional CX.
See also: More Than Personal: Customer Experiences in the Post-Digital Age
The simplest answer to this question is that real-time is whatever you need it to be to move at the speed of the customer. For all intents and purposes, this usually eliminates batch processing of customer data as sufficient to deliver a personalized, relevant experience that is in the cadence of a customer journey – particularly as consumers are moving to more digital channels and a digital-first experience.
A typical website browsing experience, for instance, may entail a visit to numerous landing pages, a chatbot session, a filled and abandoned shopping cart, or even a timed-out browser as a guest checks out a competitor’s website. A consumer may even bounce in and out of the website to check emails or engage with a mobile app. A brand that possesses a single view of the customer by integrating every conceivable piece of customer data, and makes that single view accessible in real-time, is capable of intelligently orchestrating the entire experience to provide the customer with a next-best action the moment the customer appears in a channel, or even on a specific landing page.
Ensuring the single view accurately reflects the behaviors, preferences, wants and needs of the customer at the precise moment a real-time decision is rendered requires not just real-time access, but that all the underlying customer data that constitute the single view is also in real-time. A real-time decision made on data that is minutes or in some cases seconds old – as a website browsing experience shows – may not be relevant to the customer’s journey in the precise moment of interaction.
#analytics #customer experience management #from our experts #customer experience management #machine learning #real-time analytics #real-time decisions