Pratiksha

1633671753

Benefits of Edge Computing

1. Security

While the proliferation of IoT computer devices increases the overall area of ​​network attacks, and offers some important security benefits. The traditional technology of cloud computing is centralized, making it vulnerable to attacks on service denials (DDoS) and power outages. Edge computing distributes processing, storage and applications across a wide range of devices and data centers, making it difficult for any single disruption to slow down the entire network.

Another major concern with IoT devices on the computer edge is that they can be used as a hack for cyberattacks, allowing malware or other issues to infect the network from a single weak spot. While this is a real threat, the widespread state of edge-building software makes it easy to use security agreements that can shut down damaged parts without shutting down the entire network.

As more data is processed on local devices rather than returning it to a central data center, edge computing also reduces the amount of data at risk by one second. There is little data that can be captured during transit, and even if the device is compromised, it will only contain data collected locally rather than data that can be displayed by a centralized server.

Although edge-to-computer computing includes specialized edge data centers, this often provides additional security measures to prevent DDoS attacks and other cyber threats.

The quality data center should provide a variety of tools that clients can use to protect and monitor their networks in real time.

2. Speed

Speed ​​is very important in any company's core business. Take the reliance of the financial sector on high frequency trading algorithms, for example. A small millisecond drop in its trading algorithms can lead to costly results. In the health care industry, where the poles are very high, losing a fraction of a second can be a matter of life or death.

For businesses that provide data-driven services to customers, the remaining speed can frustrate customers and cause long-term damage to the product. This may not sound as bad as life and death, but poor network performance and slow speed could mean the end of your company altogether. Speed ​​is no longer just a competitive advantage - a very good practice.

The most important advantage of Edge computing is its ability to increase network performance by reducing delays. As IoT edge computing devices process data in a location or nearby data centers, the data they collect does not have to go that far under traditional cloud technology.

In today's world, it is easy to forget that data is not always fast; bound by the same laws of physics as anything else in the known universe. Current fiber-optic technology allows data to move at 2/3 of the speed of light, from New York to San Francisco in about 21 milliseconds.

However, as more data continues to be conveyed, the digital traffic congestion in the future is likely to be a certainty. By 2020, the earth produced about 44 zettabytes (one zettabyte equivalent to a trillion gigabyte) of data. By 2025, 463 exabytes (one exabyte equivalent to billions of gigabytes) of data will be processed daily.

There is also the problem of the "mile storage" bottle, where the data must be transmitted via a local network connection before reaching its final destination. Depending on the quality of this connection, the "last mile" can add anywhere between 10 and 65 milliseconds.

By processing the data near the source and reducing the apparent distance to go, edge computing can significantly reduce delays. This means higher speeds for end users, with latency measured in microseconds rather than milliseconds. If you think that even one minute of relaxation or rest can cost companies thousands of dollars, the speed benefits of using edge computing are very important for your network.

3. Scalability

As companies grow, they cannot always anticipate their IT infrastructure needs. Building a dedicated data center is an expensive proposal, making it very difficult to plan for the future.

In addition to the high cost of high-quality construction and ongoing care, there is the question of future needs. Traditional independent services limit performance growth, preventing companies from predicting their future computer needs. If business growth exceeds expectations, they may not be able to take advantage of opportunities due to insufficient computer resources.

Fortunately, the development of cloud-based technology and edge computing has made it easier for businesses to increase their performance. Computer skills, storage, and analytics are increasingly being integrated with devices with small steps that can be close to end users.

Expanding data collection and analysis no longer requires companies to establish central, independent, cost-effective data centers to build, maintain, and replace when it is time to grow again. By integrating regional data collection services with regional computer data centers, organizations can expand their network edge and access faster and cost-effectively. As they grow, the flexibility of edge computing power enables them to quickly adapt to changing markets and maximize their data and computer needs efficiently.

In short, end-to-end computing provides a more cost-effective approach to scalability, allowing companies to maximize their computing power through integration of IoT devices and edge data centers. The use of computer devices on the active edge also reduces the cost of growth because each new additional device does not impose large bandwidth requirements in the context of the network.

4. Versatility

The decline of the computer on the edge also plays into its various functions. By working with data centers on the edge, local companies can easily target desirable markets without investing in expanding expensive infrastructure.

Edge data centers allow them to use end users efficiently with minimal body distance or latency. This is especially important for content providers who want to deliver uninterrupted streaming services. Nor do they force companies with more violence, allowing them to switch to other markets if economic conditions change.

Edge computing enables IoT devices to collect unprecedented amounts of possible data. Instead of waiting for people to sign in with devices and interact with central servers, edge computing devices are always open, always connected, and always generate data for future analysis.

Randomized data collected by end-to-end networks can be processed locally to deliver faster services or be returned to a network environment, where powerful analytics and machine learning programs will be distributed to identify trends and noticeable data points. Armed with this knowledge, companies can make better decisions and meet real market needs effectively.

By incorporating new IoT devices into their network design, companies can provide new and better services to their customers without completely overhauling their IT infrastructure. Targeted devices offer many exciting opportunities for organizations that value design as a way to drive growth. It is a great benefit to industries that want to increase network access in regions with limited connectivity (such as health, agriculture and manufacturing sectors).

5. Reliability

Given the security benefits offered by edge computing, it should come as no surprise that it offers better reliability as well. With IoT edge computing devices and edge data centers located close to end users, there is less chance of a remote network problem affecting local customers. Even in the event of a nearby data center outage, IoT devices on the computer edge will continue to work successfully on their own because they handle important traditional processing tasks.

By processing data close to the source and prioritizing traffic, edge computing reduces the amount of data flowing to or from the main network, resulting in lower latency and greater overall speed. Physical distance is also important for performance.

By getting edge systems in geographical data centers by approaching end users and distributing processing accordingly, companies can significantly reduce the distance data that has to go before services can be delivered. These reduced networks ensure a faster, seamless experience for their customers, who expect access to their content and requests instantly anywhere, anytime.

With many edge computing devices and data centers on the edge connected to the network, it becomes very difficult for any single failure to shut down the service completely. Information can be retrieved on multiple routes to ensure users keep access to the products and information they need. Successfully installing IoT computer devices and data centers on the edge in complete edge design can therefore provide unparalleled reliability.

What is GEEK

Buddha Community

 Benefits of Edge Computing
Zelma  Gerlach

Zelma Gerlach

1621616520

Edge Computing: Device Edge vs. Cloud Edge

It sometimes makes sense to treat edge computing not as a generic category but as two distinct types of architectures: cloud edge and device edge.

Most people talk about edge computing as a singular type of architecture. But in some respects, it makes sense to think of edge computing as two fundamentally distinct types of architectures: Device edge and cloud edge.

Although a device edge and a cloud edge operate in similar ways from an architectural perspective, they cater to different types of use cases, and they pose different challenges.

Here’s a breakdown of how device edge and cloud edge compare.

Edge computing, defined

First, let’s briefly define edge computing itself.

Edge computing is any type of architecture in which workloads are hosted closer to the “edge” of the network — which typically means closer to end-users — than they would be in conventional architectures that centralize processing and data storage inside large data centers.

#cloud #edge computing #cloud computing #device edge #cloud edge

Juanita  Apio

Juanita Apio

1623173160

Computing on the EDGE

Most of the companies in today’s era are moving towards cloud for their computation and storage needs. Cloud provides a one shot solution for all the needs for services across various aspects, be it large scale processing, ML model training and deployments or big data storage and analysis. This again requires moving data, video or audio to the cloud for processing and storage which also has certain shortcomings compared to do it at the client like

  • Network latency
  • Network cost and bandwidth
  • Privacy
  • Single point failure

If you look at other side, cloud have their own advantages and I will not talk about them right now. With all these in mind, how about a hybrid approach where few requirements can be moved to the client and some remain on the cloud. This is where EDGE computing comes into picture. According to Wiki here is the definition of the same

Edge computing_ is a distributed computing paradigm that brings computation and data storage closer to the location where it is needed, to improve response times and save bandwidth”_

Edge has a lot of use cases like

  • Trained ML models (specially video and audio) siting closer on the edge for inferencing or prediction.
  • IoT data analysis for large scale machines right at the edge

Look at Gartner hype cycle for emerging technologies. Edge is gaining momentum.

There are many platforms in the market specialised in edge deployments right from cloud solutions like azure iot hub, aws greengrass …, open source like _kubeedge, edgeX-Foundary _and third party like Intellisite etc.

I will focus this article on using one of the platforms for building an “Attendance platform” on the edge using facial recognition. I will add as many links as possible for your references.

Let us start with taking the first step and defining the requirements

  • Capture video from the camera
  • Recognise faces based on trained ML model
  • Display the video feed with recognised faces on the monitor
  • Log attendance in a database
  • Collect logs and metrics
  • Save unrecognised images to a central repository for retraining and improving model
  • Multi site deployments

Choosing a platform

Choosing the right platform from so many options was a bit tricky. For the POC, we looked at few pieces in the platform

  • Pricing
  • Infrastructure maintenance
  • Learning curve
  • Ease of use

There were other metrics as well but these were on top of our mind. Azure IoT looked pretty good in terms of above evaluation. We also looked at Kubeedge which provided deployments on Kubernetes on the edge. It is open source and looked promising. Looking at many components (cloud and edge) involved with maintenance overhead, we decided not to move ahead with open source. We were already using Azure cloud for other cloud infra, this also made our work a little more easier in choosing this platform. This also helped

Leading platform players

Designing the solution

Azure IoT hub provided 2 main components. One is the cloud component responsible for managing the deployments on edge and collection of data from them. The other is the edge component consisting of

  • Edge Agent : manages deployment and monitoring of modules on the IoT Edge device
  • Edge Hub : handles communications between modules on the IoT Edge device, and between the device and IoT Hub.

I will not go into the details, you can find more details here about the Azure IoT edge. To give a brief, Azure edge requires modules as containers which can to be pushed to the edge. The edge device first needs to be registered with the IoT Hub. Once the Edge agent connects with the hub, you can push your modules using a deployment.json file. The container runtime that Azure Edge uses is moby.

We used Azure IoT free tier which was sufficient for our POC. Check the pricing here

As per the requirements of the POC, this is what we came up with

The solution consists of various containers which are deployment on the edge as well as few cloud deployments. I will talk about each components in details as we move ahead.

As part of the POC, we assumed 2 sites where attendance needs to be taken at multiple gates. To simulate, we created 4 ubuntu machine. This is the ubuntu desktop image we used. For attendance, we created a video containing still photos of few filmstars and sportsperson. These videos will be used for attendance in order to simulate the cameras, one for each gate.

Modules in action

Camera module

It captures IP camera feed and pushed the frames for consumption

  • It uses python opencv for capture. For the POC, we read video files pushed inside the container.
  • Frames published to zeromq (brokerless message queue).
  • Used python3-opencv docker container as base image and pyzmq module for mq. Check this blog on how to use zeromq with python.

The module was configured to use a lot of environment variables, one being sampling rate of the video frames. Processing all frames require high memory and CPU, so it is always advisable to drop frames to reduce cpu load. This can be done in either camera module or inferencing module.

Inference Module

  • Used a pre-existing face recognition deep learning model for our inferencing needs.
  • Trained the model with easily available filmstars and sportsperson images.
  • The model was not trained with couple of images which were present in the video to showcase undetected image use case. These undetected images were stored in ADLS gen2, explained in the storage module.
  • Python pyzmq module was used to consume frames published by the camera module.
  • Not every frame was processed and few frames were dropped based on the configuration set via environment variables.
  • Once an image was recognised, a message (json) for attendance was send to the cloud using IoT Edge hub. Use this to specify routes in your deployment file.

#deep-learning #edge-computing #azure #edge

How to Predict Housing Prices with Linear Regression?

How-to-Predict-Housing-Prices-with-Linear-Regression

The final objective is to estimate the cost of a certain house in a Boston suburb. In 1970, the Boston Standard Metropolitan Statistical Area provided the information. To examine and modify the data, we will use several techniques such as data pre-processing and feature engineering. After that, we'll apply a statistical model like regression model to anticipate and monitor the real estate market.

Project Outline:

  • EDA
  • Feature Engineering
  • Pick and Train a Model
  • Interpret
  • Conclusion

EDA

Before using a statistical model, the EDA is a good step to go through in order to:

  • Recognize the data set
  • Check to see if any information is missing.
  • Find some outliers.
  • To get more out of the data, add, alter, or eliminate some features.

Importing the Libraries

  • Recognize the data set
  • Check to see if any information is missing.
  • Find some outliers.
  • To get more out of the data, add, alter, or eliminate some features.

# Import the libraries #Dataframe/Numerical libraries import pandas as pd import numpy as np #Data visualization import plotly.express as px import matplotlib import matplotlib.pyplot as plt import seaborn as sns #Machine learning model from sklearn.linear_model import LinearRegression

Reading the Dataset with Pandas

#Reading the data path='./housing.csv' housing_df=pd.read_csv(path,header=None,delim_whitespace=True)

 CRIMZNINDUSCHASNOXRMAGEDISRADTAXPTRATIOBLSTATMEDV
00.0063218.02.3100.5386.57565.24.09001296.015.3396.904.9824.0
10.027310.07.0700.4696.42178.94.96712242.017.8396.909.1421.6
20.027290.07.0700.4697.18561.14.96712242.017.8392.834.0334.7
30.032370.02.1800.4586.99845.86.06223222.018.7394.632.9433.4
40.069050.02.1800.4587.14754.26.06223222.018.7396.905.3336.2
.............................................
5010.062630.011.9300.5736.59369.12.47861273.021.0391.999.6722.4
5020.045270.011.9300.5736.12076.72.28751273.021.0396.909.0820.6
5030.060760.011.9300.5736.97691.02.16751273.021.0396.905.6423.9
5040.109590.011.9300.5736.79489.32.38891273.021.0393.456.4822.0
5050.047410.011.9300.5736.03080.82.50501273.021.0396.907.8811.9

Have a Look at the Columns

Crime: It refers to a town's per capita crime rate.

ZN: It is the percentage of residential land allocated for 25,000 square feet.

Indus: The amount of non-retail business lands per town is referred to as the indus.

CHAS: CHAS denotes whether or not the land is surrounded by a river.

NOX: The NOX stands for nitric oxide content (part per 10m)

RM: The average number of rooms per home is referred to as RM.

AGE: The percentage of owner-occupied housing built before 1940 is referred to as AGE.

DIS: Weighted distance to five Boston employment centers are referred to as dis.

RAD: Accessibility to radial highways index

TAX: The TAX columns denote the rate of full-value property taxes per $10,000 dollars.

B: B=1000(Bk — 0.63)2 is the outcome of the equation, where Bk is the proportion of blacks in each town.

PTRATIO: It refers to the student-to-teacher ratio in each community.

LSTAT: It refers to the population's lower socioeconomic status.

MEDV: It refers to the 1000-dollar median value of owner-occupied residences.

Data Preprocessing

# Check if there is any missing values. housing_df.isna().sum() CRIM       0 ZN         0 INDUS      0 CHAS       0 NOX        0 RM         0 AGE        0 DIS        0 RAD        0 TAX        0 PTRATIO    0 B          0 LSTAT      0 MEDV       0 dtype: int64

No missing values are found

We examine our data's mean, standard deviation, and percentiles.

housing_df.describe()

Graph Data

 CRIMZNINDUSCHASNOXRMAGEDISRADTAXPTRATIOBLSTATMEDV
count506.000000506.000000506.000000506.000000506.000000506.000000506.000000506.000000506.000000506.000000506.000000506.000000506.000000506.000000
mean3.61352411.36363611.1367790.0691700.5546956.28463468.5749013.7950439.549407408.23715418.455534356.67403212.65306322.532806
std8.60154523.3224536.8603530.2539940.1158780.70261728.1488612.1057108.707259168.5371162.16494691.2948647.1410629.197104
min0.0063200.0000000.4600000.0000000.3850003.5610002.9000001.1296001.000000187.00000012.6000000.3200001.7300005.000000
25%0.0820450.0000005.1900000.0000000.4490005.88550045.0250002.1001754.000000279.00000017.400000375.3775006.95000017.025000
50%0.2565100.0000009.6900000.0000000.5380006.20850077.5000003.2074505.000000330.00000019.050000391.44000011.36000021.200000
75%3.67708312.50000018.1000000.0000000.6240006.62350094.0750005.18842524.000000666.00000020.200000396.22500016.95500025.000000
max88.976200100.00000027.7400001.0000000.8710008.780000100.00000012.12650024.000000711.00000022.000000396.90000037.97000050.000000

The crime, area, sector, nitric oxides, 'B' appear to have multiple outliers at first look because the minimum and maximum values are so far apart. In the Age columns, the mean and the Q2(50 percentile) do not match.

We might double-check it by examining the distribution of each column.

Inferences

  1. The rate of crime is rather low. The majority of values are in the range of 0 to 25. With a huge value and a value of zero.
  2. The majority of residential land is zoned for less than 25,000 square feet. Land zones larger than 25,000 square feet represent a small portion of the dataset.
  3. The percentage of non-retial commercial acres is mostly split between two ranges: 0-13 and 13-23.
  4. The majority of the properties are bordered by the river, although a tiny portion of the data is not.
  5. The content of nitrite dioxide has been trending lower from.3 to.7, with a little bump towards.8. It is permissible to leave a value in the range of 0.1–1.
  6. The number of rooms tends to cluster around the average.
  7. With time, the proportion of owner-occupied units rises.
  8. As the number of weights grows, the weight distance between 5 employment centers reduces. It could indicate that individuals choose to live in new high-employment areas.
  9. People choose to live in places with limited access to roadways (0-10). We have a 30th percentile outlier.
  10. The majority of dwelling taxes are in the range of $200-450, with large outliers around $700,000.
  11. The percentage of people with lower status tends to cluster around the median. The majority of persons are of lower social standing.

Because the model is overly generic, removing all outliers will underfit it. Keeping all outliers causes the model to overfit and become excessively accurate. The data's noise will be learned.

The approach is to establish a happy medium that prevents the model from becoming overly precise. When faced with a new set of data, however, they generalise well.

We'll keep numbers below 600 because there's a huge anomaly in the TAX column around 600.

new_df=housing_df[housing_df['TAX']<600]

Looking at the Distribution

Looking-at-the-Distribution

The overall distribution, particularly the TAX, PTRATIO, and RAD, has improved slightly.

Correlation

Correlation

Perfect correlation is denoted by the clear values. The medium correlation between the columns is represented by the reds, while the negative correlation is represented by the black.

With a value of 0.89, we can see that 'MEDV', which is the medium price we wish to anticipate, is substantially connected with the number of rooms 'RM'. The proportion of black people in area 'B' with a value of 0.19 is followed by the residential land 'ZN' with a value of 0.32 and the percentage of black people in area 'ZN' with a value of 0.32.

The metrics that are most connected with price will be plotted.

The-metrics-that-are-most-connected

Feature Engineering

Feature Scaling

Gradient descent is aided by feature scaling, which ensures that all features are on the same scale. It makes locating the local optimum much easier.

Mean standardization is one strategy to employ. It substitutes (target-mean) for the target to ensure that the feature has a mean of nearly zero.

def standard(X):    '''Standard makes the feature 'X' have a zero mean'''    mu=np.mean(X) #mean    std=np.std(X) #standard deviation    sta=(X-mu)/std # mean normalization    return mu,std,sta     mu,std,sta=standard(X) X=sta X

 CRIMZNINDUSCHASNOXRMAGEDISRADTAXPTRATIOBLSTAT
0-0.6091290.092792-1.019125-0.2809760.2586700.2791350.162095-0.167660-2.105767-0.235130-1.1368630.401318-0.933659
1-0.575698-0.598153-0.225291-0.280976-0.4237950.0492520.6482660.250975-1.496334-1.032339-0.0041750.401318-0.219350
2-0.575730-0.598153-0.225291-0.280976-0.4237951.1897080.0165990.250975-1.496334-1.032339-0.0041750.298315-1.096782
3-0.567639-0.598153-1.040806-0.280976-0.5325940.910565-0.5263500.773661-0.886900-1.3276010.4035930.343869-1.283945
4-0.509220-0.598153-1.040806-0.280976-0.5325941.132984-0.2282610.773661-0.886900-1.3276010.4035930.401318-0.873561
..........................................
501-0.519445-0.5981530.585220-0.2809760.6048480.3060040.300494-0.936773-2.105767-0.5746821.4456660.277056-0.128344
502-0.547094-0.5981530.585220-0.2809760.604848-0.4000630.570195-1.027984-2.105767-0.5746821.4456660.401318-0.229652
503-0.522423-0.5981530.585220-0.2809760.6048480.8777251.077657-1.085260-2.105767-0.5746821.4456660.401318-0.820331
504-0.444652-0.5981530.585220-0.2809760.6048480.6060461.017329-0.979587-2.105767-0.5746821.4456660.314006-0.676095
505-0.543685-0.5981530.585220-0.2809760.604848-0.5344100.715691-0.924173-2.105767-0.5746821.4456660.401318-0.435703

Choose and Train the Model

For the sake of the project, we'll apply linear regression.

Typically, we run numerous models and select the best one based on a particular criterion.

Linear regression is a sort of supervised learning model in which the response is continuous, as it relates to machine learning.

Form of Linear Regression

y= θX+θ1 or y= θ1+X1θ2 +X2θ3 + X3θ4

y is the target you will be predicting

0 is the coefficient

x is the input

We will Sklearn to develop and train the model

#Import the libraries to train the model from sklearn.model_selection import train_test_split from sklearn.linear_model import LinearRegression

Allow us to utilise the train/test method to learn a part of the data on one set and predict using another set using the train/test approach.

X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.4) #Create and Train the model model=LinearRegression().fit(X_train,y_train) #Generate prediction predictions_test=model.predict(X_test) #Compute loss to evaluate the model coefficient= model.coef_ intercept=model.intercept_ print(coefficient,intercept) [7.22218258] 24.66379606613584

In this example, you will learn the model using below hypothesis:

Price= 24.85 + 7.18* Room

It is interpreted as:

For a decided price of a house:

A 7.18-unit increase in the price is connected with a growth in the number of rooms.

As a side note, this is an association, not a cause!

Interpretation

You will need a metric to determine whether our hypothesis was right. The RMSE approach will be used.

Root Means Square Error (RMSE) is defined as the square root of the mean of square error. The difference between the true and anticipated numbers called the error. It's popular because it can be expressed in y-units, which is the median price of a home in our scenario.

def rmse(predict,actual):    return np.sqrt(np.mean(np.square(predict - actual))) # Split the Data into train and test set X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.4) #Create and Train the model model=LinearRegression().fit(X_train,y_train) #Generate prediction predictions_test=model.predict(X_test) #Compute loss to evaluate the model coefficient= model.coef_ intercept=model.intercept_ print(coefficient,intercept) loss=rmse(predictions_test,y_test) print('loss: ',loss) print(model.score(X_test,y_test)) #accuracy [7.43327725] 24.912055881970886 loss: 3.9673165450580714 0.7552661033654667 Loss will be 3.96

This means that y-units refer to the median value of occupied homes with 1000 dollars.

This will be less by 3960 dollars.

While learning the model you will have a high variance when you divide the data. Coefficient and intercept will vary. It's because when we utilized the train/test approach, we choose a set of data at random to place in either the train or test set. As a result, our theory will change each time the dataset is divided.

This problem can be solved using a technique called cross-validation.

Improvisation in the Model

With 'Forward Selection,' we'll iterate through each parameter to assist us choose the numbers characteristics to include in our model.

Forward Selection

  1. Choose the most appropriate variable (in our case based on high correlation)
  2. Add the next best variable to the model
  3. Some predetermined conditions must meet.

We'll use a random state of 1 so that each iteration yields the same outcome.

cols=[] los=[] los_train=[] scor=[] i=0 while i < len(high_corr_var):    cols.append(high_corr_var[i])        # Select inputs variables    X=new_df[cols]        #mean normalization    mu,std,sta=standard(X)    X=sta        # Split the data into training and testing    X_train,X_test,y_train,y_test= train_test_split(X,y,random_state=1)        #fit the model to the training    lnreg=LinearRegression().fit(X_train,y_train)        #make prediction on the training test    prediction_train=lnreg.predict(X_train)        #make prediction on the testing test    prediction=lnreg.predict(X_test)        #compute the loss on train test    loss=rmse(prediction,y_test)    loss_train=rmse(prediction_train,y_train)    los_train.append(loss_train)    los.append(loss)        #compute the score    score=lnreg.score(X_test,y_test)    scor.append(score)        i+=1

We have a big 'loss' with a smaller collection of variables, yet our system will overgeneralize in this scenario. Although we have a reduced 'loss,' we have a large number of variables. However, if the model grows too precise, it may not generalize well to new data.

In order for our model to generalize well with another set of data, we might use 6 or 7 features. The characteristic chosen is descending based on how strong the price correlation is.

high_corr_var ['RM', 'ZN', 'B', 'CHAS', 'RAD', 'DIS', 'CRIM', 'NOX', 'AGE', 'TAX', 'INDUS', 'PTRATIO', 'LSTAT']

With 'RM' having a high price correlation and LSTAT having a negative price correlation.

# Create a list of features names feature_cols=['RM','ZN','B','CHAS','RAD','CRIM','DIS','NOX'] #Select inputs variables X=new_df[feature_cols] # Split the data into training and testing sets X_train,X_test,y_train,y_test= train_test_split(X,y, random_state=1) # feature engineering mu,std,sta=standard(X) X=sta # fit the model to the trainning data lnreg=LinearRegression().fit(X_train,y_train) # make prediction on the testing test prediction=lnreg.predict(X_test) # compute the loss loss=rmse(prediction,y_test) print('loss: ',loss) lnreg.score(X_test,y_test) loss: 3.212659865936143 0.8582338376696363

The test set yielded a loss of 3.21 and an accuracy of 85%.

Other factors, such as alpha, the learning rate at which our model learns, could still be tweaked to improve our model. Alternatively, return to the preprocessing section and working to increase the parameter distribution.

For more details regarding scraping real estate data you can contact Scraping Intelligence today

https://www.websitescraper.com/how-to-predict-housing-prices-with-linear-regression.php

Sheldon  Grant

Sheldon Grant

1654988880

Learning-edge-computing: Notes and Code Examples About Edge Computing

Learning Edge Computing

Projects to gather notes and examples around edge computing.

Notes:

Edge computing background

Over the recent years more and more IoT devices have been deployed and these devices are creating more and more data that we want to use in some way. What is currently most often the case is that these IoT devices are connected to some sort of gateway what will route the data to a cloud service for processing (analysis, processing, storing etc.).

The number of deployed devices is increasing every day and more and more data needs to be handled, and this is going to cause issues with bandwidth. There are also more devices that require lower latency from its services. For example, self driving cars (connected cars) might not have time to wait for cloud service responses, and another example is servailance cameras that generate huge amounts of data. These are some of the driving forces, to moving networked computing resources closer to where the data is created.

Since most IoT devices are resource constrained, like they might not have powerful processors, or be limit to battery power and therefor need to do as little processing as possible. Now, these devices "can't" really send this information directly to a cloud but instead will send small amounts of data to a gateway node which will in turn send it along to some cloud service, generally speaking. This is called a Cloud centric Internet of Things (CIot).

CIot:
                                                          +-------------------+
                                                          |                   |
                                                          |                   |
                                                          |                   |
   +----------+      +-------+                            |   Cloud Services  |
   |IoT Device|<---->|Gateway|<-------------------------->|                   |
   +----------+  +-->|       |                            |                   |
   +----------+  |   +-------+                            |                   |
   |IoT Device|<-+                                        |                   |
   +----------+                                           +-------------------+

Note that in this case the gateway is acting more like a router and does not store or process the data from the IoT devices.

This architecture has some issues as more and more IoT devices are deployed, more and more data is going be transmitted to the cloud services which is going to cause bandwidth issues.

There is also an issue with latency for some types of applications, for example a self-driving car might not be able to wait for a packet to be transported to/from a cloud service. There is also the issue with that an application might not allow for disconnect to the cloud service. Again a self-driving car must be able to continue if such an break occurs.

So the idea is to move some of the cloud service functionality closer to the the IoT devices, to the edge of the network. These are functionalites like computing, storage, and networking. These are called edge servers/computers:

Edge computing:
                                                          +-------------------+
                                                          |                   |
                                                          |                   |
                     +--------+                           |                   |
   +----------+      |Edge    |                           |   Cloud Services  |
   |IoT Device|<---->|compute |<------------------------->|                   |
   +----------+  +-->|resource|                           |                   |
   +----------+  |   +--------+                           |                   |
   |IoT Device|<-+                                        |                   |
   +----------+                                           +-------------------+

An edge server is a compute resource located where, or close to where, data is being generated. So it receives data from IoT devices like sensors and can store, process, and/or send the data to the cloud (or all three I guess). But data does not need to be sent to the cloud and might be processed by the edge compute resources itself.

Now, the environment where these compute resources are located will look very different. For example, lets say that I'm at home watching IP based TV or using an application on a WIFI connected device. To move an application closer to my location would be placing/deploying it perhaps in my internet service provider's (ISP) network or somewhere in Sweden (I think Netflix does this for example). I imagine that doing this would be like deploying in a kubernetes like environment, at least it would be a non-resource contrained environment where a full operating system and memory resources are available. The runtime used in this case could be any runtime for Java, JavaScript (Node.js, Deno), DotNet etc:

                                                          +-------------------+
                Internet Service Provider environment     |                   |
                                                          |                   |
                     +--------+                           |                   |
   +----------+      |Edge    |                           |   Cloud Services  |
   | IP TV    |<---->|compute |<------------------------->|                   |
   +----------+      |resource|                           |                   |
                     +--------+                           |                   |
                    "Normal servers"                      |                   |
                                                          +-------------------+

Now, lets say I switch to my mobile phone and start using the application on it. This would now be using my telco operator and going over their network. Placing the same application closer would in this case be placing it in the telco operators environment (like in a base station). This environment is similar to a cloud operator environment now where they have moved from hardware specific network devices to virtualized software that can be run on commondity hardware and managed in much the same way as cloud environment using platforms like kubernetes. So in this case we have access to similar non-resources constrained environment where I expect the runetime to be the same as the previous example, that is any runtime for Java, JavaScript (Node.js, Deno), DotNet, etc.

                                                          +-------------------+
                Telco Operator environment                |                   |
                                                          |                   |
                     +--------+                           |                   |
   +-------------+   |Edge    |                           |   Cloud Services  |
   | Mobile Phone|-->|compute |<------------------------->|                   |
   +-------------+   |resource|                           |                   |
                     +--------+                           |                   |
                   "Normal servers"                       |                   |
                                                          +-------------------+

But there are also other types of Edges which could be on factory floors, or located in hospitals, or spread out accross a city, or in cars, where smaller devices containing edge compute resources need to be placed closer to where data is generated and can be acted upon in the shortest time possible. These can also act as aggragators and limit the amount of data being sent to backend cloud applications.

                Public environments                       +-------------------+
                Factory environments                      |                   |
                Embedded in products (cars and others)    |                   |
                                                          |                   |
                     +--------+                           |                   |
   +----------+      |Edge    |                           |   Cloud Services  |
   | IP TV    |<---->|compute |<------------------------->|                   |
   +----------+      |resource|                           |                   |
                     +--------+                           |                   |
                  "Contstrained compute devices"          |                   |
                                                          +-------------------+

So what options are there for deploying to these resource constrained environments? I currently don't know the answer to this question.

I initialy thought of the edge compute resources as a normal server in a rack for example, but these can be small dedicated devices (small single board computers) like a lattepanda or a udoo bolt, or a Khadas Edge V, or a Jetson Nano.

Now, an extension of the Edge compute resource is to have a mini cloud of sort that has some of the same features of a cloud, like scalability, managability and the rest of functionality that enterprise clouds provider. This is a layer between the edge server (or parhaps even replaces the edge server, this is not clear to me yet). What are these things called, well they are called Fog (as in cloud but closer to the ground or something like that):

Fog computing:
                                                          +-------------------+
                                                          |                   |
                                                          |                   |
                     +--------+        +--------+         |                   |
   +----------+      |Edge    |        |  Fog   |         |   Cloud Services  |
   |IoT Device|<---->|compute |<-------| layer  |-------->|                   |
   +----------+  +-->|resource|        |        |         |                   |
   +----------+  |   +--------+        +--------+         |                   |
   |IoT Device|<-+                                        |                   |
   +----------+                                           +-------------------+

This idea called Fog computing was coined by Cisco in 2014 and later in 2015 IBM coided the term Edge computing.

The Fog layer receives data from the edge layer and can futher filter it down or can act on the data with or without sending it through to the cloud services. This allows for saving on bandwidth and also latency.

OpenFog

Fog layer:
 +-------------+ +---------------+ +------------+ +----------+ +----------+
 | Compute     | | Acceleration  | | Networking | | Control  | | Storage  |
 +-------------+ +---------------+ +------------+ +----------+ +----------+

Compute: VM/Containers, iPaaS/SaaS, On-Demand data Processing (ODP), Context as a Service (CaaS)
Acceleration: Network Function virtualization/Software Defined Networking, GPU/FPGA
Networking: TCP/UDP IP, Http/ CoAP, XMPP/MQTT/AMQP, 802.15.4 (ZigBee), Z-Wave, Bluetooth
Control: Deployment, Actuation, Mediation, Security
Storage: Caching

Multi-access Edge Computing (MEC)

See MEC.

Usecases

Autonomous vehicles These will generate huge amounts of data and need to repond in real time. This will require them to have onboard computing resources that can handle this and will have Edge compute resources onboard. These will b

Cloud Gaming Cloud gaming companies are looking to build edge servers as close to gamers as possible in order to reduce latency and provide a fully responsive and immersive gaming experience.

Health Care Healthcare data is coming from numerous medical devices, including those in doctor's offices, in hospitals and from consumer wearables bought by patients themselves. But all that data doesn't need to be moved to centralized servers for analysis and storage -- a process that could create bandwidth congestion and an explosion in storage needs.

In this case artificial intelligence (AI) and machine learning capable edge compute resources might be needed to deployed at the Edge (somewhere at the hospital, I imaging this as network connected devices that are small and not large servers). These would help medical staff make decisions in real time and minimize the number of false alarms.

Industrial IoT There a a lot of sensors and IoT devices in instustrial environments and having edge compute resources closer to where this data is used provides low latency to that immediate reponses to problems are possible. And Edge compute resources with AI/MI can help improve quality assurance.

Security Surveillance systems can benefit from the low latency and reliability of edge computing because it’s often necessary to respond to security threats within seconds. Edge computing also significantly reduces bandwidth costs in video surveillance, since the vast majority of surveillance footage requires no response.

Author: Danbev
Source Code: https://github.com/danbev/learning-edge-computing 
License: 

#node #nodejs #computing #edge 

Alec  Nikolaus

Alec Nikolaus

1596036600

Edge Is Taking Data to a Higher Level

This article is an introduction to edge computing. Let’s have a look at what edge computing is and the advantages.

Introduction

Over the years of computing, the processing and storage of data systems that are used in the interconnected computers have been based on the technology of cloud computing. Cloud computing has been based on the centralized data storage systems where all the devices performing some internet operations depend on the efficiency of the cloud service provider.

Since the data has often been centralized, various concerns including the security and the speed in operation have been raised regarding this setup of infrastructure. Since the data is centralized, a single breach can sabotage a large number of users. Moreover, people’s right to privacy may be violated since the service providers have an opportunity to access and monitor people’s details and demographic characteristics.

Latency to the information required may be experienced when the data is being transmitted from the cloud to the end-user due to factors such as the traffic and the distance.

The introduction of edge computing has proved to be effective in the problems associated with cloud computing. Let’s have a look at what edge computing is and the advantages.

Edge Computing From a Broad Perspective

The introduction of edge computing has led to the successful proximity of internet data to the end-user. This is done by installing the edge devices close to the end-user by different service providers. A system of interconnected nodes enables the transfer of data from one edge device to the other, hence resulting in the ease of accessing information.

The response time which has been a critical concern especially to the heavy commercial consumers has been solved by this great technology of edge computing. Since the edge devices are close to the end-user, the time of travel of the information from one end-user to the other or from an end-user to a system of AI in the edge devices is minimized. Besides, the traffic that exists in cloud computing is eliminated since the decentralized edge devices serve few users, consequently, the efficiency in the response time and rate.

What Is So Unique in Edge Computing

The system of a computer program that functions to avail data to users at their location and delivers it, can be referred to as an edge device.

Most service providers such as the CCTV cameras, traffic systems in roundabouts and other critical points that heavily depend on the real-time processing of data find the edge computing useful in these functions. The CCTV cameras collect a huge amount of data that can be as high as 10 GB per second especially in a moving car for about a mile. For the data to be transferred to the cloud for the AI (artificial intelligence) to assist in its processing, there can be latency experienced in the process resulting in poor decision making especially in the self-driving cars or the AI dependent systems.

Edge devices enable the real-time processing of the data in huge volumes and at the shortest distance hence the elimination of the latency experienced when cloud computing is adopted. Cloud computing might be efficient in the operation of huge data for its capacity and the extent of specialized and sophisticated hardware installed in it, the edge devices are unchallenged in the operation of real-time data.

#cloud computing #data #edge computing #edge #interner of things #cloud