1599187560
Modern app development with SwiftUI and Combine eliminates a lot of boilerplate code. Tools like Playgrounds extend this further to allow quick prototyping. One common issue is to load arbitrary content (JSON or binary) from network and display it in a SwiftUI view. The other day I was looking for a simple yet elegant way to use in quick prototypes. I came up with a reusable view that can load arbitrary content. After a couple of iterations I discovered some interesting tricks to share.
Content Loading stages: initial, in progress, success, and failure
Content loading is at least three stage process:
SwiftUI encourages creating small, reusable views, and use composition to create the complete picture. Each stage of the content loading process will require a view. The container view will compose the result.
The content loading process can be represented using an enum with associated values. Swift provides Result
type, that could be used to represent completed process. However, I find it more convenient to represent the result as separate success
and failure
cases. This is due to added support of switch
statement in function builders (Xcode 12).
It is easy to see what we are aiming for: the container view can switch over the loading state to provide a corresponding view.
enum RemoteContentLoadingState<Value> {
case initial
case inProgress
case success(_ value: Value)
case failure(_ error: Error)
}
You probably know by now that a view in SwiftUI can not load content by itself. This is because SwiftUI views are value types. Content loading requires back and forth communication possible only with reference types.
The way to provide remote content to a SwiftUI view is by using ObservedObject
property wrapper and ObservableObject
protocol.
ObservableObject
protocol synthesizes a publisher that emits before the object has changed.
Typically, you would create a class that confirms to ObservableObject
and reference it using ObservedObject
property wrapper from your view.
Because we are building a reusable view it makes sense to inject ObservableObject
. To start, declare RemoteContent
protocol.
protocol RemoteContent : ObservableObject {
associatedtype Value
var loadingState: RemoteContentLoadingState<Value> { get }
func load()
func cancel()
}
#ios #apple #xcode #swiftui #swift
1647064260
Run C# scripts from the .NET CLI, define NuGet packages inline and edit/debug them in VS Code - all of that with full language services support from OmniSharp.
Name | Version | Framework(s) |
---|---|---|
dotnet-script (global tool) | net6.0 , net5.0 , netcoreapp3.1 | |
Dotnet.Script (CLI as Nuget) | net6.0 , net5.0 , netcoreapp3.1 | |
Dotnet.Script.Core | netcoreapp3.1 , netstandard2.0 | |
Dotnet.Script.DependencyModel | netstandard2.0 | |
Dotnet.Script.DependencyModel.Nuget | netstandard2.0 |
The only thing we need to install is .NET Core 3.1 or .NET 5.0 SDK.
.NET Core 2.1 introduced the concept of global tools meaning that you can install dotnet-script
using nothing but the .NET CLI.
dotnet tool install -g dotnet-script
You can invoke the tool using the following command: dotnet-script
Tool 'dotnet-script' (version '0.22.0') was successfully installed.
The advantage of this approach is that you can use the same command for installation across all platforms. .NET Core SDK also supports viewing a list of installed tools and their uninstallation.
dotnet tool list -g
Package Id Version Commands
---------------------------------------------
dotnet-script 0.22.0 dotnet-script
dotnet tool uninstall dotnet-script -g
Tool 'dotnet-script' (version '0.22.0') was successfully uninstalled.
choco install dotnet.script
We also provide a PowerShell script for installation.
(new-object Net.WebClient).DownloadString("https://raw.githubusercontent.com/filipw/dotnet-script/master/install/install.ps1") | iex
curl -s https://raw.githubusercontent.com/filipw/dotnet-script/master/install/install.sh | bash
If permission is denied we can try with sudo
curl -s https://raw.githubusercontent.com/filipw/dotnet-script/master/install/install.sh | sudo bash
A Dockerfile for running dotnet-script in a Linux container is available. Build:
cd build
docker build -t dotnet-script -f Dockerfile ..
And run:
docker run -it dotnet-script --version
You can manually download all the releases in zip
format from the GitHub releases page.
Our typical helloworld.csx
might look like this:
Console.WriteLine("Hello world!");
That is all it takes and we can execute the script. Args are accessible via the global Args array.
dotnet script helloworld.csx
Simply create a folder somewhere on your system and issue the following command.
dotnet script init
This will create main.csx
along with the launch configuration needed to debug the script in VS Code.
.
├── .vscode
│ └── launch.json
├── main.csx
└── omnisharp.json
We can also initialize a folder using a custom filename.
dotnet script init custom.csx
Instead of main.csx
which is the default, we now have a file named custom.csx
.
.
├── .vscode
│ └── launch.json
├── custom.csx
└── omnisharp.json
Note: Executing
dotnet script init
inside a folder that already contains one or more script files will not create themain.csx
file.
Scripts can be executed directly from the shell as if they were executables.
foo.csx arg1 arg2 arg3
OSX/Linux
Just like all scripts, on OSX/Linux you need to have a
#!
and mark the file as executable via chmod +x foo.csx. If you use dotnet script init to create your csx it will automatically have the#!
directive and be marked as executable.
The OSX/Linux shebang directive should be #!/usr/bin/env dotnet-script
#!/usr/bin/env dotnet-script
Console.WriteLine("Hello world");
You can execute your script using dotnet script or dotnet-script, which allows you to pass arguments to control your script execution more.
foo.csx arg1 arg2 arg3
dotnet script foo.csx -- arg1 arg2 arg3
dotnet-script foo.csx -- arg1 arg2 arg3
All arguments after --
are passed to the script in the following way:
dotnet script foo.csx -- arg1 arg2 arg3
Then you can access the arguments in the script context using the global Args
collection:
foreach (var arg in Args)
{
Console.WriteLine(arg);
}
All arguments before --
are processed by dotnet script
. For example, the following command-line
dotnet script -d foo.csx -- -d
will pass the -d
before --
to dotnet script
and enable the debug mode whereas the -d
after --
is passed to script for its own interpretation of the argument.
dotnet script
has built-in support for referencing NuGet packages directly from within the script.
#r "nuget: AutoMapper, 6.1.0"
Note: Omnisharp needs to be restarted after adding a new package reference
We can define package sources using a NuGet.Config
file in the script root folder. In addition to being used during execution of the script, it will also be used by OmniSharp
that provides language services for packages resolved from these package sources.
As an alternative to maintaining a local NuGet.Config
file we can define these package sources globally either at the user level or at the computer level as described in Configuring NuGet Behaviour
It is also possible to specify packages sources when executing the script.
dotnet script foo.csx -s https://SomePackageSource
Multiple packages sources can be specified like this:
dotnet script foo.csx -s https://SomePackageSource -s https://AnotherPackageSource
Dotnet-Script can create a standalone executable or DLL for your script.
Switch | Long switch | description |
---|---|---|
-o | --output | Directory where the published executable should be placed. Defaults to a 'publish' folder in the current directory. |
-n | --name | The name for the generated DLL (executable not supported at this time). Defaults to the name of the script. |
--dll | Publish to a .dll instead of an executable. | |
-c | --configuration | Configuration to use for publishing the script [Release/Debug]. Default is "Debug" |
-d | --debug | Enables debug output. |
-r | --runtime | The runtime used when publishing the self contained executable. Defaults to your current runtime. |
The executable you can run directly independent of dotnet install, while the DLL can be run using the dotnet CLI like this:
dotnet script exec {path_to_dll} -- arg1 arg2
We provide two types of caching, the dependency cache
and the execution cache
which is explained in detail below. In order for any of these caches to be enabled, it is required that all NuGet package references are specified using an exact version number. The reason for this constraint is that we need to make sure that we don't execute a script with a stale dependency graph.
In order to resolve the dependencies for a script, a dotnet restore
is executed under the hood to produce a project.assets.json
file from which we can figure out all the dependencies we need to add to the compilation. This is an out-of-process operation and represents a significant overhead to the script execution. So this cache works by looking at all the dependencies specified in the script(s) either in the form of NuGet package references or assembly file references. If these dependencies matches the dependencies from the last script execution, we skip the restore and read the dependencies from the already generated project.assets.json
file. If any of the dependencies has changed, we must restore again to obtain the new dependency graph.
In order to execute a script it needs to be compiled first and since that is a CPU and time consuming operation, we make sure that we only compile when the source code has changed. This works by creating a SHA256 hash from all the script files involved in the execution. This hash is written to a temporary location along with the DLL that represents the result of the script compilation. When a script is executed the hash is computed and compared with the hash from the previous compilation. If they match there is no need to recompile and we run from the already compiled DLL. If the hashes don't match, the cache is invalidated and we recompile.
You can override this automatic caching by passing --no-cache flag, which will bypass both caches and cause dependency resolution and script compilation to happen every time we execute the script.
The temporary location used for caches is a sub-directory named dotnet-script
under (in order of priority):
DOTNET_SCRIPT_CACHE_LOCATION
, if defined and value is not empty.$XDG_CACHE_HOME
if defined otherwise $HOME/.cache
~/Library/Caches
Path.GetTempPath
for the platform.The days of debugging scripts using Console.WriteLine
are over. One major feature of dotnet script
is the ability to debug scripts directly in VS Code. Just set a breakpoint anywhere in your script file(s) and hit F5(start debugging)
Script packages are a way of organizing reusable scripts into NuGet packages that can be consumed by other scripts. This means that we now can leverage scripting infrastructure without the need for any kind of bootstrapping.
A script package is just a regular NuGet package that contains script files inside the content
or contentFiles
folder.
The following example shows how the scripts are laid out inside the NuGet package according to the standard convention .
└── contentFiles
└── csx
└── netstandard2.0
└── main.csx
This example contains just the main.csx
file in the root folder, but packages may have multiple script files either in the root folder or in subfolders below the root folder.
When loading a script package we will look for an entry point script to be loaded. This entry point script is identified by one of the following.
main.csx
in the root folderIf the entry point script cannot be determined, we will simply load all the scripts files in the package.
The advantage with using an entry point script is that we can control loading other scripts from the package.
To consume a script package all we need to do specify the NuGet package in the #load
directive.
The following example loads the simple-targets package that contains script files to be included in our script.
#load "nuget:simple-targets-csx, 6.0.0"
using static SimpleTargets;
var targets = new TargetDictionary();
targets.Add("default", () => Console.WriteLine("Hello, world!"));
Run(Args, targets);
Note: Debugging also works for script packages so that we can easily step into the scripts that are brought in using the
#load
directive.
Scripts don't actually have to exist locally on the machine. We can also execute scripts that are made available on an http(s)
endpoint.
This means that we can create a Gist on Github and execute it just by providing the URL to the Gist.
This Gist contains a script that prints out "Hello World"
We can execute the script like this
dotnet script https://gist.githubusercontent.com/seesharper/5d6859509ea8364a1fdf66bbf5b7923d/raw/0a32bac2c3ea807f9379a38e251d93e39c8131cb/HelloWorld.csx
That is a pretty long URL, so why don't make it a TinyURL like this:
dotnet script https://tinyurl.com/y8cda9zt
A pretty common scenario is that we have logic that is relative to the script path. We don't want to require the user to be in a certain directory for these paths to resolve correctly so here is how to provide the script path and the script folder regardless of the current working directory.
public static string GetScriptPath([CallerFilePath] string path = null) => path;
public static string GetScriptFolder([CallerFilePath] string path = null) => Path.GetDirectoryName(path);
Tip: Put these methods as top level methods in a separate script file and
#load
that file wherever access to the script path and/or folder is needed.
This release contains a C# REPL (Read-Evaluate-Print-Loop). The REPL mode ("interactive mode") is started by executing dotnet-script
without any arguments.
The interactive mode allows you to supply individual C# code blocks and have them executed as soon as you press Enter. The REPL is configured with the same default set of assembly references and using statements as regular CSX script execution.
Once dotnet-script
starts you will see a prompt for input. You can start typing C# code there.
~$ dotnet script
> var x = 1;
> x+x
2
If you submit an unterminated expression into the REPL (no ;
at the end), it will be evaluated and the result will be serialized using a formatter and printed in the output. This is a bit more interesting than just calling ToString()
on the object, because it attempts to capture the actual structure of the object. For example:
~$ dotnet script
> var x = new List<string>();
> x.Add("foo");
> x
List<string>(1) { "foo" }
> x.Add("bar");
> x
List<string>(2) { "foo", "bar" }
>
REPL also supports inline Nuget packages - meaning the Nuget packages can be installed into the REPL from within the REPL. This is done via our #r
and #load
from Nuget support and uses identical syntax.
~$ dotnet script
> #r "nuget: Automapper, 6.1.1"
> using AutoMapper;
> typeof(MapperConfiguration)
[AutoMapper.MapperConfiguration]
> #load "nuget: simple-targets-csx, 6.0.0";
> using static SimpleTargets;
> typeof(TargetDictionary)
[Submission#0+SimpleTargets+TargetDictionary]
Using Roslyn syntax parsing, we also support multiline REPL mode. This means that if you have an uncompleted code block and press Enter, we will automatically enter the multiline mode. The mode is indicated by the *
character. This is particularly useful for declaring classes and other more complex constructs.
~$ dotnet script
> class Foo {
* public string Bar {get; set;}
* }
> var foo = new Foo();
Aside from the regular C# script code, you can invoke the following commands (directives) from within the REPL:
Command | Description |
---|---|
#load | Load a script into the REPL (same as #load usage in CSX) |
#r | Load an assembly into the REPL (same as #r usage in CSX) |
#reset | Reset the REPL back to initial state (without restarting it) |
#cls | Clear the console screen without resetting the REPL state |
#exit | Exits the REPL |
You can execute a CSX script and, at the end of it, drop yourself into the context of the REPL. This way, the REPL becomes "seeded" with your code - all the classes, methods or variables are available in the REPL context. This is achieved by running a script with an -i
flag.
For example, given the following CSX script:
var msg = "Hello World";
Console.WriteLine(msg);
When you run this with the -i
flag, Hello World
is printed, REPL starts and msg
variable is available in the REPL context.
~$ dotnet script foo.csx -i
Hello World
>
You can also seed the REPL from inside the REPL - at any point - by invoking a #load
directive pointed at a specific file. For example:
~$ dotnet script
> #load "foo.csx"
Hello World
>
The following example shows how we can pipe data in and out of a script.
The UpperCase.csx
script simply converts the standard input to upper case and writes it back out to standard output.
using (var streamReader = new StreamReader(Console.OpenStandardInput()))
{
Write(streamReader.ReadToEnd().ToUpper());
}
We can now simply pipe the output from one command into our script like this.
echo "This is some text" | dotnet script UpperCase.csx
THIS IS SOME TEXT
The first thing we need to do add the following to the launch.config
file that allows VS Code to debug a running process.
{
"name": ".NET Core Attach",
"type": "coreclr",
"request": "attach",
"processId": "${command:pickProcess}"
}
To debug this script we need a way to attach the debugger in VS Code and the simplest thing we can do here is to wait for the debugger to attach by adding this method somewhere.
public static void WaitForDebugger()
{
Console.WriteLine("Attach Debugger (VS Code)");
while(!Debugger.IsAttached)
{
}
}
To debug the script when executing it from the command line we can do something like
WaitForDebugger();
using (var streamReader = new StreamReader(Console.OpenStandardInput()))
{
Write(streamReader.ReadToEnd().ToUpper()); // <- SET BREAKPOINT HERE
}
Now when we run the script from the command line we will get
$ echo "This is some text" | dotnet script UpperCase.csx
Attach Debugger (VS Code)
This now gives us a chance to attach the debugger before stepping into the script and from VS Code, select the .NET Core Attach
debugger and pick the process that represents the executing script.
Once that is done we should see our breakpoint being hit.
By default, scripts will be compiled using the debug
configuration. This is to ensure that we can debug a script in VS Code as well as attaching a debugger for long running scripts.
There are however situations where we might need to execute a script that is compiled with the release
configuration. For instance, running benchmarks using BenchmarkDotNet is not possible unless the script is compiled with the release
configuration.
We can specify this when executing the script.
dotnet script foo.csx -c release
Starting from version 0.50.0, dotnet-script
supports .Net Core 3.0 and all the C# 8 features. The way we deal with nullable references types in dotnet-script
is that we turn every warning related to nullable reference types into compiler errors. This means every warning between CS8600
and CS8655
are treated as an error when compiling the script.
Nullable references types are turned off by default and the way we enable it is using the #nullable enable
compiler directive. This means that existing scripts will continue to work, but we can now opt-in on this new feature.
#!/usr/bin/env dotnet-script
#nullable enable
string name = null;
Trying to execute the script will result in the following error
main.csx(5,15): error CS8625: Cannot convert null literal to non-nullable reference type.
We will also see this when working with scripts in VS Code under the problems panel.
Download Details:
Author: filipw
Source Code: https://github.com/filipw/dotnet-script
License: MIT License
1626350640
Hello Guys 🖐🖐🖐🖐
In this Video I’m going to show how to convert SwiftUI View to UIKit View in Just Three Simple Steps | SwiftUI to UIKit Conversion | UIKit Integration | SwiftUI UIHostingController | Converting SwiftUI View to UIKit View | Xcode 12 SwiftUI.
► Twitter Profile Page UI
https://youtu.be/U5UbLFmLUpU
► Support Us
Patreon : https://www.patreon.com/kavsoft
Contributions : https://donorbox.org/kavsoft
Or By Visiting the Link Given Below:
► Kite is a free AI-powered coding assistant that will help you code faster and smarter. The Kite plugin integrates with all the top editors and IDEs to give you smart completions and documentation while you’re typing. It’s gives a great experience and I think you should give it a try too https://www.kite.com/get-kite/?utm_medium=referral&utm_source=youtube&utm_campaign=kavsoft&utm_content=description-only
► My MacBook Specs
M1 MacBook Pro(16GB)
Xcode Version: 12.5
macOS Version: 11.3 Big Sur
► Official Website: https://kavsoft.dev
For Any Queries: https://kavsoft.dev/#contact
► Social Platforms
Instagram: https://www.instagram.com/_kavsoft/
Twitter: https://twitter.com/_Kavsoft
Thanks for watching
Make sure to like and Subscribe For More Content !!!
#swiftui view #uikit view #swiftui #uikit
1599187560
Modern app development with SwiftUI and Combine eliminates a lot of boilerplate code. Tools like Playgrounds extend this further to allow quick prototyping. One common issue is to load arbitrary content (JSON or binary) from network and display it in a SwiftUI view. The other day I was looking for a simple yet elegant way to use in quick prototypes. I came up with a reusable view that can load arbitrary content. After a couple of iterations I discovered some interesting tricks to share.
Content Loading stages: initial, in progress, success, and failure
Content loading is at least three stage process:
SwiftUI encourages creating small, reusable views, and use composition to create the complete picture. Each stage of the content loading process will require a view. The container view will compose the result.
The content loading process can be represented using an enum with associated values. Swift provides Result
type, that could be used to represent completed process. However, I find it more convenient to represent the result as separate success
and failure
cases. This is due to added support of switch
statement in function builders (Xcode 12).
It is easy to see what we are aiming for: the container view can switch over the loading state to provide a corresponding view.
enum RemoteContentLoadingState<Value> {
case initial
case inProgress
case success(_ value: Value)
case failure(_ error: Error)
}
You probably know by now that a view in SwiftUI can not load content by itself. This is because SwiftUI views are value types. Content loading requires back and forth communication possible only with reference types.
The way to provide remote content to a SwiftUI view is by using ObservedObject
property wrapper and ObservableObject
protocol.
ObservableObject
protocol synthesizes a publisher that emits before the object has changed.
Typically, you would create a class that confirms to ObservableObject
and reference it using ObservedObject
property wrapper from your view.
Because we are building a reusable view it makes sense to inject ObservableObject
. To start, declare RemoteContent
protocol.
protocol RemoteContent : ObservableObject {
associatedtype Value
var loadingState: RemoteContentLoadingState<Value> { get }
func load()
func cancel()
}
#ios #apple #xcode #swiftui #swift
1668563924
In this Machine Learning article, we learn about Machine Learning Tutorial: step by step for beginners. This Machine Learning tutorial provides both intermediate and basics of machine learning. It is designed for students and working professionals who are complete beginners. At the end of this tutorial, you will be able to make machine learning models that can perform complex tasks such as predicting the price of a house or recognizing the species of an Iris from the dimensions of its petal and sepal lengths. If you are not a complete beginner and are a bit familiar with Machine Learning, I would suggest starting with subtopic eight i.e, Types of Machine Learning.
Before we deep dive further, if you are keen to explore a course in Artificial Intelligence & Machine Learning do check out our Artificial Intelligence Courses available at Great Learning. Anyone could expect an average Salary Hike of 48% from this course. Participate in Great Learning’s career accelerate programs and placement drives and get hired by our pool of 500+ Hiring companies through our programs.
Before jumping into the tutorial, you should be familiar with Pandas and NumPy. This is important to understand the implementation part. There are no prerequisites for understanding the theory. Here are the subtopics that we are going to discuss in this tutorial:
Arthur Samuel coined the term Machine Learning in the year 1959. He was a pioneer in Artificial Intelligence and computer gaming, and defined Machine Learning as “Field of study that gives computers the capability to learn without being explicitly programmed”.
In simple terms, Machine Learning is an application of Artificial Intelligence (AI) which enables a program(software) to learn from the experiences and improve their self at a task without being explicitly programmed. For example, how would you write a program that can identify fruits based on their various properties, such as colour, shape, size or any other property?
One approach is to hardcode everything, make some rules and use them to identify the fruits. This may seem the only way and work but one can never make perfect rules that apply on all cases. This problem can be easily solved using machine learning without any rules which makes it more robust and practical. You will see how we will use machine learning to do this task in the coming sections.
Thus, we can say that Machine Learning is the study of making machines more human-like in their behaviour and decision making by giving them the ability to learn with minimum human intervention, i.e., no explicit programming. Now the question arises, how can a program attain any experience and from where does it learn? The answer is data. Data is also called the fuel for Machine Learning and we can safely say that there is no machine learning without data.
You may be wondering that the term Machine Learning has been introduced in 1959 which is a long way back, then why haven’t there been any mention of it till recent years? You may want to note that Machine Learning needs a huge computational power, a lot of data and devices which are capable of storing such vast data. We have only recently reached a point where we now have all these requirements and can practice Machine Learning.
Are you wondering how is Machine Learning different from traditional programming? Well, in traditional programming, we would feed the input data and a well written and tested program into a machine to generate output. When it comes to machine learning, input data along with the output associated with the data is fed into the machine during the learning phase, and it works out a program for itself.
Machine Learning today has all the attention it needs. Machine Learning can automate many tasks, especially the ones that only humans can perform with their innate intelligence. Replicating this intelligence to machines can be achieved only with the help of machine learning.
With the help of Machine Learning, businesses can automate routine tasks. It also helps in automating and quickly create models for data analysis. Various industries depend on vast quantities of data to optimize their operations and make intelligent decisions. Machine Learning helps in creating models that can process and analyze large amounts of complex data to deliver accurate results. These models are precise and scalable and function with less turnaround time. By building such precise Machine Learning models, businesses can leverage profitable opportunities and avoid unknown risks.
Image recognition, text generation, and many other use-cases are finding applications in the real world. This is increasing the scope for machine learning experts to shine as a sought after professionals.
A machine learning model learns from the historical data fed to it and then builds prediction algorithms to predict the output for the new set of data the comes in as input to the system. The accuracy of these models would depend on the quality and amount of input data. A large amount of data will help build a better model which predicts the output more accurately.
Suppose we have a complex problem at hand that requires to perform some predictions. Now, instead of writing a code, this problem could be solved by feeding the given data to generic machine learning algorithms. With the help of these algorithms, the machine will develop logic and predict the output. Machine learning has transformed the way we approach business and social problems. Below is a diagram that briefly explains the working of a machine learning model/ algorithm. our way of thinking about the problem.
Nowadays, we can see some amazing applications of ML such as in self-driving cars, Natural Language Processing and many more. But Machine learning has been here for over 70 years now. It all started in 1943, when neurophysiologist Warren McCulloch and mathematician Walter Pitts wrote a paper about neurons, and how they work. They decided to create a model of this using an electrical circuit, and therefore, the neural network was born.
In 1950, Alan Turing created the “Turing Test” to determine if a computer has real intelligence. To pass the test, a computer must be able to fool a human into believing it is also human. In 1952, Arthur Samuel wrote the first computer learning program. The program was the game of checkers, and the IBM computer improved at the game the more it played, studying which moves made up winning strategies and incorporating those moves into its program.
Just after a few years, in 1957, Frank Rosenblatt designed the first neural network for computers (the perceptron), which simulates the thought processes of the human brain. Later, in 1967, the “nearest neighbor” algorithm was written, allowing computers to begin using very basic pattern recognition. This could be used to map a route for travelling salesmen, starting at a random city but ensuring they visit all cities during a short tour.
But we can say that in the 1990s we saw a big change. Now work on machine learning shifted from a knowledge-driven approach to a data-driven approach. Scientists began to create programs for computers to analyze large amounts of data and draw conclusions or “learn” from the results.
In 1997, IBM’s Deep Blue became the first computer chess-playing system to beat a reigning world chess champion. Deep Blue used the computing power in the 1990s to perform large-scale searches of potential moves and select the best move. Just a decade before this, in 2006, Geoffrey Hinton created the term “deep learning” to explain new algorithms that help computers distinguish objects and text in images and videos.
The year 2012 saw the publication of an influential research paper by Alex Krizhevsky, Geoffrey Hinton, and Ilya Sutskever, describing a model that can dramatically reduce the error rate in image recognition systems. Meanwhile, Google’s X Lab developed a machine learning algorithm capable of autonomously browsing YouTube videos to identify the videos that contain cats. In 2016 AlphaGo (created by researchers at Google DeepMind to play the ancient Chinese game of Go) won four out of five matches against Lee Sedol, who has been the world’s top Go player for over a decade.
And now in 2020, OpenAI released GPT-3 which is the most powerful language model ever. It can write creative fiction, generate functioning code, compose thoughtful business memos and much more. Its possible use cases are limited only by our imaginations.
1. Automation: Nowadays in your Gmail account, there is a spam folder that contains all the spam emails. You might be wondering how does Gmail know that all these emails are spam? This is the work of Machine Learning. It recognizes the spam emails and thus, it is easy to automate this process. The ability to automate repetitive tasks is one of the biggest characteristics of machine learning. A huge number of organizations are already using machine learning-powered paperwork and email automation. In the financial sector, for example, a huge number of repetitive, data-heavy and predictable tasks are needed to be performed. Because of this, this sector uses different types of machine learning solutions to a great extent.
2. Improved customer experience: For any business, one of the most crucial ways to drive engagement, promote brand loyalty and establish long-lasting customer relationships is by providing a customized experience and providing better services. Machine Learning helps us to achieve both of them. Have you ever noticed that whenever you open any shopping site or see any ads on the internet, they are mostly about something that you recently searched for? This is because machine learning has enabled us to make amazing recommendation systems that are accurate. They help us customize the user experience. Now coming to the service, most of the companies nowadays have a chatting bot with them that are available 24×7. An example of this is Eva from AirAsia airlines. These bots provide intelligent answers and sometimes you might even not notice that you are having a conversation with a bot. These bots use Machine Learning, which helps them to provide a good user experience.
3. Automated data visualization: In the past, we have seen a huge amount of data being generated by companies and individuals. Take an example of companies like Google, Twitter, Facebook. How much data are they generating per day? We can use this data and visualize the notable relationships, thus giving businesses the ability to make better decisions that can actually benefit both companies as well as customers. With the help of user-friendly automated data visualization platforms such as AutoViz, businesses can obtain a wealth of new insights in an effort to increase productivity in their processes.
4. Business intelligence: Machine learning characteristics, when merged with big data analytics can help companies to find solutions to the problems that can help the businesses to grow and generate more profit. From retail to financial services to healthcare, and many more, ML has already become one of the most effective technologies to boost business operations.
Python provides flexibility in choosing between object-oriented programming or scripting. There is also no need to recompile the code; developers can implement any changes and instantly see the results. You can use Python along with other languages to achieve the desired functionality and results.
Python is a versatile programming language and can run on any platform including Windows, MacOS, Linux, Unix, and others. While migrating from one platform to another, the code needs some minor adaptations and changes, and it is ready to work on the new platform. To build strong foundation and cover basic concepts you can enroll in a python machine learning course that will help you power ahead your career.
Here is a summary of the benefits of using Python for Machine Learning problems:
Machine learning has been broadly categorized into three categories
Let us start with an easy example, say you are teaching a kid to differentiate dogs from cats. How would you do it?
You may show him/her a dog and say “here is a dog” and when you encounter a cat you would point it out as a cat. When you show the kid enough dogs and cats, he may learn to differentiate between them. If he is trained well, he may be able to recognize different breeds of dogs which he hasn’t even seen.
Similarly, in Supervised Learning, we have two sets of variables. One is called the target variable, or labels (the variable we want to predict) and features(variables that help us to predict target variables). We show the program(model) the features and the label associated with these features and then the program is able to find the underlying pattern in the data. Take this example of the dataset where we want to predict the price of the house given its size. The price which is a target variable depends upon the size which is a feature.
Number of rooms | Price |
1 | $100 |
3 | $300 |
5 | $500 |
In a real dataset, we will have a lot more rows and more than one features like size, location, number of floors and many more.
Thus, we can say that the supervised learning model has a set of input variables (x), and an output variable (y). An algorithm identifies the mapping function between the input and output variables. The relationship is y = f(x).
The learning is monitored or supervised in the sense that we already know the output and the algorithm are corrected each time to optimize its results. The algorithm is trained over the data set and amended until it achieves an acceptable level of performance.
We can group the supervised learning problems as:
Regression problems – Used to predict future values and the model is trained with the historical data. E.g., Predicting the future price of a house.
Classification problems – Various labels train the algorithm to identify items within a specific category. E.g., Dog or cat( as mentioned in the above example), Apple or an orange, Beer or wine or water.
This approach is the one where we have no target variables, and we have only the input variable(features) at hand. The algorithm learns by itself and discovers an impressive structure in the data.
The goal is to decipher the underlying distribution in the data to gain more knowledge about the data.
We can group the unsupervised learning problems as:
Clustering: This means bundling the input variables with the same characteristics together. E.g., grouping users based on search history
Association: Here, we discover the rules that govern meaningful associations among the data set. E.g., People who watch ‘X’ will also watch ‘Y’.
In this approach, machine learning models are trained to make a series of decisions based on the rewards and feedback they receive for their actions. The machine learns to achieve a goal in complex and uncertain situations and is rewarded each time it achieves it during the learning period.
Reinforcement learning is different from supervised learning in the sense that there is no answer available, so the reinforcement agent decides the steps to perform a task. The machine learns from its own experiences when there is no training data set present.
In this tutorial, we are going to mainly focus on Supervised Learning and Unsupervised learning as these are quite easy to understand and implement.
This may be the most time-consuming and difficult process in your journey of Machine Learning. There are many algorithms in Machine Learning and you don’t need to know them all in order to get started. But I would suggest, once you start practising Machine Learning, start learning about the most popular algorithms out there such as:
Here, I am going to give a brief overview of one of the simplest algorithms in Machine learning, the K-nearest neighbor Algorithm (which is a Supervised learning algorithm) and show how we can use it for Regression as well as for classification. I would highly recommend checking the Linear Regression and Logistic Regression as we are going to implement them and compare the results with KNN(K-nearest neighbor) algorithm in the implementation part.
You may want to note that there are usually separate algorithms for regression problems and classification problems. But by modifying an algorithm, we can use it for both classifications as well as regression as you will see below
KNN belongs to a group of lazy learners. As opposed to eager learners such as logistic regression, SVM, neural nets, lazy learners just store the training data in memory. During the training phase, KNN arranges the data (sort of indexing process) in order to find the closest neighbours efficiently during the inference phase. Otherwise, it would have to compare each new case during inference with the whole dataset making it quite inefficient.
So if you are wondering what is a training phase, eager learners and lazy learners, for now just remember that training phase is when an algorithm learns from the data provided to it. For example, if you have gone through the Linear Regression algorithm linked above, during the training phase the algorithm tries to find the best fit line which is a process that includes a lot of computations and hence takes a lot of time and this type of algorithm is called eager learners. On the other hand, lazy learners are just like KNN which do not involve many computations and hence train faster.
Now let us see how we can use K-NN for classification. Here a hypothetical dataset which tries to predict if a person is male or female (labels) on the base of the height and weight (features).
Height(cm) -feature | Weight(kg) -feature. | Gender(label) |
187 | 80 | Male |
165 | 50 | Female |
199 | 99 | Male |
145 | 70 | Female |
180 | 87 | Male |
178 | 65 | Female |
187 | 60 | Male |
Now let us plot these points:
Now we have a new point that we want to classify, given that its height is 190 cm and weight is 100 Kg. Here is how K-NN will classify this point:
Now let us apply this algorithm to our own dataset. Let us first plot the new data point.
Now let us take k=3 i.e, we will see the three closest points to the new point:
Therefore, it is classified as Male:
Now let us take the value of k=5 and see what happens:
As we can see four of the points closest to our new data point are males and just one point is female, so we go with the majority and classify it as Male again. You must always select the value of K as an odd number when doing classification.
We have seen how we can use K-NN for classification. Now, let us see what changes are made to use it for regression. The algorithm is almost the same there is just one difference. In Classification, we checked for the majority of all nearest points. Here, we are going to take the average of all the nearest points and take that as predicted value. Let us again take the same example but here we have to predict the weight(label) of a person given his height(features).
Height(cm) -feature | Weight(kg) -label |
187 | 80 |
165 | 50 |
199 | 99 |
145 | 70 |
180 | 87 |
178 | 65 |
187 | 60 |
Now we have new data point with a height of 160cm, we will predict its weight by taking the values of K as 1,2 and 4.
When K=1: The closest point to 160cm in our data is 165cm which has a weight of 50, so we conclude that the predicted weight is 50 itself.
When K=2: The two closest points are 165 and 145 which have weights equal to 50 and 70 respectively. Taking average we say that the predicted weight is (50+70)/2=60.
When K=4: Repeating the same process, now we take 4 closest points instead and hence we get 70.6 as predicted weight.
You might be thinking that this is really simple and there is nothing so special about Machine learning, it is just basic Mathematics. But remember this is the simplest algorithm and you will see much more complex algorithms once you move ahead in this journey.
At this stage, you must have a vague idea of how machine learning works, don’t worry if you are still confused. Also if you want to go a bit deep now, here is an excellent article – Gradient Descent in Machine Learning, which discusses how we use an optimization technique called as gradient descent to find a best-fit line in linear regression.
There are plenty of machine learning algorithms and it could be a tough task to decide which algorithm to choose for a specific application. The choice of the algorithm will depend on the objective of the problem you are trying to solve.
Let us take an example of a task to predict the type of fruit among three varieties, i.e., apple, banana, and orange. The predictions are based on the colour of the fruit. The picture depicts the results of ten different algorithms. The picture on the top left is the dataset. The data is classified into three categories: red, light blue and dark blue. There are some groupings. For instance, from the second image, everything in the upper left belongs to the red category, in the middle part, there is a mixture of uncertainty and light blue while the bottom corresponds to the dark category. The other images show different algorithms and how they try to classified the data.
I wish Machine learning was just applying algorithms on your data and get the predicted values but it is not that simple. There are several steps in Machine Learning which are must for each project.
For evaluating the model, we hold out a portion of data called test data and do not use this data to train the model. Later, we use test data to evaluate various metrics.
The results of predictive models can be viewed in various forms such as by using confusion matrix, root-mean-squared error(RMSE), AUC-ROC etc.
TP (True Positive) is the number of values predicted to be positive by the algorithm and was actually positive in the dataset. TN represents the number of values that are expected to not belong to the positive class and actually do not belong to it. FP depicts the number of instances misclassified as belonging to the positive class thus is actually part of the negative class. FN shows the number of instances classified as the negative class but should belong to the positive class.
Now in Regression problem, we usually use RMSE as evaluation metrics. In this evaluation technique, we use the error term.
Let’s say you feed a model some input X and the model predicts 10, but the actual value is 5. This difference between your prediction (10) and the actual observation (5) is the error term: (f_prediction – i_actual). The formula to calculate RMSE is given by:
Where N is a total number of samples for which we are calculating RMSE.
In a good model, the RMSE should be as low as possible and there should not be much difference between RMSE calculated over training data and RMSE calculated over the testing set.
Although there are many languages that can be used for machine learning, according to me, Python is hands down the best programming language for Machine Learning applications. This is due to the various benefits mentioned in the section below. Other programming languages that could to use for Machine Learning Applications are R, C++, JavaScript, Java, C#, Julia, Shell, TypeScript, and Scala. R is also a really good language to get started with machine learning.
Python is famous for its readability and relatively lower complexity as compared to other programming languages. Machine Learning applications involve complex concepts like calculus and linear algebra which take a lot of effort and time to implement. Python helps in reducing this burden with quick implementation for the Machine Learning engineer to validate an idea. You can check out the Python Tutorial to get a basic understanding of the language. Another benefit of using Python in Machine Learning is the pre-built libraries. There are different packages for a different type of applications, as mentioned below:
Before moving on to the implementation of machine learning with Python part, you need to download some important software and libraries. Anaconda is an open-source distribution that makes it easy to perform Python/R data science and machine learning on a single machine. It contains all most all the libraries that are needed by us. In this tutorial, we are mostly going to use the scikit-learn library which is a free software machine learning library for the Python programming language.
Now, we are going to implement all that we learnt till now. We will solve a Regression problem and then a Classification problem using the seven steps mentioned above.
Implementation of a Regression problem
We have a problem of predicting the prices of the house given some features such as size, number of rooms and many more. So let us get started:
The dataset we are using is called the Boston Housing dataset. Each record in the database describes a Boston suburb or town. The data was drawn from the Boston Standard Metropolitan Statistical Area (SMSA) in 1970. The attributes are defined as follows (taken from the UCI Machine Learning Repository).
Here is a link to download this dataset.
Now after opening the file you can see the data about House sales. This dataset is not in a proper tabular form, in fact, there are no column names and each value is separated by spaces. We are going to use Pandas to put it in proper tabular form. We will provide it with a list containing column names and also use delimiter as ‘\s+’ which means that after encounterings a single or multiple spaces, it can differentiate every single entry.
We are going to import all the necessary libraries such as Pandas and NumPy. Next, we will import the data file which is in CSV format into a pandas DataFrame.
import numpy as np
import pandas as pd
column_names = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX','PTRATIO', 'B', 'LSTAT', 'MEDV']
bos1 = pd.read_csv('housing.csv', delimiter=r"\s+", names=column_names)
2. Preprocess Data: The next step is to pre-process the data. Now for this dataset, we can see that there are no NaN (missing) values and also all the data is in numbers rather than strings so we won’t face any errors when training the model. So let us just divide our data into training data and testing data such that 70% of data is training data and the rest is testing data. We could also scale our data to make the predictions much accurate but for now, let us keep it simple.
bos1.isna().sum()
from sklearn.model_selection import train_test_split
X=np.array(bos1.iloc[:,0:13])
Y=np.array(bos1["MEDV"])
#testing data size is of 30% of entire data
x_train, x_test, y_train, y_test =train_test_split(X,Y, test_size = 0.30, random_state =5)
3. Choose a Model: For this particular problem, we are going to use two algorithms of supervised learning that can solve regression problems and later compare their results. One algorithm is K-NN (K-nearest Neighbor) which is explained above and the other is Linear Regression. I would highly recommend to check it out in case you haven’t already.
from sklearn.linear_model import LinearRegression
from sklearn.neighbors import KNeighborsRegressor
#load our first model
lr = LinearRegression()
#train the model on training data
lr.fit(x_train,y_train)
#predict the testing data so that we can later evaluate the model
pred_lr = lr.predict(x_test)
#load the second model
Nn=KNeighborsRegressor(3)
Nn.fit(x_train,y_train)
pred_Nn = Nn.predict(x_test)
4. Hyperparameter Tuning: Since this is a beginners tutorial, here, I am only going to turn the value ok K in the K-NN model. I will just use a for loop and check results of k ranging from 1 to 50. K-NN is extremely fast on small dataset like ours so it won’t take any time. There are much more advanced methods of doing this which you can find linked in the steps of Machine Learning section above.
import sklearn
for i in range(1,50):
model=KNeighborsRegressor(i)
model.fit(x_train,y_train)
pred_y = model.predict(x_test)
mse = sklearn.metrics.mean_squared_error(y_test, pred_y,squared=False)
print("{} error for k = {}".format(mse,i))
Output:
From the output, we can see that error is least for k=3, so that should justify why I put the value of K=3 while training the model
5. Evaluating the model: For evaluating the model we are going to use the mean_squared_error() method from the scikit-learn library. Remember to set the parameter ‘squared’ as False, to get the RMSE error.
#error for linear regression
mse_lr= sklearn.metrics.mean_squared_error(y_test, pred_lr,squared=False)
print("error for Linear Regression = {}".format(mse_lr))
#error for linear regression
mse_Nn= sklearn.metrics.mean_squared_error(y_test, pred_Nn,squared=False)
print("error for K-NN = {}".format(mse_Nn))
Now from the results, we can conclude that Linear Regression performs better than K-NN for this particular dataset. But It is not necessary that Linear Regression would always perform better than K-NN as it completely depends upon the data that we are working with.
6. Prediction: Now we can use the models to predict the prices of the houses using the predict function as we did above. Make sure when predicting the prices that we are given all the features that were present when training the model.
Here is the whole script:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.neighbors import KNeighborsRegressor
column_names = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT', 'MEDV']
bos1 = pd.read_csv('housing.csv', delimiter=r"\s+", names=column_names)
X=np.array(bos1.iloc[:,0:13])
Y=np.array(bos1["MEDV"])
#testing data size is of 30% of entire data
x_train, x_test, y_train, y_test =train_test_split(X,Y, test_size = 0.30, random_state =54)
#load our first model
lr = LinearRegression()
#train the model on training data
lr.fit(x_train,y_train)
#predict the testing data so that we can later evaluate the model
pred_lr = lr.predict(x_test)
#load the second model
Nn=KNeighborsRegressor(12)
Nn.fit(x_train,y_train)
pred_Nn = Nn.predict(x_test)
#error for linear regression
mse_lr= sklearn.metrics.mean_squared_error(y_test, pred_lr,squared=False)
print("error for Linear Regression = {}".format(mse_lr))
#error for linear regression
mse_Nn= sklearn.metrics.mean_squared_error(y_test, pred_Nn,squared=False)
print("error for K-NN = {}".format(mse_Nn))
In this section, we will solve the population classification problem known as Iris Classification problem. The Iris dataset was used in R.A. Fisher’s classic 1936 paper, The Use of Multiple Measurements in Taxonomic Problems, and can also be found on the UCI Machine Learning Repository.
It includes three iris species with 50 samples each as well as some properties about each flower. One flower species is linearly separable from the other two, but the other two are not linearly separable from each other. The columns in this dataset are:
Different species of iris
We don’t need to download this dataset as scikit-learn library already contains this dataset and we can simply import it from there. So let us start coding this up:
from sklearn.datasets import load_iris
iris = load_iris()
X=iris.data
Y=iris.target
print(X)
print(Y)
As we can see, the features are in a list containing four items which are the features and at the bottom, we got a list containing labels which have been transformed into numbers as the model cannot understand names that are strings, so we encode each name as a number. This has already done by the scikit learn developers.
from sklearn.model_selection import train_test_split
#testing data size is of 30% of entire data
x_train, x_test, y_train, y_test =train_test_split(X,Y, test_size = 0.3, random_state =5)
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
#fitting our model to train and test
Nn = KNeighborsClassifier(8)
Nn.fit(x_train,y_train)
#the score() method calculates the accuracy of model.
print("Accuracy for K-NN is ",Nn.score(x_test,y_test))
Lr = LogisticRegression()
Lr.fit(x_train,y_train)
print("Accuracy for Logistic Regression is ",Lr.score(x_test,y_test))
1. Easily identifies trends and patterns
Machine Learning can review large volumes of data and discover specific trends and patterns that would not be apparent to humans. For instance, for e-commerce websites like Amazon and Flipkart, it serves to understand the browsing behaviors and purchase histories of its users to help cater to the right products, deals, and reminders relevant to them. It uses the results to reveal relevant advertisements to them.
2. Continuous Improvement
We are continuously generating new data and when we provide this data to the Machine Learning model which helps it to upgrade with time and increase its performance and accuracy. We can say it is like gaining experience as they keep improving in accuracy and efficiency. This lets them make better decisions.
3. Handling multidimensional and multi-variety data
Machine Learning algorithms are good at handling data that are multidimensional and multi-variety, and they can do this in dynamic or uncertain environments.
4. Wide Applications
You could be an e-tailer or a healthcare provider and make Machine Learning work for you. Where it does apply, it holds the capability to help deliver a much more personal experience to customers while also targeting the right customers.
1. Data Acquisition
Machine Learning requires a massive amount of data sets to train on, and these should be inclusive/unbiased, and of good quality. There can also be times where we must wait for new data to be generated.
2. Time and Resources
Machine Learning needs enough time to let the algorithms learn and develop enough to fulfill their purpose with a considerable amount of accuracy and relevancy. It also needs massive resources to function. This can mean additional requirements of computer power for you.
3. Interpretation of Results
Another major challenge is the ability to accurately interpret results generated by the algorithms. You must also carefully choose the algorithms for your purpose. Sometimes, based on some analysis you might select an algorithm but it is not necessary that this model is best for the problem.
4. High error-susceptibility
Machine Learning is autonomous but highly susceptible to errors. Suppose you train an algorithm with data sets small enough to not be inclusive. You end up with biased predictions coming from a biased training set. This leads to irrelevant advertisements being displayed to customers. In the case of Machine Learning, such blunders can set off a chain of errors that can go undetected for long periods of time. And when they do get noticed, it takes quite some time to recognize the source of the issue, and even longer to correct it.
Machine Learning can be a competitive advantage to any company, be it a top MNC or a startup. As things that are currently being done manually will be done tomorrow by machines. With the introduction of projects such as self-driving cars, Sophia(a humanoid robot developed by Hong Kong-based company Hanson Robotics) we have already started a glimpse of what the future can be. The Machine Learning revolution will stay with us for long and so will be the future of Machine Learning.
How do I start learning Machine Learning?
You first need to start with the basics. You need to understand the prerequisites, which include learning Linear Algebra and Multivariate Calculus, Statistics, and Python. Then you need to learn several ML concepts, which include terminology of Machine Learning, types of Machine Learning, and Resources of Machine Learning. The third step is taking part in competitions. You can also take up a free online statistics for machine learning course and understand the foundational concepts.
Is Machine Learning easy for beginners?
Machine Learning is not the easiest. The difficulty in learning Machine Learning is the debugging problem. However, if you study the right resources, you will be able to learn Machine Learning without any hassles.
What is a simple example of Machine Learning?
Recommendation Engines (Netflix); Sorting, tagging and categorizing photos (Yelp); Customer Lifetime Value (Asos); Self-Driving Cars (Waymo); Education (Duolingo); Determining Credit Worthiness (Deserve); Patient Sickness Predictions (KenSci); and Targeted Emails (Optimail).
Can I learn Machine Learning in 3 months?
Machine Learning is vast and consists of several things. Therefore, it will take you around six months to learn it, provided you spend at least 5-6 days every day. Also, the time taken to learn Machine Learning depends a lot on your mathematical and analytical skills.
Does Machine Learning require coding?
If you are learning traditional Machine Learning, it would require you to know software programming as it will help you to write machine learning algorithms. However, through some online educational platforms, you do not need to know coding to learn Machine Learning.
Is Machine Learning a good career?
Machine Learning is one of the best careers at present. Whether it is for the current demand, job, and salary growth, Machine Learning Engineer is one of the best profiles. You need to be very good at data, automation, and algorithms.
Can I learn Machine Learning without Python?
To learn Machine Learning, you need to have some basic knowledge of Python. A version of Python that is supported by all Operating Systems such as Windows, Linux, etc., is Anaconda. It offers an overall package for machine learning, including matplotlib, scikit-learn, and NumPy.
Where can I practice Machine Learning?
The online platforms where you can practice Machine Learning include CloudXLab, Google Colab, Kaggle, MachineHack, and OpenML.
Where can I learn Machine Learning for free?
You can learn the basics of Machine Learning from online platforms like Great Learning. You can enroll in the Beginners Machine Learning course and get the certificate for free. The course is easy and perfect for beginners to start with.
Original article source at: https://www.mygreatlearning.com
1646698200
What is face recognition? Or what is recognition? When you look at an apple fruit, your mind immediately tells you that this is an apple fruit. This process, your mind telling you that this is an apple fruit is recognition in simple words. So what is face recognition then? I am sure you have guessed it right. When you look at your friend walking down the street or a picture of him, you recognize that he is your friend Paulo. Interestingly when you look at your friend or a picture of him you look at his face first before looking at anything else. Ever wondered why you do that? This is so that you can recognize him by looking at his face. Well, this is you doing face recognition.
But the real question is how does face recognition works? It is quite simple and intuitive. Take a real life example, when you meet someone first time in your life you don't recognize him, right? While he talks or shakes hands with you, you look at his face, eyes, nose, mouth, color and overall look. This is your mind learning or training for the face recognition of that person by gathering face data. Then he tells you that his name is Paulo. At this point your mind knows that the face data it just learned belongs to Paulo. Now your mind is trained and ready to do face recognition on Paulo's face. Next time when you will see Paulo or his face in a picture you will immediately recognize him. This is how face recognition work. The more you will meet Paulo, the more data your mind will collect about Paulo and especially his face and the better you will become at recognizing him.
Now the next question is how to code face recognition with OpenCV, after all this is the only reason why you are reading this article, right? OK then. You might say that our mind can do these things easily but to actually code them into a computer is difficult? Don't worry, it is not. Thanks to OpenCV, coding face recognition is as easier as it feels. The coding steps for face recognition are same as we discussed it in real life example above.
OpenCV comes equipped with built in face recognizer, all you have to do is feed it the face data. It's that simple and this how it will look once we are done coding it.
OpenCV has three built in face recognizers and thanks to OpenCV's clean coding, you can use any of them by just changing a single line of code. Below are the names of those face recognizers and their OpenCV calls.
cv2.face.createEigenFaceRecognizer()
cv2.face.createFisherFaceRecognizer()
cv2.face.createLBPHFaceRecognizer()
We have got three face recognizers but do you know which one to use and when? Or which one is better? I guess not. So why not go through a brief summary of each, what you say? I am assuming you said yes :) So let's dive into the theory of each.
This algorithm considers the fact that not all parts of a face are equally important and equally useful. When you look at some one you recognize him/her by his distinct features like eyes, nose, cheeks, forehead and how they vary with respect to each other. So you are actually focusing on the areas of maximum change (mathematically speaking, this change is variance) of the face. For example, from eyes to nose there is a significant change and same is the case from nose to mouth. When you look at multiple faces you compare them by looking at these parts of the faces because these parts are the most useful and important components of a face. Important because they catch the maximum change among faces, change the helps you differentiate one face from the other. This is exactly how EigenFaces face recognizer works.
EigenFaces face recognizer looks at all the training images of all the persons as a whole and try to extract the components which are important and useful (the components that catch the maximum variance/change) and discards the rest of the components. This way it not only extracts the important components from the training data but also saves memory by discarding the less important components. These important components it extracts are called principal components. Below is an image showing the principal components extracted from a list of faces.
Principal Components source
You can see that principal components actually represent faces and these faces are called eigen faces and hence the name of the algorithm.
So this is how EigenFaces face recognizer trains itself (by extracting principal components). Remember, it also keeps a record of which principal component belongs to which person. One thing to note in above image is that Eigenfaces algorithm also considers illumination as an important component.
Later during recognition, when you feed a new image to the algorithm, it repeats the same process on that image as well. It extracts the principal component from that new image and compares that component with the list of components it stored during training and finds the component with the best match and returns the person label associated with that best match component.
Easy peasy, right? Next one is easier than this one.
This algorithm is an improved version of EigenFaces face recognizer. Eigenfaces face recognizer looks at all the training faces of all the persons at once and finds principal components from all of them combined. By capturing principal components from all the of them combined you are not focusing on the features that discriminate one person from the other but the features that represent all the persons in the training data as a whole.
This approach has drawbacks, for example, images with sharp changes (like light changes which is not a useful feature at all) may dominate the rest of the images and you may end up with features that are from external source like light and are not useful for discrimination at all. In the end, your principal components will represent light changes and not the actual face features.
Fisherfaces algorithm, instead of extracting useful features that represent all the faces of all the persons, it extracts useful features that discriminate one person from the others. This way features of one person do not dominate over the others and you have the features that discriminate one person from the others.
Below is an image of features extracted using Fisherfaces algorithm.
Fisher Faces source
You can see that features extracted actually represent faces and these faces are called fisher faces and hence the name of the algorithm.
One thing to note here is that even in Fisherfaces algorithm if multiple persons have images with sharp changes due to external sources like light they will dominate over other features and affect recognition accuracy.
Getting bored with this theory? Don't worry, only one face recognizer is left and then we will dive deep into the coding part.
I wrote a detailed explaination on Local Binary Patterns Histograms in my previous article on face detection using local binary patterns histograms. So here I will just give a brief overview of how it works.
We know that Eigenfaces and Fisherfaces are both affected by light and in real life we can't guarantee perfect light conditions. LBPH face recognizer is an improvement to overcome this drawback.
Idea is to not look at the image as a whole instead find the local features of an image. LBPH alogrithm try to find the local structure of an image and it does that by comparing each pixel with its neighboring pixels.
Take a 3x3 window and move it one image, at each move (each local part of an image), compare the pixel at the center with its neighbor pixels. The neighbors with intensity value less than or equal to center pixel are denoted by 1 and others by 0. Then you read these 0/1 values under 3x3 window in a clockwise order and you will have a binary pattern like 11100011 and this pattern is local to some area of the image. You do this on whole image and you will have a list of local binary patterns.
LBP Labeling
Now you get why this algorithm has Local Binary Patterns in its name? Because you get a list of local binary patterns. Now you may be wondering, what about the histogram part of the LBPH? Well after you get a list of local binary patterns, you convert each binary pattern into a decimal number (as shown in above image) and then you make a histogram of all of those values. A sample histogram looks like this.
Sample Histogram
I guess this answers the question about histogram part. So in the end you will have one histogram for each face image in the training data set. That means if there were 100 images in training data set then LBPH will extract 100 histograms after training and store them for later recognition. Remember, algorithm also keeps track of which histogram belongs to which person.
Later during recognition, when you will feed a new image to the recognizer for recognition it will generate a histogram for that new image, compare that histogram with the histograms it already has, find the best match histogram and return the person label associated with that best match histogram.
Below is a list of faces and their respective local binary patterns images. You can see that the LBP images are not affected by changes in light conditions.
LBP Faces source
The theory part is over and now comes the coding part! Ready to dive into coding? Let's get into it then.
Coding Face Recognition with OpenCV
The Face Recognition process in this tutorial is divided into three steps.
[There should be a visualization diagram for above steps here]
To detect faces, I will use the code from my previous article on face detection. So if you have not read it, I encourage you to do so to understand how face detection works and its Python coding.
Before starting the actual coding we need to import the required modules for coding. So let's import them first.
#import OpenCV module
import cv2
#import os module for reading training data directories and paths
import os
#import numpy to convert python lists to numpy arrays as
#it is needed by OpenCV face recognizers
import numpy as np
#matplotlib for display our images
import matplotlib.pyplot as plt
%matplotlib inline
The more images used in training the better. Normally a lot of images are used for training a face recognizer so that it can learn different looks of the same person, for example with glasses, without glasses, laughing, sad, happy, crying, with beard, without beard etc. To keep our tutorial simple we are going to use only 12 images for each person.
So our training data consists of total 2 persons with 12 images of each person. All training data is inside training-data
folder. training-data
folder contains one folder for each person and each folder is named with format sLabel (e.g. s1, s2)
where label is actually the integer label assigned to that person. For example folder named s1 means that this folder contains images for person 1. The directory structure tree for training data is as follows:
training-data
|-------------- s1
| |-- 1.jpg
| |-- ...
| |-- 12.jpg
|-------------- s2
| |-- 1.jpg
| |-- ...
| |-- 12.jpg
The test-data
folder contains images that we will use to test our face recognizer after it has been successfully trained.
As OpenCV face recognizer accepts labels as integers so we need to define a mapping between integer labels and persons actual names so below I am defining a mapping of persons integer labels and their respective names.
Note: As we have not assigned label 0
to any person so the mapping for label 0 is empty.
#there is no label 0 in our training data so subject name for index/label 0 is empty
subjects = ["", "Tom Cruise", "Shahrukh Khan"]
You may be wondering why data preparation, right? Well, OpenCV face recognizer accepts data in a specific format. It accepts two vectors, one vector is of faces of all the persons and the second vector is of integer labels for each face so that when processing a face the face recognizer knows which person that particular face belongs too.
For example, if we had 2 persons and 2 images for each person.
PERSON-1 PERSON-2
img1 img1
img2 img2
Then the prepare data step will produce following face and label vectors.
FACES LABELS
person1_img1_face 1
person1_img2_face 1
person2_img1_face 2
person2_img2_face 2
Preparing data step can be further divided into following sub-steps.
s1, s2
.sLabel
where Label
is an integer representing the label we have assigned to that subject. So for example, folder name s1
means that the subject has label 1, s2 means subject label is 2 and so on. The label extracted in this step is assigned to each face detected in the next step.[There should be a visualization for above steps here]
Did you read my last article on face detection? No? Then you better do so right now because to detect faces, I am going to use the code from my previous article on face detection. So if you have not read it, I encourage you to do so to understand how face detection works and its coding. Below is the same code.
#function to detect face using OpenCV
def detect_face(img):
#convert the test image to gray image as opencv face detector expects gray images
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
#load OpenCV face detector, I am using LBP which is fast
#there is also a more accurate but slow Haar classifier
face_cascade = cv2.CascadeClassifier('opencv-files/lbpcascade_frontalface.xml')
#let's detect multiscale (some images may be closer to camera than others) images
#result is a list of faces
faces = face_cascade.detectMultiScale(gray, scaleFactor=1.2, minNeighbors=5);
#if no faces are detected then return original img
if (len(faces) == 0):
return None, None
#under the assumption that there will be only one face,
#extract the face area
(x, y, w, h) = faces[0]
#return only the face part of the image
return gray[y:y+w, x:x+h], faces[0]
I am using OpenCV's LBP face detector. On line 4, I convert the image to grayscale because most operations in OpenCV are performed in gray scale, then on line 8 I load LBP face detector using cv2.CascadeClassifier
class. After that on line 12 I use cv2.CascadeClassifier
class' detectMultiScale
method to detect all the faces in the image. on line 20, from detected faces I only pick the first face because in one image there will be only one face (under the assumption that there will be only one prominent face). As faces returned by detectMultiScale
method are actually rectangles (x, y, width, height) and not actual faces images so we have to extract face image area from the main image. So on line 23 I extract face area from gray image and return both the face image area and face rectangle.
Now you have got a face detector and you know the 4 steps to prepare the data, so are you ready to code the prepare data step? Yes? So let's do it.
#this function will read all persons' training images, detect face from each image
#and will return two lists of exactly same size, one list
# of faces and another list of labels for each face
def prepare_training_data(data_folder_path):
#------STEP-1--------
#get the directories (one directory for each subject) in data folder
dirs = os.listdir(data_folder_path)
#list to hold all subject faces
faces = []
#list to hold labels for all subjects
labels = []
#let's go through each directory and read images within it
for dir_name in dirs:
#our subject directories start with letter 's' so
#ignore any non-relevant directories if any
if not dir_name.startswith("s"):
continue;
#------STEP-2--------
#extract label number of subject from dir_name
#format of dir name = slabel
#, so removing letter 's' from dir_name will give us label
label = int(dir_name.replace("s", ""))
#build path of directory containin images for current subject subject
#sample subject_dir_path = "training-data/s1"
subject_dir_path = data_folder_path + "/" + dir_name
#get the images names that are inside the given subject directory
subject_images_names = os.listdir(subject_dir_path)
#------STEP-3--------
#go through each image name, read image,
#detect face and add face to list of faces
for image_name in subject_images_names:
#ignore system files like .DS_Store
if image_name.startswith("."):
continue;
#build image path
#sample image path = training-data/s1/1.pgm
image_path = subject_dir_path + "/" + image_name
#read image
image = cv2.imread(image_path)
#display an image window to show the image
cv2.imshow("Training on image...", image)
cv2.waitKey(100)
#detect face
face, rect = detect_face(image)
#------STEP-4--------
#for the purpose of this tutorial
#we will ignore faces that are not detected
if face is not None:
#add face to list of faces
faces.append(face)
#add label for this face
labels.append(label)
cv2.destroyAllWindows()
cv2.waitKey(1)
cv2.destroyAllWindows()
return faces, labels
I have defined a function that takes the path, where training subjects' folders are stored, as parameter. This function follows the same 4 prepare data substeps mentioned above.
(step-1) On line 8 I am using os.listdir
method to read names of all folders stored on path passed to function as parameter. On line 10-13 I am defining labels and faces vectors.
(step-2) After that I traverse through all subjects' folder names and from each subject's folder name on line 27 I am extracting the label information. As folder names follow the sLabel
naming convention so removing the letter s
from folder name will give us the label assigned to that subject.
(step-3) On line 34, I read all the images names of of the current subject being traversed and on line 39-66 I traverse those images one by one. On line 53-54 I am using OpenCV's imshow(window_title, image)
along with OpenCV's waitKey(interval)
method to display the current image being traveresed. The waitKey(interval)
method pauses the code flow for the given interval (milliseconds), I am using it with 100ms interval so that we can view the image window for 100ms. On line 57, I detect face from the current image being traversed.
(step-4) On line 62-66, I add the detected face and label to their respective vectors.
But a function can't do anything unless we call it on some data that it has to prepare, right? Don't worry, I have got data of two beautiful and famous celebrities. I am sure you will recognize them!
Let's call this function on images of these beautiful celebrities to prepare data for training of our Face Recognizer. Below is a simple code to do that.
#let's first prepare our training data
#data will be in two lists of same size
#one list will contain all the faces
#and other list will contain respective labels for each face
print("Preparing data...")
faces, labels = prepare_training_data("training-data")
print("Data prepared")
#print total faces and labels
print("Total faces: ", len(faces))
print("Total labels: ", len(labels))
Preparing data...
Data prepared
Total faces: 23
Total labels: 23
This was probably the boring part, right? Don't worry, the fun stuff is coming up next. It's time to train our own face recognizer so that once trained it can recognize new faces of the persons it was trained on. Read? Ok then let's train our face recognizer.
As we know, OpenCV comes equipped with three face recognizers.
cv2.face.createEigenFaceRecognizer()
cv2.face.createFisherFaceRecognizer()
cv2.face.LBPHFisherFaceRecognizer()
I am going to use LBPH face recognizer but you can use any face recognizer of your choice. No matter which of the OpenCV's face recognizer you use the code will remain the same. You just have to change one line, the face recognizer initialization line given below.
#create our LBPH face recognizer
face_recognizer = cv2.face.createLBPHFaceRecognizer()
#or use EigenFaceRecognizer by replacing above line with
#face_recognizer = cv2.face.createEigenFaceRecognizer()
#or use FisherFaceRecognizer by replacing above line with
#face_recognizer = cv2.face.createFisherFaceRecognizer()
Now that we have initialized our face recognizer and we also have prepared our training data, it's time to train the face recognizer. We will do that by calling the train(faces-vector, labels-vector)
method of face recognizer.
#train our face recognizer of our training faces
face_recognizer.train(faces, np.array(labels))
Did you notice that instead of passing labels
vector directly to face recognizer I am first converting it to numpy array? This is because OpenCV expects labels vector to be a numpy
array.
Still not satisfied? Want to see some action? Next step is the real action, I promise!
Now comes my favorite part, the prediction part. This is where we actually get to see if our algorithm is actually recognizing our trained subjects's faces or not. We will take two test images of our celeberities, detect faces from each of them and then pass those faces to our trained face recognizer to see if it recognizes them.
Below are some utility functions that we will use for drawing bounding box (rectangle) around face and putting celeberity name near the face bounding box.
#function to draw rectangle on image
#according to given (x, y) coordinates and
#given width and heigh
def draw_rectangle(img, rect):
(x, y, w, h) = rect
cv2.rectangle(img, (x, y), (x+w, y+h), (0, 255, 0), 2)
#function to draw text on give image starting from
#passed (x, y) coordinates.
def draw_text(img, text, x, y):
cv2.putText(img, text, (x, y), cv2.FONT_HERSHEY_PLAIN, 1.5, (0, 255, 0), 2)
First function draw_rectangle
draws a rectangle on image based on passed rectangle coordinates. It uses OpenCV's built in function cv2.rectangle(img, topLeftPoint, bottomRightPoint, rgbColor, lineWidth)
to draw rectangle. We will use it to draw a rectangle around the face detected in test image.
Second function draw_text
uses OpenCV's built in function cv2.putText(img, text, startPoint, font, fontSize, rgbColor, lineWidth)
to draw text on image.
Now that we have the drawing functions, we just need to call the face recognizer's predict(face)
method to test our face recognizer on test images. Following function does the prediction for us.
#this function recognizes the person in image passed
#and draws a rectangle around detected face with name of the
#subject
def predict(test_img):
#make a copy of the image as we don't want to chang original image
img = test_img.copy()
#detect face from the image
face, rect = detect_face(img)
#predict the image using our face recognizer
label= face_recognizer.predict(face)
#get name of respective label returned by face recognizer
label_text = subjects[label]
#draw a rectangle around face detected
draw_rectangle(img, rect)
#draw name of predicted person
draw_text(img, label_text, rect[0], rect[1]-5)
return img
predict(face)
method. This method will return a lableNow that we have the prediction function well defined, next step is to actually call this function on our test images and display those test images to see if our face recognizer correctly recognized them. So let's do it. This is what we have been waiting for.
print("Predicting images...")
#load test images
test_img1 = cv2.imread("test-data/test1.jpg")
test_img2 = cv2.imread("test-data/test2.jpg")
#perform a prediction
predicted_img1 = predict(test_img1)
predicted_img2 = predict(test_img2)
print("Prediction complete")
#create a figure of 2 plots (one for each test image)
f, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 5))
#display test image1 result
ax1.imshow(cv2.cvtColor(predicted_img1, cv2.COLOR_BGR2RGB))
#display test image2 result
ax2.imshow(cv2.cvtColor(predicted_img2, cv2.COLOR_BGR2RGB))
#display both images
cv2.imshow("Tom cruise test", predicted_img1)
cv2.imshow("Shahrukh Khan test", predicted_img2)
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)
cv2.destroyAllWindows()
Predicting images...
Prediction complete
wohooo! Is'nt it beautiful? Indeed, it is!
Face Recognition is a fascinating idea to work on and OpenCV has made it extremely simple and easy for us to code it. It just takes a few lines of code to have a fully working face recognition application and we can switch between all three face recognizers with a single line of code change. It's that simple.
Although EigenFaces, FisherFaces and LBPH face recognizers are good but there are even better ways to perform face recognition like using Histogram of Oriented Gradients (HOGs) and Neural Networks. So the more advanced face recognition algorithms are now a days implemented using a combination of OpenCV and Machine learning. I have plans to write some articles on those more advanced methods as well, so stay tuned!
Download Details:
Author: informramiz
Source Code: https://github.com/informramiz/opencv-face-recognition-python
License: MIT License