Java Questions

Java Questions

1595676000

Case Study: Breast Cancer Classification Using a Support Vector Machine

In this tutorial, we’re going to create a model to predict whether a patient has a positive breast cancer diagnosis based on several tumor features.

Problem Statement

The breast cancer database is a publicly available dataset from the UCI Machine learning Repository. It gives information on tumor features such as tumor size, density, and texture.

**Goal: **To create a classification model that looks at predicts if the cancer diagnosis is benign or malignant based on several features.

Data used: Kaggle-Breast Cancer Prediction Dataset


Step 1: Exploring the Dataset

First, let’s understand our dataset:

#import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
#import models from scikit learn module:
from sklearn.model_selection import train_test_split
from sklearn import metrics
from sklearn.svm import SVC
#import Data
df_cancer = pd.read_csv('Breast_cancer_data.csv')
df_cancer.head()
#get some information about our Data-Set
df_cancer.info()
df_cancer.describe()
#visualizing data
sns.pairplot(df_cancer, hue = 'diagnosis')
plt.figure(figsize=(7,7))
sns.heatmap(df_cancer['mean_radius mean_texture mean_perimeter mean_area mean_smoothness diagnosis'.split()].corr(), annot=True)
sns.scatterplot(x = 'mean_texture', y = 'mean_perimeter', hue = 'diagnosis', data = df_cancer)

#data-science #machine-learning #support-vector-machine #python #kaggle

What is GEEK

Buddha Community

Case Study: Breast Cancer Classification Using a Support Vector Machine
Chloe  Butler

Chloe Butler

1667425440

Pdf2gerb: Perl Script Converts PDF Files to Gerber format

pdf2gerb

Perl script converts PDF files to Gerber format

Pdf2Gerb generates Gerber 274X photoplotting and Excellon drill files from PDFs of a PCB. Up to three PDFs are used: the top copper layer, the bottom copper layer (for 2-sided PCBs), and an optional silk screen layer. The PDFs can be created directly from any PDF drawing software, or a PDF print driver can be used to capture the Print output if the drawing software does not directly support output to PDF.

The general workflow is as follows:

  1. Design the PCB using your favorite CAD or drawing software.
  2. Print the top and bottom copper and top silk screen layers to a PDF file.
  3. Run Pdf2Gerb on the PDFs to create Gerber and Excellon files.
  4. Use a Gerber viewer to double-check the output against the original PCB design.
  5. Make adjustments as needed.
  6. Submit the files to a PCB manufacturer.

Please note that Pdf2Gerb does NOT perform DRC (Design Rule Checks), as these will vary according to individual PCB manufacturer conventions and capabilities. Also note that Pdf2Gerb is not perfect, so the output files must always be checked before submitting them. As of version 1.6, Pdf2Gerb supports most PCB elements, such as round and square pads, round holes, traces, SMD pads, ground planes, no-fill areas, and panelization. However, because it interprets the graphical output of a Print function, there are limitations in what it can recognize (or there may be bugs).

See docs/Pdf2Gerb.pdf for install/setup, config, usage, and other info.


pdf2gerb_cfg.pm

#Pdf2Gerb config settings:
#Put this file in same folder/directory as pdf2gerb.pl itself (global settings),
#or copy to another folder/directory with PDFs if you want PCB-specific settings.
#There is only one user of this file, so we don't need a custom package or namespace.
#NOTE: all constants defined in here will be added to main namespace.
#package pdf2gerb_cfg;

use strict; #trap undef vars (easier debug)
use warnings; #other useful info (easier debug)


##############################################################################################
#configurable settings:
#change values here instead of in main pfg2gerb.pl file

use constant WANT_COLORS => ($^O !~ m/Win/); #ANSI colors no worky on Windows? this must be set < first DebugPrint() call

#just a little warning; set realistic expectations:
#DebugPrint("${\(CYAN)}Pdf2Gerb.pl ${\(VERSION)}, $^O O/S\n${\(YELLOW)}${\(BOLD)}${\(ITALIC)}This is EXPERIMENTAL software.  \nGerber files MAY CONTAIN ERRORS.  Please CHECK them before fabrication!${\(RESET)}", 0); #if WANT_DEBUG

use constant METRIC => FALSE; #set to TRUE for metric units (only affect final numbers in output files, not internal arithmetic)
use constant APERTURE_LIMIT => 0; #34; #max #apertures to use; generate warnings if too many apertures are used (0 to not check)
use constant DRILL_FMT => '2.4'; #'2.3'; #'2.4' is the default for PCB fab; change to '2.3' for CNC

use constant WANT_DEBUG => 0; #10; #level of debug wanted; higher == more, lower == less, 0 == none
use constant GERBER_DEBUG => 0; #level of debug to include in Gerber file; DON'T USE FOR FABRICATION
use constant WANT_STREAMS => FALSE; #TRUE; #save decompressed streams to files (for debug)
use constant WANT_ALLINPUT => FALSE; #TRUE; #save entire input stream (for debug ONLY)

#DebugPrint(sprintf("${\(CYAN)}DEBUG: stdout %d, gerber %d, want streams? %d, all input? %d, O/S: $^O, Perl: $]${\(RESET)}\n", WANT_DEBUG, GERBER_DEBUG, WANT_STREAMS, WANT_ALLINPUT), 1);
#DebugPrint(sprintf("max int = %d, min int = %d\n", MAXINT, MININT), 1); 

#define standard trace and pad sizes to reduce scaling or PDF rendering errors:
#This avoids weird aperture settings and replaces them with more standardized values.
#(I'm not sure how photoplotters handle strange sizes).
#Fewer choices here gives more accurate mapping in the final Gerber files.
#units are in inches
use constant TOOL_SIZES => #add more as desired
(
#round or square pads (> 0) and drills (< 0):
    .010, -.001,  #tiny pads for SMD; dummy drill size (too small for practical use, but needed so StandardTool will use this entry)
    .031, -.014,  #used for vias
    .041, -.020,  #smallest non-filled plated hole
    .051, -.025,
    .056, -.029,  #useful for IC pins
    .070, -.033,
    .075, -.040,  #heavier leads
#    .090, -.043,  #NOTE: 600 dpi is not high enough resolution to reliably distinguish between .043" and .046", so choose 1 of the 2 here
    .100, -.046,
    .115, -.052,
    .130, -.061,
    .140, -.067,
    .150, -.079,
    .175, -.088,
    .190, -.093,
    .200, -.100,
    .220, -.110,
    .160, -.125,  #useful for mounting holes
#some additional pad sizes without holes (repeat a previous hole size if you just want the pad size):
    .090, -.040,  #want a .090 pad option, but use dummy hole size
    .065, -.040, #.065 x .065 rect pad
    .035, -.040, #.035 x .065 rect pad
#traces:
    .001,  #too thin for real traces; use only for board outlines
    .006,  #minimum real trace width; mainly used for text
    .008,  #mainly used for mid-sized text, not traces
    .010,  #minimum recommended trace width for low-current signals
    .012,
    .015,  #moderate low-voltage current
    .020,  #heavier trace for power, ground (even if a lighter one is adequate)
    .025,
    .030,  #heavy-current traces; be careful with these ones!
    .040,
    .050,
    .060,
    .080,
    .100,
    .120,
);
#Areas larger than the values below will be filled with parallel lines:
#This cuts down on the number of aperture sizes used.
#Set to 0 to always use an aperture or drill, regardless of size.
use constant { MAX_APERTURE => max((TOOL_SIZES)) + .004, MAX_DRILL => -min((TOOL_SIZES)) + .004 }; #max aperture and drill sizes (plus a little tolerance)
#DebugPrint(sprintf("using %d standard tool sizes: %s, max aper %.3f, max drill %.3f\n", scalar((TOOL_SIZES)), join(", ", (TOOL_SIZES)), MAX_APERTURE, MAX_DRILL), 1);

#NOTE: Compare the PDF to the original CAD file to check the accuracy of the PDF rendering and parsing!
#for example, the CAD software I used generated the following circles for holes:
#CAD hole size:   parsed PDF diameter:      error:
#  .014                .016                +.002
#  .020                .02267              +.00267
#  .025                .026                +.001
#  .029                .03167              +.00267
#  .033                .036                +.003
#  .040                .04267              +.00267
#This was usually ~ .002" - .003" too big compared to the hole as displayed in the CAD software.
#To compensate for PDF rendering errors (either during CAD Print function or PDF parsing logic), adjust the values below as needed.
#units are pixels; for example, a value of 2.4 at 600 dpi = .0004 inch, 2 at 600 dpi = .0033"
use constant
{
    HOLE_ADJUST => -0.004 * 600, #-2.6, #holes seemed to be slightly oversized (by .002" - .004"), so shrink them a little
    RNDPAD_ADJUST => -0.003 * 600, #-2, #-2.4, #round pads seemed to be slightly oversized, so shrink them a little
    SQRPAD_ADJUST => +0.001 * 600, #+.5, #square pads are sometimes too small by .00067, so bump them up a little
    RECTPAD_ADJUST => 0, #(pixels) rectangular pads seem to be okay? (not tested much)
    TRACE_ADJUST => 0, #(pixels) traces seemed to be okay?
    REDUCE_TOLERANCE => .001, #(inches) allow this much variation when reducing circles and rects
};

#Also, my CAD's Print function or the PDF print driver I used was a little off for circles, so define some additional adjustment values here:
#Values are added to X/Y coordinates; units are pixels; for example, a value of 1 at 600 dpi would be ~= .002 inch
use constant
{
    CIRCLE_ADJUST_MINX => 0,
    CIRCLE_ADJUST_MINY => -0.001 * 600, #-1, #circles were a little too high, so nudge them a little lower
    CIRCLE_ADJUST_MAXX => +0.001 * 600, #+1, #circles were a little too far to the left, so nudge them a little to the right
    CIRCLE_ADJUST_MAXY => 0,
    SUBST_CIRCLE_CLIPRECT => FALSE, #generate circle and substitute for clip rects (to compensate for the way some CAD software draws circles)
    WANT_CLIPRECT => TRUE, #FALSE, #AI doesn't need clip rect at all? should be on normally?
    RECT_COMPLETION => FALSE, #TRUE, #fill in 4th side of rect when 3 sides found
};

#allow .012 clearance around pads for solder mask:
#This value effectively adjusts pad sizes in the TOOL_SIZES list above (only for solder mask layers).
use constant SOLDER_MARGIN => +.012; #units are inches

#line join/cap styles:
use constant
{
    CAP_NONE => 0, #butt (none); line is exact length
    CAP_ROUND => 1, #round cap/join; line overhangs by a semi-circle at either end
    CAP_SQUARE => 2, #square cap/join; line overhangs by a half square on either end
    CAP_OVERRIDE => FALSE, #cap style overrides drawing logic
};
    
#number of elements in each shape type:
use constant
{
    RECT_SHAPELEN => 6, #x0, y0, x1, y1, count, "rect" (start, end corners)
    LINE_SHAPELEN => 6, #x0, y0, x1, y1, count, "line" (line seg)
    CURVE_SHAPELEN => 10, #xstart, ystart, x0, y0, x1, y1, xend, yend, count, "curve" (bezier 2 points)
    CIRCLE_SHAPELEN => 5, #x, y, 5, count, "circle" (center + radius)
};
#const my %SHAPELEN =
#Readonly my %SHAPELEN =>
our %SHAPELEN =
(
    rect => RECT_SHAPELEN,
    line => LINE_SHAPELEN,
    curve => CURVE_SHAPELEN,
    circle => CIRCLE_SHAPELEN,
);

#panelization:
#This will repeat the entire body the number of times indicated along the X or Y axes (files grow accordingly).
#Display elements that overhang PCB boundary can be squashed or left as-is (typically text or other silk screen markings).
#Set "overhangs" TRUE to allow overhangs, FALSE to truncate them.
#xpad and ypad allow margins to be added around outer edge of panelized PCB.
use constant PANELIZE => {'x' => 1, 'y' => 1, 'xpad' => 0, 'ypad' => 0, 'overhangs' => TRUE}; #number of times to repeat in X and Y directions

# Set this to 1 if you need TurboCAD support.
#$turboCAD = FALSE; #is this still needed as an option?

#CIRCAD pad generation uses an appropriate aperture, then moves it (stroke) "a little" - we use this to find pads and distinguish them from PCB holes. 
use constant PAD_STROKE => 0.3; #0.0005 * 600; #units are pixels
#convert very short traces to pads or holes:
use constant TRACE_MINLEN => .001; #units are inches
#use constant ALWAYS_XY => TRUE; #FALSE; #force XY even if X or Y doesn't change; NOTE: needs to be TRUE for all pads to show in FlatCAM and ViewPlot
use constant REMOVE_POLARITY => FALSE; #TRUE; #set to remove subtractive (negative) polarity; NOTE: must be FALSE for ground planes

#PDF uses "points", each point = 1/72 inch
#combined with a PDF scale factor of .12, this gives 600 dpi resolution (1/72 * .12 = 600 dpi)
use constant INCHES_PER_POINT => 1/72; #0.0138888889; #multiply point-size by this to get inches

# The precision used when computing a bezier curve. Higher numbers are more precise but slower (and generate larger files).
#$bezierPrecision = 100;
use constant BEZIER_PRECISION => 36; #100; #use const; reduced for faster rendering (mainly used for silk screen and thermal pads)

# Ground planes and silk screen or larger copper rectangles or circles are filled line-by-line using this resolution.
use constant FILL_WIDTH => .01; #fill at most 0.01 inch at a time

# The max number of characters to read into memory
use constant MAX_BYTES => 10 * M; #bumped up to 10 MB, use const

use constant DUP_DRILL1 => TRUE; #FALSE; #kludge: ViewPlot doesn't load drill files that are too small so duplicate first tool

my $runtime = time(); #Time::HiRes::gettimeofday(); #measure my execution time

print STDERR "Loaded config settings from '${\(__FILE__)}'.\n";
1; #last value must be truthful to indicate successful load


#############################################################################################
#junk/experiment:

#use Package::Constants;
#use Exporter qw(import); #https://perldoc.perl.org/Exporter.html

#my $caller = "pdf2gerb::";

#sub cfg
#{
#    my $proto = shift;
#    my $class = ref($proto) || $proto;
#    my $settings =
#    {
#        $WANT_DEBUG => 990, #10; #level of debug wanted; higher == more, lower == less, 0 == none
#    };
#    bless($settings, $class);
#    return $settings;
#}

#use constant HELLO => "hi there2"; #"main::HELLO" => "hi there";
#use constant GOODBYE => 14; #"main::GOODBYE" => 12;

#print STDERR "read cfg file\n";

#our @EXPORT_OK = Package::Constants->list(__PACKAGE__); #https://www.perlmonks.org/?node_id=1072691; NOTE: "_OK" skips short/common names

#print STDERR scalar(@EXPORT_OK) . " consts exported:\n";
#foreach(@EXPORT_OK) { print STDERR "$_\n"; }
#my $val = main::thing("xyz");
#print STDERR "caller gave me $val\n";
#foreach my $arg (@ARGV) { print STDERR "arg $arg\n"; }

Download Details:

Author: swannman
Source Code: https://github.com/swannman/pdf2gerb

License: GPL-3.0 license

#perl 

Oleta  Becker

Oleta Becker

1600839540

Case Study: Breast Cancer Classification Using a Support Vector Machine

In this tutorial, we’re going to create a model to predict whether a patient has a positive breast cancer diagnosis based on several tumor features.

Problem Statement

The breast cancer database is a publicly available dataset from the UCI Machine learning Repository. It gives information on tumor features such as tumor size, density, and texture.

**Goal: **To create a classification model that looks at predicts if the cancer diagnosis is benign or malignant based on several features.

Data used: Kaggle-Breast Cancer Prediction Dataset

#data-science #machine-learning #support-vector-machine #python #kaggle

Java Questions

Java Questions

1595676000

Case Study: Breast Cancer Classification Using a Support Vector Machine

In this tutorial, we’re going to create a model to predict whether a patient has a positive breast cancer diagnosis based on several tumor features.

Problem Statement

The breast cancer database is a publicly available dataset from the UCI Machine learning Repository. It gives information on tumor features such as tumor size, density, and texture.

**Goal: **To create a classification model that looks at predicts if the cancer diagnosis is benign or malignant based on several features.

Data used: Kaggle-Breast Cancer Prediction Dataset


Step 1: Exploring the Dataset

First, let’s understand our dataset:

#import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
#import models from scikit learn module:
from sklearn.model_selection import train_test_split
from sklearn import metrics
from sklearn.svm import SVC
#import Data
df_cancer = pd.read_csv('Breast_cancer_data.csv')
df_cancer.head()
#get some information about our Data-Set
df_cancer.info()
df_cancer.describe()
#visualizing data
sns.pairplot(df_cancer, hue = 'diagnosis')
plt.figure(figsize=(7,7))
sns.heatmap(df_cancer['mean_radius mean_texture mean_perimeter mean_area mean_smoothness diagnosis'.split()].corr(), annot=True)
sns.scatterplot(x = 'mean_texture', y = 'mean_perimeter', hue = 'diagnosis', data = df_cancer)

#data-science #machine-learning #support-vector-machine #python #kaggle

Shardul Bhatt

Shardul Bhatt

1620797149

Python for Freight Forwarding: Proven Case Study for Logistics Company

Python is a popular web development language for enterprise and customer-centric applications. It is one of the top programming languages, according to TIOBE’s index. It has applications in web development, Machine Learning, Data Science, and other domains. The versatility of Python web development makes it the perfect language for applications in every project.

Amidst the hundreds of languages for web application development, Python stands out. It is powerful, scalable, and easy-to-learn. Python’s capabilities are useful in every sector — technology, FinTechHealthTechfreight forwarding industry, and more. The core functionality of Python takes care of all the programming tasks for every feature that needs to be added.

In this article, we will focus on the major aspects of Python that make it suitable for web applications of all kinds. We will then highlight the proficiency of Python using a proven case study that Python developers at BoTree have built. It is a freight forwarding software for international logistics service provider that uses Python in the main technology stack.

Checkout Top 10 real-world Python Use Cases and Applications

Let’s look at the case study and capabilities of Python in detail.

Why choose Python for Web Development

Python is now the first choice for web development, Unlike Ruby on Rails, it offers more flexibility in the process, Here are a few reasons why companies should choose Python for web development -

  • Readable: Python has an easily readable syntax. It is similar to the english language. Python developers admire the programming language as it is easy to read, write, and understand. You don’t have to write additional code to express concepts with ease. The emphasis on code readability, which enables you to maintain and update the code.
  • Multi-programming paradigms: Like all the other object-oriented and open-source programming languages, Python supports multi-programming paradigms. There’s a dynamic type system and automatic memory management. It simplifies the process of building large and complex enterprise scale applications.
  • Scalable: Python is highly scalable. Because of its in-built capabilities to minimize the errors during the development process, it is perfect for freight forwarding software solutions that require processing bills at a huge scale. It is also suitable for enterprise dashboards and other applications that need to handle massive server requests at once.
  • Versatile: Python is a heavily versatile programming language. It has diverse applications in various domains, including statistical analysis, numerical computations, data analytics and more. Companies can use it for web development or Machine Learning applications. Today, Python plays a crucial role in building data science models and intelligent algorithms.
  • Library
    One of the biggest reasons to choose Python is because of its library set. Python has libraries for almost everything — there’s TensorFlow, Selenium, Apache Spark, Requests, Theano, Py Torch and many more. The libraries enable adding functionalities and features, simplifying the process of building high-quality web applications.

Checkout Top Python Libraries for Data Science to use in 2020

As Python grows in popularity, its community also grows. There are more developers than any other programming language. They provide support for different development problems, support, and training for multiple projects.

Let’s look at a proven case study by BoTree Technologies that showcases Python’s capabilities in web development.

Python: Proven Case Study of a Logistics Company

At BoTree, we use Python development services for building dynamic web applications. Today we will discuss a case study on the freight forwarding services industry. We developed it using Python and other technologies. Let’s understand it better.

About the Case Study

We designed the freight forwarding software for a leading international logistics services provider. The system we created would collect the information from different freight forwarding websites using bill of lading or the container number. The information is then entered into the centralized system automatically for better management of the freight.

The main challenge was the manual processing of bills of lading. The information had to be gathered from a large number of websites. Each website had hundreds and thousands of bills. The manual process was lengthy and time-consuming. Because the freight forwarding companies were based out of different geographical locations, the client also faced language barriers while processing the B/L.

Our Technology Stack

The technology stack to add freight forwarding features was simple and powerful. We used Python, Postgresql, AWS SQS, EC2m, Puppeteer and Virtual Private Cloud. We offered web development, software testing, and continuous support and maintenance.

The technology stack we used was focused on simplifying the complications in the freight forwarding system. Because the solution had to be scalable, Python was the probably choice for building the web application.

Our Solution

We built a fully server-les architecture. It performs the mapping of the websites and analyzes the different fields for assessing the required details in freight forwarding.

The solution parses data from different websites and matches the fields with the required information. It also takes into account previously parsed data for making the decision.

The collected information is structurally arranged into a format. The entire data system is then pushed back to a centralized ERP system. All the data is accumulated at a single place, making it easier to process the B/L without any hassle.

The freight forwarding solution consisted of the following features built using Python -

Core Features

  • B/L Processing: The system could easily parse 15000 B/L in a single day.
  • Efficiency delivery: The process became efficient by 30% for processing the B/L.
  • Activity log maintenance: There’s a proper record of all the records that take place in the system.
  • Multiple languages: The freight forwarding software could easily parse B/L in different languages.

Conclusion

Python is a powerful programming language for enterprise-grade applications. Logistics companies heavily benefit from investing in freight forwarding solutions. Shipping systems are essential for managing the timely delivery of products and services. An internal system for B/L processing can enable you to reap the benefits of swift deliveries.

BoTree Technologies is a custom software development company that has Python experts who can build quality applications for enterprises. We have experience in the logistics, healthcare, fintech, education, and multiple other industries.

Connect with us today for a FREE CONSULTATION in the next 24 hours!

Originally published at https://www.botreetechnologies.com on May 11, 2021.

#python case study for logistics company #b/l processing system #freight forwarding case study #logistics case study #case study for logistics company #python web development

Multiclass Classification with Support Vector Machines

Support Vector Machines (SVM) are not new but are still a powerful tool for classification due to their tendency not to overfit, but to perform well in many cases. If you are only interested in a certain topic, just scroll over the topics. These are the topics in chronological order:
What’s the mathematical concept behind the Support Vector Machine?
What is a kernel and what are kernel functions?
What is the kernel trick?
What is the dual problem of a SVM?
How does Multiclass Classification take place?
Implementation via Python and scikit-learn
If you are only interested in how it can be implemented using Python and scikit-learn, scroll down to the end!

#scikit-learn #classification #python #support-vector-machine #machine-learning