Elvis Miranda

Elvis Miranda

1626423711

Python Automated Machine Learning Library for Tabular Data

Simple but powerful Automated Machine Learning library for tabular data. It uses efficient in-memory SAP HANA algorithms to automate routine Data Science tasks.

Table of Contents

  1. About The Project
  2. Getting Started
    • Prerequisites
    • Installation
  3. Usage
  4. Roadmap
  5. Contributing
  6. License
  7. Contact

About the project

❗️ Warning ❗️

The project has been frozen for uncertain time 🥶. However, you can still use our web-app. Also, this library is an open-source research project and is not a part of any official SAP products.

What's this?

This is a simple but accurate Automated Machine Learning library. Based on SAP HANA powerful in-memory algorithms, it provides high accuracy in multiple machine learning tasks. Our library also uses numerous data preprocessing functions to automate routine data cleaning tasks. So, hana_automl goes through all AutoML steps and makes Data Science work easier.

What is SAP HANA?

From www.sap.com: SAP HANA is a high-performance in-memory database that speeds data-driven, real-time decisions and actions.

Web app

https://share.streamlit.io/dan0nchik/sap-hana-automl/main/web.py

Documentation

https://sap-hana-automl.readthedocs.io/en/latest/index.html

Benchmarks

https://github.com/dan0nchik/SAP-HANA-AutoML/blob/main/comparison_openml.ipynb

ML tasks:

  •  Binary classification
  •  Regression
  •  Multiclass classification
  •  Forecasting

Steps automated:

  •  Data exploration
  •  Data preparation
  •  Feature engineering
  •  Model selection
  •  Model training
  •  Hyperparameter tuning

👇 By the end of summer 2021, blue part will be fully automated by our library Logo

Clients

Streamlit client Streamlit client

Built With

Getting Started

To get a package up and running, follow these simple steps.

Prerequisites

Make sure you have the following:

1: ✅ Setup SAP HANA (skip this step if you have an instance with PAL enabled). There are 2 ways to do that.
In HANA Cloud:

  • Create a free trial account
  • Setup an instance
  • Enable PAL - Predictive Analysis Library. It is vital to enable it because we use their algorithms.

In Virtual Machine:

  • Rent a virtual machine in Azure, AWS, Google Cloud, etc.
  • Install HANA instance there or on your PC (if you have >32 Gb RAM).
  • Enable PAL - Predictive Analysis Library. It is vital to enable it because we use their algorithms

2: ✅ Installed software

  • Python > 3.6
    Skip this step if python --version returns > 3.6
  • Cython
pip3 install Cython

Installation

There are 2 ways to install the library

  • Stable: from pypi
pip3 install hana_automl
  • Latest: from the repository
pip3 install https://github.com/dan0nchik/SAP-HANA-AutoML/archive/dev.zip
  • Note: latest version may contain bugs, be careful!

After installation

Check that PAL (Predictive Analysis Library) is installed and roles are granted

  • Read docs section about that.
  • If you don't want to read docs, run this code
from hana_automl.utils.scripts import setup_user
from hana_ml.dataframe import ConnectionContext

cc = ConnectionContext(address='address', user='user', password='password', port=39015)

# replace with credentials of user that will be created or granted a role to run PAL.
setup_user(connection_context=cc, username='user', password="password")

Usage

From code

Our library in a few lines of code

Connect to database.

from hana_ml.dataframe import ConnectionContext

cc = ConnectionContext(address='address',
                     user='username',
                     password='password',
                     port=1234)

Create AutoML model and fit it.

from hana_automl.automl import AutoML

model = AutoML(cc)
model.fit(
  file_path='path to training dataset', # it may be HANA table/view, or pandas DataFrame
  steps=10, # number of iterations
  target='target', # column to predict
  time_limit=120 # time limit in seconds
)

Predict.

model.predict(
file_path='path to test dataset',
id_column='ID',
verbose=1
)

For more examples, please refer to the Documentation

How to run Streamlit client

  1. Clone repository: git clone https://github.com/dan0nchik/SAP-HANA-AutoML.git
  2. Install Cython pip3 install Cython
  3. Install dependencies: pip3 install -r requirements.txt
  4. Run GUI: streamlit run ./web.py

Roadmap

See the open issues for a list of proposed features (and known issues). Feel free to report any bugs :)

Contributing

Any contributions you make are greatly appreciated 👏!

1: Fork the Project

2: Create your Feature Branch (git checkout -b feature/NewFeature)

3: Install dependencies

pip3 install Cython
pip3 install -r requirements.txt

4: Create credentials.py file in tests directory Your files should look like this:

SAP-HANA-AutoML
│   README.md
│   all other files   
│   .....
|
└───tests
    │   test files...
    │   credentials.py

Copy and paste this piece of code there and replace it with your credentials:

host = "host"
user = "username"
password = "password"
port = 39015 # or any port you need
schema = "your schema"

Don't worry, this file is in .gitignore, so your credentials won't be seen by anyone.

5: Make some changes

6: Write tests that cover your code in tests directory

7: Run tests (under SAP-HANA-AutoML directory)

pytest

8: Commit your changes (git commit -m 'Add some amazing features')

9: Push to the branch (git push origin feature/AmazingFeature)

10: Open a Pull Request

License

Distributed under the MIT License. See LICENSE for more information.
Don't really understand license? Check out the MIT license summary.

Download Details:
 

Author: dan0nchik
Download Link: Download The Source Code
Official Website: https://github.com/dan0nchik/SAP-HANA-AutoML 
License: MIT license

#machinelearning #python

What is GEEK

Buddha Community

Python Automated Machine Learning Library for Tabular Data
Ray  Patel

Ray Patel

1625843760

Python Packages in SQL Server – Get Started with SQL Server Machine Learning Services

Introduction

When installing Machine Learning Services in SQL Server by default few Python Packages are installed. In this article, we will have a look on how to get those installed python package information.

Python Packages

When we choose Python as Machine Learning Service during installation, the following packages are installed in SQL Server,

  • revoscalepy – This Microsoft Python package is used for remote compute contexts, streaming, parallel execution of rx functions for data import and transformation, modeling, visualization, and analysis.
  • microsoftml – This is another Microsoft Python package which adds machine learning algorithms in Python.
  • Anaconda 4.2 – Anaconda is an opensource Python package

#machine learning #sql server #executing python in sql server #machine learning using python #machine learning with sql server #ml in sql server using python #python in sql server ml #python packages #python packages for machine learning services #sql server machine learning services

Ray  Patel

Ray Patel

1619643600

Top Machine Learning Projects in Python For Beginners [2021]

If you want to become a machine learning professional, you’d have to gain experience using its technologies. The best way to do so is by completing projects. That’s why in this article, we’re sharing multiple machine learning projects in Python so you can quickly start testing your skills and gain valuable experience.

However, before you begin, make sure that you’re familiar with machine learning and its algorithm. If you haven’t worked on a project before, don’t worry because we have also shared a detailed tutorial on one project:

#artificial intelligence #machine learning #machine learning in python #machine learning projects #machine learning projects in python #python

Top Machine Learning Projects in Python For Beginners [2021] | upGrad blog

If you want to become a machine learning professional, you’d have to gain experience using its technologies. The best way to do so is by completing projects. That’s why in this article, we’re sharing multiple machine learning projects in Python so you can quickly start testing your skills and gain valuable experience.

However, before you begin, make sure that you’re familiar with machine learning and its algorithm. If you haven’t worked on a project before, don’t worry because we have also shared a detailed tutorial on one project:

The Iris Dataset: For the Beginners

The Iris dataset is easily one of the most popular machine learning projects in Python. It is relatively small, but its simplicity and compact size make it perfect for beginners. If you haven’t worked on any machine learning projects in Python, you should start with it. The Iris dataset is a collection of flower sepal and petal sizes of the flower Iris. It has three classes, with 50 instances in every one of them.

We’ve provided sample code on various places, but you should only use it to understand how it works. Implementing the code without understanding it would fail the premise of doing the project. So be sure to understand the code well before implementing it.

#artificial intelligence #machine learning #machine learning in python #machine learning projects #machine learning projects in python #python

sophia tondon

sophia tondon

1620898103

5 Latest Technology Trends of Machine Learning for 2021

Check out the 5 latest technologies of machine learning trends to boost business growth in 2021 by considering the best version of digital development tools. It is the right time to accelerate user experience by bringing advancement in their lifestyle.

#machinelearningapps #machinelearningdevelopers #machinelearningexpert #machinelearningexperts #expertmachinelearningservices #topmachinelearningcompanies #machinelearningdevelopmentcompany

Visit Blog- https://www.xplace.com/article/8743

#machine learning companies #top machine learning companies #machine learning development company #expert machine learning services #machine learning experts #machine learning expert

Nora Joy

1604154094

Hire Machine Learning Developers in India

Hire machine learning developers in India ,DxMinds Technologies is the best product engineering company in India making innovative solutions using Machine learning and deep learning. We are among the best to hire machine learning experts in India work in different industry domains like Healthcare retail, banking and finance ,oil and gas, ecommerce, telecommunication ,FMCG, fashion etc.
**
Services**
Product Engineering & Development
Re-engineering
Maintenance / Support / Sustenance
Integration / Data Management
QA & Automation
Reach us 917483546629

Hire machine learning developers in India ,DxMinds Technologies is the best product engineering company in India making innovative solutions using Machine learning and deep learning. We are among the best to hire machine learning experts in India work in different industry domains like Healthcare retail, banking and finance ,oil and gas, ecommerce, telecommunication ,FMCG, fashion etc.

Services

Product Engineering & Development

Re-engineering

Maintenance / Support / Sustenance

Integration / Data Management

QA & Automation

Reach us 917483546629

#hire machine learning developers in india #hire dedicated machine learning developers in india #hire machine learning programmers in india #hire machine learning programmers #hire dedicated machine learning developers #hire machine learning developers