Mysql Statsd: A Python Daemon to Gather information From MySQL

Deprication

This project is no longer supported by Spil Games and has been adopted by DB-Art.

The new repository can be found here: MySQL-StatsD @ DB-Art

mysql-statsd

Daemon that gathers statistics from MySQL and sends them to statsd.

Usage / Installation

Install mysql_statsd through pip(pip is a python package manager, please don't use sudo!):

pip install mysql_statsd

If all went well, you'll now have a new executable called mysql_statsd in your path.

Running mysql_statsd

$ mysql_statsd --config /etc/mysql-statsd.conf

Assuming you placed a config file in /etc/ named mysql-statsd.conf

See our example configuration or read below about how to configure

Running the above command will start mysql_statsd in deamon mode. If you wish to see it's output, then run the command with -f / --foreground

Usage

$ mysql_statsd --help
usage: mysql_statsd.py [-h] [-c FILE] [-d] [-f]

optional arguments:
  -h, --help            show this help message and exit
  -c FILE, --config FILE
                        Configuration file
  -d, --debug           Prints statsd metrics next to sending them
  --dry-run             Print the output that would be sent to statsd without
                        actually sending data somewhere
  -f, --foreground      Dont fork main program

At the moment there is also a deamon script for this package

You're more than welcome to help us improve it!

Platforms

We would love to support many other kinds of database servers, but currently we're supporting these:

  • MySQL 5.1
  • MySQL 5.5
  • Galera

Both MySQL versions supported with Percona flavour as well as vanilla.

Todo:

Support for the following platforms

  • Mysql 5.6
  • MariaDB

We're looking forward to your pull request for other platforms

Development installation

To install package, setup a python virtual environment

Install the requirements(once the virtual environment is active):

pip install -r requirements.txt

NOTE: MySQL-Python package needs mysql_config command to be in your path.

There are future plans to replace the mysql-python package with PyMySQL

After that you're able to run the script through

$ python mysql_statsd/mysql_statsd.py

Coding standards

We like to stick with the python standard way of working: PEP-8

Configuration

The configuration consists out of four sections:

  • daemon specific (log/pidfiles)
  • statsd (host, port, prefixes)
  • mysql (connecitons, queries, etc)
  • metrics (metrics to be stored including their type)

Daemon

The daemon section allows you to set the paths to your log and pic files

Statsd

The Statsd section allows you to configure the prefix and hostname of the metrics. In our example the prefix has been set to mysql and the hostname is included. This will log the status.com_select metric to: mysql.<hostname>.status.com_select

You can use any prefix that is necessary in your environment.

MySQL

The MySQL section allows you to configure the credentials of your mysql host (preferrably on localhost) and the queries + timings for the metrics. The queries and timings are configured through the stats_types configurable, so take for instance following example:

stats_types = status, innodb

This will execute both the query_status and query_innodb on the MySQL server. The frequency can then be controlled through the time (in milliseconds) set in the interval_status and interval_innodb. The complete configuration would be:

stats_types = status, innodb
query_status = SHOW GLOBAL STATUS
interval_status = 1000
query_innodb = SHOW ENGINE INNODB STATUS
interval_innodb = 10000

A special case is the query_commit: as the connection opened by mysql_statsd will be kept open and auto commit is turned off by default the status variables are not updated if your server is set to REPEATABLE_READ transaction isolation. Also most probably your history_list will skyrocket and your ibdata files will grow fast enough to drain all available diskspace. So when in doubt about your transaction isolation: do include the query_commit!

Now here is the interesting part of mysql_statsd: if you wish to keep track of your own application data inside your application database you could create your own custom query this way. So for example:

stats_types = myapp
query_myapp = SELECT some_metric_name, some_metric_value FROM myapp.metric_table WHERE metric_ts >= DATE_SUB(NOW(), interval 1 MINUTE)
interval_myapp = 60000

This will query your application database every 60 seconds, fetch all the metrics that have changed since then and send them through StatsD. Obviously you need to whitelist them via the metrics section below.

Metrics

The metrics section is basically a whitelisting of all metrics you wish to send to Graphite via StatsD. Currently there is no possibilty to whitelist all possible metrics, but there is a special case where we do allow wildcarding: for the bufferpool_* we whitelist all bufferpools with that specific metric. Don't worry if you haven't configured multiple bufferpools: the output will be omitted by InnoDB and also not parsed by the preprocessor.

Important to know about the metrics is that you will have to specify what type they are. By default Graphite stores all metric equaly but treats them differently per type:

  • Gauge (g for gauge)
  • Rate (r for raw, d for delta)
  • Timer (t for timer)

Gauges are sticky values (like the spedometer in your car). Rates are the number of units that need to be translated to units per second. Timers are the time it took to perform a certain task.

An ever increasing value like the com_select can be sent various ways. If you wish to retain the absolute value of the com_select it is advised to configure it as a gauge. However if you are going to use it as a rate (queries per second) it is no use storing it as a rate in the first place and then later on calculate the integral of the gauge to get the rate. It would be far more accurate to store it as a rate in the first place.

Keep in mind that sending the com_select value as a raw value is in this case a bad habit: StatsD will average out the collected metrics per second, so sending within a 10 second timeframe 10 times a value of 1,000,000 will average out to the expected 1,000,000. However as the processing of metrics also takes a bit of time the chance of missing one beat is relatively high and you end up sending only 9 times the value, hence averaging out to 900,000 once in a while.

The best way to configure the com_select to a rate is by defining it as a delta. The delta metric will remember the metric as it was during the previous run and will only send the difference of the two values.

Media:

Art gave a talk about this tool at Percona London 2013: http://www.percona.com/live/mysql-conference-2013/sessions/mysql-performance-monitoring-using-statsd-and-graphite

Contributors

spil-jasper

thijsdezoete

art-spilgames

bnkr


Author: db-art
Source Code: https://github.com/db-art/mysql-statsd
License: BSD-3-Clause License

#mysql #python 

What is GEEK

Buddha Community

Mysql Statsd: A Python Daemon to Gather information From MySQL
Joe  Hoppe

Joe Hoppe

1595905879

Best MySQL DigitalOcean Performance – ScaleGrid vs. DigitalOcean Managed Databases

HTML to Markdown

MySQL is the all-time number one open source database in the world, and a staple in RDBMS space. DigitalOcean is quickly building its reputation as the developers cloud by providing an affordable, flexible and easy to use cloud platform for developers to work with. MySQL on DigitalOcean is a natural fit, but what’s the best way to deploy your cloud database? In this post, we are going to compare the top two providers, DigitalOcean Managed Databases for MySQL vs. ScaleGrid MySQL hosting on DigitalOcean.

At a glance – TLDR
ScaleGrid Blog - At a glance overview - 1st pointCompare Throughput
ScaleGrid averages almost 40% higher throughput over DigitalOcean for MySQL, with up to 46% higher throughput in write-intensive workloads. Read now

ScaleGrid Blog - At a glance overview - 2nd pointCompare Latency
On average, ScaleGrid achieves almost 30% lower latency over DigitalOcean for the same deployment configurations. Read now

ScaleGrid Blog - At a glance overview - 3rd pointCompare Pricing
ScaleGrid provides 30% more storage on average vs. DigitalOcean for MySQL at the same affordable price. Read now

MySQL DigitalOcean Performance Benchmark
In this benchmark, we compare equivalent plan sizes between ScaleGrid MySQL on DigitalOcean and DigitalOcean Managed Databases for MySQL. We are going to use a common, popular plan size using the below configurations for this performance benchmark:

Comparison Overview
ScaleGridDigitalOceanInstance TypeMedium: 4 vCPUsMedium: 4 vCPUsMySQL Version8.0.208.0.20RAM8GB8GBSSD140GB115GBDeployment TypeStandaloneStandaloneRegionSF03SF03SupportIncludedBusiness-level support included with account sizes over $500/monthMonthly Price$120$120

As you can see above, ScaleGrid and DigitalOcean offer the same plan configurations across this plan size, apart from SSD where ScaleGrid provides over 20% more storage for the same price.

To ensure the most accurate results in our performance tests, we run the benchmark four times for each comparison to find the average performance across throughput and latency over read-intensive workloads, balanced workloads, and write-intensive workloads.

Throughput
In this benchmark, we measure MySQL throughput in terms of queries per second (QPS) to measure our query efficiency. To quickly summarize the results, we display read-intensive, write-intensive and balanced workload averages below for 150 threads for ScaleGrid vs. DigitalOcean MySQL:

ScaleGrid MySQL vs DigitalOcean Managed Databases - Throughput Performance Graph

For the common 150 thread comparison, ScaleGrid averages almost 40% higher throughput over DigitalOcean for MySQL, with up to 46% higher throughput in write-intensive workloads.

#cloud #database #developer #digital ocean #mysql #performance #scalegrid #95th percentile latency #balanced workloads #developers cloud #digitalocean droplet #digitalocean managed databases #digitalocean performance #digitalocean pricing #higher throughput #latency benchmark #lower latency #mysql benchmark setup #mysql client threads #mysql configuration #mysql digitalocean #mysql latency #mysql on digitalocean #mysql throughput #performance benchmark #queries per second #read-intensive #scalegrid mysql #scalegrid vs. digitalocean #throughput benchmark #write-intensive

Shardul Bhatt

Shardul Bhatt

1626775355

Why use Python for Software Development

No programming language is pretty much as diverse as Python. It enables building cutting edge applications effortlessly. Developers are as yet investigating the full capability of end-to-end Python development services in various areas. 

By areas, we mean FinTech, HealthTech, InsureTech, Cybersecurity, and that's just the beginning. These are New Economy areas, and Python has the ability to serve every one of them. The vast majority of them require massive computational abilities. Python's code is dynamic and powerful - equipped for taking care of the heavy traffic and substantial algorithmic capacities. 

Programming advancement is multidimensional today. Endeavor programming requires an intelligent application with AI and ML capacities. Shopper based applications require information examination to convey a superior client experience. Netflix, Trello, and Amazon are genuine instances of such applications. Python assists with building them effortlessly. 

5 Reasons to Utilize Python for Programming Web Apps 

Python can do such numerous things that developers can't discover enough reasons to admire it. Python application development isn't restricted to web and enterprise applications. It is exceptionally adaptable and superb for a wide range of uses.

Robust frameworks 

Python is known for its tools and frameworks. There's a structure for everything. Django is helpful for building web applications, venture applications, logical applications, and mathematical processing. Flask is another web improvement framework with no conditions. 

Web2Py, CherryPy, and Falcon offer incredible capabilities to customize Python development services. A large portion of them are open-source frameworks that allow quick turn of events. 

Simple to read and compose 

Python has an improved sentence structure - one that is like the English language. New engineers for Python can undoubtedly understand where they stand in the development process. The simplicity of composing allows quick application building. 

The motivation behind building Python, as said by its maker Guido Van Rossum, was to empower even beginner engineers to comprehend the programming language. The simple coding likewise permits developers to roll out speedy improvements without getting confused by pointless subtleties. 

Utilized by the best 

Alright - Python isn't simply one more programming language. It should have something, which is the reason the business giants use it. Furthermore, that too for different purposes. Developers at Google use Python to assemble framework organization systems, parallel information pusher, code audit, testing and QA, and substantially more. Netflix utilizes Python web development services for its recommendation algorithm and media player. 

Massive community support 

Python has a steadily developing community that offers enormous help. From amateurs to specialists, there's everybody. There are a lot of instructional exercises, documentation, and guides accessible for Python web development solutions. 

Today, numerous universities start with Python, adding to the quantity of individuals in the community. Frequently, Python designers team up on various tasks and help each other with algorithmic, utilitarian, and application critical thinking. 

Progressive applications 

Python is the greatest supporter of data science, Machine Learning, and Artificial Intelligence at any enterprise software development company. Its utilization cases in cutting edge applications are the most compelling motivation for its prosperity. Python is the second most well known tool after R for data analytics.

The simplicity of getting sorted out, overseeing, and visualizing information through unique libraries makes it ideal for data based applications. TensorFlow for neural networks and OpenCV for computer vision are two of Python's most well known use cases for Machine learning applications.

Summary

Thinking about the advances in programming and innovation, Python is a YES for an assorted scope of utilizations. Game development, web application development services, GUI advancement, ML and AI improvement, Enterprise and customer applications - every one of them uses Python to its full potential. 

The disadvantages of Python web improvement arrangements are regularly disregarded by developers and organizations because of the advantages it gives. They focus on quality over speed and performance over blunders. That is the reason it's a good idea to utilize Python for building the applications of the future.

#python development services #python development company #python app development #python development #python in web development #python software development

Art  Lind

Art Lind

1602968400

Python Tricks Every Developer Should Know

Python is awesome, it’s one of the easiest languages with simple and intuitive syntax but wait, have you ever thought that there might ways to write your python code simpler?

In this tutorial, you’re going to learn a variety of Python tricks that you can use to write your Python code in a more readable and efficient way like a pro.

Let’s get started

Swapping value in Python

Instead of creating a temporary variable to hold the value of the one while swapping, you can do this instead

>>> FirstName = "kalebu"
>>> LastName = "Jordan"
>>> FirstName, LastName = LastName, FirstName 
>>> print(FirstName, LastName)
('Jordan', 'kalebu')

#python #python-programming #python3 #python-tutorials #learn-python #python-tips #python-skills #python-development

Art  Lind

Art Lind

1602666000

How to Remove all Duplicate Files on your Drive via Python

Today you’re going to learn how to use Python programming in a way that can ultimately save a lot of space on your drive by removing all the duplicates.

Intro

In many situations you may find yourself having duplicates files on your disk and but when it comes to tracking and checking them manually it can tedious.

Heres a solution

Instead of tracking throughout your disk to see if there is a duplicate, you can automate the process using coding, by writing a program to recursively track through the disk and remove all the found duplicates and that’s what this article is about.

But How do we do it?

If we were to read the whole file and then compare it to the rest of the files recursively through the given directory it will take a very long time, then how do we do it?

The answer is hashing, with hashing can generate a given string of letters and numbers which act as the identity of a given file and if we find any other file with the same identity we gonna delete it.

There’s a variety of hashing algorithms out there such as

  • md5
  • sha1
  • sha224, sha256, sha384 and sha512

#python-programming #python-tutorials #learn-python #python-project #python3 #python #python-skills #python-tips

Mysql Statsd: A Python Daemon to Gather information From MySQL

Deprication

This project is no longer supported by Spil Games and has been adopted by DB-Art.

The new repository can be found here: MySQL-StatsD @ DB-Art

mysql-statsd

Daemon that gathers statistics from MySQL and sends them to statsd.

Usage / Installation

Install mysql_statsd through pip(pip is a python package manager, please don't use sudo!):

pip install mysql_statsd

If all went well, you'll now have a new executable called mysql_statsd in your path.

Running mysql_statsd

$ mysql_statsd --config /etc/mysql-statsd.conf

Assuming you placed a config file in /etc/ named mysql-statsd.conf

See our example configuration or read below about how to configure

Running the above command will start mysql_statsd in deamon mode. If you wish to see it's output, then run the command with -f / --foreground

Usage

$ mysql_statsd --help
usage: mysql_statsd.py [-h] [-c FILE] [-d] [-f]

optional arguments:
  -h, --help            show this help message and exit
  -c FILE, --config FILE
                        Configuration file
  -d, --debug           Prints statsd metrics next to sending them
  --dry-run             Print the output that would be sent to statsd without
                        actually sending data somewhere
  -f, --foreground      Dont fork main program

At the moment there is also a deamon script for this package

You're more than welcome to help us improve it!

Platforms

We would love to support many other kinds of database servers, but currently we're supporting these:

  • MySQL 5.1
  • MySQL 5.5
  • Galera

Both MySQL versions supported with Percona flavour as well as vanilla.

Todo:

Support for the following platforms

  • Mysql 5.6
  • MariaDB

We're looking forward to your pull request for other platforms

Development installation

To install package, setup a python virtual environment

Install the requirements(once the virtual environment is active):

pip install -r requirements.txt

NOTE: MySQL-Python package needs mysql_config command to be in your path.

There are future plans to replace the mysql-python package with PyMySQL

After that you're able to run the script through

$ python mysql_statsd/mysql_statsd.py

Coding standards

We like to stick with the python standard way of working: PEP-8

Configuration

The configuration consists out of four sections:

  • daemon specific (log/pidfiles)
  • statsd (host, port, prefixes)
  • mysql (connecitons, queries, etc)
  • metrics (metrics to be stored including their type)

Daemon

The daemon section allows you to set the paths to your log and pic files

Statsd

The Statsd section allows you to configure the prefix and hostname of the metrics. In our example the prefix has been set to mysql and the hostname is included. This will log the status.com_select metric to: mysql.<hostname>.status.com_select

You can use any prefix that is necessary in your environment.

MySQL

The MySQL section allows you to configure the credentials of your mysql host (preferrably on localhost) and the queries + timings for the metrics. The queries and timings are configured through the stats_types configurable, so take for instance following example:

stats_types = status, innodb

This will execute both the query_status and query_innodb on the MySQL server. The frequency can then be controlled through the time (in milliseconds) set in the interval_status and interval_innodb. The complete configuration would be:

stats_types = status, innodb
query_status = SHOW GLOBAL STATUS
interval_status = 1000
query_innodb = SHOW ENGINE INNODB STATUS
interval_innodb = 10000

A special case is the query_commit: as the connection opened by mysql_statsd will be kept open and auto commit is turned off by default the status variables are not updated if your server is set to REPEATABLE_READ transaction isolation. Also most probably your history_list will skyrocket and your ibdata files will grow fast enough to drain all available diskspace. So when in doubt about your transaction isolation: do include the query_commit!

Now here is the interesting part of mysql_statsd: if you wish to keep track of your own application data inside your application database you could create your own custom query this way. So for example:

stats_types = myapp
query_myapp = SELECT some_metric_name, some_metric_value FROM myapp.metric_table WHERE metric_ts >= DATE_SUB(NOW(), interval 1 MINUTE)
interval_myapp = 60000

This will query your application database every 60 seconds, fetch all the metrics that have changed since then and send them through StatsD. Obviously you need to whitelist them via the metrics section below.

Metrics

The metrics section is basically a whitelisting of all metrics you wish to send to Graphite via StatsD. Currently there is no possibilty to whitelist all possible metrics, but there is a special case where we do allow wildcarding: for the bufferpool_* we whitelist all bufferpools with that specific metric. Don't worry if you haven't configured multiple bufferpools: the output will be omitted by InnoDB and also not parsed by the preprocessor.

Important to know about the metrics is that you will have to specify what type they are. By default Graphite stores all metric equaly but treats them differently per type:

  • Gauge (g for gauge)
  • Rate (r for raw, d for delta)
  • Timer (t for timer)

Gauges are sticky values (like the spedometer in your car). Rates are the number of units that need to be translated to units per second. Timers are the time it took to perform a certain task.

An ever increasing value like the com_select can be sent various ways. If you wish to retain the absolute value of the com_select it is advised to configure it as a gauge. However if you are going to use it as a rate (queries per second) it is no use storing it as a rate in the first place and then later on calculate the integral of the gauge to get the rate. It would be far more accurate to store it as a rate in the first place.

Keep in mind that sending the com_select value as a raw value is in this case a bad habit: StatsD will average out the collected metrics per second, so sending within a 10 second timeframe 10 times a value of 1,000,000 will average out to the expected 1,000,000. However as the processing of metrics also takes a bit of time the chance of missing one beat is relatively high and you end up sending only 9 times the value, hence averaging out to 900,000 once in a while.

The best way to configure the com_select to a rate is by defining it as a delta. The delta metric will remember the metric as it was during the previous run and will only send the difference of the two values.

Media:

Art gave a talk about this tool at Percona London 2013: http://www.percona.com/live/mysql-conference-2013/sessions/mysql-performance-monitoring-using-statsd-and-graphite

Contributors

spil-jasper

thijsdezoete

art-spilgames

bnkr


Author: db-art
Source Code: https://github.com/db-art/mysql-statsd
License: BSD-3-Clause License

#mysql #python