1672987920
Deploy Prometheus monitoring system using ansible.
When upgrading from <= 2.4.0 version of this role to >= 2.4.1 please turn off your prometheus instance. More in 2.4.1 release notes
brew install gnu-tar
)All variables which can be overridden are stored in defaults/main.yml file as well as in table below.
Name | Default Value | Description |
---|---|---|
prometheus_version | 2.27.0 | Prometheus package version. Also accepts latest as parameter. Only prometheus 2.x is supported |
prometheus_skip_install | false | Prometheus installation tasks gets skipped when set to true. |
prometheus_binary_local_dir | "" | Allows to use local packages instead of ones distributed on github. As parameter it takes a directory where prometheus AND promtool binaries are stored on host on which ansible is ran. This overrides prometheus_version parameter |
prometheus_config_dir | /etc/prometheus | Path to directory with prometheus configuration |
prometheus_db_dir | /var/lib/prometheus | Path to directory with prometheus database |
prometheus_read_only_dirs | [] | Additional paths that Prometheus is allowed to read (useful for SSL certs outside of the config directory) |
prometheus_web_listen_address | "0.0.0.0:9090" | Address on which prometheus will be listening |
prometheus_web_config | {} | A Prometheus web config yaml for configuring TLS and auth. |
prometheus_web_external_url | "" | External address on which prometheus is available. Useful when behind reverse proxy. Ex. http://example.org/prometheus |
prometheus_storage_retention | "30d" | Data retention period |
prometheus_storage_retention_size | "0" | Data retention period by size |
prometheus_config_flags_extra | {} | Additional configuration flags passed to prometheus binary at startup |
prometheus_alertmanager_config | [] | Configuration responsible for pointing where alertmanagers are. This should be specified as list in yaml format. It is compatible with official |
prometheus_alert_relabel_configs | [] | Alert relabeling rules. This should be specified as list in yaml format. It is compatible with the official |
prometheus_global | { scrape_interval: 60s, scrape_timeout: 15s, evaluation_interval: 15s } | Prometheus global config. Compatible with official configuration |
prometheus_remote_write | [] | Remote write. Compatible with official configuration |
prometheus_remote_read | [] | Remote read. Compatible with official configuration |
prometheus_external_labels | environment: "{{ ansible_fqdn | default(ansible_host) | default(inventory_hostname) }}" | Provide map of additional labels which will be added to any time series or alerts when communicating with external systems |
prometheus_targets | {} | Targets which will be scraped. Better example is provided in our demo site |
prometheus_scrape_configs | defaults/main.yml#L58 | Prometheus scrape jobs provided in same format as in official docs |
prometheus_config_file | "prometheus.yml.j2" | Variable used to provide custom prometheus configuration file in form of ansible template |
prometheus_alert_rules | defaults/main.yml#L81 | Full list of alerting rules which will be copied to {{ prometheus_config_dir }}/rules/ansible_managed.rules . Alerting rules can be also provided by other files located in {{ prometheus_config_dir }}/rules/ which have *.rules extension |
prometheus_alert_rules_files | defaults/main.yml#L78 | List of folders where ansible will look for files containing alerting rules which will be copied to {{ prometheus_config_dir }}/rules/ . Files must have *.rules extension |
prometheus_static_targets_files | defaults/main.yml#L78 | List of folders where ansible will look for files containing custom static target configuration files which will be copied to {{ prometheus_config_dir }}/file_sd/ . |
prometheus_scrape_configs
and prometheus_targets
prometheus_targets
is just a map used to create multiple files located in "{{ prometheus_config_dir }}/file_sd" directory. Where file names are composed from top-level keys in that map with .yml
suffix. Those files store file_sd scrape targets data and they need to be read in prometheus_scrape_configs
.
A part of prometheus.yml configuration file which describes what is scraped by prometheus is stored in prometheus_scrape_configs
. For this variable same configuration options as described in prometheus docs are used.
Meanwhile prometheus_targets
is our way of adopting prometheus scrape type file_sd
. It defines a map of files with their content. A top-level keys are base names of files which need to have their own scrape job in prometheus_scrape_configs
and values are a content of those files.
All this mean that you CAN use custom prometheus_scrape_configs
with prometheus_targets
set to {}
. However when you set anything in prometheus_targets
it needs to be mapped to prometheus_scrape_configs
. If it isn't you'll get an error in preflight checks.
Lets look at our default configuration, which shows all features. By default we have this prometheus_targets
:
prometheus_targets:
node: # This is a base file name. File is located in "{{ prometheus_config_dir }}/file_sd/<<BASENAME>>.yml"
- targets: #
- localhost:9100 # All this is a targets section in file_sd format
labels: #
env: test #
Such config will result in creating one file named node.yml
in {{ prometheus_config_dir }}/file_sd
directory.
Next this file needs to be loaded into scrape config. Here is modified version of our default prometheus_scrape_configs
:
prometheus_scrape_configs:
- job_name: "prometheus" # Custom scrape job, here using `static_config`
metrics_path: "/metrics"
static_configs:
- targets:
- "localhost:9090"
- job_name: "example-node-file-servicediscovery"
file_sd_configs:
- files:
- "{{ prometheus_config_dir }}/file_sd/node.yml" # This line loads file created from `prometheus_targets`
---
- hosts: all
roles:
- cloudalchemy.prometheus
vars:
prometheus_targets:
node:
- targets:
- localhost:9100
- demo.cloudalchemy.org:9100
labels:
env: demosite
Prometheus organization provide a demo site for full monitoring solution based on prometheus and grafana. Repository with code and links to running instances is available on github.
Alerting rules are defined in prometheus_alert_rules
variable. Format is almost identical to one defined in Prometheus 2.0 documentation. Due to similarities in templating engines, every templates should be wrapped in {% raw %}
and {% endraw %}
statements. Example is provided in defaults/main.yml file.
The preferred way of locally testing the role is to use Docker and molecule (v2.x). You will have to install Docker on your system. See "Get started" for a Docker package suitable to for your system. We are using tox to simplify process of testing on multiple ansible versions. To install tox execute:
pip3 install tox
To run tests on all ansible versions (WARNING: this can take some time)
tox
To run a custom molecule command on custom environment with only default test scenario:
tox -e py35-ansible28 -- molecule test -s default
For more information about molecule go to their docs.
If you would like to run tests on remote docker host just specify DOCKER_HOST
variable before running tox tests.
Combining molecule and CircleCI allows us to test how new PRs will behave when used with multiple ansible versions and multiple operating systems. This also allows use to create test scenarios for different role configurations. As a result we have a quite large test matrix which will take more time than local testing, so please be patient.
See troubleshooting.
Author: Cloudalchemy
Source Code: https://github.com/cloudalchemy/ansible-prometheus
License: MIT license
1671020659
A monitoring and logging solution for Docker hosts and containers with Prometheus, Grafana, Loki, cAdvisor, NodeExporter and alerting with AlertManager.
Inspired by dockprom
Full source code in here: https://github.com/ductnn/domolo
Clone this repository on your Docker host, cd into domolo
directory and run docker-compose up -d
:
git clone https://github.com/ductnn/domolo.git
cd domolo
docker-compose up -d
Containers:
http://<host-ip>:9090
http://<host-ip>:9091
http://<host-ip>:9093
http://<host-ip>:3000
http://<host-ip>:3100
Change the credentials in file config
GF_SECURITY_ADMIN_USER=admin
GF_SECURITY_ADMIN_PASSWORD=changeme
GF_USERS_ALLOW_SIGN_UP=false
Grafana is preconfigured with dashboards, setup Prometheus(default) and Loki in datasources
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
access: proxy
orgId: 1
url: http://prometheus:9090
basicAuth: false
isDefault: true
editable: true
- name: Loki
type: loki
access: proxy
jsonData:
maxLines: 1000
basicAuth: false
url: http://loki:3100
isDefault: false
editable: true
Config prometheus
for receiving metrics from node_exporter
. First, setup node_exporter
in servers we need monitor with docker-compose.agents.yml and run command:
docker-compose -f docker-compose.agents.yml up -d
This file will setup 3 agents:
node_exporter
cAdvisor
promtail
Then, we need config scrape metric on prometheus server:
Live monitoring prometheus server:
scrape_configs:
- job_name: 'nodeexporter'
scrape_interval: 5s
static_configs:
- targets: ['nodeexporter:9100']
Monitoring other Server, we need to add external_labels
:
external_labels:
monitor: 'docker-host-alpha'
scrape_configs:
- job_name: 'ApiExporter'
scrape_interval: 5s
static_configs:
- targets: ['<IP Server need Monitor>:Port']
Simple dashboards on Grafana:
Node Exporter
Monitor Services
Docker Host
Setup config loki in file loki-config
Config scrape logs with promtail, create file promtail-config.yaml
and setup:
- job_name: container_logs
docker_sd_configs:
- host: unix:///var/run/docker.sock
refresh_interval: 5s
relabel_configs:
- source_labels: ['__meta_docker_container_name']
regex: '/(.*)'
target_label: 'container'
- job_name: system
static_configs:
- targets:
- localhost
labels:
job: varlogs
__path__: /var/log/*log
Create simple tool generate logs and containerization this tool. Navigate to file entrypoint.sh and run test:
➜ domolo git:(master) cd fake-logs
➜ fake-logs git:(master) ✗ chmod +x entrypoint.sh
➜ fake-logs git:(master) ✗ ./entrypoint.sh
2022-12-08T13:20:00Z ERROR An error is usually an exception that has been caught and not handled.
2022-12-08T13:20:00Z DEBUG This is a debug log that shows a log that can be ignored.
2022-12-08T13:20:01Z WARN A warning that should be ignored is usually at this level and should be actionable.
2022-12-08T13:20:03Z ERROR An error is usually an exception that has been caught and not handled.
2022-12-08T13:20:05Z ERROR An error is usually an exception that has been caught and not handled.
2022-12-08T13:20:09Z INFO This is less important than debug log and is often used to provide context in the current task.
2022-12-08T13:20:13Z ERROR An error is usually an exception that has been caught and not handled.
2022-12-08T13:20:15Z DEBUG This is a debug log that shows a log that can be ignored.
2022-12-08T13:20:16Z INFO This is less important than debug log and is often used to provide context in the current task.
2022-12-08T13:20:17Z INFO This is less important than debug log and is often used to provide context in the current task.
...
Then, add fake-logs
in docker-compose.yml
# Fake Logs
flogs:
image: ductn4/flog:v1 # Set your name image :)
build:
context: ./fake-logs
dockerfile: Dockerfile
container_name: fake-logs
restart: always
networks:
- monitor-net
labels:
org.label-schema.group: "monitoring"
or checkout docker-compose.with-flogs.yml and run command docker-compose -f docker-compose.with-flogs.yml up -d
Navigate grafana and open Explore
:
So, we can select labels
and views logs
:
Ex: Select label container
and view log container fake-logs
:
More logs: logs system, other containers, ....
Give a ⭐ if you like this application ❤️
All contributions are welcomed in this project!
The MIT License (MIT). Please see LICENSE for more information.
#monitoring #logging #docker #devops #cloud
#prometheus #grafana #loki
1667503860
mtail
is a tool for extracting metrics from application logs to be exported into a timeseries database or timeseries calculator for alerting and dashboarding.
It fills a monitoring niche by being the glue between applications that do not export their own internal state (other than via logs) and existing monitoring systems, such that system operators do not need to patch those applications to instrument them or writing custom extraction code for every such application.
The extraction is controlled by mtail programs which define patterns and actions:
# simple line counter
counter lines_total
/$/ {
lines_total++
}
Metrics are exported for scraping by a collector as JSON or Prometheus format over HTTP, or can be periodically sent to a collectd, StatsD, or Graphite collector socket.
Read the programming guide if you want to learn how to write mtail programs.
Ask general questions on the users mailing list: https://groups.google.com/g/mtail-users
There are various ways of installing mtail.
Precompiled binaries for released versions are available in the Releases page on Github. Using the latest production release binary is the recommended way of installing mtail.
Windows, OSX and Linux binaries are available.
The simplest way to get mtail
is to go get
it directly.
go get github.com/google/mtail/cmd/mtail
This assumes you have a working Go environment with a recent Go version. Usually mtail is tested to work with the last two minor versions (e.g. Go 1.12 and Go 1.11).
If you want to fetch everything, you need to turn on Go Modules to succeed because of the way Go Modules have changed the way go get treats source trees with no Go code at the top level.
GO111MODULE=on go get -u github.com/google/mtail
cd $GOPATH/src/github.com/google/mtail
make install
If you develop the compiler you will need some additional tools like goyacc
to be able to rebuild the parser.
See the Build instructions for more details.
A Dockerfile
is included in this repository for local development as an alternative to installing Go in your environment, and takes care of all the build dependency installation, if you don't care for that.
mtail
works best when it paired with a timeseries-based calculator and alerting tool, like Prometheus.
So what you do is you take the metrics from the log files and you bring them down to the monitoring system?
It deals with the instrumentation so the engineers don't have to! It has the extraction skills! It is good at dealing with log files!!
Full documentation at http://google.github.io/mtail/
Read more about writing mtail
programs:
Read more about hacking on mtail
Read more about deploying mtail
and your programs in a monitoring environment
After that, if you have any questions, please email (and optionally join) the mailing list: https://groups.google.com/forum/#!forum/mtail-users or file a new issue.
Author: Google
Source Code: https://github.com/google/mtail
License: Apache-2.0 license
1665796680
Log activity inside your Laravel app
The spatie/laravel-activitylog
package provides easy to use functions to log the activities of the users of your app. It can also automatically log model events. The Package stores all activity in the activity_log
table.
Here's a demo of how you can use it:
activity()->log('Look, I logged something');
You can retrieve all activity using the Spatie\Activitylog\Models\Activity
model.
Activity::all();
Here's a more advanced example:
activity()
->performedOn($anEloquentModel)
->causedBy($user)
->withProperties(['customProperty' => 'customValue'])
->log('Look, I logged something');
$lastLoggedActivity = Activity::all()->last();
$lastLoggedActivity->subject; //returns an instance of an eloquent model
$lastLoggedActivity->causer; //returns an instance of your user model
$lastLoggedActivity->getExtraProperty('customProperty'); //returns 'customValue'
$lastLoggedActivity->description; //returns 'Look, I logged something'
Here's an example on event logging.
$newsItem->name = 'updated name';
$newsItem->save();
//updating the newsItem will cause the logging of an activity
$activity = Activity::all()->last();
$activity->description; //returns 'updated'
$activity->subject; //returns the instance of NewsItem that was saved
Calling $activity->changes()
will return this array:
[
'attributes' => [
'name' => 'updated name',
'text' => 'Lorum',
],
'old' => [
'name' => 'original name',
'text' => 'Lorum',
],
];
You'll find the documentation on https://spatie.be/docs/laravel-activitylog/introduction.
Find yourself stuck using the package? Found a bug? Do you have general questions or suggestions for improving the activity log? Feel free to create an issue on GitHub, we'll try to address it as soon as possible.
You can install the package via composer:
composer require spatie/laravel-activitylog
The package will automatically register itself.
You can publish the migration with
php artisan vendor:publish --provider="Spatie\Activitylog\ActivitylogServiceProvider" --tag="activitylog-migrations"
Note: The default migration assumes you are using integers for your model IDs. If you are using UUIDs, or some other format, adjust the format of the subject_id and causer_id fields in the published migration before continuing.
After publishing the migration you can create the activity_log
table by running the migrations:
php artisan migrate
You can optionally publish the config file with:
php artisan vendor:publish --provider="Spatie\Activitylog\ActivitylogServiceProvider" --tag="activitylog-config"
Please see CHANGELOG for more information about recent changes.
Please see UPGRADING for details.
composer test
Please see CONTRIBUTING for details.
If you've found a bug regarding security please mail security@spatie.be instead of using the issue tracker.
And a special thanks to Caneco for the logo and Ahmed Nagi for all the work he put in v4
.
Author: Spatie
Source Code: https://github.com/spatie/laravel-activitylog
License: MIT license
1665453600
Thanos is a set of components that can be composed into a highly available metric system with unlimited storage capacity, which can be added seamlessly on top of existing Prometheus deployments.
Thanos is a CNCF Incubating project.
Thanos leverages the Prometheus 2.0 storage format to cost-efficiently store historical metric data in any object storage while retaining fast query latencies. Additionally, it provides a global query view across all Prometheus installations and can merge data from Prometheus HA pairs on the fly.
Concretely the aims of the project are:
Deployment with Sidecar:
Deployment with Receive:
The philosophy of Thanos and our community is borrowing much from UNIX philosophy and the golang programming language.
Main branch should be stable and usable. Every commit to main builds docker image named main-<date>-<sha>
in quay.io/thanos/thanos and thanosio/thanos dockerhub (mirror)
We also perform minor releases every 6 weeks.
During that, we build tarballs for major platforms and release docker images.
See release process docs for details.
Contributions are very welcome! See our CONTRIBUTING.md for more information.
Thanos is an open source project and we value and welcome new contributors and members of the community. Here are ways to get in touch with the community:
See Adopters List
.
See MAINTAINERS.md
Author: Thanos-io
Source Code: https://github.com/thanos-io/thanos
License: Apache-2.0 license
1665106380
LibreNMS is an auto-discovering PHP/MySQL/SNMP based network monitoring which includes support for a wide range of network hardware and operating systems including Cisco, Linux, FreeBSD, Juniper, Brocade, Foundry, HP and many more.
We intend LibreNMS to be a viable project and community that:
The Debian Social Contract will be the basis of our priority system, and mutual respect is the basis of our behavior towards others.
Documentation can be found in the doc directory or docs.librenms.org, including instructions for installing and contributing.
You can participate in the project by:
You can try LibreNMS by downloading a VM image. Currently, a Ubuntu-based image is supplied and has been tested with VirtualBox.
Download one of the VirtualBox images we have available, documentation is provided which details login credentials and setup details.
Author: librenms
Source Code: https://github.com/librenms/librenms
License: View license
1664877446
It is a self-hosted monitoring tool like "Uptime Robot".
Try it!
It is a temporary live demo, all data will be deleted after 10 minutes. The server is located in Tokyo, so if you live far from there, it may affect your experience. I suggest that you should install and try it out for the best demo experience.
VPS is sponsored by Uptime Kuma sponsors on Open Collective! Thank you so much!
docker run -d --restart=always -p 3001:3001 -v uptime-kuma:/app/data --name uptime-kuma louislam/uptime-kuma:1
⚠️ Please use a local volume only. Other types such as NFS are not supported.
Browse to http://localhost:3001 after starting.
Required Tools:
# Update your npm to the latest version
npm install npm -g
git clone https://github.com/louislam/uptime-kuma.git
cd uptime-kuma
npm run setup
# Option 1. Try it
node server/server.js
# (Recommended) Option 2. Run in background using PM2
# Install PM2 if you don't have it:
npm install pm2 -g && pm2 install pm2-logrotate
# Start Server
pm2 start server/server.js --name uptime-kuma
Browse to http://localhost:3001 after starting.
More useful PM2 Commands
# If you want to see the current console output
pm2 monit
# If you want to add it to startup
pm2 save && pm2 startup
If you need more options or need to browse via a reverse proxy, please read:
https://github.com/louislam/uptime-kuma/wiki/%F0%9F%94%A7-How-to-Install
Please read:
https://github.com/louislam/uptime-kuma/wiki/%F0%9F%86%99-How-to-Update
I will mark requests/issues to the next milestone.
https://github.com/louislam/uptime-kuma/milestones
Project Plan:
https://github.com/users/louislam/projects/4/views/1
Light Mode:
Status Page:
Settings Page:
Telegram Notification Sample:
If you love this project, please consider giving me a ⭐.
You can discuss or ask for help in issues.
My Reddit account: u/louislamlam.
You can mention me if you ask a question on Reddit. r/Uptime kuma
There are a lot of pull requests right now, but I don't have time to test them all.
If you want to help, you can check this: https://github.com/louislam/uptime-kuma/wiki/Test-Pull-Requests
Check out the latest beta release here: https://github.com/louislam/uptime-kuma/releases
If you want to report a bug or request a new feature, feel free to open a new issue.
If you want to translate Uptime Kuma into your language, please read: https://github.com/louislam/uptime-kuma/tree/master/src/languages
Feel free to correct my grammar in this README, source code, or wiki, as my mother language is not English and my grammar is not that great.
If you want to modify Uptime Kuma, please read this guide and follow the rules here: https://github.com/louislam/uptime-kuma/blob/master/CONTRIBUTING.md
Author: louislam
Source Code: https://github.com/louislam/uptime-kuma
License: MIT license
1660783860
Node Application Metrics monitoring and profiling agent
Node Application Metrics instruments the Node.js runtime for performance monitoring, providing the monitoring data via an API. Additionally the data can be visualized by using the Node Application Metrics Dashboard.
The data can also be visualized in Eclipse using the IBM Monitoring and Diagnostics Tools - Health Center client. Profiling data is available in Health Center, but is not yet available in the Dashboard. See https://www.ibm.com/developerworks/java/jdk/tools/healthcenter/ for more details.
Node Application Metrics provides the following built-in data collection sources:
Source | Description |
---|---|
Environment | Machine and runtime environment information |
CPU | Process and system CPU |
Memory | Process and system memory usage |
GC | Node/V8 garbage collection statistics |
Event Loop | Event loop latency information |
Loop | Event loop timing metrics |
Function profiling | Node/V8 function profiling (disabled by default) |
HTTP | HTTP request calls made of the application |
HTTP Outbound | HTTP requests made by the application |
socket.io | WebSocket data sent and received by the application |
LevelDB | LevelDB queries made by the application |
MySQL | MySQL queries made by the application |
MongoDB | MongoDB queries made by the application |
PostgreSQL | PostgreSQL queries made by the application |
MQTT | MQTT messages sent and received by the application |
MQLight | MQLight messages sent and received by the application |
Memcached | Data that is stored or manipulated in Memcached |
OracleDB | OracleDB queries made by the application |
Oracle | Oracle queries made by the application |
StrongOracle | StrongOracle database queries made by the application |
Redis | Redis commands issued by the application |
Riak | Riak methods called by the application |
Request tracking | A tree of application requests, events and optionally trace (disabled by default) |
Function trace | Tracing of application function calls that occur during a request (disabled by default) |
Our testing has shown that the performance overhead in terms of processing is minimal, adding less than 0.5 % to the CPU usage of your application. The additional memory required is around 20 MB to gather information about your system and application.
We gathered this information by monitoring the sample application Acme Air. We used MongoDB as our datastore and used JMeter to drive load though the program. We have performed this testing with Node.js version 6.10.3
Appmetrics uses node-gyp
to compile and build local binary libraries to enhance execution performance. If the following compilation and build logs contain errors, make sure you have the node-gyp pre-requisites installed (https://github.com/nodejs/node-gyp#installation). If you have them and the build still had errors, see if there are any related issues at https://github.com/RuntimeTools/appmetrics/issues). If there aren't, feel free to open a new issue to report the bug.
You can get Node Application Metrics from 3 different places:
npm install appmetrics
. Requires a compiler)Node Application Metrics can be configured in two ways, by using the configuration file described below or via a call to configure(options).
Node Application Metrics comes with a configuration file inside the module installation directory (.../node_modules/appmetrics/appmetrics.properties
). This can be used to configure connection options, logging and data source options.
Node Application Metrics will attempt to load appmetrics.properties
from one of the following locations (in order):
The default configuration has minimal logging enabled, will attempt to send data to a local MQTT server on the default port and has method profiling disabled.
Many of the options provide configuration of the Health Center core agent library and are documented in the Health Center documentation: Health Center configuration properties.
The following options are specific to appmetrics:
com.ibm.diagnostics.healthcenter.data.profiling=[off|on]
Specifies whether method profiling data will be captured. The default value is off
. This specifies the value at start-up; it can be enabled and disabled dynamically as the application runs, either by a monitoring client or the API.In previous versions appmetrics came with an executable, node-hc
, which could be used instead of the node
command to run your application and load and start appmetrics. This has been removed in version 4.0.0, instead you can use:
$ node --require appmetrics/start app.js
to preload and start appmetrics, or in Node.js from versions 8.0.0 and 6.12.0 onwards, use the NODE_OPTIONS environment variable:
$ export NODE_OPTIONS="--require appmetrics/start"
If you locally install this module with npm then you will additionally have access to the monitoring data via the appmetrics
API (see API Documentation).
To load appmetrics
and get the monitoring API object, add the following to the start-up code for your application:
var appmetrics = require('appmetrics');
var monitoring = appmetrics.monitor();
The call to appmetrics.monitor()
starts the data collection agent, making the data available via the API and to the Heath Center client via MQTT.
You should start your application using the node
command as usual (not node-hc
).
You must call require('appmetrics');
before the require statements for any npm modules you want to monitor. Appmetrics must be initialized first so that it can instrument modules for monitoring as they are loaded. If this is a problem due to the structure of your application you can require the module on the node command line by using -r or --require or by setting NODE_OPTIONS as described above to make sure it is pre-loaded.
Once you have loaded appmetrics you can then use the monitoring object to register callbacks and request information about the application:
monitoring.on('initialized', function (env) {
env = monitoring.getEnvironment();
for (var entry in env) {
console.log(entry + ':' + env[entry]);
};
});
monitoring.on('cpu', function (cpu) {
console.log('[' + new Date(cpu.time) + '] CPU: ' + cpu.process);
});
Not supported on z/OS
Connecting to the Health Center client requires the additional installation of a MQTT broker. The Node Application Metrics agent sends data to the MQTT broker specified in the appmetrics.properties
file or set via a call to configure(options). Installation and configuration documentation for the Health Center client is available from the Health Center documentation in IBM Knowledge Center.
Note that both the API and the Health Center client can be used at the same time and will receive the same data. Use of the API requires a local install and application modification (see Modifying your application to use the local installation).
Further information regarding the use of the Health Center client with Node Application Metrics can be found on the appmetrics wiki: Using Node Application Metrics with the Health Center client.
Sets various properties on the appmetrics monitoring agent. If the agent has already been started, this function does nothing.
options
(Object) key value pairs of properties and values to be set on the monitoring agent.Property name | Property value type | Property description |
---|---|---|
applicationID | string | Specifies a unique identifier for the mqtt connection |
mqtt | string['off'|'on'] | Specifies whether the monitoring agent sends data to the mqtt broker. The default value is 'on' |
mqttHost | string | Specifies the host name of the mqtt broker |
mqttPort | string['[0-9]*'] | Specifies the port number of the mqtt broker |
profiling | string['off'|'on'] | Specifies whether method profiling data will be captured. The default value is 'off' |
Starts the appmetrics monitoring agent. If the agent is already running this function does nothing.
Stops the appmetrics monitoring agent. If the agent is not running this function does nothing.
type
, config
)Enable data generation of the specified data type. Cannot be called until the agent has been started by calling start()
or monitor()
.
type
(String) the type of event to start generating data for. Values of eventloop
, profiling
, http
, http-outbound
, mongo
, socketio
, mqlight
, postgresql
, mqtt
, mysql
, redis
, riak
, memcached
, oracledb
, oracle
, strong-oracle
, requests
and trace
are currently supported. As trace
is added to request data, both requests
and trace
must be enabled in order to receive trace data.config
(Object) (optional) configuration map to be added for the data type being enabled. (see setConfig) for more information.The following data types are disabled by default: profiling
, requests
, trace
type
)Disable data generation of the specified data type. Cannot be called until the agent has been started by calling start()
or monitor()
.
type
(String) the type of event to stop generating data for. Values of eventloop
, profiling
, http
, mongo
, socketio
, mqlight
, postgresql
, mqtt
, mysql
, redis
, riak
, memcached
, oracledb
, oracle
, strong-oracle
, requests
and trace
are currently supported.type
, config
)Set the configuration to be applied to a specific data type. The configuration available is specific to the data type.
type
(String) the type of event to apply the configuration to.config
(Object) key value pairs of configurations to be applied to the specified event. The available configuration options are as follows:Type | Configuration key | Configuration Value |
---|---|---|
http | filters | (Array) of URL filter Objects consisting of:
|
requests | excludeModules | (Array) of String names of modules to exclude from request tracking. |
trace | includeModules | (Array) of String names for modules to include in function tracing. By default only non-module functions are traced when trace is enabled. |
advancedProfiling | threshold | (Number) millisecond run time of an event loop cycle that will trigger profiling |
type
, data
)Allows custom monitoring events to be added into the Node Application Metrics agent.
type
(String) the name you wish to use for the data. A subsequent event of that type will be raised, allowing callbacks to be registered for it.data
(Object) the data to be made available with the event. The object must not contain circular references, and by convention should contain a time
value representing the milliseconds when the event occurred.Not supported on z/OS Dumps the v8 heap via heapdump
. For more information, see https://github.com/bnoordhuis/node-heapdump/blob/master/README.md
Creates a Node Application Metrics agent client instance. This can subsequently be used to get environment data and subscribe to data events. This function will start the appmetrics monitoring agent if it is not already running.
Requests an object containing all of the available environment information for the running application. This will not contain all possible environment information until an 'initialized' event has been received.
Not supported on z/OS Emitted when a CPU monitoring sample is taken.
data
(Object) the data from the CPU sample:time
(Number) the milliseconds when the sample was taken. This can be converted to a Date using new Date(data.time)
.process
(Number) the percentage of CPU used by the Node.js application itself. This is a value between 0.0 and 1.0.system
(Number) the percentage of CPU used by the system as a whole. This is a value between 0.0 and 1.0.Emitted every 5 seconds, summarising sample based information of the event loop latency
data
(Object) the data from the event loop sample:time
(Number) the milliseconds when the event was emitted. This can be converted to a Date using new Date(data.time)
.latency.min
(Number) the shortest sampled latency, in milliseconds.latency.max
(Number) the longest sampled latency, in milliseconds.latency.avg
(Number) the average sampled latency, in milliseconds.Emitted when a garbage collection (GC) cycle occurs in the underlying V8 runtime.
data
(Object) the data from the GC sample:time
(Number) the milliseconds when the sample was taken. This can be converted to a Date using new Date(data.time)
.type
(String) the type of GC cycle, either:'M'
: MarkSweepCompact, aka "major"'S'
: Scavenge, aka "minor"'I'
: IncrementalMarking, aka "incremental" (only exists on node 5.x and greater)W'
: ProcessWeakCallbacks, aka "weakcb" (only exists on node 5.x and greater)size
(Number) the size of the JavaScript heap in bytes.used
(Number) the amount of memory used on the JavaScript heap in bytes.duration
(Number) the duration of the GC cycle in milliseconds.Emitted when all possible environment variables have been collected. Use appmetrics.monitor.getEnvironment()
to access the available environment variables.
Emitted every 5 seconds, summarising event tick information in time interval
data
(Object) the data from the event loop sample:count
(Number) the number of event loop ticks in the last interval.minimum
(Number) the shortest (i.e. fastest) tick in milliseconds.maximum
(Number) the longest (slowest) tick in milliseconds.average
(Number) the average tick time in milliseconds.cpu_user
(Number) the percentage of 1 CPU used by the event loop thread in user code the last interval. This is a value between 0.0 and 1.0.cpu_system
(Number) the percentage of 1 CPU used by the event loop thread in system code in the last interval. This is a value between 0.0 and 1.0.Emitted when a memory monitoring sample is taken.
data
(Object) the data from the memory sample:time
(Number) the milliseconds when the sample was taken. This can be converted to a Date using new Date(data.time)
.physical_total
(Number) the total amount of RAM available on the system in bytes.physical_used
(Number) the total amount of RAM in use on the system in bytes.physical_free
(Number) the total amount of free RAM available on the system in bytes.virtual
(Number) the memory address space used by the Node.js application in bytes.private
(Number) the amount of memory used by the Node.js application that cannot be shared with other processes, in bytes.physical
(Number) the amount of RAM used by the Node.js application in bytes.Emitted when a profiling sample is available from the underlying V8 runtime.
data
(Object) the data from the profiling sample:time
(Number) the milliseconds when the sample was taken. This can be converted to a Date using new Date(data.time)
.functions
(Array) an array of functions that ran during the sample. Each array entry consists of:self
(Number) the ID for this function.parent
(Number) the ID for this function's caller.name
(String) the name of this function.file
(String) the file in which this function is defined.line
(Number) the line number in the file.count
(Number) the number of samples for this function.Emitted when a HTTP/HTTPS request is made of the application.
data
(Object) the data from the HTTP(S) request:time
(Number) the milliseconds when the request was made. This can be converted to a Date using new Date(data.time)
.method
(String) the HTTP(S) method used for the request.url
(String) the URL on which the request was made.duration
(Number) the time taken for the HTTP(S) request to be responded to in ms.header
(String) the response header for the HTTP(S) request.contentType
(String) the content type of the HTTP(S) request.requestHeader
(Object) the request header for HTTP(S) request.Emitted when the application makes an outbound HTTP/HTTPS request.
data
(Object) the data from the HTTP(S) request:time
(Number) the milliseconds when the request was made. This can be converted to a Date using new Date(data.time)
.method
(String) the HTTP(S) method used for the request.url
(String) the URL on which the request was made.contentType
(String) the HTTP(S) response content-type.statusCode
(String) the HTTP response status code.duration
(Number) the time taken for the HTTP(S) request to be responded to in ms.Emitted when a LevelDB query is made using the leveldown
module.
data
(Object) the data from the LevelDB query:time
(Number) the time in milliseconds when the LevelDB query was made. This can be converted to a Date using new Date(data.time)
.method
(String) The leveldown method being used.key
(Object) The key being used for a call to get
, put
or del
(Undefined for other methods)value
(Object) The value being added to the LevelDB database using the put
method (Undefined for other methods)opCount
(Number) The number of operations carried out by a batch
method (Undefined for other methods)duration
(Number) the time taken for the LevelDB query to be responded to in ms.Emitted when a function is called on the loopback-datasource-juggler
module
data
(Object) the data from the loopback-datasource-juggler event:time
(Number) the time in milliseconds when the event occurred. This can be converted to a Date using new Date(data.time)
method
(String) the function the juggler has executedduration
(Number) the time taken for the operation to complete.Emitted when a data is stored, retrieved or modified in Memcached using the memcached
module.
data
(Object) the data from the memcached event:time
(Number) the milliseconds when the memcached event occurred. This can be converted to a Date using new Date(data.time)
method
(String) the method used in the memcached client, eg set
, get
, append
, delete
, etc.key
(String) the key associated with the data.duration
(Number) the time taken for the operation on the memcached data to occur.Emitted when a MongoDB query is made using the mongodb
module.
data
(Object) the data from the MongoDB request:time
(Number) the milliseconds when the MongoDB query was made. This can be converted to a Date using new Date(data.time)
query
(String) the query made of the MongoDB database.duration
(Number) the time taken for the MongoDB query to be responded to in ms.method
(String) the executed method for the query, such as find, update.collection
(String) the MongoDB collection name.Emitted when a MQLight message is sent or received.
data
(Object) the data from the MQLight event:time
(Number) the time in milliseconds when the MQLight event occurred. This can be converted to a Date using new Date(data.time).clientid
(String) the id of the client.data
(String) the data sent if a 'send' or 'message', undefined for other calls. Truncated if longer than 25 characters.method
(String) the name of the call or event (will be one of 'send' or 'message').topic
(String) the topic on which a message is sent/received.qos
(Number) the QoS level for a 'send' call, undefined if not set.duration
(Number) the time taken in milliseconds.Emitted when a MQTT message is sent or received.
data
(Object) the data from the MQTT event:time
(Number) the time in milliseconds when the MQTT event occurred. This can be converted to a Date using new Date(data.time).method
(String) the name of the call or event (will be one of 'publish' or 'message').topic
(String) the topic on which a message is published or received.qos
(Number) the QoS level for the message.duration
(Number) the time taken in milliseconds.Emitted when a MySQL query is made using the mysql
module.
data
(Object) the data from the MySQL query:time
(Number) the milliseconds when the MySQL query was made. This can be converted to a Date using new Date(data.time)
.query
(String) the query made of the MySQL database.duration
(Number) the time taken for the MySQL query to be responded to in ms.Emitted when a query is executed using the oracle
module.
data
(Object) the data from the Oracle query:time
(Number) the milliseconds when the Oracle query was made. This can be converted to a Date using new Date(data.time)
.query
(String) the query made of the Oracle database.duration
(Number) the time taken for the Oracle query to be responded to in ms.Emitted when a query is executed using the oracledb
module.
data
(Object) the data from the OracleDB query:time
(Number) the milliseconds when the OracleDB query was made. This can be converted to a Date using new Date(data.time)
.query
(String) the query made of the OracleDB database.duration
(Number) the time taken for the OracleDB query to be responded to in ms.Emitted when a PostgreSQL query is made to the pg
module.
data
(Object) the data from the PostgreSQL query:time
(Number) the milliseconds when the PostgreSQL query was made. This can be converted to a Date using new Date(data.time)
.query
(String) the query made of the PostgreSQL database.duration
(Number) the time taken for the PostgreSQL query to be responded to in ms.Emitted when a Redis command is sent.
data
(Object) the data from the Redis event:time
(Number) the time in milliseconds when the redis event occurred. This can be converted to a Date using new Date(data.time).cmd
(String) the Redis command sent to the server or 'batch.exec'/'multi.exec' for groups of command sent using batch/multi calls.duration
(Number) the time taken in milliseconds.Emitted when a Riak method is called using the basho-riak-client
module.
data
(Object) the data from the Riak event:time
(Number) the time in milliseconds when the riak event occurred. This can be converted to a Date using new Date(data.time).method
(String) the Riak method called.options
(Object) the options parameter passed to Riak.command
(Object) the command parameter used in the execute
method.query
(String) the query parameter used in the mapReduce
method.duration
(Number) the time taken in milliseconds.Emitted when WebSocket data is sent or received by the application using socketio.
data
(Object) the data from the socket.io request:time
(Number) the milliseconds when the event occurred. This can be converted to a Date using new Date(data.time)
.method
(String) whether the event is a broadcast
or emit
from the application, or a receive
from a client .event
(String) the name used for the event.duration
(Number) the time taken for event to be sent or for a received event to be handled.Emitted when a query is executed using the strong-oracle
module.
data
(Object) the data from the Strong Oracle query:time
(Number) the milliseconds when the Strong Oracle query was made. This can be converted to a Date using new Date(data.time)
.query
(String) the query made of the database.duration
(Number) the time taken for the Strong Oracle query to be responded to in ms.Requests are a special type of event emitted by appmetrics. All the probes named above can also create request events if requests are enabled. However requests are nested within a root incoming request (usually http). Request events are disabled by default.
data
(Object) the data from the request:time
(Number) the milliseconds when the request occurred. This can be converted to a Date using new Date(data.time)
.type
(String) The type of the request event. This is the name of the probe that sent the request data, e.g. http
, socketio
etc.name
(String) The name of the request event. This is the request task, eg. the url, or the method being used.request
(Object) the detailed data for the root request event:type
(String) The type of the request event. This is the name of the probe that sent the request data, e.g. http
, socketio
etc.name
(String) The name of the request event. This is the request task, eg. the url, or the method being used.context
(Object) Additional context data (usually contains the same data as the associated non-request metric event).stack
(String) An optional stack trace for the event call.children
(Array) An array of child request events that occurred as part of the overall request event. Child request events may include function trace entries, which will have a type
of null.duration
(Number) the time taken for the request to complete in ms.duration
(Number) the time taken for the overall request to complete in ms.The Node Application Metrics agent supports the following runtime environments where a Node.js runtime is available:
npm install appmetrics
, ensure the environment variable CC=gcc
is set.Find below some possible problem scenarios and corresponding diagnostic steps. Updates to troubleshooting information will be made available on the appmetrics wiki: Troubleshooting. If these resources do not help you resolve the issue, you can open an issue on the Node Application Metrics appmetrics issue tracker.
By default, a message similar to the following will be written to console output when Node Application Metrics starts:
[Fri Aug 21 09:36:58 2015] com.ibm.diagnostics.healthcenter.loader INFO: Node Application Metrics 1.0.1-201508210934 (Agent Core 3.0.5.201508210934)
This error indicates there was a problem while loading the native part of the module or one of its dependent libraries. On Windows, appmetrics.node
depends on a particular version of the C runtime library and if it cannot be found this error is the likely result.
Check:
appmetrics.node
file exist in the indicated location? If not, try reinstalling the module.1.0.0
on Windows: are msvcr100.dll
and msvcp100.dll
installed on your Windows system, and do they match the bitness (32-bit or 64-bit) of your Node.js runtime environment? If not, you may be able to install them with the Visual C++ Redistributable Packages for Visual Studio 2010 package from the Microsoft website.1.0.1
on Windows: does msvcr120.dll
and msvcp120.dll
exist in the module installation directory (see Installation) and does it match the bitness of your Node.js runtime environment? If not, try reinstalling the module.Note: On Windows, the global module installation directory might be shared between multiple Node.js runtime environments. This can cause problems with globally installed modules with native components, particularly if some of the Node.js runtime environments are 32-bit and others are 64-bit because the native components will only work with those with matching bitness.
This error indicates there was a problem while loading the native part of the module or one of its dependent libraries. On non-Windows platforms, libagentcore.so
depends on a particular (minimum) version of the C runtime library and if it cannot be found this error is the result.
Check:
libstdc++
installed. You may need to install or update a package in your package manager. If your OS does not supply a package at this version, you may have to install standalone software - consult the documentation or support forums for your OS.libstdc++
installed, ensure it is on the system library path, or use a method (such as setting LD_LIBRARY_PATH
environment variable on Linux, or LIBPATH environment variable on AIX) to add the library to the search path.Method profiling data is not collected by default, check Configuring Node Application Metrics for information on how to enable it.
If collection is enabled, an absence of method profiling data from a Node.js application could be caused by the type of tasks that are being run by your application -- it may be running long, synchronous tasks that prevent collection events from being scheduled on the event loop.
If a task uses the Node.js thread exclusively then shuts down the Node.js runtime environment, the Health Center agent may not get the opportunity to obtain any profiling data. An example of such an application is the Octane JavaScript benchmark suite, which loads the CPU continuously rather than dividing the load across multiple units of work.
The source code for Node Application Metrics is available in the appmetrics project. Information on working with the source code -- installing from source, developing, contributing -- is available on the appmetrics wiki.
This project is released under an Apache 2.0 open source license.
The npm package for this project uses a semver-parsable X.0.Z version number for releases, where X is incremented for breaking changes to the public API described in this document and Z is incremented for bug fixes and for non-breaking changes to the public API that provide new function.
Non-release versions of this project (for example on github.com/RuntimeTools/appmetrics) will use semver-parsable X.0.Z-dev.B version numbers, where X.0.Z is the last release with Z incremented and B is an integer. For further information on the development process go to the appmetrics wiki: Developing.
This module adopts the Module Long Term Support (LTS) policy, with the following End Of Life (EOL) dates:
Module Version | Release Date | Minimum EOL | EOL With | Status |
---|---|---|---|---|
V4.x.x | Jan 2018 | Dec 2019 | Maintenance | |
V5.x.x | May 2019 | Dec 2020 | Current |
5.1.1
5.1.1
- Node13 support, bump dependency versions and a trace probe fix.5.0.5
- zAppmetrics fixes, and bump agentcore for Alpine support.5.0.3
- Bug fix.5.0.2
- Bump level of omragentcore.5.0.1
- Bug fix for incorrect timiings on http request.5.0.0
- Add Node 12 support, remove Node 6 support.4.0.1
- Bug fix release including adding Node 10 support on Windows (Unix already working).4.0.0
- Remove node-hc and add support for preloading.3.1.3
- Packaging fix.3.1.2
- Bug fixes.3.1.1
- Node v6 on z/OS support.3.1.0
- HTTPS probe added. Remove support for Node v7.3.0.2
- Probe defect for Node 8 support.3.0.1
- Packaging bug fix to allow build from source if binary not present.3.0.0
- Remove express probe. Additional data available in http and request events. Code improvements.2.0.1
- Remove support for Node.js 0.10, 0.12, 5. Add heapdump api call.1.2.0
- Add file data collection capability and option configuration via api.1.1.2
- Update agent core to 3.0.10, support Node.js v7.1.1.1
- Fix node-gyp rebuild failure and don't force MQTT broker to on1.1.0
- Bug fixes, improved MongoDB data, updated dependencies, CPU watchdog feature1.0.13
- Express probe, strong-supervisor integration1.0.12
- Appmetrics now fully open sourced under Apache 2.0 license1.0.11
- Bug fixes1.0.10
- Bug fixes1.0.9
- Loopback and Riak support, bug fixes and update to agent core 3.0.9.1.0.8
- Oracle support, bug fixes and api tests runnable using 'npm test'.1.0.7
- StrongOracle support, support for installing with a proxy, expose MongoDB, MQLight and MySQL events to connectors.1.0.6
- OracleDB support and bug fixes.1.0.5
- Expose HTTP events to connectors (including MQTT).1.0.4
- Redis, Leveldown, Postgresql, Memcached, MQLight and MQTT support, higher precision timings, and improved performance.1.0.3
- Node.js v4 support.1.0.2
- HTTP, MySQL, MongoDB, request tracking and function tracing support.1.0.1
- Mac OS X support, io.js v2 support.1.0.0
- First release.
Author: RuntimeTools
Source Code: https://github.com/RuntimeTools/appmetrics
License: Apache-2.0 license
1658268120
The spatie/laravel-activitylog
package provides easy to use functions to log the activities of the users of your app. It can also automatically log model events. The Package stores all activity in the activity_log
table.
Here's a demo of how you can use it:
activity()->log('Look, I logged something');
You can retrieve all activity using the Spatie\Activitylog\Models\Activity
model.
Activity::all();
Here's a more advanced example:
activity()
->performedOn($anEloquentModel)
->causedBy($user)
->withProperties(['customProperty' => 'customValue'])
->log('Look, I logged something');
$lastLoggedActivity = Activity::all()->last();
$lastLoggedActivity->subject; //returns an instance of an eloquent model
$lastLoggedActivity->causer; //returns an instance of your user model
$lastLoggedActivity->getExtraProperty('customProperty'); //returns 'customValue'
$lastLoggedActivity->description; //returns 'Look, I logged something'
Here's an example on event logging.
$newsItem->name = 'updated name';
$newsItem->save();
//updating the newsItem will cause the logging of an activity
$activity = Activity::all()->last();
$activity->description; //returns 'updated'
$activity->subject; //returns the instance of NewsItem that was saved
Calling $activity->changes()
will return this array:
[
'attributes' => [
'name' => 'updated name',
'text' => 'Lorum',
],
'old' => [
'name' => 'original name',
'text' => 'Lorum',
],
];
You can install the package via composer:
composer require spatie/laravel-activitylog
The package will automatically register itself.
You can publish the migration with:
php artisan vendor:publish --provider="Spatie\Activitylog\ActivitylogServiceProvider" --tag="activitylog-migrations"
Note: The default migration assumes you are using integers for your model IDs. If you are using UUIDs, or some other format, adjust the format of the subject_id and causer_id fields in the published migration before continuing.
After publishing the migration you can create the activity_log
table by running the migrations:
php artisan migrate
You can optionally publish the config file with:
php artisan vendor:publish --provider="Spatie\Activitylog\ActivitylogServiceProvider" --tag="activitylog-config"
Please see CHANGELOG for more information about recent changes.
Please see UPGRADING for details.
composer test
You'll find the documentation on https://spatie.be/docs/laravel-activitylog/introduction.
Find yourself stuck using the package? Found a bug? Do you have general questions or suggestions for improving the activity log? Feel free to create an issue on GitHub, we'll try to address it as soon as possible.
Please see CONTRIBUTING for details.
We invest a lot of resources into creating best in class open source packages. You can support us by buying one of our paid products.
We highly appreciate you sending us a postcard from your hometown, mentioning which of our package(s) you are using. You'll find our address on our contact page. We publish all received postcards on our virtual postcard wall.
If you've found a bug regarding security please mail security@spatie.be instead of using the issue tracker.
And a special thanks to Caneco for the logo and Ahmed Nagi for all the work he put in v4
.
Author: Spatie
Source Code: https://github.com/spatie/laravel-activitylog
License: MIT license
1654762380
Bosun
Bosun is a time series alerting framework developed by Stack Exchange. Scollector is a metric collection agent.
bosun and scollector are found under the cmd
directory. Run go build
in the corresponding directories to build each project. There's also a Makefile available for most tasks.
For a full stack with all dependencies, run docker-compose up
from the docker
directory. Don't forget to rebuild images and containers if you change the code:
$ cd docker
$ docker-compose down
$ docker-compose up --build
If you only need the dependencies (Redis, OpenTSDB, HBase) and would like to run Bosun on your machine directly (e.g. to attach a debugger), you can bring up the dependencies with these three commands from the repository's root:
$ docker run -p 6379:6379 --name redis redis:6
$ docker build -f docker/opentsdb.Dockerfile -t opentsdb .
$ docker run -p 4242:4242 --name opentsdb opentsdb
The OpenTSDB container will be reachable at http://localhost:4242. Redis listens on its default port 6379
. Bosun, if brought up in a Docker container, is available at http://localhost:8070.
Install:
make deps
and make testdeps
to set up all dependencies.make generate
when new static assets (like JS and CSS files) are added or changed.The w.sh
script will automatically build and run bosun in a loop. It will update itself when go/js/ts files change, and it runs in read-only mode, not sending any alerts.
$ cd cmd/bosun
$ ./w.sh
Go Version:
.travis.yml
in the root of this repo for the version of Go to use. Generally speaking, you should be able to use newer versions of Go if you are able to build Bosun without error.Miniprofiler:
ALT-P
will show miniprofiler. This allows you to see timings, as well as the raw queries sent to TSDBs.Learn more at bosun.org.
Author: Bosun-monitor
Source Code: https://github.com/bosun-monitor/bosun
License: MIT license
1654740000
Balerter is a scripts based alerting system.
In your script you may:
In the example bellow we create one Clickhouse datasource, one scripts source and one alert channel. In the script we run query to clickhouse, check the value and fire the alert (or switch off it)
http
lua moduledocker pull balerter/balerter
docker run \
-v /path/to/config.yml:/opt/config.yml \
-v /path/to/scripts:/opt/scripts \
-v /path/to/cert.crt:/home/user/db.crt \
balerter/balerter -config=/opt/config.yml
Config file config.yml
scripts:
folder:
- name: debug-folder
path: /opt/scripts
mask: '*.lua'
datasources:
clickhouse:
- name: ch1
host: localhost
port: 6440
username: default
password: secret
database: default
sslMode: verified_full
sslCertPath: /home/user/db.crt
channels:
slack:
- name: slack1
url: https://hooks.slack.com/services/hash
Sample script rps.lua
-- @cron */10 * * * * *
-- @name script1
local minRequestsRPS = 100
local log = require("log")
local ch1 = require("datasource.clickhouse.ch1")
local res, err = ch1.query("SELECT sum(requests) AS rps FROM some_table WHERE date = now()")
if err ~= nil then
log.error("clickhouse 'ch1' query error: " .. err)
return
end
local resultRPS = res[1].rps
if resultRPS < minResultRPS then
alert.error("rps-min-limit", "Requests RPS are very small: " .. tostring(resultRPS))
else
alert.success("rps-min-limit", "Requests RPS ok")
end
Also, you can to write tests!
An example:
-- @test script1
-- @name script1-test
test = require('test')
local resp = {
{
rps = 10
}
}
test.datasource('clickhouse.ch1').on('query', 'SELECT sum(requests) AS rps FROM some_table WHERE date = now()').response(resp)
test.alert().assertCalled('error', 'rps-min-limit', 'Requests RPS are very small: 10')
test.alert().assertNotCalled('success', 'rps-min-limit', 'Requests RPS ok')
See a documentation on https://balerter.com
Full documentation available on https://balerter.com
A Project in active development. Features may have breaking changes at any time before v1.0.0 version
Author: Balerter
Source Code: https://github.com/balerter/balerter
License: MIT license
1654691160
roumon
A goroutine monitor to keep track of active routines from within your favorite shell.
go install github.com/becheran/roumon@latest
Or download the pre-compiled binaries from the releases page.
Before starting roumon, the go app which shall be monitored needs to be prepared to export pprof infos via http.
The program which shall be monitored needs to run a pprof server.
Import pprof into you program:
import _ "net/http/pprof"
Run a webserver which will listen on a specific port:
go func() {
log.Println(http.ListenAndServe("localhost:6060", nil))
}()
Start your program and check that the pprof
site is available in you web-browser: http://localhost:6060/debug/pprof
Start roumon in from your command line interface. Use optional arguments if needed.
For example roumon -debug=logfile -host=192.168.10.1 -port=8081
will start the routine monitor for the pprof profiles exposed to 192.168.10.1:8081
and write a debug logfile to ./logfile
.
Run roumon with -h
or --help
to see all commandline argument options:
Usage of roumon:
-debug string
Path to debug file
-host string
The pprof server IP or hostname (default "localhost")
-port int
The pprof server port (default 6060)
From within the Terminal User Interface (TUI) hit F1
for help F10
or ctrl-c
to stop the application.
Pull requests and issues are welcome!
Author: Becheran
Source Code: https://github.com/becheran/roumon
License: MIT license
1649085300
Gowl
Gowl is a process management and process monitoring tool at once. An infinite worker pool gives you the ability to control the pool and processes and monitor their status.
Using Gowl is easy. First, use go get
to install the latest version of the library. This command will install the gowl
along with library and its dependencies:
go get -u github.com/hamed-yousefi/gowl
Next, include Gowl in your application:
import "github.com/hamed-yousefi/gowl"
Gowl has three main parts. Process, Pool, and Monitor. The process is the smallest part of this project. The process is the part of code that the developer must implement. To do that, Gowl provides an interface to inject outside code into the pool. The process interface is as follows:
Process interface {
Start() error
Name() string
PID() PID
}
The process interface has three methods. The Start function contains the user codes, and the pool workers use this function to run the process. The Name function returns the process name, and the monitor uses this function to provide reports. The PID function returns process id. The process id is unique in the entire pool, and it will use by the pool and monitor.
Let's take a look at an example:
Document struct {
content string
hash string
}
func (d *Document) Start() error {
hasher := sha1.New()
hasher.Write(bv)
h.hash = base64.URLEncoding.EncodeToString(hasher.Sum(nil))
}
func (d *Document) Name() string {
return "hashing-process"
}
func (d *Document) PID() PID {
return "p-1"
}
func (d *Document) Hash() string {
return h.hash
}
As you can see, in this example, Document
implements the Process interface. So now we can register it into the pool.
Creating Gowl pool is very easy. You must use the NewPool(size int)
function and pass the pool size to this function. Pool size indicates the worker numbers in and the underlying queue size that workers consume process from it. Look at the following example:
pool := gowl.NewPool(4)
In this example, Gowl will create a new instance of a Pool object with four workers and an underlying queue with the size of four.
To start the Gowl, you must call the Start()
method of the pool object. It will begin to create the workers, and workers start listening to the queue to consume process.
To register processes to the pool, you must use the Register(args ...process)
method. Pass the processes to the register method, and it will create a new publisher to publish the process list to the queue. You can call multiple times when Gowl pool is running.
One of the most remarkable features of Gowl is the ability to control the process after registered it into the pool. You can kill a process before any worker runs it. Killing a process is simple, and you need the process id to do it.
pool.Kill(PID("p-909"))
Gowl is an infinite worker pool. However, you should have control over the pool and decide when you want to start it, register a new process on it, kill a process, and close
the pool and terminate the workers. Gowl gives you this option to close the pool by the Close()
method of the Pool object.
Every process management tool needs a monitoring system to expose the internal stats to the outside world. Gowl gives you a monitoring API to see processes and workers stats.
You can get the Monitor instance by calling the Monitor()
method of the Pool. The monitor object is as follows:
Monitor interface {
PoolStatus() pool.Status
Error(PID) error
WorkerList() []WorkerName
WorkerStatus(name WorkerName) worker.Status
ProcessStats(pid PID) ProcessStats
}
The Monitor gives you this opportunity to get the Pool status, process error, worker list, worker status, and process stats. Wis Monitor API, you can create your monitoring app with ease. The following example is using Monitor API to present the stats in the console in real-time.
Also, you can use the Monitor API to show worker status in the console:
Author: Hamed-yousefi
Source Code: https://github.com/hamed-yousefi/gowl
License: MIT License
1644714660
gtop
System monitoring dashboard for terminal.
Requirements
$ npm install gtop -g
You need to assign host net
and pid
to access the metrics in the host machine.
$ docker run --rm -it \
--name gtop \
--net="host" \
--pid="host" \
aksakalli/gtop
Start gtop with the gtop
command
$ gtop
To stop gtop use q
, or ctrl+c
in most shell environments.
You can sort the process table by pressing
p
: Process Idc
: CPU usagem
: Memory usageIf you see question marks or other different characters, try to run it with these environment variables:
$ LANG=en_US.utf8 TERM=xterm-256color gtop
Author: Aksakalli
Source Code: https://github.com/aksakalli/gtop
License: MIT License
1640304000
We use Munin to monitor servers with different resources and critical operations. Munin will contact each server and ask for statistics and monitor if a service is running and have enough resources to work.