Nigel  Uys

Nigel Uys

1672987920

Ansible-prometheus: Deploy Prometheus Monitoring System

Ansible Role: prometheus

Description

Deploy Prometheus monitoring system using ansible.

Upgradability notice

When upgrading from <= 2.4.0 version of this role to >= 2.4.1 please turn off your prometheus instance. More in 2.4.1 release notes

Requirements

  • Ansible >= 2.7 (It might work on previous versions, but we cannot guarantee it)
  • jmespath on deployer machine. If you are using Ansible from a Python virtualenv, install jmespath to the same virtualenv via pip.
  • gnu-tar on Mac deployer host (brew install gnu-tar)

Role Variables

All variables which can be overridden are stored in defaults/main.yml file as well as in table below.

NameDefault ValueDescription
prometheus_version2.27.0Prometheus package version. Also accepts latest as parameter. Only prometheus 2.x is supported
prometheus_skip_installfalsePrometheus installation tasks gets skipped when set to true.
prometheus_binary_local_dir""Allows to use local packages instead of ones distributed on github. As parameter it takes a directory where prometheus AND promtool binaries are stored on host on which ansible is ran. This overrides prometheus_version parameter
prometheus_config_dir/etc/prometheusPath to directory with prometheus configuration
prometheus_db_dir/var/lib/prometheusPath to directory with prometheus database
prometheus_read_only_dirs[]Additional paths that Prometheus is allowed to read (useful for SSL certs outside of the config directory)
prometheus_web_listen_address"0.0.0.0:9090"Address on which prometheus will be listening
prometheus_web_config{}A Prometheus web config yaml for configuring TLS and auth.
prometheus_web_external_url""External address on which prometheus is available. Useful when behind reverse proxy. Ex. http://example.org/prometheus
prometheus_storage_retention"30d"Data retention period
prometheus_storage_retention_size"0"Data retention period by size
prometheus_config_flags_extra{}Additional configuration flags passed to prometheus binary at startup
prometheus_alertmanager_config[]Configuration responsible for pointing where alertmanagers are. This should be specified as list in yaml format. It is compatible with official
prometheus_alert_relabel_configs[]Alert relabeling rules. This should be specified as list in yaml format. It is compatible with the official
prometheus_global{ scrape_interval: 60s, scrape_timeout: 15s, evaluation_interval: 15s }Prometheus global config. Compatible with official configuration
prometheus_remote_write[]Remote write. Compatible with official configuration
prometheus_remote_read[]Remote read. Compatible with official configuration
prometheus_external_labelsenvironment: "{{ ansible_fqdn | default(ansible_host) | default(inventory_hostname) }}"Provide map of additional labels which will be added to any time series or alerts when communicating with external systems
prometheus_targets{}Targets which will be scraped. Better example is provided in our demo site
prometheus_scrape_configsdefaults/main.yml#L58Prometheus scrape jobs provided in same format as in official docs
prometheus_config_file"prometheus.yml.j2"Variable used to provide custom prometheus configuration file in form of ansible template
prometheus_alert_rulesdefaults/main.yml#L81Full list of alerting rules which will be copied to {{ prometheus_config_dir }}/rules/ansible_managed.rules. Alerting rules can be also provided by other files located in {{ prometheus_config_dir }}/rules/ which have *.rules extension
prometheus_alert_rules_filesdefaults/main.yml#L78List of folders where ansible will look for files containing alerting rules which will be copied to {{ prometheus_config_dir }}/rules/. Files must have *.rules extension
prometheus_static_targets_filesdefaults/main.yml#L78List of folders where ansible will look for files containing custom static target configuration files which will be copied to {{ prometheus_config_dir }}/file_sd/.

Relation between prometheus_scrape_configs and prometheus_targets

Short version

prometheus_targets is just a map used to create multiple files located in "{{ prometheus_config_dir }}/file_sd" directory. Where file names are composed from top-level keys in that map with .yml suffix. Those files store file_sd scrape targets data and they need to be read in prometheus_scrape_configs.

Long version

A part of prometheus.yml configuration file which describes what is scraped by prometheus is stored in prometheus_scrape_configs. For this variable same configuration options as described in prometheus docs are used.

Meanwhile prometheus_targets is our way of adopting prometheus scrape type file_sd. It defines a map of files with their content. A top-level keys are base names of files which need to have their own scrape job in prometheus_scrape_configs and values are a content of those files.

All this mean that you CAN use custom prometheus_scrape_configs with prometheus_targets set to {}. However when you set anything in prometheus_targets it needs to be mapped to prometheus_scrape_configs. If it isn't you'll get an error in preflight checks.

Example

Lets look at our default configuration, which shows all features. By default we have this prometheus_targets:

prometheus_targets:
  node:  # This is a base file name. File is located in "{{ prometheus_config_dir }}/file_sd/<<BASENAME>>.yml"
    - targets:              #
        - localhost:9100    # All this is a targets section in file_sd format
      labels:               #
        env: test           #

Such config will result in creating one file named node.yml in {{ prometheus_config_dir }}/file_sd directory.

Next this file needs to be loaded into scrape config. Here is modified version of our default prometheus_scrape_configs:

prometheus_scrape_configs:
  - job_name: "prometheus"    # Custom scrape job, here using `static_config`
    metrics_path: "/metrics"
    static_configs:
      - targets:
          - "localhost:9090"
  - job_name: "example-node-file-servicediscovery"
    file_sd_configs:
      - files:
          - "{{ prometheus_config_dir }}/file_sd/node.yml" # This line loads file created from `prometheus_targets`

Example

Playbook

---
- hosts: all
  roles:
  - cloudalchemy.prometheus
  vars:
    prometheus_targets:
      node:
      - targets:
        - localhost:9100
        - demo.cloudalchemy.org:9100
        labels:
          env: demosite

Demo site

Prometheus organization provide a demo site for full monitoring solution based on prometheus and grafana. Repository with code and links to running instances is available on github.

Defining alerting rules files

Alerting rules are defined in prometheus_alert_rules variable. Format is almost identical to one defined in Prometheus 2.0 documentation. Due to similarities in templating engines, every templates should be wrapped in {% raw %} and {% endraw %} statements. Example is provided in defaults/main.yml file.

Local Testing

The preferred way of locally testing the role is to use Docker and molecule (v2.x). You will have to install Docker on your system. See "Get started" for a Docker package suitable to for your system. We are using tox to simplify process of testing on multiple ansible versions. To install tox execute:

pip3 install tox

To run tests on all ansible versions (WARNING: this can take some time)

tox

To run a custom molecule command on custom environment with only default test scenario:

tox -e py35-ansible28 -- molecule test -s default

For more information about molecule go to their docs.

If you would like to run tests on remote docker host just specify DOCKER_HOST variable before running tox tests.

CircleCI

Combining molecule and CircleCI allows us to test how new PRs will behave when used with multiple ansible versions and multiple operating systems. This also allows use to create test scenarios for different role configurations. As a result we have a quite large test matrix which will take more time than local testing, so please be patient.

Contributing

See contributor guideline.

Troubleshooting

See troubleshooting.

Download Details:

Author: Cloudalchemy
Source Code: https://github.com/cloudalchemy/ansible-prometheus 
License: MIT license

#ansible #prometheus #monitoring

Ansible-prometheus: Deploy Prometheus Monitoring System
Duc Tran

Duc Tran

1671020659

A solution for Monitoring and Logging Containers

A monitoring and logging solution for Docker hosts and containers with Prometheus, Grafana, Loki, cAdvisor, NodeExporter and alerting with AlertManager.

Inspired by dockprom

Full source code in here: https://github.com/ductnn/domolo

Install

Clone this repository on your Docker host, cd into domolo directory and run docker-compose up -d:

git clone https://github.com/ductnn/domolo.git
cd domolo
docker-compose up -d

Containers:

  • Prometheus (metrics database): http://<host-ip>:9090
  • Prometheus-Pushgateway (push acceptor for ephemeral and batch jobs): http://<host-ip>:9091
  • AlertManager (alerts management): http://<host-ip>:9093
  • Grafana (visualize metrics): http://<host-ip>:3000
  • Loki (likes prometheus, but for logs): http://<host-ip>:3100
  • Promtail (is the agent, responsible for gathering logs and sending them to Loki)
  • NodeExporter (host metrics collector)
  • cAdvisor (containers metrics collector)
  • Caddy (reverse proxy and basic auth provider for prometheus and alertmanager)

Grafana

Change the credentials in file config

GF_SECURITY_ADMIN_USER=admin
GF_SECURITY_ADMIN_PASSWORD=changeme
GF_USERS_ALLOW_SIGN_UP=false

Grafana is preconfigured with dashboards, setup Prometheus(default) and Loki in datasources

apiVersion: 1

datasources:
  - name: Prometheus
    type: prometheus
    access: proxy
    orgId: 1
    url: http://prometheus:9090
    basicAuth: false
    isDefault: true
    editable: true

  - name: Loki
    type: loki
    access: proxy
    jsonData:
      maxLines: 1000
    basicAuth: false
    url: http://loki:3100
    isDefault: false
    editable: true

 

Prometheus + Node Exporter

Config prometheus for receiving metrics from node_exporter. First, setup node_exporter in servers we need monitor with docker-compose.agents.yml and run command:

docker-compose -f docker-compose.agents.yml up -d

This file will setup 3 agents:

  • node_exporter
  • cAdvisor
  • promtail

Then, we need config scrape metric on prometheus server:

Live monitoring prometheus server:

scrape_configs:
  - job_name: 'nodeexporter'
    scrape_interval: 5s
    static_configs:
      - targets: ['nodeexporter:9100']

Monitoring other Server, we need to add external_labels:

external_labels:
  monitor: 'docker-host-alpha'

scrape_configs:
  - job_name: 'ApiExporter'
    scrape_interval: 5s
    static_configs:
      - targets: ['<IP Server need Monitor>:Port']

 

Grafana Dashboards

Simple dashboards on Grafana:

Node Exporter

NodeExporter

Monitor Services

MonitorServices

Docker Host

DockerHost

Loki

Setup config loki in file loki-config

Config scrape logs with promtail, create file promtail-config.yaml and setup:

  • Scrape logs container:
- job_name: container_logs
  docker_sd_configs:
    - host: unix:///var/run/docker.sock
      refresh_interval: 5s
  relabel_configs:
    - source_labels: ['__meta_docker_container_name']
      regex: '/(.*)'
      target_label: 'container'
  • Scrape logs systems:
- job_name: system
  static_configs:
  - targets:
      - localhost
    labels:
      job: varlogs
      __path__: /var/log/*log

Demo

Create simple tool generate logs and containerization this tool. Navigate to file entrypoint.sh and run test:

➜  domolo git:(master) cd fake-logs
➜  fake-logs git:(master) ✗ chmod +x entrypoint.sh
➜  fake-logs git:(master) ✗ ./entrypoint.sh
2022-12-08T13:20:00Z ERROR An error is usually an exception that has been caught and not handled.
2022-12-08T13:20:00Z DEBUG This is a debug log that shows a log that can be ignored.
2022-12-08T13:20:01Z WARN A warning that should be ignored is usually at this level and should be actionable.
2022-12-08T13:20:03Z ERROR An error is usually an exception that has been caught and not handled.
2022-12-08T13:20:05Z ERROR An error is usually an exception that has been caught and not handled.
2022-12-08T13:20:09Z INFO This is less important than debug log and is often used to provide context in the current task.
2022-12-08T13:20:13Z ERROR An error is usually an exception that has been caught and not handled.
2022-12-08T13:20:15Z DEBUG This is a debug log that shows a log that can be ignored.
2022-12-08T13:20:16Z INFO This is less important than debug log and is often used to provide context in the current task.
2022-12-08T13:20:17Z INFO This is less important than debug log and is often used to provide context in the current task.
...

Then, add fake-logs in docker-compose.yml

# Fake Logs
flogs:
  image: ductn4/flog:v1 # Set your name image :)
  build:
    context: ./fake-logs
    dockerfile: Dockerfile
  container_name: fake-logs
  restart: always
  networks:
    - monitor-net
  labels:
    org.label-schema.group: "monitoring"

or checkout docker-compose.with-flogs.yml and run command docker-compose -f docker-compose.with-flogs.yml up -d

Navigate grafana and open Explore:

Explore

So, we can select labels and views logs:

Labels

Ex: Select label container and view log container fake-logs:

LabelFlog

LogsFlog

More logs: logs system, other containers, ....

SystemLogs

ContainersLogs

Show your support

Give a ⭐ if you like this application ❤️

Contribution

All contributions are welcomed in this project!

License

The MIT License (MIT). Please see LICENSE for more information.

#monitoring  #logging #docker #devops #cloud 
#prometheus #grafana #loki

A solution for Monitoring and Logging Containers
Elian  Harber

Elian Harber

1667503860

Mtail: Extract internal Monitoring Data From Application Logs

mtail - extract internal monitoring data from application logs for collection into a timeseries database

mtail is a tool for extracting metrics from application logs to be exported into a timeseries database or timeseries calculator for alerting and dashboarding.

It fills a monitoring niche by being the glue between applications that do not export their own internal state (other than via logs) and existing monitoring systems, such that system operators do not need to patch those applications to instrument them or writing custom extraction code for every such application.

The extraction is controlled by mtail programs which define patterns and actions:

# simple line counter
counter lines_total
/$/ {
  lines_total++
}

Metrics are exported for scraping by a collector as JSON or Prometheus format over HTTP, or can be periodically sent to a collectd, StatsD, or Graphite collector socket.

Read the programming guide if you want to learn how to write mtail programs.

Ask general questions on the users mailing list: https://groups.google.com/g/mtail-users

Installation

There are various ways of installing mtail.

Precompiled binaries

Precompiled binaries for released versions are available in the Releases page on Github. Using the latest production release binary is the recommended way of installing mtail.

Windows, OSX and Linux binaries are available.

Building from source

The simplest way to get mtail is to go get it directly.

go get github.com/google/mtail/cmd/mtail

This assumes you have a working Go environment with a recent Go version. Usually mtail is tested to work with the last two minor versions (e.g. Go 1.12 and Go 1.11).

If you want to fetch everything, you need to turn on Go Modules to succeed because of the way Go Modules have changed the way go get treats source trees with no Go code at the top level.

GO111MODULE=on go get -u github.com/google/mtail
cd $GOPATH/src/github.com/google/mtail
make install

If you develop the compiler you will need some additional tools like goyacc to be able to rebuild the parser.

See the Build instructions for more details.

A Dockerfile is included in this repository for local development as an alternative to installing Go in your environment, and takes care of all the build dependency installation, if you don't care for that.

Deployment

mtail works best when it paired with a timeseries-based calculator and alerting tool, like Prometheus.

So what you do is you take the metrics from the log files and you bring them down to the monitoring system?

It deals with the instrumentation so the engineers don't have to! It has the extraction skills! It is good at dealing with log files!!

Read More

Full documentation at http://google.github.io/mtail/

Read more about writing mtail programs:

Read more about hacking on mtail

Read more about deploying mtail and your programs in a monitoring environment

After that, if you have any questions, please email (and optionally join) the mailing list: https://groups.google.com/forum/#!forum/mtail-users or file a new issue.

Download Details:

Author: Google
Source Code: https://github.com/google/mtail 
License: Apache-2.0 license

#go #golang #calculator #monitoring 

Mtail: Extract internal Monitoring Data From Application Logs

Laravel-activitylog: Log Activity inside Your Laravel App

Laravel Activity Log

Log activity inside your Laravel app

The spatie/laravel-activitylog package provides easy to use functions to log the activities of the users of your app. It can also automatically log model events. The Package stores all activity in the activity_log table.

Here's a demo of how you can use it:

activity()->log('Look, I logged something');

You can retrieve all activity using the Spatie\Activitylog\Models\Activity model.

Activity::all();

Here's a more advanced example:

activity()
   ->performedOn($anEloquentModel)
   ->causedBy($user)
   ->withProperties(['customProperty' => 'customValue'])
   ->log('Look, I logged something');

$lastLoggedActivity = Activity::all()->last();

$lastLoggedActivity->subject; //returns an instance of an eloquent model
$lastLoggedActivity->causer; //returns an instance of your user model
$lastLoggedActivity->getExtraProperty('customProperty'); //returns 'customValue'
$lastLoggedActivity->description; //returns 'Look, I logged something'

Here's an example on event logging.

$newsItem->name = 'updated name';
$newsItem->save();

//updating the newsItem will cause the logging of an activity
$activity = Activity::all()->last();

$activity->description; //returns 'updated'
$activity->subject; //returns the instance of NewsItem that was saved

Calling $activity->changes() will return this array:

[
   'attributes' => [
        'name' => 'updated name',
        'text' => 'Lorum',
    ],
    'old' => [
        'name' => 'original name',
        'text' => 'Lorum',
    ],
];

Documentation

You'll find the documentation on https://spatie.be/docs/laravel-activitylog/introduction.

Find yourself stuck using the package? Found a bug? Do you have general questions or suggestions for improving the activity log? Feel free to create an issue on GitHub, we'll try to address it as soon as possible.

Installation

You can install the package via composer:

composer require spatie/laravel-activitylog

The package will automatically register itself.

You can publish the migration with

php artisan vendor:publish --provider="Spatie\Activitylog\ActivitylogServiceProvider" --tag="activitylog-migrations"

Note: The default migration assumes you are using integers for your model IDs. If you are using UUIDs, or some other format, adjust the format of the subject_id and causer_id fields in the published migration before continuing.

After publishing the migration you can create the activity_log table by running the migrations:

php artisan migrate

You can optionally publish the config file with:

php artisan vendor:publish --provider="Spatie\Activitylog\ActivitylogServiceProvider" --tag="activitylog-config"

Changelog

Please see CHANGELOG for more information about recent changes.

Upgrading

Please see UPGRADING for details.

Testing

composer test

Contributing

Please see CONTRIBUTING for details.

Security

If you've found a bug regarding security please mail security@spatie.be instead of using the issue tracker.

Credits

And a special thanks to Caneco for the logo and Ahmed Nagi for all the work he put in v4.

Download Details:

Author: Spatie
Source Code: https://github.com/spatie/laravel-activitylog 
License: MIT license

#php #laravel #monitoring #log 

Laravel-activitylog: Log Activity inside Your Laravel App
Elian  Harber

Elian Harber

1665453600

Highly Available Prometheus Setup with Long Term Storage Capabilities

Overview

Thanos is a set of components that can be composed into a highly available metric system with unlimited storage capacity, which can be added seamlessly on top of existing Prometheus deployments.

Thanos is a CNCF Incubating project.

Thanos leverages the Prometheus 2.0 storage format to cost-efficiently store historical metric data in any object storage while retaining fast query latencies. Additionally, it provides a global query view across all Prometheus installations and can merge data from Prometheus HA pairs on the fly.

Concretely the aims of the project are:

  1. Global query view of metrics.
  2. Unlimited retention of metrics.
  3. High availability of components, including Prometheus.

Features

  • Global querying view across all connected Prometheus servers
  • Deduplication and merging of metrics collected from Prometheus HA pairs
  • Seamless integration with existing Prometheus setups
  • Any object storage as its only, optional dependency
  • Downsampling historical data for massive query speedup
  • Cross-cluster federation
  • Fault-tolerant query routing
  • Simple gRPC "Store API" for unified data access across all metric data
  • Easy integration points for custom metric providers

Architecture Overview

Deployment with Sidecar:

Sidecar

Deployment with Receive:

Receive

Thanos Philosophy

The philosophy of Thanos and our community is borrowing much from UNIX philosophy and the golang programming language.

  • Each subcommand should do one thing and do it well
    • eg. thanos query proxies incoming calls to known store API endpoints merging the result
  • Write components that work together
    • e.g. blocks should be stored in native prometheus format
  • Make it easy to read, write, and, run components
    • e.g. reduce complexity in system design and implementation

Releases

Main branch should be stable and usable. Every commit to main builds docker image named main-<date>-<sha> in quay.io/thanos/thanos and thanosio/thanos dockerhub (mirror)

We also perform minor releases every 6 weeks.

During that, we build tarballs for major platforms and release docker images.

See release process docs for details.

Getting Started

Contributing

Contributions are very welcome! See our CONTRIBUTING.md for more information.

Community

Thanos is an open source project and we value and welcome new contributors and members of the community. Here are ways to get in touch with the community:

Adopters

See Adopters List.

Maintainers

See MAINTAINERS.md

Download Details:

Author: Thanos-io
Source Code: https://github.com/thanos-io/thanos 
License: Apache-2.0 license

#go #golang #storage #monitoring 

Highly Available Prometheus Setup with Long Term Storage Capabilities

Librenms: Community-based GPL-licensed Network Monitoring System

Introduction

LibreNMS is an auto-discovering PHP/MySQL/SNMP based network monitoring which includes support for a wide range of network hardware and operating systems including Cisco, Linux, FreeBSD, Juniper, Brocade, Foundry, HP and many more.

We intend LibreNMS to be a viable project and community that:

  • encourages contribution,
  • focuses on the needs of its users, and
  • offers a welcoming, friendly environment for everyone.

The Debian Social Contract will be the basis of our priority system, and mutual respect is the basis of our behavior towards others.

Documentation

Documentation can be found in the doc directory or docs.librenms.org, including instructions for installing and contributing.

Participating

You can participate in the project by:

VM image

You can try LibreNMS by downloading a VM image. Currently, a Ubuntu-based image is supplied and has been tested with VirtualBox.

Download one of the VirtualBox images we have available, documentation is provided which details login credentials and setup details.

Download Details:

Author: librenms
Source Code: https://github.com/librenms/librenms 
License: View license

#php #laravel #monitoring #network 

Librenms: Community-based GPL-licensed Network Monitoring System
Reid  Rohan

Reid Rohan

1664877446

Uptime-kuma: A Fancy Self-hosted Monitoring tool

Uptime Kuma

It is a self-hosted monitoring tool like "Uptime Robot".

🥔 Live Demo

Try it!

https://demo.uptime.kuma.pet

It is a temporary live demo, all data will be deleted after 10 minutes. The server is located in Tokyo, so if you live far from there, it may affect your experience. I suggest that you should install and try it out for the best demo experience.

VPS is sponsored by Uptime Kuma sponsors on Open Collective! Thank you so much!

⭐ Features

  • Monitoring uptime for HTTP(s) / TCP / HTTP(s) Keyword / Ping / DNS Record / Push / Steam Game Server / Docker Containers.
  • Fancy, Reactive, Fast UI/UX.
  • Notifications via Telegram, Discord, Gotify, Slack, Pushover, Email (SMTP), and 90+ notification services, click here for the full list.
  • 20 second intervals.
  • Multi Languages
  • Multiple Status Pages
  • Map Status Page to Domain
  • Ping Chart
  • Certificate Info
  • Proxy Support
  • 2FA available

🔧 How to Install

🐳 Docker

docker run -d --restart=always -p 3001:3001 -v uptime-kuma:/app/data --name uptime-kuma louislam/uptime-kuma:1

⚠️ Please use a local volume only. Other types such as NFS are not supported.

Browse to http://localhost:3001 after starting.

💪🏻 Non-Docker

Required Tools:

# Update your npm to the latest version
npm install npm -g

git clone https://github.com/louislam/uptime-kuma.git
cd uptime-kuma
npm run setup

# Option 1. Try it
node server/server.js

# (Recommended) Option 2. Run in background using PM2
# Install PM2 if you don't have it: 
npm install pm2 -g && pm2 install pm2-logrotate

# Start Server
pm2 start server/server.js --name uptime-kuma

Browse to http://localhost:3001 after starting.

More useful PM2 Commands

# If you want to see the current console output
pm2 monit

# If you want to add it to startup
pm2 save && pm2 startup

Advanced Installation

If you need more options or need to browse via a reverse proxy, please read:

https://github.com/louislam/uptime-kuma/wiki/%F0%9F%94%A7-How-to-Install

🆙 How to Update

Please read:

https://github.com/louislam/uptime-kuma/wiki/%F0%9F%86%99-How-to-Update

🆕 What's Next?

I will mark requests/issues to the next milestone.

https://github.com/louislam/uptime-kuma/milestones

Project Plan:

https://github.com/users/louislam/projects/4/views/1

🖼 More Screenshots

Light Mode:

Status Page:

Settings Page:

Telegram Notification Sample:

Motivation

  • I was looking for a self-hosted monitoring tool like "Uptime Robot", but it is hard to find a suitable one. One of the close ones is statping. Unfortunately, it is not stable and no longer maintained.
  • Want to build a fancy UI.
  • Learn Vue 3 and vite.js.
  • Show the power of Bootstrap 5.
  • Try to use WebSocket with SPA instead of REST API.
  • Deploy my first Docker image to Docker Hub.

If you love this project, please consider giving me a ⭐.

🗣️ Discussion

Issues Page

You can discuss or ask for help in issues.

Subreddit

My Reddit account: u/louislamlam.
You can mention me if you ask a question on Reddit. r/Uptime kuma

Contribute

Test Pull Requests

There are a lot of pull requests right now, but I don't have time to test them all.

If you want to help, you can check this: https://github.com/louislam/uptime-kuma/wiki/Test-Pull-Requests

Test Beta Version

Check out the latest beta release here: https://github.com/louislam/uptime-kuma/releases

Bug Reports / Feature Requests

If you want to report a bug or request a new feature, feel free to open a new issue.

Translations

If you want to translate Uptime Kuma into your language, please read: https://github.com/louislam/uptime-kuma/tree/master/src/languages

Feel free to correct my grammar in this README, source code, or wiki, as my mother language is not English and my grammar is not that great.

Create Pull Requests

If you want to modify Uptime Kuma, please read this guide and follow the rules here: https://github.com/louislam/uptime-kuma/blob/master/CONTRIBUTING.md

Download Details:

Author: louislam
Source Code: https://github.com/louislam/uptime-kuma 
License: MIT license

#javascript #docker #monitoring

Uptime-kuma: A Fancy Self-hosted Monitoring tool
Gordon  Taylor

Gordon Taylor

1660783860

Appmetrics: Node Application Metrics

Node Application Metrics

Node Application Metrics monitoring and profiling agent       

Node Application Metrics instruments the Node.js runtime for performance monitoring, providing the monitoring data via an API. Additionally the data can be visualized by using the Node Application Metrics Dashboard.

The data can also be visualized in Eclipse using the IBM Monitoring and Diagnostics Tools - Health Center client. Profiling data is available in Health Center, but is not yet available in the Dashboard. See https://www.ibm.com/developerworks/java/jdk/tools/healthcenter/ for more details.

Node Application Metrics provides the following built-in data collection sources:

SourceDescription
EnvironmentMachine and runtime environment information
CPUProcess and system CPU
MemoryProcess and system memory usage
GCNode/V8 garbage collection statistics
Event LoopEvent loop latency information
LoopEvent loop timing metrics
Function profilingNode/V8 function profiling (disabled by default)
HTTPHTTP request calls made of the application
HTTP OutboundHTTP requests made by the application
socket.ioWebSocket data sent and received by the application
LevelDBLevelDB queries made by the application
MySQLMySQL queries made by the application
MongoDBMongoDB queries made by the application
PostgreSQLPostgreSQL queries made by the application
MQTTMQTT messages sent and received by the application
MQLightMQLight messages sent and received by the application
MemcachedData that is stored or manipulated in Memcached
OracleDBOracleDB queries made by the application
OracleOracle queries made by the application
StrongOracleStrongOracle database queries made by the application
RedisRedis commands issued by the application
RiakRiak methods called by the application
Request trackingA tree of application requests, events and optionally trace (disabled by default)
Function traceTracing of application function calls that occur during a request (disabled by default)

Performance overhead

Our testing has shown that the performance overhead in terms of processing is minimal, adding less than 0.5 % to the CPU usage of your application. The additional memory required is around 20 MB to gather information about your system and application.

We gathered this information by monitoring the sample application Acme Air. We used MongoDB as our datastore and used JMeter to drive load though the program. We have performed this testing with Node.js version 6.10.3

Getting Started

Pre-requisites:

Appmetrics uses node-gyp to compile and build local binary libraries to enhance execution performance. If the following compilation and build logs contain errors, make sure you have the node-gyp pre-requisites installed (https://github.com/nodejs/node-gyp#installation). If you have them and the build still had errors, see if there are any related issues at https://github.com/RuntimeTools/appmetrics/issues). If there aren't, feel free to open a new issue to report the bug.

Installation

You can get Node Application Metrics from 3 different places:

Configuring Node Application Metrics

Node Application Metrics can be configured in two ways, by using the configuration file described below or via a call to configure(options).

Node Application Metrics comes with a configuration file inside the module installation directory (.../node_modules/appmetrics/appmetrics.properties). This can be used to configure connection options, logging and data source options.

Node Application Metrics will attempt to load appmetrics.properties from one of the following locations (in order):

  1. the application directory
  2. the current working directory
  3. the appmetrics module installation directory

The default configuration has minimal logging enabled, will attempt to send data to a local MQTT server on the default port and has method profiling disabled.

Many of the options provide configuration of the Health Center core agent library and are documented in the Health Center documentation: Health Center configuration properties.

The following options are specific to appmetrics:

  • com.ibm.diagnostics.healthcenter.data.profiling=[off|on] Specifies whether method profiling data will be captured. The default value is off. This specifies the value at start-up; it can be enabled and disabled dynamically as the application runs, either by a monitoring client or the API.

Running Node Application Metrics

Preloading appmetrics

In previous versions appmetrics came with an executable, node-hc, which could be used instead of the node command to run your application and load and start appmetrics. This has been removed in version 4.0.0, instead you can use:

$ node --require appmetrics/start app.js

to preload and start appmetrics, or in Node.js from versions 8.0.0 and 6.12.0 onwards, use the NODE_OPTIONS environment variable:

$ export NODE_OPTIONS="--require appmetrics/start"

Modifying your application to use appmetrics

If you locally install this module with npm then you will additionally have access to the monitoring data via the appmetrics API (see API Documentation).

To load appmetrics and get the monitoring API object, add the following to the start-up code for your application:

var appmetrics = require('appmetrics');
var monitoring = appmetrics.monitor();

The call to appmetrics.monitor() starts the data collection agent, making the data available via the API and to the Heath Center client via MQTT.

You should start your application using the node command as usual (not node-hc).

You must call require('appmetrics'); before the require statements for any npm modules you want to monitor. Appmetrics must be initialized first so that it can instrument modules for monitoring as they are loaded. If this is a problem due to the structure of your application you can require the module on the node command line by using -r or --require or by setting NODE_OPTIONS as described above to make sure it is pre-loaded.

Once you have loaded appmetrics you can then use the monitoring object to register callbacks and request information about the application:

monitoring.on('initialized', function (env) {
    env = monitoring.getEnvironment();
    for (var entry in env) {
        console.log(entry + ':' + env[entry]);
    };
});

monitoring.on('cpu', function (cpu) {
    console.log('[' + new Date(cpu.time) + '] CPU: ' + cpu.process);
});

Health Center Eclipse IDE client

Not supported on z/OS

Connecting to the client

Connecting to the Health Center client requires the additional installation of a MQTT broker. The Node Application Metrics agent sends data to the MQTT broker specified in the appmetrics.properties file or set via a call to configure(options). Installation and configuration documentation for the Health Center client is available from the Health Center documentation in IBM Knowledge Center.

Note that both the API and the Health Center client can be used at the same time and will receive the same data. Use of the API requires a local install and application modification (see Modifying your application to use the local installation).

Further information regarding the use of the Health Center client with Node Application Metrics can be found on the appmetrics wiki: Using Node Application Metrics with the Health Center client.

API Documentation

appmetrics.configure(options)

Sets various properties on the appmetrics monitoring agent. If the agent has already been started, this function does nothing.

  • options(Object) key value pairs of properties and values to be set on the monitoring agent.
Property nameProperty value typeProperty description
applicationIDstringSpecifies a unique identifier for the mqtt connection
mqttstring['off'|'on']Specifies whether the monitoring agent sends data to the mqtt broker. The default value is 'on'
mqttHoststringSpecifies the host name of the mqtt broker
mqttPortstring['[0-9]*']Specifies the port number of the mqtt broker
profilingstring['off'|'on']Specifies whether method profiling data will be captured. The default value is 'off'

appmetrics.start()

Starts the appmetrics monitoring agent. If the agent is already running this function does nothing.

appmetrics.stop()

Stops the appmetrics monitoring agent. If the agent is not running this function does nothing.

appmetrics.enable(type, config)

Enable data generation of the specified data type. Cannot be called until the agent has been started by calling start() or monitor().

  • type (String) the type of event to start generating data for. Values of eventloop, profiling, http, http-outbound, mongo, socketio, mqlight, postgresql, mqtt, mysql, redis, riak, memcached, oracledb, oracle, strong-oracle, requests and trace are currently supported. As trace is added to request data, both requests and trace must be enabled in order to receive trace data.
  • config (Object) (optional) configuration map to be added for the data type being enabled. (see setConfig) for more information.

The following data types are disabled by default: profiling, requests, trace

appmetrics.disable(type)

Disable data generation of the specified data type. Cannot be called until the agent has been started by calling start() or monitor().

  • type (String) the type of event to stop generating data for. Values of eventloop, profiling, http, mongo, socketio, mqlight, postgresql, mqtt, mysql, redis, riak, memcached, oracledb, oracle, strong-oracle, requests and trace are currently supported.

appmetrics.setConfig(type, config)

Set the configuration to be applied to a specific data type. The configuration available is specific to the data type.

  • type (String) the type of event to apply the configuration to.
  • config (Object) key value pairs of configurations to be applied to the specified event. The available configuration options are as follows:
TypeConfiguration keyConfiguration Value
httpfilters

(Array) of URL filter Objects consisting of:

  • pattern (String) a regular expression pattern to match HTTP method and URL against, eg. 'GET /favicon.ico$'
  • to (String) a conversion for the URL to allow grouping. A value of '' causes the URL to be ignored.
requestsexcludeModules(Array) of String names of modules to exclude from request tracking.
traceincludeModules(Array) of String names for modules to include in function tracing. By default only non-module functions are traced when trace is enabled.
advancedProfilingthreshold(Number) millisecond run time of an event loop cycle that will trigger profiling

appmetrics.emit(type, data)

Allows custom monitoring events to be added into the Node Application Metrics agent.

  • type (String) the name you wish to use for the data. A subsequent event of that type will be raised, allowing callbacks to be registered for it.
  • data (Object) the data to be made available with the event. The object must not contain circular references, and by convention should contain a time value representing the milliseconds when the event occurred.

appmetrics.writeSnapshot([filename],[callback])

Not supported on z/OS Dumps the v8 heap via heapdump. For more information, see https://github.com/bnoordhuis/node-heapdump/blob/master/README.md

appmetrics.monitor()

Creates a Node Application Metrics agent client instance. This can subsequently be used to get environment data and subscribe to data events. This function will start the appmetrics monitoring agent if it is not already running.

appmetrics.monitor.getEnvironment()

Requests an object containing all of the available environment information for the running application. This will not contain all possible environment information until an 'initialized' event has been received.

Event: 'cpu'

Not supported on z/OS Emitted when a CPU monitoring sample is taken.

  • data (Object) the data from the CPU sample:
    • time (Number) the milliseconds when the sample was taken. This can be converted to a Date using new Date(data.time).
    • process (Number) the percentage of CPU used by the Node.js application itself. This is a value between 0.0 and 1.0.
    • system (Number) the percentage of CPU used by the system as a whole. This is a value between 0.0 and 1.0.

Event: 'eventloop'

Emitted every 5 seconds, summarising sample based information of the event loop latency

  • data (Object) the data from the event loop sample:
    • time (Number) the milliseconds when the event was emitted. This can be converted to a Date using new Date(data.time).
    • latency.min (Number) the shortest sampled latency, in milliseconds.
    • latency.max (Number) the longest sampled latency, in milliseconds.
    • latency.avg (Number) the average sampled latency, in milliseconds.

Event: 'gc'

Emitted when a garbage collection (GC) cycle occurs in the underlying V8 runtime.

  • data (Object) the data from the GC sample:
    • time (Number) the milliseconds when the sample was taken. This can be converted to a Date using new Date(data.time).
    • type (String) the type of GC cycle, either:
      • 'M': MarkSweepCompact, aka "major"
      • 'S': Scavenge, aka "minor"
      • 'I': IncrementalMarking, aka "incremental" (only exists on node 5.x and greater)
      • 'W': ProcessWeakCallbacks, aka "weakcb" (only exists on node 5.x and greater)
    • size (Number) the size of the JavaScript heap in bytes.
    • used (Number) the amount of memory used on the JavaScript heap in bytes.
    • duration (Number) the duration of the GC cycle in milliseconds.

Event: 'initialized'

Emitted when all possible environment variables have been collected. Use appmetrics.monitor.getEnvironment() to access the available environment variables.

Event: 'loop'

Emitted every 5 seconds, summarising event tick information in time interval

  • data (Object) the data from the event loop sample:
    • count (Number) the number of event loop ticks in the last interval.
    • minimum (Number) the shortest (i.e. fastest) tick in milliseconds.
    • maximum (Number) the longest (slowest) tick in milliseconds.
    • average (Number) the average tick time in milliseconds.
    • cpu_user (Number) the percentage of 1 CPU used by the event loop thread in user code the last interval. This is a value between 0.0 and 1.0.
    • cpu_system (Number) the percentage of 1 CPU used by the event loop thread in system code in the last interval. This is a value between 0.0 and 1.0.

Event: 'memory'

Emitted when a memory monitoring sample is taken.

  • data (Object) the data from the memory sample:
    • time (Number) the milliseconds when the sample was taken. This can be converted to a Date using new Date(data.time).
    • physical_total (Number) the total amount of RAM available on the system in bytes.
    • physical_used (Number) the total amount of RAM in use on the system in bytes.
    • physical_free (Number) the total amount of free RAM available on the system in bytes.
    • virtual (Number) the memory address space used by the Node.js application in bytes.
    • private (Number) the amount of memory used by the Node.js application that cannot be shared with other processes, in bytes.
    • physical (Number) the amount of RAM used by the Node.js application in bytes.

Event: 'profiling'

Emitted when a profiling sample is available from the underlying V8 runtime.

  • data (Object) the data from the profiling sample:
    • time (Number) the milliseconds when the sample was taken. This can be converted to a Date using new Date(data.time).
    • functions (Array) an array of functions that ran during the sample. Each array entry consists of:
      • self (Number) the ID for this function.
      • parent (Number) the ID for this function's caller.
      • name (String) the name of this function.
      • file (String) the file in which this function is defined.
      • line (Number) the line number in the file.
      • count (Number) the number of samples for this function.

API: Dependency Events (probes)

Event: 'http'/'https'

Emitted when a HTTP/HTTPS request is made of the application.

  • data (Object) the data from the HTTP(S) request:
    • time (Number) the milliseconds when the request was made. This can be converted to a Date using new Date(data.time).
    • method (String) the HTTP(S) method used for the request.
    • url (String) the URL on which the request was made.
    • duration (Number) the time taken for the HTTP(S) request to be responded to in ms.
    • header (String) the response header for the HTTP(S) request.
    • contentType (String) the content type of the HTTP(S) request.
    • requestHeader (Object) the request header for HTTP(S) request.

Event: 'http-outbound'/'https-outbound'

Emitted when the application makes an outbound HTTP/HTTPS request.

  • data (Object) the data from the HTTP(S) request:
    • time (Number) the milliseconds when the request was made. This can be converted to a Date using new Date(data.time).
    • method (String) the HTTP(S) method used for the request.
    • url (String) the URL on which the request was made.
    • contentType (String) the HTTP(S) response content-type.
    • statusCode (String) the HTTP response status code.
    • duration (Number) the time taken for the HTTP(S) request to be responded to in ms.
    • 'requestHeaders' (Object) the HTTP(S) request headers.

Event: 'leveldown'

Emitted when a LevelDB query is made using the leveldown module.

  • data (Object) the data from the LevelDB query:
    • time (Number) the time in milliseconds when the LevelDB query was made. This can be converted to a Date using new Date(data.time).
    • method (String) The leveldown method being used.
    • key (Object) The key being used for a call to get, put or del (Undefined for other methods)
    • value (Object) The value being added to the LevelDB database using the put method (Undefined for other methods)
    • opCount (Number) The number of operations carried out by a batch method (Undefined for other methods)
    • duration (Number) the time taken for the LevelDB query to be responded to in ms.

Event: 'loopback-datasource-juggler'

Emitted when a function is called on the loopback-datasource-juggler module

  • data (Object) the data from the loopback-datasource-juggler event:
    • time (Number) the time in milliseconds when the event occurred. This can be converted to a Date using new Date(data.time)
    • method (String) the function the juggler has executed
    • duration (Number) the time taken for the operation to complete.

Event: 'memcached'

Emitted when a data is stored, retrieved or modified in Memcached using the memcached module.

  • data (Object) the data from the memcached event:
    • time (Number) the milliseconds when the memcached event occurred. This can be converted to a Date using new Date(data.time)
    • method (String) the method used in the memcached client, eg set, get, append, delete, etc.
    • key (String) the key associated with the data.
    • duration (Number) the time taken for the operation on the memcached data to occur.

Event: 'mongo'

Emitted when a MongoDB query is made using the mongodb module.

  • data (Object) the data from the MongoDB request:
    • time (Number) the milliseconds when the MongoDB query was made. This can be converted to a Date using new Date(data.time)
    • query (String) the query made of the MongoDB database.
    • duration (Number) the time taken for the MongoDB query to be responded to in ms.
    • method (String) the executed method for the query, such as find, update.
    • collection (String) the MongoDB collection name.

Event: 'mqlight'

Emitted when a MQLight message is sent or received.

  • data (Object) the data from the MQLight event:
    • time (Number) the time in milliseconds when the MQLight event occurred. This can be converted to a Date using new Date(data.time).
    • clientid (String) the id of the client.
    • data (String) the data sent if a 'send' or 'message', undefined for other calls. Truncated if longer than 25 characters.
    • method (String) the name of the call or event (will be one of 'send' or 'message').
    • topic (String) the topic on which a message is sent/received.
    • qos (Number) the QoS level for a 'send' call, undefined if not set.
    • duration (Number) the time taken in milliseconds.

Event: 'mqtt'

Emitted when a MQTT message is sent or received.

  • data (Object) the data from the MQTT event:
    • time (Number) the time in milliseconds when the MQTT event occurred. This can be converted to a Date using new Date(data.time).
    • method (String) the name of the call or event (will be one of 'publish' or 'message').
    • topic (String) the topic on which a message is published or received.
    • qos (Number) the QoS level for the message.
    • duration (Number) the time taken in milliseconds.

Event: 'mysql'

Emitted when a MySQL query is made using the mysql module.

  • data (Object) the data from the MySQL query:
    • time (Number) the milliseconds when the MySQL query was made. This can be converted to a Date using new Date(data.time).
    • query (String) the query made of the MySQL database.
    • duration (Number) the time taken for the MySQL query to be responded to in ms.

Event: 'oracle'

Emitted when a query is executed using the oracle module.

  • data (Object) the data from the Oracle query:
    • time (Number) the milliseconds when the Oracle query was made. This can be converted to a Date using new Date(data.time).
    • query (String) the query made of the Oracle database.
    • duration (Number) the time taken for the Oracle query to be responded to in ms.

Event: 'oracledb'

Emitted when a query is executed using the oracledb module.

  • data (Object) the data from the OracleDB query:
    • time (Number) the milliseconds when the OracleDB query was made. This can be converted to a Date using new Date(data.time).
    • query (String) the query made of the OracleDB database.
    • duration (Number) the time taken for the OracleDB query to be responded to in ms.

Event: 'postgres'

Emitted when a PostgreSQL query is made to the pg module.

  • data (Object) the data from the PostgreSQL query:
    • time (Number) the milliseconds when the PostgreSQL query was made. This can be converted to a Date using new Date(data.time).
    • query (String) the query made of the PostgreSQL database.
    • duration (Number) the time taken for the PostgreSQL query to be responded to in ms.

Event: 'redis'

Emitted when a Redis command is sent.

  • data (Object) the data from the Redis event:
    • time (Number) the time in milliseconds when the redis event occurred. This can be converted to a Date using new Date(data.time).
    • cmd (String) the Redis command sent to the server or 'batch.exec'/'multi.exec' for groups of command sent using batch/multi calls.
    • duration (Number) the time taken in milliseconds.

Event: 'riak'

Emitted when a Riak method is called using the basho-riak-client module.

  • data (Object) the data from the Riak event:
    • time (Number) the time in milliseconds when the riak event occurred. This can be converted to a Date using new Date(data.time).
    • method (String) the Riak method called.
    • options (Object) the options parameter passed to Riak.
    • command (Object) the command parameter used in the execute method.
    • query (String) the query parameter used in the mapReduce method.
    • duration (Number) the time taken in milliseconds.

Event: 'socketio'

Emitted when WebSocket data is sent or received by the application using socketio.

  • data (Object) the data from the socket.io request:
    • time (Number) the milliseconds when the event occurred. This can be converted to a Date using new Date(data.time).
    • method (String) whether the event is a broadcast or emit from the application, or a receive from a client .
    • event (String) the name used for the event.
    • duration (Number) the time taken for event to be sent or for a received event to be handled.

Event: 'strong-oracle'

Emitted when a query is executed using the strong-oracle module.

  • data (Object) the data from the Strong Oracle query:
    • time (Number) the milliseconds when the Strong Oracle query was made. This can be converted to a Date using new Date(data.time).
    • query (String) the query made of the database.
    • duration (Number) the time taken for the Strong Oracle query to be responded to in ms.

API: Requests

Event: 'request'

Requests are a special type of event emitted by appmetrics. All the probes named above can also create request events if requests are enabled. However requests are nested within a root incoming request (usually http). Request events are disabled by default.

  • data (Object) the data from the request:
    • time (Number) the milliseconds when the request occurred. This can be converted to a Date using new Date(data.time).
    • type (String) The type of the request event. This is the name of the probe that sent the request data, e.g. http, socketio etc.
    • name (String) The name of the request event. This is the request task, eg. the url, or the method being used.
    • request (Object) the detailed data for the root request event:
      • type (String) The type of the request event. This is the name of the probe that sent the request data, e.g. http, socketio etc.
      • name (String) The name of the request event. This is the request task, eg. the url, or the method being used.
      • context (Object) Additional context data (usually contains the same data as the associated non-request metric event).
      • stack (String) An optional stack trace for the event call.
      • children (Array) An array of child request events that occurred as part of the overall request event. Child request events may include function trace entries, which will have a type of null.
      • duration (Number) the time taken for the request to complete in ms.
    • duration (Number) the time taken for the overall request to complete in ms.

Supported platforms

The Node Application Metrics agent supports the following runtime environments where a Node.js runtime is available:

  • Node.js v10, 12, 14 on:
    • 64-bit Windows (x64)
    • 64-bit Linux (x64, ppc64, ppc64le, s390x)
    • 64-bit AIX (ppc64)
    • 64-bit IBM i (ppc64)
      • Before running npm install appmetrics, ensure the environment variable CC=gcc is set.
      • Functionality for Memory and CPU stats not fully implemented, currently under construction.
    • 64-bit macOS (x64)

Troubleshooting

Find below some possible problem scenarios and corresponding diagnostic steps. Updates to troubleshooting information will be made available on the appmetrics wiki: Troubleshooting. If these resources do not help you resolve the issue, you can open an issue on the Node Application Metrics appmetrics issue tracker.

Checking Node Application Metrics has started

By default, a message similar to the following will be written to console output when Node Application Metrics starts:

[Fri Aug 21 09:36:58 2015] com.ibm.diagnostics.healthcenter.loader INFO: Node Application Metrics 1.0.1-201508210934 (Agent Core 3.0.5.201508210934)

Error "The specified module could not be found ... appmetrics.node"

This error indicates there was a problem while loading the native part of the module or one of its dependent libraries. On Windows, appmetrics.node depends on a particular version of the C runtime library and if it cannot be found this error is the likely result.

Check:

  • Does the appmetrics.node file exist in the indicated location? If not, try reinstalling the module.
  • For version 1.0.0 on Windows: are msvcr100.dll and msvcp100.dll installed on your Windows system, and do they match the bitness (32-bit or 64-bit) of your Node.js runtime environment? If not, you may be able to install them with the Visual C++ Redistributable Packages for Visual Studio 2010 package from the Microsoft website.
  • For version 1.0.1 on Windows: does msvcr120.dll and msvcp120.dll exist in the module installation directory (see Installation) and does it match the bitness of your Node.js runtime environment? If not, try reinstalling the module.

Note: On Windows, the global module installation directory might be shared between multiple Node.js runtime environments. This can cause problems with globally installed modules with native components, particularly if some of the Node.js runtime environments are 32-bit and others are 64-bit because the native components will only work with those with matching bitness.

Error "Failed to open library .../libagentcore.so: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.15' not found"

This error indicates there was a problem while loading the native part of the module or one of its dependent libraries. On non-Windows platforms, libagentcore.so depends on a particular (minimum) version of the C runtime library and if it cannot be found this error is the result.

Check:

  • Your system has the required version of libstdc++ installed. You may need to install or update a package in your package manager. If your OS does not supply a package at this version, you may have to install standalone software - consult the documentation or support forums for your OS.
  • If you have an appropriate version of libstdc++installed, ensure it is on the system library path, or use a method (such as setting LD_LIBRARY_PATH environment variable on Linux, or LIBPATH environment variable on AIX) to add the library to the search path.

No profiling data present for Node.js applications

Method profiling data is not collected by default, check Configuring Node Application Metrics for information on how to enable it.

If collection is enabled, an absence of method profiling data from a Node.js application could be caused by the type of tasks that are being run by your application -- it may be running long, synchronous tasks that prevent collection events from being scheduled on the event loop.

If a task uses the Node.js thread exclusively then shuts down the Node.js runtime environment, the Health Center agent may not get the opportunity to obtain any profiling data. An example of such an application is the Octane JavaScript benchmark suite, which loads the CPU continuously rather than dividing the load across multiple units of work.

Source code

The source code for Node Application Metrics is available in the appmetrics project. Information on working with the source code -- installing from source, developing, contributing -- is available on the appmetrics wiki.

License

This project is released under an Apache 2.0 open source license.

Versioning scheme

The npm package for this project uses a semver-parsable X.0.Z version number for releases, where X is incremented for breaking changes to the public API described in this document and Z is incremented for bug fixes and for non-breaking changes to the public API that provide new function.

Development versions

Non-release versions of this project (for example on github.com/RuntimeTools/appmetrics) will use semver-parsable X.0.Z-dev.B version numbers, where X.0.Z is the last release with Z incremented and B is an integer. For further information on the development process go to the appmetrics wiki: Developing.

Module Long Term Support Policy

This module adopts the Module Long Term Support (LTS) policy, with the following End Of Life (EOL) dates:

Module VersionRelease DateMinimum EOLEOL WithStatus
V4.x.xJan 2018Dec 2019 Maintenance
V5.x.xMay 2019Dec 2020 Current

Version

5.1.1

Release History

5.1.1 - Node13 support, bump dependency versions and a trace probe fix.
5.0.5 - zAppmetrics fixes, and bump agentcore for Alpine support.
5.0.3 - Bug fix.
5.0.2 - Bump level of omragentcore.
5.0.1 - Bug fix for incorrect timiings on http request.
5.0.0 - Add Node 12 support, remove Node 6 support.
4.0.1 - Bug fix release including adding Node 10 support on Windows (Unix already working).
4.0.0 - Remove node-hc and add support for preloading.
3.1.3 - Packaging fix.
3.1.2 - Bug fixes.
3.1.1 - Node v6 on z/OS support.
3.1.0 - HTTPS probe added. Remove support for Node v7.
3.0.2 - Probe defect for Node 8 support.
3.0.1 - Packaging bug fix to allow build from source if binary not present.
3.0.0 - Remove express probe. Additional data available in http and request events. Code improvements.
2.0.1 - Remove support for Node.js 0.10, 0.12, 5. Add heapdump api call.
1.2.0 - Add file data collection capability and option configuration via api.
1.1.2 - Update agent core to 3.0.10, support Node.js v7.
1.1.1 - Fix node-gyp rebuild failure and don't force MQTT broker to on
1.1.0 - Bug fixes, improved MongoDB data, updated dependencies, CPU watchdog feature
1.0.13 - Express probe, strong-supervisor integration
1.0.12 - Appmetrics now fully open sourced under Apache 2.0 license
1.0.11 - Bug fixes
1.0.10 - Bug fixes
1.0.9 - Loopback and Riak support, bug fixes and update to agent core 3.0.9.
1.0.8 - Oracle support, bug fixes and api tests runnable using 'npm test'.
1.0.7 - StrongOracle support, support for installing with a proxy, expose MongoDB, MQLight and MySQL events to connectors.
1.0.6 - OracleDB support and bug fixes.
1.0.5 - Expose HTTP events to connectors (including MQTT).
1.0.4 - Redis, Leveldown, Postgresql, Memcached, MQLight and MQTT support, higher precision timings, and improved performance.
1.0.3 - Node.js v4 support.
1.0.2 - HTTP, MySQL, MongoDB, request tracking and function tracing support.
1.0.1 - Mac OS X support, io.js v2 support.
1.0.0 - First release.

Download Details:

Author: RuntimeTools
Source Code: https://github.com/RuntimeTools/appmetrics 
License: Apache-2.0 license

#javascript #nodejs #monitoring #metrics 

Appmetrics: Node Application Metrics
Rupert  Beatty

Rupert Beatty

1658268120

Laravel-activitylog: Log Activity inside Your Laravel App

Log activity inside your Laravel app

The spatie/laravel-activitylog package provides easy to use functions to log the activities of the users of your app. It can also automatically log model events. The Package stores all activity in the activity_log table.

Here's a demo of how you can use it:

activity()->log('Look, I logged something');

You can retrieve all activity using the Spatie\Activitylog\Models\Activity model.

Activity::all();

Here's a more advanced example:

activity()
   ->performedOn($anEloquentModel)
   ->causedBy($user)
   ->withProperties(['customProperty' => 'customValue'])
   ->log('Look, I logged something');

$lastLoggedActivity = Activity::all()->last();

$lastLoggedActivity->subject; //returns an instance of an eloquent model
$lastLoggedActivity->causer; //returns an instance of your user model
$lastLoggedActivity->getExtraProperty('customProperty'); //returns 'customValue'
$lastLoggedActivity->description; //returns 'Look, I logged something'

Here's an example on event logging.

$newsItem->name = 'updated name';
$newsItem->save();

//updating the newsItem will cause the logging of an activity
$activity = Activity::all()->last();

$activity->description; //returns 'updated'
$activity->subject; //returns the instance of NewsItem that was saved

Calling $activity->changes() will return this array:

[
   'attributes' => [
        'name' => 'updated name',
        'text' => 'Lorum',
    ],
    'old' => [
        'name' => 'original name',
        'text' => 'Lorum',
    ],
];

Installation

You can install the package via composer:

composer require spatie/laravel-activitylog

The package will automatically register itself.

You can publish the migration with:

php artisan vendor:publish --provider="Spatie\Activitylog\ActivitylogServiceProvider" --tag="activitylog-migrations"

Note: The default migration assumes you are using integers for your model IDs. If you are using UUIDs, or some other format, adjust the format of the subject_id and causer_id fields in the published migration before continuing.

After publishing the migration you can create the activity_log table by running the migrations:

php artisan migrate

You can optionally publish the config file with:

php artisan vendor:publish --provider="Spatie\Activitylog\ActivitylogServiceProvider" --tag="activitylog-config"

Changelog

Please see CHANGELOG for more information about recent changes.

Upgrading

Please see UPGRADING for details.

Testing

composer test

Documentation

You'll find the documentation on https://spatie.be/docs/laravel-activitylog/introduction.

Find yourself stuck using the package? Found a bug? Do you have general questions or suggestions for improving the activity log? Feel free to create an issue on GitHub, we'll try to address it as soon as possible.

Contributing

Please see CONTRIBUTING for details.

Support us

We invest a lot of resources into creating best in class open source packages. You can support us by buying one of our paid products.

We highly appreciate you sending us a postcard from your hometown, mentioning which of our package(s) you are using. You'll find our address on our contact page. We publish all received postcards on our virtual postcard wall.

Security

If you've found a bug regarding security please mail security@spatie.be instead of using the issue tracker.

Credits

And a special thanks to Caneco for the logo and Ahmed Nagi for all the work he put in v4.

Author: Spatie
Source Code: https://github.com/spatie/laravel-activitylog 
License: MIT license

#laravel #php #logging #monitoring 

Laravel-activitylog: Log Activity inside Your Laravel App
Elian  Harber

Elian Harber

1654762380

Bosun: Time Series Alerting Framework

Bosun

Bosun is a time series alerting framework developed by Stack Exchange. Scollector is a metric collection agent. 

Building

bosun and scollector are found under the cmd directory. Run go build in the corresponding directories to build each project. There's also a Makefile available for most tasks.

Running

For a full stack with all dependencies, run docker-compose up from the docker directory. Don't forget to rebuild images and containers if you change the code:

$ cd docker
$ docker-compose down
$ docker-compose up --build

If you only need the dependencies (Redis, OpenTSDB, HBase) and would like to run Bosun on your machine directly (e.g. to attach a debugger), you can bring up the dependencies with these three commands from the repository's root:

$ docker run -p 6379:6379 --name redis redis:6
$ docker build -f docker/opentsdb.Dockerfile -t opentsdb .
$ docker run -p 4242:4242 --name opentsdb opentsdb

The OpenTSDB container will be reachable at http://localhost:4242. Redis listens on its default port 6379. Bosun, if brought up in a Docker container, is available at http://localhost:8070.

Developing

Install:

  • Run make deps and make testdeps to set up all dependencies.
  • Run make generate when new static assets (like JS and CSS files) are added or changed.

The w.sh script will automatically build and run bosun in a loop. It will update itself when go/js/ts files change, and it runs in read-only mode, not sending any alerts.

$ cd cmd/bosun
$ ./w.sh

Go Version:

  • See the version number in .travis.yml in the root of this repo for the version of Go to use. Generally speaking, you should be able to use newer versions of Go if you are able to build Bosun without error.

Miniprofiler:

  • Bosun includes miniprofiler in the web UI which can help with debugging. The key combination ALT-P will show miniprofiler. This allows you to see timings, as well as the raw queries sent to TSDBs.

Learn more at bosun.org.

Author: Bosun-monitor
Source Code: https://github.com/bosun-monitor/bosun 
License: MIT license

#go #golang #monitoring 

Bosun: Time Series Alerting Framework
Elian  Harber

Elian Harber

1654740000

Balerter: Script Based Alerting Manager

Balerter is a scripts based alerting system.

In your script you may:

  • obtain needed data from different data sources (prometheus, clickhouse, postgres, external HTTP API etc.)
  • analyze data and make a decision about alert status
  • change Alerts statuses and receive notifications about it

In the example bellow we create one Clickhouse datasource, one scripts source and one alert channel. In the script we run query to clickhouse, check the value and fire the alert (or switch off it)

Notification channels

  • Slack
  • Telegram
  • Syslog
  • Desktop Notify
  • Email
  • Discord
  • Webhook
  • Prometheus Alertmanager
  • Prometheus AlertmanagerReceiver
  • Twilio Voice (phone calls)

Datasources

  • Clickhouse
  • Prometheus
  • Postgres
  • MySQL
  • Loki
  • Any external API with http lua module

Example

docker pull balerter/balerter
docker run \
    -v /path/to/config.yml:/opt/config.yml \
    -v /path/to/scripts:/opt/scripts \ 
    -v /path/to/cert.crt:/home/user/db.crt \
    balerter/balerter -config=/opt/config.yml

Config file config.yml

scripts:
  folder:
    - name: debug-folder
      path: /opt/scripts
      mask: '*.lua'

datasources:
  clickhouse:
    - name: ch1
      host: localhost
      port: 6440
      username: default
      password: secret
      database: default
      sslMode: verified_full
      sslCertPath: /home/user/db.crt

channels:
  slack:
    - name: slack1
      url: https://hooks.slack.com/services/hash

Sample script rps.lua

-- @cron */10 * * * * *
-- @name script1

local minRequestsRPS = 100

local log = require("log")
local ch1 = require("datasource.clickhouse.ch1")

local res, err = ch1.query("SELECT sum(requests) AS rps FROM some_table WHERE date = now()")
if err ~= nil then
    log.error("clickhouse 'ch1' query error: " .. err)
    return
end

local resultRPS = res[1].rps

if resultRPS < minResultRPS then
    alert.error("rps-min-limit", "Requests RPS are very small: " .. tostring(resultRPS))
else
    alert.success("rps-min-limit", "Requests RPS ok")
end 

Also, you can to write tests!

An example:

-- @test script1
-- @name script1-test

test = require('test')

local resp = {
    {
        rps = 10
    }
} 

test.datasource('clickhouse.ch1').on('query', 'SELECT sum(requests) AS rps FROM some_table WHERE date = now()').response(resp)

test.alert().assertCalled('error', 'rps-min-limit', 'Requests RPS are very small: 10')
test.alert().assertNotCalled('success', 'rps-min-limit', 'Requests RPS ok')

See a documentation on https://balerter.com

Full documentation available on https://balerter.com

A Project in active development. Features may have breaking changes at any time before v1.0.0 version

Author: Balerter
Source Code: https://github.com/balerter/balerter 
License: MIT license

#go #golang #monitoring 

Balerter: Script Based Alerting Manager
Elian  Harber

Elian Harber

1654691160

Roumon: Universal Goroutine Monitor using Pprof and Termui

roumon  

A goroutine monitor to keep track of active routines from within your favorite shell.

Features

  • Track live state of all active goroutines
  • Terminal user interface written with termui 🤓
  • Simple to integrate pprof server for live monitoring
  • Dynamic history of goroutine count
  • Full-text filtering
  • Overview of routine states

Demo

screenshot

Installation

go install github.com/becheran/roumon@latest

Or download the pre-compiled binaries from the releases page.

Usage

Before starting roumon, the go app which shall be monitored needs to be prepared to export pprof infos via http.

pprof

The program which shall be monitored needs to run a pprof server.

Import pprof into you program:

import _ "net/http/pprof"

Run a webserver which will listen on a specific port:

go func() {
    log.Println(http.ListenAndServe("localhost:6060", nil))
}()

Start your program and check that the pprof site is available in you web-browser: http://localhost:6060/debug/pprof

roumon

Start roumon in from your command line interface. Use optional arguments if needed.

For example roumon -debug=logfile -host=192.168.10.1 -port=8081 will start the routine monitor for the pprof profiles exposed to 192.168.10.1:8081 and write a debug logfile to ./logfile.

Run roumon with -h or --help to see all commandline argument options:

Usage of roumon:
  -debug string
        Path to debug file 
  -host string
        The pprof server IP or hostname (default "localhost")
  -port int
        The pprof server port (default 6060)

From within the Terminal User Interface (TUI) hit F1 for help F10 or ctrl-c to stop the application.

Contributing

Pull requests and issues are welcome!

Author: Becheran
Source Code: https://github.com/becheran/roumon 
License: MIT license

#go #golang #monitoring 

Roumon: Universal Goroutine Monitor using Pprof and Termui

Gowl: A Process Management and Process Monitoring tool At once

Gowl

Gowl is a process management and process monitoring tool at once. An infinite worker pool gives you the ability to control the pool and processes and monitor their status.

Table of Contents

Install

Using Gowl is easy. First, use go get to install the latest version of the library. This command will install the gowl along with library and its dependencies:

go get -u github.com/hamed-yousefi/gowl

Next, include Gowl in your application:

import "github.com/hamed-yousefi/gowl"

How to use

Gowl has three main parts. Process, Pool, and Monitor. The process is the smallest part of this project. The process is the part of code that the developer must implement. To do that, Gowl provides an interface to inject outside code into the pool. The process interface is as follows:

Process interface {
Start() error
Name() string
PID() PID
}

The process interface has three methods. The Start function contains the user codes, and the pool workers use this function to run the process. The Name function returns the process name, and the monitor uses this function to provide reports. The PID function returns process id. The process id is unique in the entire pool, and it will use by the pool and monitor.

Let's take a look at an example:

Document struct {
   content string
   hash string
}

func (d *Document) Start() error {
   hasher := sha1.New()
   hasher.Write(bv)
   h.hash = base64.URLEncoding.EncodeToString(hasher.Sum(nil))
}

func (d *Document) Name() string {
   return "hashing-process"
}

func (d *Document) PID() PID {
   return "p-1"
}

func (d *Document) Hash() string {
   return h.hash
}

As you can see, in this example, Document implements the Process interface. So now we can register it into the pool.

Pool

Creating Gowl pool is very easy. You must use the NewPool(size int) function and pass the pool size to this function. Pool size indicates the worker numbers in and the underlying queue size that workers consume process from it. Look at the following example:

pool := gowl.NewPool(4)

In this example, Gowl will create a new instance of a Pool object with four workers and an underlying queue with the size of four.

Start

To start the Gowl, you must call the Start() method of the pool object. It will begin to create the workers, and workers start listening to the queue to consume process.

Register process

To register processes to the pool, you must use the Register(args ...process) method. Pass the processes to the register method, and it will create a new publisher to publish the process list to the queue. You can call multiple times when Gowl pool is running.

Kill process

One of the most remarkable features of Gowl is the ability to control the process after registered it into the pool. You can kill a process before any worker runs it. Killing a process is simple, and you need the process id to do it.

pool.Kill(PID("p-909"))

Close

Gowl is an infinite worker pool. However, you should have control over the pool and decide when you want to start it, register a new process on it, kill a process, and close the pool and terminate the workers. Gowl gives you this option to close the pool by the Close() method of the Pool object.

Monitor

Every process management tool needs a monitoring system to expose the internal stats to the outside world. Gowl gives you a monitoring API to see processes and workers stats.

You can get the Monitor instance by calling the Monitor() method of the Pool. The monitor object is as follows:

Monitor interface {
   PoolStatus() pool.Status
   Error(PID) error
   WorkerList() []WorkerName
   WorkerStatus(name WorkerName) worker.Status
   ProcessStats(pid PID) ProcessStats
}

The Monitor gives you this opportunity to get the Pool status, process error, worker list, worker status, and process stats. Wis Monitor API, you can create your monitoring app with ease. The following example is using Monitor API to present the stats in the console in real-time.

process-monitoring

Also, you can use the Monitor API to show worker status in the console:

worker-monitoring

Author: Hamed-yousefi
Source Code: https://github.com/hamed-yousefi/gowl 
License: MIT License

#go #golang #monitoring 

Gowl: A Process Management and Process Monitoring tool At once

Gtop: System Monitoring Dashboard for The Terminal

gtop

screen record

System monitoring dashboard for terminal.

Requirements

  • Linux / OSX / Windows (partial support)
  • Node.js >= v8

Installation

$ npm install gtop -g

Docker

You need to assign host net and pid to access the metrics in the host machine.

$ docker run --rm -it \
    --name gtop \
    --net="host" \
    --pid="host" \
    aksakalli/gtop

Usage

Start gtop with the gtop command

$ gtop

To stop gtop use q, or ctrl+c in most shell environments.

You can sort the process table by pressing

  • p: Process Id
  • c: CPU usage
  • m: Memory usage

Troubleshooting

If you see question marks or other different characters, try to run it with these environment variables:

$ LANG=en_US.utf8 TERM=xterm-256color gtop

Author: Aksakalli
Source Code: https://github.com/aksakalli/gtop 
License: MIT License

#node #monitoring 

Gtop: System Monitoring Dashboard for The Terminal
Lenna  Kihn

Lenna Kihn

1640304000

Munin Server Monitor Setup Guide

We use Munin to monitor servers with different resources and critical operations. Munin will contact each server and ask for statistics and monitor if a service is running and have enough resources to work.

#monitoring 

Munin Server Monitor Setup Guide