Quickly shifting to remote work has enterprises looking to meet the ops needs of a suddenly distributed team, and there are open source options to get them there.

The recent mad rush to scale to remote work may prove to be a key chapter in DevOps and AIOps evolution. This need for rapid, widescale change is creating a real conundrum concerning AIOps, DevOps, and ITSM, as organizations seek the best monitoring and incident response solution for their now distributed enterprises.

The key question both the DevOps and IT service management (ITSM) communities need to answer is how quickly they can pivot and adapt to increasing demands for operational intelligence.

What’s AIOps?

Artificial intelligence for IT Operations (AIOps) brings together artificial intelligence (AI), analytics, and machine learning (ML) to automate the identification and remediation of IT operations issues.

An AIOps system learns from your data and adapts how your application works. These systems won’t do the same thing each time. AIOps systems can also run through all workable solutions to a problem, including solutions that some developers may miss in their human analysis of an infrastructure issue. However, we aren’t at a place where AIOps systems—open source or proprietary—can replace experienced systems administrators and other operations team members.

Some better known open source contributions to AIOps include:

  • Prometheus is the first tool that comes to mind when discussing open source monitoring solutions. It’s a graduate of a Cloud Native Computing Foundation (CNCF) project which focuses on monitoring for site reliability engineering (SRE). It simplifies pulling numerical metrics from a metrics endpoint.
  • Grafana is an open source metric analytics and visualization suite. As a data visualization tool, Grafana is popular among Prometheus users to visualize the metrics.
  • Elastic Stack is a suite of open source products from Elastic designed to help users search, analyze, and visualize data from any type of source, in any format, in real time. When you run Elastic Stack with Elastic Search, it provides monitoring and logging solutions.

All three of these technologies do not use AI to resolve issues but are still foundational to the practice of AIOps since consistent, structured data is required to inform decisions. A skilled engineering team, SRE or otherwise, could add open source technologies like TensorFlow or tooling from the SciPy toolkit to get to automated and statistically relevant conclusions about infrastructure.

#devops

Solving the AIOps, DevOps, and ITSM conundrum
1.10 GEEK