Operational Excellence in IT Service Management Mehmet zgr
Operational Excellence in IT Service Management Mehmet zgr Depren Technical Sales Manager - IBM Middleware The Next IT Operations Focus: Big Data Focus on operational objectives has seen significant uptick since 2013 IBM Continues to Invest Heavily in Analytics More than $17B in Acquisitions Since 2005; more than any other company Most comprehensive portfolio, from business to IT Analytics, while most other vendors offer only point solutions C&SIs suite of analytics products leverage best of breed capabilities from across all of IBMs portfolio 2015 Social Analytics/Consumer Insight
Workload Optimized Systems Advanced Case Management Content Analytics Decision Management Stream Computing Pervasive Content pureScale pureXML Deep Compression Developer Productivity Autonomic Operations 2005 IT Operations Analytics Solves New Challenges Reducing & Preventing Outages and Slowdowns for the 24/7 Application World The Network End users Web Servers Devices
Databases App Servers IT Operations Analytics can help 1 Never set performance threshold manually again 2 Identify potential issues before customers are impacted 3 Isolate the problem through analysis of all your IT data
Understanding IBM Operations Analytics Business Outcome Proactive Outage Avoidance Faster Problem Resolution Optimized Performance Predict Search Optimize Predict problems before they occur Search quickly across massive amounts of data Optimize across your IT app infrastructure
Capabilities Operations Analytics IBM Big Data Platform Streams IBM or 3rd Party Solutions Operational Environment Application Performance SPSS Cloud Insights
InfoSphere BigInsights Rave Watson Documentation System & Log Monitoring Transactions Assets & Workorders Alerts, Alarms & Events Applications | Systems | Workloads | Wireless | Network | Voice | Security | Mainframe | Storage | Assets IBM Solution for IT Operations Analytics Our Capabilities Why IBM?
Predict Predict problems before they become service impacting Search Diagnose application & infrastructure issues using all your operational data Optimize Ensure your IT infrastructure is operating as efficiently as possible environments 60% Faster creation of custom high impact mobile ready operations dashboards 50% Faster application diagnostics
Analytics Avoid Outages While Reducing Threshold Management Costs Consolidated Communications detects 100 percent of their major incidents, including silent failures, and eliminated the human intensive task of managing manual thresholds, saving $300,000 annually Resolve Problems Faster Barclays Bank was able to search and diagnose problems 60% faster to quickly resolve application and infrastructure issues. In addition, they identified customer patterns from log data and applied this to channel intelligence 30% Improve Operational Efficiency Advanced events analytics has allowed Claranet to reduce the number of trouble tickets and focus more time and resources on what truly matters to their customers.
Reduction in operator event load 20% Reduction in storage requirements over competitive offerings #1 Leadership position in Operations Management solutions IBM Operations Analytics Predictive Insights Challenge: Reacting to performance thresholds is not enough. IT Staffs must become proactive to ensure mission Predict critical apps never go down. Automated Threshold Maintenance No complex manual intervention to setup & maintain with 5 times faster processing Anomaly Detection
Alerting before potential issues become service impacting, enabling IT to shift from reactive to proactive On-Prem and SaaS Predictive Insights now available as a Service, providing additional value to our Performance Management solutions Supports Heterogeneous Environments Out-of-the-box integrations to IBM APM/ITM or 3rd-party monitoring solutions Why arent operations teams proactive today? Too much data to analyze manually Existing analytic techniques, such as standard thresholds, are not up to the task They cannot detect problems while they are emerging (before business impact) Set performance threshold too high, insufficient warning before total failure. Set performance threshold too low, too much noise, everything is ignored If no there is no early detection before the outage, operations teams can
only react while outage is already in effect and already losing money... Learn relationships between metrics without static thresholds Predicative Insights learns the normal historical range It will alarm if it falls outside this range Watson DNA inside 9 European Telco Flatline Stopped (crashed) Application - Regular load absent. Targeting Situation Detections Customer Relationship Management System for large Telco. 100 applications monitored by Compuware System. (40 million metrics) In this Example the regular load on one of the servers has changed indicating application problem. European Gambling Website Adaptive Threshold High disk latency Automated Dynamic Thresholds and Early Detection
A gambling Website application monitored by HP . Coming up to busy sporting event traffic increased causing stress on the system and negative customer experience. Using PI early detection of latency issue could have been tackled to avoid this. Large US Bank Adaptive Threshold Connection Leak Automated Dynamic Thresholds and Early Detection These are Websphere metrics taken from CAWily performance management system.. The number of actual connections to the WebSphere application server has increased dramatically. The poolsize and bytesInUse are also affected indicating either increased demand, or a problem with connections not being freed up. Insight Poolsize and Bytesinuse on the same node are also behaving anomalous at the same time and are related to each other. European Bank Significant trend. Disk Thrashing Targeting Situation Detections File server under stress as file control operations and bytes per second increase. This sudden change can be tracked back to a patch applied.
A Sample of technologies Predictive Insights integrates with IBM ITM/TDD & IBM APM IBM OMEGAMON HP BAC, Topaz IBM TNPM Aircom Optima Predictive Insights as a Service Performance Management + Predictive Insights Integrated threshold automation and maintenance Anomaly detection Get ahead of potential application and resource outages Learn, Explore, and Try Continuous Delivery
IBMPredict Operations Analytics Log Analysis Challenge: To diagnose service problems in applications and the infrastructure supporting them involves quickly analyzing incredible amounts of both structured and unstructured data Breadth of Searchable Data Search across all of your IT operational data to quickly resolve issues Expert Advice Any competitor can isolate problems. IBM helps clients quickly resolve them. Mainframe Support Search System z (zLinux & zOS) logs in addition to all your other data Embedded Analytics Out-of-the-box integrations to IBM APM/ITM or 3rd-party monitoring solutions Search Search IBM Operations Analytics Log Analysis Collects large volumes of structured and semi-structured data and transforms it through analytics into actionable intelligence.
Search and Visualize Insight Packs IT Operations App Support Service Desk Normalize Consolidate Documentation Logs Metrics Events Collect Application owner : I got a trouble ticket on my application. I want to quickly find the root cause and fix it and restore
app/service ASAP Current Challenge : large volume of data to collect and analyze , manual correlation taking days/hours to find the root cause of the problem. Cannot find logs for problem window situations. Highly dependent on SME skills. Its an art Core files Logs, Traces,.. Events Metrics Transactions Config 01000110001110000111 00110001111100001100 01 11111100011001110001 1 [10/9/12 5:51:38:295 GMT+05:30] 0000006a servlet E com.ibm.ws.webcontainer.se rvlet.ServletWrapper service SRVE0068E:
Application owner : I got a trouble ticket on my app. I want to quickly find the root cause, fix it and restore service ASAP Solution: IBM Operations Analytics Log Analysis can provide insights from all data in clicks. App owner can search through the data, leverage Dashboards to find the root cause in minutes IBM IBMOperations Operations Analytics AnalyticsLog LogAnalysis Analysis metrics metrics Expert Expert knowledge knowledge Events Events Tickets
theservlet servletTradeAppServlet TradeAppServlet ininapplication applicationDayTrader2-EE5. DayTrader2-EE5. Exception created :: Exception created logs logs javax.servlet.ServletException: Tx# date status 108978 23-Jul-2013
started 108978 23-Jul-2013 To IN Transaction Transactiondetails details from App DB from App DB Out of the Box Insight Packs Out of the Box Insight Packs (IBM Provided)
IBM Websphere Application Server IBM DB2 Web Access Logs Windows Events SysLog Java Core IBM MQ Series IBM Integration Bus (Message Broker) Delimiter Separated Value (DSV) log files Partner Provided Microsoft Sharepoint, Microsoft Exchange, Microsoft SQL Server, Microsoft Active Directory Tivoli Storage Manager IBM Systems Disk Storage 8000 IBM AIX Errpt IBM HTTP Server HP LiveSite , HP TeamSite Oracle Database VM Ware ESXi
Oracle Siebel https://developer.ibm.com/itoa/ IBM Netcool Operations Insight Modern Dashboards, Fully Mobile Visualize the performance and health of your entire operations environment. Out of the box Integration 98% Reduction in Critical events: ~22 critical & ~100 major events per week Improved focus and utilization of first- and second-line staff Analytics to increase event value v1.1 30% reduction in Events to Operations v1.2 Almost 50% reduction in repeating events v1.3
90% reduction for known event classes Optimize Event Analytics Seasonal Event Identification Improve efficiency by identifying and resolving recurring problems Large Bank 7% of Priority 1 Tickets were raised by events that were highly seasonal 30% of lower severity tickets Report on event history identifies seasonal events sorted by confidence level and frequency Drill down shows time distributions of events investigate peaks.
Can better align thresholds to seasonal peaks reducing events Seasonality Analysis of events 1 MS SCOM Health Service Heartbeat failures happen often on Sunday 06.00am, probably due to regular maintenance 2 A specific Oracle database is not accessible every day at 21.00pm, probably due to a daily restart or backup 3 A node is giving file system alerts every day around 01.00am, probably due to a daily batch job Related Events Grouping Relationships I know about Known Event Analysis Grouping and Correlation providing powerful situation management of
active events Out of the box domain expertise for known event relationships Vendor and technology dependent Significant reduction of incidents presented to the operator Extendable by Business Partners and clients with no coding required Event Analytics Related Event Analytics Relationships I dont know about Improve efficiency - Reduce actionable events by grouping events that always occur together Automatic detection of event clusters Leverages machine learning to analyze historical event archive and identify groups of events that always occur together Presents identified relationship to the Administrator Presents proposed automated actions
Watch, Deploy, Archive or Do nothing Groups events in the Event Viewer It is very beneficial to have a tool that can turn historical event data into an event group with a single root event. It helps us turn the data into logic Increase operator efficiency by up to 90% with out-of-the-box alert reduction and advanced alert analytics Future of Service Management Visibility Control Automation Real-time Analytics and Visualization Problem Isolation Data Correlation
Outage avoidance Integration Optimization Insight & Care Predictive Analytics Thank You
Aylmer Lodge Cookley Partnership . An update . March 2017. Independent Practice based on 2 sites. Part of the Wyre Forest GP Federation. 14,000 patients. 9 Doctors, increasing to 10 Summer 2017. Advanced Nurse Practitioner, full compliment of skilled Practice...
Comma 4 bis (introdotto dal D.L. n. 78/2013, conv. con modifiche, Legge n. 94/2013): al di fuori dei casi di condannati che si trovano in stato di custodia cautelare in carcere al momento in cui la sentenza di condanna diventa...
About 99.7% of all values fall within 3 standard deviations of the mean. The Empirical Rule Chebyshev's Theorem The proportion (or fraction) of any set of data lying within K standard deviations of the mean is always at least 1-1/K2,...
Instalacja telewizji naziemnej i satelitarnej Szczegółowe wytyczne zawarte w rozporządzeniu okablowanie kategorii RG-6 lub wyższej, wykonane w klasie A, zawierające podwójny ekran - folię aluminiową i oplot o gęstości co najmniej 77% oraz miedzianą żyłę wewnętrzną o średnicy nie mniejszej...
Role-play rubric follows performance indicators 90% Performance Indicators 10% Overall Impression BUT, intrinsically linked! The ART of Role Plays Purpose: to simulate a business meeting; play the role. More than just content…
Financial data from the FAFSA are used to determine how much federal and state financial aid a student qualifies for, including grants and loans. Tax Information: The FAFSA requires tax information from the student and (typically) their parents. It uses...
What makes Lawyers different "Lawyers occupy a critical and sensitive place in the functioning of a society governed by the rule of law. This is why the practice of law is so much more than a business or an industry...