top of page
Team Meeting

SLAs and KPIs in
Application Outsourcing

We help establish the SLAs & KPIs that actually matter to the business and help remediate service delivery when performance expectations are not being met. 

Performance Management Concepts:

  • Common types of SLAs and KPIs

  • The Top 5 SLA mistakes to avoid 

  • Missing SLAs - remediation of service delivery

  • Aligning IT metrics with business outcomes

•Enable multi-vendor coordination and reduce finger-pointing

What are Service Level Agreements (SLAs) and Key Performance Indicators (KPIs)

The SLA structure is one of the most unique and complex aspects of an outsourcing engagement. While you may have experienced operational measures within an internal IT organization, the complexities around managing IT performance under an outsourced arrangement is very different.​​​

Definitions:

​

Service Level Agreement (SLAs):  the primary measures which the client and vendor agree will be utilized to evaluate the vendors performance under the contract.  The SLA commercial construct will include provisions for financial penalties and even termination for cause in the event the vendor fails to meet agreed upon performance standards.

 

Key Performance Indicators (KPIs):  the second level of measures that are important to the client, but do not rise to the level of importance as to have financial penalties associated with them.  The vendor is expected to adhere to the performance targets in the same manner as an SLA.  There are normally provisions in the agreement that allow the client to promote a KPI to an SLA in the event the vendor is not performing as expected.

 

The definitions of the SLA and KPI will include the following items:

​

  • Name and description - including the formula of the measure. It is also important that the parties agree as to the source of the data to be used for the calculation.

 

  • Performance requirement – the performance target for each task or event

​​

​

  • Expected performance – the percentage of successful tasks or event that the vendor should deliver

 

  • Minimum performance  -  the percentage that triggers the potential assessment of penalties

​

  • Significant minimum – the result that is any individual event falls below (not the aggregate) penalties will be incurred

 

Measures:  each key service performed by the vendor will have one or more quantifiable performance metrics identified. As tasks are delivered, they are evaluated as to achievement of the agreed upon performance requirement and determined to have either passed or failed. At the end of the reporting period, the passed/failed results will be totaled.

Performance Dashboard

Example:​

 

Incident Management typically has two type of measures;  response time and resolution time. The performance requirements vary based on the severity of the incident, so there are actually 8 individual measures that are evaluated independently.

 

SLA Definition – Incident Management:                                                                                          

 

                                    Performance Requirements​ 

            Severity           Resolution Time                          Expected               Minimum

          1                   2 hours                                    99%                      97% 

          2                   4 hours                                    95%                      90%   

          3                   5 business days                        90%                      85%   

          4                 30 business days                        90%                      85%

 

At the end of the month, each incident that has been closed is evaluated against the response and resolution times (individually) and determined if the performance passed or failed expectations, resulting in a percentage.​For our example, we will use the Severity 2 Resolution measure and assume that there were 10 Severity 2 incidents, 9 were solved in under 4 hours, making the Severity 2 Incident Resolution metric 90%.  

 

In this case the vendor did NOT meet the expected performance and is under the minimum performance, therefore evaluation as to the application of penalties needs to be conducted.

Intermediate - Using SLA(s) and KPI(s) to Manage Vendor Performance

In the example in the previous section, the vendor missed the SLA for Severity 2 Incident Resolution. We will utilize that example to illustrate what would happen in the service management group to determine the appropriate disposition of that result.

Definitions:

​

Exemptions:  one of the core principles of an SLA framework is that the vendor can only be held accountable for something that they control. In the event that an outside party caused the vendor to miss the performance target, then that task would be excluded from the calculation. 

​

The exclusion requires the client to agree to the exclusion.  If the revised calculation results in a measurement that is equal to or higher than the minimum performance target, then the penalties do not apply.

 

Law of Small Numbers:  there is a principle in the SLA framework that a single miss should not trigger a financial penalty.  In our Severity 2 Resolution example above, you will see that the vendor missed 1 out of 10 incidents causing a 90% performance rating where the target was 95%.  In order for a single miss to not trigger this  performance measure, there needs to be at least 20 Severity 2 incidents in the month, i.e. 19 successful events against a total of 20 = 95% performance.

​

The typical approach to remedy this situation would be to carry over the results to the following month and aggregate the volume of the two months to reach the required volume of incidents for evaluation.

​

Exception:   some measures that are so important that a single miss can cause a financial penalty, for example application availability. 

 

Earn Back:   in some agreements, the vendor has the right to earn back penalties that are incurred. The ability to earn back is based on the achievement or overachievement of performance targets in subsequent months.

​

It is worth noting that Earn Back provisions are no longer common in the competitive outsourcing marketplace.

​

Application Tiers:   It is not unusual for large scale organizations to split their application portfolios into different groups (ex: Gold, Silver, Bronze) based on their criticality. In this case, each application tier would have different service levels associated with them.

​

As an example, Gold applications would have higher requirements for system availability than the Silver applications, such as Gold 99.99% availability and Silver 99.9% availability.

Dead Bands:  when a vendor presents a solution it is based on a series of assumptions, one of the most critical being volume of work.  As an example, for application support engagements the number of incidents per month is an important factor in staffing and has a direct impact on the ability of the vendor to meet the agreed upon service levels.

 

Dead Bands are constructed to indicate when variability is deemed high enough to impact the vendor’s ability to perform, thereby giving them relief from service level penalties.

  

In the example shown, the expected volume of incidents is 1,500 per month. The parties have agreed to establish the dead bands at +/- 500 incidents per month.  In the month of January and February, the incident volumes were within the expected range. 

Dead Band:  establishes expected volume and upper and lower levels before SLAs are suspended

In March there was an unexpected surge of incidents that resulted in volumes above the upper range. Provided that these incidents were not the result of something the vendor did, there would be relief granted for SLA misses that were “volume related” during that month.  SLAs such as Incident Remediation for Severity 3-4 might be excused because Severity 3-4 incidents is the vast majority of the incidents. Remediation for Severity 1-2 incident would not be excused because the volume is typically less than 5% of the total incidents. 

​

June, July and August all are above the upper range. When 3 or more months are outside the range (above or below) a formal meeting is called to review the expected volume. Adjustments can be made to either the service level performance requirements (no cost impact) or the necessary staffing to align with the new expected volume  (potential cost impact +/-).   

Advanced Topics - Critical Concepts for Contracting SLA(s) and KPI(s)

The following items are topics that are going to be points of negotiation during the contracting phase or will be used in dispute resolution when there are issues regarding the quality of a vendor’s performance.

Due Diligence: 

 

Prior to contract signature, vendors will want to conduct Due Diligence to show that the SLA performance targets requested by the client are achievable and the current IT organization has met these targets historically. The rational is that the vendor has relied upon the current environment as a basis for the estimation of the solution and if the SLA results were not achieved then the solution is likely underestimated.

​

The best example of this would be application availability. If a client is requiring an uptime of 99.99% but the systems have never been available more than 99.2%, then the SLA requested is not appropriate.

​

The easiest approach for conducting Due Diligence is to obtain the last 6 months of SLA data and compare the results against the performance targets being requested. The measures and data utilized must conform to those incorporated in the agreement. It is not permitted to vary the formulas or data sources in order for the results to be valid. 

​

This DOES NOT apply to all measures, only those that are heavily dependent on the current environment should be subject to Due Diligence. Measures that are within the vendor’s control such as; quality measures, on-time/on-budget delivery, training compliance, etc. are examples of tasks which would NOT require that the client achieve the performance targets for the vendor to accept accountability.  The most common examples of measures that DO require Due Diligence would include; System – Application Availability, Severity 1-2 Incident Resolution, Application Performance, N-1 Upgrade Compliance, etc.

Activities undertaken pre-contract

Options for Resolving Due Diligence Issues:  In virtually every engagement there are applications that do not meet the system availability requirements and the patch levels for systems are behind the new targets. 

​

Simply exclude the offending systems from the SLA calculations until such time as they are either brought into compliance or the SLA performance targets are adjusted to reflect the current environment. 

​

The efforts to bring the systems into compliance can be performed by the client or the vendor can submit a project estimate to complete the work.

Contract Items:  

 

When do SLAs Start:  As part of the contracting process, the vendor and client will agree on a transition plan and timeline. Transition covers the period when the vendor secures the staffing to assume responsibility for the services and performs knowledge transfer from the incumbent resources.  The date that the vendor resources take over responsibility for services, referred to as “Service Commencement Date (SCD)”. 

​

The SCD acts as the anchor for date commitments, ex:   SCD + 30 would indicate that a deliverable is due 30 days after the Service Commencement Date. This eliminates the need for restatement in the event the go-live date moves.

​

Service Levels are captured and reported beginning on the Service Commencement Date, however there are variations on when the penalty structure will begin and it differs by measure.  The reason for the delay is that while the new resources have been trained and operational, they have not had enough experience to have financial penalties applied and are given time to gain additional proficiency. A standard delay from SCD to penalties is approximately 3 months, i.e.  “SDO + 90”. 

​

In some situations, there is not enough information known about the environment so that service level performance targets can’t be established at the time of contracting.  These measures are deemed to need a “Baseline” which is where the actual performance of the vendor during the first 3 to 6 months of operation is used to calculate what the performance targets will be. For measures where the targets are agreed but the penalty start is delayed, those measures are referred to as “Burn-In”, which is the time given to the resources to come further up to speed while penalties do not apply.

Activities negotiated during contracting

SLAs with 100% Targets:   “perfection is not achievable” as such service levels should not have an expected performance rating of 100%.  There are very rare examples where the vendor may agree, such as; compliance with laws or company policy, adherence to training or certification requirements. But these should be the exception not the norm.  

​

Productivity:    often vendors will commit to improving the environment and eradicating systemic problems which will result in productivity savings.  Be sure that these productivity improvements are reflected in the agreement in such a way as the capacity of the vendor staff is reduced without proof that the output remains at the same level. For example, a commitment to increase development productivity could result in the reduction of developers without the same level of development work being delivered.

Service Offerings - SLA(s) & KPI(s):

Enablement

OAS will work with the appropriate parties in helping to establish a Performance Management structure and the necessary capability to manage the implementation and execution of the performance governance ongoing.  We can support your efforts during: design, contract negotiation, due diligence and baselining. The traditional SLA and KPI structure can be expanded to include a Balanced Scorecard which will incorporate measures that evaluate IT's impact on the business.  In this role, OAS can be your advisor in the room as you work with your IT vendors or work with Internal IT to implement a performance management structure.

Remediation

OAS will work with the delivery organizations when service delivery is not meeting performance targets or there is a disagreement over the application of commercial constructs that govern the handling of changes and/or exceptions. If performance targets are not being achieved, OAS will work across organizations to facilitate the identification of the Root Cause of the deviations and help construct a roadmap to success. In the event there is lack of alignment regarding how to move forward in obstacles with the current framework, OAS will facilitate joint solutioning sessions to arrive at agreed upon solutions.  

bottom of page