top of page

COLOCATION'S HIDDEN FLAW: LACK OF AGENCY

Everything you do online, whether using social media, doing online payments, tracking accounts for a large company, pretty much anything where a computer (or mobile phone) is connected to the internet, relies on data centers. To put it simply, they are the backbone of our digital world. And every major company in every business sector relies on them for their day-to-day operations. 


But they’re extremely expensive, take a long time to build, and require a superior maintenance staff.  For a company where managing critical facilities is part of their core business, these costs can be formidable; enter the colocation vendor.


Colo vendors offer ready-built data center spaces, where power, cooling, and fire/life-safety systems are all available in abundant supply, with the staff already onsite to maintain those core systems.  An enterprise IT company merely need to sign a contract for adequate floor space and estimated power needs and voilà, you’re immediately in business!  No massive sunk costs, no waiting 2-3 years to have a new data center built, no expensive critical facilities staff to hire and train, etc., etc.  It’s a VERY attractive option for those companies that want to avoid these costs, and this option can massively increase their level of agility within the business world.


The colocation market is huge: of the ~5,300 data centers in the USA alone, roughly 60% of them are owned by colo vendors.  [Interestingly, about 25% of the colo market is consumed by IT and telecom companies; this makes sense, as the large cloud companies are in the business of running their own data centers but utilize colo vendors are their networking pipelines to the world.]

 But the ugly underbelly to the colo market is the lack of agency for enterprise IT clients.


You see, those clients have no insight (or oversight) to make sure the data centers where they’ve put their companies crown-jewels are being properly maintained or have adequate cybersecurity protection.  The clients are utterly blind to the potential risks of improper maintenance management and are therefore exposed to unplanned outages.


Enter the SOC-2 certification process, where colocation companies are audited by 3rd-parties, to verify that the colo vendor has proper controls in place for a variety of things, the two biggest rocks being cybersecurity and critical facilities maintenance. 


Most enterprise IT organizations depend upon the SOC-2 certification as the standard metric to judge whether a given colo facility will be safe for their equipment.


Let me state with absolute clarity: 


The SOC-2 is a fraud on its face. 


How do I dare say such a thing? 


Simple: the only requirement to be a SOC-2 auditor is that you must be a CPA. 


A Certified Public Accountant, that's it. This also applies to the ISO 9001:2015 or ISO 27001:2013 standards. 


Yes, CPAs are attempting to validate the maintenance practices of sophisticated systems that require decades of experience to fully understand. 


It’s the same as a CPA verifying the maintenance was done properly on the airliner you’re about to board; would you really bet your LIFE on that???


And yet the truth is that enterprise IT organizations- including the big cloud-computing companies- blindly trust that the colocation facilities they lease are being adequately maintained and use the SOC-2 as validation of that belief.


But the SOC-2 certification process is a veneer of legitimacy, or as a colleague stated, “it’s like The 'Good Housekeeping Seal of Approval.'  It sounds impressive, but it’s a façade."


THE MANY FLAWS WITH THE SOC-2 AUDIT PROCESS


CPAs Are NOT Capable of Performing Availability Audits


Let me restate the obvious: SOC-2 availability audits performed by CPAs are intrinsically and fatally flawed because they lack the deep engineering knowledge necessary to properly assess the critical infrastructure of a data center or colocation facility. 


Quite by accident, I recently met a very senior SOC-2 auditor for one of the “big four” accounting firms outside of the work environment, who expressed his 

disillusionment with being an auditor.  He said they are often bewildered, wondering how to effectively audit the company they’re sent to; a lack of understanding about the core business, lack of technical knowledge, and insufficient time on-site to gain insight, are major problems.  Often, he stated, they simply copy previous audit reports and just update the numbers.  Occasionally, they must resort to Google searches, to try and get awareness of what they should be looking for.  When I asked if they had any kind of training program, he said there was nothing; they’re simply thrown in, to learn on the job.


Blanket Coverage


SOC-2 audits are a sampling of the records of the colocation company, with a few sample records from individual sites thrown in to check whether the controls are being followed.  Restated, the SOC-2 audits do not look at individual sites, but cover the entire colo company, whether they have 2 data centers or 20: it’s a blanket certification.


The problem is that I have seen multiple examples where two sites were within a few miles of each other, both owned by the same company, where one site was being managed well, and the other site very poorly.  The team managing the site was usually the deciding factor.


Timing and Frequency Problems


The SOC-2 audits are supposed to be performed on an annual basis.  The problem with this is that if a SOC-2 audit is finished in mid-January (typical), and a major risk item comes up in February, the client is unaware of the risk for a year, assuming the auditors find it on the next cycle- which they rarely do, due to their lack of technical knowledge.  Worse, many colocation companies defer their audits for an extended period, up to a year, so your company may be at risk for up to two full years before you even become aware of the problem!


Existing Audit Procedures


Current audit processes follow a script, which essentially documents that they looked at documents.  They look at tickets, schedules, signatures, and other “evidentiary” data, to verify that facilities are being properly maintained.  In a typical business, an auditor can verify business data by


  • Verifying the appropriate records are present

  • Reviewing the records for validity, and if necessary,

  • Performing a physical inspection of the assets in question (i.e., check inventory).


The script breaks down in the mission-critical space because the auditors lack the technical knowledge to determine if the appropriate records are present and if the data indicates underlying problems.  They certainly won’t have sufficient knowledge to physically inspect the assets in question.  As one manager noted, “They’re great at chasing paperwork, they’re able to document that they looked at documents, dates, times and ticket numbers…  but the underlying data itself is no good.”


Hiding Evidence to Game the System


The script the SOC-2 auditors follow is to inspect the maintenance records stored in the Computerized Maintenance Management System (CMMS) that all colocation companies use to track issues from detection to completion.  However, they never look at the equipment records submitted by the maintenance vendors themselves. 


Let me give you an example: the local Caterpillar vendor performs maintenance on the CAT diesel generators in a data center.  The maintenance report contains all steps taken in the job, what all parameters were, what defects were found, and a list of recommendations at the end of the report.  Contractually, if a defect was found in a generator and was not detected by the vendor- whether due to negligence or failure to follow procedure- the vendor would be liable for repairs if the generator subsequently experienced a catastrophic failure.  So it behooves the vendor to be as detailed and accurate as possible in their maintenance reports, whether for generators, UPS units, UPS batteries, chillers, CRAC units, etc.  The REAL data is always contained in those reports.


Now you may ask, “Of course that makes sense, but why bring this up?” 


In recent audits for one of the biggest colo companies in the world, I found multiple, serious defects highlighted in the vendor reports, yet the corresponding CMMS records were clean; there were no notes as to the defects found by the vendors!  I asked the compliance manager (who was participating in the audit) why the CMMS tickets didn’t mention these serious defects, and he answered that it was their policy not to record any defects in the CMMS system, because it would make them “late” (AKA look bad).  In a later meeting where the IT clients- who simply couldn’t believe what I told them- were on, he repeated this statement.  I later discovered that this policy is in effect for their entire global portfolio.


Because CPAs cannot intelligently read vendor reports, they simply rely on the CMMS records, where defects are supposed to be documented, and tracked to resolution. When the colo vendor intentionally hides those defects, the SOC-2 auditors are faced with a "clean" document- no defects, as a matter of policy. Thus, they always get a clean SOC-2 certificate every year.


He Who Pays the Piper Calls the Tune


SOC-2 audits are a bought-and-paid-for product, by colocation companies, as a cost of doing business. 


They are not paid for directly by the enterprise IT organizations, who are merely renters. 


This results in an obvious conflict of interest; SOC-2 auditors must "deliver the goods,” or they’ll not be in business for long. 

Indeed, if word gets out to the colocation industry that a particular CPA firm is more stringent in its SOC-2 audit practices, it’ll be relegated to competing against H&R Block for tax returns.


Bad-Actor SOC-2 Auditors


There should not be a problem with “bad actors” in the CPA space, as they have a code of ethics they’re required to adhere to.  But unfortunately, like lawyers, there are indeed bad actors, and I’ve seen it first-hand.


One of the top-10 data center providers in the world released its latest SOC 2 certification with no exceptions noted regarding its preventative maintenance program. The compliance manager of the data center provider attested to the auditing company that there were no known issues and that all controls were in place for the period in question, with no deviations- exactly what you would expect to see from one of the biggest data center companies in the world.


The problem is that I have been auditing them for years. During the period of this audit certificate, the company had a variety of critical “misses” in engineering management regarding preventative maintenance. 


Finally, in a dispute with the commercial management company chosen to acquire and manage the talent maintaining the data centers, they lost six months of engineering records and were able to reconstruct- partially- only three months of records.  The other three months of records were irretrievably lost.  In other words, half of the records needed to prove that availability was being assured, were missing or incomplete.


The logical question is this: how could the CPA firm certify them as passing the audit when literally half the required records were lost? 


The answer: They could NOT.  The best the CPA firm could say in a legitimate audit was something to the effect that there were not enough records to reach a determination.


Thus, the SOC-2 audit certification from this particular CPA company was fraudulent.


Like the comments by the CPA who had been thrown into SOC-2 audits, they simply copied information from the previous year, updated some numbers, and then signed off.  And the compliance manager intentionally hid the problems from the auditors.


Case Study


Rather than repeat myself, I will refer you to my previous essay, Beyond Compliance: Vanguard’s Journey to Perfect Uptime, which details the audit program I developed and deployed for Vanguard Financial.  The result of that program was zero downtime for the entire global portfolio in terms of availability for 6 ½ years. 


Some quick takeaways from that essay:


  • Of the dozens of defects I discovered that represented a significant risk to IT clients, NONE were ever disclosed to those stakeholders; they were only discovered by my audits.

  • Every time serious defects were discovered that represented a serious risk to ongoing operations, colocation provider representatives would almost always attempt to downplay the risk to the stakeholders, even to the point of mendaciousness.

  • Every colocation provider I audited was SOC-2 “certified.”

  • The best-managed colocation providers were very open in their audits and were very accommodating about scheduling, sharing of information, etc.  It seemed that the worse the colocation company was in keeping their equipment properly maintained, the more they relied on the SOC-2 and did their best to stave off any substantive audits.


When compared to the probabilities of downtime as published by the Uptime Institute 2023 outage report, where 60% of respondents had suffered an outage in the previous three years, the likelihood of the portfolio under my audit program NOT suffering an outage was 2.04x10^-126.  In other words, even competing against other critical facilities management professionals, my audit program delivered statistically impossible results, yet this was the result.


Amerruss LLC: Your Beacon in the Fog of Data Center Uncertainty


While the financial benefits derived from using the services of colocation providers are undeniable, the downside is that you now have a lack of agency. 


You’ve put your destiny in the hands of someone else, who often is not worthy of such trust.   Enterprise IT organizations have no means to validate their interests are being protected, and that they’re getting what they’re paying for. 

When I fully understood the deceptive nature of the SOC-2 certifications and

the implications, I was staggered: every major company, in every business sector, is at risk.


I, therefore, formed Amerruss LLC, to correct the lack of agency that endangers all companies, and their customers- all of the US.

 

AMERRUSS:  Restoring Agency To Your Company


Amerruss LLC is the only company in the world, uniquely positioned to restore agency to your enterprise IT organization, so you can be assured of having truly resilient colocation spaces from which to base your IT assets. 

Let me be clear: We offer the only proven availability assurance audit program in the world, bar NONE. 

We achieve incredible results by performing the following steps:


  1. Establish open lines of communication between colocation vendor and IT client, in such a way that critical facilities maintenance can be openly discussed transparently.

  2. Perform a deep-dive analysis of all technical assets supporting the IT client, whether the client is located in a small cage or leases an entire building, to have a road map of what assets support the clients’ equipment.  If this is a new facility, that would also include reviewing all commissioning documentation to be certain that there are no hidden defects from construction.

  3. Document the findings of step #2 in basic diagrams, to be included with the audit results, so clients can easily understand how any defects detected fit into the holistic picture.

  4. Verify the maintenance calendars match industry standards, and then monitor the adherence of the colocation provider to that calendar.  Any significant deviations require investigation.  Trends of deferment would be defects, to be investigated.

  5. Perform a deep dive of sufficient maintenance records of critical facilities assets, to achieve at least a 95% confidence interval (as a practical matter, it's FAR better than 95%).  This includes reading CMMS reports, and all vendor reports, and examining the recorded data, to derive underlying trends in equipment health.

  6. Investigate and document any defects noted.  All defects that introduce real risk to the

  7. Follow up on any defects from past audits that needed correction.

  8. Report the results to the client- and the colocation vendor, to maintain full transparency.

  9. Repeat quarterly. Subsequent audits are much faster and the costs are therefore nominal since all core data is developed during the first audit cycle.  This offers clients as close to real-time feedback as possible, keeping a pulse of ongoing maintenance operations, and exposing any new risks that develop VERY quickly.


It is important to understand that any defects found, are contractually the responsibility of the colocation vendor to mitigate, as the colocation contracts always stipulate the equipment will be maintained under industry practices and manufacturer recommendations.  Thus, all repair costs are the responsibility of the colo vendor, NOT the IT client.


Take the Helm: Partner with Amerruss for Data Center Excellence


With our audit program, you need to continually expand your IT portfolio to more and more colocation facilities- a redundancy “arms race,” is no longer needed, and you can save millions by having our audit program resolutely monitor the equipment that keeps your company safe and secure.


You won’t lose sleep at night, wondering if your company is exposed to hidden availability risks; we keep a sharp eye on things for you.


With increasing risks to the electric grid, you can no longer rely on an ersatz certificate like the SOC-2; you need to know the critical facilities components supporting your business are being properly maintained, so that WHEN utility interruptions occur, you’re sailing through them without issue.


Visit us at Amerruss LLC, and sleep better at night.




Comentarios


bottom of page