2023 Frontier Technology of Artificial Intelligence — Machine Learning Observability KellyOnTech

6 min readJan 13, 2023

2023 Frontier Technology of Artificial Intelligence — Machine Learning Observability KellyOnTech

In 2022, North American early-stage venture capital institutions were more inclined to invest in projects in the fields of cloud infrastructure, network security and machine learning, and generative artificial intelligence introduced before.

Today I am introducing the cutting-edge technology of artificial intelligence in 2023: machine learning observability.

According to recent research, the application of real-time machine learning will increase significantly in the next three years. This trend actually brings a big challenge to machine learning. For example, we have all dealt with chatbots that feel a little off as we talk. One of the reasons for this involves one of the problems of machine learning: Model Drift.

What is Model Drift?

Simply put, model drift is the attenuation of the model’s predictive ability due to changes in the real world environment. It is caused by a variety of reasons, including changes in the digital environment and the consequent changes in the relationship between variables.

There are two main types of model drift：

Concept drift: Drift due to changes in the properties of variables. Concept drift occurs when a function that models the relationship between features and dependent variables is no longer appropriate for the current context. For example, the definition of lifestyle / entertainment essentials has changed over time. 10 years ago it might have been a TV, now it’s a mobile phone.
Data drift: The underlying distribution of features changes over time. This can happen for many reasons, for example, changes in pathological feature values caused by the COVID-19 epidemic is an example of data drift.

Why Does the Model Drift?

NLP (natural language processing) algorithms are often used for spam filtering. Based on keyword extraction in emails, emails are classified as spam and non-spam to protect users from spam attacks.

Like the lottery winning spam we’ve all received. The model learns feature words such as “very high winning amount”, “lottery draw” to recognize this kind of spam. However, the spam has gone through constant iterations, introducing many new elements such as the kind of spam that notifies you that your membership has been prematurely terminated before it expires and you need to click the button to appeal. These spam emails are not seen by the model before, thus leading to a degradation of the model performance.

What Can Go Wrong with Machine Learning Models?

When deploying a machine learning model, you may find that the model is likely not as good as when it was validated offline. This type of problem is often referred to as training/service bias.

One of the reasons for this is the model drift discussed previously. The distribution of data exposed by a machine learning model may change over time, often referred to as data drift or feature drift. This drift may be gradual or may occur overnight and result in a degradation of model performance.

Another reason is data messiness, especially the fidelity of data sources. In the real world, it is difficult for people to guarantee the quality or freshness of data. Because data changes over time. Especially if an external data source is introduced, its reliability is even more questionable. In research labs, thousands of hours are often spent creating high-quality datasets with minimal noise and accurate labels.

What is Machine Learning Observability?

Machine learning observability is the practice of gaining insight into model performance at all stages of the model development cycle. It enables machine learning practitioners to optimize models by finding the root cause of why a model is behaving in a certain way during model building, post-deployment, and in the long-term production lifecycle.

Image source: Arize AI. ML Engineer Lifecyle. KellyOnTech — Image source: Arize AI. ML Engineer Lifecyle.

Key metrics for observability in machine learning include timeliness for problem discovery and resolution.

1. Detection time: The first key goal of observability in machine learning is to demonstrate possible problems with the model in a timely manner. A good machine learning observability solution helps reduce the time required to detect problems with a model. Machine learning practitioners use machine learning observability solutions to solve problems in advance before enterprise customers know about them.

2. Time to resolution: Once a problem is detected, how quickly a machine learning observability tool can help the machine learning team find the root cause of the problem to be fixed. A good machine learning observability solution needs to guide the model owner to understand the changing input data distribution, feature transformation or model prediction expectation, and provide solutions.

Which Company Provides Machine Learning Observability Platform?

Today I am introducing a relatively new start-up company, Arize AI, founded in 2020 and is headquartered in Berkeley, California. The company offers a machine learning observability platform that monitors machine learning models and provides analytics and troubleshooting.

The two founders of Arize AI, Jason Lopatecki and Aparna Dhinakaran, have some experiences in common. First, they both received their undergraduate degrees in electrical engineering and computer science from the University of California, Berkeley.

Image source: Arize AI. Co-founders Jason Lopatecki and Aparna Dhinakaran

In addition, they are former colleagues who used to work together at TubeMogul. TubeMogul is an enterprise software company for brand advertising, an end-to-end advertising platform that aims to bridge the gap between traditional TV and digital formats. It was acquired by Adobe for $540 million at the end of 2016.

Jason helped build the company’s machine learning team, and Aparna was a data scientist with a Ph.D. in computer science from Cornell University. She later joined Uber as part of its famed Michelangelo team.

Both founders have personally experienced the time-consuming and laborious process of machine learning model establishment and training, deployment to delivery, and the issues that arise throughout the lifecycle with respect to actual performance after deployment.

They felt that something fundamental was missing in the MLOps tool chain.So together they decided to start a company focused on bringing transparency and effective performance improvements to machine learning models through Arize’s purpose-built machine learning observability platform.

In September 2022, TCV, a veteran U.S. technology venture capital firm, led a $38 million Series B round of funding for Arize AI.

What Was One of the Biggest Losses Caused by AI Machine Learning Errors in 2022?

In 2022 of this year, a company lost about US$110 million due to errors in the artificial intelligence machine learning system.

The NYSE-listed company is called Unity Software Inc., a San Francisco-based video game software developer. It is a platform for creating and manipulating interactive real-time 3D (RT3D) content. It was founded in Denmark in 2004 as Over the Edge Entertainment and changed its name in 2007.

It was a glitch in the machine learning model they were using that led to a tragic loss in the accuracy of their Audience Pinpointer tool, a machine learning-based ad targeting tool that uses the data resources accumulated by Unity to help marketers better reach specific audiences.

Are Machine Learning Model Problems Universal?

Do you think this kind of error is an isolated case?

Machine learning model failures have happened to hundreds of companies. Machine learning model issues can lurk undiscovered in every industry, waiting to be discovered. According to a recent paper, 47 Fortune 500 companies listed artificial intelligence and machine learning as a risk factor in their most recent annual financial reports, up 20.5% year-over-year.

Original article

Video version

中文版

For more information about global cutting-edge technology, China’s technology strategy and entrepreneurial projects, welcome to read the recently published English book “Strategic Development of Technology in China”.

“Strategic Development of Technology in China” book cover