How IKEA Retail Standardizes Docker Images for Efficient Machine Learning Model Deployment

What do Docker and IKEA Retail have in common? Both companies have changed how products are built, stored, and shipped. In IKEA Retail’s case, they created the market of flat-packed furniture items, which made everything from shipping, warehousing, and delivering their furniture to the end location much easier and more cost effective. This parallels what Docker has done for developers. Docker has changed the way that software is built, shipped, and stored, with Docker Images taking up much less space “shelf” space. 

In this post, contributing authors Karan Honavar and Fernando Dorado Rueda from IKEA Retail walk through their MLOps solution, built with Docker.

V2 banner unlocking the power of production standardizing docker images for efficient ml model deployment

Machine learning (ML) deployment, the act of shifting an ML model from the developmental stage to a live production environment, is paramount to translating complex algorithms into real-world solutions. Yet, this intricate process isn’t without its challenges, including:

  1. Complexity and opacity: With ML models often veiled in complexity, deciphering their logic can be taxing. This obscurity impedes trust and complicates the explanation of decisions to stakeholders.
  2. Adaptation to changing data patterns: The shifting landscape of real-world data can deviate from training sets, causing “concept drift.” Addressing this requires vigilant retraining, an arduous task that wastes time and resources.
  3. Real-time data processing: Handling the deluge of data necessary for accurate predictions can burden systems and impede scalability.
  4. Varied deployment methods: Whether deployed locally, in the cloud, or via web services, each method brings unique challenges, adding layers of complexity to an already intricate procedure.
  5. Security and compliance: Ensuring that ML models align with rigorous regulations, particularly around private information, necessitates a focus on lawful implementation.
  6. Ongoing maintenance and monitoring: The journey doesn’t end with deployment. Constant monitoring is vital to sustain the model’s health and address emerging concerns.

These factors represent substantial obstacles, but they are not insurmountable. We can streamline the journey from the laboratory to the real world by standardizing Docker images for efficient ML model deployment. 

This article will delve into the creation, measurement, deployment, and interaction with Dockerized ML models. We will demystify the complexities and demonstrate how Docker can catalyze cutting-edge concepts into tangible benefits.

Standardization deployment process via Docker

In the dynamic realm of today’s data-driven enterprises, such as our case at IKEA Retail, the multitude of tools and deployment strategies serves both as a boon and a burden. Innovation thrives, but so too does complexity, giving rise to inconsistency and delays. The antidote? Standardization. It’s more than just a buzzword; it’s a method to pave the way to efficiency, compliance, and seamless integration.

Enter Docker, the unsung hero in this narrative. In the evolving field of ML deployment, Docker offers agility and uniformity. It has reshaped the landscape by offering a consistent environment from development to production. The beauty of Docker lies in its containerization technology, enabling developers to wrap up an application with all the parts it needs, such as libraries and other dependencies, and ship it all out as one package.

At IKEA Retail, diverse teams — including hybrid data scientist teams and R&D units — conceptualize and develop models, each selecting drivers and packaging libraries according to their preferences and requirements. Although virtual environments provide a certain level of support, they can also present compatibility challenges when transitioning to a production environment. 

This is where Docker becomes an essential tool in our daily operations, offering simplification and a marked acceleration in the development and deployment process. Here are key advantages:

  • Portability: With Docker, the friction between different computing environments melts away. A container runs uniformly, regardless of where it’s deployed, bringing robustness to the entire pipeline.
  • Efficiency: Docker’s lightweight nature ensures that resources are optimally utilized, thereby reducing overheads and boosting performance.
  • Scalability: With Docker, scaling your application or ML models horizontally becomes a breeze. It aligns perfectly with the orchestrated symphony that large-scale deployment demands.

Then, there’s Seldon-Core, a solution chosen by IKEA Retail’s forward-thinking MLOps (machine learning operations) team. Why? Because it transforms ML models into production-ready microservices, regardless of the model’s origin (TensorFlow, PyTorch, H2O, etc.) or language (Python, Java, etc.). But that’s not all. Seldon-Core scales precisely, enabling everything from advanced metrics and logging to explainers and A/B testing.

This combination of Docker and Seldon-Core forms the heart of our exploration today. Together, they sketch the blueprint for a revolution in ML deployment. This synergy is no mere technical alliance; it’s a transformative collaboration that redefines deploying, monitoring, and interacting with ML models.

Through the looking glass of IKEA Retail’s experience, we’ll unearth how this robust duo — Docker and Seldon-Core — can turn a convoluted task into a streamlined, agile operation and how you can harness real-time metrics for profound insights.

Dive into this new MLOps era with us. Unlock efficiency, scalability, and a strategic advantage in ML production. Your innovation journey begins here, with Docker and Seldon-Core leading the way. This is more than a solution; it’s a paradigm shift.

In the rest of this article, we will cover deployment steps, including model preparation, encapsulating the model into an Docker image, and testing. Let’s get started.

Prerequisites

The following items must be present to replicate this example:

  • Docker: Ensure Docker is up and running, easily achievable through solutions like Docker Desktop
  • Python: Have a local installation at the ready (+3.7)

Model preparation

Model training and simple evaluation

Embarking on the journey to deploying an ML model is much like crafting a masterpiece: The canvas must be prepared, and every brushstroke must be deliberate. However, the focus of this exploration isn’t the art itself but rather the frame that holds it — the standardization of ML models, regardless of their creation or the frameworks used.

The primary objective of this demonstration is not to augment the model’s performance but rather to elucidate the seamless transition from local development to production deployment. It is imperative to note that the methodology we present is universally applicable across different models and frameworks. Therefore, we have chosen a straightforward model as a representative example. This choice is intentional, allowing readers to concentrate on the underlying process flows, which can be readily adapted to more sophisticated models that may require refined hyperparameter tuning and meticulous model selection. 

By focusing on these foundational principles, we aim to provide a versatile and accessible guide that transcends the specificities of individual models or use cases. Let’s delve into this process.

To align with our ethos of transparency and consumer privacy and to facilitate your engagement with this approach, a public dataset is employed for a binary classification task.

In the following code excerpt, you’ll find the essence of our training approach, reflecting how we transform raw data into a model ready for real-world challenges:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
import os
import pickle
import numpy as np
import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression, Perceptron
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
 
# Load the breast cancer dataset
X, y = datasets.load_breast_cancer(return_X_y=True)
 
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.9, random_state=0)
 
# Combine X_test and y_test into a single DataFrame
X_test_df = pd.DataFrame(X_test, columns=[f"feature_{i}" for i in range(X_test.shape[1])])
y_test_df = pd.DataFrame(y_test, columns=["target"])
 
df_test = pd.concat([X_test_df, y_test_df], axis=1)
 
# Define the path to store models
model_path = "models/"
 
# Create the folder if it doesn't exist
if not os.path.exists(model_path):
    os.makedirs(model_path)
 
# Define a list of classifier parameters
parameters = [
    {"clf": LogisticRegression(solver="liblinear", multi_class="ovr"), "name": f"{model_path}/binary-lr.joblib"},
    {"clf": Perceptron(eta0=0.1, random_state=0), "name": f"{model_path}/binary-percept.joblib"},
]
 
# Iterate through each parameter configuration
for param in parameters:
    clf = param["clf"# Retrieve the classifier from the parameter dictionary
    clf.fit(X_train, y_train)  # Fit the classifier on the training data
    # Save the trained model to a file using pickle
    model_filename = f"{param['name']}"
    with open(model_filename, 'wb') as model_file:
        pickle.dump(clf, model_file)
    print(f"Model saved in {model_filename}")
 
# Simple Model Evaluation
model_path = 'models/binary-lr.joblib'
with open(model_path, 'rb') as model_file:
    loaded_model = pickle.load(model_file)
 
# Make predictions using the loaded model
predictions = loaded_model.predict(X_test)
 
# Calculate metrics (accuracy, precision, recall, f1-score)
accuracy = accuracy_score(y_test, predictions)
precision = precision_score(y_test, predictions)
recall = recall_score(y_test, predictions)
f1 = f1_score(y_test, predictions)

Model class creation

With the model files primed, the task at hand shifts to the crafting of the model class — an essential architectural element that will later reside within the Docker image. Like a skilled sculptor, we must shape this class, adhering to the exacting standards proposed by Seldon:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
import joblib
import logging
 
class Score:
    """
    Class to hold metrics for binary classification, including true positives (TP), false positives (FP),
    true negatives (TN), and false negatives (FN).
    """
    def __init__(self, TP=0, FP=0, TN=0, FN=0):
        self.TP = TP  # True Positives
        self.FP = FP  # False Positives
        self.TN = TN  # True Negatives
        self.FN = FN  # False Negatives
 
 
class DockerModel:
    """
    Class for loading and predicting using a pre-trained model, handling feedback to update metrics,
    and providing those metrics.
    """
    result = {}  # Dictionary to store input data
 
    def __init__(self, model_name="models/binary-lr.joblib"):
        """
        Initialize DockerModel with metrics and model name.
        :param model_name: Path to the pre-trained model.
        """
        self.scores = Score(0, 0, 0, 0)
        self.loaded = False
        self.model_name = model_name
 
    def load(self):
        """
        Load the model from the provided path.
        """
        self.model = joblib.load(self.model_name)
        logging.info(f"Model {self.model_name} Loaded")
 
    def predict(self, X, features_names=None, meta=None):
        """
        Predict the target using the loaded model.
        :param X: Features for prediction.
        :param features_names: Names of the features, optional.
        :param meta: Additional metadata, optional.
        :return: Predicted target values.
        """
        self.result['shape_input_data'] = str(X.shape)
        logging.info(f"Received request: {X}")
        if not self.loaded:
            self.load()
            self.loaded = True
        predictions = self.model.predict(X)
        return predictions
 
    def send_feedback(self, features, feature_names, reward, truth, routing=""):
        """
        Provide feedback on predictions and update the metrics.
        :param features: Features used for prediction.
        :param feature_names: Names of the features.
        :param reward: Reward signal, not used in this context.
        :param truth: Ground truth target values.
        :param routing: Routing information, optional.
        :return: Empty list as return value is not used.
        """
        predicted = self.predict(features)
        logging.info(f"Predicted: {predicted[0]}, Truth: {truth[0]}")
        if int(truth[0]) == 1:
            if int(predicted[0]) == int(truth[0]):
                self.scores.TP += 1
            else:
                self.scores.FN += 1
        else:
            if int(predicted[0]) == int(truth[0]):
                self.scores.TN += 1
            else:
                self.scores.FP += 1
        return []  # Ignore return statement as its not used
 
    def calculate_metrics(self):
        """
        Calculate the accuracy, precision, recall, and F1-score.
        :return: accuracy, precision, recall, f1_score
        """
        total_samples = self.scores.TP + self.scores.TN + self.scores.FP + self.scores.FN
 
        # Check if there are any samples to avoid division by zero
        if total_samples == 0:
            logging.warning("No samples available to calculate metrics.")
            return 0, 0, 0, 0  # Return zeros for all metrics if no samples
 
        accuracy = (self.scores.TP + self.scores.TN) / total_samples
 
        # Check if there are any positive predictions to calculate precision
        positive_predictions = self.scores.TP + self.scores.FP
        precision = self.scores.TP / positive_predictions if positive_predictions != 0 else 0
 
        # Check if there are any actual positives to calculate recall
        actual_positives = self.scores.TP + self.scores.FN
        recall = self.scores.TP / actual_positives if actual_positives != 0 else 0
 
        # Check if precision and recall are non-zero to calculate F1-score
        if precision + recall == 0:
            f1_score = 0
        else:
            f1_score = 2 * (precision * recall) / (precision + recall)
 
        # Return the calculated metrics
        return accuracy, precision, recall, f1_score
 
 
    def metrics(self):
        """
        Generate metrics for monitoring.
        :return: List of dictionaries containing accuracy, precision, recall, and f1_score.
        """
        accuracy, precision, recall, f1_score = self.calculate_metrics()
        return [
            {"type": "GAUGE", "key": "accuracy", "value": accuracy},
            {"type": "GAUGE", "key": "precision", "value": precision},
            {"type": "GAUGE", "key": "recall", "value": recall},
            {"type": "GAUGE", "key": "f1_score", "value": f1_score},
        ]
         
    def tags(self):
        """
        Retrieve metadata when generating predictions
        :return: Dictionary the intermediate information
        """
        return self.result

Let’s delve into the details of the functions and classes within the DockerModel class that encapsulates these four essential aspects:

  1. Loading and predicting:
    • load(): This function is responsible for importing the pretrained model from the provided path. It’s usually called internally before making predictions to ensure the model is available.
    • predict(X, features_names=None, meta=None): This function deploys the loaded model to make predictions. It takes in the input features X, optional features_names, and optional metadata meta, returning the predicted target values.
  2. Feedback handling:
    • send_feedback(features, feature_names, reward, truth, routing=""): This function is vital in adapting the model to real-world feedback. It accepts the input data, truth values, and other parameters to assess the model’s performance. The feedback updates the model’s understanding, and the metrics are calculated and stored for real-time analysis. This facilitates continuous retraining of the model.
  3. Metrics calculation:
    • calculate_metrics(): This function calculates the essential metrics of accuracy, precision, recall, and F1-score. These metrics provide quantitative insights into the model’s performance, enabling constant monitoring and potential improvement.
    • Score class: This auxiliary class is used within the DockerModel to hold metrics for binary classification, including true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN). It helps keep track of these parameters, which are vital for calculating the aforementioned metrics.
  4. Monitoring assistance:
    • metrics(): This function generates the metrics for model monitoring. It returns a list of dictionaries containing the calculated accuracy, precision, recall, and F1 score. These metrics are compliant with Prometheus Metrics, facilitating real-time monitoring and analysis.
    • tags(): This function is designed to retrieve custom metadata data when generating predictions, aiding in monitoring and debugging. It returns a dictionary, which can help track and understand the nature of the requests.

Together, these functions and classes form a cohesive and robust structure that supports the entire lifecycle of an ML model. From the moment of inception (loading and predicting) through its growth (feedback handling) and evaluation (metrics calculation), to its continuous vigilance (monitoring assistance), the architecture is designed to standardize and streamline the process of deploying and maintaining ML models in a real-world production environment.

This model class is more than code; it’s the vessel that carries our ML model from a local environment to the vast sea of production. It’s the vehicle for standardization, unlocking efficiency and consistency in deploying models.

At this stage, we’ve prepared the canvas and outlined the masterpiece. Now, it’s time to dive deeper and explore how this model is encapsulated into a Docker image, an adventure that blends technology and strategy to redefine ML deployment. 

Testing model locally

Before venturing into creating a Docker image, testing the model locally is vital. This step acts as a rehearsal before the main event, providing a chance to ensure that the model is performing as expected with the testing data.

The importance of local testing lies in its ability to catch issues early, avoiding potential complications later in the deployment process. Following the example code provided below, it can confirm that the model is ready for its next phase if it provides the expected prediction in the expected format:

1
2
3
from DockerModel import DockerModel
demoModel = DockerModel()
demoModel.predict(X_test) # Can take the entire testing dataset or individual predictions

The expected output should match the format of the class labels you anticipate from the model. If everything works correctly, you’re assured that the model is well prepared for the next grand step: encapsulation within a Docker image.

Local testing is more than a technical process; it’s a quality assurance measure that stands as a gatekeeper, ensuring that only a well-prepared model moves forward. It illustrates the meticulous care taken in the deployment process, reflecting a commitment to excellence that transcends code and resonates with the core values of standardization and efficiency.

With the local testing accomplished, we stand on the threshold of a new frontier: creating the Docker image. Let’s continue this exciting journey, knowing each step is a stride toward innovation and mastery in ML deployment.

Encapsulating the model into a Docker image

In our IKEA Retail MLOps view, a model is not simply a collection of code. Rather, it is a sophisticated assembly comprising code, dependencies, and ML artifacts, all encapsulated within a versioned and registered Docker image. This composition is carefully designed, reflecting the meticulous planning of the physical infrastructure.

What is Docker’s role in MLOps?

Docker plays a vital role in MLOps, providing a standardized environment that streamlines the transition from development to production:

  • Streamlining deployment: Docker containers encapsulate everything an ML model needs to run, easing the deployment process.
  • Facilitating collaboration: Using Docker, data scientists and engineers can ensure that models and their dependencies remain consistent across different stages of development.
  • Enhancing model reproducibility: Docker provides a uniform environment that enhances the reproducibility of models, a critical aspect in machine learning.
  • Integrating with orchestration tools: Docker can be used with orchestration platforms like Kubernetes, enabling automated deployment, scaling, and management of containerized applications.

Docker and containerization are more than technology tools; they catalyze innovation and efficiency in MLOps. Ensuring consistency, scalability and agility, Docker unlocks new potential and opens the way for a more agile and robust ML deployment process. Whether you are a developer, a data scientist, or an IT professional, understanding Docker is critical to navigating the complex and multifaceted landscape of modern data-driven applications.

Dockerfile creation

Creating a Dockerfile is like sketching the architectural plan of a building. It outlines the instructions for creating a Docker image to run the application in a coherent, isolated environment. This design ensures that the entire model — including its code, dependencies, and unique ML artifacts — is treated as a cohesive entity, aligning with the overarching vision of IKEA Retail’s MLOps approach.

In our case, we have created a Dockerfile with the express purpose of encapsulating not only the code but all the corresponding artifacts of the model. This deliberate design facilitates a smooth transition to production, effectively bridging the gap between development and deployment.

We used the following Dockerfile for this demonstration, which represents a tangible example of how IKEA Retail’s MLOps approach is achieved through thoughtful engineering and strategic implementation.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
# Use an official Python runtime as a parent image.
# Using a slim image for a smaller final size and reduced attack surface.
FROM python:3.9-slim
 
# Set the maintainer label for metadata.
LABEL maintainer="<a href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="4c2a293e222d22282328233e2d2823623e3929282d0c25222b272d622f2321">[email protected]</a>"
 
# Set environment variables for a consistent build behavior.
# Disabling the buffer helps to log messages synchronously.
ENV PYTHONUNBUFFERED=1
 
# Set a working directory inside the container to store all our project files.
WORKDIR /app
 
# First, copy the requirements file to leverage Docker's cache for dependencies.
# By doing this first, changes to the code will not invalidate the cached dependencies.
COPY requirements.txt requirements.txt
 
# Install the required packages listed in the requirements file.
# It's a good practice to include the --no-cache-dir flag to prevent the caching of dependencies
# that aren't necessary for executing the application.
RUN pip install --no-cache-dir -r requirements.txt
 
# Copy the rest of the code and model files into the image.
COPY DockerModel.py DockerModel.py
COPY models/    models/
 
# Expose ports that the application will run on.
# Port 5000 for GRPC
# Port 9000 for REST
EXPOSE 5000 9000
 
# Set environment variables used by the application.
ENV MODEL_NAME DockerModel
ENV SERVICE_TYPE MODEL
 
# Change the owner of the directory to user 8888 for security purposes.
# It can prevent unauthorised write access by the application itself.
# Make sure to run the application as this non-root user later if applicable.
RUN chown -R 8888 /app
 
# Use the exec form of CMD so that the application you run will receive UNIX signals.
# This is helpful for graceful shutdown.
# Here we're using seldon-core-microservice to serve the model.
CMD exec seldon-core-microservice $MODEL_NAME --service-type $SERVICE_TYPE

This Dockerfile contains different parts:

  • FROM python:3.9-slim: This line chooses the official Python 3.9 slim image as the parent image. It is favored for its reduced size and attack surface, enhancing both efficiency and security.
  • LABEL maintainer="[email protected]": A metadata label that specifies the maintainer of the image, providing contact information.
  • ENV PYTHONUNBUFFERED=1: Disabling Python’s output buffering ensures that log messages are emitted synchronously, aiding in debugging and log analysis.
  • WORKDIR /app: Sets the working directory inside the container to /app, a centralized location for all project files.
  • COPY requirements.txt requirements.txt: Copies the requirements file into the image. Doing this before copying the rest of the code leverages Docker’s caching mechanism, making future builds faster. This file must contain the “seldon-core” package:
1
2
3
4
5
pandas==1.3.5
requests==2.28.1
numpy==1.20
seldon-core==1.14.1
scikit-learn==1.0.2
  • RUN pip install --no-cache-dir -r requirements.txt: Installs required packages as listed in the requirements file. The flag -no-cache-dir prevents unnecessary caching of dependencies, reducing the image size.
  • COPY DockerModel.py DockerModel.py: Copies the main Python file into the image.
  • COPY models/ models/: Copies the model files into the image.
  • EXPOSE 5000 9000: Exposes ports 5000 (GRPC) and 9000 (REST), allowing communication with the application inside the container.
  • ENV MODEL_NAME DockerModel: Sets the environment variable for the model name.
  • ENV SERVICE_TYPE MODEL: Sets the environment variable for the service type.
  • RUN chown -R 8888 /app: Changes the owner of the directory to user 8888. Running the application as a non-root user helps mitigate the risk of unauthorized write access.
  • CMD exec seldon-core-microservice $MODEL_NAME --service-type $SERVICE_TYPE: Executes the command to start the service using seldon-core-microservice. It also includes the model name and service type as parameters. Using exec ensures the application receives UNIX signals, facilitating graceful shutdown.

Building and pushing Docker image

1. Installing Docker Desktop

If not already installed, Docker Desktop is recommended for this task. Docker Desktop provides a graphical user interface that simplifies the process of building, running, and managing Docker containers. Docker Desktop also supports Kubernetes, offering an easy way to create a local cluster.

2. Navigating to the Project directory

Open a terminal or command prompt.

Navigate to the folder where the Dockerfile and other necessary files are located.

3. Building the Image

Execute the command: docker build . -t docker-model:1.0.0.

  • docker build . instructs Docker to build the image using the current directory (.).
  • -t docker-model:1.0.0 assigns a name (docker-model) and tag (1.0.0) to the image.

The build process will follow the instructions defined in the Dockerfile, creating a Docker image encapsulating the entire environment needed to run the model.

4. Pushing the image

If needed, the image can be pushed to a container registry like Docker Hub, or a private registry within an organization.

For this demonstration, the image is being kept in the local container registry, simplifying the process and removing the need for authentication with an external registry.

Deploy ML model using Docker: Unleash it into the world

Once the Docker image is built, running it is relatively straightforward. Let’s break down this process:

1
docker run --rm --name docker-model -p 9000:9000 docker-model:1.0.0

Components of the command:

  1. docker run: This is the base command to run a Docker container.
  2. -rm: This flag ensures that the Docker container is automatically removed once it’s stopped. It helps keep the environment clean, especially when you run containers for testing or short-lived tasks.
  3. -name docker-model: Assigns a name to the running container.
  4. p 9000:9000: This maps port 9000 on the host machine to port 9000 on the Docker container. The format is p <host_port>:<container_port>. Because the Dockerfile mentions that the application will be exposing ports 5000 for GRPC and 9000 for REST, this command makes sure the REST endpoint is available to external users or applications through port 9000 on the host.
  5. docker-model:1.0.0: This specifies the name and tag of the Docker image to run. docker-model is the name, and 1.0.0 is the version tag we assigned during the build process.

What happens next

On executing the command, Docker will initiate a container instance from the docker-model:1.0.0 image.

The application within the Docker container will start and begin listening for requests on port 9000 (as specified).

With the port mapping, any incoming requests on port 9000 of the host machine will be forwarded to port 9000 of the Docker container.

The application can now be accessed and interacted with as if it were running natively on the host machine.

Test deployed model using Docker

With the Docker image in place, it’s time to see the model in action.

Generate predictions

The path from model to prediction is a delicate process, requiring an understanding of the specific input-output type that Seldon accommodates (e.g., ndarray, JSON data, STRDATA).

In our scenario, the model anticipates an array, and thus, the key in our payload is “ndarray.” Here’s how we orchestrate this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
import requests
import json
 
 
def send_prediction_request(data):
     
    # Create the headers for the request
    headers = {'Content-Type': 'application/json'}
 
    try:
        # Send the POST request
        response = requests.post(URL, headers=headers, json=data)
         
        # Check if the request was successful
        response.raise_for_status() # Will raise HTTPError if the HTTP request returned an unsuccessful status code
         
        # If successful, return the JSON data
        return response.json()
    except requests.ConnectionError:
        raise Exception("Failed to connect to the server. Is it running?")
    except requests.Timeout:
        raise Exception("Request timed out. Please try again later.")
    except requests.RequestException as err:
        # For any other requests exceptions, re-raise it
        raise Exception(f"An error occurred with your request: {err}")
 
X_test
 
# Define the data payload (We can also use X_test[0:1].tolist() instead of the raw array)
data_payload = {
    "data": {
        "ndarray": [
            [
                1.340e+01, 2.052e+01, 8.864e+01, 5.567e+02, 1.106e-01, 1.469e-01,
                1.445e-01, 8.172e-02, 2.116e-01, 7.325e-02, 3.906e-01, 9.306e-01,
                3.093e+00, 3.367e+01, 5.414e-03, 2.265e-02, 3.452e-02, 1.334e-02,
                1.705e-02, 4.005e-03, 1.641e+01, 2.966e+01, 1.133e+02, 8.444e+02,
                1.574e-01, 3.856e-01, 5.106e-01, 2.051e-01, 3.585e-01, 1.109e-01
            ]
        ]
    }
}
 
 
# Get the response and print it
try:
    response = send_prediction_request(data_payload)
    pretty_json_response = json.dumps(response, indent=4
    print(pretty_json_response)
except Exception as err:
    print(err)

The prediction of our model will be similar to this dictionary:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
{
    "data": {
        "names": [],
        "ndarray": [
            0
        ]
    },
    "meta": {
        "metrics": [
            {
                "key": "accuracy",
                "type": "GAUGE",
                "value": 0
            },
            {
                "key": "precision",
                "type": "GAUGE",
                "value": 0
            },
            {
                "key": "recall",
                "type": "GAUGE",
                "value": 0
            },
            {
                "key": "f1_score",
                "type": "GAUGE",
                "value": 0
            }
        ],
        "tags": {
            "shape_input_data": "(1, 30)"
        }
    }
}

The response from the model will contain several keys:

  • "data": Provides the generated output by our model. In our case, it’s the predicted class.
  • "meta": Contains metadata and model metrics. It shows the actual values of the classification metrics, including accuracy, precision, recall, and f1_score.
  • "tags": Contains intermediate metadata. This could include anything you want to track, such as the shape of the input data.

The structure outlined above ensures that not only can we evaluate the final predictions, but we also gain insights into intermediate results. These insights can be instrumental in understanding predictions and debugging any potential issues.

This stage marks a significant milestone in our journey from training a model to deploying and testing it within a Docker container. We’ve seen how to standardize an ML model and how to set it up for real-world predictions. With this foundation, you’re well-equipped to scale, monitor, and further integrate this model into a full-fledged production environment.

Send feedback in real-time and calculate metrics

The provisioned /feedback endpoint facilitates this learning by allowing truth values to be sent back to the model once they are available. As these truth values are received, the model’s metrics are updated and can be scraped by other tools for real-time analysis and monitoring. In the following code snippet, we iterate over the test dataset and send the truth value to the /feedback endpoint, using a POST request:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
import requests
import json
 
 
def send_prediction_feedback(data):
     
    # Create the headers for the request
    headers = {'Content-Type': 'application/json'}
 
    try:
        # Send the POST request
        response = requests.post(URL, headers=headers, json=data)
         
        # Check if the request was successful
        response.raise_for_status() # Will raise HTTPError if the HTTP request returned an unsuccessful status code
         
        # If successful, return the JSON data
        return response.json()
    except requests.ConnectionError:
        raise Exception("Failed to connect to the server. Is it running?")
    except requests.Timeout:
        raise Exception("Request timed out. Please try again later.")
    except requests.RequestException as err:
        # For any other requests exceptions, re-raise it
        raise Exception(f"An error occurred with your request: {err}")
 
 
 
for i in range(len(X_test)):
    payload = {'request': {'data': {'ndarray': [X_test[i].tolist()]}}, 'truth': {'data': {'ndarray': [int(y_test[i])]}}}
 
    # Get the response and print it
    try:
        response = send_prediction_feedback(payload)
        pretty_json_response = json.dumps(response, indent=4# Pretty-print JSON
        print(pretty_json_response)
    except Exception as err:
        print(err)

After processing the feedback, the model calculates and returns key metrics, including accuracy, precision, recall, and F1-score. These metrics are then available for analysis:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
{
    "data": {
        "ndarray": []
    },
    "meta": {
        "metrics": [
            {
                "key": "accuracy",
                "type": "GAUGE",
                "value": 0.92607003
            },
            {
                "key": "precision",
                "type": "GAUGE",
                "value": 0.9528302
            },
            {
                "key": "recall",
                "type": "GAUGE",
                "value": 0.9294478
            },
            {
                "key": "f1_score",
                "type": "GAUGE",
                "value": 0.9409938
            }
        ],
        "tags": {
            "shape_input_data": "(1, 30)"Ω
        }
    }
}

What makes this approach truly powerful is that the model’s evolution is no longer confined to the training phase. Instead, it’s in a continual state of learning, adjustment, and refinement, based on real-world feedback.

This way, we’re not just deploying a static prediction engine but fostering an evolving intelligent system that can better align itself with the changing landscape of data it interprets. It’s a holistic approach to machine learning deployment that encourages continuous improvement and real-time adaptation.

Conclusions

At IKEA Retail, Docker has become an indispensable element in our daily MLOps activities, serving as a catalyst that accelerates the development and deployment of models, especially when transitioning to production. The transformative impact of Docker unfolds through a spectrum of benefits that not only streamlines our workflow but also fortifies it:

  • Standardization: Docker orchestrates a consistent environment during the development and deployment of any ML model, fostering uniformity and coherence across the lifecycle.
  • Compatibility: With support for diverse environments and seamless multi-cloud or on-premise integration, Docker bridges gaps and ensures a harmonious workflow.
  • Isolation: Docker ensures that applications and resources are segregated, offering an isolated environment that prioritizes efficiency and integrity.
  • Security: Beyond mere isolation, Docker amplifies security by completely segregating applications from each other. This robust separation empowers us with precise control over traffic flow and management, laying a strong foundation of trust.

These attributes translate into tangible advantages in our MLOps journey, sculpting a landscape that’s not only innovative but also robust:

  • Agile development and deployment environment: Docker ignites a highly responsive development and deployment environment, enabling seamless creation, updating, and deployment of ML models.
  • Optimized resource utilization: Utilize compute/GPU resources efficiently within a shared model, maximizing performance without compromising flexibility.
  • Scalable deployment: Docker’s architecture allows for the scalable deployment of ML models, adapting effortlessly to growing demands.
  • Smooth release cycles: Integrating seamlessly with our existing CI/CD pipelines, Docker smoothens the model release cycle, ensuring a continuous flow of innovation.
  • Effortless integration with monitoring tools: Docker’s compatibility extends to monitoring stacks like Prometheus + Grafana, creating a cohesive ecosystem fully aligned with our MLOps approach when creating and deploying models in production.

The convergence of these benefits elevates IKEA Retail’s MLOps strategy, transforming it into a symphony of efficiency, security, and innovation. Docker is not merely a tool: Docker is a philosophy that resonates with our pursuit of excellence. Docker is the bridge that connects creativity with reality, and innovation with execution.

In the complex world of ML deployment, we’ve explored a path less trodden but profoundly rewarding. We’ve tapped into the transformative power of standardization, unlocking an agile and responsive way to deploy and engage with ML models in real-time.

But this is not a conclusion; it’s a threshold. New landscapes beckon, brimming with opportunities for growth, exploration, and innovation. The following steps will continue the current approach: 

  • Scaling with Kubernetes: Unleash the colossal potential of Kubernetes, a beacon of flexibility and resilience, guiding you to a horizon of unbounded possibilities.
  • Applying real-time monitoring and alerting systems based on open source technologies, such as Prometheus and Grafana.
  • Connecting a data-drift detector for real-time detection: Deployment and integration of drift detectors to detect changes in data in real-time.

We hope this exploration will empower you to redefine your paths, ignite new ideas, and push the boundaries of what’s possible. The gateway to an extraordinary future is open, and the key is in our hands.

Learn more