How to Set Up Logging with Arize (Step by Step)
In this tutorial, we’re going to set up logging with Arize to ensure our machine learning models are performing as expected. Building logging solutions can seem overwhelming at first, but with structured practices, it becomes manageable — even enjoyable.
Prerequisites
- Python 3.7+
- pip install arize
- Familiarity with logging libraries in Python
Step 1: Setting Up Your Environment
The first thing you need is a working environment. For most developers, this is straightforward; however, it’s crucial to ensure your dependencies are correctly set up to avoid headaches later on.
# Create a virtual environment (recommended)
python -m venv arize_logging_env
source arize_logging_env/bin/activate # On Windows use `arize_logging_env\Scripts\activate`
# Install the Arize package
pip install arize
Why go through this trouble? Virtual environments isolate your project dependencies, avoiding conflicts with globally installed packages. Trust me; you don’t want version issues on your hands, especially when you’re logging important metrics.
Step 2: Import Necessary Libraries
Now that we have our environment in place, the next step is importing the necessary libraries. We’re going to use Python’s built-in logging library alongside Arize’s logging capabilities.
import logging
from arize.pandas.logger import Client
You might wonder why we need both logging and Arize’s logging client. The native logging library offers flexibility and configurability, while Arize provides a specialized interface for metrics and model monitoring. Remember, using the right tools for the job makes everything easier.
Step 3: Configure the Logger
With the libraries imported, we need to set up our logger. The configuration will determine how and where logging messages are displayed or stored.
# Configure logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(levelname)s - %(message)s',
handlers=[logging.StreamHandler()]
)
logger = logging.getLogger(__name__)
Here’s the deal: configuring your logger is just as much about aesthetics as it is about functionality. Choose a log format that works for you. By using `StreamHandler`, you ensure logs appear in your console. This is handy for debugging during the development phase.
Step 4: Connecting Arize Client
Next, we need to tell Arize how to connect with your logging instance. It requires some parameters like API keys, space keys, and the environment you are logging data in.
# Connecting to Arize
arize_client = Client(
space_key='your_space_key',
api_key='your_api_key',
verification_enabled=True
)
Make sure to replace `’your_space_key’` and `’your_api_key’` with actual values from your Arize account. You’ll enter those keys in the Arize settings section. If you’re having trouble finding this info, check out Arize’s official documentation on getting started, where they have this laid out nicely.
Step 5: Creating a Log Function
We need a specialized log function that integrates both your Python logging and the Arize logging client. This function will send logs to Arize on-demand while using the existing logger for console output.
def log_to_arize(model_id, model_version, input_data, prediction, actual):
try:
arize_client.log(
model_id=model_id,
model_version=model_version,
input_data=input_data,
prediction=prediction,
actual=actual
)
logger.info("Logging successful for model_id: %s", model_id)
except Exception as e:
logger.error("Error logging to Arize: %s", str(e))
What’s important to note here is the try-except block. This is going to save your sanity later on should anything fail during the logging process. It’s the classic fail gracefully approach — don’t just crash and burn. Instead, provide meaningful feedback so you can address the issue swiftly.
Step 6: Implement Logging in Your Workflow
Now, you can start logging your models at various points in your workflow. For example, if you are predicting new data samples, you can call the `log_to_arize` method to log inputs, predictions, and actual values.
# Sample input data
input_data = [{"feature1": 0.2, "feature2": 0.5}]
predictions = [0.9]
actuals = [1.0]
# Example model identifiers
model_id = "sample_model"
model_version = "v1.0"
# Logging the prediction
log_to_arize(model_id, model_version, input_data, predictions, actuals)
This is quite essential in production environments where monitoring and evaluating model performance is crucial. You’ll likely face challenges correlating real-time logging with other operations; hence, this function helps streamline that process.
The Gotchas
Every developer knows that production environments are filled with pitfalls that aren’t always covered in tutorials. Here are some common issues you’ll likely encounter.
- Log Volume: If you’re logging too much data without bounds, you can quickly hit storage limits. Use configurations to handle batch sizes or set limits on specific log events.
- Insufficient Permissions: Make sure the API key you’re using has sufficient privileges in Arize to perform logging. You might be surprised how many errors arise from permissions issues.
- Latency Issues: If you’re logging synchronously, especially in high-traffic scenarios, it can introduce latency into your processing pipeline. Consider using asynchronous logging to mitigate this.
Remember, addressing these issues early on can save you from future headaches, especially in critical use cases where performance and reliability are paramount.
Full Code
Alright, after all those steps, here’s a working example all in one place. This will help you tie everything together. Just make sure you replace the placeholder keys with your actual values.
import logging
from arize.pandas.logger import Client
# Configure logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(levelname)s - %(message)s',
handlers=[logging.StreamHandler()]
)
logger = logging.getLogger(__name__)
# Connecting to Arize
arize_client = Client(
space_key='your_space_key',
api_key='your_api_key',
verification_enabled=True
)
def log_to_arize(model_id, model_version, input_data, prediction, actual):
try:
arize_client.log(
model_id=model_id,
model_version=model_version,
input_data=input_data,
prediction=prediction,
actual=actual
)
logger.info("Logging successful for model_id: %s", model_id)
except Exception as e:
logger.error("Error logging to Arize: %s", str(e))
# Sample input data
input_data = [{"feature1": 0.2, "feature2": 0.5}]
predictions = [0.9]
actuals = [1.0]
# Example model identifiers
model_id = "sample_model"
model_version = "v1.0"
# Logging the prediction
log_to_arize(model_id, model_version, input_data, predictions, actuals)
What’s Next
Your next step after setting up logging with Arize should be to implement a monitoring dashboard. This allows real-time insight into your model’s performance. You can use a tool like Grafana or Metabase to visualize logs and detect anomalies. This will help you correlate logs with operational metrics, which is essential for keeping your models in check.
FAQ
What should I do if my API key is invalid?
Double-check the API key in your Arize account’s settings. If it still doesn’t work, try regenerating the key or contacting Arize support for help.
How can I log performance metrics aside from predictions?
Arize supports logging various metrics, including confusion matrices, ROC curves, and more. You can adapt the `log_to_arize` function to include these additional metrics alongside your main logs.
Is there a way to test logging without deploying my model?
Absolutely! You can create mock data and call each method independently to ensure that your logging setups perform without needing your model to be operational.
Recommendations for Different Developer Personas
If you’re a data scientist, focus on getting familiar with the logging features first, especially how they interface with your workflows. Models are supposed to improve, and understanding how to monitor that evolution is crucial.
For software engineers, I’d recommend diving deeper into asynchronous logging methods for high-performance systems. Synchronous logging can slow down processes significantly, which might not be apparent until you’re at scale.
Finally, if you’re a machine learning engineer, prioritize integrating logging into your CI/CD pipeline to ensure that performance metrics are logged every time you deploy a new version. This ensures ongoing visibility into how your model behaves in production.
Data as of March 22, 2026. Sources: Arize Getting Started, Arize Audit Log
Related Articles
- Scaling AI Agents in Production: A Case Study in Automated Customer Support
- AI agent load balancing strategies
- AI Funding Insights: WSJ’s Latest Analysis for AI Startups
🕒 Last updated: · Originally published: March 22, 2026