Classification Under Open Set Conditions#
Author: Nathan A. Mahynski
Date: 2024/09/23
Description: Building classifiers that work in the “open world.”
Conventional classifiers often assume there exist a finite set of known classes.
To train such a model, these classes should be sampled (often evenly) in such a way that during testing / deployment the new observations will come from the same distribution as the training samples (IID). Class balancing and cross-validation are common tools used to handle uncertainty in the latter of these assumptions, however, in the real world there are often many other (possibly infinite number of) classes that are not available at training time which a classifier might encounter when deployed.
This image is from Scheirer et al., “Toward Open Set Recognition” (2012) which originally formalized the open set recognition (OSR) problem:

Note that the “face verification” problem is essentially a one-class authentication problem. This “open set” of possibilities means that a classifier should be able to recognize the known classes seen during training, but also recognize when a test case is “none of the above.” There are a variety of algorithms designed to do this and related tasks. Here a few references that summarize some ontologies:
Yang et al., Generalized Out-of-Distribution Detection: A Survey
Geng et al., Recent Advances in Open Set Recognition: A Survey
The OSR task refers to when the model should be able to identify known classes and reject unknown ones. By some classification schemes, the process of simply rejecting an input as being from an unknown class is referred to as a “reject” option. OSR tasks are closely related to one-class classifiers (OCC) used for authentication purposes. An OSR-capable model can be constructed by chaining together multiple OCCs, each designed to recognize a single class. Soft PLS-DA is another example of model capable of handling open-set conditions. Another ad hoc, but general, way to handle an OSR task is to combine an outlier detector with a closed set classifier (blue pathway below). Green and red pathways illustrate “compliant” and “rigorous” OCC training schemes, respectively. The “pipeline” is any composite model, possibly including other preprocessing steps.

The outlier detector illustrated here determines if a sample is out of distribution (OOD), and sends only those in distribution (ID) to the classifier. In this way, the outlier detector determines if the input is coming from a “known” region of parameter space which the classifier should be responsible for. If not, the input is simply assigned to an “unknown” class. Otherwise, the closed-set classifier is assumed to be responsible for identifying the input as one of its known classes. The outlier detector itself may use a variety of different assumptions depending on which detector is used, as may the classifier, but this combination method is very general and can be applied with different sorts of outlier detectors and classification models.
An input point can shift from ID to OOD in two ways:
The first is by semantic shift; this occurs when a new, previously unknown, class appears. For example, an new animal the model did not see during training.
The other is by covariate shift; this occurs when variance in a known class increases, usually due to a new “type” of that class. For example, if we trained a model to recognize cats vs. dogs using high resolution images, but at test time a hand-drawn image of a dog is submitted.
The above workflow will (try) to catch both, though the main application lies in detecting semantic shifts since we typically attempt to build datasets that have representative examples of all sorts of the known classes. Thus, we try to control covariate shift by carefully preparing the training data; this is, of course, imperfect and not always possible.
We have implemented a basic OpenSetClassifier model in PyChemAuth which we illustrate for a variety of different conditions below. Depending on the classification model being used the performance metric may vary.
The OpenSetClassifier works as follows: 1. Select known classes to use from training data (typically all that are available). 2. Send knowns to outlier detector (OD) to train it. 3. Use fitted OD to predict inliers on training data. 4. Send inlying, known class examples to classifier to train on.
During test time, new samples are first given to the OD; if considered an inlier, then the classifier is called to identify the sample, otherwise it is labeled as “unknown”.
Note that if the underlying classification model is capable of detecting outliers (or rejecting acceptance, as in OCC) we should be able to combine those (1) rejected because the test sample belongs to an known alternative (“known unknown”) with those (2) rejected because they are from an unknown alternative class (“unknown unknown”) to compute performance metrics. Sometimes it is useful to avoid this to identify why a sample was rejected (by the model or the outlier detector?). Examples of this are illustrated below.
There are many classification models available in both PyChemAuth and scikit-learn; the latter also offers various outlier detection routines, but there are many more available in compatbile libraries such as pyOD.
[1]:
if 'google.colab' in str(get_ipython()):
!pip install git+https://github.com/mahynski/pychemauth@main
import os
os.kill(os.getpid(), 9) # Automatically restart the runtime to reload libraries
[2]:
try:
import pychemauth
except:
raise ImportError("pychemauth not installed")
import matplotlib.pyplot as plt
%matplotlib inline
import watermark
%load_ext watermark
%load_ext autoreload
%autoreload 2
[3]:
import sklearn
import imblearn
import copy
import numpy as np
from pychemauth.classifier.osr import OpenSetClassifier
from pychemauth.classifier.plsda import PLSDA
from pychemauth.manifold.elliptic import EllipticManifold_Model
from pychemauth.classifier import osr
from pychemauth.classifier.simca import SIMCA_Authenticator
from pychemauth.datasets import make_pgaa_images
from pychemauth import utils
from pyts.image import GramianAngularField
from sklearn.ensemble import IsolationForest, RandomForestClassifier
from sklearn.model_selection import GridSearchCV
2024-09-24 13:28:15.729274: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
[4]:
%watermark -t -m -v --iversions
Python implementation: CPython
Python version : 3.11.4
IPython version : 8.14.0
Compiler : GCC 12.2.0
OS : Linux
Release : 6.5.0-45-generic
Machine : x86_64
Processor : x86_64
CPU cores : 40
Architecture: 64bit
matplotlib: 3.7.2
watermark : 2.4.3
sklearn : 1.3.0
pychemauth: 0.0.0b4
numpy : 1.26.4
imblearn : 0.11.0
Load Example Data#
[5]:
from sklearn.datasets import load_iris as load_data
X, y = load_data(return_X_y=True, as_frame=True)
# Let's turn the indices into names
names = dict(zip(np.arange(3), ['setosa', 'versicolor', 'virginica']))
y = y.apply(lambda x: names[x])
X.head()
[5]:
| sepal length (cm) | sepal width (cm) | petal length (cm) | petal width (cm) | |
|---|---|---|---|---|
| 0 | 5.1 | 3.5 | 1.4 | 0.2 |
| 1 | 4.9 | 3.0 | 1.4 | 0.2 |
| 2 | 4.7 | 3.2 | 1.3 | 0.2 |
| 3 | 4.6 | 3.1 | 1.5 | 0.2 |
| 4 | 5.0 | 3.6 | 1.4 | 0.2 |
[6]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
X.values,
y.values,
shuffle=True,
random_state=42,
test_size=0.2,
stratify=y # It is usually important to balance the test and train set so they have the same fraction of classes
)
[7]:
for class_ in ['setosa', 'versicolor', 'virginica']:
mask = y_train == class_
plt.plot(X_train[mask][:,2], X_train[mask][:,3], 'o', label=class_)
plt.xlabel('Column 3 (petal length)')
plt.ylabel('Column 4 (petal width)')
_ = plt.legend(loc='best')
Case 1: Multiclass Hard Model#
Hard models divide the input / latent space into disjoint regions for each known class. By adding an outlier detector (OD), we can effectively turn this discriminator into a class model that uses OD to create finite regions of space in which we should expect the hard model to work correctly.
Recall from our example using PLS-DA, that hard PLS-DA with 3 components (and other hyperparamters below) has a 2D space which can divide the three iris flowers reasonably well. Here, let’s just train with 2 classes to emulate the case where we have an incomplete knowledge of all possible flowers.
[8]:
# Leave out versicolor on purpose
X_train_ = X_train[y_train != 'versicolor']
y_train_ = y_train[y_train != 'versicolor']
Case 1a: Using a Flexible Outlier Detector#
Here we will use an isolation forest to detect outliers. The advantage of this model is that it does not make assumptions about the underlying distribution of the data. This is important if the data is multimodal (for example, if you have multiple classes and each has a different center) so for multiclass problems this is recommended. See this example for an illustration of how isolation forest works.
The main assumption with an isolation forest is that there are dense cluster(s) of data which determine where “inliers” should be located. This is a reasonable assumption for many chemometric authentication problems.
[9]:
osc = OpenSetClassifier(
clf_model=PLSDA,
clf_kwargs={
'n_components':3,
'alpha':0.05,
'gamma':0.01,
'style':"hard",
'scale_x':True,
},
outlier_model=IsolationForest,
outlier_kwargs={
'n_estimators':100,
'max_samples':1.0,
'contamination': 0.15, # In practice, this is very important to tune correctly
'max_features':1.0,
'bootstrap':True,
'random_state':42,
},
score_metric='TEFF',
clf_style='hard',
unknown_class="UNKNOWN FLOWER",
)
[10]:
osc.fit(X_train_, y_train_)
[10]:
OpenSetClassifier(clf_kwargs={'alpha': 0.05, 'gamma': 0.01, 'n_components': 3,
'scale_x': True, 'style': 'hard'},
clf_model=<class 'pychemauth.classifier.plsda.PLSDA'>,
outlier_kwargs={'bootstrap': True, 'contamination': 0.15,
'max_features': 1.0, 'max_samples': 1.0,
'n_estimators': 100, 'random_state': 42},
outlier_model=<class 'sklearn.ensemble._iforest.IsolationForest'>,
unknown_class='UNKNOWN FLOWER')In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
OpenSetClassifier(clf_kwargs={'alpha': 0.05, 'gamma': 0.01, 'n_components': 3,
'scale_x': True, 'style': 'hard'},
clf_model=<class 'pychemauth.classifier.plsda.PLSDA'>,
outlier_kwargs={'bootstrap': True, 'contamination': 0.15,
'max_features': 1.0, 'max_samples': 1.0,
'n_estimators': 100, 'random_state': 42},
outlier_model=<class 'sklearn.ensemble._iforest.IsolationForest'>,
unknown_class='UNKNOWN FLOWER')[11]:
osc.predict(X_train_)[:10]
[11]:
['setosa',
'UNKNOWN FLOWER',
'setosa',
'virginica',
'virginica',
'virginica',
'UNKNOWN FLOWER',
'virginica',
'setosa',
'setosa']
[12]:
res = osc.figures_of_merit(osc.predict(X_train_), y_train_)
[13]:
osc.score(X_train_, y_train_)
[13]:
0.9219544457292888
[14]:
res['TEFF']
[14]:
0.9219544457292888
[15]:
res['CM']
[15]:
| setosa | virginica | UNKNOWN FLOWER | |
|---|---|---|---|
| setosa | 35 | 0 | 5 |
| virginica | 0 | 33 | 7 |
[16]:
# The isolation forest tuned itself to find this fraction of unknowns.
res['CM']['UNKNOWN FLOWER'].sum() / (40 + 40)
[16]:
0.15
[17]:
# The underlying hard model only recognizes setosa and virginica
_ = osc.fitted_classification_model.visualize(styles=['hard'])
[18]:
# And it can only predict membership to setosa and virginica
print('Inlier?\tPrediction\tActual')
print('------------------------------')
for inlier, pred, act in zip(
osc.fitted_outlier_model.predict(X_train_)[:10],
osc.fitted_classification_model.predict(X_train_)[:10],
y_train_[:10]
):
print(inlier, '\t', pred, '\t', act)
Inlier? Prediction Actual
------------------------------
1 setosa setosa
-1 virginica virginica
1 setosa setosa
1 virginica virginica
1 virginica virginica
1 virginica virginica
-1 virginica virginica
1 virginica virginica
1 setosa setosa
1 setosa setosa
[19]:
# Clearly nearly all the intervening versicolor (green) will be classified incorrectly as virginica (orange).
fig, ax = plt.subplots(nrows=1, ncols=1)
for i, class_ in enumerate(['setosa', 'virginica', 'versicolor']):
mask = y_train == class_
projection = osc.fitted_classification_model.transform(X_train[mask])
ax.plot(projection.ravel(), [i]*projection.shape[0], 'o', label=class_)
ax = osc.fitted_classification_model.visualize(styles=['hard'], ax=ax, show_training=False)
_ = ax.legend(loc='best')
_ = ax.set_ylim(-1, 3)
[20]:
for i, label in enumerate(['setosa', 'virginica', 'versicolor']):
mask = y_train == label
X_ = X_train[mask]
inlier_ = osc.fitted_outlier_model.predict(X_)
for marker, alpha, value in zip(['*', 'o'], [0.5, 1.0], [-1, +1]):
out_mask = inlier_ == value
plt.plot(X_[out_mask, 2], X_[out_mask, 3], marker=marker, lw=0, alpha=alpha, label=f'{label} {"inlier" if value==1 else "outlier"}', color=f'C{i}')
plt.legend(loc='upper left')
plt.xlabel('Column 3 (petal length)')
plt.ylabel('Column 4 (petal width)')
[20]:
Text(0, 0.5, 'Column 4 (petal width)')
[21]:
# However, the OSC model is able to identify most of the versicolor samples as being from an unknown class!
res = osc.figures_of_merit(osc.predict(X_train), y_train)
res['CM']
[21]:
| setosa | virginica | UNKNOWN FLOWER | |
|---|---|---|---|
| setosa | 35 | 0 | 5 |
| versicolor | 0 | 3 | 37 |
| virginica | 0 | 33 | 7 |
Case 1b: Using an Outlier Detector with Strong Assumptions#
Here we will use an elliptical manifold to determine if a point is an outlier or inlier. This is akin to using the robust covariance approach illustrated here and assumes that the inlying data is all located inside a single “ellipse”. For one-class classification instances, this can be reasonable, but should be avoided for multiclass problems since this will draw an ellipse around multiple class centers, possibly overlapping the region “in between” which makes the model susceptible to issues if there are alternative classes there. In this example data, the versicolor class tends to be “in between” the other 2 and nicely demonstrates the issue.
[22]:
# If we do not provide a dr_model then the EllipticManifold simply fits an ellipse in the original data space.
# For this example, let's just use the second 2 columns so that we will be able to plot the results nicely.
setosa_virginica = EllipticManifold_Model(
alpha=0.05,
robust=True, # Estimate the covariance matrix for the Mahalanobis distance using a robust approach (MCD)
center='score', # Center the ellipse around the empirical mean of the projected data
)
_ = setosa_virginica.fit(
X_train_[:, 2:],
["INLIERS"] * np.sum(y_train != 'versicolor')
)
[23]:
# Clearly, the versicolor falls between the other 2, so this elliptical boundary is not expected to be a good outlier
# detector.
_ = setosa_virginica.visualize(
[
X_train[y_train == 'setosa'][:, 2:],
X_train[y_train == 'versicolor'][:, 2:],
X_train[y_train == 'virginica'][:, 2:]
],
['setosa', 'versicolor', 'virginica']
)
[24]:
# Essentially all of the versicolor is considered to be an 'inlier' and will be
# sent to the classifier, which has never seen this class before.
setosa_virginica.predict(X_train[y_train == 'versicolor', 2:])
[24]:
array([ True, True, True, True, True, False, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, False, True, True,
True, True, True, True])
[25]:
osc = OpenSetClassifier(
clf_model=PLSDA,
clf_kwargs={
'n_components':1,
'alpha':0.05,
'gamma':0.01,
'style':"hard",
'scale_x':True,
},
outlier_model=EllipticManifold_Model, # Estimate the covariance matrix for the Mahalanobis distance using a robust approach (MCD)
outlier_kwargs={
'alpha':0.05,
'robust':True,
'center':'score'
},
score_metric='TEFF',
clf_style='hard',
unknown_class="UNKNOWN FLOWER",
)
[26]:
osc.fit(X_train_[:, 2:], y_train_)
[26]:
OpenSetClassifier(clf_kwargs={'alpha': 0.05, 'gamma': 0.01, 'n_components': 1,
'scale_x': True, 'style': 'hard'},
clf_model=<class 'pychemauth.classifier.plsda.PLSDA'>,
outlier_kwargs={'alpha': 0.05, 'center': 'score',
'robust': True},
outlier_model=<class 'pychemauth.manifold.elliptic.EllipticManifold_Model'>,
unknown_class='UNKNOWN FLOWER')In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
OpenSetClassifier(clf_kwargs={'alpha': 0.05, 'gamma': 0.01, 'n_components': 1,
'scale_x': True, 'style': 'hard'},
clf_model=<class 'pychemauth.classifier.plsda.PLSDA'>,
outlier_kwargs={'alpha': 0.05, 'center': 'score',
'robust': True},
outlier_model=<class 'pychemauth.manifold.elliptic.EllipticManifold_Model'>,
unknown_class='UNKNOWN FLOWER')[27]:
osc.predict(X_train_[:, 2:])[:10]
[27]:
['setosa',
'virginica',
'setosa',
'virginica',
'virginica',
'virginica',
'UNKNOWN FLOWER',
'UNKNOWN FLOWER',
'setosa',
'setosa']
[28]:
res = osc.figures_of_merit(osc.predict(X_train[:, 2:]), y_train)
[29]:
res['CM']
[29]:
| setosa | virginica | UNKNOWN FLOWER | |
|---|---|---|---|
| setosa | 40 | 0 | 0 |
| versicolor | 4 | 34 | 2 |
| virginica | 0 | 28 | 12 |
Case 2: Multiclass Soft Model#
Soft models can assign a point to multiple (or no) classes, but only those known at training time. For example, soft PLS-DA will learn the \(N\) specific classes known when trained, and learn to predict 1, >1, or none of those specific labels. By adding an outlier detector we might be able to further protect against bad predictions when covariate or semantic shifts occur (at test time) and the model encounters an unusual input; we would hope this would be recognized as entirely novel by the PLS-DA model, but this model’s behavior is biased based on its training data and unexpected predictions can result from unexpected inputs.
[30]:
osc = OpenSetClassifier(
clf_model=PLSDA,
clf_kwargs={
'n_components':3,
'alpha':0.05,
'gamma':0.01,
'style':"soft",
'not_assigned':"UNKNOWN FLOWER",
'scale_x':True,
},
outlier_model=IsolationForest,
outlier_kwargs={
'n_estimators':100,
'max_samples':1.0,
'contamination':0.15, # In practice, this is very important to tune correctly
'max_features':1.0,
'bootstrap':True,
'random_state':42,
},
score_metric='TEFF',
clf_style='soft',
unknown_class="OUTLIER",
)
[31]:
osc.fit(X_train_, y_train_)
[31]:
OpenSetClassifier(clf_kwargs={'alpha': 0.05, 'gamma': 0.01, 'n_components': 3,
'not_assigned': 'UNKNOWN FLOWER', 'scale_x': True,
'style': 'soft'},
clf_model=<class 'pychemauth.classifier.plsda.PLSDA'>,
clf_style='soft',
outlier_kwargs={'bootstrap': True, 'contamination': 0.15,
'max_features': 1.0, 'max_samples': 1.0,
'n_estimators': 100, 'random_state': 42},
outlier_model=<class 'sklearn.ensemble._iforest.IsolationForest'>,
unknown_class='OUTLIER')In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
OpenSetClassifier(clf_kwargs={'alpha': 0.05, 'gamma': 0.01, 'n_components': 3,
'not_assigned': 'UNKNOWN FLOWER', 'scale_x': True,
'style': 'soft'},
clf_model=<class 'pychemauth.classifier.plsda.PLSDA'>,
clf_style='soft',
outlier_kwargs={'bootstrap': True, 'contamination': 0.15,
'max_features': 1.0, 'max_samples': 1.0,
'n_estimators': 100, 'random_state': 42},
outlier_model=<class 'sklearn.ensemble._iforest.IsolationForest'>,
unknown_class='OUTLIER')[32]:
_ = osc.fitted_classification_model.visualize(styles=['hard', 'soft'])
[33]:
# The predictions can return either OUTLIER (OD identified this point) or UNKNOWN FLOWER (soft model identified this point)
osc.predict(X_train_[:10])
[33]:
[['setosa'],
['OUTLIER'],
['setosa'],
['virginica'],
['UNKNOWN FLOWER'],
['virginica'],
['OUTLIER'],
['virginica'],
['setosa'],
['setosa']]
[34]:
# For the purpose of computing metrics these are combined under the OSC's label (here, that is OUTLIER)
res = osc.figures_of_merit(osc.predict(X_train_[:10]), y_train_[:10])
res['CM']
[34]:
| setosa | virginica | OUTLIER | |
|---|---|---|---|
| setosa | 4 | 0 | 0 |
| virginica | 0 | 3 | 3 |
[35]:
res = osc.figures_of_merit(osc.predict(X_train_), y_train_)
res['CM']
[35]:
| setosa | virginica | OUTLIER | |
|---|---|---|---|
| setosa | 33 | 0 | 7 |
| virginica | 0 | 32 | 8 |
The difference can be useful if you want to understand which model rejected the point.
[36]:
for i, label in enumerate(['setosa', 'virginica', 'versicolor']):
mask = y_train == label
X_ = X_train[mask]
preds_ = osc.predict(X_)
outlier_mask = np.array([True if x_[0] == 'OUTLIER' else False for x_ in preds_]) # Determined to be OOD by the outlier detector
unknown_mask = np.array([True if x_[0] == 'UNKNOWN FLOWER' else False for x_ in preds_]) # PLS-DA determined this was out of its known bounds
inlier_mask = (~outlier_mask) & (~unknown_mask)
for mask_, marker_, alpha_, tag_ in zip([inlier_mask, outlier_mask, unknown_mask], ['o', '^', '*'], [0.3, 1.0, 1.0], ['inlier', 'outlier', 'unknown']):
plt.plot(X_[mask_, 2], X_[mask_, 3], marker=marker_, lw=0, color=f'C{i}', alpha=alpha_, label=f'{label}->{tag_}')
plt.legend(loc='upper left')
plt.xlabel('Column 3 (petal length)')
plt.ylabel('Column 4 (petal width)')
[36]:
Text(0, 0.5, 'Column 4 (petal width)')
[37]:
# Note that the trangular points were caught by the outlier detector, but it is possible that the PLS-DA model would
# have caught them too.
clf_, od_, both_, neither_ = 0, 0, 0, 0
od_correct_ = 0
clf_correct_ = 0
for a,b,label in zip(
osc.fitted_classification_model.predict(X_train),
osc.fitted_outlier_model.predict(X_train),
y_train
):
clf_caught, od_caught = 0, 0
if a[0] == 'UNKNOWN FLOWER':
clf_caught = 1
if b == -1:
od_caught = 1
if (a[0] == 'UNKNOWN FLOWER') and (label == 'versicolor'):
clf_correct_ += 1
if (b == -1) and (label == 'versicolor'):
od_correct_ += 1
clf_ += clf_caught
od_ += od_caught
both_ += (1 if (clf_caught==1 and od_caught==1) else 0)
neither_ += (1 if (clf_caught==0 and od_caught==0) else 0)
[38]:
# In most cases, the model and outlier both reacted the same way to a point.
clf_, od_, both_, neither_
[38]:
(45, 49, 40, 66)
[39]:
# They each correctly identified a novel point 37/40 times
clf_correct_, od_correct_
[39]:
(37, 37)
[40]:
# The OD rejects 37/40 versicolor points, allowing 3 to get through. However, it is clear that only 1 was incorrectly
# assigned to a known class (virginica); these must have been rejected by the soft PLS-DA model. Thus, together they
# make a more specific model than either alone (below the PLS-DA model is shown to fail to reject 2 points).
[41]:
res = osc.figures_of_merit(osc.predict(X_train), y_train)
res['CM']
[41]:
| setosa | virginica | OUTLIER | |
|---|---|---|---|
| setosa | 33 | 0 | 7 |
| versicolor | 0 | 1 | 39 |
| virginica | 0 | 32 | 8 |
[42]:
# The PLS-DA model alone is less specific against the versicolor challenge class, however, it is more sensitive to both
# known classes.
soft_plsda = PLSDA(
**{
'n_components':3,
'alpha':0.05,
'gamma':0.01,
'style':"soft",
'not_assigned':"UNKNOWN FLOWER",
'scale_x':True,
}
)
_ = soft_plsda.fit(X_train_, y_train_)
res = soft_plsda.figures_of_merit(soft_plsda.predict(X_train), y_train)
res['CM']
[42]:
| setosa | virginica | UNKNOWN FLOWER | |
|---|---|---|---|
| setosa | 35 | 0 | 5 |
| versicolor | 0 | 2 | 38 |
| virginica | 0 | 38 | 2 |
Case 3: Convert a Binary OvA Discriminator into a Binary Class Model#
Often a single (e.g., authentic) class may be well sampled while many possible alternatives are collected into a single “alternative” class, then a pseudo-binary model is trained in a “one-vs-all” (OvA) fashion. The lack of a well-defined, well-sampled alternative class is one of the main reasons such models are typically inappropriate for authentication applications. However, it is possible to convert a binary OvA discriminator into a binary class model by adding an outlier detector. This assumes that the two classes (authentic vs. known alternatives) are well sampled, so that the outlier detector is simply responsible for learning when covariate or semantic shifts occur and a sample moves out of the region of input space that a classifier is expected to be familiar with. In this sense, we have a model which understands 2 classes: the authentic and the known alternatives, and both class spaces are bounded due to the outlier detector.
By adding an outlier detector, we can at least determine if a new sample looks different from both the known authentic class and the set of known alternatives. These “unknown unknowns” are typically the reason that models fail under open set conditions. The example below is not quite an authenticator since it uses both classes to evaluate the model performance, but we may consider it a class model since the outlier detector will reject any sample that is far away from the learned (finite) class space. Since alternatives are used during training, this is a “compliant” approach rather than a “rigorous” one.
[43]:
# Pretend that setosa is the target class and virginica is the only known alternative when the model is initially trained.
# For example, perhaps the species is versicolor has not been discovered yet.
target = 'setosa'
known_alternatives = ['virginica'] # In principle, the training set could contain other known alternatives, too
unknown_alternatives = ['versicolor'] # The list of alternatives encountered at test time that were not available during training
mask = (y_train == target) | np.array([y_ in known_alternatives for y_ in y_train])
[44]:
osc = OpenSetClassifier(
clf_model=RandomForestClassifier,
clf_kwargs={
'n_estimators':100,
'max_features':'sqrt',
'random_state':42,
'class_weight':'balanced'
},
outlier_model=IsolationForest,
outlier_kwargs={
'n_estimators':100,
'contamination':0.15, # In practice, this is very important to tune correctly
'random_state':42,
},
inlier_value=1,
unknown_class='UNKNOWN',
score_metric='TEFF',
clf_style='hard',
score_using="all"
)
[45]:
osc.fit(X_train[mask], y_train[mask])
[45]:
OpenSetClassifier(clf_kwargs={'class_weight': 'balanced',
'max_features': 'sqrt', 'n_estimators': 100,
'random_state': 42},
clf_model=<class 'sklearn.ensemble._forest.RandomForestClassifier'>,
outlier_kwargs={'contamination': 0.15, 'n_estimators': 100,
'random_state': 42},
outlier_model=<class 'sklearn.ensemble._iforest.IsolationForest'>,
unknown_class='UNKNOWN')In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
OpenSetClassifier(clf_kwargs={'class_weight': 'balanced',
'max_features': 'sqrt', 'n_estimators': 100,
'random_state': 42},
clf_model=<class 'sklearn.ensemble._forest.RandomForestClassifier'>,
outlier_kwargs={'contamination': 0.15, 'n_estimators': 100,
'random_state': 42},
outlier_model=<class 'sklearn.ensemble._iforest.IsolationForest'>,
unknown_class='UNKNOWN')[46]:
osc.predict(X_train[mask])[:10]
[46]:
['setosa',
'UNKNOWN',
'setosa',
'virginica',
'virginica',
'virginica',
'UNKNOWN',
'UNKNOWN',
'setosa',
'setosa']
[47]:
res = osc.figures_of_merit(osc.predict(X_train[mask]), y_train[mask])
res
[47]:
{'CM': setosa virginica UNKNOWN
setosa 36 0 4
virginica 0 32 8,
'I': setosa 40
virginica 40
dtype: int64,
'CSNS': setosa 0.9
virginica 0.8
dtype: float64,
'CSPS': setosa 1.0
virginica 1.0
dtype: float64,
'CEFF': setosa 0.948683
virginica 0.894427
dtype: float64,
'TSNS': 0.85,
'TSPS': 1.0,
'TEFF': 0.9219544457292888,
'ACC': 0.85}
[48]:
res['CM']
[48]:
| setosa | virginica | UNKNOWN | |
|---|---|---|---|
| setosa | 36 | 0 | 4 |
| virginica | 0 | 32 | 8 |
[49]:
osc.score(X_train, y_train)
[49]:
0.9142392101268318
[50]:
res = osc.figures_of_merit(osc.predict(X_train), y_train)
[51]:
res['CM']
[51]:
| setosa | virginica | UNKNOWN | |
|---|---|---|---|
| setosa | 36 | 0 | 4 |
| versicolor | 0 | 2 | 38 |
| virginica | 0 | 32 | 8 |
Case 4: Convert a Binary OvA Discriminator into a One-Class Classifier#
If we evaluate the model using one-class model metrics we can consider this a one-class classifier (OCC) instead of a binary one. This effectively still predicts class membership (Y/N) but rejection can come from either the discriminative classifier or the outlier detector. You can analyze the source of the rejections as previously illustrated with multiclass soft model example.
[52]:
# Re-label training data into authentic vs. alternative - hold out the versicolor for a later test
def convert_y_to_binary_format(y, target, alternative_class='KNOWN ALTERNATIVE'):
y_binary = y.copy()
y_binary[y_binary != target] = alternative_class
return y_binary
y_train_binary = convert_y_to_binary_format(y_train[mask], target)
X_train_binary = X_train[mask]
[53]:
y_train_binary[:10]
[53]:
array(['setosa', 'KNOWN ALTERNATIVE', 'setosa', 'KNOWN ALTERNATIVE',
'KNOWN ALTERNATIVE', 'KNOWN ALTERNATIVE', 'KNOWN ALTERNATIVE',
'KNOWN ALTERNATIVE', 'setosa', 'setosa'], dtype=object)
[54]:
osc = OpenSetClassifier(
clf_model=RandomForestClassifier,
clf_kwargs={
'n_estimators':100,
'max_features':'sqrt',
'random_state':42,
'class_weight':'balanced'
},
outlier_model=IsolationForest,
outlier_kwargs={
'n_estimators':100,
'contamination':0.15, # In practice, this is very important to tune correctly
'random_state':42,
},
inlier_value=1,
unknown_class='UNKNOWN',
score_metric='TEFF',
clf_style='hard',
score_using=target # Now score the model using only the target class as the positive class, others are all negative
)
[55]:
osc.fit(X_train_binary, y_train_binary)
[55]:
OpenSetClassifier(clf_kwargs={'class_weight': 'balanced',
'max_features': 'sqrt', 'n_estimators': 100,
'random_state': 42},
clf_model=<class 'sklearn.ensemble._forest.RandomForestClassifier'>,
outlier_kwargs={'contamination': 0.15, 'n_estimators': 100,
'random_state': 42},
outlier_model=<class 'sklearn.ensemble._iforest.IsolationForest'>,
score_using='setosa', unknown_class='UNKNOWN')In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
OpenSetClassifier(clf_kwargs={'class_weight': 'balanced',
'max_features': 'sqrt', 'n_estimators': 100,
'random_state': 42},
clf_model=<class 'sklearn.ensemble._forest.RandomForestClassifier'>,
outlier_kwargs={'contamination': 0.15, 'n_estimators': 100,
'random_state': 42},
outlier_model=<class 'sklearn.ensemble._iforest.IsolationForest'>,
score_using='setosa', unknown_class='UNKNOWN')[56]:
osc.score(X_train, y_train)
[56]:
0.9486832980505138
[57]:
# Predict on the entire dataset which also includes versicolor as an unknown unknown
res = osc.figures_of_merit(osc.predict(X_train), y_train)
[58]:
res # The results are now reported as if this is a OCC
[58]:
{'CM': KNOWN ALTERNATIVE setosa UNKNOWN
setosa 0 36 4
versicolor 2 0 38
virginica 32 0 8,
'I': KNOWN ALTERNATIVE 0
setosa 40
versicolor 40
virginica 40
dtype: int64,
'CSPS': {'KNOWN ALTERNATIVE': nan, 'versicolor': 1.0, 'virginica': 1.0},
'TSNS': 0.9,
'TSPS': 1.0,
'TEFF': 0.9486832980505138,
'ACC': 0.9666666666666667}
[59]:
res['CM']
[59]:
| KNOWN ALTERNATIVE | setosa | UNKNOWN | |
|---|---|---|---|
| setosa | 0 | 36 | 4 |
| versicolor | 2 | 0 | 38 |
| virginica | 32 | 0 | 8 |
[60]:
tsns = 36 / 40. # 35 correctly recognized as setosa
tsps = (40 + 40) / (40 + 40.) # All others correctly NOT assigned to setosa
teff = (tsns * tsps)**0.5
tsns, tsps, teff
[60]:
(0.9, 1.0, 0.9486832980505138)
Case 5: Enabling OCCs to Detect New Alternative Classes#
Compliant OCC’s contain a bias since they are trained using information about known alternatives. Rigorous models use only information about the target class. Compliant models tend to appear more performant on the surface, but the underlying bias is hard to quantify. Adding an outlier detector to this can help detect covariate or semantic shift away from the target + known alternative classes.
[61]:
osc = OpenSetClassifier(
clf_model=SIMCA_Authenticator,
clf_kwargs={
'n_components':3,
'alpha':0.05,
'gamma':0.01,
'target_class':target,
'style':'dd-simca',
'use':'compliant', # Allows the model to use all classes that make it through the outlier detector
'scale_x':True,
'robust':'semi',
'sft':False,
},
outlier_model=IsolationForest,
outlier_kwargs={
'n_estimators':100,
'contamination':0.15, # In practice, this is very important to tune correctly
'random_state':42,
},
inlier_value=1,
unknown_class="UNKNOWN",
score_metric='TEFF',
clf_style='hard', # SIMCA returns Y/N membership which is hard
score_using=target # Now score the model using only the target class as the positive class, others are all negative
)
[62]:
osc.fit(X_train_binary, y_train_binary)
[62]:
OpenSetClassifier(clf_kwargs={'alpha': 0.05, 'gamma': 0.01, 'n_components': 3,
'robust': 'semi', 'scale_x': True, 'sft': False,
'style': 'dd-simca', 'target_class': 'setosa',
'use': 'compliant'},
clf_model=<class 'pychemauth.classifier.simca.SIMCA_Authenticator'>,
outlier_kwargs={'contamination': 0.15, 'n_estimators': 100,
'random_state': 42},
outlier_model=<class 'sklearn.ensemble._iforest.IsolationForest'>,
score_using='setosa', unknown_class='UNKNOWN')In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
OpenSetClassifier(clf_kwargs={'alpha': 0.05, 'gamma': 0.01, 'n_components': 3,
'robust': 'semi', 'scale_x': True, 'sft': False,
'style': 'dd-simca', 'target_class': 'setosa',
'use': 'compliant'},
clf_model=<class 'pychemauth.classifier.simca.SIMCA_Authenticator'>,
outlier_kwargs={'contamination': 0.15, 'n_estimators': 100,
'random_state': 42},
outlier_model=<class 'sklearn.ensemble._iforest.IsolationForest'>,
score_using='setosa', unknown_class='UNKNOWN')[63]:
# OCC predict T/F while OD returns "UNKNOWN" string, so let's convert this to the same format.
def convert_predictions_to_occ(X, target, alternative_class='KNOWN ALTERNATIVE'):
pred_ = osc.predict(X)
preds = []
for i in range(len(pred_)):
if pred_[i] == True:
preds.append(target)
elif pred_[i] == False:
preds.append(alternative_class)
else:
preds.append(pred_[i])
return preds
[64]:
# 2 versicolor were (incorrectly) considered to belong to a known alternative (virginica), but most were classified as
# entirely novel (UNKNOWN) by the outlier detector! The virginica were mostly recognized correctly as a known alternative.
res = osc.figures_of_merit(convert_predictions_to_occ(X_train, target), y_train)
res['CM']
[64]:
| KNOWN ALTERNATIVE | setosa | UNKNOWN | |
|---|---|---|---|
| setosa | 6 | 30 | 4 |
| versicolor | 2 | 0 | 38 |
| virginica | 32 | 0 | 8 |
[65]:
tsns = 30 / 40.
tsps = (40 + 40) / 80.
acc = (30 + (2+38) + (32+8)) / (40 + 40 + 40.)
tsns, tsps, acc
[65]:
(0.75, 1.0, 0.9166666666666666)
[66]:
res
[66]:
{'CM': KNOWN ALTERNATIVE setosa UNKNOWN
setosa 6 30 4
versicolor 2 0 38
virginica 32 0 8,
'I': KNOWN ALTERNATIVE 0
setosa 40
versicolor 40
virginica 40
dtype: int64,
'CSPS': {'KNOWN ALTERNATIVE': nan, 'versicolor': 1.0, 'virginica': 1.0},
'TSNS': 0.75,
'TSPS': 1.0,
'TEFF': 0.8660254037844386,
'ACC': 0.9166666666666666}
Case 6: OSR Models with Prefit Deep Classifiers#
We can also use deep models (such as feed-forward or convolutional neural networks) as classifiers! However, given they are expensive to train and optimize, it is best to do this separately. In this procedure, a closed-set model is trained to recognize a set of classes and saved to disk. Then, to apply to this under open-set conditions we can combine this with an outlier detector to detect OOD points. Thus, during training of the OpenSetClassifier only the outlier detector needs to be
trained.
Deep OOD Methods
There are many methods to perform OOD detection with deep neural networks. Two of the simplest and most general are:
Briefly, the idea behind (1) is the if you have a deep classifier which outputs class probabilities you predict the class with the highest one. If the highest probability is not very high, then the model is likely “confused” and this is its “best guess” though it appears to not be very good in absolute terms. Thus, the maximum probability of all the possible classes can be taken as a measure of the model’s confidence in its response. This is just a baseline; though it is not perfect, it can work surprisingly well.
The idea behind (2) is that you can take logits (probability = softmax(logit) = exp(logit) / sumexp(all logits)) associated with each possible class and compute a “Helmholtz free energy” for this set. Based on approximately on similar logic as (1), this has certain mathematical properties that make it more appealing. Observations with a low free energy are “stable” and considered ID, while those with a high free energy are OOD. Because it is convention to have a lower bound on the ID, below which the OOD samples lie, the negative free energy is taken as the score.
In both cases, a threshold can be defined by computing these scores over the test set, choosing a Type I error rate (e.g., \(\alpha = 0.05\)), then determining a value. Samples which score below this threshold are OOD, samples which score above are ID; the former are simply considered “unknown” while the latter are sent to the closed-set classifier for identification.
In an example architecture like the one show below, the softmax scores are extracted at the end of the model (rightmost gray arrow) while the energy-based scores (logits) are extracted before the softmax layer (middle gray arrow). Such a model will be used in the calculations below.
You can read more at Mahynski, N.A., Sheen, D.A., Paul, R.L. et al. Encoding PGAA spectra as images for material classification with convolutional neural networks. J Radioanal Nucl Chem (2025).

Load models and data
[92]:
# Let's use a pretrained classifier for this demonstration
model = utils.HuggingFace.from_pretrained(
model_id="mahynski/2d-cnn-demo",
)
[93]:
# Observe how this relates to the figure above
model.summary()
Model: "model_7"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_11 (InputLayer) [(None, 2631, 2631, 1)] 0
conv2d_2 (Conv2D) (None, 2631, 2631, 3) 3
rescaling_2 (Rescaling) (None, 2631, 2631, 3) 0
tf.math.truediv_2 (TFOpLam (None, 2631, 2631, 3) 0
bda)
tf.math.subtract_2 (TFOpLa (None, 2631, 2631, 3) 0
mbda)
mobilenet_1.00_224 (Functi (None, 82, 82, 1024) 3228864
onal)
global_average_pooling2d_2 (None, 1024) 0
(GlobalAveragePooling2D)
dropout_2 (Dropout) (None, 1024) 0
dense_7 (Dense) (None, 10) 10250
=================================================================
Total params: 3239117 (12.36 MB)
Trainable params: 10250 (40.04 KB)
Non-trainable params: 3228867 (12.32 MB)
_________________________________________________________________
[5]:
# This model was trained on 2D "imaged" PGAA spectra - let's write that dataset to disk
res = make_pgaa_images(
transformer=GramianAngularField(method='difference'),
exclude_classes=['Carbon Powder', 'Phosphate Rock', 'Zircaloy'],
valid_range=(0, 2631),
renormalize=True,
test_size=0.2,
random_state=42,
directory='./2d-dataset/',
overwrite=True,
)
Transforming train set: 100%|█████████████████████████████████████████████████████████| 243/243 [00:50<00:00, 4.83it/s]
Transforming test set: 100%|████████████████████████████████████████████████████████████| 61/61 [00:13<00:00, 4.46it/s]
[7]:
# This dataset contains 10 classes indexed from [0, 9]
encoder = res[-1]
[8]:
# Load the challenge data
res_challenge = make_pgaa_images(
transformer=GramianAngularField(method='difference'),
exclude_classes=encoder.classes_, # Exclude the ones we already trained on
valid_range=(0, 2631),
renormalize=True,
test_size=0.0,
)
X_challenge, _, y_challenge, _, _, encoder_challenge = res_challenge
# Map the y_challenge classes of [0, 1, 2] -> [10, 11, 12]
y_challenge += 10
# Split the challenge set (classes [10, 11, 12]) into test/train folds
Xc_train, Xc_test, yc_train, yc_test = sklearn.model_selection.train_test_split(
X_challenge, y_challenge, test_size=0.2, stratify=y_challenge, random_state=42, shuffle=True
)
# Write to disk to use as loaders
_ = utils.write_dataset(
directory='./2d-dataset/train',
X=Xc_train,
y=yc_train,
overwrite=False,
augment=True
)
_ = utils.write_dataset(
directory='./2d-dataset/test',
X=Xc_test,
y=yc_test,
overwrite=False,
augment=True
)
[9]:
train_loader = utils.NNTools.build_loader('./2d-dataset/train/', batch_size=10)
test_loader = utils.NNTools.build_loader('./2d-dataset/test/', batch_size=10)
OOD Models
[22]:
def build_classifier(ood_type, featurize=False, updated_kwargs={}):
kwargs = { # Baseline kwargs for OSR
'clf_prefit':True,
'known_classes':np.arange(10), # The model has 10 classes in it
'inlier_value':True,
'unknown_class':-1,
'score_metric':'TEFF',
'clf_style':'hard',
'score_using':'all'
}
if ood_type.lower() == 'softmax':
outlier_kwargs = {
'model': None if featurize else utils.HuggingFace.from_pretrained(model_id="mahynski/2d-cnn-demo"),
'alpha': 0.05
}
outlier_kwargs.update(updated_kwargs)
kwargs.update({
'clf_model': osr.DeepOOD.SoftmaxFeatureClf() if featurize else utils.HuggingFace.from_pretrained(model_id="mahynski/2d-cnn-demo"),
'outlier_model': osr.DeepOOD.Softmax,
'outlier_kwargs': outlier_kwargs
})
elif ood_type.lower() == 'energy':
outlier_kwargs = {
'model': None if featurize else utils.HuggingFace.from_pretrained(model_id="mahynski/2d-cnn-demo"),
'alpha': 0.05,
'T': 1.0
}
outlier_kwargs.update(updated_kwargs)
kwargs.update({
'clf_model': osr.DeepOOD.EnergyFeatureClf() if featurize else utils.HuggingFace.from_pretrained(model_id="mahynski/2d-cnn-demo"),
'outlier_model': osr.DeepOOD.Energy,
'outlier_kwargs': outlier_kwargs,
})
else:
raise Exception(f'unrecognized {ood_type}')
return osr.OpenSetClassifier(**kwargs)
[25]:
softmax_model = build_classifier(ood_type='softmax', featurize=False)
_ = softmax_model.fit(train_loader)
[27]:
energy_model = build_classifier(ood_type='energy', featurize=False)
_ = energy_model.fit(train_loader)
[28]:
fig, axes = plt.subplots(nrows=2, ncols=2, figsize=(10, 4))
axes[0][0].set_title('Softmax Model')
_ = softmax_model.fitted_outlier_model.visualize(
X_test=test_loader, test_label='Test Set (ID)', bins=np.linspace(0, 1, 100+1),
ax=axes[0][0]
)
axes[0][0].set_yscale('log')
_ = softmax_model.fitted_outlier_model.visualize(
X_test=X_challenge, test_label='Challenge Set (OOD)',
no_train=True, bins=np.linspace(0, 1, 100+1), ax=axes[1][0]
)
axes[0][1].set_title('Energy-based Model')
_ = energy_model.fitted_outlier_model.visualize(
X_test=test_loader, test_label='Test Set (ID)', bins=25,
ax=axes[0][1]
)
_ = energy_model.fitted_outlier_model.visualize(
X_test=X_challenge, test_label='Challenge Set (OOD)',
no_train=True, bins=25, ax=axes[1][1]
)
1/1 [==============================] - 9s 9s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 5s 5s/step
2024-09-24 14:06:23.486113: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 9310457856 exceeds 10% of free system memory.
2024-09-24 14:06:24.412880: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 9324612864 exceeds 10% of free system memory.
1/1 [==============================] - 17s 17s/step
1/1 [==============================] - 9s 9s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 5s 5s/step
2024-09-24 14:07:49.122783: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 9310457856 exceeds 10% of free system memory.
2024-09-24 14:07:50.035909: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 9324612864 exceeds 10% of free system memory.
1/1 [==============================] - 18s 18s/step
OSR Training Without Featurization
It is possible to greatly accelerate this training, especially during cross-validation, by featurizing the data. We will discuss that in the next section, but for now let’s proceed without doing this.
Let’s build a model with some known unknowns present during training. This “compliant” model will be able to see some things it should recognize as OOD and this can help it select certain hyperparameters during cross-validation. Here, we will not do full CV since it will take too much time, instead we will just fit a single model.
[29]:
# This is just explicitly what the "build_classifier" function is doing. Here, the classifier is the CNN model from HF.
# We happen to also be using that to determine which points are OOD. In principle, you could mix and match, i.e.,
# use another model (besides the classifier) as an OOD detector, but this is a self-consistent way to do it, and only
# requires a single model to be trained.
model = osr.OpenSetClassifier(
clf_prefit=True,
known_classes=np.arange(10), # The model recognizes 10 classes
inlier_value=True,
unknown_class=-1,
score_metric='TEFF',
clf_style='hard',
score_using='all',
clf_model=utils.HuggingFace.from_pretrained(model_id="mahynski/2d-cnn-demo"),
outlier_model=osr.DeepOOD.Softmax,
outlier_kwargs = {
'model': utils.HuggingFace.from_pretrained(model_id="mahynski/2d-cnn-demo"),
'alpha': 0.05
}
)
[30]:
# Observe how long this takes to train! This is because we have push our inputs through the large CNN base each time.
model.fit(train_loader)
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
WARNING:tensorflow:5 out of the last 17 calls to <function Model.make_predict_function.<locals>.predict_function at 0x77194735eca0> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has reduce_retracing=True option that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for more details.
1/1 [==============================] - 9s 9s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 3s 3s/step
[30]:
OpenSetClassifier(clf_model=<keras.src.engine.functional.Functional object at 0x77183c30b790>,
clf_prefit=True, inlier_value=True,
known_classes=array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),
outlier_kwargs={'alpha': 0.05,
'model': <keras.src.engine.functional.Functional object at 0x7718542aa510>},
outlier_model=<class 'pychemauth.classifier.osr.DeepOOD.Softmax'>,
unknown_class=-1)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
OpenSetClassifier(clf_model=<keras.src.engine.functional.Functional object at 0x77183c30b790>,
clf_prefit=True, inlier_value=True,
known_classes=array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),
outlier_kwargs={'alpha': 0.05,
'model': <keras.src.engine.functional.Functional object at 0x7718542aa510>},
outlier_model=<class 'pychemauth.classifier.osr.DeepOOD.Softmax'>,
unknown_class=-1)[31]:
y_test = []
for _, y_batch_ in test_loader:
if y_batch_.size > 0:
y_test.append(y_batch_)
y_test = np.concatenate(y_test)
[32]:
fom_test = model.figures_of_merit(model.predict(test_loader), y_test)
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 5s 5s/step
7/7 [==============================] - 52s 7s/step
[33]:
fom_test['CM']
[33]:
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | -1 | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| 1 | 0 | 19 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 2 | 0 | 0 | 8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 3 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 4 | 0 | 0 | 0 | 0 | 4 | 0 | 0 | 0 | 0 | 0 | 0 |
| 5 | 0 | 0 | 0 | 0 | 0 | 4 | 0 | 0 | 0 | 0 | 1 |
| 6 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 0 |
| 7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8 | 0 | 0 | 0 |
| 8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 0 |
| 9 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 |
| 10 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| 11 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| 12 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 |
[34]:
ax = model.fitted_outlier_model.visualize(
X_test=X_challenge,
test_label='Challenge Set (OOD)',
bins=np.linspace(0, 1, 50+1)
)
ax.set_yscale('log')
2024-09-24 14:15:59.217725: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 9310457856 exceeds 10% of free system memory.
1/1 [==============================] - 17s 17s/step
Recommended: OSR Training With Featurization
As an alternative training procedure, we could run our data through the classification model up to the point where the OOD model takes the data as its input, creating a “featurized” data set. The “new” closed-set classifier in the OSR is really just the “rest” of the original model which accepts the featurized data as inputs and finishes the transformation producing the expected outputs, \(y\). This means we only need to run the training set through the majority of the classification model once, which can significantly reduce computational cost! This is especially important when using cross validation to optimize hyperparameters.
The recommended procedure is:
Select the OOD method desired and determine the featurizer associated with this (examples below).
Featurize the data (\(X_{\rm data} \rightarrow X_{\rm feature}\)) by running it through this part of the original model.
Create a new classifier, \(F_{\rm new}\) from just the “remainder” of the prefit classification model (this is frozen anyway); thus, \(F_{\rm new}(X_{\rm feature}) = y\) vs. the original case where \(F_{\rm orig}(X_{\rm data}) = y\).
Perform CV to optimize the OOD detector using the (\(X_{\rm feature}\), \(y\)) data.
Create a new final OSR model with these optimized hyperparameters and train once using the original data (\(X_{\rm data}\), \(y\)) so that this can accept the original, unfeaturized input in the future.
Below the procedure is illustrated for the two Deep OOD methods discussed above; in this notebook another method called DIME is discussed and illustrated.
Softmax OOD
[35]:
# For softmax OOD we use the softmax probabilities directly, so the "featurizer" is just the model itself!
featurizer = utils.HuggingFace.from_pretrained(model_id="mahynski/2d-cnn-demo")
# These are just the probabilities for each observation in X_train
X_feature = featurizer.predict(train_loader)
26/26 [==============================] - 219s 8s/step
[36]:
# This is also a much smaller matrix so it can be more easily held in memory - no need for a data loader here
X_feature[:3]
[36]:
array([[1.3592508e-16, 4.7793682e-22, 6.4363411e-32, 2.6706921e-23,
2.6138906e-27, 7.2800362e-19, 0.0000000e+00, 9.9999994e-01,
8.6173289e-28, 0.0000000e+00],
[6.6362346e-28, 5.4575948e-11, 3.2994977e-17, 9.9999994e-01,
6.0528073e-14, 5.4183482e-18, 2.6547629e-32, 2.2311170e-36,
9.0344759e-11, 1.2154283e-27],
[4.7406639e-26, 6.4688026e-11, 6.1510879e-12, 1.1068182e-16,
9.9999994e-01, 2.1820309e-19, 2.1152146e-31, 1.2802273e-34,
1.2521640e-21, 2.2328113e-21]], dtype=float32)
[37]:
# The classification model is set to osr.DeepOOD.SoftmaxPrefitClf() by "build_classifier", which is basically a pass
# through since the featurizer is the entire model itself in this case.
pipeline = imblearn.pipeline.Pipeline(
steps=[
("osr", build_classifier(ood_type='softmax', featurize=True))
]
)
[38]:
?osr.DeepOOD.SoftmaxFeatureClf
[39]:
y_train = []
for _, y_batch_ in train_loader:
if y_batch_.size > 0:
y_train.append(y_batch_)
y_train = np.concatenate(y_train)
[40]:
param_grid = [{
'osr__outlier_kwargs': [
{
'model':None, # This tells the Softmax OOD to expect "featurized" data as the input (the probabilities)
'alpha':a
} for a in np.logspace(-3, np.log10(0.5), 10)],
}]
gs = GridSearchCV(
estimator=pipeline,
param_grid=param_grid,
n_jobs=-1,
cv=sklearn.model_selection.StratifiedKFold(n_splits=5, shuffle=True, random_state=0),
error_score=0,
refit=True
)
_ = gs.fit(X_feature, y_train)
[41]:
optimal_hyperparameters = copy.copy(gs.best_params_['osr__outlier_kwargs'])
optimal_hyperparameters.pop('model')
[42]:
optimal_hyperparameters
[42]:
{'alpha': 0.001}
[43]:
final_pipeline = imblearn.pipeline.Pipeline(
steps=[
("osr", build_classifier(ood_type='softmax', featurize=False, updated_kwargs=optimal_hyperparameters))
]
)
final_pipeline.fit(train_loader)
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 2s 2s/step
[43]:
Pipeline(steps=[('osr',
OpenSetClassifier(clf_model=<keras.src.engine.functional.Functional object at 0x771947499bd0>,
clf_prefit=True, inlier_value=True,
known_classes=array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),
outlier_kwargs={'alpha': 0.001,
'model': <keras.src.engine.functional.Functional object at 0x7718546eb250>},
outlier_model=<class 'pychemauth.classifier.osr.DeepOOD.Softmax'>,
unknown_class=-1))])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Pipeline(steps=[('osr',
OpenSetClassifier(clf_model=<keras.src.engine.functional.Functional object at 0x771947499bd0>,
clf_prefit=True, inlier_value=True,
known_classes=array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),
outlier_kwargs={'alpha': 0.001,
'model': <keras.src.engine.functional.Functional object at 0x7718546eb250>},
outlier_model=<class 'pychemauth.classifier.osr.DeepOOD.Softmax'>,
unknown_class=-1))])OpenSetClassifier(clf_model=<keras.src.engine.functional.Functional object at 0x771947499bd0>,
clf_prefit=True, inlier_value=True,
known_classes=array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),
outlier_kwargs={'alpha': 0.001,
'model': <keras.src.engine.functional.Functional object at 0x7718546eb250>},
outlier_model=<class 'pychemauth.classifier.osr.DeepOOD.Softmax'>,
unknown_class=-1)[44]:
fom_test = final_pipeline.named_steps['osr'].figures_of_merit(final_pipeline.predict(test_loader), y_test)
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 5s 5s/step
7/7 [==============================] - 56s 8s/step
[45]:
# Decent performance, but one challenge class (11) seems to get confused with class 2
fom_test['CM']
[45]:
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | -1 | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 1 | 0 | 19 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 2 | 0 | 0 | 8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 3 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 4 | 0 | 0 | 0 | 0 | 4 | 0 | 0 | 0 | 0 | 0 | 0 |
| 5 | 0 | 0 | 0 | 0 | 0 | 5 | 0 | 0 | 0 | 0 | 0 |
| 6 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 0 |
| 7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8 | 0 | 0 | 0 |
| 8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 0 |
| 9 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 |
| 10 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| 11 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 12 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 |
[46]:
ax = final_pipeline.named_steps['osr'].fitted_outlier_model.visualize(
X_test=X_challenge,
test_label='Challenge Set (OOD)',
bins=np.linspace(0, 1, 50+1)
)
ax.set_yscale('log')
1/1 [==============================] - 18s 18s/step
Energy-based OOD
[47]:
# For energy OOD we use the raw logits directly, so the "featurizer" is just the model itself less the final activation
featurizer = utils.HuggingFace.from_pretrained(model_id="mahynski/2d-cnn-demo")
featurizer.layers[-1].activation = None # Deactivate to get logits
X_feature = featurizer.predict(train_loader)
26/26 [==============================] - 223s 9s/step
[48]:
X_feature[:3]
[48]:
array([[ -0.887215 , -13.445352 , -36.17355 , -16.329908 , -25.561745 ,
-6.1167665, -61.026443 , 35.647213 , -26.671394 , -61.906696 ],
[-35.21933 , 3.7290795, -10.589668 , 27.360508 , -3.0751612,
-12.396232 , -45.34586 , -54.73005 , 4.23312 , -34.614193 ],
[-27.502474 , 7.347114 , 4.994167 , -5.9313126, 30.80856 ,
-12.1603 , -39.82242 , -47.232296 , -17.320854 , -16.742466 ]],
dtype=float32)
[49]:
# The classification model is set to osr.DeepOOD.EnergyPrefitClf() by "build_classifier", which is just a softmax layer
# since the featurizer was the whole model except for the final softmax activation function to turn logits into
# probabilities.
pipeline = imblearn.pipeline.Pipeline(
steps=[
("osr", build_classifier(ood_type='energy', featurize=True))
]
)
[50]:
?osr.DeepOOD.EnergyFeatureClf
[51]:
param_grid = [{
'osr__outlier_kwargs': [
{
'model':None,
'alpha':a,
'T':t
} for a in np.logspace(-3, np.log10(0.5), 10) for t in [0.1, 1.0, 10.0]],
}]
gs = GridSearchCV(
estimator=pipeline,
param_grid=param_grid,
n_jobs=-1,
cv=sklearn.model_selection.StratifiedKFold(n_splits=5, shuffle=True, random_state=0),
error_score=0,
refit=True
)
_ = gs.fit(X_feature, y_train)
[52]:
optimal_hyperparameters = copy.copy(gs.best_params_['osr__outlier_kwargs'])
optimal_hyperparameters.pop('model')
[53]:
optimal_hyperparameters
[53]:
{'alpha': 0.001, 'T': 0.1}
[54]:
final_pipeline = imblearn.pipeline.Pipeline(
steps=[
("osr", build_classifier(ood_type='energy', featurize=False, updated_kwargs=optimal_hyperparameters))
]
)
final_pipeline.fit(train_loader)
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
Iterating through all batches to summarize, be patient...: 0it [00:00, ?it/s]
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 2s 2s/step
[54]:
Pipeline(steps=[('osr',
OpenSetClassifier(clf_model=<keras.src.engine.functional.Functional object at 0x771a6b8b3e50>,
clf_prefit=True, inlier_value=True,
known_classes=array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),
outlier_kwargs={'T': 0.1, 'alpha': 0.001,
'model': <keras.src.engine.functional.Functional object at 0x771a6b9e3010>},
outlier_model=<class 'pychemauth.classifier.osr.DeepOOD.Energy'>,
unknown_class=-1))])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Pipeline(steps=[('osr',
OpenSetClassifier(clf_model=<keras.src.engine.functional.Functional object at 0x771a6b8b3e50>,
clf_prefit=True, inlier_value=True,
known_classes=array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),
outlier_kwargs={'T': 0.1, 'alpha': 0.001,
'model': <keras.src.engine.functional.Functional object at 0x771a6b9e3010>},
outlier_model=<class 'pychemauth.classifier.osr.DeepOOD.Energy'>,
unknown_class=-1))])OpenSetClassifier(clf_model=<keras.src.engine.functional.Functional object at 0x771a6b8b3e50>,
clf_prefit=True, inlier_value=True,
known_classes=array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),
outlier_kwargs={'T': 0.1, 'alpha': 0.001,
'model': <keras.src.engine.functional.Functional object at 0x771a6b9e3010>},
outlier_model=<class 'pychemauth.classifier.osr.DeepOOD.Energy'>,
unknown_class=-1)[55]:
fom_test = final_pipeline.named_steps['osr'].figures_of_merit(final_pipeline.predict(test_loader), y_test)
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 8s 8s/step
1/1 [==============================] - 5s 5s/step
7/7 [==============================] - 58s 8s/step
[56]:
fom_test['CM']
[56]:
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | -1 | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 1 | 0 | 19 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 2 | 0 | 0 | 8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 3 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 4 | 0 | 0 | 0 | 0 | 4 | 0 | 0 | 0 | 0 | 0 | 0 |
| 5 | 0 | 0 | 0 | 0 | 0 | 5 | 0 | 0 | 0 | 0 | 0 |
| 6 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 0 |
| 7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8 | 0 | 0 | 0 |
| 8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 0 |
| 9 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 |
| 10 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| 11 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 12 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 |
[57]:
ax = final_pipeline.named_steps['osr'].fitted_outlier_model.visualize(
X_test=X_challenge,
test_label='Challenge Set (OOD)',
)
ax.set_yscale('log')
1/1 [==============================] - 18s 18s/step
Common pitfalls#
Some things to keep in mind:
It is typically best to allow the OpenSetClassifier to infer the known classes from the training data. If you manually specify this and forget some classes, those points which belong to those left out will be ignored during training. This also affects the way the model is scored.
However, this can be useful when building “compliant” models where we wish the model to be aware of certain “known unknowns” during training, as illustrated above.