Feature Store Integration
Feature stores are an essential building block of a modern MLOps setup. For an introduction into the concept and an overview of available options and vendors, see the Feature Store Comparison & Evaluation on the MLOps Community website.
Anovos provides integration with Feast, a widely used open source feature store, out of the box. Using the same abstractions, it is straightforward to integrate Anovos with other feature stores.
If there is a particular feature store integration you'd like to see supported by Anovos, let us know!
Using Anovos with Feast
The following guide describes how to use Anovos to push data to Feast. We assume that you are familiar with the fundamentals of both Anovos workflows and Feast.
For an introduction to Feast, see 📖 the Feast Quickstart guide.
Prerequisites
In order to use Anovos with Feast, you need to install it:
Next, we'll instantiate a new Feast repository:
🤓 Note: You can also use an existing repository. In this case, Anovos will simply add a new file anovos.py containing the feature definitions as well as the output file to the existing repository.
Adding the Feast export to your Anovos workflow
To export data to Feat at the end of a workflow run, you need to add the write_feast_features
block
to the configuration file. (To learn more about the configuration file in general and available options,
see 📖 the configuration file documentation.)
You can use the following template as a starting point:
write_feast_features:
file_path: "../anovos_repo/" # the location of your Feast repository
entity:
name: "income" # the Feast entity
description: "this entity is a ...." # the entity description used by Feast
id_col: 'ifa' # the primary key column to identify this entity by
file_source:
description: 'data source description' # the data source description used by Feast
owner: "me@business.com" # the data source owner registered in Feast
timestamp_col: 'event_time' # the name of the logical timestamp at which the feature was observed
create_timestamp_col: 'create_time_col' # the name of the physical timestamp (wallclock time)
# of when the feature value was computed
feature_view:
name: 'income_view' # the name of the generated feature view
owner: 'view@owner.com' # the view owner registered in Feast
ttl_in_seconds: 36000000 # the time to live in seconds for features in this view.
# Feast will use this value to look backwards when performing
# point-in-time joins
service_name: 'income_feature_service' # the name of the feature service generated by the workflow
Let's break this down!
The following block generates an
entity definition in Feast.
This block and all its child elements are mandatory.
the name
elements specifies the entity name. The description
element provides a human-readable
description to be displayed
in the Feast UI. The element id_col
specifies the primary key of the entity.
The subsequent block generates a file source definition in Feast. This block and all its children are mandatory.
The owner
element describes the owner of the file data source in the shape of an email address.
The two elements timestamp_col
and create_timestamp_col
refer to timestamped columns
used when retrieving data.
file_source:
description: 'data source description'
owner: "me@business.com"
timestamp_col: 'event_time'
create_timestamp_col: 'create_time_col'
The next block generates a feature view definition in Feast. This block and all its children are mandatory.
The following element generates a feature service definition in Feast. This element is optional.
Setting repartition value to 1 in write_main
The current version of the feast integration only supports adding single output files to feast repositories. Thus, it is required to set the value of the repartition attribute to 1 in the write_main config.
Exporting data to Feast
First, run your Anovos workflow with the configuration above.
Once the workflow has finished, switch into the folder of anovos_feature
repository,
apply the changed feature definitions, and materialize the features:
To verify that the features have been loaded correctly, you can check them using Feast's UI. Run
and access the Feast UI at http://127.0.0.1:8888
.
The UI gives a realtime overview about data sources, entities, feature views etc. of the entire feature repository
(i.e., across multiple .py files that contain feature definitions).
Retrieve feature data from Feast
The following script shows how to access historical data, e.g., for the purpose of training an ML model.
For more information, see the
feast documentation on feature retrieval.
Documentation on how to specify event_time
and its use in point in time joins can be found
here.
import datetime
import feast
import pandas as pd
repo_path="./anovos_repo"
store = feast.FeatureStore(repo_path=repo_path)
# ACCESS HISTORICAL FEATURES
# Either read directly from parquet file generated by the Anovos workflow or generated manually
income_entities = pd.DataFrame.from_dict(
{
"ifa": [
"27a",
"30a",
"475a",
"965a",
"1678a",
"1698a",
"1807a",
"1951a",
"2041a",
"2215a",
],
"event_time": [
datetime.now(),
datetime.now(),
datetime.now(),
datetime.now(),
datetime.now(),
datetime.now(),
datetime.now(),
datetime.now(),
datetime.now(),
datetime.now(),
],
}
)
fs = feast.FeatureStore(repo_path=repo_path)
# Alternative 1: retrieve features via explicit specification
income_features_df = fs.get_historical_features(
entity_df=income_entities,
features=[
"income_view:income",
"income_view:latent_0",
"income_view:latent_1",
"income_view:latent_2",
"income_view:latent_3",
],
).to_df()
print(income_features_df.head())
# Alternative 2: retrieve features using the feature service
feature_service = fs.get_feature_service("income_feature_service")
income_features_by_service_df = fs.get_historical_features(
features=feature_service, entity_df=income_entities
).to_df()
print(income_features_by_service_df.head())
# Now, you can use the features to train your model
...
Integrating Anovos with other feature stores
We're exploring further support for Feast and the integration of Anovos with other feature stores. Let us know which capabilities you'd like to see in future versions of Anovos!