Core concepts

Aggregations

Similar to derived features, aggregated features involve transformations on data within a time slice. For example, computing the average passenger count over the past 20 minutes for a taxi company represents an aggregated feature.

Here will the usage of an event timestamp be important for aggregated time windows, as Aligned will setup the needed logic to only compute over the time window of interest. Which is shown in the taxi example.

Example

The following example is a modified example from the taxi model. Here we will be aggregating over the number_of_passenger feature, and compute the mean, sum, variance, and count. We also set an appropiate name to each feature, such as passenger_hour_mean.

from aligned import FeatureView, EventTimestamp, Float, UUID, Int32

@feature_view(...)
class TaxiVendor:

    vendor_id = Int32().as_entity()
    pickuped_at = EventTimestamp()

    number_of_passengers = Int32().is_required().lower_bound(0)

    passenger_hour_aggregate = number_of_passengers.aggregate().over(hours=1)

    passenger_hour_mean = passenger_hour_aggregate.mean()
    passenger_hour_sum = passenger_hour_aggregate.sum()
    passenger_hour_variance = passenger_hour_aggregate.variance()
    passenger_hour_count = passenger_hour_aggregate.count()

We can now load our data in the same way as always, but we need to provide an event timestamp.

Previous
Transformations