Mannequin vs. Serving complexity
Within the early stage of our product, we began with easy logistic regression fashions to estimate the chance of customers sending/accepting gives. The fashions had been skilled offline utilizing scikit learn. The coaching set was obtained utilizing a “log and learn” method (logging alerts precisely as they had been throughout serving time) over ~90 totally different alerts, and the realized weights had been injected into our serving layer.
Though these fashions had been doing a fairly good job, we noticed through offline experiments the good potential of extra superior non linear fashions reminiscent of gradient boosted regression classifiers for our rating process.
Implementing an in-memory quick serving layer supporting such superior fashions would require non-trivial effort, in addition to an on-going upkeep value. A a lot less complicated possibility was to delegate the serving layer to an exterior managed service that may be referred to as through a REST API. Nevertheless, we wanted to make sure that it wouldn’t add an excessive amount of latency to the general stream.
With a purpose to make our choice, we determined to do a fast POC utilizing the AI Platform Online Prediction service, which gave the impression of a possible nice match for our wants on the serving layer.
A fast (and profitable) POC
We skilled our gradient boosted fashions over our ~90 alerts utilizing scikit study, serialized it as a pickle file, and easily deployed it as-is to the Google Cloud AI Platform. Accomplished. We get a completely managed serving layer for our superior mannequin by way of a REST API. From there, we simply needed to join it to our java serving layer (lots of necessary particulars to make it work, however unrelated to the pure mannequin serving layer).
Under is a really excessive stage schema of what our offline/on-line coaching/serving structure appears to be like like. The carpool serving layer is accountable for lots of logic round computing/fetching the related candidates to attain, however we focus right here on the pure rating ML half. Google Cloud AI Platform performs a key function in that structure. It enormously will increase our velocity by offering us with a right away, managed and strong serving layer for our fashions and permits us to concentrate on enhancing our options and modelling.