By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: PCIC Model Design: Category-Level Repurchase Prediction and Frequency‑Recency Item Ranking | HackerNoon
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > PCIC Model Design: Category-Level Repurchase Prediction and Frequency‑Recency Item Ranking | HackerNoon
Computing

PCIC Model Design: Category-Level Repurchase Prediction and Frequency‑Recency Item Ranking | HackerNoon

News Room
Last updated: 2025/08/12 at 1:02 AM
News Room Published 12 August 2025
Share
SHARE

Table of Links

Abstract and 1 Introduction

  1. Literature Review
  2. Model
  3. Experiments
  4. Deployment Journey
  5. Future Directions and References

3 MODEL

3.1 Category level repurchase modeling

We use category level features to predict the customers’ likelihood to repurchase items. Each customer has their own features crafted by their purchase history, and the last m days of customer purchase data is used to generate labels to train a category level model. All purchase history before this m days is used to generate the features. Any category in which customers repurchased an item in this time period is considered label 1 while the other categories are assigned label 0. The main features considered to train the model are enumerated in subsequent subsections. The purchase history of a customer before this time frame is used to obtain features.

3.1.1 Survival Analysis. Survival analysis focuses on the expected duration of time until occurrence of an event of interest. It differs from traditional regression by the fact that parts of the training data can only be partially observed, which is stated as being censored. For these censored observations, we only know that the event time is greater than the time at the point of censoring. In the retail scenario, we consider the purchase of an item within a category as an event. For each category, repeat purchase data can then be used to construct a life table across customers for each category, which will allow us to predict repeat purchase risk as a function of time. A life table summarizes the events and censored cases across time. At time 0, all observations (reference purchases) are still at risk, which meants that they have not yet repeated the purchase (event) or been censored. As events and censored cases occur, observations fall out of the risk set.

Repeat purchase data can be used to compute a few useful features:

1. hazard (eq. 1) is the probability of event occurring at kth day, conditional on the event not occurring before day k. It denotes an approximate probability that an event (repurchase) occurs in a given time interval, under the condition that an user would remain event-free up to that time (no purchase).

2. cum_hazard (eq. 2) is cumulative sum of hazard over time.

3. survival (eq. 3) is probability of the event occurring after day k or equivalently, the proportion that have not yet experienced the event by time t.

4. cum_survival (eq. 4) as probability of event occuring in ±3 days to today. We additionally define this feature since many grocery customers shop once a week.

5. normalized_risk (eq. 5) is defined as risk associated with the user category today as a fraction of risk on the day of purchase.

6. normalized_event (eq. 6) is defined as the event probability on the given day normalized by event plus censor population.

Building this model gives a population level overview of the item repurchase rate. For example, we observe that people mostly repurchase bananas every 7 days and cleaning supplies every 21 days, so the hazard function is maximized at that time duration between purchases. Based on the last date of purchase of each item by the customer, we can use survival analysis to predict the date of repurchase or the probability of repurchase after n days.

3.1.2 ARIMA models. Autoregressive Integrated Moving Average or ARIMA models are useful for short term forecasts on nonstationary time series problem. For each customer and category, we try to characterize their purchase pattern using ARIMA and predict the next day of purchase. ARIMA models have three parameters (p, d, q) where p is the order of the autoregressive model, d is the degree of differencing, and q is the order of the moving-average model. We build one ARIMA model that observes the past dates of purchases within a category to predict the next one and a second model to consider the quantity of item purchased and predict the current rate of consumption by the customer (say X uses 2 oz of shampoo daily). This is then used to predict the date when the customer will likely run out of the item. For each customer-category pair, we train these models and use their forecasts ARIMA(date) and ARIMA(rate) as features.

3.1.3 Other features. We consider three more behavioral category level features: NumPurchases – Number of times a given customer has purchased from the category, tripsSinceLastPurchased – the number of purchases in other categories customer has made since purchasing in this category, daysSincelastPurchased – the time difference between today and last date the customer made a purchase in this category

3.1.4 Model training. We take the past 1.5 years of user shopping data to train the model to ensure we capture a yearly cadence. The last m days of data is held out to generate labels. For example – we may take Jan 2021- July 24 2022 dataset to generate features for all guests. For those guests who shopped during July 25 – 31 (m = 7), we generate labels 0 and 1 for categories not shopped and shopped respectively. The 6 features from survival model, 2 predictions from

Table 1: Some characteristics of datasets considered for evaluationTable 1: Some characteristics of datasets considered for evaluation

two ARIMA models and the 3 other features mentioned earlier are generated for each user and category pair.

We trained a 2 layer neural network on the category level guest purchase dataset. We wanted to keep it light because the number of input features is small (11), and we wanted it to scale well for the large number of users. The most performant neural net was composed of 2 fully connected layers (10 and 5 neurons) with sigmoid activations. The output layer is run through a softmax and the logistic loss function is used for optimization.

3.2 Inter-category Product Ranking

In general, we observed that a customer is most likely to repurchase their most frequently or most recently bought items. The two main features used to rank products within a category are frequency (Freq) and recency (Rec) of purchase. We wanted to combine them both to arrive at optimal ranks, however, recency is measured in days and frequency is a count. To come to a common ground, we convert both into ranks. Item Frequency Rank (IFR) and Item Recency Rank (IRR) are obtained by ranking the frequency counts and days (respectively) since the last purchase of an item (DaysSincePurchase). IFR = 𝑅𝑘 (𝐹𝑟𝑒𝑞), IRR = 𝑅𝑘 (𝐷𝑎𝑦𝑠𝑆𝑖𝑛𝑐𝑒𝑃𝑢𝑟𝑐ℎ𝑎𝑠𝑒). We combine the ranks using a weighted average, rank again, then divide the rank by number of times the item is bought (𝑁𝐼𝐵). This insight was based on user feedback and will be discussed in later sections. The equation 5.2 shows how final Item Rank (IR) is calculated.

where the parameters 𝛼 and 𝛽 were obtained using exhaustive grid search in the range [0,1].

3.3 Model output

Authors:

(1) Amit Pande, Data Sciences, Target Corporation, Brooklyn Park, Minnesota, USA ([email protected]);

(2) Kunal Ghosh, Data Sciences, Target Corporation, Brooklyn Park, Minnesota, USA ([email protected]);

(3) Rankyung Park, Data Sciences, Target Corporation, Brooklyn Park, Minnesota, USA ([email protected]).


Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Kendall Jenner secretly drops $23 million on Montecito estate with horse stables
Next Article F&W Networks, Fusion Fibre team to accelerate gigabit broadband | Computer Weekly
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Huawei’s PC chipset contingency plan does not exist, Huawei staff say · TechNode
Computing
Tiny Bookshop review – a truly cosy escape made with readers in mind
News
Best Internet Providers in Portland, Oregon
News
10 Best Free AI Meeting Note Taker Tools for Meetings in 2025
Computing

You Might also Like

Computing

Huawei’s PC chipset contingency plan does not exist, Huawei staff say · TechNode

1 Min Read
Computing

10 Best Free AI Meeting Note Taker Tools for Meetings in 2025

29 Min Read
Computing

Creditcoin’s Bold Plan to Make the World’s Invisible $2 Trillion Economy Visible | HackerNoon

11 Min Read
Computing

Alibaba chairman points to AI as core growth engine for e-commerce and cloud businesses · TechNode

1 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?