Tech stack: rails, sidekiq, postgresql, redis
For each tenant application, we want to calculate the fee of the orders.Every week, starting by monday, is available past week's sum. We charge apps the total fee.
We have to calculate this in a reliable way:
- A fee is not charged twice
- A fee is not omitted due to errors
- Dataset will grow overtime, this means that the table that has ORDER_ID, APP_ID, AMOUNT, FEE, CHARGED_AT will become big overtime.
First approach:Cron job that calculates the sum of the fees of an order per app every monday for past week, running several times to avoid problems with downtime. Maybe I could use a group by? The con of this approach is that I see having a huge spike of calculations on a given day.
The table will become bigger and bigger as time comes by:
- Of course i can add indexes for the APP_ID and CHARGED_AT to avoid sequential scans.
- I can have a db replica just for reads
- Maybe this table can be "pruned" overtime and moved somewhere else?
Second approach:Everytime we know the fee of an order:
- Try to create a "weekly report" on the database. If it doesnt exist initialize the sum with 0.
- Update the record in the database to guarantee concurrent updates are done sequentially
UPDATE AMOUNT = AMOUNT + FEE
- The problem that I find here is that I have to guarantee that an order fee is not charged twice.
So, could I somehow, in a single transaction do
BEGIN TRANSACTIONUPDATE AMOUNT = AMOUNT + FEEORDER UPDATE WEEKLY_FEE = TRUECOMMIT TRANSACTION
I have two doubts with this approach:
- Does it really scale better with the dataset? We are doing microtransactions to the database everytime we can. We will have more orders and apps overtime so this table can quickly grow.
- It feels like the charge twice can still happen in my code. Maybe adding a sort of IF WEEKLY_FEE ? 0 : FEE to the UPDATE AMOUNT = can help?
BEGIN TRANSACTIONCURRENT_FEE = WEEKLY_FEE ? 0 : FEEUPDATE AMOUNT = AMOUNT + CURRENT_FEEORDER UPDATE WEEKLY_FEE = TRUECOMMIT TRANSACTION
Am I over complicating this and the first approach can be optimized for performance?