Why aren’t you de-duping the table before processing?
I need to dedupe the to-be-processed data with the data thats already processed in the "final" table. We are working with hundreds of millions of products therefore we thought about "simply" using random batches from the data to be processed. But thanks to the many replies Ive learned already that our approach was in the beginning already wrong.
thanks for this input. This was the winning argument for my boss for migrating to a modern server. While I admit that I see many flaws in our design, we are now working on refactoring our architecture and approach itself.
Thanks to the other numerous answers leading me to the right direction (hopefully).