Next Level Western Blot Duplicate Detection

Western Blots are among the most difficult scientific images to analyse for duplication. They often lack distinctive texture or structure, and many look strikingly similar to one another, which can result in a high false positive rate. Our current detection system performs well, but feedback from researchers and editors showed us that some especially challenging cases were slipping through.

Over the past months, our team has been building a stronger system designed specifically to handle these edge cases, and the results are a major step forward.

How We Improved Western Blot Detection

To improve detection, we analysed thousands of Western Blot duplicates identified on PubPeer. These real-world cases helped us understand recurring patterns and highlighted situations where conventional algorithms struggled.

From this foundation, we worked closely with research integrity experts to assemble a large, curated dataset of Western Blot duplicates. This became the basis for training a new machine learning model, purpose-built for this image type.

Key advances in the new system include:

  • Specialised training corpus
    Thousands of confirmed duplicates curated with expert input
  • Next-generation detection model
    Optimised to capture subtle similarities that generic approaches miss
  • Robustness to transformations
    Tested against common transformations such as rotation, scaling, contrast or colour shifts, quality loss, and flipping. These are all areas where the new model shows far stronger performance

What This Means for Users

With this release, duplicate detection between Western blot images has become substantially more accurate, achieving 90% accuracy on 443 verified duplicates collected from PubPeer. Many of the most challenging cases that previously went undetected are now flagged correctly. This reduces the risk of false negatives while giving researchers, editors, and institutions greater confidence in their results.

To demonstrate the impact of this upgrade, we tested the new model against some of the most difficult Western Blot cases shared by researchers and editors, some examples below: 

This means researchers, editors, and institutions can work with greater confidence that duplications will be caught, even in the toughest cases.

Looking Ahead

What we developed is not just a model for Western Blots. We have created a new duplicate detection technology that can be adapted to other use cases. Next, we will extend from duplicate detection between Western Blot images to duplicate detection within single Western Blot subimages. Then, over the coming months, we will roll out the new technology to flow cytometry images, microscopy images, and graphs, strengthening duplicate detection across an even broader range of figures.

By listening to feedback and analyzing real-world cases, we are building ever-stronger tools to support the research community. Western Blot detection is just the latest step in this process, and more improvements are already on the way.