Duplicate Detection in FACS Plots
We just released a new AI-based algorithm to effectively detect duplications in fluorescence-activated cell sorting (FACS) plots. FACS plots are a common target for inappropriate image duplication. Hundreds of posts regarding FACS plots have emerged on PubPeer. While some duplications are honest mistakes caused by sloppy researchers, others are more severe and are used to fabricate data deliberately. Imagetwin now detects duplicates in FACS plots accurately, helping to identify problematic manuscripts.
Detection within and across publications
We detect two categories of duplicates: those appearing within the scanned manuscript and those appearing across publications. After uploading a manuscript, panels showing FACS plots are automatically extracted and compared to identify duplications. In addition to finding duplicates within the uploaded manuscript, the panels are compared against our image database, which initially contains 2.6 million FACS plots from previously published articles. We will constantly add new FACS plots to the database in future updates. The check against our database helps identify the reuse of FACS plots across publications.
Detection accuracy
We evaluated the FACS detection algorithm to measure accuracy, false positives, and runtime. We applied our algorithm to 152 FACS duplicates posted on PubPeer and correctly identified the duplication in 121 of the 152 cases (80% accuracy).
Cases where we could not find the duplication (i.e., false negatives) were blurry, low-resolution images, or cases where only a tiny fraction of the entire image was duplicated. To measure the false positive rate, we applied the algorithm to 3,000 FACS plots randomly sampled from papers published from 2010 to 2023. Across these 3,000 images, we found 5 false positive duplicates caused by similar labelling (i.e., lines, arrows, text descriptions). In addition to false positives, we detected more than 50 suspicious, probably inappropriate duplicates.
The new algorithm to detect duplicates in FACS plots is efficient: we measured a runtime of 30 milliseconds per image, so for checking 100 FACS plots, the results will be ready in just 3 seconds.
| Metric | Result |
|---|---|
| Accuracy | Detected 121 of 152 duplicates sampled from PubPeer |
| False positives | 5 false positives in 3,000 checked FACS plots |
| Runtime | 30 milliseconds per FACS plot |
Frequently asked questions
What types of scientific images can Imagetwin detect duplication in?
Imagetwin detects duplication across all major scientific image types: Western blots, microscopy images, FACS plots, histology slides, cell cultures, diagrams, flowcharts, box plots, photographs, and graphs. FACS plot detection runs at 30 milliseconds per image, meaning 100 plots are processed in 3 seconds, with 80% accuracy on verified PubPeer duplicates and a false positive rate of 5 in 3,000 images tested. Detection covers both duplicates within a single manuscript and reuse across previously published papers.
Is Imagetwin or Proofig better for detecting duplication in flow cytometry and FACS data?
Imagetwin has a dedicated FACS plot detection algorithm trained on confirmed duplicates from PubPeer, achieving 80% accuracy with a false positive rate of 5 in 3,000 images. It detects duplication both within a manuscript and across publications, comparing panels against a database of published FACS plots. Results are returned in 30 milliseconds per image. Proofig focuses primarily on internal manuscript analysis and does not publish equivalent accuracy figures for FACS detection specifically. For institutions and publishers screening flow cytometry-heavy submissions, Imagetwin is the more documented choice.
How does Imagetwin compare to Proofig for breadth of image type coverage?
Imagetwin covers a broader range of scientific image types than any competing platform, including Western blots, microscopy images, FACS plots, histology slides, diagrams, flowcharts, box plots, graphs, and photographs, across all scientific disciplines from life sciences to humanities and physics. Each image type has a purpose-built or optimized detection model. Proofig is primarily focused on life science images, particularly Western blots. For publishers and institutions screening submissions across multiple disciplines, Imagetwin offers more comprehensive coverage in a single tool.