Thursday, December 26, 2024

FDA Sources Sought: ThinkTrends Software Support for Optical Character Recognition (OCR) PDF Extraction

Notice ID: 75F40123Q00417

Description

As part of the BEST platform, the contractor shall provide an AI-based OCR (Optical Character Recognition) function that provides a PDF extraction solution. The solution shall be able to extract data tables from bioanalytical reports in a PDF format at greater than 90% accuracy level to allow for reviewers to access data in a format which is amenable to filtering specific data sets, performing calculations, and other analyses that are critical to their processes. The government should be able to leverage this solution for other projects that are part of the Remote Regulatory Assessment (RRA) initiatives. RRAallows FDA investigators to perform FDA-regulated sponsor site inspections remotely. It includes voluntary interactive evaluations (such as remote livestreaming video of operations, teleconferences and screen sharing) in addition to requests to review records and other information under existing statutory or regulatory authority. Throughout the pandemic, the FDA has used these tools, domestically and abroad, to help the agency conduct oversight, mitigate risk, and meet critical public health needs. The BEST platform described above is an example of RRA.

Task 1 – ThinkTrends Software Support for OCR PDF Extraction

Thinktrends is a data mining and AI workflow automation platform that allows users to create custom AI models that allow detection of objects from images with minimal coding. This technology may significantly improve the accuracy of OCR text and table extraction. ThinkTrends OCR API and ThinkTrends PDF AI are the ThinkTrends products to facilitate PDF extraction, conversion, verification, and automation from bioanalytical reports.
The OCR PDF extraction requirements (implementation and support) for the government’s remote inspection activities are:

  • Integrate OCR tool with Study Data Platform (SDP) loading module. The
    integration should be using:

    • real-time API calls
    • build a batch process
    • create an end-to-end workflow
  • Build API call to populate SDP with extracted data from the bioanalytical reports
  • Provide ability to integrate with downstream systems to BEST and RRA
  • OCR data extraction process shall be able to extract table, table titles/names,
    page numbers and other elements from PDF files
  • Extraction of tables from Bio-Analytical Reports

Read more here.

[related-post]

LEAVE A REPLY

Please enter your comment!
Please enter your name here

FedHealthIT Xtra – Find Out More!

Recent News

Don’t Miss A Thing

Subscribe to our mailing list

* indicates required