maeser.admin_portal.extract_figures module

maeser.admin_portal.extract_figures module#

This module is used to extract figures from a series of pdfs in a directory It does this by looking for “Figure X.Y” in the text and extracting a screenshot of the page cropped around this text. This procedure is very rudimentary and could use much improvement. This module can be executed in the terminal or used within another script.

maeser.admin_portal.extract_figures.extract_all_figures(target_dir: str)[source]#

Identifies all pdfs in target_dir and extracts all figures from that pdf.

Parameters:

target_dir (str) – The directory containing the pdfs.

maeser.admin_portal.extract_figures.extract_figures_with_captions(pdf_path, output_dir)[source]#