Module `utilities.protocols`

Define protocols for inaccessible (due to circular imports) external classes

Classes

class DocuVisionIntegratorProtocol (*args, **kwargs)

Define externally accessed methods for the DocuVisionIntegrator class. See ./app/integrators/docuvision_integrator.py for class implementation.

Ancestors

typing.Protocol
typing.Generic

Subclasses

DocuVisionIntegrator

Class variables

var api_key : str
var base_path : str
var base_url : str
var page_map : dict[str, dict[str, list[int]]]
var split_by_pid : bool

Instance variables

prop pdf_library : dict[str, typing.Any]: Dict of {case_doc_id: PDFLibEntry} where each entry represents the combined_pdf of a DocuVisionCase created by a child DocuVisionTask instance.

Merged into S3Batch.pdf_library in aws_s3_batch.py for file matching and other downstream processes.
prop pdfs_by_doc_id : dict[str, bytes]: Returns a dict of pdf bytes split and/or concatenated by task pids.
prop results : dict[str, list[dict[str, str]]]: Process responses from all child tasks to produce results according to the current instance settings.

Methods

def create_tasks(self, documents: dict[str, typing.Any] | None = None)

Obtain upload location and post PDFs. If mock_ids were defined, create mock tasks for each id and collect the existing respones.

Args

documents : dict[str, lu.PDFLibProto]: dict of {filename: PDFLibProto} where PDFLibProto is a namedtuple of (body, meta) where body is a bytes object and meta is a dict of metadata for the pdf. Optional. Extends self.documents if supplied.

def job_dict_entries(self, extracted_data: dict[str, dict[str, typing.Any]]) ‑> dict[str, dict[str, typing.Any]]

Dict of {job_id: job_dict} for all tasks in self._tasks where each job_dict contains values for db columns 'input', 'comments', and 'note'.

Called by aws_s3_batch.py to recombine the values for the columns noted above with their corresponding output from table_transformer.py.

Args

extracted_data : dict[str, dict[str, Any]]: TableTransformer output data supplied from aws_s3_batch.S3Batch.transformed.

Returns

dict[str, dict[str, Any]]: dict of {job_id: job_dict} for all tasks in self._tasks.

def reset(self, **kwargs)

Reset the dataclass to prepare for a new facility by clearing all documents, tasks, results, and internal variables.

KwArgs

documents : dict[str, lu.PDFLibProto]: dict of {filename: PDFLibProto} where PDFLibProto is a namedtuple of (body, meta) where body is a bytes object and meta is a dict of metadata for the pdf.
page_map : dict[str, dict[str, list[int]]]: dict of {new_doc_id: {old_doc_id: [page_nums]}} where new_doc_id is the doc_id for the combined pdf created by docuvision and old_doc_id is the doc_id for the original pdf. page_nums is a list of page numbers from the original pdf that were included in the combined pdf.
out_dir : str: path to directory where output json files will be written. If None, no files will be written.
split_by_pid : bool: if True, docuvision will split each pdf into separate documents based on patient id. If False, docuvision will combine all pdfs into a single document.
fail_on_error : bool: if True, raise an error if any task fails to post or any response fails to be collected. Defaults to False.
mock_ids : list[int]: list of task_ids to use for mock responses. Defaults to [].
default_dos : str: default date of service to use if no date of service is extracted from the facesheet. Defaults to gvars.DEFAULT_DOS.
api_secret_name : str: name of secret in AWS secretsmanager containing the base_url, base_path, and api_key values for the docuvision API.
dv_preferred_networks : list[str] | None: list of preferred Docuvision Neural Networks. Created for facilities where people manually upload 1-page PDFs
table_converters : dict[str, Callable[[list[str]], list[dict[str, str]]]]: function reference for processing '*Table' labels returned by DV-1.
dv_required_page_types : set[str]: if supplied, a case will only be created for a pid if at least one of the pages assigned to that pid have a type in this set.

def tables_for(self, doc_id: str, section: str = 'DocuVision', sep: str = '.') ‑> dict[str, list[dict[str, str]]]

Get results for the supplied doc_id in a tabular format suitable for downstream processing in table_transformer.py.

Args

doc_id : str: doc_id for the document to retrieve results for.
section : str: section name to use for the table. Defaults to "DocuVision".
sep : str: separator to use for table keys. Defaults to ".".

Returns

dict[str, list[dict[str, str]]]: dict of {table_name: [table_rows]} where each table_row is a dict of {label: value}.

def task_attr_list(self, attr: str) ‑> list

List the specified attribute for all tasks in self._tasks.

class ProviderIntegratorProtocol (*args, **kwargs)

Define externally accessed methods for the ProviderIntegrator class

Ancestors

typing.Protocol
typing.Generic

Subclasses

ProviderIntegrator

Class variables

var api_url : str
var full_name : vStr
var is_anes_provider : bool
var mode : str | None
var npi : str | vStr
var public_only : bool

Instance variables

prop last_api_response : dict[str, typing.Any] | None: last public API query response
prop last_url_params : dict[str, typing.Any]: last public API query parameters
prop query_name: name of last query function

Methods

def search(self, is_anes_provider: bool, full_name: str | vStr, npi: str | vStr = '', mode: str | None = None) ‑> tuple[vStr, vStr]: progressively search local DB and public API with supplied provider data and return (a) fully populated vStr objects for the provider name and NPI upon a successful lookup or (b) an 'original value only' vStr object for the provider name and a "null" vStr upon failure.