Module utilities.protocols

Define protocols for inaccessible (due to circular imports) external classes

Classes

class DocuVisionIntegratorProtocol (*args, **kwargs)

Define externally accessed methods for the DocuVisionIntegrator class. See ./app/integrators/docuvision_integrator.py for class implementation.

Ancestors

  • typing.Protocol
  • typing.Generic

Subclasses

Class variables

var api_key : str
var base_path : str
var base_url : str
var page_map : dict[str, dict[str, list[int]]]
var split_by_pid : bool

Instance variables

prop pdf_library : dict[str, typing.Any]

Dict of {case_doc_id: PDFLibEntry} where each entry represents the combined_pdf of a DocuVisionCase created by a child DocuVisionTask instance.

Merged into S3Batch.pdf_library in aws_s3_batch.py for file matching and other downstream processes.

prop pdfs_by_doc_id : dict[str, bytes]

Returns a dict of pdf bytes split and/or concatenated by task pids.

prop results : dict[str, list[dict[str, str]]]

Process responses from all child tasks to produce results according to the current instance settings.

Methods

def create_tasks(self, documents: dict[str, typing.Any] | None = None)

Obtain upload location and post PDFs. If mock_ids were defined, create mock tasks for each id and collect the existing respones.

Args

documents : dict[str, lu.PDFLibProto]
dict of {filename: PDFLibProto} where PDFLibProto is a namedtuple of (body, meta) where body is a bytes object and meta is a dict of metadata for the pdf. Optional. Extends self.documents if supplied.
def job_dict_entries(self, extracted_data: dict[str, dict[str, typing.Any]]) ‑> dict[str, dict[str, typing.Any]]

Dict of {job_id: job_dict} for all tasks in self._tasks where each job_dict contains values for db columns 'input', 'comments', and 'note'.

Called by aws_s3_batch.py to recombine the values for the columns noted above with their corresponding output from table_transformer.py.

Args

extracted_data : dict[str, dict[str, Any]]
TableTransformer output data supplied from aws_s3_batch.S3Batch.transformed.

Returns

dict[str, dict[str, Any]]
dict of {job_id: job_dict} for all tasks in self._tasks.
def reset(self, **kwargs)

Reset the dataclass to prepare for a new facility by clearing all documents, tasks, results, and internal variables.

KwArgs

documents : dict[str, lu.PDFLibProto]
dict of {filename: PDFLibProto} where PDFLibProto is a namedtuple of (body, meta) where body is a bytes object and meta is a dict of metadata for the pdf.
page_map : dict[str, dict[str, list[int]]]
dict of {new_doc_id: {old_doc_id: [page_nums]}} where new_doc_id is the doc_id for the combined pdf created by docuvision and old_doc_id is the doc_id for the original pdf. page_nums is a list of page numbers from the original pdf that were included in the combined pdf.
out_dir : str
path to directory where output json files will be written. If None, no files will be written.
split_by_pid : bool
if True, docuvision will split each pdf into separate documents based on patient id. If False, docuvision will combine all pdfs into a single document.
fail_on_error : bool
if True, raise an error if any task fails to post or any response fails to be collected. Defaults to False.
mock_ids : list[int]
list of task_ids to use for mock responses. Defaults to [].
default_dos : str
default date of service to use if no date of service is extracted from the facesheet. Defaults to gvars.DEFAULT_DOS.
api_secret_name : str
name of secret in AWS secretsmanager containing the base_url, base_path, and api_key values for the docuvision API.
dv_preferred_networks : list[str] | None
list of preferred Docuvision Neural Networks. Created for facilities where people manually upload 1-page PDFs
table_converters : dict[str, Callable[[list[str]], list[dict[str, str]]]]
function reference for processing '*Table' labels returned by DV-1.
dv_required_page_types : set[str]
if supplied, a case will only be created for a pid if at least one of the pages assigned to that pid have a type in this set.
def tables_for(self, doc_id: str, section: str = 'DocuVision', sep: str = '.') ‑> dict[str, list[dict[str, str]]]

Get results for the supplied doc_id in a tabular format suitable for downstream processing in table_transformer.py.

Args

doc_id : str
doc_id for the document to retrieve results for.
section : str
section name to use for the table. Defaults to "DocuVision".
sep : str
separator to use for table keys. Defaults to ".".

Returns

dict[str, list[dict[str, str]]]
dict of {table_name: [table_rows]} where each table_row is a dict of {label: value}.
def task_attr_list(self, attr: str) ‑> list

List the specified attribute for all tasks in self._tasks.

class ProviderIntegratorProtocol (*args, **kwargs)

Define externally accessed methods for the ProviderIntegrator class

Ancestors

  • typing.Protocol
  • typing.Generic

Subclasses

Class variables

var api_url : str
var full_namevStr
var is_anes_provider : bool
var mode : str | None
var npi : str | vStr
var public_only : bool

Instance variables

prop last_api_response : dict[str, typing.Any] | None

last public API query response

prop last_url_params : dict[str, typing.Any]

last public API query parameters

prop query_name

name of last query function

Methods

def search(self, is_anes_provider: bool, full_name: str | vStr, npi: str | vStr = '', mode: str | None = None) ‑> tuple[vStrvStr]

progressively search local DB and public API with supplied provider data and return (a) fully populated vStr objects for the provider name and NPI upon a successful lookup or (b) an 'original value only' vStr object for the provider name and a "null" vStr upon failure.