Module utilities.protocols
Define protocols for inaccessible (due to circular imports) external classes
Classes
class DocuVisionIntegratorProtocol (*args, **kwargs)
-
Define externally accessed methods for the DocuVisionIntegrator class. See
./app/integrators/docuvision_integrator.py
for class implementation.Ancestors
- typing.Protocol
- typing.Generic
Subclasses
Class variables
var api_key : str
var base_path : str
var base_url : str
var page_map : dict[str, dict[str, list[int]]]
var split_by_pid : bool
Instance variables
prop pdf_library : dict[str, typing.Any]
-
Dict of {case_doc_id: PDFLibEntry} where each entry represents the combined_pdf of a DocuVisionCase created by a child DocuVisionTask instance.
Merged into S3Batch.pdf_library in aws_s3_batch.py for file matching and other downstream processes.
prop pdfs_by_doc_id : dict[str, bytes]
-
Returns a dict of pdf bytes split and/or concatenated by task pids.
prop results : dict[str, list[dict[str, str]]]
-
Process responses from all child tasks to produce results according to the current instance settings.
Methods
def create_tasks(self, documents: dict[str, typing.Any] | None = None)
-
Obtain upload location and post PDFs. If mock_ids were defined, create mock tasks for each id and collect the existing respones.
Args
documents
:dict[str, lu.PDFLibProto]
- dict of {filename: PDFLibProto} where PDFLibProto is a namedtuple of (body, meta) where body is a bytes object and meta is a dict of metadata for the pdf. Optional. Extends self.documents if supplied.
def job_dict_entries(self, extracted_data: dict[str, dict[str, typing.Any]]) ‑> dict[str, dict[str, typing.Any]]
-
Dict of {job_id: job_dict} for all tasks in self._tasks where each job_dict contains values for db columns 'input', 'comments', and 'note'.
Called by aws_s3_batch.py to recombine the values for the columns noted above with their corresponding output from table_transformer.py.
Args
extracted_data
:dict[str, dict[str, Any]]
- TableTransformer output data supplied from aws_s3_batch.S3Batch.transformed.
Returns
dict[str, dict[str, Any]]
- dict of {job_id: job_dict} for all tasks in self._tasks.
def reset(self, **kwargs)
-
Reset the dataclass to prepare for a new facility by clearing all documents, tasks, results, and internal variables.
KwArgs
documents
:dict[str, lu.PDFLibProto]
- dict of {filename: PDFLibProto} where PDFLibProto is a namedtuple of (body, meta) where body is a bytes object and meta is a dict of metadata for the pdf.
page_map
:dict[str, dict[str, list[int]]]
- dict of {new_doc_id: {old_doc_id: [page_nums]}} where new_doc_id is the doc_id for the combined pdf created by docuvision and old_doc_id is the doc_id for the original pdf. page_nums is a list of page numbers from the original pdf that were included in the combined pdf.
out_dir
:str
- path to directory where output json files will be written. If None, no files will be written.
split_by_pid
:bool
- if True, docuvision will split each pdf into separate documents based on patient id. If False, docuvision will combine all pdfs into a single document.
fail_on_error
:bool
- if True, raise an error if any task fails to post or any response fails to be collected. Defaults to False.
mock_ids
:list[int]
- list of task_ids to use for mock responses. Defaults to [].
default_dos
:str
- default date of service to use if no date of service is extracted from the facesheet. Defaults to gvars.DEFAULT_DOS.
api_secret_name
:str
- name of secret in AWS secretsmanager containing the base_url, base_path, and api_key values for the docuvision API.
dv_preferred_networks
:list[str] | None
- list of preferred Docuvision Neural Networks. Created for facilities where people manually upload 1-page PDFs
table_converters
:dict[str, Callable[[list[str]], list[dict[str, str]]]]
- function reference for processing '*Table' labels returned by DV-1.
dv_required_page_types
:set[str]
- if supplied, a case will only be created for a pid if at least one of the pages assigned to that pid have a type in this set.
def tables_for(self, doc_id: str, section: str = 'DocuVision', sep: str = '.') ‑> dict[str, list[dict[str, str]]]
-
Get results for the supplied doc_id in a tabular format suitable for downstream processing in table_transformer.py.
Args
doc_id
:str
- doc_id for the document to retrieve results for.
section
:str
- section name to use for the table. Defaults to "DocuVision".
sep
:str
- separator to use for table keys. Defaults to ".".
Returns
dict[str, list[dict[str, str]]]
- dict of {table_name: [table_rows]} where each table_row is a dict of {label: value}.
def task_attr_list(self, attr: str) ‑> list
-
List the specified attribute for all tasks in self._tasks.
class ProviderIntegratorProtocol (*args, **kwargs)
-
Define externally accessed methods for the ProviderIntegrator class
Ancestors
- typing.Protocol
- typing.Generic
Subclasses
Class variables
var api_url : str
var full_name : vStr
var is_anes_provider : bool
var mode : str | None
var npi : str | vStr
var public_only : bool
Instance variables
prop last_api_response : dict[str, typing.Any] | None
-
last public API query response
prop last_url_params : dict[str, typing.Any]
-
last public API query parameters
prop query_name
-
name of last query function
Methods
def search(self, is_anes_provider: bool, full_name: str | vStr, npi: str | vStr = '', mode: str | None = None) ‑> tuple[vStr, vStr]
-
progressively search local DB and public API with supplied provider data and return (a) fully populated vStr objects for the provider name and NPI upon a successful lookup or (b) an 'original value only' vStr object for the provider name and a "null" vStr upon failure.