Package `dbops`

init file for db_push package containing all db operations

Sub-modules

dbops.create_batch: db func for creating new batch ids
dbops.db_dataclasses: Dataclasses and NamedTuples representing db objects
dbops.db_utils: db utility functions
dbops.insurance_providers: selects and updates for records in common_db.shared.insurance_providers
dbops.providers: db funcs for pushing providers to facility list
dbops.push_analysis_jobs: Inserts and updates for ClaimMaker DB tables analysis_jobs and analysis_job_claims
dbops.selects: db select functions jobs
dbops.state_enums: defines enums for batch and job states for convenient access

Functions

def clone_as_new_case(existing: dict[str, typing.Any]) ‑> dict[str, typing.Any]

Create a copy of a case from the DB such that it can be imported under a new ID. Also clears patient_info, schedule, and input from the existing case to avoid updates to the existing case's db data.

def create_batches(extracted_dict: dict[str, dict[str, typing.Any]], provider_name: str, provider_facility: str, first_dos: str, mock: bool = False)

iterate through job data, creating batches when required. returns a list of created and/or pre-existing batch ids for each job in the extracted dict.

def db_push(json_result: dict, patient_identifier: str)

Push data extracted for a single patient to a ClaimMaker DB.

Args

json_result : dict: data extracted for a single patient in the Hank AI format. A valid "batch_id" entry, generally set by a prior call to the create_batches() function defined in dbops/create_batch.py. If json_result["datasource"] == "database", json_result must also contain a valid "id" entry, and the json_result data will update the job having that id rather than being inserted as a new record.
patient_identifier : str: unique identifier for the patient encounter, typically the source filename.

Returns

bool: True if successful, False otherwise.

def existing_entities_with_new_as_pdf(pt_dict: dict[str, typing.Any]) ‑> dict[str, str | list[dict[str, str | bool]]]

get the entities currently attached to the case and append a newly received doc in the form of a PDF. Used when new docs are received after the case has been charged entered. pt_dict must contain a valid extracted documentation set (at least one json type doc) and a valid job id or an error will result.

def initialize_data_entry_fields(job: dict[str, typing.Any]) ‑> dict[str, typing.Any]

initialize data entry fields data as vStr custom attrs

def job_library(provider_facility: str, dates_of_service: collections.abc.Sequence[str], job_ids: collections.abc.Sequence[int] = (-1,)) ‑> dict[str, dict[str, typing.Any]]

get a dictionary of jobs for a batch specified by facility and date of service. Principle use case is for matching previously created jobs that have no documents to newly extracted documents as they come in.

def minimized_query(query_string) ‑> str

convenience function for eliminating superfluous whitespace from query strings.

def select_insurance_provider(insurance_entry: dict[str, Any]) ‑> InsuranceRecord

given an extracted patient_info->'insurance' entry, find matches from common_db.shared.insurance_providers.

Args

insurance_entry : dict[str, Any]: a patient_info->'insurance' entry from an extracted case

Returns

InsuranceRecord: the matching record from insurance_providers (no result is returned if more than one match is discovered.)

def select_providers(predicate: str, predicate_params: tuple, strict=True) ‑> list[ProviderRecord]

supplied with a where clause and parameters to populate it return a list of matching providers from the local table

Args

predicate : str: the query predicate. 'where ' is assumed (DO NOT SUPPLY 'WHERE ')
predicate_params : tuple: tuple of params matching %s entries in predicate
strict : bool: if True, add facility condition to supplied predicate

Returns

list[ProviderRecord]: list of matched providers from local list

def set_batch_state(batch_id: int | str, state: str) ‑> int

Set batch with specified id to supplied state value. Raises ValueError if batch does not exist or state is not one of: GatheringInformation New Reconciled Assigned In-progress Completed ChargesEntered Error

def set_db_globals(db_host, db_user, db_pwd, db_name, no_db) ‑> bool

sets globabl variables for module to be used in all subsequent db ops. called by extract_s3.py

def simple_select(select_list: str, table_expression: str, sort_specification: str = '', *, params: tuple[typing.Any, ...] | dict[str, typing.Any] = (), formatter: collections.abc.Callable[[collections.abc.Sequence[tuple[typing.Any, ...]]], ~T] = <function _formatter_default>) ‑> ~T

execute a generic query and return the results. use the formatter kwarg to specify a custom output type, e.g. if cursor.fetchall() returns a list of 2 length tuples with unique hashables in the first position, passing formatter=dict will return a dict with appropriate typing. The first 3 arguments will be complied into the select query as follows: f"SELECT {select_list} FROM {table_expression} {sort_specification}" Refer to https://www.postgresql.org/docs/10/queries-overview.html. Note that 'sort_expression' includes everything after the table_expression including all where conditions, group bys, and order bys. The caller MUST supply all required keywords (i.e. "WHERE", "GROUP BY", etc.).

def simple_select_from_common(select_list: str, table_expression: str, sort_specification: str = '', *, params: tuple[typing.Any, ...] | dict[str, typing.Any] = (), formatter: collections.abc.Callable[[collections.abc.Sequence[tuple[typing.Any, ...]]], ~T] = <function _formatter_default>) ‑> ~T

Same as simple_select() defined above but targets common_db in place of facility specific db. see simple_select() docstring for more info.

def unmatched_pdf_reference(provider_facility: str, dates_of_service: collections.abc.Sequence[str], document_id_not_like: str = '%_combined.pdf', state_in: collections.abc.Sequence[str] = ('New', 'InProgress', 'OnHold')) ‑> dict[str, utilities.library_utils.PDFLibReference]

get all PDFs that have NOT been combined (i.e. documentId does not end in '_combined.pdf'). Allows intraop records and facesheets to be combined and re-extracted when both documents don't post on the same day. nphllc.com's chaph facility is the principal use case.

def update_batch_dos(batch_id: int | str, job_ids: collections.abc.Sequence[str] | collections.abc.Sequence[int], new_dos: str, facility: str) ‑> bool

Set dos of batch and jobs with specified ids to supplied dos Raises ValueError if: - batch contains job ids not in job_ids - batch is not in state New

def update_insurance_provider(ins_record: InsuranceRecord) ‑> bool

insert provider info into facility table if not already present. if provider with supplied npi exists, update lastReferenceDate

def update_job_on_error(job_id: int | str, input_update: dict[str, typing.Any] | None = None) ‑> int

Update the DB after a failed DocuVision extraction.

Set job state to SystemError and append error message to comments. Update input with any new data from the extraction.

Args

job_id : int: the job id.
input_update : dict[str, Any]: new data extracted from the PDF.

Returns

int: the updated job id.

def upsert_provider(provider: ProviderRecord) ‑> bool

insert provider info into facility table if not already present. if provider with supplied npi exists, update lastReferenceDate

Classes

class BatchInfo (date_of_service: datetime.date | datetime.datetime | str = datetime.datetime(1, 1, 1, 0, 0), batch_id: int = 0, batch_state: str | BatchState = '')

container for the date of service, id, and state for a batch

Class variables

var batch_id : int
var batch_state : str | BatchState
var date_of_service : datetime.date | datetime.datetime | str

Instance variables

prop active : bool: Based on state. True once reconciled until coding complete.
prop closed : bool: Based on state. True if cases should not be added.
prop is_new : bool: Based on state. True if not released to coders.
prop open : bool: Based on state. True if cases can be added.

Methods

def to_dict(self) ‑> dict[str, typing.Any]: return asdict representation

class BatchState (value, names=None, *, module=None, qualname=None, type=None, start=1)

Valid batch states used in UI

Ancestors

builtins.str
enum.Enum

Class variables

var ASGN
var CHGD
var COMP
var ERR
var EXP
var INFO
var IN_P
var NEW
var RCND

class JobState (value, names=None, *, module=None, qualname=None, type=None, start=1)

Valid job states used in UI

Ancestors

builtins.str
enum.Enum

Class variables

var CODE_ERR
var COMP
var HOLD
var IGN
var IN_P
var NEW
var NO_CODE
var SYS_ERR

class ProviderData (name: dict[str, str] = <factory>, aliases: list[str] = <factory>, licenses: list[dict] = <factory>, facilities: list[str] = <factory>, usStates: list[str] = <factory>)

Represents the json content contained in the provider_data_column

Attributes

name : dict[str, str]: name dict as returned by public API
aliases : list[str]: list of full names linked to this provider
licenses : list[dict]: licenses as returned by public API
facilities : list[str]: all facilities provider has worked
usStates : list[str]: all states referenced in API result

Class variables

var aliases : list[str]
var facilities : list[str]
var licenses : list[dict]
var name : dict[str, str]
var usStates : list[str]

Methods

def merge(self, other: ProviderData) ‑> bool: merge this object's fields with other returning True if any data is adjusted in self
def to_dict(self, include_licenses=True) ‑> dict[str, dict | list]: return asdict(self) after deduping list[str] objects

class ProviderRecord (npi: str, full_name: str, specialty: str, provider_data: ProviderData = <factory>, selected: bool = False, cached: bool = False)

Represents the complete set of data required to update or insert records into the shared.providers table

Attributes

npi : str: a numeric string containing a national provider identifier
full_name : str: name in LAST, FIRST [MIDDLE], [CRED, e.g. MD]
specialty : str: provider_type from NLM clinicaltables API
provider_data : dict[str, Any]: metadata, e.g. facilities supported

Class variables

var cached : bool
var full_name : str
var npi : str
var provider_data : ProviderData
var selected : bool
var specialty : str

Static methods

def from_query(row: tuple[str, str, str, dict]) ‑> ProviderRecord

construct a new instance from a single db row

Args

row : tuple[str, str, str, dict[str, list]]: row value from db

Returns

ProviderRecord: ProviderRecord for the row with selected=True

Instance variables

prop provider_record : ProviderRecord: Included as a convenience for easier ProviderIntegrator caching behavior
prop updated : bool: indicates whether the record was updated, e.g. to add a facility

Methods

def as_params(self) ‑> dict[str, str]: return asdict with provider_data converted with json.dumps
def set_cached(self) ‑> ProviderRecord: set the cached attribute to bool(self) and return True if successful

class TrimEncoder (*args, **kwargs)

Custom json encoder for pretty printing nested dicts

Constructor for JSONEncoder, with sensible defaults.

If skipkeys is false, then it is a TypeError to attempt encoding of keys that are not str, int, float or None. If skipkeys is True, such items are simply skipped.

If ensure_ascii is true, the output is guaranteed to be str objects with all incoming non-ASCII characters escaped. If ensure_ascii is false, the output can contain non-ASCII characters.

If check_circular is true, then lists, dicts, and custom encoded objects will be checked for circular references during encoding to prevent an infinite recursion (which would cause an RecursionError). Otherwise, no such check takes place.

If allow_nan is true, then NaN, Infinity, and -Infinity will be encoded as such. This behavior is not JSON specification compliant, but is consistent with most JavaScript based encoders and decoders. Otherwise, it will be a ValueError to encode such floats.

If sort_keys is true, then the output of dictionaries will be sorted by key; this is useful for regression tests to ensure that JSON serializations can be compared on a day-to-day basis.

If indent is a non-negative integer, then JSON array elements and object members will be pretty-printed with that indent level. An indent level of 0 will only insert newlines. None is the most compact representation.

If specified, separators should be an (item_separator, key_separator) tuple. The default is (', ', ': ') if indent is None and (',', ': ') otherwise. To get the most compact JSON representation, you should specify (',', ':') to eliminate whitespace.

If specified, default is a function that gets called for objects that can't otherwise be serialized. It should return a JSON encodable version of the object or raise a TypeError.

Ancestors

json.encoder.JSONEncoder

Methods

def encode(self, o)

Return a JSON string representation of a Python data structure.

>>> from json.encoder import JSONEncoder
>>> JSONEncoder().encode({"foo": ["bar", "baz"]})
'{"foo": ["bar", "baz"]}'