Module specs._types
Spec definitions. TypedDicts are used for easy serialization.
Sub-modules
specs._types._client-
TypeDict definitions for client_specs.py
specs._types._match-
TypeDict definitions for matchops classes.
specs._types._section-
section spec typed dict
specs._types._summary-
Type definitions for summary and condense specs.
specs._types._table-
TypedDicts for use in table_specs.py
Classes
class ClientSpecStatic (*args, **kwargs)-
Top level keys that must be present in ClientSpec objects
Attributes
db_secret_name:str- Name of the secret containing the database credentials.
api_secret_name:str- Name of the secret containing the api credentials.
managed_fields:dict[str, ManagedField]- client level standard fields.
managed_fields_context_rules:list[ContextRule]- client level context rules.
summary_key_addendum:list[str]- additional summary keys specific to client.
summary_map_addendum:dict[str, ReduceSpec]- dict of additional summary reduce specs.
Ancestors
- builtins.dict
Class variables
var api_secret_name : strvar db_secret_name : strvar managed_fields : dict[str, utilities.managed_fields.ManagedField]var managed_fields_context_rules : list[utilities.managed_fields.ContextRule]var summary_key_addendum : list[str]var summary_map_addendum : dict[str, ReduceSpec]
class FacilitySpec (*args, **kwargs)-
Facility entry in client_specs.builtin_client_specs.
Attributes
azure_secret_name:str- stores endpoint and subscription key for Azure Computer Vision OCR operations.
dest_prefix:str- The final s3 folder specification for processed PDFs.
extract_func:partial | pu.ExtractFunc- Function used for text extraction.
facility_name:str- MUST MATCH FACILITY NAME FROM ACE SALESFORCE.
failed_prefix:str- The final s3 folder specification for failed PDFs.
first_dos:str- The first date of service to be coded for facility.
- insurance_integration_mode (Literal['0', '1'] | None): overrides equivalent env var when set.
max_keys:int- Override max keys passed to extract_buckets for this facility.
match_specs_key:str- Key for this facility in match_specs.
output_dir:str | None- Optional directory for saving intermediate outputs.
- provider_integration_mode (Literal['0', '1', '2'] | None): overrides equivalent env var when set.
s3_prefix:str- Non-filename portion of source PDF s3 keys.
section_specs_key:str- Key in section_specs.
send_func:Callable- S3Batch output processing function.
source_prefix:str- The final s3 folder specification for unprocessed PDFs.
summary_key_addendum:list[str]- List of additional summary keys valid for this facility.
summary_map_addendum:dict[str, ReduceSpec]- dict of additional summary reduce specs.
summary_specs_key:str- Key for summary type in summary_specs.
table_specs_key:str- Key for facility type in table_specs.
transform_specs_key:str- Key for facility type in transform_specs.
use_autocoding:bool- Flag to enable/disable autocoding for the facility.
use_docuvision:bool- Flag to enable/disable docuvision for the facility.
managed_fields:dict[str, ManagedField]- Custom standard field values (optional).
managed_fields_context_rules- list[ContextRule]: facility level context rules.
dv_preferred_networks:list[str] | None- List of DocuVision neural networks preferred for this facility.
dv_required_page_types:set[str] | None- if supplied, docuvision will only create a case for a pid if at least one of the pages assigned to that pid has a noteType in this set.
send_reject_notifications:bool- if True, include a UserNotification for each rejected s3 file input in the ClaimMaker Alert email to the client. Defaults to False.
file_groups:list[S3FileGroup]- list of file groups specifying how to process every file type according to a matched regex
Ancestors
- builtins.dict
Class variables
var azure_secret_name : strvar dest_prefix : strvar dv_preferred_networks : list[str] | Nonevar dv_required_page_types : set[str] | Nonevar extract_func : functools.partial[dict[str, utilities.utils.FileContentsEntry]] | collections.abc.Callable[[dict[str, utilities.library_utils.PDFLibProto]], dict[str, utilities.utils.FileContentsEntry]]var facility_name : strvar failed_prefix : strvar file_groups : list[utilities.client_utils.S3FileGroup]var first_dos : strvar insurance_integration_mode : Optional[Literal['0', '1']]var managed_fields : dict[str, utilities.managed_fields.ManagedField]var managed_fields_context_rules : list[utilities.managed_fields.ContextRule]var match_specs_key : strvar max_keys : intvar output_dir : str | Nonevar provider_integration_mode : Optional[Literal['0', '1', '2']]var s3_prefix : strvar section_specs_key : strvar send_func : collections.abc.Callable[..., dict[str, bool]]var send_reject_notifications : boolvar source_prefix : strvar summary_key_addendum : list[str]var summary_map_addendum : dict[str, ReduceSpec]var summary_specs_key : strvar table_specs_key : str | Nonevar transform_specs_key : strvar use_autocoding : boolvar use_docuvision : bool
class MatchSpec (*args, **kwargs)-
Spec definition for reference file dataframe matching.
Attributes
schedule_other:partial[DataFrameMatcher]- partial function for matching other csvs to schedule.
schedule_demo:partial[DataFrameMatcher]- partial function for matching demographics to schedule.
Ancestors
- builtins.dict
Class variables
var schedule_demo : functools.partial[DataFrameMatcher]var schedule_other : functools.partial[DataFrameMatcher]
class ReduceSpec (*args, **kwargs)-
TypedDict defining the fields required in a summary_map entry.
Attributes
reduce:Callable- function to reduce a list of values to a single value
key_filters:list[str]- if the source table key contains a value in this list, exclude its value from consideration.
value_filters:list[Callable]- if any of these functions return True for for a candidate value, exclude it from consideration.
queries:dict[str, Callable]- collect candidate values by querying the source table for keys that match the regex defined in the query key and reduce the results using the function defined in the query value.
Ancestors
- builtins.dict
Class variables
var key_filters : list[str]var queries : dict[str, collections.abc.Callable[[list[str]], str | bool | list[typing.Any]]]var reduce : collections.abc.Callable[[collections.abc.Sequence[typing.Any]], str | bool | list[typing.Any]] | strvar value_filters : list[collections.abc.Callable[..., bool]]
class SectionSpec (*args, **kwargs)-
Section spec TypedDict definition.
Attributes
exact_titles:list[str]- list of exact section titles to match
force_names:list[ForceNameTuple]- list of tuples of check functions and names to force if the check function returns True when passed a current section's title.
strip_ends:list[str]- list of strings to strip from the end of a section title.
heading_breaks:list[HBTuple]- list of tuples of break strings to remove unwanted data from section titles.
sect_start_checks:list[Callable]- list of functions to check if a section has started when passed the list of remaining extracted lines of text.
end_sect_latches:list[LTTuple]- list of tuples containing a latch function that should return True when passed the list of remaining lines if the section is ending, a trigger function to end the section based on the current line, and an unlatch function to clear the "section ending" latch based on the current line.
sect_start_dqs:list[su.SectStartDisqualifier]- list of su.SectStartDisqualifier. Disqualify section starts based on the currently extracting section name and the remaining extracted text.
line_roll_checks:list[su.LineRollCheck]- list of LineRoleCheck tuples defining tests to detect and functions to correct improper line wrapping in the source document.
wrap_lines:bool- if True, search for horizontally distributed table layouts and move tables in the rightmost column such that they appear below the table in the leftmost column.
document_strippers:list[StripperTuple]- list of tuples defining document stripper classes and their kwargs.
Ancestors
- builtins.dict
Class variables
var document_strippers : list[utilities.section_utils.StripperTuple]var end_sect_latches : list[utilities.section_utils.LTTuple]var exact_titles : list[str]var force_names : list[utilities.section_utils.ForceNameTuple]var heading_breaks : list[utilities.section_utils.HBTuple]var line_roll_checks : list[utilities.section_utils.LineRollCheck]var sect_start_checks : list[collections.abc.Callable[[collections.abc.Sequence[str]], bool]]var sect_start_dqs : list[utilities.section_utils.SectStartDisqualifier]var strip_ends : list[str]var wrap_lines : bool
class SummarySpec (*args, **kwargs)-
Typed dict for required summary spec keys.
Attributes
summary_func:str- must be a valid summary function name from TableTransformer()
summary_args:dict[str, Any]- kwargs for summary_func
summary_key_addendum:list[str]- list of valid output keys not appearing in summary_map
summary_map:CondenseSpec- map of output keys to ReduceSpecs
summary_meets_claimmaker_standard:bool- enables "claimmaker only" operations. See aws_s3_batch.py for more information.
Ancestors
- builtins.dict
Class variables
var summary_args : dict[str, typing.Any]var summary_func : strvar summary_key_addendum : list[str]var summary_map : dict[str, ReduceSpec]var summary_meets_claimmaker_standard : bool
class TableSpec (*args, **kwargs)-
TypedDict representing args/configuration items for parsing the tables of a single section.
Attributes
save_full_text:bool- if True, save the full text of the table
split_table_columns:list[str]- list of column names to split on
row_indent:int- lines indented by at least this many spaces are appended to the list of lines in the previous table.
force_table_names:list[ForceNameTuple]- list of tuples of check functions and names to force if the check function returns True when passed a current table's title.
rollup_cascade_reference:RollupCascadeManager- container class for defining rollup/cascade operations.
heading_indent:int- lines indented by exactly this many spaces are autmotically considered to be table headings.
stripped_head_key:str- key to use when storing data stripped from a table heading.
heading_contains:list[str]- list of strings to check for in the line to detect it as a table heading.
start_checks:list[TableStartCheck]- list of tuples of check functions for starting new tables.
end_checks:list[TableEndCheck]- list of tuples of check functions for ending the current table.
interpreter:Callable- function to interpret the table
interpreter_kwargs:SwappingInterpreterKwArgs- kwargs for interpreter
process_residual:bool- if True, process residual text after removing all lines assigned to other tables.
heading_breaks:list[HBTuple]- list of tuples of break strings to remove unwanted data from table titles. Inherited from section specs.
Ancestors
- builtins.dict
Class variables
var end_checks : list[utilities.table_utils.TableEndCheck]var force_table_names : list[utilities.section_utils.ForceNameTuple]var heading_breaks : list[utilities.section_utils.HBTuple]var heading_contains : list[str]var heading_indent : intvar interpreter : collections.abc.Callable[..., list[dict[str, str]] | utilities.table_utils.SubtableParser]var interpreter_kwargs : utilities.table_utils.SwappingInterpreterKwArgsvar process_residual : boolvar rollup_cascade_reference : RollupCascadeManagervar row_indent : intvar save_full_text : boolvar split_table_columns : list[str]var start_checks : list[utilities.table_utils.TableStartCheck]var stripped_head_key : str