Output Map Types
The output map custom types.
OutputTrackerGitFilter
Bases: TypedDict
Parsed git filter TypedDict used for output map formatting.
Attributes:
Name | Type | Description |
---|---|---|
filter |
tuple[str, list[str]]
|
Tuple representing the filter type (include or exclude) and the filter values. |
Source code in bcorag/custom_types/output_map_types.py
OutputTrackerParamSet
Bases: TypedDict
Parameter set for a run.
Attributes:
Name | Type | Description |
---|---|---|
loader |
str
|
The data loader used for the run. |
vector_store |
str
|
The vector store used for the run. |
llm |
str
|
The LLM name used for the run. |
embedding_model |
str
|
The embedding model used for the run. |
similarity_top_k |
int
|
The similarity top k value used for the run. |
chunking_config |
str
|
The chunking strategy used for the run. |
git_user |
Optional[str]
|
The user who owns the github repository included in the document ingestion for the run (if applicable). |
git_repo |
Optional[str]
|
The github repository included in the document ingestion for the run (if applicable). |
git_branch |
Optional[str]
|
The github repository branch indexed during the document ingestion for the run (if applicable). |
directory_git_filter |
Optional[OutputTrackerGitFilter]
|
The directory filter used for indexing the github repository (if applicable). |
file_ext_git_filter |
Optional[OutputTrackerGitFilter]
|
The file extension filter used for indexing the github repository (if applicable). |
other_docs |
Optional[list[str]]
|
The file path to any additional documentation included in the documents. |
Source code in bcorag/custom_types/output_map_types.py
OutputTrackerRunsEntry
Bases: TypedDict
Specific file data under a parameter set.
Attributes:
Name | Type | Description |
---|---|---|
index |
int
|
The index for the run (the index represents the run number for that specific domain parameter set). |
timestamp |
str
|
The timestamp for the run. |
txt_file |
str
|
File path to the raw output dump text file. |
json_file |
str
|
File path to the JSON output file. |
source_node_file |
str
|
File path to the source node text file. |
elapsed_time |
float
|
The elapsed time (in seconds) for how long the domain generation took. |
version |
str
|
The version of the bcorag tool used. |
Source code in bcorag/custom_types/output_map_types.py
OutputTrackerEntry
Bases: TypedDict
Entry in the output map under a specific domain hash string.
Attributes:
Name | Type | Description |
---|---|---|
curr_index |
int
|
The most recent run index. |
params |
OutputTrackerParamSet
|
The parameter set for the run. |
runs |
list[OutputTrackerRunsEntry]
|
The list of runs for this parameter set. |
Source code in bcorag/custom_types/output_map_types.py
OutputTrackerDomainEntry
Bases: TypedDict
Entry for a specific domain.
Note: this isn't the most ideal way to do this. Ideally the hash string itself for the parameter set would be the key instead of forcing the OutputTrackerDomainField to be kept as a list of objects. However, there doesn't seem to be a good way to do this in a pythonic way while enforcing type safety with static type checkers. As they currently exist, TypedDict's require all keys are specified at the time of creating the definition. I would rather not specify regular dictionaries with extensive and verbose type annotations and I expect these map output files are likely to be small enough that serious linear runtime complexity won't cause issues.
Attributes:
Name | Type | Description |
---|---|---|
hash_str |
str
|
The hash of the parameter set used for run collision identification. |
entries |
OutputTrackerEntry
|
The run objects. |
Source code in bcorag/custom_types/output_map_types.py
OutputTrackerFile
Bases: TypedDict
Top level schema for the output file.
Attributes:
Name | Type | Description |
---|---|---|
usability |
list[OutputTrackerDomainEntry]
|
The output map for the usability domain. |
io |
list[OutputTrackerDomainEntry]
|
The output map for the io domain. |
description |
list[OutputTrackerDomainEntry]
|
The output map for the description domain. |
execution |
list[OutputTrackerDomainEntry]
|
The output map for the execution domain. |
parametric |
list[OutputTrackerDomainEntry]
|
The output map for the parametric domain. |
error |
list[OutputTrackerDomainEntry]
|
The output map for the error domain. |
Source code in bcorag/custom_types/output_map_types.py
create_output_tracker_git_filter(filter)
Constructor for the OutputTrackerGitFilter
TypedDict.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filter |
tuple[str, list[str]]
|
|
required |
Returns:
Type | Description |
---|---|
OutputTrackerGitFilter
|
|
Source code in bcorag/custom_types/output_map_types.py
create_output_tracker_param_set(loader, vector_store, llm, embedding_model, similarity_top_k, chunking_config, git_user, git_repo, git_branch, directory_git_filter=None, file_ext_git_filter=None, other_docs=None)
Constructor for the OutputTrackerParamSet
TypedDict.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
loader |
str
|
The data loader used for the run. |
required |
vector_store |
str
|
The vector store used for the run. |
required |
llm |
str
|
The LLM name used for the run. |
required |
embedding_model |
str
|
The embedding model used for the run. |
required |
similarity_top_k |
int
|
The similarity top k value used for the run. |
required |
chunking_config |
str
|
The chunking strategy used for the run. |
required |
git_user |
Optional[str]
|
The user who owns the github repository included in the document ingestion for the run (if applicable). |
required |
git_repo |
Optional[str]
|
The github repository included in the document ingestion for the run (if applicable). |
required |
git_branch |
Optional[str]
|
The github repository branch indexed during the document ingestion for the run (if applicable). |
required |
directory_git_filter |
Optional[OutputTrackerGitFilter]
|
The directory filter used for indexing the github repository (if applicable). |
None
|
file_ext_git_filter |
Optional[OutputTrackerGitFilter]
|
The file extension filter used for indexing the github repository (if applicable). |
None
|
other_docs |
Optional[list[str]]
|
The file path to any additional documentation included in the documents. |
None
|
Returns:
Type | Description |
---|---|
OutputTrackerParamSet
|
|
Source code in bcorag/custom_types/output_map_types.py
create_output_tracker_runs_entry(index, timestamp, txt_file, json_file, source_node_file, elapsed_time, version=__version__)
Constructor for the OutputTrackerRunsEntry
TypedDict.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
index |
int
|
The index for the run (the index represents the run number for that specific domain parameter set). |
required |
timestamp |
str
|
The timestamp for the run. |
required |
txt_file |
str
|
File path to the raw output dump text file. |
required |
json_file |
str
|
File path to the JSON output file. |
required |
source_node_file |
str
|
File path to the source node text file. |
required |
elapsed_time |
float
|
The elapsed time (in seconds) for how long the domain generation took. |
required |
version |
str
|
The version of the |
__version__
|
Source code in bcorag/custom_types/output_map_types.py
create_output_tracker_entry(curr_index, params, runs)
Constructor for the OutputTrackerEntry
TypedDict.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
curr_index |
int
|
The most recent run index. |
required |
params |
OutputTrackerParamSet
|
The parameter set for the run. |
required |
runs |
list[OutputTrackerRunsEntry]
|
The list of runs for this parameter set. |
required |
Returns:
Type | Description |
---|---|
OutputTrackerEntry
|
|
Source code in bcorag/custom_types/output_map_types.py
create_output_tracker_domain_entry(hash_str, entries)
Constructor for the OutputTrackerDomainEntry
TypedDict.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
hash_str |
str
|
The hash of the parameter set used for run collision identification. |
required |
entries |
OutputTrackerEntry
|
The run objects. |
required |
Returns:
Type | Description |
---|---|
OutputTrackerDomainEntry
|
|
Source code in bcorag/custom_types/output_map_types.py
default_output_tracker_file()
Creates an empty, default output tracker file instance.
Returns:
Type | Description |
---|---|
OutputTrackerFile
|
|