-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
For comparison only: DEV branch #61
+3,401,682
−271,515
Merged
Changes from 9 commits
Commits
Show all changes
152 commits
Select commit
Hold shift + click to select a range
fa49f0f
add obsolete classes due to renamed patterns
jamesamcl 690f708
add sssom mapping generation
jamesamcl 28cfef2
Add obsoletion module
matentzn 9027175
inheres_in_part_of
jamesamcl 2fedb71
Add method to to postprocess modified patterns after matching
matentzn 7fbee3e
Merge branch 'dev' of https://github.com/obophenotype/upheno-dev into…
matentzn add9688
Update upheno_prepare.py
matentzn 4ad5740
fix inheres_in pattern generation
jamesamcl 78965b9
add missing method to lib.py
jamesamcl f07a464
add out path for sssom
jamesamcl 9b6d25e
add mappings file wip
jamesamcl 4193f75
remove exit from prepare
jamesamcl fe43954
remove missing bridges
jamesamcl 2e6c4ca
Merge branch 'master' into dev
matentzn 3b309d9
reinstate excluded patterns with comment
jamesamcl 71a03c5
Merge branch 'dev' of github.com:obophenotype/upheno-dev into dev
jamesamcl 35015cf
Updated patterns
matentzn 0c329bb
Create upheno-pattern-deriver.ipynb
matentzn 0432c8a
Updated patterns
matentzn ee9af35
add +x to upheno_pipeline.sh
jamesamcl e7b0d8a
Add changed patterns generation step
matentzn 1966b9a
Merge branch 'dev' of https://github.com/obophenotype/upheno-dev into…
matentzn 0e0fa3c
fix variable same name as function
jamesamcl 41e06f1
Update upheno_create_profiles.py
matentzn 3f4cb53
Update upheno_create_profiles.py
matentzn 7456884
Update upheno_create_profiles.py
matentzn e4b323b
Review changed patterns
matentzn e783292
Updating remaining classes
matentzn e0b3ba9
Updated some more patterns
matentzn a84ad08
Add a few more rewrite rules
matentzn 4d10c56
Update upheno-pattern-deriver.ipynb
matentzn de437e3
Update upheno-config.yaml
jamesamcl 157cfea
Update upheno_create_profiles.py
matentzn fed7de1
Merge branch 'dev' of https://github.com/obophenotype/upheno-dev into…
matentzn fe758e2
filter out bfo
jamesamcl 4dfb83f
Update scripts and files
matentzn 67cccc5
Update components and add new top level classes
matentzn 9e31002
Remove three patterns for matches
matentzn 98878a5
Update some phenotype patterns
matentzn 62c1905
Huge refactor
matentzn 682fd5d
Update some data files
matentzn c8c94d7
Update upheno_prepare.py
matentzn 92ac6d9
Update lexical_mapping.py
matentzn f14488a
Update lib.py
matentzn 55680e9
Update upheno-config.yaml
matentzn 420f775
Update upheno.Makefile
matentzn 793cb2f
Update upheno_pipeline.sh
matentzn 06a63f9
Huge refactor, no words
matentzn a95da92
Updated files during release
matentzn 67299a9
Merge branch 'master' into dev
matentzn 47ba698
Refactor uPheno makefile continued
matentzn c922bf2
Update catalog-v001.xml
matentzn a9788c1
Update import
matentzn 986d849
Update merged_import.owl
matentzn 35dcb4c
State of latest fixed version of uPheno
matentzn 89df4e2
Update upheno_id_map.txt
matentzn 941c9dd
Create upheno_id_map_december_2023.txt
matentzn 8102d97
set SOT for upheno_id_map to github agaimn
matentzn 443378c
Update upheno-config.yaml
matentzn 8f56da2
Add uPheno fillers to version control
matentzn abf8127
More huge refactors
matentzn 4817c7b
Update python depedencies
matentzn a30df2d
More refactoring
matentzn 597d5ea
Update upheno.Makefile
matentzn 4a36bf2
Update patterns
matentzn 7b8a55a
Update upheno_build.py
matentzn 0a54e3b
refactor
matentzn c6cb03a
Update fillers
matentzn fe24d2e
Fix fillers pipeline
matentzn b6eb68e
Update upheno.Makefile
matentzn 72fcd69
Update upheno fillers
matentzn 4fd297e
Update fillers
matentzn 47e7217
Update upheno_id_map.txt
matentzn 248cd20
Make sure upheno map is updated correctly
matentzn 0d278fe
Update upheno.Makefile
matentzn 3526c62
Update lib.py
matentzn 0ae1bf9
Update upheno_create_profiles.py
matentzn 8dc82f1
Add all uPheno patterns to patterns dir
matentzn 518ca35
Add automatic from upheno map
matentzn 4317b81
Add all changed patterns to pattern directory
matentzn 89184ab
Update a few pattern names
matentzn 2abbfd5
Update pattern names
matentzn 2a85058
Remove modified patterns
matentzn 40b540b
Remove modified patterns
matentzn af1cca6
Update pipelines
matentzn 8dd05cd
Add missing cols to obsolete tsv
matentzn 46bd970
Remove obsolete classes form set
matentzn a52c40c
Update DOSDP
matentzn 69f7882
More refactoring
matentzn 29c0f7c
Get rid of some patterns we dont need anymore
matentzn bdfe3c2
Another huge refactor
matentzn e378993
Add obsoletion pipeline
matentzn 16a9cd5
Update merged.owl goal
matentzn ace925f
Update definitions.owl
matentzn 189e4ee
Update obsolete.tsv
matentzn 1c1e2a2
Update upheno-odk.yaml
matentzn 165ae54
Update abnormalAnatomicalEntity.tsv
matentzn 48a8c1c
Updated patterns
matentzn cfe807d
Update upheno-deprecated.owl
matentzn 6140e9f
Create upheno_qc.ipynb
matentzn a0512de
Obsolete all of the gocc_anatomical_entity cases
matentzn 3668ff5
Apply all cc_cl removals
matentzn 296573d
Get rid of resistence_to_entity_chebi case in uPheno
matentzn f3d50db
Dealt with bp_and_mf
matentzn 18f4132
Update obsolete.tsv
matentzn b0bfdee
Remove some obsoleted classes from patterns
matentzn c96d7bb
Deleting a lot of redundant class definitions
matentzn 8cfae0b
Updated release files
matentzn cecfc2a
Update imports and components
matentzn 5442cde
Update ODK
matentzn b0af5e3
ODK 1.5.2 update
matentzn 43be67d
Update upheno-edit.owl
matentzn 67d6e7b
Remove all NBO phenotypes from abnormalBiologicalProcess pattern
matentzn cdeef4e
Update definitions.owl
matentzn f361dfb
Remove some more classes
matentzn 2bc12f1
Remove some redundant classes from patterns
matentzn c5c5958
Update component files
matentzn 52fa451
Update scripts
matentzn 5b3c3c6
Update MISC
matentzn 8744032
Update scripts
matentzn 6094b5b
Update patterns (mainly removals)
matentzn ffd42b3
Update imports files
matentzn 495c35a
Remove merged import which has become to large to be handled
matentzn 680213c
Adding remaining SSPOs to imports
matentzn 535ed65
Update bridge files
matentzn beca8b3
Obsolete 8 redundant classes
matentzn ba2d541
Update components and mappings
matentzn 8f93933
Update templates
matentzn 0bf5b98
Update pipelines
matentzn f45150c
Update remove duplicative classes from pattern files
matentzn 5dcfd61
ODK config fixes
matentzn d437b65
Update upheno.Makefile
matentzn e84feaa
Add anatomy mappings into upheno pattern matching process
matentzn cce685c
Add Uberon sssom file to uPheno
matentzn eae3bf1
Update components and mappings
matentzn c9c7e40
Add a few more mapping sets to upheno odk config
matentzn 7b47abe
Update COmponents and mappings
matentzn e62ee96
Update obsolete.tsv
matentzn 7c39ab3
Update uPheno deprecated
matentzn c41d47a
Remove a few obsolete c-s matches
matentzn 60ffca9
Providing some last tweaks to the ontology
matentzn 2a5cf80
Update upheno.Makefile
matentzn 8c0a9ca
Update upheno.Makefile
matentzn 3e0f60a
Add root alignments
matentzn aa78439
Various updates
matentzn cf2ab7c
Update uPheno release files
matentzn 49e8e82
Remove upheno.owl
matentzn abe17c5
Delete upheno-base.owl
matentzn a7a9069
Update makefile
matentzn 0413f08
MISC custom makefile updates
matentzn c858992
Add lexical matching command to CLI
matentzn 75726f0
Update release files
matentzn File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,7 @@ | ||
# ---------------------------------------- | ||
# Makefile for upheno | ||
# Generated using ontology-development-kit | ||
# ODK Version: v1.4 | ||
# ODK Version: v1.4.3 | ||
# ---------------------------------------- | ||
# IMPORTANT: DO NOT EDIT THIS FILE. To override default make goals, use upheno.Makefile instead | ||
|
||
|
@@ -43,14 +43,14 @@ REPORT_PROFILE_OPTS = | |
OBO_FORMAT_OPTIONS = | ||
SPARQL_VALIDATION_CHECKS = owldef-self-reference iri-range label-with-iri multiple-replaced_by | ||
SPARQL_EXPORTS = basic-report class-count-by-prefix edges xrefs obsoletes synonyms | ||
ODK_VERSION_MAKEFILE = v1.4 | ||
ODK_VERSION_MAKEFILE = v1.4.3 | ||
|
||
TODAY ?= $(shell date +%Y-%m-%d) | ||
OBODATE ?= $(shell date +'%d:%m:%Y %H:%M') | ||
VERSION= $(TODAY) | ||
ANNOTATE_ONTOLOGY_VERSION = annotate -V $(ONTBASE)/releases/$(VERSION)/$@ --annotation owl:versionInfo $(VERSION) | ||
ANNOTATE_CONVERT_FILE = annotate --ontology-iri $(ONTBASE)/$@ $(ANNOTATE_ONTOLOGY_VERSION) convert -f ofn --output [email protected] && mv [email protected] $@ | ||
OTHER_SRC = $(PATTERNDIR)/definitions.owl | ||
OTHER_SRC = $(PATTERNDIR)/definitions.owl $(COMPONENTSDIR)/upheno-deprecated.owl | ||
ONTOLOGYTERMS = $(TMPDIR)/ontologyterms.txt | ||
EDIT_PREPROCESSED = $(TMPDIR)/$(ONT)-preprocess.owl | ||
PATTERNDIR= ../patterns | ||
|
@@ -345,6 +345,45 @@ refresh-%: | |
no-mirror-refresh-%: | ||
$(MAKE) IMP=true IMP_LARGE=true MIR=false PAT=false $(IMPORTDIR)/$*_import.owl -B | ||
|
||
|
||
# ---------------------------------------- | ||
# Components | ||
# ---------------------------------------- | ||
# Some ontologies contain external and internal components. A component is included in the ontology in its entirety. | ||
|
||
COMP=true # Global parameter to bypass component generation | ||
|
||
.PHONY: all_components | ||
all_components: $(OTHER_SRC) | ||
|
||
.PHONY: recreate-components | ||
recreate-components: | ||
$(MAKE) COMP=true IMP=false MIR=true PAT=true IMP_LARGE=false all_components -B | ||
|
||
.PHONY: no-mirror-recreate-components | ||
no-mirror-recreate-components: | ||
$(MAKE) COMP=true IMP=false MIR=false PAT=true IMP_LARGE=false all_components -B | ||
|
||
.PHONY: recreate-% | ||
recreate-%: | ||
$(MAKE) COMP=true IMP=false IMP_LARGE=false MIR=true PAT=true $(COMPONENTSDIR)/$*.owl -B | ||
|
||
.PHONY: no-mirror-recreate-% | ||
no-mirror-recreate-%: | ||
$(MAKE) COMP=true IMP=false IMP_LARGE=false MIR=false PAT=true $(COMPONENTSDIR)/$*.owl -B | ||
|
||
$(COMPONENTSDIR)/%: | $(COMPONENTSDIR) | ||
touch $@ | ||
.PRECIOUS: $(COMPONENTSDIR)/% | ||
|
||
|
||
|
||
$(COMPONENTSDIR)/upheno-deprecated.owl: $(TEMPLATEDIR)/obsolete.tsv | ||
if [ $(COMP) = true ] ; then $(ROBOT) template \ | ||
$(patsubst %, --template %, $^) \ | ||
$(ANNOTATE_CONVERT_FILE); fi | ||
|
||
.PRECIOUS: $(COMPONENTSDIR)/upheno-deprecated.owl | ||
# ---------------------------------------- | ||
# Mirroring upstream ontologies | ||
# ---------------------------------------- | ||
|
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,106 @@ | ||
|
||
import pandas as pd | ||
import os | ||
import yaml | ||
import glob | ||
import argparse | ||
from sssom.context import get_converter | ||
from sssom.parsers import from_sssom_dataframe | ||
from sssom.writers import write_table | ||
|
||
def main(): | ||
parser = argparse.ArgumentParser(description='Create SSSOM file from upheno id map and pattern matches') | ||
parser.add_argument('--upheno_id_map', type=str, help='upheno id map file') | ||
parser.add_argument('--patterns_dir', type=str, help='directory containing pattern files') | ||
parser.add_argument('--matches_dir', type=str, help='directory containing pattern matches') | ||
args = parser.parse_args() | ||
create_upheno_sssom(args.upheno_id_map, args.patterns_dir, args.matches_dir) | ||
|
||
def get_id_columns(pattern_file): | ||
try: | ||
with open(pattern_file, "r") as stream: | ||
pattern_json = yaml.safe_load(stream) | ||
idcolumns = list(pattern_json["vars"].keys()) | ||
return idcolumns | ||
except Exception as exc: | ||
print("Could not get id columns: " + pattern_file) | ||
return None | ||
|
||
def create_upheno_sssom(upheno_id_map, patterns_dir, matches_dir): | ||
|
||
all_pattern_matches_map = dict() | ||
|
||
for pattern_match_tsv in glob.glob(matches_dir + "/**/*.tsv"): | ||
pattern_name = os.path.basename( pattern_match_tsv ).split(".")[0] | ||
df = pd.read_csv(pattern_match_tsv, sep='\t') | ||
if pattern_name in all_pattern_matches_map: | ||
all_pattern_matches_map[pattern_name] = pd.concat([ all_pattern_matches_map[pattern_name], df ]) | ||
else: | ||
all_pattern_matches_map[pattern_name] = df | ||
|
||
|
||
|
||
cache_pattern_file_to_idcolumn = dict() | ||
|
||
df = pd.read_csv(upheno_id_map, sep='\t') | ||
|
||
sssom = [] | ||
|
||
converter = get_converter() | ||
|
||
for index, row in df.iterrows(): | ||
tokens = row['id'].split('-') | ||
fillers = tokens[:-1] | ||
pattern_name = tokens[-1].split('.')[0] | ||
pattern_file = pattern_name + ".yaml" | ||
id_columns = cache_pattern_file_to_idcolumn.get(pattern_file) | ||
if id_columns == None: | ||
id_columns = get_id_columns(os.path.join(patterns_dir, pattern_file)) | ||
cache_pattern_file_to_idcolumn[pattern_file] = id_columns | ||
if id_columns == None: | ||
continue | ||
# print(tokens) | ||
# print(pattern_file) | ||
# print(id_columns) | ||
# print(fillers) | ||
tsv_df = all_pattern_matches_map[pattern_name] | ||
#filtered = tsv[lambda df: filter_row(df, id_columns, fillers) ] | ||
|
||
mask = pd.Series(True, index=tsv_df.index) | ||
for col, filler in zip(id_columns, fillers): | ||
mask = mask & (tsv_df[col] == filler) | ||
subset_df = tsv_df[mask] | ||
|
||
# print(subset_df) | ||
|
||
upheno_id = row['defined_class'] | ||
|
||
for index, row in subset_df.iterrows(): | ||
species_specific_id = row['defined_class'] | ||
sssom.append([ | ||
converter.compress(upheno_id), | ||
"semapv:crossSpeciesExactMatch", | ||
converter.compress(species_specific_id), | ||
"semapv:LogicalMatching" | ||
]) | ||
|
||
df_out = pd.DataFrame(sssom, columns=['subject_id', 'predicate_id', 'object_id', 'mapping_justification']) | ||
|
||
meta = dict() | ||
meta['mapping_set_id'] = 'https://data.monarchinitiative.org/mappings/upheno/upheno-species-independent.sssom.tsv' | ||
msdf = from_sssom_dataframe(df_out, prefix_map=converter, meta=meta) | ||
msdf.clean_prefix_map() | ||
write_table(msdf, open("upheno-species-independent.sssom.tsv", "w")) | ||
|
||
def filter_row(df, id_columns, fillers): | ||
n = 0 | ||
while n < len(id_columns): | ||
column = id_columns[n] | ||
filler = fillers[n] | ||
if df[column] != filler: | ||
return False | ||
return True | ||
|
||
if __name__ == "__main__": | ||
main() | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jamesamcl
add
--output
param here.