Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ortholog mapping not finding all orthologs #962

Open
dlesper opened this issue Jan 6, 2025 · 2 comments
Open

Ortholog mapping not finding all orthologs #962

dlesper opened this issue Jan 6, 2025 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@dlesper
Copy link
Contributor

dlesper commented Jan 6, 2025

When searching gene: SCGB2A1 for dataset collection: scRNA-seq - Crista, mouse (Bermingham-McDonogh) you incorrectly get that "Neither the given gene symbol(s) nor corresponding orthologs were found in this dataset. "

However when searching gene: SCGB2A2 on the same dataset collection values show up. SCGB2A2 is an ortholog to SCGB2A1.

Screenshot 2025-01-06 at 12 51 25 PM

Screenshot 2025-01-06 at 12 51 44 PM

@dlesper dlesper added the bug Something isn't working label Jan 6, 2025
@adkinsrs
Copy link
Member

adkinsrs commented Jan 6, 2025

Spot-checked one of the datasets in that profile and while Scgb2a2 is in the dataset, Scgb2a1 was not found. So if the ortholog feature mapping file does not have an entry saying that Scgb2a1 maps to Scgb2a2 (or any organism-specific variants on casing) then the ortholog will not be found.

@adkinsrs
Copy link
Member

Way orthologs are determined.

  1. First attempt to do a case-insensitive match of the searched gene against genes in the dataset. If not found go to 2)
  2. If preferred organism is saved as a user default, map genes from that organism to the dataset organism. If user has no organism default, go through each organism and attempt to find mappings, stopping if any gene has a map in the feature mapping file. If nothing maps, go to 3)
  3. Throw the "Orthology mapping not found error".

We discussed having this error link to some wiki documentation explaining what we use for orthology mapping. Currently we use the Alliance of Genomes orthology mapping file found here (https://www.alliancegenome.org/downloads#orthology). It is worth noting that gEAR will format this mapping file to fit our needs but we do not make any assertions on orthology beyond what the original Alliance of Genomes orthology file contains. The wiki should also be kept up to date regarding the orthology version that we use, and contain links to alternative orthology mappings if a requested organism mapping does not exist in the Alliance of Genomes mapping file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants