Skip to content

Latest commit

 

History

History
57 lines (31 loc) · 4.44 KB

positions_bachelormaster.md

File metadata and controls

57 lines (31 loc) · 4.44 KB
layout title
page
Bachelor/Master

The group for computational biology and bioinformatics at the Institute for Medical Genetics at the Charité offers challenging internships, bachelor and masterthesis for interested students of Computer Sciences and Bioinformatics.

Next Generation Sequencing

Sequencing techniques of the second generation are revolutionizing genetics. These high through-put techniques generate several orders of magnitude more sequencing data and challenge bioinformaticians. We use a Genome Analyzer II for genome wide mutation screening. We will sketch three projects, that could be dealt within the scope of a software engineering internship or Berufspraktikum of 8 weeks:

  • After a sequencing run the raw data consists of millions of short sequence reads that have to be aligned to a reference sequence before further analysis. After the alignment, one is interested how well the genomic region of interest is covered by short reads. The goal of this software internship is the development of a automatized pipeline, that computes the coverage of the target region an further meaningful statistics of a sequencing run.
  • NGS data sets as well as array CGH data sets generate a plethora of candidate mutation and copy number variation, that have to be validated using standard techniques. This requires the design of oligonucleotide primer sequences. The goal of this software internship is to further develop a pipeline, that generates for a given genomic position or interval the appropriate validation primer sequences.

Human Phenotype Ontology

The Human Phenotype Ontology project requires new interfaces for exploring, visualising and improve the ontology. The student should develop Java-servlets for these tasks.

Application of the HPO to different fields in the life science domain, especially for transferring knowledge from rare and genetic diseases to other fields.

Possible projects are:

  • New algorithmic ideas to be implemented/tested to improve the performance of Disease Gene Prediction
  • Exploration of the common disease annotations of HPO
  • Improve the user-interfaces

All software projects deal with topics that are also of high interest to the research community. Commited students may be able to publish their work in scientific journals. For the projects described above interested students should contact Peter Robinson, Sebastian Köhler and Peter Krawitz. Ideally, the student should already have good programming skills in Java (!) and should have some experience with Perl, R, Matlab, C++ or comparable programming languages.

Master thesis: Genome variant simulator

The lower cost of of next-generation sequencing starts a new area of genetic data generation. Thousands of genomes or exomes are sequenced around the world. It is only a matter of time if the vision in sequencing the genome for every patient comes true. Therefore new algorithms needs to be invented to deal with the new flood of genetic information.

Recently the institute of Institute for Medical Genetics and Human Genetics start sequencing whole-genomes of patients to find out the causative mutation of their genetic disease. Therefore our computational biology groups developed (and still inventing new) algorithms and combines them with the Human Phenotype Ontology to create tools which efficiently finds the causative mutation.

To benchmark our tools we need artificial genomes where we spike in a possible disease-causing mutations. A widely used approach is to use a public 1000Genomes genome. But these genomes are highly curated so that benchmarking is maybe biased. To test our tools at extreme conditions we developed a new tool that randomly samples a new genome considering variant frequencies of certain populations. In this Master's thesis this tool should be developed further that it can portrait more real sequencing data.

Mile Stones

  1. Implementing a variant-calling noise simulator considering the complexity (mappability of reads) in the genome.
  2. Implementing a variant-calling de-novo simulator considering real de-novo rates on different parts of the genome.
  3. Developing a strategy to simulate heredity of variants given a pedigree and implementing it.

Preconditions

  1. Good Java and programming knowledge.
  2. Motivation and enthusiasm about the topic.
  3. To have an interest in working on larger software projects.

Contact: Max Schubach