-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data Merge #64
base: main
Are you sure you want to change the base?
Data Merge #64
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, nice clean PR. Ready to merge!
# In[2]: | ||
|
||
|
||
data_directory = "../7.collab-data/data/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider resolving this path
@@ -0,0 +1,77 @@ | |||
#!/usr/bin/env python | |||
# coding: utf-8 | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a markdown cell describing this file
rnaseq_df = rnaseq_df.set_index('GeneID') # Set the GeneID column as index in df2 | ||
rnaseq_df.index.name = 'Symbol (GeneID)' # Rename index to match the new column name | ||
|
||
# Step 3: Map the Symbol (GeneID) values from df1 to df2 index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like these step # documentation comments, the flow is nice.
This PR merges the two dfs from the collaborator to get a df with the proper rows and columns to be used in the RNAseq model