Skip to content

Latest commit

 

History

History
66 lines (36 loc) · 1.7 KB

README.md

File metadata and controls

66 lines (36 loc) · 1.7 KB

COVID-Mat2Vec

Applying Mat2Vec to COVID papers dataset

I used Mat2Vec (https://github.com/materialsintelligence/mat2vec) and instead of inputting journal abstracts about materials, I used journal abstracts about COVID-19. One by one, I searched for all the terms on the tasks page for the Kaggle COVID competition. What the program returned was a list of words in the COVID documents closely associated with the one I searched for.

What came up was interesting to me. I removed any words in the search results that seemed generic.

Two particularly interesting insights:

  1. The word "green"shows up very often. I'm not sure what this is in relation to.
  2. The word "bone" shows up with "risks" in a couple places.
    I haven't heard anything relating to COVID and bones
  3. "Evolved" is associated with "designed" and "led.

Other interesting word associations:

transmission: green

incubation: mice, road

stable: genetically, green, end, lack, chain

environment: core, green

environmental: green, physicians, reducing, administration

risk: protected

risks: bone, green

origin: natural, numerous, genotype, hosts, human, physiological

genetic: risks, especially, virulence

evolution: green, patterns, rapidly, virulence

evolved: designed, led

vaccine: generated, directly, structure, green

therapeutic: immune, particles

therapeutics: interactions, genotype, core, selection, added

test: cough

tests: plasma, regression

testing: green, users, dicrectly, standard, genotype

ethical: green

medical: green, four

diagnostics: bone, virulence, reagents, regression, expected

surveillance: control, national, evaluated

social: long

sharing: green, four, risks, genetically, expected

share: mayroon

information: core