Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hathi collections return only 2 columns "leader" and "fields" #612

Open
jacobthill opened this issue Jan 15, 2025 · 0 comments
Open

Hathi collections return only 2 columns "leader" and "fields" #612

jacobthill opened this issue Jan 15, 2025 · 0 comments
Assignees

Comments

@jacobthill
Copy link
Contributor

The collections coming from Hathi Trust (Michigan and McGill) have only two columns "leader" which looks like this: 04267ctm a2200769 i 4500 and "fields" which looks like

[{'001': '102869317'}, {'003': 'MiAaHDL'}, {'005': '20220519000000.0'}, {'006': 'm d '}, {'007': 'cr bn ---auaua'}, {'008': '180920q17001750xx 000 0 ara d'}, {'035': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': 'sdr-ia-qmm.1053626200'}]}}, {'035': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': '(OCoLC)1053626200'}]}}, {'040': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': 'LGG'}, {'b': 'eng'}, {'e': 'rda'}, {'c': 'LGG'}, {'d': 'LGG'}, {'d': 'OCLCO'}, {'d': 'OCLCQ'}]}}, {'043': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': 'aw-----'}, {'a': 'n-cn-qu'}]}}, {'049': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': 'LGG'}]}}, {'050': {'ind1': ' ', 'ind2': '4', 'subfields': [{'a': 'R128.3'}, {'b': '.R39 1700z'}]}}, {'055': {'ind1': '1', 'ind2': '9', 'subfields': [{'a': 'McGill University, Bib.Osl. 0450'}]}}, {'066': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'c': '(3'}, {'c': '(4'}]}}, {'100': {'ind1': '1', 'ind2': ' ', 'subfields': [{'6': '880-01'}, {'a': 'Rāzī, Abū Bakr Muḥammad ibn Zakarīyā,'}, {'d': '865?-925?'}, {'e': 'author.'}]}}, {'240': {'ind1': '1', 'ind2': '0', 'subfields': [{'6': '880-02'}, {'a': 'al-Fāk̲h̲ir fī al-t̤ibb.'}]}}, {'245': {'ind1': '1', 'ind2': '0', 'subfields': [{'6': '880-03'}, {'a': 'al-Kunnās al-fāk̲h̲ir /'}, {'c': 'lʼAbī Bakr Muḥammad bin Zakariyyā al-Rāzī.'}]}}, {'246': {'ind1': '3', 'ind2': '0', 'subfields': [{'6': '880-04'}, {'a': 'Kitāb al-Kunnāsh'}]}}, {'264': {'ind1': ' ', 'ind2': '0', 'subfields': [{'c': '[between 1700 and 1750]'}]}}, {'300': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': '855 pages ;'}, {'c': '27 x 14 cm'}]}}, {'336': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': 'text'}, {'b': 'txt'}, {'2': 'rdacontent'}]}}, {'337': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': 'unmediated'}, {'b': 'n'}, {'2': 'rdamedia'}]}}, {'338': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': 'volume'}, {'b': 'nc'}, {'2': 'rdacarrier'}]}}, {'500': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': 'Dimensions of textblock: 18 x 7 cm.'}]}}, {'500': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': 'On the whole practice of medicine.'}]}}, {'500': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': 'The first and last leaves are of a later date and bear references, mainly in Arabic, in a modern hand to the pages of the book.'}]}}, {'500': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': 'Two volumes bound in one, with one series of original Arabic pagination.'}]}}, {'500': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': 'Some marginal corrections, divided into two parts, catch words.'}]}}, {'500': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': "Fine, laid Oriental paper, elegant Persian Nasta'alīq, rubricated."}]}}, {'500': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'6': '880-05'}, {'a': 'Manuscript.'}]}}, {'510': {'ind1': '4', 'ind2': ' ', 'subfields': [{'a': 'Gacek, Adam. Arabic manuscripts in the libraries of the McGill University,'}, {'c': '107'}]}}, {'510': {'ind1': '4', 'ind2': ' ', 'subfields': [{'a': 'Osler'}, {'c': '450'}]}}, {'538': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': 'Mode of access: Internet.'}]}}, {'546': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': 'Text in Arabic.'}]}}, {'561': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': "Inserted part of a letter descriptive of the MS. from Dr. Sa'eed and some notes by Sir W. Osler."}, {'5': 'CaQMM'}]}}, {'561': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': 'Manuscript comes from the Library of Sir Willian Osler and was purchased in Persia in 1917 through Dr. Neligan.'}, {'5': 'CaQMM'}]}}, {'563': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': 'Black leather binding, written in yellow and pink papers, hinges repaired with white cotton, stab sewn.'}, {'5': 'CaQMM'}]}}, {'591': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': "Copy in McGill Library's Osler Library of the History of Medicine, Robertson Collection copy: B.O. 450"}]}}, {'592': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': "Copy in McGill Library's Osler Library of the History of Medicine, Robertson Collection: Inserted part of a letter descriptive of the MS. from Dr. Sa'eed and some notes by Sir W. Osler."}]}}, {'592': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': "Copy in McGill Library's Osler Library of the History of Medicine, Robertson Collection: Manuscript comes from the Library of Sir Willian Osler and was purchased in Persia in 1917 through Dr. Neligan."}]}}, {'593': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': "Copy in McGill Library's Osler Library of the History of Medicine, Robertson Collection: Black leather binding, written in yellow and pink papers, hinges repaired with white cotton, stab sewn"}]}}, {'594': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': 'Gacek, Adam. Arabic manuscripts in the libraries of the McGill University, 107'}]}}, {'594': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': 'Osler'}, {'c': '450'}]}}, {'650': {'ind1': ' ', 'ind2': '7', 'subfields': [{'a': 'Medicine, Medieval.'}, {'2': 'fast'}, {'0': '(OCoLC)fst01015277'}]}}, {'650': {'ind1': ' ', 'ind2': '7', 'subfields': [{'a': 'Medicine, Arab.'}, {'2': 'fast'}, {'0': '(OCoLC)fst01015186'}]}}, {'650': {'ind1': ' ', 'ind2': '7', 'subfields': [{'a': 'Manuscripts, Arabic.'}, {'2': 'fast'}, {'0': '(OCoLC)fst01008278'}]}}, {'650': {'ind1': ' ', 'ind2': '0', 'subfields': [{'a': 'Manuscripts, Arabic'}, {'z': 'Québec (Province)'}, {'z': 'Montréal.'}]}}, {'650': {'ind1': ' ', 'ind2': '0', 'subfields': [{'a': 'Medicine, Arab'}]}}, {'650': {'ind1': ' ', 'ind2': '0', 'subfields': [{'a': 'Medicine, Medieval'}, {'z': 'Middle East.'}]}}, {'651': {'ind1': ' ', 'ind2': '7', 'subfields': [{'a': 'Québec'}, {'z': 'Montréal.'}, {'2': 'fast'}, {'0': '(OCoLC)fst01210434'}]}}, {'651': {'ind1': ' ', 'ind2': '7', 'subfields': [{'a': 'Middle East.'}, {'2': 'fast'}, {'0': '(OCoLC)fst01241586'}]}}, {'655': {'ind1': ' ', 'ind2': '7', 'subfields': [{'a': 'Manuscripts.'}, {'2': 'fast'}, {'0': '(OCoLC)fst01424060'}]}}, {'710': {'ind1': '2', 'ind2': ' ', 'subfields': [{'a': 'Osler Library.'}, {'k': 'Manuscript.'}, {'n': '450.'}]}}, {'791': {'ind1': '2', 'ind2': ' ', 'subfields': [{'a': 'Osler Library.'}, {'k': 'Manuscript.'}, {'n': '450.'}]}}, {'880': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'6': '500-05/(3/r'}, {'a': 'Explicit : فى باب ادرار العروق و قطعه فهذا اخر الکلام فىها.'}]}}, {'880': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'6': '500-00/(3/r'}, {'a': 'Incipit : اللهم اعصمنا من الزلل و اعدنا من الخلل ...'}]}}, {'880': {'ind1': '3', 'ind2': '0', 'subfields': [{'6': '246-04/(4/r'}, {'a': 'کتاب الکناش'}]}}, {'880': {'ind1': '1', 'ind2': '0', 'subfields': [{'6': '245-03/(3/r'}, {'a': 'الکناش الفاخر /'}, {'c': 'لأبي بكر محمد بن زكريا الرازي.'}]}}, {'880': {'ind1': '1', 'ind2': '0', 'subfields': [{'6': '240-02/(3/r'}, {'a': 'فاخر قي الطب'}]}}, {'880': {'ind1': '1', 'ind2': ' ', 'subfields': [{'6': '100-01/(3/r'}, {'a': 'الرازي، أبو بكر محمد بن زكريا,'}, {'d': '865?-925?'}]}}, {'CID': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': '102869317'}]}}, {'DAT': {'ind1': '0', 'ind2': ' ', 'subfields': [{'a': '20220225050334.3'}, {'b': '20220519000000.0'}]}}, {'DAT': {'ind1': '1', 'ind2': ' ', 'subfields': [{'a': '20220519140503.0'}, {'b': '2022-05-22T18:01:00Z'}]}}, {'DAT': {'ind1': '2', 'ind2': ' ', 'subfields': [{'a': '2022-05-22T17:30:02Z'}]}}, {'CAT': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': 'SDR-IA-QMM'}, {'d': 'WMS'}, {'l': '[prepare.pl](http://prepare.pl/)-004-008'}]}}, {'FMT': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'a': 'BK'}]}}, {'HOL': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'0': 'sdr-ia-qmm.1053626200'}, {'a': 'mcg'}, {'b': 'SDR'}, {'c': 'QMM'}, {'p': 'mcg.ark:/13960/s2037xz6ndv'}, {'s': 'QMM'}, {'1': '1053626200'}, {'8': 'ia.McGillLibrary-osl_robe_0450-19416'}]}}, {'974': {'ind1': ' ', 'ind2': ' ', 'subfields': [{'8': 'ia.McGillLibrary-osl_robe_0450-19416'}, {'b': 'QMM'}, {'c': 'QMM'}, {'d': '20220522'}, {'s': 'mcgill'}, {'u': 'mcg.ark:/13960/s2037xz6ndv'}, {'y': '1750'}, {'r': 'pd'}, {'q': 'bib'}, {'t': 'non-US bib date2 < 1900'}]}}]

Each field needs to be broken out into its own column.

@aaron-collier aaron-collier self-assigned this Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants