Biblical data including translations, tagged original language texts, second temple literature, early church writings, dictionaries, and cross references.
Legend:
- 🏷️ Morphologically Tagged
- 🌲 Syntax Trees
- 💬 Discourse Analysis
- Aligned Bible Texts, automatic and/or manually corrected.
- Bible corpus - A multilingual parallel corpus created from translations of the Bible.
- Gratis Bible (OSIS XML)
- Open English Bible - A CC0 Bible translation.
- Parallel corpora from eBible.org (Verse per line txt). source. - Made for use with NLP, not ideal for finding book/ch/v divisions.
- Unfolding Word Translations - See esp. their Literal Translation, Simplified Translation. Resources developed for Bible translators.
- Zefania Bibles - A corpus of 140+ Bibles in 63 languages (and some English/German resources such as concordances). The Bibles are formatted in "Zefania XML". Some include strongs tagging.
OT
- ETCBC BHSa (TextFabric) 🌲 🏷️
- ETCBC BHSa - Hierarchical XML format 🌲 🏷️ - For those who prefer XML to TextFabric, in canonical and reordered versions
- Macula Hebrew 🌲 - One of the most developed datasets. Combines multiple sources, with clear provenance!
- MorphHB 🏷️ - Crowd sourced tagging of the OT
- Peshitta (TextFabric)
- Speaker Quotations for the whole Bible in various translations and the original languages. 💬
- STEPBible Data 🏷️ - One of the most developed datasets
LXX
- CCAT LXX in sqlite 🏷️
- LXX Codex Alexandrinus
- STEPBible Data 🏷️ - Appears to only be available upon request
- Swete's LXX Text from 1KY corrected 🏷️
NT
- Byzantine Majority Text 🏷️
- SBLGNT - Source data for the SBL GNT published by Logos.
- SBLGNT Tagged by MorphGNT 🏷️
- Levinsohn's Greek New Testament Discourse Features 💬
- Macula Greek 🌲 🏷️ - One of the most developed datasets. Combines multiple sources, with clear provenance!
- NA1904 Tagged by MorphGNT 🏷️
- OpenText Context Annotations extracted from forthcoming OpenText 2.0 syntax data, these include pericopes, speaker turns (maps to Speaker Quotations dataset above), moves within turns, and tokens/expressions for mapping to other datasets. 💬🌲
- PROIEL Treebanks (GNT, Vulgate, other NTs + more) 🌲
- SBLGNT and Nestle1904 with syntax trees by the Global Bible Initiative 🌲 🏷️
- Statistical Restoration GNT 🏷️ - An approach to construct a critical NT based on the earliest evidence (was Bunning Heuristic Prototype GNT)
- STEPBible Data 🏷️ - One of the most developed datasets!
- Syriac New Testament (TextFabric)
- Online-Critical-Pseudepigrapha - multiple versions of the Pseudepigrapha, powering http://pseudepigrapha.com
- (English) https://github.com/scrollmapper/bible_databases_deuterocanonical
- (English/Hebrew) https://github.com/Sefaria/Sefaria-Export
- https://github.com/ETCBC/dss (TextFabric) 🏷️
- https://github.com/ETCBC/extrabiblical
- https://github.com/Sefaria/Sefaria-Export
- (English) Ante- and Post-Nicene Fathers (TEI XML)
- (Greek) Apostolic Fathers hand corrected.
- (Greek) Clement of Alexandria
- (Greek) Justin Martyr
- (Greek) Patristics (TextFabric)
- CCEL Reference Mappings - Converted to sqlite. Note, the original mappings are no longer at the CCEL URL.
- Copenhagen Alliance Versification Mappings
- STEPBible TVTMS - Translators Versification Traditions with Methodology for Standardisation
- Abbott-Smith (NT)
- BDB (OT)
- Jeffrey Dodson's Greek Lexicon (NT)
- Koine Greek English Dictionary CC0. Updated Strongs (likely only NT Greek).
- UBS Dictionary of Biblical Hebrew & Greek CC-BY-SA. Extracted from the SDBH and SDGNT.
- Strongs (OT + NT)
- Koine Greek to Chinese Dictionary CC0. Apparently a conversion of the "Koine Greek English Dictionary" to Chinese. Because it's strongs based, likely only NT Greek. "This dictionary contains Chinese glosses where a gloss is available from the biblical text database."