13 datasets found

Individuals in the Holy Bible
kaggle.com
zip
Updated Dec 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Individuals in the Holy Bible [Dataset]. https://www.kaggle.com/datasets/thedevastator/individuals-in-the-holy-bible
Explore at:
zip(463563 bytes)Available download formats
Dataset updated
Dec 6, 2023
Authors
The Devastator
Description
Individuals in the Holy Bible

Biblical Individuals: Mentions, Verses, and Notes

By Brady Stephenson [source]

About this dataset

The Holy Bible, a revered text studied by students, scholars, critics, and the curious for centuries, encompasses a rich tapestry of stories featuring numerous individuals. The BibleData-PersonVerseTanakh dataset provides an extensive collection of information about these individuals mentioned in each chapter and verse across the entire Bible. It offers unique identifiers (corresponding to the BibleData-Person and BibleData-PersonLabel datasets) alongside valuable notes for study and verification purposes. Each individual's entry includes their specific label, denoting their distinct identification within the Bible's narrative.

One vital aspect of this dataset is the person_label_count column, which quantifies the frequency with which an individual is referenced throughout the entirety of the Bible. This numerical value presents an insightful metric to gauge significant figures or recurring characters present in biblical narratives.

Furthermore, the dataset also encompasses a wealth of detailed annotations provided under the person_verse_notes column. These notes offer additional contextual information related to each individual mentioned in respective verses throughout various chapters. Researchers and enthusiasts can delve into these annotations for deeper comprehension or clarification surrounding specific biblical characters.

For easier reference and cross-referencing purposes, an essential attribute is presented through the person_verse_sequence column. This field not only identifies chapter and verse references but consolidates them into concise textual representations aligned with each particular individual's mention within scripture.

The comprehensive nature of this dataset ensures coverage across all books within both Testaments (Genesis 1:1—Malachi 4:6) as per its last update on June 24th, 2023. While it currently stands as a complete resource capturing every persona from biblical texts accurately so far discovered until that date; it remains open for edits or corrections if any discrepancies are identified in its data integrity.

Envisioned as a fundamental tool for rigorous academic study or personal exploration alike—the data provided here brilliantly complements the immense historical and spiritual significance carried by the Holy Bible

How to use the dataset

Welcome to the comprehensive dataset of individuals mentioned in every chapter and verse of the Holy Bible. This guide will help you navigate and make the most of this valuable resource. Whether you are a student, scholar, critic, or just curious about the Bible, this dataset will provide you with unique identifiers and notes for study and verification purposes.

Understanding the Columns

This dataset contains several columns that provide important information about the individuals mentioned in the Bible. Here's a breakdown of each column:

person_label: This column contains a unique identifier for each individual mentioned in the Bible. It is a text-based label that can be used to reference specific individuals throughout your analysis.

person_label_count: This column indicates how many times each individual is mentioned in the Bible. It is an integer value that can help you understand their significance or prominence within biblical texts.

person_verse_sequence: This column provides chapter and verse references where each individual is mentioned in the Bible. The chapter and verse references are given as text entries, allowing you to easily locate specific instances where an individual appears.

person_verse_notes: This column includes any additional notes or information related to each individual mentioned in their corresponding verses. These notes can provide historical context, interpretation insights, or other relevant details to enrich your understanding of biblical characters.

Exploring and Analyzing the Dataset

To make full use of this dataset, consider incorporating these steps into your analysis:

Data Exploration: Start by exploring some summary statistics or descriptive measures using columns like person_label_count to understand overall patterns and frequencies concerning individuals' mentions.

Study Individual Characters: Pick specific individuals from the person_label column based on your research interest or personal curiosity about biblical figures who appear frequently (higher count) or less often (lower count). Use these unique identifiers to trace their journeys and roles in different chapters and verses.

**Inte...
Bible Corpus
kaggle.com
zip
Updated Jun 16, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Oswin Rahadiyan Hartono (2017). Bible Corpus [Dataset]. https://www.kaggle.com/datasets/oswinrh/bible/code
Explore at:
zip(184125756 bytes)Available download formats
Dataset updated
Jun 16, 2017
Authors
Oswin Rahadiyan Hartono
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

Bible (or Biblia in Greek) is a collection of sacred texts or scriptures that Jews and Christians consider to be a product of divine inspiration and a record of the relationship between God and humans (Wiki). And for data mining purpose, we could do many things using Bible scriptures as for NLP, Classification, Sentiment Analysis and other particular topics between Data Science and Theology perspective.

Content

Here you will find the following bible versions in sql, sqlite, xml, csv, and json format:

American Standard-ASV1901 (ASV)

Bible in Basic English (BBE)

Darby English Bible (DARBY)

King James Version (KJV)

Webster's Bible (WBT)

World English Bible (WEB)

Young's Literal Translation (YLT)

Each verse is accessed by a unique key, the combination of the BOOK+CHAPTER+VERSE id.

Example:

Genesis 1:1 (Genesis chapter 1, verse 1) = 01001001 (01 001 001)

Exodus 2:3 (Exodus chapter 2, verse 3) = 02002003 (02 002 003)

The verse-id system is used for faster, simplified queries.

For instance: 01001001 - 02001005 would capture all verses between Genesis 1:1 through Exodus 1:5.

Written simply:

SELECT * FROM bible.t_asv WHERE id BETWEEN 01001001 AND 02001005

Coordinating Tables

There is also a number-to-book key (key_english table), a cross-reference list (cross_reference table), and a bible key containing meta information about the included translations (bible_version_key table). See below SQL table layout. These tables work together providing you a great basis for a bible-reading and cross-referencing app. In addition, each book is marked with a particular genre, mapping in the number-to-genre key (key_genre_english table) and common abbreviations for each book can be looked up in the abbreviations list (key_abbreviations_english table). While its expected that your programs would use the verse-id system, book #, chapter #, and verse # columns have been included in the bible versions tables.

A Valuable Cross-Reference Table

A very special and valuable addition to these databases is the extensive cross-reference table. It was created from the project at http://www.openbible.info/labs/cross-references/. See .txt version included from http://www.openbible.info website. Its extremely useful in bible study for discovering related scriptures. For any given verse, you simply query vid (verse id), and a list of rows will be returned. Each of those rows has a rank (r) for relevance, start-verse (sv), and end verse (ev) if there is one.

Basic Web Interaction

The web folder contains two php files. Edit the first few lines of index.php to match your server's settings. Place these in a folder on your webserver. The references search box can be multiple comma separated values. (i.e. John 3:16, Rom 3:23, 1 Jn 1:9, Romans 10:9-10) You can also directly link to a verse by altering the URI: http://localhost/index.php?b=John 3:16, Rom 3:23, 1 Jn 1:9, Romans 10:9-10

bible-mysql.sql (MySQL) is the main database and most feature-oriented due to contributions from developers. It is suggested you use that for most things, or at least convert the information from it.

cross_references-mysql.sql (MySQL) is the cross-reference table. It has been separated to become an optional feature. This is converted from the project at http://www.openbible.info/labs/cross-references/.

bible-sqlite.db (SQLite) is a basic simplified database for simpler applications (includes cross-references too).

cross_references.txt is the source cross-reference file obtained from http://www.openbible.info/labs/cross-references/

In CSV folder, you will find (same list order with the other formats):

bible_version_key.csv

http://i.imgur.com/S9JialN.png" alt="bible_version_key">

key_abbreviations_english.csv

http://i.imgur.com/v59SpQs.png" alt="key_abbreviations_english">

key_english.csv

http://i.imgur.com/BbKMQgF.png" alt="key_english">

key_genre_english.csv

http://i.imgur.com/lJVVW2C.png" alt="key_genre_english">

t_asv.csv, t_bbe.csv, t_dby.csv, t_wbt.csv, t_web.csv, t_ylt.csv

http://i.imgur.com/jJ4cf4q.png" alt="t_version">

Acknowledgements

In behalf of the original contributors (Github)

Inspirations

WordNet as an additional semantic resource for NLP
BibleData
kaggle.com
zip
Updated Nov 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brady Stephenson (2025). BibleData [Dataset]. https://www.kaggle.com/datasets/bradystephenson/bibledata/code
Explore at:
zip(12088321 bytes)Available download formats
Dataset updated
Nov 16, 2025
Authors
Brady Stephenson
License
Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
Description
Context

Read by students, scholars, critics, and the curious for millennia, the Holy Bible is the most translated, most widely published, and most examined text in history. Unfortunately, the information in Scripture largely has remained unstructured and not easily parsed, examined, processed, or enriched with modern technology. This series of datasets is intended to make the information in the Bible accessible to these technologies.

Content

There are several files and, while some are complete (minus any corrections that are identified over time), many are still in development.

BibleData-Reference [complete]

Based on the Unified Scripture XML (USX) labels for each book, this dataset provides unique identifiers for each book, chapter, and verse in the 66 books of the Bible. The reference IDs in this dataset are used throughout other BibleData datasets. The dataset is complete (but open to corrections) as of 4/7/2021.

BibleData-Commandments [complete]

This dataset contains detailed information about the (traditionally enumerated) 613 commandments found in the Bible with chapter and verse references (keyed to the BibleData-Reference dataset) and related source texts in Hebrew, Greek, and English to facilitate individual verification and study. This dataset is complete (but open to corrections) as of 1/31/2021.

HebrewStrongs [complete]

A dataset of all the words used in the Hebrew Bible as organized by James Strong in his Hebrew and Chaldee Dictionary. Based on the work of Matthias Mueller (https://christthetruth.net/2013/07/15/strongs-goes-excel/), this data is organized by Strong's number, one entry per number. This dataset is complete (but open for corrections) as of 2/29/2020.

NavesTopicalDictionary [complete]

Naves 1897 Topical Dictionary in structured data format. This data is alphabetically organized, with one entry per topic. This dataset is complete (but open for corrections) as of 6/5/2020.

HitchcocksBibleNamesDictionary [complete]

Roswell D. Hitchcock's Bible Names Dictionary (1869) in a structured data format. This data is alphabetically organized, with one entry per name. This dataset is complete (but open for corrections) as of 6/5/2020.

The Alamo Polyglot [complete]

This dataset is the work of a Bible student in San Antonio, Texas, USA, seeking to integrate the plain-text (UTF-8) data of ancient Bible manuscripts into one source and make it freely available to others for their own studies. This dataset is complete (but open for additions or corrections!) as of 8/29/2020.

This single, parallel view of various texts and translations of Scripture includes: - World English Bible [WEB] - King James Version [KJV] - Leningrad Codex [BHS] - Jewish Publication Society [JPS 1917] - Codex Alexandrinus - Brenton's English Translation of Alexandrinus [BET] - Samaritan Pentateuch [SP] - Samaritan Pentateuch In English [SPE] - Targum Onkelos - Targum Onkelos in English [TOE])

BibleData-Book [in progress]

This dataset contains basic information about each of the 66 books in the Bible: book names in English, Hebrew, and Greek (along with transliterations and meanings of those names), chapter and verse counts, along with details of who wrote each book, when, and where (if known). This dataset is still in development as of 5/16/2020.

BibleData-Person [in progress]

Information about each named individual in the Bible with chapter and verse references (keyed to the BibleData-Reference dataset) to facilitate individual verification and study. This list serves as the foundation for the BibleData-PersonLabel (with English, Hebrew, and Greek labels including chapter and verse) and the BibleData-PersonRelationship (noting parental, marital, or other relationships between individuals in the Bible). This dataset is incomplete (only Genesis 1:1-Psalm 1:1) but still in development as of 12/18/2021.

BibleData-PersonLabel [in progress]

This dataset contains detailed information about the English, Hebrew, and Greek labels (proper names, titles, etc.) given to individuals mentioned in the Bible (keyed to the BibleData-Person dataset) along with their meanings and the chapter and verse references (keyed to the BibleData-Reference dataset) to facilitate individual verification and study. This dataset is incomplete (only Genesis 1:1-Psalm 1:1) but still in development as of 12/18/2021.

BibleData-PersonRelationship [in progress]

This dataset contains information about the relationships between individuals named in the Bible including unique identifiers for each person (keyed to the BibleData-Person dataset), the relationship type (father, son, mother, daughter, wife, husband, etc), and notes along with chapter and verse references (keyed to the BibleData-Reference dataset) to facilitate individual verification and...
Bible Person Verse Descriptions
kaggle.com
zip
Updated Dec 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Bible Person Verse Descriptions [Dataset]. https://www.kaggle.com/datasets/thedevastator/bible-person-verse-descriptions
Explore at:
zip(580291 bytes)Available download formats
Dataset updated
Dec 6, 2023
Authors
The Devastator
Description
Bible Person Verse Descriptions

Occurrences and details of individuals named in the Bible

By Brady Stephenson [source]

About this dataset

The Holy Bible, a revered and influential text cherished by students, scholars, critics, and the curious alike for centuries, holds an unparalleled significance in human history. This dataset named BibleData-PersonVerse offers comprehensive information regarding the multitude of individuals mentioned throughout its chapters and verses. By providing unique identifiers (which correspond to the BibleData-Person and BibleData-PersonLabel datasets) as well as detailed notes about these individuals, this dataset intends to facilitate individual verification and promote in-depth study.

As a work in progress aiming for completion (currently covering Genesis 1:1 through JAS 1:7), this dataset remains under constant development as of now.

Included within this dataset are columns such as person_label which denotes a distinctive identifier assigned to each individual named in the Biblical verses. Meanwhile, person_label_count signifies the frequency of occurrences of an individual's name across various scripture verses. Moreover,person_verse_sequence represents the sequential numbering assigned to each verse where an individual is mentioned.

Additionally, researchers can benefit from person_verse_notes, offering supplementary information or relevant details pertaining to each person described within these verses.

By leveraging this rich repository of data on Biblical figures intertwined with their contextual references (aligned with the BibleData-Reference dataset), scholars and enthusiasts can embark on extensive research endeavors exploring diverse aspects of biblical characters throughout history.

To encapsulate it succinctly - The BibleData-PersonVerse dataset stands as a valuable resource aiding diligent study on numerous individuals featured across different chapters and verses within the Holy Scriptures

How to use the dataset

1. Understanding the Dataset

The dataset consists of several columns that provide specific information about each individual:

person_label: A unique identifier for the individual named in the Bible verse.

person_label_count: The count of occurrences of that individual's name in the Bible.

person_verse_sequence: The sequence number of the verse where the individual is mentioned.

person_verse_notes: Additional notes or information about that person mentioned in the verse.

2. Exploring Individuals by Name

To explore a specific character from the Holy Bible using this dataset:

a) Identify their unique identifier from column person_label.

b) Use their unique identifier to search for all verses where they are mentioned.

c) Explore their characteristics, roles, and relationships through their corresponding verses.

d) Refer to column person_verse_notes for any additional information related to that person.

3. Analyzing Frequency of Individual Names

The column person_label_count provides valuable insights into how frequently an individual's name appears in different parts of Scripture.

a) Sort or filter based on person_label_count to identify individuals with high occurrences across various chapters and verses.

b) Analyze patterns within a certain book or section by filtering based on references (such as book names).

c) Compare frequencies between different individuals using their counts to gain a better understanding of their significance within biblical texts.

4. Cross-referencing with Other Datasets

This dataset is designed to work in conjunction with the related datasets BibleData-Person and BibleData-Reference.

a) Utilize the person_label identifier to connect data between different datasets for comprehensive analysis.

b) Explore referenced verses from this dataset by referring to the corresponding chapters, books, and verses in the BibleData-Reference dataset.

c) Combine information from various datasets to deepen your understanding of individual personalities, their relationships, genealogies, or historical context.

5. Contributing and Data Completeness

This dataset is a

Research Ideas

Analysis of Biblical figures: This dataset can be used to analyze and study the occurrences of different individuals mentioned in the Bible. By examining the count of occurrences, researchers can gain insights into the prominence and significance of certain individuals in various chapters and verses.

Studying relationship...
r
Swedes' outlook on life, religion and the bible 1984/1985
researchdata.se
Updated Oct 29, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Thorleif Pettersson; Jörgen Straarup (2025). Swedes' outlook on life, religion and the bible 1984/1985 [Dataset]. http://doi.org/10.5878/z961-1576
Explore at:
(158501), (236426), (298302)Available download formats
Unique identifier
https://doi.org/10.5878/z961-1576
Dataset updated
Oct 29, 2025
Dataset provided by
Religionssociologiska institutet
Authors
Thorleif Pettersson; Jörgen Straarup
Time period covered
Apr 23, 1984 - May 23, 1984
Area covered
Sweden
Description
In September 1981 a new Swedish translation of the New Testament was published. The main purpose of this survey is to show the possession and use of the Bible among the Swedish population. Respondents were asked about their interest in issues concerning religion and outlook of life, if they believe in God and about their relation toward the Christian faith, how often they attend church and how often they pray. The major part of the questions addressed people who used to read the Bible. They were asked how and why they read the Bible and which Bible translation they use. Furthermore they were asked about their opinion on the new translation of the New Testament.

The study consists of

Interviews and a survey including persons between 16 - 74 years.

Telephone interviews with persons between 65-99 years.

Pilot study
Bible Verses from King James Version
kaggle.com
zip
Updated Mar 16, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brian Liao (2017). Bible Verses from King James Version [Dataset]. https://www.kaggle.com/phyred23/bibleverses
Explore at:
zip(1481194 bytes)Available download formats
Dataset updated
Mar 16, 2017
Authors
Brian Liao
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Context

On qxczv.pw, a 4chan styled board, an anon posted the whole Bible in King James Version. I chose to scrape it and format it into a Bible data set.

Content

Data is in CSV in the format: citation, book, chapter, verse, text. For example: citation: Genesis 1:1 book: Genesis chapter: 1 verse: 1 text: "In the beginning God created the heaven and the earth. "

Acknowledgements

I'd like to thank qxczv.pw, Andrew Palmer, Jessica Butterfield, Gary Handwerk, Brian Wurtz, and the whole Lake Washington High School. Papa Bless.

Inspiration

I am unsure what data can be analysis from this data set but am thinking graphing distributions of words or running natural language processing on this could be interesting. Send me a pm if you have any ideas.
The King James Bible
kaggle.com
zip
Updated Apr 27, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
K King (2023). The King James Bible [Dataset]. https://www.kaggle.com/datasets/kk99807/the-king-james-bible/code
Explore at:
zip(1423031 bytes)Available download formats
Dataset updated
Apr 27, 2023
Authors
K King
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
About Dataset

The bible is considered a sacred book by Christians. It has a rich & varied history. The first section of the bible (the Old Testament) was originally written in Hebrew and tells the story of the Israelite people. It also includes books of religious law, poetry, and prophesy. The second section of the bible (the New Testament) was originally written in Greek and tells of the life of Jesus Christ and development of the early Christian church.

In the 1600's, the Protestant Reformation drove translations of the bible into local languages. This particular translation was authorized in 1604 by King James I of England for use by the Church of England. It has since grown to become the most popular English version of the bible.

Inspiration

This would be a great dataset for Natural Language Processing (NLP) techniques. Aspects of the bible that would be interesting to identify and explore with NLP:

The Old Testament contains many examples of a Hebrew literary technique called parallelism.

There are interesting examples of chiasmus in the bible, where there is symmetry in the text that can even span chapters.

King Solomon, reported to be the wisest man on earth, authored the book of Proverbs. In the initial chapter, he references the "riddles" of the wise man. Are there riddles to be discovered here?
Bible Timeline - Acts
kaggle.com
zip
Updated Sep 4, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cristián Munizaga (2021). Bible Timeline - Acts [Dataset]. https://www.kaggle.com/cmunizaga/bible-timeline-acts
Explore at:
zip(1014 bytes)Available download formats
Dataset updated
Sep 4, 2021
Authors
Cristián Munizaga
Description
Context

The book of Acts of Apostles covers a period of approximately 30 years between the years 30 to 62 DC based on traditionally accepted timeframes. This dataset provides data for timeline analysis.

Content

Order: number of order of historical event Year_ AD: AD year of historical event Title: title of historical event Book: bible book of historical event (always Acts for this dataset) Chapter: chapter of the bible book that records the historical event Verse: verse where the historical event begins

Acknowledgements

Bible Timeline © 2010 by Rich Valkanet, Discovery Bible and Biblos.com. All Dates are Approximate. Timeline based on traditionally accepted timeframes and general consensus of a variety of sources, including Wilmington's Guide to the Bible, A Survey of Israel's History (Wood), The Mysterious Numbers of the Hebrew Kings (Thiele), ESV Study Bible, The Treasury of Scripture Knowledge, International Standard Bible Encyclopedia, and Easton's Bible Dictionary.

Bible Timeline by Bible Hub https://biblehub.com/timeline/

Photo by Jordan Benton from Pexels https://www.pexels.com/photo/shallow-focus-of-clear-hourglass-1095601/

Inspiration

What if instead of limiting ourselves to analyzing the Bible by books, chapters and verses, we could also analyze it with aggregations based on time and chronology? Something like this would allow us to use a new type of cross-references based on time and not on words or quotes. This timeline dataset could add an additional frame of reference and be useful for crossing it with datasets to perform more complex analysis on the biblical text. Let's start with the book of Acts!
The World English Bible
kaggle.com
zip
Updated Feb 27, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kyubyong Park (2018). The World English Bible [Dataset]. https://www.kaggle.com/bryanpark/the-world-english-bible-speech-dataset
Explore at:
zip(10558178926 bytes)Available download formats
Dataset updated
Feb 27, 2018
Authors
Kyubyong Park
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Context

The World English Bible is a public domain update of the American Standard Version of 1901 into modern English. Its audio recordings are freely available at http://www.audiotreasure.com/. The only problem when you use those in speech-relevant tasks is that each file is too long. That's why I split each audio file such that an audio clip is equivalent to a verse. Subsequently I aligned them to the text.

Content

This dataset is composed of the following:
- README.md
- wav files sampled at 12,000 KHZ
- transcript.txt.

transcript.txt is in a tab-delimited format. The first column is the audio file paths. The second one is the script. Finally, the rightmost column is the duration of the audio file.

Acknowledgements

I would like to show my respect to Dave, the host of www.audiotreasure.com and the reader of the audio files.

Reference

You may want to check my project using this dataset at https://github.com/Kyubyong/tacotron.
The Bible and The Quran: Sentiment Analysis.
kaggle.com
zip
Updated Aug 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Patrick L Ford (2024). The Bible and The Quran: Sentiment Analysis. [Dataset]. https://www.kaggle.com/datasets/patricklford/bible-and-quran-sentiment-analysis
Explore at:
zip(1713644 bytes)Available download formats
Dataset updated
Aug 1, 2024
Authors
Patrick L Ford
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Introduction

Text analysis, also known as text mining or natural language processing (NLP), is a branch of computer science and artificial intelligence that involves the extraction of useful information and knowledge from unstructured text data. It encompasses a wide range of techniques and applications, from sentiment analysis and topic modelling to information retrieval and machine translation.

Core Concepts in Text Analysis

Before delving into specific tools and techniques, it's essential to understand some fundamental concepts: - Tokenization: The process of breaking down text into individual words or tokens.
- Stop word removal: Eliminating common words (like "the," "and," "of") that often carry little semantic value.
- Stemming and Lemmatization: Reducing words to their root form to improve analysis accuracy.
- Part-of-speech tagging: Identifying the grammatical role of words (noun, verb, adjective, etc.). Named entity recognition (NER): Recognising and classifying named entities (people, organisations, locations, etc.).
- Sentiment Analysis: Bing and NRC Sentiment analysis aims to determine the emotional tone behind a piece of text. It's widely used in social media monitoring, customer feedback analysis, and market research.
- Bing Sentiment Analysis: Microsoft's Bing offers a sentiment analysis API that provides polarity scores (positive, negative, neutral) for text. It's relatively easy to use and integrates well with other Bing services. However, it might not be as granular as other options.
- NRC Sentiment Analysis: The National Research Council (NRC) lexicon is a widely used resource for sentiment analysis. It assigns multiple emotions (anger, fear, joy, sadness, surprise, disgust) to words, allowing for more nuanced analysis. It's often used as a baseline for comparison with other sentiment analysis methods.
- Word counts: A simple but informative metric that measures the frequency of words in a text. It can be used to identify keywords, identify the most common topics, and compare texts. - Lexical diversity: This metric measures the variety of words used in a text. It can help assess the complexity and richness of language. Common measures include type-token ratio (TTR) and lexical density.
- Word clouds (also known as tag clouds): Word clouds are visual representations of text data where the size of each word corresponds to its frequency or importance. They are useful for quickly identifying prominent terms and themes in a text.

Advanced Text Analysis Techniques

Beyond the basics, text analysis offers a wealth of sophisticated techniques: - Topic modelling: Uncovers hidden thematic structures within a large collection of documents. - Text classification: Categories text into predefined categories (e.g., spam/not spam, news/sports).
- Named entity recognition (NER): Identifies and classifies named entities (people, organisations, locations, etc.).
- Relationship extraction: Discovers relationships between entities in text (e.g., "Apple acquired Beats").
- Machine translation: Translates text from one language to another.

Applications of Text Analysis

Text analysis has a wide range of applications across various industries: - Social media monitoring: Analysing public sentiment, identifying trends, and tracking brand reputation. - Customer service: Analysing customer feedback to improve products and services. - Market research: Understanding customer preferences and market trends. - Healthcare: Extracting information from medical records, literature, and patient reviews. - Legal: Analysing legal documents for information extraction and discovery.

Challenges and Considerations

Text analysis is not without its challenges: - Ambiguity: Natural language is inherently ambiguous, making it difficult for computers to interpret meaning accurately.
- Data quality: The quality of the text data can significantly impact the results of analysis. - Computational resources: Some text analysis techniques, especially those involving deep learning, require significant computational power.

By understanding the core concepts and techniques of text analysis, we can harness the power of text data to extract valuable insights and drive decision-making.

Visualisations: Old_Testament_KJ_Bible.csv and Quran_english.csv

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13231939%2F30ac54122eb49d5e712b5133316e9654%2FScreenshot%202024-08-04%2015.46.50.png?generation=1722784916593731&alt=media" alt="">

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13231939%2F0701222f04c944c7b8064d20e9a6278b%2FScreenshot%202024-08-04%2015.47.45.png?generation=1722784617149634&alt=media" alt="">

![](https://www.googleapis.com/downloa...
Sound Mind Bible Word Study
kaggle.com
zip
Updated Nov 20, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
JORGE GARCIA-INIGUEZ (2022). Sound Mind Bible Word Study [Dataset]. https://www.kaggle.com/datasets/jorgegarciainiguez/sound-mind-bible-word-study/code
Explore at:
zip(12276 bytes)Available download formats
Dataset updated
Nov 20, 2022
Authors
JORGE GARCIA-INIGUEZ
Description
One of my favorite things to do is study the usage of Hebrew and Greek words throughout the Bible. I use study tools such as Strong's Concordance, Englishman's Hebrew and Greek Concordances, Thayer's Greek Lexicon, and Gesenius' Hebrew Lexicon to gather details about the word or words of study. I then analyze the usage across all passages in order to get a better understanding of the Word of God as well as derive any specific themes from the Bible.

This notebook explores the usage of two Greek word groups used to designate sound mindedness in the New Testament.

G3524/5 - lit. "abstaining from wine"; sober, sober-minded; by metaphor, self-control, aware, watchful, in possession of one's faculties

G4993/8 - being of sound mind, in one's right mind

The Excel spreadsheet associated with this notebook is a collection of data acquired from the study tools mentioned above. I decided to try to use Pandas to analyze the word usage and provide various breakdowns to spot anomalies and deviations.

Here is a summary of the Excel data contents.

Group - Word group for Greek word. Derived from first few characters of Strong's number. Will group related Greek words together.

Word - Strong's number for the Greek word.

Passsage - The book of the Bible where word is found.

Translation - The translation of the word in the specific Version.

Version - The translation version such as King James Version (KJV), New International Version (NIV), Diaglott (DIAG), etc.

Comments - Miscellaneous comments about the passage usage of the word. For example, any marginal reference from the translators about the word.
Understanding the Douay-Rheims (complete) Bible.
kaggle.com
zip
Updated Jan 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Patrick L Ford (2025). Understanding the Douay-Rheims (complete) Bible. [Dataset]. https://www.kaggle.com/datasets/patricklford/understanding-the-douay-rheims-complete-bible
Explore at:
zip(1801181 bytes)Available download formats
Dataset updated
Jan 27, 2025
Authors
Patrick L Ford
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
Understanding the Douay-Rheims Bible: A Sentiment Analysis

Introduction

Text analysis, also known as text mining or natural language processing (NLP), is a branch of computer science and artificial intelligence that involves the extraction of useful information and knowledge from unstructured text data. It encompasses a wide range of techniques and applications, from sentiment analysis and topic modelling to information retrieval and machine translation.

Core Concepts in Text Analysis

Before delving into specific tools and techniques, it's essential to understand some fundamental concepts: - Tokenization: The process of breaking down text into individual words or tokens.
- Stop word removal: Eliminating common words (like "the," "and," "of") that often carry little semantic value.
- Stemming and Lemmatization: Reducing words to their root form to improve analysis accuracy.
- Part-of-speech tagging: Identifying the grammatical role of words (noun, verb, adjective, etc.). Named entity recognition (NER): Recognising and classifying named entities (people, organisations, locations, etc.).
- Sentiment Analysis: Bing and NRC Sentiment analysis aims to determine the emotional tone behind a piece of text. It's widely used in social media monitoring, customer feedback analysis, and market research.
- Bing Sentiment Analysis: Microsoft's Bing offers a sentiment analysis API that provides polarity scores (positive, negative, neutral) for text. It's relatively easy to use and integrates well with other Bing services. However, it might not be as granular as other options.
- NRC Sentiment Analysis: The National Research Council (NRC) lexicon is a widely used resource for sentiment analysis. It assigns multiple emotions (anger, fear, joy, sadness, surprise, disgust) to words, allowing for more nuanced analysis. It's often used as a baseline for comparison with other sentiment analysis methods.
- Word counts: A simple but informative metric that measures the frequency of words in a text. It can be used to identify keywords, identify the most common topics, and compare texts. - Lexical diversity: This metric measures the variety of words used in a text. It can help assess the complexity and richness of language. Common measures include type-token ratio (TTR) and lexical density.
- Word clouds (also known as tag clouds): Word clouds are visual representations of text data where the size of each word corresponds to its frequency or importance. They are useful for quickly identifying prominent terms and themes in a text.

Advanced Text Analysis Techniques

Beyond the basics, text analysis offers a wealth of sophisticated techniques: - Topic modelling: Uncovers hidden thematic structures within a large collection of documents. - Text classification: Categories text into predefined categories (e.g., spam/not spam, news/sports).
- Named entity recognition (NER): Identifies and classifies named entities (people, organisations, locations, etc.).
- Relationship extraction: Discovers relationships between entities in text (e.g., "Apple acquired Beats").
- Machine translation: Translates text from one language to another.

Applications of Text Analysis

Text analysis has a wide range of applications across various industries: - Social media monitoring: Analysing public sentiment, identifying trends, and tracking brand reputation. - Customer service: Analysing customer feedback to improve products and services. - Market research: Understanding customer preferences and market trends. - Healthcare: Extracting information from medical records, literature, and patient reviews. - Legal: Analysing legal documents for information extraction and discovery.

Challenges and Considerations

Text analysis is not without its challenges: - Ambiguity: Natural language is inherently ambiguous, making it difficult for computers to interpret meaning accurately.
- Data quality: The quality of the text data can significantly impact the results of analysis. - Computational resources: Some text analysis techniques, especially those involving deep learning, require significant computational power.

By understanding the core concepts and techniques of text analysis, we can harness the power of text data to extract valuable insights and drive decision-making.

The Douay-Rheims Bible

The Douay-Rheims Bible, an English translation of the Latin Vulgate, has been a cornerstone of Catholic tradition for centuries. Originally translated by members of the English College at Douay and Rheims in the late 16th century, this Bible reflects the Catholic Church's emphasis on accuracy and reverence in conveying the Word of God. Its initial publication occurred in two stages: the New Testament in 1582 and the Old Tes...
Billy Graham's Crusade Notebooks
kaggle.com
zip
Updated Feb 11, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Siddhartha Gupta (2021). Billy Graham's Crusade Notebooks [Dataset]. https://www.kaggle.com/sidddhero97/billy-grahams-notebook
Explore at:
zip(314866 bytes)Available download formats
Dataset updated
Feb 11, 2021
Authors
Siddhartha Gupta
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

Billy Graham is one of the world's most famous televangelist to ever live. Once interested in merely picking up women. He became one of the most hardworking and popular Evangelist the world has ever seen. He has performed his tours, or what he calls "crusades" all over the world.

Content

This data might just be scratching the surface. But it shows that Billy Graham toured closed to 610 cities in his lifetime.

Acknowledgements

The Wheaton College has a repository of his works; Collection 265; https://www2.wheaton.edu/bgc/archives/guides/265.htm

Inspiration

I wish to explore questions like how much of the bible did Billy Graham really cite, did he have any particular preferences of which books of the bible or any text which he referred to often, which countries did he tour, which cities did he tour, did his preferences of cities and countries change with time.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

The Devastator (2023). Individuals in the Holy Bible [Dataset]. https://www.kaggle.com/datasets/thedevastator/individuals-in-the-holy-bible

Individuals in the Holy Bible

Biblical Individuals: Mentions, Verses, and Notes

Explore at:

zip(463563 bytes)Available download formats

Dataset updated

Dec 6, 2023

Authors

The Devastator

Description

Individuals in the Holy Bible

Biblical Individuals: Mentions, Verses, and Notes

By Brady Stephenson [source]

About this dataset

The Holy Bible, a revered text studied by students, scholars, critics, and the curious for centuries, encompasses a rich tapestry of stories featuring numerous individuals. The BibleData-PersonVerseTanakh dataset provides an extensive collection of information about these individuals mentioned in each chapter and verse across the entire Bible. It offers unique identifiers (corresponding to the BibleData-Person and BibleData-PersonLabel datasets) alongside valuable notes for study and verification purposes. Each individual's entry includes their specific label, denoting their distinct identification within the Bible's narrative.

One vital aspect of this dataset is the person_label_count column, which quantifies the frequency with which an individual is referenced throughout the entirety of the Bible. This numerical value presents an insightful metric to gauge significant figures or recurring characters present in biblical narratives.

Furthermore, the dataset also encompasses a wealth of detailed annotations provided under the person_verse_notes column. These notes offer additional contextual information related to each individual mentioned in respective verses throughout various chapters. Researchers and enthusiasts can delve into these annotations for deeper comprehension or clarification surrounding specific biblical characters.

For easier reference and cross-referencing purposes, an essential attribute is presented through the person_verse_sequence column. This field not only identifies chapter and verse references but consolidates them into concise textual representations aligned with each particular individual's mention within scripture.

The comprehensive nature of this dataset ensures coverage across all books within both Testaments (Genesis 1:1—Malachi 4:6) as per its last update on June 24th, 2023. While it currently stands as a complete resource capturing every persona from biblical texts accurately so far discovered until that date; it remains open for edits or corrections if any discrepancies are identified in its data integrity.

Envisioned as a fundamental tool for rigorous academic study or personal exploration alike—the data provided here brilliantly complements the immense historical and spiritual significance carried by the Holy Bible

How to use the dataset

Welcome to the comprehensive dataset of individuals mentioned in every chapter and verse of the Holy Bible. This guide will help you navigate and make the most of this valuable resource. Whether you are a student, scholar, critic, or just curious about the Bible, this dataset will provide you with unique identifiers and notes for study and verification purposes.

Understanding the Columns

This dataset contains several columns that provide important information about the individuals mentioned in the Bible. Here's a breakdown of each column:

person_label: This column contains a unique identifier for each individual mentioned in the Bible. It is a text-based label that can be used to reference specific individuals throughout your analysis.

person_label_count: This column indicates how many times each individual is mentioned in the Bible. It is an integer value that can help you understand their significance or prominence within biblical texts.

person_verse_sequence: This column provides chapter and verse references where each individual is mentioned in the Bible. The chapter and verse references are given as text entries, allowing you to easily locate specific instances where an individual appears.

person_verse_notes: This column includes any additional notes or information related to each individual mentioned in their corresponding verses. These notes can provide historical context, interpretation insights, or other relevant details to enrich your understanding of biblical characters.

Exploring and Analyzing the Dataset

To make full use of this dataset, consider incorporating these steps into your analysis:

Data Exploration: Start by exploring some summary statistics or descriptive measures using columns like person_label_count to understand overall patterns and frequencies concerning individuals' mentions.

Study Individual Characters: Pick specific individuals from the person_label column based on your research interest or personal curiosity about biblical figures who appear frequently (higher count) or less often (lower count). Use these unique identifiers to trace their journeys and roles in different chapters and verses.

**Inte...

Clear search

Close search

Google apps

Main menu

Individuals in the Holy Bible

Individuals in the Holy Bible

Biblical Individuals: Mentions, Verses, and Notes

About this dataset

How to use the dataset

Understanding the Columns

Exploring and Analyzing the Dataset

Bible Corpus

Context

Content

Acknowledgements

Inspirations

BibleData

Context

Content

BibleData-Reference [complete]

BibleData-Commandments [complete]

HebrewStrongs [complete]

NavesTopicalDictionary [complete]

HitchcocksBibleNamesDictionary [complete]

The Alamo Polyglot [complete]

BibleData-Book [in progress]

BibleData-Person [in progress]

BibleData-PersonLabel [in progress]

BibleData-PersonRelationship [in progress]

Bible Person Verse Descriptions

Bible Person Verse Descriptions

Occurrences and details of individuals named in the Bible

About this dataset

How to use the dataset

1. Understanding the Dataset

2. Exploring Individuals by Name

3. Analyzing Frequency of Individual Names

4. Cross-referencing with Other Datasets

5. Contributing and Data Completeness

Research Ideas

Swedes' outlook on life, religion and the bible 1984/1985

Bible Verses from King James Version

Context

Content

Acknowledgements

Inspiration

The King James Bible

About Dataset

Inspiration

Bible Timeline - Acts

Context

Content

Acknowledgements

Inspiration

The World English Bible

Context

Content

Acknowledgements

Reference

The Bible and The Quran: Sentiment Analysis.

Introduction

Core Concepts in Text Analysis

Advanced Text Analysis Techniques

Applications of Text Analysis

Challenges and Considerations

Visualisations: Old_Testament_KJ_Bible.csv and Quran_english.csv

Sound Mind Bible Word Study

Understanding the Douay-Rheims (complete) Bible.

Understanding the Douay-Rheims Bible: A Sentiment Analysis

Introduction

Core Concepts in Text Analysis

Advanced Text Analysis Techniques

Applications of Text Analysis

Challenges and Considerations

The Douay-Rheims Bible

Billy Graham's Crusade Notebooks

Context

Content

Acknowledgements

Inspiration

Individuals in the Holy Bible

Biblical Individuals: Mentions, Verses, and Notes

Individuals in the Holy Bible

Biblical Individuals: Mentions, Verses, and Notes