100+ datasets found
  1. N

    Gratis, OH Population Breakdown by Gender Dataset: Male and Female...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Gratis, OH Population Breakdown by Gender Dataset: Male and Female Population Distribution // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/b235d8fd-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Gratis
    Variables measured
    Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Gratis by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Gratis across both sexes and to determine which sex constitutes the majority.

    Key observations

    There is a slight majority of female population, with 50.0% of total population being female. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

    Variables / Data Columns

    • Gender: This column displays the Gender (Male / Female)
    • Population: The population of the gender in the Gratis is shown in this column.
    • % of Total Population: This column displays the percentage distribution of each gender as a proportion of Gratis total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Gratis Population by Race & Ethnicity. You can refer the same here

  2. Reddit: /r/travel

    • kaggle.com
    zip
    Updated Dec 18, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Reddit: /r/travel [Dataset]. https://www.kaggle.com/datasets/thedevastator/uncovering-travel-experiences-desires-and-opinio
    Explore at:
    zip(369897 bytes)Available download formats
    Dataset updated
    Dec 18, 2022
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Reddit: /r/travel

    An Exploration of Users & Posts

    By Reddit [source]

    About this dataset

    Traveling can be an incredibly exciting and rewarding experience; it is the perfect way to break away from the everyday routine and explore new cultures, sights, and sounds. For those planning a travel-related adventure – whether international or local – having access to real-user experiences in the form of advice and recommendations can mean the difference between a fantastic journey and a costly mistake. That's why this dataset of Reddit posts history on 'travel' is particularly useful for exploring Reddit users' opinions, desires, and experiences with their travel endeavors.

    This dataset contains information on over 750+ Reddit posts regarding traveling as well as thousands of related comments over an extended period of time. For every post listed, data such as title, score (number of upvotes), URL link to page, number of comments given per post/comment thread, creation date/time stamp for both post/comment threads can be found.

    All together these attributes provide detailed insights into user sentiments towards various aspects regarding traveling: What topics are they most interested in? What do they think are the best (or worst) destinations? Are there any tips or pitfalls that could inform our own decisions when embarking on our next journey? All this information resulting from our analysis will give us better guidance when helping us make smarter decisions during our planning process!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset provides valuable insights into the various opinions, desires and experiences of Redditors about travel-related activities. The data consists of posts and comments collected from the 'travel' sub reddit page on Reddit. To get started with this dataset, you need to first understand that each post includes data such as title, score, ID, url, number of comments created at the timestamp etc. This can be used to understand the kind of conversations that are happening in these forums regarding travel related topics.

    Research Ideas

    • Analyzing user sentiment around various topics in the travel industry such as airlines, hotels, attractions and experiences.
    • Comparing time of year to the frequency of posts related to summer vacation or other holiday specific activities.
    • Examining which geographical locations generate the most interest among Redditors, and applying this data to marketing campaigns for those areas

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: travel.csv | Column name | Description | |:--------------|:--------------------------------------------------------| | title | The title of the post. (String) | | score | The number of upvotes the post has received. (Integer) | | url | The URL of the post. (String) | | comms_num | The number of comments the post has received. (Integer) | | created | The date and time the post was created. (DateTime) | | body | The body of the post. (String) | | timestamp | The date and time the post was last updated. (DateTime) |

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Reddit.

  3. Job Offers Web Scraping Search

    • kaggle.com
    zip
    Updated Feb 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Job Offers Web Scraping Search [Dataset]. https://www.kaggle.com/datasets/thedevastator/job-offers-web-scraping-search
    Explore at:
    zip(5322 bytes)Available download formats
    Dataset updated
    Feb 11, 2023
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Job Offers Web Scraping Search

    Targeted Results to Find the Optimal Work Solution

    By [source]

    About this dataset

    This dataset collects job offers from web scraping which are filtered according to specific keywords, locations and times. This data gives users rich and precise search capabilities to uncover the best working solution for them. With the information collected, users can explore options that match with their personal situation, skillset and preferences in terms of location and schedule. The columns provide detailed information around job titles, employer names, locations, time frames as well as other necessary parameters so you can make a smart choice for your next career opportunity

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset is a great resource for those looking to find an optimal work solution based on keywords, location and time parameters. With this information, users can quickly and easily search through job offers that best fit their needs. Here are some tips on how to use this dataset to its fullest potential:

    • Start by identifying what type of job offer you want to find. The keyword column will help you narrow down your search by allowing you to search for job postings that contain the word or phrase you are looking for.

    • Next, consider where the job is located – the Location column tells you where in the world each posting is from so make sure it’s somewhere that suits your needs!

    • Finally, consider when the position is available – look at the Time frame column which gives an indication of when each posting was made as well as if it’s a full-time/ part-time role or even if it’s a casual/temporary position from day one so make sure it meets your requirements first before applying!

    • Additionally, if details such as hours per week or further schedule information are important criteria then there is also info provided under Horari and Temps Oferta columns too! Now that all three criteria have been ticked off - key words, location and time frame - then take a look at Empresa (Company Name) and Nom_Oferta (Post Name) columns too in order to get an idea of who will be employing you should you land the gig!

      All these pieces of data put together should give any motivated individual all they need in order to seek out an optimal work solution - keep hunting good luck!

    Research Ideas

    • Machine learning can be used to groups job offers in order to facilitate the identification of similarities and differences between them. This could allow users to specifically target their search for a work solution.
    • The data can be used to compare job offerings across different areas or types of jobs, enabling users to make better informed decisions in terms of their career options and goals.
    • It may also provide an insight into the local job market, enabling companies and employers to identify where there is potential for new opportunities or possible trends that simply may have previously gone unnoticed

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: web_scraping_information_offers.csv | Column name | Description | |:-----------------|:------------------------------------| | Nom_Oferta | Name of the job offer. (String) | | Empresa | Company offering the job. (String) | | Ubicació | Location of the job offer. (String) | | Temps_Oferta | Time of the job offer. (String) | | Horari | Schedule of the job offer. (String) |

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit .

  4. N

    College Springs, IA Population Breakdown by Gender Dataset: Male and Female...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). College Springs, IA Population Breakdown by Gender Dataset: Male and Female Population Distribution // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/b2297cea-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    College Springs
    Variables measured
    Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of College Springs by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of College Springs across both sexes and to determine which sex constitutes the majority.

    Key observations

    There is a majority of male population, with 56.68% of total population being male. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

    Variables / Data Columns

    • Gender: This column displays the Gender (Male / Female)
    • Population: The population of the gender in the College Springs is shown in this column.
    • % of Total Population: This column displays the percentage distribution of each gender as a proportion of College Springs total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for College Springs Population by Race & Ethnicity. You can refer the same here

  5. D

    Our415 Events and Activities

    • data.sfgov.org
    • s.cnmilf.com
    • +1more
    Updated Nov 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Our415 Events and Activities [Dataset]. https://data.sfgov.org/Economy-and-Community/Our415-Events-and-Activities/8i3s-ih2a
    Explore at:
    xlsx, csv, kmz, application/geo+json, kml, xmlAvailable download formats
    Dataset updated
    Nov 27, 2025
    License

    ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
    License information was derived automatically

    Description

    A. SUMMARY San Francisco offers numerous events and activities tailored for children, youth, and families. However, finding and navigating the disparate sources of information can be a major challenge. Our415.org seeks to simplify this by consolidating all relevant details, ensuring that families can easily find what they need, when they need it. It also encourages discovery of new interests and things to do. This dataset compiles current and upcoming events and activities in San Francisco for children, youth, and their families.

    B. HOW THE DATASET IS CREATED This dataset is a consolidation of multiple datasets from contributing City agencies and departments as well as Community Based Organizations. Currently, the information in the dataset is sourced from Rec Park’s activities catalog, SF Public Library’s events calendar, Department of Early Childhood’s family events calendar, and Support for Families' family events calendar. Rec Park activities include any “Open” activities appropriate for ages 0-24, and SF Public Library, Department of Early Childhood, and Support for Families events include events going into the next month.

    C. UPDATE PROCESS The dataset will be updated on a daily basis, reflecting changes to the source data.

    D. HOW TO USE THIS DATASET Taxonomy related fields and eligibility fields are either AI-determined or assigned through a DCYF-created crosswalk. These values are determined for the purposes of categorization and search functionality on Our415.org. Use with caution - errors may exist.

  6. Daily Social Media Active Users

    • kaggle.com
    zip
    Updated May 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shaik Barood Mohammed Umar Adnaan Faiz (2025). Daily Social Media Active Users [Dataset]. https://www.kaggle.com/datasets/umeradnaan/daily-social-media-active-users
    Explore at:
    zip(126814 bytes)Available download formats
    Dataset updated
    May 5, 2025
    Authors
    Shaik Barood Mohammed Umar Adnaan Faiz
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Description:

    The "Daily Social Media Active Users" dataset provides a comprehensive and dynamic look into the digital presence and activity of global users across major social media platforms. The data was generated to simulate real-world usage patterns for 13 popular platforms, including Facebook, YouTube, WhatsApp, Instagram, WeChat, TikTok, Telegram, Snapchat, X (formerly Twitter), Pinterest, Reddit, Threads, LinkedIn, and Quora. This dataset contains 10,000 rows and includes several key fields that offer insights into user demographics, engagement, and usage habits.

    Dataset Breakdown:

    • Platform: The name of the social media platform where the user activity is tracked. It includes globally recognized platforms, such as Facebook, YouTube, and TikTok, that are known for their large, active user bases.

    • Owner: The company or entity that owns and operates the platform. Examples include Meta for Facebook, Instagram, and WhatsApp, Google for YouTube, and ByteDance for TikTok.

    • Primary Usage: This category identifies the primary function of each platform. Social media platforms differ in their primary usage, whether it's for social networking, messaging, multimedia sharing, professional networking, or more.

    • Country: The geographical region where the user is located. The dataset simulates global coverage, showcasing users from diverse locations and regions. It helps in understanding how user behavior varies across different countries.

    • Daily Time Spent (min): This field tracks how much time a user spends on a given platform on a daily basis, expressed in minutes. Time spent data is critical for understanding user engagement levels and the popularity of specific platforms.

    • Verified Account: Indicates whether the user has a verified account. This feature mimics real-world patterns where verified users (often public figures, businesses, or influencers) have enhanced status on social media platforms.

    • Date Joined: The date when the user registered or started using the platform. This data simulates user account history and can provide insights into user retention trends or platform growth over time.

    Context and Use Cases:

    • This synthetic dataset is designed to offer a privacy-friendly alternative for analytics, research, and machine learning purposes. Given the complexities and privacy concerns around using real user data, especially in the context of social media, this dataset offers a clean and secure way to develop, test, and fine-tune applications, models, and algorithms without the risks of handling sensitive or personal information.

    Researchers, data scientists, and developers can use this dataset to:

    • Model User Behavior: By analyzing patterns in daily time spent, verified status, and country of origin, users can model and predict social media engagement behavior.

    • Test Analytics Tools: Social media monitoring and analytics platforms can use this dataset to simulate user activity and optimize their tools for engagement tracking, reporting, and visualization.

    • Train Machine Learning Algorithms: The dataset can be used to train models for various tasks like user segmentation, recommendation systems, or churn prediction based on engagement metrics.

    • Create Dashboards: This dataset can serve as the foundation for creating user-friendly dashboards that visualize user trends, platform comparisons, and engagement patterns across the globe.

    • Conduct Market Research: Business intelligence teams can use the data to understand how various demographics use social media, offering valuable insights into the most engaged regions, platform preferences, and usage behaviors.

    • Sources of Inspiration: This dataset is inspired by public data from industry reports, such as those from Statista, DataReportal, and other market research platforms. These sources provide insights into the global user base and usage statistics of popular social media platforms. The synthetic nature of this dataset allows for the use of realistic engagement metrics without violating any privacy concerns, making it an ideal tool for educational, analytical, and research purposes.

    The structure and design of the dataset are based on real-world usage patterns and aim to represent a variety of users from different backgrounds, countries, and activity levels. This diversity makes it an ideal candidate for testing data-driven solutions and exploring social media trends.

    Future Considerations:

    As the social media landscape continues to evolve, this dataset can be updated or extended to include new platforms, engagement metrics, or user behaviors. Future iterations may incorporate features like post frequency, follower counts, engagement rates (likes, comments, shares), or even sentiment analysis from user-generated content.

    By leveraging this dataset, analysts and data scientists can create better, more effective strategies ...

  7. p

    Bangladesh Number Dataset

    • listtodata.com
    .csv, .xls, .txt
    Updated Jul 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    List to Data (2025). Bangladesh Number Dataset [Dataset]. https://listtodata.com/bangladesh-dataset
    Explore at:
    .csv, .xls, .txtAvailable download formats
    Dataset updated
    Jul 17, 2025
    Authors
    List to Data
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2025 - Dec 31, 2025
    Area covered
    Bangladesh
    Variables measured
    phone numbers, Email Address, full name, Address, City, State, gender,age,income,ip address,
    Description

    Bangladesh number dataset provides contact information from trusted sources. We only collect phone numbers from reliable sources and define this information. To ensure transparency, we also provide the source URL to show where the information was collected from. In addition, we offer 24/7 support. If you have a question or need help, we’re always here. However, we care about accuracy, so we carefully collect the Bangladesh number dataset from trusted sources. You may rely on this data for business or personal use. With customer support, you’ll never have to wait when you need help or more information. We use opt-in data to respect privacy. This way, we contact only people who want to hear from you. Bangladesh phone data gives you access to contacts in Bangladesh. Here you can filter information by gender, age, and relationship status. This makes it easy to find exactly the people you want to connect with. We define this data by ensuring it follows all GDPR rules to keep it safe and legal. Our system works hard to remove any invalid data so you get only accurate and valid numbers. List to Data is a helpful website for finding important phone numbers quickly. Also, our Bangladesh phone data is suitable for doing business targeting specific groups. You can easily filter your list to focus on specific types of customers. Since we remove invalid data regularly, you don’t have to deal with old or useless numbers. We assure you that all data follows strict GDPR rules, so you can use it without any problems. Bangladesh phone number list is a collection of phone numbers from people in Bangladesh. We define this list by providing 100% correct and valid phone numbers that are ready to use. Also, we offer a replacement guarantee if you ever receive an invalid number. This means you will always have accurate data. We collect phone numbers that we provide based on customer’s permission. Moreover, we work hard to provide the best Bangladesh phone number list for businesses and personal use. We gather data correctly, so you won’t have to worry about getting outdated or incorrect information. Our replacement guarantee means you’ll always have valid numbers, so you can relax and feel confident.

  8. O

    Department of Community Resources & Services Online Data Sources

    • opendata.howardcountymd.gov
    • data.wu.ac.at
    csv, xlsx, xml
    Updated Oct 28, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Community Resources & Services (2019). Department of Community Resources & Services Online Data Sources [Dataset]. https://opendata.howardcountymd.gov/w/kdeq-r7qc/j72c-n6z5?cur=LdI0ncE4AfX&from=n10jJ2BVdMM
    Explore at:
    xml, csv, xlsxAvailable download formats
    Dataset updated
    Oct 28, 2019
    Dataset authored and provided by
    Department of Community Resources & Services
    Description

    This dataset lists various data sources used within the Department of Community Resources & Services for various internal and external reports. This dataset allows individuals and organizations to identify the type of data they are looking for and to which geographical level they are trying to get the data for (i.e. National, State, County, etc.). This dataset will be updated every quarter and should be utilized for research purposes

  9. Data Set Knowledge Graph (DSKG)

    • zenodo.org
    • nde-dev.biothings.io
    bin, xml
    Updated Feb 18, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michael Färber; Michael Färber; David Lamprecht; David Lamprecht (2021). Data Set Knowledge Graph (DSKG) [Dataset]. http://doi.org/10.5281/zenodo.4478921
    Explore at:
    bin, xmlAvailable download formats
    Dataset updated
    Feb 18, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Michael Färber; Michael Färber; David Lamprecht; David Lamprecht
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We present the Data Set Knowledge Graph (DSKG.org), an RDF dataset about datasets that are linked to publications (modeled in the Microsoft Academic Knowledge Graph, MAKG) that mention the datasets. The metadata of the datasets is based on datasets that are registered in OpenAIRE and Wikidata.

    What exactly do we provide?

    1. Periodically updated RDF dump files of the Data Set Knowledge Graph.
    2. URI resolution of the Data Set Knowledge Graph within the Linked Open Data.
    3. A publicly accessible SPARQL endpoint containing the latest Dataset Knowledge Graph data.

    How big is the Dataset Knowledge Graph?

    The Dataset Knowledge Graph models, among others,

    • 2,208 datasets from all scientific disciplines
    • 813,551 links to 634,803 unique papers
    • 1,169 authors of datasets
    • 208 ORCID IDs.

    Potential use cases:

    • Use the DSKG for the development of semantic search engines (e.g. use the metadata of the linked publications of the datasets for advanced search capabilities)
    • Easier data integration by using the RDF standard vocabulary DCAT and by linking resources to other data sources (e.g., combining the DSKG with other dataset collections in RDF).
    • Data analysis to measure and award the provisioning of datasets (e.g., determine the scientific influence of datasets and authors).
  10. m

    Find Ideal Location for Business in Bangladesh

    • data.mendeley.com
    Updated Sep 22, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Faisal Bin Ashraf (2021). Find Ideal Location for Business in Bangladesh [Dataset]. http://doi.org/10.17632/v2k2jvjwrh.1
    Explore at:
    Dataset updated
    Sep 22, 2021
    Authors
    Faisal Bin Ashraf
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Bangladesh
    Description

    The dataset has 21 columns that carry the features (questions) of 988 respondents. The efficiency of any machine learning model is heavily dependent on its raw initial dataset. For this, we had to be extra careful in gathering our information. We figured out that for our particular problem, we had to go forward with data that was not only authentic but also versatile enough to get the proper information from relevant sources. Hence we opted to build our dataset by dispatching a survey questionnaire among targeted audiences. Firstly, we built the questionnaire with inquiries that were made after keen observation. Studying the behavior from our intended audience, we came up with factual and informative queries that generated appropriate data. Our prime audience were those who were highly into buying fashion accessories and hence we had created a set of questionnaires that emphasized on questions related to that field. We had a total of twenty one well revised questions that gave us an overview of all answers that were going to be needed within the proximity of our system. As such, we had the opportunity to gather over half a thousand authentic leads and concluded upon our initial raw dataset accordingly.

  11. d

    Data from: Land Cover Trends Dataset, 2000-2011

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Nov 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Land Cover Trends Dataset, 2000-2011 [Dataset]. https://catalog.data.gov/dataset/land-cover-trends-dataset-2000-2011
    Explore at:
    Dataset updated
    Nov 26, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    U.S. Geological Survey scientists, funded by the Climate and Land Use Change Research and Development Program, developed a dataset of 2006 and 2011 land use and land cover (LULC) information for selected 100-km2 sample blocks within 29 EPA Level 3 ecoregions across the conterminous United States. The data was collected for validation of new and existing national scale LULC datasets developed from remotely sensed data sources. The data can also be used with the previously published Land Cover Trends Dataset: 1973-2000 (http:// http://pubs.usgs.gov/ds/844/), to assess land-use/land-cover change in selected ecoregions over a 37-year study period. LULC data for 2006 and 2011 was manually delineated using the same sample block classification procedures as the previous Land Cover Trends project. The methodology is based on a statistical sampling approach, manual classification of land use and land cover, and post-classification comparisons of land cover across different dates. Landsat Thematic Mapper, and Enhanced Thematic Mapper Plus imagery was interpreted using a modified Anderson Level I classification scheme. Landsat data was acquired from the National Land Cover Database (NLCD) collection of images. For the 2006 and 2011 update, ecoregion specific alterations in the sampling density were made to expedite the completion of manual block interpretations. The data collection process started with the 2000 date from the previous assessment and any needed corrections were made before interpreting the next two dates of 2006 and 2011 imagery. The 2000 land cover was copied and any changes seen in the 2006 Landsat images were digitized into a new 2006 land cover image. Similarly, the 2011 land cover image was created after completing the 2006 delineation. Results from analysis of these data include ecoregion based statistical estimates of the amount of LULC change per time period, ranking of the most common types of conversions, rates of change, and percent composition. Overall estimated amount of change per ecoregion from 2001 to 2011 ranged from a low of 370 km2 in the Northern Basin and Range Ecoregion to a high of 78,782 km2 in the Southeastern Plains Ecoregion. The Southeastern Plains Ecoregion continues to encompass the most intense forest harvesting and regrowth in the country. Forest harvesting and regrowth rates in the southeastern U.S. and Pacific Northwest continued at late 20th century levels. The land use and land cover data collected by this study is ideally suited for training, validation, and regional assessments of land use and land cover change in the U.S. because it is collected using manual interpretation techniques of Landsat data aided by high resolution photography. The 2001-2011 Land Cover Trends Dataset is provided in an Albers Conical Equal Area projection using the NAD 1983 datum. The sample blocks have a 30-meter resolution and file names follow a specific naming convention that includes the number of the ecoregion containing the block, the block number, and the Landsat image date. The data files are organized by ecoregion, and are available in the ERDAS Imagine (.img) format. U.S. Geological Survey scientists, funded by the Climate and Land Use Change Research and Development Program, developed a dataset of 2006 and 2011 land use and land cover (LULC) information for selected 100-km2 sample blocks within 29 EPA Level 3 ecoregions across the conterminous United States. The data was collected for validation of new and existing national scale LULC datasets developed from remotely sensed data sources. The data can also be used with the previously published Land Cover Trends Dataset: 1973-2000 (http:// http://pubs.usgs.gov/ds/844/), to assess land-use/land-cover change in selected ecoregions over a 37-year study period. LULC data for 2006 and 2011 was manually delineated using the same sample block classification procedures as the previous Land Cover Trends project. The methodology is based on a statistical sampling approach, manual classification of land use and land cover, and post-classification comparisons of land cover across different dates. Landsat Thematic Mapper, and Enhanced Thematic Mapper Plus imagery was interpreted using a modified Anderson Level I classification scheme. Landsat data was acquired from the National Land Cover Database (NLCD) collection of images. For the 2006 and 2011 update, ecoregion specific alterations in the sampling density were made to expedite the completion of manual block interpretations. The data collection process started with the 2000 date from the previous assessment and any needed corrections were made before interpreting the next two dates of 2006 and 2011 imagery. The 2000 land cover was copied and any changes seen in the 2006 Landsat images were digitized into a new 2006 land cover image. Similarly, the 2011 land cover image was created after completing the 2006 delineation. Results from analysis of these data include ecoregion based statistical estimates of the amount of LULC change per time period, ranking of the most common types of conversions, rates of change, and percent composition. Overall estimated amount of change per ecoregion from 2001 to 2011 ranged from a low of 370 square km in the Northern Basin and Range Ecoregion to a high of 78,782 square km in the Southeastern Plains Ecoregion. The Southeastern Plains Ecoregion continues to encompass the most intense forest harvesting and regrowth in the country. Forest harvesting and regrowth rates in the southeastern U.S. and Pacific Northwest continued at late 20th century levels. The land use and land cover data collected by this study is ideally suited for training, validation, and regional assessments of land use and land cover change in the U.S. because it’s collected using manual interpretation techniques of Landsat data aided by high resolution photography. The 2001-2011 Land Cover Trends Dataset is provided in an Albers Conical Equal Area projection using the NAD 1983 datum. The sample blocks have a 30-meter resolution and file names follow a specific naming convention that includes the number of the ecoregion containing the block, the block number, and the Landsat image date. The data files are organized by ecoregion, and are available in the ERDAS Imagine (.img) format.

  12. N

    Hood River County, OR Population Breakdown by Gender Dataset: Male and...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Hood River County, OR Population Breakdown by Gender Dataset: Male and Female Population Distribution // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/b239af4f-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Hood River County
    Variables measured
    Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Hood River County by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Hood River County across both sexes and to determine which sex constitutes the majority.

    Key observations

    There is a slight majority of female population, with 50.07% of total population being female. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

    Variables / Data Columns

    • Gender: This column displays the Gender (Male / Female)
    • Population: The population of the gender in the Hood River County is shown in this column.
    • % of Total Population: This column displays the percentage distribution of each gender as a proportion of Hood River County total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Hood River County Population by Race & Ethnicity. You can refer the same here

  13. N

    Portland, ME Population Breakdown by Gender Dataset: Male and Female...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Portland, ME Population Breakdown by Gender Dataset: Male and Female Population Distribution // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/b24d655c-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Portland, Maine
    Variables measured
    Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Portland by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Portland across both sexes and to determine which sex constitutes the majority.

    Key observations

    There is a slight majority of female population, with 51.9% of total population being female. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

    Variables / Data Columns

    • Gender: This column displays the Gender (Male / Female)
    • Population: The population of the gender in the Portland is shown in this column.
    • % of Total Population: This column displays the percentage distribution of each gender as a proportion of Portland total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Portland Population by Race & Ethnicity. You can refer the same here

  14. 10 Years Bug-Fix Dataset (PROMISE'19)

    • figshare.com
    • search.datacite.org
    zip
    Updated Sep 27, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Renan Vieira (2021). 10 Years Bug-Fix Dataset (PROMISE'19) [Dataset]. http://doi.org/10.6084/m9.figshare.8852084.v5
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 27, 2021
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Renan Vieira
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Replication Package of the paper "From Reports to Bug-Fix Commits: A 10 Years Dataset of Bug-Fixing Activity from 55 Apache's Open Source Projects"ABSTRACT:Bugs appear in almost any software development. Solving all or at least a large part of them requires a great deal of time, effort, and budget. Software projects typically use issue tracking systems as a way to report and monitor bug-fixing tasks. In recent years, several researchers have been conducting bug tracking analysis to better understand the problem and thus provide means to reduce costs and improve the efficiency of the bug-fixing task. In this paper, we introduce a new dataset composed of more than 70,000 bug-fix reports from 10 years of bug-fixing activity of 55 projects from the Apache Software Foundation, distributed in 9 categories. We have mined this information from Jira issue track system concerning two different perspectives of reports with closed/resolved status: static (the latest version of reports) and dynamic (the changes that have occurred in reports over time). We also extract information from the commits (if they exist) that fix such bugs from their respective version-control system (Git).We also provide a change analysis that occurs in the reports as a way of illustrating and characterizing the proposed dataset. Once the data extraction process is an error-prone nontrivial task, we believe such initiatives like this could be useful to support researchers in further more detailed investigations.You can find the full paper at: https://doi.org/10.1145/3345629.3345639If you use this dataset for your research, please reference the following paper:@inproceedings{Vieira:2019:RBC:3345629.3345639, author = {Vieira, Renan and da Silva, Ant^{o}nio and Rocha, Lincoln and Gomes, Jo~{a}o Paulo}, title = {From Reports to Bug-Fix Commits: A 10 Years Dataset of Bug-Fixing Activity from 55 Apache's Open Source Projects}, booktitle = {Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering}, series = {PROMISE'19}, year = {2019}, isbn = {978-1-4503-7233-6}, location = {Recife, Brazil}, pages = {80--89}, numpages = {10}, url = {http://doi.acm.org/10.1145/3345629.3345639}, doi = {10.1145/3345629.3345639}, acmid = {3345639}, publisher = {ACM}, address = {New York, NY, USA}, keywords = {Bug-Fix Dataset, Mining Software Repositories, Software Traceability}, } P.S: We added a new dataset version (v1.0.1). In this version, we fix the git commit features that track the src and test files. More info can be found in the fix-script.py file.

  15. N

    Lakeville, IN Population Breakdown by Gender Dataset: Male and Female...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Lakeville, IN Population Breakdown by Gender Dataset: Male and Female Population Distribution // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/b23e1188-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Lakeville
    Variables measured
    Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Lakeville by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Lakeville across both sexes and to determine which sex constitutes the majority.

    Key observations

    There is a majority of female population, with 58.0% of total population being female. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

    Variables / Data Columns

    • Gender: This column displays the Gender (Male / Female)
    • Population: The population of the gender in the Lakeville is shown in this column.
    • % of Total Population: This column displays the percentage distribution of each gender as a proportion of Lakeville total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Lakeville Population by Race & Ethnicity. You can refer the same here

  16. N

    Pueblo, CO Population Breakdown by Gender Dataset: Male and Female...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Pueblo, CO Population Breakdown by Gender Dataset: Male and Female Population Distribution // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/b24e29cb-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Pueblo, Colorado
    Variables measured
    Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Pueblo by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Pueblo across both sexes and to determine which sex constitutes the majority.

    Key observations

    There is a slight majority of female population, with 50.09% of total population being female. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

    Variables / Data Columns

    • Gender: This column displays the Gender (Male / Female)
    • Population: The population of the gender in the Pueblo is shown in this column.
    • % of Total Population: This column displays the percentage distribution of each gender as a proportion of Pueblo total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Pueblo Population by Race & Ethnicity. You can refer the same here

  17. F

    SlideImages

    • data.uni-hannover.de
    • service.tib.eu
    tar, zip
    Updated Jan 20, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TIB (2022). SlideImages [Dataset]. https://data.uni-hannover.de/dataset/slideimages
    Explore at:
    tar, zipAvailable download formats
    Dataset updated
    Jan 20, 2022
    Dataset authored and provided by
    TIB
    License

    Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
    License information was derived automatically

    Description

    Please note: this archive requires support for dangling symlinks, which excludes the Windows operating system.

    To use this dataset, you will need to download the MS COCO 2017 detection images and expand them to a folder called coco17 in the train_val_combined directory. The download can be found here: https://cocodataset.org/#download You will also need to download the AI2D image description dataset and expand them to a folder called ai2d in the train_val_combined directory. The download can be found here: https://prior.allenai.org/projects/diagram-understanding

    License Notes for Train and Val: Since the images in this dataset come from different sources, they are bound by different licenses.

    Images for bar charts, x-y plots, maps, pie charts, tables, and technical drawings were downloaded directly from wikimedia commons. License and authorship information is stored independently for each image in these categories in the wikimedia_commons_licenses.csv file. Each row (note: some rows are multi-line) is formatted so:

    Images in the slides category were taken from presentations which were downloaded from Wikimedia Commons. The names of the presentations on Wikimedia Commons omits the trailing underscore, number, and file extension, and ends with .pdf instead. The source materials' licenses are shown in source_slices_licenses.csv.

    Wikimedia commons photos' information page can be found at "https://commons.wikimedia.org/wiki/File:

    License Notes for Testing: The testing images have been uploaded to SlideWiki by SlideWiki users. The image authorship and copyright information is available in authors.csv.

    Further information can be found for each image using the SlideWiki file service. Documentation is available at https://fileservice.slidewiki.org/documentation#/ and in particular: metadata is available at "https://fileservice.slidewiki.org/metadata/

    This is the SlideImages dataset, which has been assembled for the SlideImages paper. If you find the dataset useful, please cite our paper: https://doi.org/10.1007/978-3-030-45442-5_36

  18. N

    Roanoke, IN Population Breakdown by Gender Dataset: Male and Female...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Roanoke, IN Population Breakdown by Gender Dataset: Male and Female Population Distribution // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/b24fd620-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Roanoke, Indiana
    Variables measured
    Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Roanoke by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Roanoke across both sexes and to determine which sex constitutes the majority.

    Key observations

    There is a slight majority of female population, with 51.9% of total population being female. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

    Variables / Data Columns

    • Gender: This column displays the Gender (Male / Female)
    • Population: The population of the gender in the Roanoke is shown in this column.
    • % of Total Population: This column displays the percentage distribution of each gender as a proportion of Roanoke total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Roanoke Population by Race & Ethnicity. You can refer the same here

  19. N

    Michigan Population Breakdown by Gender Dataset: Male and Female Population...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Michigan Population Breakdown by Gender Dataset: Male and Female Population Distribution // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/b2442216-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Michigan
    Variables measured
    Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Michigan by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Michigan across both sexes and to determine which sex constitutes the majority.

    Key observations

    There is a slight majority of female population, with 50.43% of total population being female. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

    Variables / Data Columns

    • Gender: This column displays the Gender (Male / Female)
    • Population: The population of the gender in the Michigan is shown in this column.
    • % of Total Population: This column displays the percentage distribution of each gender as a proportion of Michigan total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Michigan Population by Race & Ethnicity. You can refer the same here

  20. Replication Package: Unboxing Default Argument Breaking Changes in 1 + 2...

    • zenodo.org
    application/gzip
    Updated Jul 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    João Eduardo Montandon; Luciana Lourdes Silva; Cristiano Politowski; Daniel Prates; Arthur Bonifácio; Ghizlane El Boussaidi; João Eduardo Montandon; Luciana Lourdes Silva; Cristiano Politowski; Daniel Prates; Arthur Bonifácio; Ghizlane El Boussaidi (2024). Replication Package: Unboxing Default Argument Breaking Changes in 1 + 2 Data Science Libraries in Python [Dataset]. http://doi.org/10.5281/zenodo.11584961
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jul 15, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    João Eduardo Montandon; Luciana Lourdes Silva; Cristiano Politowski; Daniel Prates; Arthur Bonifácio; Ghizlane El Boussaidi; João Eduardo Montandon; Luciana Lourdes Silva; Cristiano Politowski; Daniel Prates; Arthur Bonifácio; Ghizlane El Boussaidi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Replication Package

    This repository contains data and source files needed to replicate our work described in the paper "Unboxing Default Argument Breaking Changes in Scikit Learn".

    Requirements

    We recommend the following requirements to replicate our study:

    1. Internet access
    2. At least 100GB of space
    3. Docker installed
    4. Git installed

    Package Structure

    We relied on Docker containers to provide a working environment that is easier to replicate. Specifically, we configure the following containers:

    • data-analysis, an R-based Container we used to run our data analysis.
    • data-collection, a Python Container we used to collect Scikit's default arguments and detect them in client applications.
    • database, a Postgres Container we used to store clients' data, obtainer from Grotov et al.
    • storage, a directory used to store the data processed in data-analysis and data-collection. This directory is shared in both containers.
    • docker-compose.yml, the Docker file that configures all containers used in the package.

    In the remainder of this document, we describe how to set up each container properly.

    Using VSCode to Setup the Package

    We selected VSCode as the IDE of choice because its extensions allow us to implement our scripts directly inside the containers. In this package, we provide configuration parameters for both data-analysis and data-collection containers. This way you can directly access and run each container inside it without any specific configuration.

    You first need to set up the containers

    $ cd /replication/package/folder
    $ docker-compose build
    $ docker-compose up
    # Wait docker creating and running all containers
    

    Then, you can open them in Visual Studio Code:

    1. Open VSCode in project root folder
    2. Access the command palette and select "Dev Container: Reopen in Container"
      1. Select either Data Collection or Data Analysis.
    3. Start working

    If you want/need a more customized organization, the remainder of this file describes it in detail.

    Longest Road: Manual Package Setup

    Database Setup

    The database container will automatically restore the dump in dump_matroskin.tar in its first launch. To set up and run the container, you should:

    Build an image:

    $ cd ./database
    $ docker build --tag 'dabc-database' .
    $ docker image ls
    REPOSITORY  TAG    IMAGE ID    CREATED     SIZE
    dabc-database latest  b6f8af99c90d  50 minutes ago  18.5GB
    

    Create and enter inside the container:

    $ docker run -it --name dabc-database-1 dabc-database
    $ docker exec -it dabc-database-1 /bin/bash
    root# psql -U postgres -h localhost -d jupyter-notebooks
    jupyter-notebooks=# \dt
           List of relations
     Schema |    Name    | Type | Owner
    --------+-------------------+-------+-------
     public | Cell       | table | root
     public | Code_cell     | table | root
     public | Md_cell      | table | root
     public | Notebook     | table | root
     public | Notebook_features | table | root
     public | Notebook_metadata | table | root
     public | repository    | table | root
    

    If you got the tables list as above, your database is properly setup.

    It is important to mention that this database is extended from the one provided by Grotov et al.. Basically, we added three columns in the table Notebook_features (API_functions_calls, defined_functions_calls, andother_functions_calls) containing the function calls performed by each client in the database.

    Data Collection Setup

    This container is responsible for collecting the data to answer our research questions. It has the following structure:

    • dabcs.py, extract DABCs from Scikit Learn source code, and export them to a CSV file.
    • dabcs-clients.py, extract function calls from clients and export them to a CSV file. We rely on a modified version of Matroskin to leverage the function calls. You can find the tool's source code in the `matroskin`` directory.
    • Makefile, commands to set up and run both dabcs.py and dabcs-clients.py
    • matroskin, the directory containing the modified version of matroskin tool. We extended the library to collect the function calls performed on the client notebooks of Grotov's dataset.
    • storage, a docker volume where the data-collection should save the exported data. This data will be used later in Data Analysis.
    • requirements.txt, Python dependencies adopted in this module.

    Note that the container will automatically configure this module for you, e.g., install dependencies, configure matroskin, download scikit learn source code, etc. For this, you must run the following commands:

    $ cd ./data-collection
    $ docker build --tag "data-collection" .
    $ docker run -it -d --name data-collection-1 -v $(pwd)/:/data-collection -v $(pwd)/../storage/:/data-collection/storage/ data-collection
    $ docker exec -it data-collection-1 /bin/bash
    $ ls
    Dockerfile Makefile config.yml dabcs-clients.py dabcs.py matroskin storage requirements.txt utils.py
    

    If you see project files, it means the container is configured accordingly.

    Data Analysis Setup

    We use this container to conduct the analysis over the data produced by the Data Collection container. It has the following structure:

    • dependencies.R, an R script containing the dependencies used in our data analysis.
    • data-analysis.Rmd, the R notebook we used to perform our data analysis
    • datasets, a docker volume pointing to the storage directory.

    Execute the following commands to run this container:

    $ cd ./data-analysis
    $ docker build --tag "data-analysis" .
    $ docker run -it -d --name data-analysis-1 -v $(pwd)/:/data-analysis -v $(pwd)/../storage/:/data-collection/datasets/ data-analysis
    $ docker exec -it data-analysis-1 /bin/bash
    $ ls
    data-analysis.Rmd datasets dependencies.R Dockerfile figures Makefile
    

    If you see project files, it means the container is configured accordingly.

    A note on storage shared folder

    As mentioned, the storage folder is mounted as a volume and shared between data-collection and data-analysis containers. We compressed the content of this folder due to space constraints. Therefore, before starting working on Data Collection or Data Analysis, make sure you extracted the compressed files. You can do this by running the Makefile inside storage folder.

    $ make unzip # extract files
    $ ls
    clients-dabcs.csv clients-validation.csv dabcs.csv Makefile scikit-learn-versions.csv versions.csv
    $ make zip # compress files
    $ ls
    csv-files.tar.gz Makefile
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Neilsberg Research (2025). Gratis, OH Population Breakdown by Gender Dataset: Male and Female Population Distribution // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/b235d8fd-f25d-11ef-8c1b-3860777c1fe6/

Gratis, OH Population Breakdown by Gender Dataset: Male and Female Population Distribution // 2025 Edition

Explore at:
json, csvAvailable download formats
Dataset updated
Feb 24, 2025
Dataset authored and provided by
Neilsberg Research
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered
Gratis
Variables measured
Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the population of Gratis by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Gratis across both sexes and to determine which sex constitutes the majority.

Key observations

There is a slight majority of female population, with 50.0% of total population being female. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Scope of gender :

Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

Variables / Data Columns

  • Gender: This column displays the Gender (Male / Female)
  • Population: The population of the gender in the Gratis is shown in this column.
  • % of Total Population: This column displays the percentage distribution of each gender as a proportion of Gratis total population. Please note that the sum of all percentages may not equal one due to rounding of values.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Gratis Population by Race & Ethnicity. You can refer the same here

Search
Clear search
Close search
Google apps
Main menu