58 datasets found
  1. H

    Replication Data for: "The Problem of False Positives in Automated Census...

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Oct 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tyler Anbinder; Cormac O Grada; Dylan Connor; Simone A. Wegge (2023). Replication Data for: "The Problem of False Positives in Automated Census Linking: Nineteenth-Century New York's Irish Immigrants as a Case Study" [Dataset]. http://doi.org/10.7910/DVN/NHV2IH
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 20, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Tyler Anbinder; Cormac O Grada; Dylan Connor; Simone A. Wegge
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    New York
    Description

    Data sets referred to in our article that examines the problem of false positives in automated census linking.

  2. Automatically assembling a full census of an academic field

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    pdf
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Allison C. Morgan; Samuel F. Way; Aaron Clauset (2023). Automatically assembling a full census of an academic field [Dataset]. http://doi.org/10.1371/journal.pone.0202223
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Allison C. Morgan; Samuel F. Way; Aaron Clauset
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The composition of the scientific workforce shapes the direction of scientific research, directly through the selection of questions to investigate, and indirectly through its influence on the training of future scientists. In most fields, however, complete census information is difficult to obtain, complicating efforts to study workforce dynamics and the effects of policy. This is particularly true in computer science, which lacks a single, all-encompassing directory or professional organization. A full census of computer science would serve many purposes, not the least of which is a better understanding of the trends and causes of unequal representation in computing. Previous academic census efforts have relied on narrow or biased samples, or on professional society membership rolls. A full census can be constructed directly from online departmental faculty directories, but doing so by hand is expensive and time-consuming. Here, we introduce a topical web crawler for automating the collection of faculty information from web-based department rosters, and demonstrate the resulting system on the 205 PhD-granting computer science departments in the U.S. and Canada. This method can quickly construct a complete census of the field, and achieve over 99% precision and recall. We conclude by comparing the resulting 2017 census to a hand-curated 2011 census to quantify turnover and retention in computer science, in general and for female faculty in particular, demonstrating the types of analysis made possible by automated census construction.

  3. o

    Data and Code for: Automated Linking of Historical Data

    • openicpsr.org
    delimited
    Updated Mar 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ran Abramitzky; Leah Boustan; Katherine Eriksson; James Feigenbaum; Santiago Pérez (2021). Data and Code for: Automated Linking of Historical Data [Dataset]. http://doi.org/10.3886/E133781V1
    Explore at:
    delimitedAvailable download formats
    Dataset updated
    Mar 1, 2021
    Dataset provided by
    American Economic Association
    Authors
    Ran Abramitzky; Leah Boustan; Katherine Eriksson; James Feigenbaum; Santiago Pérez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    1850 - 1940
    Area covered
    United States; Norway
    Description

    The recent digitization of complete count census data is an extraordinary opportunity for social scientists to create large longitudinal datasets by linking individuals from one census to another or from other sources to the census. We evaluate different automated methods for record linkage, performing a series of comparisons across methods and against hand linking. We have three main findings that lead us to conclude that automated methods perform well. First, a number of automated methods generate very low (less than 5%) false positive rates. The automated methods trace out a frontier illustrating the tradeoff between the false positive rate and the (true) match rate. Relative to more conservative automated algorithms, humans tend to link more observations but at a cost of higher rates of false positives. Second, when human linkers and algorithms use the same linking variables, there is relatively little disagreement between them. Third, across a number of plausible analyses, coefficient estimates and parameters of interest are very similar when using linked samples based on each of the different automated methods. We provide code and Stata commands to implement the various automated methods.

  4. Average predicted ancestry and variance in predicted ancestry for candidate...

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tamar E. Crum; Robert D. Schnabel; Jared E. Decker; Luciana C. A. Regitano; Jeremy F. Taylor (2023). Average predicted ancestry and variance in predicted ancestry for candidate reference breed individuals when filtered on minimum predicted ancestry. [Dataset]. http://doi.org/10.1371/journal.pone.0221471.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Tamar E. Crum; Robert D. Schnabel; Jared E. Decker; Luciana C. A. Regitano; Jeremy F. Taylor
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Average predicted ancestry and variance in predicted ancestry for candidate reference breed individuals when filtered on minimum predicted ancestry.

  5. CRUMBLER: A tool for the prediction of ancestry in cattle

    • plos.figshare.com
    pdf
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tamar E. Crum; Robert D. Schnabel; Jared E. Decker; Luciana C. A. Regitano; Jeremy F. Taylor (2023). CRUMBLER: A tool for the prediction of ancestry in cattle [Dataset]. http://doi.org/10.1371/journal.pone.0221471
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Tamar E. Crum; Robert D. Schnabel; Jared E. Decker; Luciana C. A. Regitano; Jeremy F. Taylor
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In many beef and some dairy production systems, crossbreeding is used to take advantage of breed complementarity and heterosis. Admixed animals are frequently identified by their coat color and body conformation phenotypes, however, without pedigree information it is not possible to identify the expected breed composition of an admixed animal and in the presence of selection, the actual composition may differ from expectation. As the roles of DNA and genotype data become more pervasive in animal agriculture, a systematic method for estimating the breed composition (the proportions of an animal’s genome originating from ancestral pure breeds) has utility for a variety of downstream analyses including the estimation of genomic breeding values for crossbred animals, the estimation of quantitative trait locus effects, and heterosis and heterosis retention in advanced generation composite animals. Currently, there is no automated or semi-automated ancestry estimation platform for cattle and the objective of this study was to evaluate the utility of extant public software for ancestry estimation and determine the effects of reference population size and composition and number of utilized single nucleotide polymorphism loci on ancestry estimation. We also sought to develop an analysis pipeline that would simplify this process for members of the livestock genomics research community. We developed and tested a tool, “CRUMBLER”, to estimate the global ancestry of cattle using ADMIXTURE and SNPweights based on a defined reference panel. CRUMBLER, was developed and evaluated in cattle, but is a species agnostic pipeline that facilitates the streamlined estimation of breed composition for individuals with potentially complex ancestries using publicly available global ancestry software and a specified reference population SNP dataset. We developed the reference panel from a large cattle genotype data set and breed association pedigree information using iterative analyses to identify purebred individuals that were representative of each breed. We also evaluated the numbers of markers necessary for breed composition estimation and simulated genotypes for advanced generation composite animals to evaluate the precision of the developed tool. The developed CRUMBLER pipeline extracts a specified subset of genotypes that is common to all current commercially available genotyping platforms, processes these into the file formats required for the analysis software, and predicts admixture proportions using the specified reference population allele frequencies.

  6. Number of individuals for each reference breed assigned to their breed of...

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    • +1more
    xls
    Updated Jun 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tamar E. Crum; Robert D. Schnabel; Jared E. Decker; Luciana C. A. Regitano; Jeremy F. Taylor (2023). Number of individuals for each reference breed assigned to their breed of registration by minimum ancestry threshold. [Dataset]. http://doi.org/10.1371/journal.pone.0221471.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 20, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Tamar E. Crum; Robert D. Schnabel; Jared E. Decker; Luciana C. A. Regitano; Jeremy F. Taylor
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Number of individuals for each reference breed assigned to their breed of registration by minimum ancestry threshold.

  7. V

    Access to Jobs in Virginia by census block 2022 (auto, bike, walk & transit)...

    • data.virginia.gov
    csv, esri map package +1
    Updated Apr 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office of INTERMODAL Planning and Investment (2025). Access to Jobs in Virginia by census block 2022 (auto, bike, walk & transit) [Dataset]. https://data.virginia.gov/dataset/access-to-jobs-in-virginia-by-census-block-2022-auto-bike-walk-transit
    Explore at:
    csv(54413201), csv(50605134), csv(49287204), shp(151025314), csv(49052537), esri map package(373988069)Available download formats
    Dataset updated
    Apr 15, 2025
    Dataset authored and provided by
    Office of INTERMODAL Planning and Investment
    Area covered
    Virginia
    Description

    The Accessibility Observatory at the University of Minnesota produces census block level access to jobs data by four modes (auto, bike, walk and transit). https://www.cts.umn.edu/programs/ao

    Accessibility is the ease and feasibility of reaching valued destinations. It can be measured for a wide array of transportation modes, to different types of destinations, and at different times of day. There are a variety of ways to define accessibility, but the number of destinations reachable within a given travel time is the most comprehensible and transparent as well as the most directly comparable across different geographies.

    Reports published by mode have detailed information on how these data are produced.
    Auto: https://hdl.handle.net/11299/266465 Bike: https://hdl.handle.net/11299/266466 Walk: https://hdl.handle.net/11299/266468 Transit: https://hdl.handle.net/11299/266467

    These data are provided in two formats. The first format is an ESRI Map Package that includes feature classes for each mode. The second format is a zipped shapefile of census blocks and four CSV files, one for each mode. For every census block, for each mode, there is a reported value of the number of jobs that can be reached within a specified time threshold. The bicycle mode uses Level of Traffic Stress (LTS) 3 (medium stress) to define usable routes.

  8. Data from: Eastern Canada Flocks: Images and manually annotated bird...

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    txt, zip
    Updated Jun 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marcos Cruz; Marcos Cruz; Javier González-Villa; Josée Lefebvre; Scott Gilliland; Francis St-Pierre; Matthew English; Christine Lepage; Javier González-Villa; Josée Lefebvre; Scott Gilliland; Francis St-Pierre; Matthew English; Christine Lepage (2022). Eastern Canada Flocks: Images and manually annotated bird positions [Dataset]. http://doi.org/10.5061/dryad.98sf7m0hx
    Explore at:
    txt, zipAvailable download formats
    Dataset updated
    Jun 4, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Marcos Cruz; Marcos Cruz; Javier González-Villa; Josée Lefebvre; Scott Gilliland; Francis St-Pierre; Matthew English; Christine Lepage; Javier González-Villa; Josée Lefebvre; Scott Gilliland; Francis St-Pierre; Matthew English; Christine Lepage
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The Eastern Canada (ECA) Flocks data set consists of manually annotated Images from the Common Eider (COEI, Somateria mollissima) Winter Survey and the Greater Snow Geese (GSGO, Anser caerulescens atlanticus) Spring Survey. The images were taken in Eastern Canada using fixed-wing aircraft and manually annotated with ImageJ's Cell counter plugins. We selected and annotated the ECA Flocks images in order to test the precision of the CountEm flock size estimation method. ECA Flocks includes 179 COEI and 99 GSGO single flock images. We cut each image manually to a rectangle that excluded large parts of the image with no birds. Both versions (original and cut) of each image are available in the data set. We manually annotated 637,555 (124,309 COEI and 514,235 GSGO) bird positions in the cut images from both surveys. Each bird has an associated "Type" which refers to species and/or sex. Sex identification was only possible for adult common eiders since females and immature males are brown birds whereas adult males have mainly white plumage. 64,484 male and 58,029 females were identified in the COEI images, as well as 1796 birds of other species. 504,891 Snow Geese and 9344 birds of other species were labeled in the GSGO images. A .csv file including all annotated bird positions and types is available for each image. The COEI and GSGO photos of the ECA Flocks data set were taken in the years 2006 and 2018 and 2016-2018 respectively. We selected these photos in order to include images with different quality and resolution. COEI and GSGO flock sizes range from 6 to 4,154 and from 43 to 36, 241 respectively. There is high variability in light conditions, backgrounds, number and spatial arrangement of birds across the images. The data set is therefore potentially useful to test the precision of methods for analyzing imagery to estimate the abundance of animals by directly detecting, identifying and counting individuals.

  9. d

    Data from: Towards a fully automated underwater census for fish assemblages...

    • datadryad.org
    • search.dataone.org
    zip
    Updated Dec 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cecile Sabourault; Kilian Bürgi (2024). Towards a fully automated underwater census for fish assemblages in the Mediterranean Sea [Dataset]. http://doi.org/10.5061/dryad.f7m0cfz6f
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 17, 2024
    Dataset provided by
    Dryad
    Authors
    Cecile Sabourault; Kilian Bürgi
    Time period covered
    Dec 2, 2024
    Area covered
    Mediterranean Sea
    Description

    Data from: Towards a fully automated underwater census for fish assemblages in the Mediterranean Sea

    https://doi.org/10.5061/dryad.f7m0cfz6f

    Description of the data and file structure

    1. Study area and data collection

    The training dataset (DATA_T) was gathered in eight different locations in the Mediterranean Sea along the French Riviera, following the same UVC protocol on each site. The depth ranged from 1-37m and was carried out during the whole year in 2022 (cold and warm season) to cover the full range of conditions and possibilities of fish occurrences.

    The experimental dataset (DATA_E) was recorded in October 2023 in and around two protected areas, one no-take zone (Cap Roux) and one Natura2000 site (Corniche Varoise), which both have elevated biodiversity. A total of 64 videos, each corresponding to a transect, from 14 sites (8 on seagrass meadows and 6 on rocky substrates) were evaluated and compared. Each site consists of...

  10. i

    Economic Census 2005 - India

    • datacatalog.ihsn.org
    • catalog.ihsn.org
    Updated Oct 5, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Central Statistical Office (2021). Economic Census 2005 - India [Dataset]. https://datacatalog.ihsn.org/catalog/3384
    Explore at:
    Dataset updated
    Oct 5, 2021
    Dataset authored and provided by
    Central Statistical Office
    Time period covered
    2005
    Area covered
    India
    Description

    Abstract

    The Central Statistical Organization (CSO) conducted fifth Economic Census in 2005 in all the States/UTs in collaboration with State Directorates of Economics and Statistics. The first Economic Census was conducted in 1977 covering only non- agricultural establishments and the three Economic Censuses subsequently carried out in 1980, 1990 and 1998 covered all agricultural and non-agricultural enterprises excepting those engaged in crop production and plantation. There was no change in the coverage of the fifth Economic Census as compared to the fourth Economic Census. Economic Census not only provides updated frame for detailed follow-up surveys but also gives basic entrepreneurial data for planning and development specially for unorganized sector of the economy.

    There are certain new features in the fifth Economic Census. Addresses of the enterprises employing 10 workers or more were collected for the first time in the fifth Economic Census through Address Slip. At present the country does not maintain a Business Register. The directory of enterprises to be generated from the Address Slip would be the basic input for preparation of a Business Register. For the first time, data collected in the fifth Economic Census are processed through Intelligent Character Recognition (ICR) Technology.

    The results of EC-2005 "ALL INDIA REPORT" contains the all India figures on the number of enterprises and their employment, cross-classified according to their locations, major activity groups, type of the establishments, size-class of the employment, etc. The disaggregated data for States/UTs are also included in the report.

    Geographic coverage

    All the States/UTs. in the country

    Analysis unit

    Establishment

    Universe

    Economic Census (EC) is the complete count of all entrepreneurial units located within the geographical boundaries of the country. All units engaged in the production or distribution of goods or services other than for the sole purpose of own consumption are counted. While all units engaged in nonagricultural activities are covered, in the agricultural sector units in crop production and plantation activities are excluded.

    Kind of data

    Census/enumeration data [cen]

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    All questionaires are provided as external resources

    Cleaning operations

    Intelligent Character Recognition (ICR) technology, which is also known as Automated Forms Processing, was used to process the EC-2005 data. Automated Forms Processing technology enables the user to process documents from their images or directly from paper and convert them to computer readable data.

    The schedules of the Fifth EC were scanned/digitized at the fifteen regional Data Processing Centres of Registrar General of India (RGI). After running the edit programme, the error list files were handed over to the State Governments for corrections. The DES officials of the State Government corrected the error files in two/three cycles and then sent the data files to RGI Headquarters to give final touch before sending to Computer Centre, MOSPI. The data files were made further error free by applying auto corrections at the Computer Centre.

  11. D

    MES-Weigh Scale Integration For Lot Genealogy Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). MES-Weigh Scale Integration For Lot Genealogy Market Research Report 2033 [Dataset]. https://dataintelo.com/report/mes-weigh-scale-integration-for-lot-genealogy-market
    Explore at:
    csv, pptx, pdfAvailable download formats
    Dataset updated
    Sep 30, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    MES-Weigh Scale Integration for Lot Genealogy Market Outlook



    According to our latest research, the MES-Weigh Scale Integration for Lot Genealogy market size reached USD 1.17 billion in 2024 globally, driven by the increasing need for traceability and quality assurance in manufacturing. The market is projected to grow at a robust CAGR of 9.2% from 2025 to 2033, reaching an estimated USD 2.66 billion by 2033. This growth is primarily fueled by the rising adoption of digital manufacturing solutions, stringent regulatory requirements, and the growing emphasis on operational efficiency and product quality across industries.




    The primary growth factor for the MES-Weigh Scale Integration for Lot Genealogy market is the increasing demand for end-to-end traceability in manufacturing processes. With global supply chains becoming more complex and customer expectations for product quality rising, manufacturers are under pressure to provide detailed genealogy records for each lot produced. Integrating Manufacturing Execution Systems (MES) with weigh scales enables real-time data capture and automated lot tracking, significantly reducing human errors and ensuring compliance with regulatory standards. This integration is particularly critical in highly regulated industries such as pharmaceuticals and food & beverage, where accurate lot genealogy is essential for product recalls, audits, and quality assurance. As a result, organizations are investing heavily in advanced MES solutions and weigh scale integration to streamline operations and enhance transparency throughout the production lifecycle.




    Another significant driver for market growth is the rapid advancement of digital technologies and the Industrial Internet of Things (IIoT). Modern weigh scales equipped with smart sensors and connectivity features can seamlessly integrate with MES platforms, enabling real-time monitoring, data analytics, and predictive maintenance. The proliferation of cloud-based solutions further accelerates adoption by providing scalable, flexible, and cost-effective deployment options. With manufacturers seeking to optimize production efficiency, reduce downtime, and minimize waste, the integration of MES and weigh scale systems is becoming a strategic imperative. This trend is especially prominent among large enterprises with complex manufacturing environments, but small and medium enterprises (SMEs) are also increasingly recognizing the value of such integrations to remain competitive.




    Regulatory compliance and quality control requirements continue to intensify across global markets, further propelling the adoption of MES-Weigh Scale Integration for Lot Genealogy solutions. Governments and industry bodies are enforcing stricter standards related to product safety, traceability, and documentation, particularly in sectors such as pharmaceuticals, food & beverage, and chemicals. Non-compliance can result in severe penalties, product recalls, and reputational damage. By leveraging integrated systems, manufacturers can automate data collection, ensure accurate lot tracking, and generate comprehensive audit trails, thereby mitigating compliance risks. The growing focus on sustainability and responsible manufacturing practices also encourages organizations to adopt solutions that enable precise resource tracking and waste reduction.




    From a regional perspective, North America currently dominates the MES-Weigh Scale Integration for Lot Genealogy market, accounting for the largest share in 2024. This leadership position is attributed to the presence of advanced manufacturing industries, high regulatory standards, and early adoption of digital technologies. However, the Asia Pacific region is expected to witness the fastest growth through 2033, driven by rapid industrialization, expanding manufacturing bases, and increasing investments in smart factory solutions. Europe remains a significant market, supported by stringent regulatory frameworks and a strong focus on quality assurance. Meanwhile, Latin America and the Middle East & Africa are gradually emerging as promising markets, fueled by growing awareness and modernization efforts in their manufacturing sectors.



    Component Analysis



    The MES-Weigh Scale Integration for Lot Genealogy market is segmented into software, hardware, and services, each playing a pivotal role in enabling seamless integration and efficient lot genealogy management. The software segment is the backbone of this market, comprising MES pl

  12. m

    AO auto tracts 2023

    • geodot.mass.gov
    • gis.data.mass.gov
    • +2more
    Updated Jun 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Massachusetts geoDOT (2025). AO auto tracts 2023 [Dataset]. https://geodot.mass.gov/datasets/ao-auto-tracts-2023
    Explore at:
    Dataset updated
    Jun 6, 2025
    Dataset authored and provided by
    Massachusetts geoDOT
    Area covered
    Description

    2020 Census tracts are small, relatively permanent statistical subdivisions of a county or equivalent entity, and are reviewed and updated by local participants prior to each decennial census as part of the Census Bureau’s Participant Statistical Areas Program (PSAP). The primary purpose of census tracts is to provide a stable set of geographic units for the presentation of decennial census data.Census tracts generally have a total population size between 1,200 and 8,000 people with an optimum size of 4,000 people. The spatial size of census tracts varies widely depending on the density of settlement. Ideally, census tract boundaries remain stable over time to facilitate statistical comparisons from census to census. However, physical changes in street patterns caused by highway construction, new development, and so forth, may require boundary revisions. In addition, significant changes in population may result in splitting or combining census tracts. State and county boundaries always are census tract boundaries in the standard census geographic hierarchy, but tracts can cross the same kinds of boundaries that block groups can. Census tract numbers have up to a 4-character basic number and may have an optional 2-character suffix. The census tract numbers (used as names) eliminate any leading zeroes and append a suffix only if required. The 6-digit census tract codes, however, include leading zeroes and have an implied decimal point for the suffix. Census tract codes (000100 to 998999) are unique within a county or equivalent area.

  13. A

    Developing Techniques to Census and Monitor Colonial Waterbirds with...

    • data.amerigeoss.org
    pdf
    Updated Jan 1, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States (2011). Developing Techniques to Census and Monitor Colonial Waterbirds with Automated Digital Image Processing and Perimeter-Ground Surveys [SSP Proposal] [Dataset]. https://data.amerigeoss.org/sk/dataset/developing-techniques-to-census-and-monitor-colonial-waterbirds-with-automated-digital-image-pr2
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jan 1, 2011
    Dataset provided by
    United States
    Description

    The goal of this proposal is to develop methods for estimating the size of breeding populations for different waterbird species and to provide protocols for monitoring colonial species at a remote nesting colony. The specific objectives of this study are: 1) to develop, assess, and evaluate the accuracy of an automated, pixel-based mapping method to estimate populations of American White Pelicans, Double-crested Cormorants, and gulls at Chase Lake NWR, and 2) to develop, assess, and evaluate the accuracy of a perimeter-count method to estimate populations of egrets and herons at Chase Lake NWR. Development of survey techniques will enhance the capability of the USFWS for detecting disease events and monitoring colonial waterbirds at Chase Lake and other National Wildlife Refuges.

    Pages 1-5 contain the proposal, page 6 offers approvals and submittal, and pages 7-10 outline the budget.

  14. Z

    Illuminating Tycho's Rays: Automated Crater Census Uncovers Equilibrium...

    • data-staging.niaid.nih.gov
    Updated Apr 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bugiolacchi, Roberto; Luo, Shengda (2025). Illuminating Tycho's Rays: Automated Crater Census Uncovers Equilibrium Dynamics and Regolith Stratification on the Lunar Surface [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_14263626
    Explore at:
    Dataset updated
    Apr 2, 2025
    Dataset provided by
    Southern University of Science and Technology, Shenzhen, Guandong, China
    Macau University of Science and Technology
    Authors
    Bugiolacchi, Roberto; Luo, Shengda
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Excel files containing all crater data are output from ArcMap.

    DATA_HUMA_craters_raw contains two files for the *186 and *808 NAC images as coordinates and diameter (km). These are all the manual labelling for the NACs, with not-included data. Full set.

    DATA_YOLO_craters_raw contains several files for each sub-area, coordinates, diameter, and additional information

    DATA_selected_craters is a table that formed the basis for the analysis. It lists the crater sizes for each area (summed) and gives the surface area of the regions under investigation.

  15. G

    Forensic Genetic Genealogy Software Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Oct 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Forensic Genetic Genealogy Software Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/forensic-genetic-genealogy-software-market
    Explore at:
    pdf, pptx, csvAvailable download formats
    Dataset updated
    Oct 3, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Forensic Genetic Genealogy Software Market Outlook



    As per our latest research, the global forensic genetic genealogy software market size reached USD 292.6 million in 2024, reflecting a robust expansion driven by the growing adoption of advanced DNA analysis in criminal justice and family tracing applications. The market is projected to grow at a CAGR of 14.7% from 2025 to 2033, reaching a forecasted value of USD 931.2 million by 2033. This remarkable growth is underpinned by the increasing demand for sophisticated investigative tools among law enforcement and government agencies, as well as the rising public awareness about the potential of genetic genealogy in solving cold cases and reuniting families.




    One of the primary growth factors for the forensic genetic genealogy software market is the surge in the use of DNA databases for criminal investigations and missing persons identification. Law enforcement agencies across the globe are increasingly leveraging advanced genetic genealogy platforms to solve complex crimes that have remained unresolved for years. The integration of genealogical data with forensic DNA analysis has revolutionized investigative workflows, enabling the identification of suspects and victims through distant familial connections. This capability has led to a significant number of high-profile case resolutions, which in turn has spurred further investments in software development and database expansion. The growing sophistication of these platforms, with features such as AI-driven relationship mapping and automated kinship analysis, is further enhancing their adoption and effectiveness.




    Another key driver is the expanding application of forensic genetic genealogy software beyond traditional law enforcement. Organizations involved in adoption and family reunification, as well as humanitarian groups dealing with disaster victim identification, are increasingly utilizing these tools to trace lineage and establish biological relationships. The availability of cloud-based deployment options has democratized access to these solutions, allowing smaller agencies and non-profit organizations to harness the power of genetic genealogy without the need for extensive IT infrastructure. Additionally, the collaboration between public and private sector entities is fostering data sharing and interoperability standards, which is crucial for scaling the impact of forensic genealogy in diverse investigative contexts.




    Technological advancements in both genomics and software analytics are playing a pivotal role in shaping the forensic genetic genealogy software market. The integration of next-generation sequencing (NGS) technologies with user-friendly software interfaces has made it possible to process large volumes of DNA data rapidly and accurately. Innovations in data privacy and security protocols are also addressing ethical concerns, thereby increasing public trust and participation in genetic genealogy databases. Furthermore, legislative support in several countries, aiming to streamline the use of genetic data in criminal justice while safeguarding individual rights, is contributing to a favorable regulatory environment for market growth. As a result, the forensic genetic genealogy software market is expected to witness sustained demand from a broadening array of stakeholders.




    From a regional perspective, North America continues to dominate the forensic genetic genealogy software market, accounting for the largest revenue share in 2024, followed by Europe and Asia Pacific. The United States, in particular, has witnessed a surge in adoption due to the proactive stance of law enforcement agencies and the presence of leading software providers. Europe is experiencing rapid growth, driven by cross-border collaborations and increasing investments in forensic science infrastructure. Meanwhile, Asia Pacific is emerging as a high-potential market, with countries such as China, Japan, and Australia investing in digital transformation initiatives within their forensic departments. The Middle East & Africa and Latin America are gradually catching up, supported by international partnerships and capacity-building programs.



  16. F

    Advance Retail Sales: Auto and Other Motor Vehicle Dealers

    • fred.stlouisfed.org
    json
    Updated Nov 25, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Advance Retail Sales: Auto and Other Motor Vehicle Dealers [Dataset]. https://fred.stlouisfed.org/series/RSAOMVN
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Nov 25, 2025
    License

    https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain

    Description

    Graph and download economic data for Advance Retail Sales: Auto and Other Motor Vehicle Dealers (RSAOMVN) from Jan 1992 to Sep 2025 about retail trade, vehicles, sales, retail, and USA.

  17. u

    Utah Census Blocks 2020

    • opendata.gis.utah.gov
    • sgid-utah.opendata.arcgis.com
    • +1more
    Updated Feb 27, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Utah Automated Geographic Reference Center (AGRC) (2021). Utah Census Blocks 2020 [Dataset]. https://opendata.gis.utah.gov/datasets/utah-census-blocks-2020
    Explore at:
    Dataset updated
    Feb 27, 2021
    Dataset authored and provided by
    Utah Automated Geographic Reference Center (AGRC)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    Last Update: 02/2021This datasets was was downloaded from the 2020 Census Redistricting Data (P.L. 94-171) page. All 2020 census boundaries are current to January 1, 2020. The Census Bureau will release the first set of corresponding demographic data in September 2021 (the 2020 Census Redistricting P.L. 94-171 Summary Files). Following that release, AGRC will append the demographic data to the existing 2020 geographies served on this page.Blocks are the smallest geographic areas and the basis for all tabulated census data. Blocks are statistical areas bounded by visible features, such as streets, road, streams, and railroad tracks, and by nonvisible boundaries, such as selected property lines and city, township, school district, and county limits. A block often represents a typical city block area, however, as with other statistical geographies, rural-area blocks may be very large in spatial size.Visit the SGID 2020 Census data pagefor more information.

  18. A

    Maryland Very High Risk Census Tracts - Very High Risk Census Tracts

    • data.amerigeoss.org
    • data.imap.maryland.gov
    csv, esri rest +4
    Updated Aug 2, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AmeriGEO ArcGIS (2019). Maryland Very High Risk Census Tracts - Very High Risk Census Tracts [Dataset]. https://data.amerigeoss.org/es_AR/dataset/maryland-very-high-risk-census-tracts-very-high-risk-census-tracts
    Explore at:
    zip, csv, esri rest, geojson, html, kmlAvailable download formats
    Dataset updated
    Aug 2, 2019
    Dataset provided by
    AmeriGEO ArcGIS
    Area covered
    Maryland
    Description

    Each of the State of Maryland’s 1,406 2010 census tracts was analyzed to determine whether it represented a typical census tract as defined by the U. S. Bureau of the Census. Nationally these are census tracts that optimally are 4,000 inhabitants but generally range from 1,200 to 8,000 persons. In Maryland the average census tract contains 4,106 persons. Nationally the housing unit threshold for each census tract generally ranges from 480 to 3,200 housing units, with an optimum size of 1,600 housing units. In Maryland the average census tract contains 1,692 housing units. The Emergency Management Planning Database and the Emergency Planning Vulnerable Population Index are intended to assist State agency emergency officials plan tactics, develop strategies, allocate resources and prioritize responses for emergencies and to identify potentially vulnerable population areas for special attention.



    Statewide, there are 222 census tracts containing persons at “Very High” socio – economic risk or vulnerability in the event of an emergency. “Very High” risk census tracts account for 16 – percent of the State’s 1,390 specified census tracts. These census tracts are located throughout the State in 20 of 24 jurisdictions. There are 773,808 persons living in these areas making up 13.4 percent of the State’s 2010 Census population of 5,773,552 persons.


    This is a MD iMAP hosted service layer. Find more information at https://imap.maryland.gov.


  19. d

    Postal Code Conversion File [Canada], October 2005, Census of Canada 2001

    • search.dataone.org
    • borealisdata.ca
    Updated Dec 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics Canada. Geography Division (2024). Postal Code Conversion File [Canada], October 2005, Census of Canada 2001 [Dataset]. http://doi.org/10.5683/SP3/GXOMIV
    Explore at:
    Dataset updated
    Dec 11, 2024
    Dataset provided by
    Borealis
    Authors
    Statistics Canada. Geography Division
    Area covered
    Canada
    Description

    The Postal Code Conversion File (PCCF) is a digital file which provides a correspondence between the Canada Post Corporation (CPC) six-character postal code and Statistics Canada's standard geographic areas for which census data and other statistics are produced. Through the link between postal codes and standard geographic areas, the PCCF permits the integration of data from various sources. The Single Link Indicator provides one best link for every postal code, as there are multiple records for many postal codes. To obtain the postal code conversion file or for questions, consult the DLI contact at your educational institution. The geographic coordinates attached to each postal code on the PCCF are commonly used to map the distribution of data for spatial analysis (e.g., clients, activities). The location information is a powerful tool for planning, or research purposes. In April 1983, the Geography Division released the first version of the Postal Code Conversion File, which linked postal codes to census geographic areas and included geographic coordinates. Since then, the file has been updated on a regular basis to reflect postal code changes provided by Canada Post Corporation. Every five years, the postal code linkages on the Postal Code Conversion File are “converted” to the latest census geographic areas. The original Postal Code Conversion File was linked to the 1981 Census geographic areas. Since then, the Postal Code Conversion File has undergone four “conversions”, following the 1986, 1991, 1996 and 2001 censuses. A revised automated system was used for the 1996-2001 conversion. The 2001 Census postal codes reported by respondents were used to validate the Postal Code Conversion File links.

  20. R

    MES-Weigh Scale Integration for Lot Genealogy Market Research Report 2033

    • researchintelo.com
    csv, pdf, pptx
    Updated Oct 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Research Intelo (2025). MES-Weigh Scale Integration for Lot Genealogy Market Research Report 2033 [Dataset]. https://researchintelo.com/report/mes-weigh-scale-integration-for-lot-genealogy-market
    Explore at:
    csv, pdf, pptxAvailable download formats
    Dataset updated
    Oct 1, 2025
    Dataset authored and provided by
    Research Intelo
    License

    https://researchintelo.com/privacy-and-policyhttps://researchintelo.com/privacy-and-policy

    Time period covered
    2024 - 2033
    Area covered
    Global
    Description

    MES-Weigh Scale Integration for Lot Genealogy Market Outlook



    According to our latest research, the Global MES-Weigh Scale Integration for Lot Genealogy market size was valued at $1.2 billion in 2024 and is projected to reach $3.5 billion by 2033, expanding at a CAGR of 12.8% during 2024–2033. One of the primary drivers fueling this robust growth is the increasing demand for real-time traceability and data accuracy in highly regulated industries such as pharmaceuticals, food & beverage, and chemicals. As manufacturers strive to comply with stringent quality standards and regulatory mandates, the integration of Manufacturing Execution Systems (MES) with weigh scale technologies for lot genealogy has become essential. This integration not only enhances operational efficiency but also ensures complete traceability of materials and products throughout the manufacturing process, minimizing risk and maximizing accountability.



    Regional Outlook



    North America currently holds the largest share of the global MES-Weigh Scale Integration for Lot Genealogy market, accounting for over 38% of the total market value in 2024. This dominance is attributed to the region’s mature manufacturing sector, advanced technological infrastructure, and the early adoption of Industry 4.0 principles. The United States, in particular, has seen widespread implementation of MES and weigh scale integration in industries such as pharmaceuticals and food & beverage, driven by strict regulatory requirements from agencies like the FDA. Additionally, the presence of leading MES software providers and weigh scale manufacturers in North America further bolsters the region’s market leadership. Robust investment in automation and digital transformation across manufacturing operations continues to sustain North America’s position as a market leader.



    The Asia Pacific region is anticipated to register the fastest growth, with a projected CAGR of 16.1% from 2024 to 2033. This rapid expansion is primarily driven by the burgeoning manufacturing sectors in China, India, and Southeast Asia, where companies are increasingly investing in automation and digitalization to enhance productivity and competitiveness. Government initiatives supporting smart manufacturing, coupled with rising foreign direct investments and the establishment of new production facilities, are accelerating the adoption of MES-weigh scale integration solutions. Furthermore, the growing focus on food safety, pharmaceutical traceability, and export compliance in Asia Pacific countries is compelling manufacturers to embrace advanced lot genealogy systems, thereby fueling market growth in the region.



    Emerging economies in Latin America and the Middle East & Africa are gradually adopting MES-weigh scale integration for lot genealogy, though market penetration remains relatively low compared to developed regions. In these markets, challenges such as limited technological infrastructure, high initial investment costs, and a lack of skilled personnel often hinder widespread adoption. However, increasing awareness of the benefits of traceability, coupled with evolving regulatory frameworks and the need to access global supply chains, is creating new opportunities. Localized demand is also being shaped by region-specific regulations and the growing presence of multinational manufacturers, which are introducing best practices and advanced technologies to these markets.



    Report Scope





    Attributes Details
    Report Title MES-Weigh Scale Integration for Lot Genealogy Market Research Report 2033
    By Component Software, Hardware, Services
    By Deployment Mode On-Premises, Cloud
    By Application Pharmaceuticals, Food & Beverage, Chemicals, Electronics, Automotive, Others
    By Enterprise Size Small and Medium Enterprises, Large Enterprises </td&

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Tyler Anbinder; Cormac O Grada; Dylan Connor; Simone A. Wegge (2023). Replication Data for: "The Problem of False Positives in Automated Census Linking: Nineteenth-Century New York's Irish Immigrants as a Case Study" [Dataset]. http://doi.org/10.7910/DVN/NHV2IH

Replication Data for: "The Problem of False Positives in Automated Census Linking: Nineteenth-Century New York's Irish Immigrants as a Case Study"

Related Article
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 20, 2023
Dataset provided by
Harvard Dataverse
Authors
Tyler Anbinder; Cormac O Grada; Dylan Connor; Simone A. Wegge
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Area covered
New York
Description

Data sets referred to in our article that examines the problem of false positives in automated census linking.

Search
Clear search
Close search
Google apps
Main menu