2 datasets found
  1. H

    3-digit occupation code images from the Norwegian census of 1950 - Manual...

    • dataverse.harvard.edu
    • dataverse.no
    • +1more
    Updated Jul 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harvard Dataverse (2023). 3-digit occupation code images from the Norwegian census of 1950 - Manual review dataset [Dataset]. http://doi.org/10.18710/LYXKN1
    Explore at:
    text/comma-separated-values(54006), txt(7270), zip(1860373835)Available download formats
    Dataset updated
    Jul 3, 2023
    Dataset provided by
    Harvard Dataverse
    License

    https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.null/customlicense?persistentId=doi:10.18710/LYXKN1https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.null/customlicense?persistentId=doi:10.18710/LYXKN1

    Time period covered
    Dec 1, 1950
    Area covered
    Norway
    Dataset funded by
    The Research Council of Norway
    UiT The Arctic University of Norway, interdisciplinary strategic project High North Population Studies
    Description

    This dataset is made up of images containing handwritten 3-digit occupation codes from the Norwegian population census of 1950. The occupation codes were added to the census sheets by Statistics Norway after the census was concluded for the purpose of creating aggregated occupational statistics for the entire population. The coding standard used in the 1950 census is, according to Statistics Norway’s official publications (https://www.ssb.no/historisk-statistikk/folketellinger/folketellingen-1950, booklet 4, page 81), very similar to the standards used in the census for 1920. Cf. the 13th booklet published for the 1920 census (https://www.ssb.no/historisk-statistikk/folketellinger/folketellingen-1920, note that this booklet is only available in Norwegian). In short, an occupation code is a 3-digit number that corresponds to a given occupation or type of occupation. According to the official list of occupation codes provided by Statistics Norway there are 339 unique codes. These are not all necessarily sequential or hierarchical in general, but some subgroupings are. This list can be found under Files. It is also worth noting that these images were extracted from the original census sheet images algorithmically. This process was not flawless and lead to additional images being extracted, these can contain written occupation titles or be left entirely blank. The dataset consists of 90,000 unique images, and 9,000 images that were randomly selected and copied from the unique images. These were all used for a research project (link to preprint article: https://doi.org/10.48550/arXiv.2306.16126) where we (author list can be found in preprint) tried to find a more efficient way of reviewing and correcting classification results from a Machine Learning model, where the results did not pass a pre-set confidence threshold. This was a follow-up to our previous article where we describe the initial project and creating of our model in more detail, if it is of interest (“Lessons Learned Developing and Using a Machine Learning Model to Automatically Transcribe 2.3 Million Handwritten Occupation Codes”, https://doi.org/10.51964/hlcs11331).

  2. d

    3-digit occupation code images from the Norwegian census of 1950 - Manual...

    • b2find.dkrz.de
    Updated Jun 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). 3-digit occupation code images from the Norwegian census of 1950 - Manual review dataset - Dataset - B2FIND [Dataset]. https://b2find.dkrz.de/dataset/efbcedaa-3811-583f-95da-38371daf5ae8
    Explore at:
    Dataset updated
    Jun 22, 2024
    Description

    This dataset is made up of images containing handwritten 3-digit occupation codes from the Norwegian population census of 1950. The occupation codes were added to the census sheets by Statistics Norway after the census was concluded for the purpose of creating aggregated occupational statistics for the entire population. The coding standard used in the 1950 census is, according to Statistics Norway’s official publications (https://www.ssb.no/historisk-statistikk/folketellinger/folketellingen-1950, booklet 4, page 81), very similar to the standards used in the census for 1920. Cf. the 13th booklet published for the 1920 census (https://www.ssb.no/historisk-statistikk/folketellinger/folketellingen-1920, note that this booklet is only available in Norwegian). In short, an occupation code is a 3-digit number that corresponds to a given occupation or type of occupation. According to the official list of occupation codes provided by Statistics Norway there are 339 unique codes. These are not all necessarily sequential or hierarchical in general, but some subgroupings are. This list can be found under Files. It is also worth noting that these images were extracted from the original census sheet images algorithmically. This process was not flawless and lead to additional images being extracted, these can contain written occupation titles or be left entirely blank. The dataset consists of 90,000 unique images, and 9,000 images that were randomly selected and copied from the unique images. These were all used for a research project (link to preprint article: https://doi.org/10.48550/arXiv.2306.16126) where we (author list can be found in preprint) tried to find a more efficient way of reviewing and correcting classification results from a Machine Learning model, where the results did not pass a pre-set confidence threshold. This was a follow-up to our previous article where we describe the initial project and creating of our model in more detail, if it is of interest (“Lessons Learned Developing and Using a Machine Learning Model to Automatically Transcribe 2.3 Million Handwritten Occupation Codes”, https://doi.org/10.51964/hlcs11331).

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Harvard Dataverse (2023). 3-digit occupation code images from the Norwegian census of 1950 - Manual review dataset [Dataset]. http://doi.org/10.18710/LYXKN1

3-digit occupation code images from the Norwegian census of 1950 - Manual review dataset

Related Article
Explore at:
text/comma-separated-values(54006), txt(7270), zip(1860373835)Available download formats
Dataset updated
Jul 3, 2023
Dataset provided by
Harvard Dataverse
License

https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.null/customlicense?persistentId=doi:10.18710/LYXKN1https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.null/customlicense?persistentId=doi:10.18710/LYXKN1

Time period covered
Dec 1, 1950
Area covered
Norway
Dataset funded by
The Research Council of Norway
UiT The Arctic University of Norway, interdisciplinary strategic project High North Population Studies
Description

This dataset is made up of images containing handwritten 3-digit occupation codes from the Norwegian population census of 1950. The occupation codes were added to the census sheets by Statistics Norway after the census was concluded for the purpose of creating aggregated occupational statistics for the entire population. The coding standard used in the 1950 census is, according to Statistics Norway’s official publications (https://www.ssb.no/historisk-statistikk/folketellinger/folketellingen-1950, booklet 4, page 81), very similar to the standards used in the census for 1920. Cf. the 13th booklet published for the 1920 census (https://www.ssb.no/historisk-statistikk/folketellinger/folketellingen-1920, note that this booklet is only available in Norwegian). In short, an occupation code is a 3-digit number that corresponds to a given occupation or type of occupation. According to the official list of occupation codes provided by Statistics Norway there are 339 unique codes. These are not all necessarily sequential or hierarchical in general, but some subgroupings are. This list can be found under Files. It is also worth noting that these images were extracted from the original census sheet images algorithmically. This process was not flawless and lead to additional images being extracted, these can contain written occupation titles or be left entirely blank. The dataset consists of 90,000 unique images, and 9,000 images that were randomly selected and copied from the unique images. These were all used for a research project (link to preprint article: https://doi.org/10.48550/arXiv.2306.16126) where we (author list can be found in preprint) tried to find a more efficient way of reviewing and correcting classification results from a Machine Learning model, where the results did not pass a pre-set confidence threshold. This was a follow-up to our previous article where we describe the initial project and creating of our model in more detail, if it is of interest (“Lessons Learned Developing and Using a Machine Learning Model to Automatically Transcribe 2.3 Million Handwritten Occupation Codes”, https://doi.org/10.51964/hlcs11331).

Search
Clear search
Close search
Google apps
Main menu