2 datasets found

H
3-digit occupation code images from the Norwegian census of 1950 - Manual...
dataverse.harvard.edu
dataverse.no
+1more
Updated Jul 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Harvard Dataverse (2023). 3-digit occupation code images from the Norwegian census of 1950 - Manual review dataset [Dataset]. http://doi.org/10.18710/LYXKN1
Explore at:
text/comma-separated-values(54006), txt(7270), zip(1860373835)Available download formats
Unique identifier
https://doi.org/10.18710/LYXKN1
Dataset updated
Jul 3, 2023
Dataset provided by
Harvard Dataverse
License
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.null/customlicense?persistentId=doi:10.18710/LYXKN1https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.null/customlicense?persistentId=doi:10.18710/LYXKN1
Time period covered
Dec 1, 1950
Area covered
Norway
Dataset funded by
The Research Council of Norway
UiT The Arctic University of Norway, interdisciplinary strategic project High North Population Studies
Description
This dataset is made up of images containing handwritten 3-digit occupation codes from the Norwegian population census of 1950. The occupation codes were added to the census sheets by Statistics Norway after the census was concluded for the purpose of creating aggregated occupational statistics for the entire population. The coding standard used in the 1950 census is, according to Statistics Norway’s official publications (https://www.ssb.no/historisk-statistikk/folketellinger/folketellingen-1950, booklet 4, page 81), very similar to the standards used in the census for 1920. Cf. the 13th booklet published for the 1920 census (https://www.ssb.no/historisk-statistikk/folketellinger/folketellingen-1920, note that this booklet is only available in Norwegian). In short, an occupation code is a 3-digit number that corresponds to a given occupation or type of occupation. According to the official list of occupation codes provided by Statistics Norway there are 339 unique codes. These are not all necessarily sequential or hierarchical in general, but some subgroupings are. This list can be found under Files. It is also worth noting that these images were extracted from the original census sheet images algorithmically. This process was not flawless and lead to additional images being extracted, these can contain written occupation titles or be left entirely blank. The dataset consists of 90,000 unique images, and 9,000 images that were randomly selected and copied from the unique images. These were all used for a research project (link to preprint article: https://doi.org/10.48550/arXiv.2306.16126) where we (author list can be found in preprint) tried to find a more efficient way of reviewing and correcting classification results from a Machine Learning model, where the results did not pass a pre-set confidence threshold. This was a follow-up to our previous article where we describe the initial project and creating of our model in more detail, if it is of interest (“Lessons Learned Developing and Using a Machine Learning Model to Automatically Transcribe 2.3 Million Handwritten Occupation Codes”, https://doi.org/10.51964/hlcs11331).
d
3-digit occupation code images from the Norwegian census of 1950 - Manual...
b2find.dkrz.de
Updated Jun 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). 3-digit occupation code images from the Norwegian census of 1950 - Manual review dataset - Dataset - B2FIND [Dataset]. https://b2find.dkrz.de/dataset/efbcedaa-3811-583f-95da-38371daf5ae8
Explore at:
Dataset updated
Jun 22, 2024
Description
This dataset is made up of images containing handwritten 3-digit occupation codes from the Norwegian population census of 1950. The occupation codes were added to the census sheets by Statistics Norway after the census was concluded for the purpose of creating aggregated occupational statistics for the entire population. The coding standard used in the 1950 census is, according to Statistics Norway’s official publications (https://www.ssb.no/historisk-statistikk/folketellinger/folketellingen-1950, booklet 4, page 81), very similar to the standards used in the census for 1920. Cf. the 13th booklet published for the 1920 census (https://www.ssb.no/historisk-statistikk/folketellinger/folketellingen-1920, note that this booklet is only available in Norwegian). In short, an occupation code is a 3-digit number that corresponds to a given occupation or type of occupation. According to the official list of occupation codes provided by Statistics Norway there are 339 unique codes. These are not all necessarily sequential or hierarchical in general, but some subgroupings are. This list can be found under Files. It is also worth noting that these images were extracted from the original census sheet images algorithmically. This process was not flawless and lead to additional images being extracted, these can contain written occupation titles or be left entirely blank. The dataset consists of 90,000 unique images, and 9,000 images that were randomly selected and copied from the unique images. These were all used for a research project (link to preprint article: https://doi.org/10.48550/arXiv.2306.16126) where we (author list can be found in preprint) tried to find a more efficient way of reviewing and correcting classification results from a Machine Learning model, where the results did not pass a pre-set confidence threshold. This was a follow-up to our previous article where we describe the initial project and creating of our model in more detail, if it is of interest (“Lessons Learned Developing and Using a Machine Learning Model to Automatically Transcribe 2.3 Million Handwritten Occupation Codes”, https://doi.org/10.51964/hlcs11331).
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Harvard Dataverse (2023). 3-digit occupation code images from the Norwegian census of 1950 - Manual review dataset [Dataset]. http://doi.org/10.18710/LYXKN1

3-digit occupation code images from the Norwegian census of 1950 - Manual review dataset

Explore at:

text/comma-separated-values(54006), txt(7270), zip(1860373835)Available download formats

Unique identifier

https://doi.org/10.18710/LYXKN1

Dataset updated

Jul 3, 2023

Dataset provided by

Harvard Dataverse

License

https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.null/customlicense?persistentId=doi:10.18710/LYXKN1https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.null/customlicense?persistentId=doi:10.18710/LYXKN1

Time period covered

Dec 1, 1950

Area covered

Norway

Dataset funded by

The Research Council of Norway
UiT The Arctic University of Norway, interdisciplinary strategic project High North Population Studies

Description

This dataset is made up of images containing handwritten 3-digit occupation codes from the Norwegian population census of 1950. The occupation codes were added to the census sheets by Statistics Norway after the census was concluded for the purpose of creating aggregated occupational statistics for the entire population. The coding standard used in the 1950 census is, according to Statistics Norway’s official publications (https://www.ssb.no/historisk-statistikk/folketellinger/folketellingen-1950, booklet 4, page 81), very similar to the standards used in the census for 1920. Cf. the 13th booklet published for the 1920 census (https://www.ssb.no/historisk-statistikk/folketellinger/folketellingen-1920, note that this booklet is only available in Norwegian). In short, an occupation code is a 3-digit number that corresponds to a given occupation or type of occupation. According to the official list of occupation codes provided by Statistics Norway there are 339 unique codes. These are not all necessarily sequential or hierarchical in general, but some subgroupings are. This list can be found under Files. It is also worth noting that these images were extracted from the original census sheet images algorithmically. This process was not flawless and lead to additional images being extracted, these can contain written occupation titles or be left entirely blank. The dataset consists of 90,000 unique images, and 9,000 images that were randomly selected and copied from the unique images. These were all used for a research project (link to preprint article: https://doi.org/10.48550/arXiv.2306.16126) where we (author list can be found in preprint) tried to find a more efficient way of reviewing and correcting classification results from a Machine Learning model, where the results did not pass a pre-set confidence threshold. This was a follow-up to our previous article where we describe the initial project and creating of our model in more detail, if it is of interest (“Lessons Learned Developing and Using a Machine Learning Model to Automatically Transcribe 2.3 Million Handwritten Occupation Codes”, https://doi.org/10.51964/hlcs11331).

Clear search

Close search

Google apps

Main menu

3-digit occupation code images from the Norwegian census of 1950 - Manual...

3-digit occupation code images from the Norwegian census of 1950 - Manual...

3-digit occupation code images from the Norwegian census of 1950 - Manual review dataset