100+ datasets found
  1. Data from: USAGE OF DISSIMILARITY MEASURES AND MULTIDIMENSIONAL SCALING FOR...

    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • s.cnmilf.com
    • +2more
    Updated Feb 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). USAGE OF DISSIMILARITY MEASURES AND MULTIDIMENSIONAL SCALING FOR LARGE SCALE SOLAR DATA ANALYSIS [Dataset]. https://data.staging.idas-ds1.appdat.jsc.nasa.gov/dataset/usage-of-dissimilarity-measures-and-multidimensional-scaling-for-large-scale-solar-data-an
    Explore at:
    Dataset updated
    Feb 19, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    USAGE OF DISSIMILARITY MEASURES AND MULTIDIMENSIONAL SCALING FOR LARGE SCALE SOLAR DATA ANALYSIS Juan M Banda, Rafal Anrgyk ABSTRACT: This work describes the application of several dissimilarity measures combined with multidimensional scaling for large scale solar data analysis. Using the first solar domain-specific benchmark data set that contains multiple types of phenomena, we investigated combination of different image parameters with different dissimilarity measure sin order to determine which combination will allow us to differentiate our solar data within each class and versus the rest of the classes. In this work we also address the issue of reducing dimensionality by applying multidimensional scaling to our dissimilarity matrices produced by the previously mentioned combination. By applying multidimensional scaling we can investigate how many resulting components are needed in order to maintain a good representation of our data (in an artificial dimensional space) and how many can be discarded in order to economize our storage costs. We present a comparative analysis between different classifiers in order to determine the amount of dimensionality reduction that can be achieved with said combination of image parameters, similarity measure and multidimensional scaling.

  2. Data from: Do Agile Scaling Approaches Make A Difference? An Empirical...

    • zenodo.org
    • data.niaid.nih.gov
    Updated Oct 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christiaan Verwijs; Christiaan Verwijs; Daniel Russo; Daniel Russo (2023). Do Agile Scaling Approaches Make A Difference? An Empirical Comparison of Team Effectiveness Across Popular Scaling Approaches [Dataset]. http://doi.org/10.5281/zenodo.8396487
    Explore at:
    Dataset updated
    Oct 2, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Christiaan Verwijs; Christiaan Verwijs; Daniel Russo; Daniel Russo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This bundle contains supplementary materials for an upcoming academic publication Do Agile Scaling Approaches Make A Difference? An Empirical Comparison of Team Effectiveness Across Popular Scaling Approaches?, by Christiaan Verwijs and Daniel Russo. Included in the bundle are the dataset and SPSS syntaxes. This replication package is made available by C. Verwijs under a "Creative Commons Attribution Non-Commercial Share-Alike 4.0 International"-license (CC-BY-NC-SA 4.0).

    About the dataset

    The dataset (SPSS) contains anonymized response data from 15,078 team members aggregated into 4,013 Agile teams that participated from scrumteamsurvey.org. Stakeholder evaluations of 1,841 stakeholders were also collected for 529 of those teams. Data was gathered between September 2021, and September 2023. We cleaned the individual response data from careless responses and removed all data that could potentially identify teams, individuals, or their parent organizations. Because we wanted to analyze our measures at the team level, we calculated a team-level mean for each item in the survey. Such aggregation is only justified when at least 10% of the variance exists at the team level (Hair, 2019), which was the case (ICC = 35-50%). No data was missing at the team level.

    Question labels and option labels are provided separately in Questions.csv. To conform to the privacy statement of scrumteamsurvey.org, the bundle does not include response data from before the team-level aggregation.

    About the SPSS syntaxes

    The bundle includes the syntaxes we used to prepare the dataset from the raw import, as well as the syntax we used to generate descriptives. This is mostly there for other researchers to verify our procedure.

  3. N

    Dataset for Scales Mound, IL Census Bureau Demographics and Population...

    • neilsberg.com
    Updated Jul 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Dataset for Scales Mound, IL Census Bureau Demographics and Population Distribution Across Age // 2024 Edition [Dataset]. https://www.neilsberg.com/research/datasets/b7b2ec07-5460-11ee-804b-3860777c1fe6/
    Explore at:
    Dataset updated
    Jul 24, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Scales Mound, Illinois
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the Scales Mound population by age. The dataset can be utilized to understand the age distribution and demographics of Scales Mound.

    Content

    The dataset constitues the following three datasets

    • Scales Mound, IL Age Group Population Dataset: A complete breakdown of Scales Mound age demographics from 0 to 85 years, distributed across 18 age groups
    • Scales Mound, IL Age Cohorts Dataset: Children, Working Adults, and Seniors in Scales Mound - Population and Percentage Analysis
    • Scales Mound, IL Population Pyramid Dataset: Age Groups, Male and Female Population, and Total Population for Demographics Analysis

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

  4. Z

    Beneath the SURFace: An MRI-like View into the Life of a 21st Century...

    • data.niaid.nih.gov
    • explore.openaire.eu
    Updated Aug 26, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Codreanu, Valeriu (2020). Beneath the SURFace: An MRI-like View into the Life of a 21st Century Datacenter [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3878142
    Explore at:
    Dataset updated
    Aug 26, 2020
    Dataset provided by
    Codreanu, Valeriu
    Laursen Olason, Kristian Valur
    Iosup, Alexandru
    Uta, Alexandru
    Podareanu, Damian
    Melis, Paul
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a trace archive of metrics collected from the Lisa cluster at SURFsara associated with the article that will be published in USENIX;login: in July 2020.

    Github repository which contains documentation as well as scripts required to replicate the work from the login paper: https://github.com/sara-nl/SURFace

    Real-world data can be instrumental in answering detailed questions: How do we know which assumptions regarding large-scale systems are realistic? How do we know that the systems we build are practical? How do we know which metrics are important to assess when analyzing performance? To answer such questions, we need to collect and share operational traces containing real-world, detailed data. Not only is the presence of low-level metrics significant, but they also help avoid biases through their variety. To address variety, there exist several types of archives, such as the Parallel Workloads Archive, the Grid Workloads Archive, and the Google or Microsoft logs (the Appendix gives a multi-decade overview). However, such traces mostly focus on higher-level scheduling decisions and high-level, job-based resource utilization (e.g., consumed CPU and memory). Thus, they do not provide vital information to system administrators or researchers analyzing the full-stack or the OS-level operation of datacenters.

    The traces we are sharing have the finest granularity of all other open-source traces published so far. In addition to scheduler-level logs, they contain over 100 low-level, server-based metrics, going to the granularity of page-faults or bytes transferred through a NIC.

    The SURF archive

    Datacenters already exhibit unprecedented scale and are becoming increasingly more complex. Moreover, such computer systems have begun having a significant impact on the environment, for example, training some machine learning models has sizable carbon footprints. As our recent work on modern datacenter networks shows, low-level data is key to understanding full-stack operation, including high-level application behavior. We advocate it is time to start using such data more systematically, unlocking its potential in helping us understand how to make (datacenter) systems more efficient. We advocate that our data can contribute to a more holistic approach, looking at how the multitude of these systems work together in a large-scale datacenter.

    This archive contains data from the Dutch National Infrastructure, Lisa.

    Description of the Lisa system

    Description of the Cartesius system

    We gather metrics, at 15-second intervals, from several data sources:

    Slurm: all job, task, and scheduler related data, such as running time, queueing time, failures, servers involved in the execution, organization in partitions, and scheduling policies.

    NVIDIA Management Library (NVML): per GPU, data such as power metrics, temperature, fan speed, or used memory.

    IPMI: per server, data such as power metrics and temperature.

    OS-level: from either procfs, sockstat, or netstat data: low-level OS metrics, regarding the state of each server, including CPU, disk, memory, network utilization, context switches, and interrupts.

    We also release other kinds of novel information, related to datacenter topology and organization.

    The audience we envision using these metrics is composed of systems researchers, infrastructure developers and designers, system administrators, and software developers for large-scale infrastructure. The frequency of collecting data is uniquely high for open-source data, which could allow these experts unprecedented views into the operation of a real datacenter.

    • Note: For the GPU metrics a number of nodes were introduced to the system in late Feb/start of March and as such these specific nodes have no data available in January and February which may cause irregularities. The github will contain code snippets that will show how to filter this data such that this is not a problem and how to graph the parquet data (this is pending update in the next few days.
  5. Does Size Matter? Scaling of CO2 Emissions and U.S

    • data.subak.org
    • plos.figshare.com
    dta
    Updated Feb 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Figshare (2023). Does Size Matter? Scaling of CO2 Emissions and U.S [Dataset]. http://doi.org/10.1371/journal.pone.0064727
    Explore at:
    dtaAvailable download formats
    Dataset updated
    Feb 16, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Description

    Urban areas consume more than 66% of the world’s energy and generate more than 70% of global greenhouse gas emissions. With the world’s population expected to reach 10 billion by 2100, nearly 90% of whom will live in urban areas, a critical question for planetary sustainability is how the size of cities affects energy use and carbon dioxide (CO2) emissions. Are larger cities more energy and emissions efficient than smaller ones? Do larger cities exhibit gains from economies of scale with regard to emissions? Here we examine the relationship between city size and CO2 emissions for U.S. metropolitan areas using a production accounting allocation of emissions. We find that for the time period of 1999–2008, CO2 emissions scale proportionally with urban population size. Contrary to theoretical expectations, larger cities are not more emissions efficient than smaller ones.

  6. Data from: Large-scale integration of single-cell transcriptomic data...

    • data.niaid.nih.gov
    • zenodo.org
    • +1more
    zip
    Updated Dec 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove (2021). Large-scale integration of single-cell transcriptomic data captures transitional progenitor states in mouse skeletal muscle regeneration [Dataset]. http://doi.org/10.5061/dryad.t4b8gtj34
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 14, 2021
    Dataset provided by
    Cornell University
    Authors
    David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Skeletal muscle repair is driven by the coordinated self-renewal and fusion of myogenic stem and progenitor cells. Single-cell gene expression analyses of myogenesis have been hampered by the poor sampling of rare and transient cell states that are critical for muscle repair, and do not inform the spatial context that is important for myogenic differentiation. Here, we demonstrate how large-scale integration of single-cell and spatial transcriptomic data can overcome these limitations. We created a single-cell transcriptomic dataset of mouse skeletal muscle by integration, consensus annotation, and analysis of 23 newly collected scRNAseq datasets and 88 publicly available single-cell (scRNAseq) and single-nucleus (snRNAseq) RNA-sequencing datasets. The resulting dataset includes more than 365,000 cells and spans a wide range of ages, injury, and repair conditions. Together, these data enabled identification of the predominant cell types in skeletal muscle, and resolved cell subtypes, including endothelial subtypes distinguished by vessel-type of origin, fibro/adipogenic progenitors defined by functional roles, and many distinct immune populations. The representation of different experimental conditions and the depth of transcriptome coverage enabled robust profiling of sparsely expressed genes. We built a densely sampled transcriptomic model of myogenesis, from stem cell quiescence to myofiber maturation and identified rare, transitional states of progenitor commitment and fusion that are poorly represented in individual datasets. We performed spatial RNA sequencing of mouse muscle at three time points after injury and used the integrated dataset as a reference to achieve a high-resolution, local deconvolution of cell subtypes. We also used the integrated dataset to explore ligand-receptor co-expression patterns and identify dynamic cell-cell interactions in muscle injury response. We provide a public web tool to enable interactive exploration and visualization of the data. Our work supports the utility of large-scale integration of single-cell transcriptomic data as a tool for biological discovery.

    Methods Mice. The Cornell University Institutional Animal Care and Use Committee (IACUC) approved all animal protocols, and experiments were performed in compliance with its institutional guidelines. Adult C57BL/6J mice (mus musculus) were obtained from Jackson Laboratories (#000664; Bar Harbor, ME) and were used at 4-7 months of age. Aged C57BL/6J mice were obtained from the National Institute of Aging (NIA) Rodent Aging Colony and were used at 20 months of age. For new scRNAseq experiments, female mice were used in each experiment.

    Mouse injuries and single-cell isolation. To induce muscle injury, both tibialis anterior (TA) muscles of old (20 months) C57BL/6J mice were injected with 10 µl of notexin (10 µg/ml; Latoxan; France). At 0, 1, 2, 3.5, 5, or 7 days post-injury (dpi), mice were sacrificed and TA muscles were collected and processed independently to generate single-cell suspensions. Muscles were digested with 8 mg/ml Collagenase D (Roche; Switzerland) and 10 U/ml Dispase II (Roche; Switzerland), followed by manual dissociation to generate cell suspensions. Cell suspensions were sequentially filtered through 100 and 40 μm filters (Corning Cellgro #431752 and #431750) to remove debris. Erythrocytes were removed through incubation in erythrocyte lysis buffer (IBI Scientific #89135-030).

    Single-cell RNA-sequencing library preparation. After digestion, single-cell suspensions were washed and resuspended in 0.04% BSA in PBS at a concentration of 106 cells/ml. Cells were counted manually with a hemocytometer to determine their concentration. Single-cell RNA-sequencing libraries were prepared using the Chromium Single Cell 3’ reagent kit v3 (10x Genomics, PN-1000075; Pleasanton, CA) following the manufacturer’s protocol. Cells were diluted into the Chromium Single Cell A Chip to yield a recovery of 6,000 single-cell transcriptomes. After preparation, libraries were sequenced using on a NextSeq 500 (Illumina; San Diego, CA) using 75 cycle high output kits (Index 1 = 8, Read 1 = 26, and Read 2 = 58). Details on estimated sequencing saturation and the number of reads per sample are shown in Sup. Data 1.

    Spatial RNA sequencing library preparation. Tibialis anterior muscles of adult (5 mo) C57BL6/J mice were injected with 10µl notexin (10 µg/ml) at 2, 5, and 7 days prior to collection. Upon collection, tibialis anterior muscles were isolated, embedded in OCT, and frozen fresh in liquid nitrogen. Spatially tagged cDNA libraries were built using the Visium Spatial Gene Expression 3’ Library Construction v1 Kit (10x Genomics, PN-1000187; Pleasanton, CA) (Fig. S7). Optimal tissue permeabilization time for 10 µm thick sections was found to be 15 minutes using the 10x Genomics Visium Tissue Optimization Kit (PN-1000193). H&E stained tissue sections were imaged using Zeiss PALM MicroBeam laser capture microdissection system and the images were stitched and processed using Fiji ImageJ software. cDNA libraries were sequenced on an Illumina NextSeq 500 using 150 cycle high output kits (Read 1=28bp, Read 2=120bp, Index 1=10bp, and Index 2=10bp). Frames around the capture area on the Visium slide were aligned manually and spots covering the tissue were selected using Loop Browser v4.0.0 software (10x Genomics). Sequencing data was then aligned to the mouse reference genome (mm10) using the spaceranger v1.0.0 pipeline to generate a feature-by-spot-barcode expression matrix (10x Genomics).

    Download and alignment of single-cell RNA sequencing data. For all samples available via SRA, parallel-fastq-dump (github.com/rvalieris/parallel-fastq-dump) was used to download raw .fastq files. Samples which were only available as .bam files were converted to .fastq format using bamtofastq from 10x Genomics (github.com/10XGenomics/bamtofastq). Raw reads were aligned to the mm10 reference using cellranger (v3.1.0).

    Preprocessing and batch correction of single-cell RNA sequencing datasets. First, ambient RNA signal was removed using the default SoupX (v1.4.5) workflow (autoEstCounts and adjustCounts; github.com/constantAmateur/SoupX). Samples were then preprocessed using the standard Seurat (v3.2.1) workflow (NormalizeData, ScaleData, FindVariableFeatures, RunPCA, FindNeighbors, FindClusters, and RunUMAP; github.com/satijalab/seurat). Cells with fewer than 750 features, fewer than 1000 transcripts, or more than 30% of unique transcripts derived from mitochondrial genes were removed. After preprocessing, DoubletFinder (v2.0) was used to identify putative doublets in each dataset, individually. BCmvn optimization was used for PK parameterization. Estimated doublet rates were computed by fitting the total number of cells after quality filtering to a linear regression of the expected doublet rates published in the 10x Chromium handbook. Estimated homotypic doublet rates were also accounted for using the modelHomotypic function. The default PN value (0.25) was used. Putative doublets were then removed from each individual dataset. After preprocessing and quality filtering, we merged the datasets and performed batch-correction with three tools, independently- Harmony (github.com/immunogenomics/harmony) (v1.0), Scanorama (github.com/brianhie/scanorama) (v1.3), and BBKNN (github.com/Teichlab/bbknn) (v1.3.12). We then used Seurat to process the integrated data. After initial integration, we removed the noisy cluster and re-integrated the data using each of the three batch-correction tools.

    Cell type annotation. Cell types were determined for each integration method independently. For Harmony and Scanorama, dimensions accounting for 95% of the total variance were used to generate SNN graphs (Seurat::FindNeighbors). Louvain clustering was then performed on the output graphs (including the corrected graph output by BBKNN) using Seurat::FindClusters. A clustering resolution of 1.2 was used for Harmony (25 initial clusters), BBKNN (28 initial clusters), and Scanorama (38 initial clusters). Cell types were determined based on expression of canonical genes (Fig. S3). Clusters which had similar canonical marker gene expression patterns were merged.

    Pseudotime workflow. Cells were subset based on the consensus cell types between all three integration methods. Harmony embedding values from the dimensions accounting for 95% of the total variance were used for further dimensional reduction with PHATE, using phateR (v1.0.4) (github.com/KrishnaswamyLab/phateR).

    Deconvolution of spatial RNA sequencing spots. Spot deconvolution was performed using the deconvolution module in BayesPrism (previously known as “Tumor microEnvironment Deconvolution”, TED, v1.0; github.com/Danko-Lab/TED). First, myogenic cells were re-labeled, according to binning along the first PHATE dimension, as “Quiescent MuSCs” (bins 4-5), “Activated MuSCs” (bins 6-7), “Committed Myoblasts” (bins 8-10), and “Fusing Myoctes” (bins 11-18). Culture-associated muscle stem cells were ignored and myonuclei labels were retained as “Myonuclei (Type IIb)” and “Myonuclei (Type IIx)”. Next, highly and differentially expressed genes across the 25 groups of cells were identified with differential gene expression analysis using Seurat (FindAllMarkers, using Wilcoxon Rank Sum Test; results in Sup. Data 2). The resulting genes were filtered based on average log2-fold change (avg_logFC > 1) and the percentage of cells within the cluster which express each gene (pct.expressed > 0.5), yielding 1,069 genes. Mitochondrial and ribosomal protein genes were also removed from this list, in line with recommendations in the BayesPrism vignette. For each of the cell types, mean raw counts were calculated across the 1,069 genes to generate a gene expression profile for BayesPrism. Raw counts for each spot were then passed to the run.Ted function, using

  7. o

    Data from: ANES Scales Often Don't Measure What You Think They Measure: An...

    • osf.io
    Updated Sep 16, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matthew Pietryka; Randall C. MacIntosh (2020). ANES Scales Often Don't Measure What You Think They Measure: An ERPC2016 Analysis [Dataset]. https://osf.io/9n4zs
    Explore at:
    Dataset updated
    Sep 16, 2020
    Dataset provided by
    Center For Open Science
    Authors
    Matthew Pietryka; Randall C. MacIntosh
    Description

    Political surveys often include multi-item scales to measure individual predispositions such as authoritarianism, egalitarianism, or racial resentment. Scholars use these scales to examine group differences in these predispositions, comparing women to men, rich to poor, or Republicans to Democrats. Such research implicitly assumes that, say, Republicans' and Democrats' responses to the egalitarianism scale measure the same construct in the same metric. This research rarely evaluates whether the data possess the characteristics necessary to justify this equivalence assumption. We present a framework to test this assumption and correct scales when it fails to hold. Examining 13 commonly used scales on the 2012 and 2016 ANES, we find widespread violations of the equivalence assumption. These violations often bias the estimated magnitude or direction of theoretically important group differences. These results suggest we must reevaluate what we think we know about the causes and consequences of authoritarianism, egalitarianism, and other predispositions.

  8. PromptCloud Ecommerce Data - Web Scraping & Data Extraction from Online...

    • datarade.ai
    .json, .xml, .csv
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PromptCloud (2023). PromptCloud Ecommerce Data - Web Scraping & Data Extraction from Online Marketplaces Globally | Custom Data Extraction Services | 99% Data Accuracy [Dataset]. https://datarade.ai/data-products/promptcloud-ecommerce-data-web-scraping-data-extraction-f-promptcloud
    Explore at:
    .json, .xml, .csvAvailable download formats
    Dataset updated
    Nov 21, 2023
    Dataset authored and provided by
    PromptCloud
    Area covered
    Greece, Virgin Islands (British), Falkland Islands (Malvinas), Tokelau, Canada, Panama, Pakistan, Mongolia, Åland Islands, Bolivia (Plurinational State of)
    Description

    You can quickly implement eCommerce data scraping projects within a short period of time by following a few easy steps. Where you will see that our core focus is on data quality and speed of implementation.

    We can fulfill your large scale data scraping requirements even on complex sites without any coding in the shortest time possible. We have ready-to-use eCommerce scraping recipes as a result of our vast experience in building large-scale web crawlers for multiple clients across different verticals, catering to various use cases, including, but not limited to:

    1. Product Price Tracking
    2. Product Demand Analysis
    3. Product Trends
    4. Sentiment Analysis
    5. Seller Analysis
    6. Competitor Monitoring

    We are committed to putting data at the heart of your business. Reach out for a no-frills PromptCloud experience- professional, technologically ahead and reliable.

  9. National Hydrography Dataset Plus Version 2.1

    • resilience.climate.gov
    • oregonwaterdata.org
    • +1more
    Updated Aug 16, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Esri (2022). National Hydrography Dataset Plus Version 2.1 [Dataset]. https://resilience.climate.gov/maps/4bd9b6892530404abfe13645fcb5099a
    Explore at:
    Dataset updated
    Aug 16, 2022
    Dataset authored and provided by
    Esrihttp://esri.com/
    Area covered
    North Pacific Ocean, Pacific Ocean
    Description

    The National Hydrography Dataset Plus (NHDplus) maps the lakes, ponds, streams, rivers and other surface waters of the United States. Created by the US EPA Office of Water and the US Geological Survey, the NHDPlus provides mean annual and monthly flow estimates for rivers and streams. Additional attributes provide connections between features facilitating complicated analyses. For more information on the NHDPlus dataset see the NHDPlus v2 User Guide.Dataset SummaryPhenomenon Mapped: Surface waters and related features of the United States and associated territories not including Alaska.Geographic Extent: The United States not including Alaska, Puerto Rico, Guam, US Virgin Islands, Marshall Islands, Northern Marianas Islands, Palau, Federated States of Micronesia, and American SamoaProjection: Web Mercator Auxiliary Sphere Visible Scale: Visible at all scales but layer draws best at scales larger than 1:1,000,000Source: EPA and USGSUpdate Frequency: There is new new data since this 2019 version, so no updates planned in the futurePublication Date: March 13, 2019Prior to publication, the NHDPlus network and non-network flowline feature classes were combined into a single flowline layer. Similarly, the NHDPlus Area and Waterbody feature classes were merged under a single schema.Attribute fields were added to the flowline and waterbody layers to simplify symbology and enhance the layer's pop-ups. Fields added include Pop-up Title, Pop-up Subtitle, On or Off Network (flowlines only), Esri Symbology (waterbodies only), and Feature Code Description. All other attributes are from the original NHDPlus dataset. No data values -9999 and -9998 were converted to Null values for many of the flowline fields.What can you do with this layer?Feature layers work throughout the ArcGIS system. Generally your work flow with feature layers will begin in ArcGIS Online or ArcGIS Pro. Below are just a few of the things you can do with a feature service in Online and Pro.ArcGIS OnlineAdd this layer to a map in the map viewer. The layer is limited to scales of approximately 1:1,000,000 or larger but a vector tile layer created from the same data can be used at smaller scales to produce a webmap that displays across the full range of scales. The layer or a map containing it can be used in an application. Change the layer’s transparency and set its visibility rangeOpen the layer’s attribute table and make selections. Selections made in the map or table are reflected in the other. Center on selection allows you to zoom to features selected in the map or table and show selected records allows you to view the selected records in the table.Apply filters. For example you can set a filter to show larger streams and rivers using the mean annual flow attribute or the stream order attribute. Change the layer’s style and symbologyAdd labels and set their propertiesCustomize the pop-upUse as an input to the ArcGIS Online analysis tools. This layer works well as a reference layer with the trace downstream and watershed tools. The buffer tool can be used to draw protective boundaries around streams and the extract data tool can be used to create copies of portions of the data.ArcGIS ProAdd this layer to a 2d or 3d map. Use as an input to geoprocessing. For example, copy features allows you to select then export portions of the data to a new feature class. Change the symbology and the attribute field used to symbolize the dataOpen table and make interactive selections with the mapModify the pop-upsApply Definition Queries to create sub-sets of the layerThis layer is part of the ArcGIS Living Atlas of the World that provides an easy way to explore the landscape layers and many other beautiful and authoritative maps on hundreds of topics.Questions?Please leave a comment below if you have a question about this layer, and we will get back to you as soon as possible.

  10. Data from: Does conspicuousness scale linearly with colour distance? a test...

    • zenodo.org
    • datadryad.org
    zip
    Updated Jun 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Karen Cheney; Karen Cheney (2022). Does conspicuousness scale linearly with colour distance? a test using reef fish [Dataset]. http://doi.org/10.5061/dryad.jq2bvq874
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 3, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Karen Cheney; Karen Cheney
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    To be effective, animal colour signals must attract attention – and therefore need to be conspicuous. To understand signal function, it is useful to evaluate their conspicuousness to relevant viewers under various environmental conditions, including when visual scenes are cluttered by objects of varying colour. A widely used metric of colour difference (ΔS) is based on the Receptor Noise Limited (RNL) model, which was originally proposed to determine when two similar colours appear different from one another, termed the discrimination threshold (or JND, just noticeable difference). Estimates of the perceptual distances between colours that exceed this threshold – termed 'suprathreshold' colour differences – often assumes that a colour's conspicuousness scales linearly with colour distance, and that this scale is independent of direction in colour space. Currently, there is little behavioural evidence to support these assumptions. This study evaluated the relationship between ΔS and conspicuousness in suprathreshold colours using an Ishihara-style test with a coral reef fish, Rhinecanthus aculeatus. As our measure of conspicuousness, we tested whether fish, when presented with two colourful targets, preferred to peck at the one with a greater ΔS from the average distractor colour. We found the relationship between ΔS and conspicuousness followed a sigmoidal function, with high ΔS colours perceived as equally conspicuous. We found that the relationship between ΔS and conspicuousness varied across colour space (i.e. for different hues). The sigmoidal detectability curve was little affected by colour variation in the background or when colour distance was calculated using a model that does not correct for receptor noise. These results suggest that the RNL model may provide accurate estimates for perceptual distance for small suprathreshold distance colours, even in complex viewing environments, but must be used with caution with perceptual distances exceeding 10 ΔS.

  11. Data from: Species richness change across spatial scales

    • zenodo.org
    • borealisdata.ca
    • +5more
    zip
    Updated Jun 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonathan M. Chase; Brian J. McGill; Patrick L. Thompson; Laura H. Antão; Amanda E. Bates; Shane A. Blowes; Maria Dornelas; Andrew Gonzalez; Anne E. Magurran; Sarah R. Supp; Marten Winter; Anne D. Bjorkmann; Helge Bruelheide; Jarrett E.K. Byrnes; Juliano Sarmento Cabral; Robin Ehali; Catalina Gomez; Hector M. Guzman; Forest Isbell; Isla H. Myers-Smith; Holly P. Jones; Jessica Hines; Mark Vellend; Conor Waldock; Mary O'Connor; Jonathan M. Chase; Brian J. McGill; Patrick L. Thompson; Laura H. Antão; Amanda E. Bates; Shane A. Blowes; Maria Dornelas; Andrew Gonzalez; Anne E. Magurran; Sarah R. Supp; Marten Winter; Anne D. Bjorkmann; Helge Bruelheide; Jarrett E.K. Byrnes; Juliano Sarmento Cabral; Robin Ehali; Catalina Gomez; Hector M. Guzman; Forest Isbell; Isla H. Myers-Smith; Holly P. Jones; Jessica Hines; Mark Vellend; Conor Waldock; Mary O'Connor (2022). Data from: Species richness change across spatial scales [Dataset]. http://doi.org/10.5061/dryad.2jk717g
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 1, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jonathan M. Chase; Brian J. McGill; Patrick L. Thompson; Laura H. Antão; Amanda E. Bates; Shane A. Blowes; Maria Dornelas; Andrew Gonzalez; Anne E. Magurran; Sarah R. Supp; Marten Winter; Anne D. Bjorkmann; Helge Bruelheide; Jarrett E.K. Byrnes; Juliano Sarmento Cabral; Robin Ehali; Catalina Gomez; Hector M. Guzman; Forest Isbell; Isla H. Myers-Smith; Holly P. Jones; Jessica Hines; Mark Vellend; Conor Waldock; Mary O'Connor; Jonathan M. Chase; Brian J. McGill; Patrick L. Thompson; Laura H. Antão; Amanda E. Bates; Shane A. Blowes; Maria Dornelas; Andrew Gonzalez; Anne E. Magurran; Sarah R. Supp; Marten Winter; Anne D. Bjorkmann; Helge Bruelheide; Jarrett E.K. Byrnes; Juliano Sarmento Cabral; Robin Ehali; Catalina Gomez; Hector M. Guzman; Forest Isbell; Isla H. Myers-Smith; Holly P. Jones; Jessica Hines; Mark Vellend; Conor Waldock; Mary O'Connor
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Humans have elevated global extinction rates and thus lowered global-scale species richness. However, there is no a priori reason to expect that losses of global species richness should always, or even often, trickle down to losses of species richness at regional and local scales, even though this relationship is often assumed. Here, we show that scale can modulate our estimates of species richness change through time in the face of anthropogenic pressures, but not in a unidirectional way. Instead, the magnitude of species richness change through time can increase, decrease, reverse, or be unimodal across spatial scales. Using several case studies, we show different forms of scale-dependent richness change through time in the face of anthropogenic pressures. For example, Central American corals show a homogenization pattern, where small scale richness is largely unchanged through time, while larger scale richness change is highly negative. Alternatively, birds in North America showed a differentiation effect, where species richness was again largely unchanged through time at small scales, but was more positive at larger scales. Finally, we collated data from a heterogeneous set of studies of different taxa measured through time from sites ranging from small plots to entire continents, and found highly variable patterns that nevertheless imply complex scale-dependence in several taxa. In summary, understanding how biodiversity is changing in the Anthropocene requires an explicit recognition of the influence of spatial scale, and we conclude with some recommendations for how to better incorporate scale into our estimates of change.

  12. Success.ai | LinkedIn Full Dataset | Enrichment API – 700M Public Profiles &...

    • datarade.ai
    Updated Jan 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Success.ai (2022). Success.ai | LinkedIn Full Dataset | Enrichment API – 700M Public Profiles & 70M Companies – Best Price and Quality Guarantee [Dataset]. https://datarade.ai/data-products/success-ai-linkedin-full-dataset-enrichment-api-700m-pu-success-ai
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Jan 1, 2022
    Dataset provided by
    Area covered
    Greenland, Nicaragua, Equatorial Guinea, Jordan, Svalbard and Jan Mayen, Qatar, Guatemala, Saint Barthélemy, United Republic of, Tunisia
    Description

    Success.ai’s LinkedIn Data Solutions offer unparalleled access to a vast dataset of 700 million public LinkedIn profiles and 70 million LinkedIn company records, making it one of the most comprehensive and reliable LinkedIn datasets available on the market today. Our employee data and LinkedIn data are ideal for businesses looking to streamline recruitment efforts, build highly targeted lead lists, or develop personalized B2B marketing campaigns.

    Whether you’re looking for recruiting data, conducting investment research, or seeking to enrich your CRM systems with accurate and up-to-date LinkedIn profile data, Success.ai provides everything you need with pinpoint precision. By tapping into LinkedIn company data, you’ll have access to over 40 critical data points per profile, including education, professional history, and skills.

    Key Benefits of Success.ai’s LinkedIn Data: Our LinkedIn data solution offers more than just a dataset. With GDPR-compliant data, AI-enhanced accuracy, and a price match guarantee, Success.ai ensures you receive the highest-quality data at the best price in the market. Our datasets are delivered in Parquet format for easy integration into your systems, and with millions of profiles updated daily, you can trust that you’re always working with fresh, relevant data.

    API Integration: Our datasets are easily accessible via API, allowing for seamless integration into your existing systems. This ensures that you can automate data retrieval and update processes, maintaining the flow of fresh, accurate information directly into your applications.

    Global Reach and Industry Coverage: Our LinkedIn data covers professionals across all industries and sectors, providing you with detailed insights into businesses around the world. Our geographic coverage spans 259M profiles in the United States, 22M in the United Kingdom, 27M in India, and thousands of profiles in regions such as Europe, Latin America, and Asia Pacific. With LinkedIn company data, you can access profiles of top companies from the United States (6M+), United Kingdom (2M+), and beyond, helping you scale your outreach globally.

    Why Choose Success.ai’s LinkedIn Data: Success.ai stands out for its tailored approach and white-glove service, making it easy for businesses to receive exactly the data they need without managing complex data platforms. Our dedicated Success Managers will curate and deliver your dataset based on your specific requirements, so you can focus on what matters most—reaching the right audience. Whether you’re sourcing employee data, LinkedIn profile data, or recruiting data, our service ensures a seamless experience with 99% data accuracy.

    • Best Price Guarantee: We offer unbeatable pricing on LinkedIn data, and we’ll match any competitor.
    • Global Scale: Access 700 million LinkedIn profiles and 70 million company records globally.
    • AI-Verified Accuracy: Enjoy 99% data accuracy through our advanced AI and manual validation processes.
    • Real-Time Data: Profiles are updated daily, ensuring you always have the most relevant insights.
    • Tailored Solutions: Get custom-curated LinkedIn data delivered directly, without managing platforms.
    • Ethically Sourced Data: Compliant with global privacy laws, ensuring responsible data usage.
    • Comprehensive Profiles: Over 40 data points per profile, including job titles, skills, and company details.
    • Wide Industry Coverage: Covering sectors from tech to finance across regions like the US, UK, Europe, and Asia.

    Key Use Cases:

    • Sales Prospecting and Lead Generation: Build targeted lead lists using LinkedIn company data and professional profiles, helping sales teams engage decision-makers at high-value accounts.
    • Recruitment and Talent Sourcing: Use LinkedIn profile data to identify and reach top candidates globally. Our employee data includes work history, skills, and education, providing all the details you need for successful recruitment.
    • Account-Based Marketing (ABM): Use our LinkedIn company data to tailor marketing campaigns to key accounts, making your outreach efforts more personalized and effective.
    • Investment Research & Due Diligence: Identify companies with strong growth potential using LinkedIn company data. Access key data points such as funding history, employee count, and company trends to fuel investment decisions.
    • Competitor Analysis: Stay ahead of your competition by tracking hiring trends, employee movement, and company growth through LinkedIn data. Use these insights to adjust your market strategy and improve your competitive positioning.
    • CRM Data Enrichment: Enhance your CRM systems with real-time updates from Success.ai’s LinkedIn data, ensuring that your sales and marketing teams are always working with accurate and up-to-date information.
    • Comprehensive Data Points for LinkedIn Profiles: Our LinkedIn profile data includes over 40 key data points for every individual and company, ensuring a complete understandin...
  13. Data from: LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive...

    • zenodo.org
    • explore.openaire.eu
    zip
    Updated Oct 20, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sofia Yfantidou; Sofia Yfantidou; Christina Karagianni; Stefanos Efstathiou; Stefanos Efstathiou; Athena Vakali; Athena Vakali; Joao Palotti; Joao Palotti; Dimitrios Panteleimon Giakatos; Dimitrios Panteleimon Giakatos; Thomas Marchioro; Thomas Marchioro; Andrei Kazlouski; Elena Ferrari; Šarūnas Girdzijauskas; Šarūnas Girdzijauskas; Christina Karagianni; Andrei Kazlouski; Elena Ferrari (2022). LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive snapshots of our lives in the wild [Dataset]. http://doi.org/10.5281/zenodo.6832242
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 20, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sofia Yfantidou; Sofia Yfantidou; Christina Karagianni; Stefanos Efstathiou; Stefanos Efstathiou; Athena Vakali; Athena Vakali; Joao Palotti; Joao Palotti; Dimitrios Panteleimon Giakatos; Dimitrios Panteleimon Giakatos; Thomas Marchioro; Thomas Marchioro; Andrei Kazlouski; Elena Ferrari; Šarūnas Girdzijauskas; Šarūnas Girdzijauskas; Christina Karagianni; Andrei Kazlouski; Elena Ferrari
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    LifeSnaps Dataset Documentation

    Ubiquitous self-tracking technologies have penetrated various aspects of our lives, from physical and mental health monitoring to fitness and entertainment. Yet, limited data exist on the association between in the wild large-scale physical activity patterns, sleep, stress, and overall health, and behavioral patterns and psychological measurements due to challenges in collecting and releasing such datasets, such as waning user engagement, privacy considerations, and diversity in data modalities. In this paper, we present the LifeSnaps dataset, a multi-modal, longitudinal, and geographically-distributed dataset, containing a plethora of anthropological data, collected unobtrusively for the total course of more than 4 months by n=71 participants, under the European H2020 RAIS project. LifeSnaps contains more than 35 different data types from second to daily granularity, totaling more than 71M rows of data. The participants contributed their data through numerous validated surveys, real-time ecological momentary assessments, and a Fitbit Sense smartwatch, and consented to make these data available openly to empower future research. We envision that releasing this large-scale dataset of multi-modal real-world data, will open novel research opportunities and potential applications in the fields of medical digital innovations, data privacy and valorization, mental and physical well-being, psychology and behavioral sciences, machine learning, and human-computer interaction.

    The following instructions will get you started with the LifeSnaps dataset and are complementary to the original publication.

    Data Import: Reading CSV

    For ease of use, we provide CSV files containing Fitbit, SEMA, and survey data at daily and/or hourly granularity. You can read the files via any programming language. For example, in Python, you can read the files into a Pandas DataFrame with the pandas.read_csv() command.

    Data Import: Setting up a MongoDB (Recommended)

    To take full advantage of the LifeSnaps dataset, we recommend that you use the raw, complete data via importing the LifeSnaps MongoDB database.

    To do so, open the terminal/command prompt and run the following command for each collection in the DB. Ensure you have MongoDB Database Tools installed from here.

    For the Fitbit data, run the following:

    mongorestore --host localhost:27017 -d rais_anonymized -c fitbit 

    For the SEMA data, run the following:

    mongorestore --host localhost:27017 -d rais_anonymized -c sema 

    For surveys data, run the following:

    mongorestore --host localhost:27017 -d rais_anonymized -c surveys 

    If you have access control enabled, then you will need to add the --username and --password parameters to the above commands.

    Data Availability

    The MongoDB database contains three collections, fitbit, sema, and surveys, containing the Fitbit, SEMA3, and survey data, respectively. Similarly, the CSV files contain related information to these collections. Each document in any collection follows the format shown below:

    {
      _id: 
  14. Data from: Block-GP: Scalable Gaussian Process Regression for Multimodal...

    • data.nasa.gov
    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • +1more
    application/rdfxml +5
    Updated Jun 26, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Block-GP: Scalable Gaussian Process Regression for Multimodal Data [Dataset]. https://data.nasa.gov/dataset/Block-GP-Scalable-Gaussian-Process-Regression-for-/s4q6-i7fd
    Explore at:
    json, csv, xml, application/rssxml, tsv, application/rdfxmlAvailable download formats
    Dataset updated
    Jun 26, 2018
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    Regression problems on massive data sets are ubiquitous in many application domains including the Internet, earth and space sciences, and finances. In many cases, regression algorithms such as linear regression or neural networks attempt to fit the target variable as a function of the input variables without regard to the underlying joint distribution of the variables. As a result, these global models are not sensitive to variations in the local structure of the input space. Several algorithms, including the mixture of experts model, classification and regression trees (CART), and others have been developed, motivated by the fact that a variability in the local distribution of inputs may be reflective of a significant change in the target variable. While these methods can handle the non-stationarity in the relationships to varying degrees, they are often not scalable and, therefore, not used in large scale data mining applications. In this paper we develop Block-GP, a Gaussian Process regression framework for multimodal data, that can be an order of magnitude more scalable than existing state-of-the-art nonlinear regression algorithms. The framework builds local Gaussian Processes on semantically meaningful partitions of the data and provides higher prediction accuracy than a single global model with very high confidence. The method relies on approximating the covariance matrix of the entire input space by smaller covariance matrices that can be modeled independently, and can therefore be parallelized for faster execution. Theoretical analysis and empirical studies on various synthetic and real data sets show high accuracy and scalability of Block-GP compared to existing nonlinear regression techniques.

  15. Data from: A large-scale fMRI dataset for the visual processing of...

    • openneuro.org
    Updated Jul 9, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhengxin Gong; Ming Zhou; Yuxuan Dai; Yushan Wen; Youyi Liu; Zonglei Zhen (2023). A large-scale fMRI dataset for the visual processing of naturalistic scenes [Dataset]. http://doi.org/10.18112/openneuro.ds004496.v2.1.2
    Explore at:
    Dataset updated
    Jul 9, 2023
    Dataset provided by
    OpenNeurohttps://openneuro.org/
    Authors
    Zhengxin Gong; Ming Zhou; Yuxuan Dai; Yushan Wen; Youyi Liu; Zonglei Zhen
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Summary

    One ultimate goal of visual neuroscience is to understand how the brain processes visual stimuli encountered in the natural environment. Achieving this goal requires records of brain responses under massive amounts of naturalistic stimuli. Although the scientific community has put in a lot of effort to collect large-scale functional magnetic resonance imaging (fMRI) data under naturalistic stimuli, more naturalistic fMRI datasets are still urgently needed. We present here the Natural Object Dataset (NOD), a large-scale fMRI dataset containing responses to 57,120 naturalistic images from 30 participants. NOD strives for a balance between sampling variation between individuals and sampling variation between stimuli. This enables NOD to be utilized not only for determining whether an observation is generalizable across many individuals, but also for testing whether a response pattern is generalized to a variety of naturalistic stimuli. We anticipate that the NOD together with existing naturalistic neuroimaging datasets will serve as a new impetus for our understanding of the visual processing of naturalistic stimuli.

    Data record

    The data were organized according to the Brain-Imaging-Data-Structure (BIDS) Specification version 1.7.0 and can be accessed from the OpenNeuro public repository (accession number: ds004496). In short, raw data of each subject were stored in “sub-

    Stimulus images The stimulus images for different fMRI experiments are deposited in separate folders: “stimuli/imagenet”, “stimuli/coco”, “stimuli/prf”, and “stimuli/floc”. Each experiment folder contains corresponding stimulus images, and the auxiliary files can be found within the “info” subfolder.

    Raw MRI data Each participant folder consists of several session folders: anat, coco, imagenet, prf, floc. Each session folder in turn includes “anat”, “func”, or “fmap” folders for corresponding modality data. The scan information for each session is provided in a TSV file.

    Preprocessed volume data from fMRIprep The preprocessed volume-based fMRI data are in subject's native space, saved as “sub-

    Preprocessed surface-based data from ciftify The preprocessed surface-based data are in standard fsLR space, saved as “sub-

    Brain activation data from surface-based GLM analyses The brain activation data are derived from GLM analyses on the standard fsLR space, saved as “sub-

  16. N

    Scales Mound, IL Age Group Population Dataset: A Complete Breakdown of...

    • neilsberg.com
    csv, json
    Updated Feb 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Scales Mound, IL Age Group Population Dataset: A Complete Breakdown of Scales Mound Age Demographics from 0 to 85 Years and Over, Distributed Across 18 Age Groups // 2025 Edition [Dataset]. https://www.neilsberg.com/insights/scales-mound-il-population-by-age/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 22, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Scales Mound, Illinois
    Variables measured
    Population Under 5 Years, Population over 85 years, Population Between 5 and 9 years, Population Between 10 and 14 years, Population Between 15 and 19 years, Population Between 20 and 24 years, Population Between 25 and 29 years, Population Between 30 and 34 years, Population Between 35 and 39 years, Population Between 40 and 44 years, and 9 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the age groups. For age groups we divided it into roughly a 5 year bucket for ages between 0 and 85. For over 85, we aggregated data into a single group for all ages. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the Scales Mound population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for Scales Mound. The dataset can be utilized to understand the population distribution of Scales Mound by age. For example, using this dataset, we can identify the largest age group in Scales Mound.

    Key observations

    The largest age group in Scales Mound, IL was for the group of age 10 to 14 years years with a population of 54 (12.24%), according to the ACS 2019-2023 5-Year Estimates. At the same time, the smallest age group in Scales Mound, IL was the 85 years and over years with a population of 7 (1.59%). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Variables / Data Columns

    • Age Group: This column displays the age group in consideration
    • Population: The population for the specific age group in the Scales Mound is shown in this column.
    • % of Total Population: This column displays the population of each age group as a proportion of Scales Mound total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Scales Mound Population by Age. You can refer the same here

  17. Data from: Distributed Monitoring of the R2 Statistic for Linear Regression

    • data.nasa.gov
    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • +1more
    application/rdfxml +5
    Updated Jun 26, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Distributed Monitoring of the R2 Statistic for Linear Regression [Dataset]. https://data.nasa.gov/dataset/Distributed-Monitoring-of-the-R2-Statistic-for-Lin/mcn5-47py
    Explore at:
    csv, json, tsv, application/rssxml, xml, application/rdfxmlAvailable download formats
    Dataset updated
    Jun 26, 2018
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    The problem of monitoring a multivariate linear regression model is relevant in studying the evolving relationship between a set of input variables (features) and one or more dependent target variables. This problem becomes challenging for large scale data in a distributed computing environment when only a subset of instances is available at individual nodes and the local data changes frequently. Data centralization and periodic model recomputation can add high overhead to tasks like anomaly detection in such dynamic settings. Therefore, the goal is to develop techniques for monitoring and updating the model over the union of all nodes' data in a communication-efficient fashion. Correctness guarantees on such techniques are also often highly desirable, especially in safety-critical application scenarios. In this paper we develop DReMo --- a distributed algorithm with very low resource overhead, for monitoring the quality of a regression model in terms of its coefficient of determination (R2 statistic). When the nodes collectively determine that R2 has dropped below a fixed threshold, the linear regression model is recomputed via a network-wide convergecast and the updated model is broadcast back to all nodes. We show empirically, using both synthetic and real data, that our proposed method is highly communication-efficient and scalable, and also provide theoretical guarantees on correctness.

  18. d

    US Employee Data | Accurate Contact Information, Job Experience, LinkedIn...

    • datarade.ai
    .json, .csv, .xls
    Updated Aug 22, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Salutary Data (2023). US Employee Data | Accurate Contact Information, Job Experience, LinkedIn URLs + More | Recruiting / HR [Dataset]. https://datarade.ai/data-products/salutary-data-us-employee-data-accurate-contact-informati-salutary-data
    Explore at:
    .json, .csv, .xlsAvailable download formats
    Dataset updated
    Aug 22, 2023
    Dataset authored and provided by
    Salutary Data
    Area covered
    United States of America
    Description

    Salutary Data is a boutique, B2B contact and company data provider that's committed to delivering high quality data for sales intelligence, lead generation, marketing, recruiting, employee data / HR, identity resolution, and ML / AI. Our database currently consists of 148MM+ highly curated B2B Contacts ( US only), along with over 4M+ companies, and is updated regularly to ensure we have the most up-to-date information.

    We can enrich your in-house data ( CRM Enrichment, Lead Enrichment, etc.) and provide you with a custom dataset ( such as a lead list) tailored to your target audience specifications and data use-case. We also support large-scale data licensing to software providers and agencies that intend to redistribute our data to their customers and end-users.

    What makes Salutary unique? - We offer our clients a truly unique, one-stop aggregation of the best-of-breed quality data sources. Our supplier network consists of numerous, established high quality suppliers that are rigorously vetted. - We leverage third party verification vendors to ensure phone numbers and emails are accurate and connect to the right person. Additionally, we deploy automated and manual verification techniques to ensure we have the latest job information for contacts. - We're reasonably priced and easy to work with.

    Products: API Suite Web UI Full and Custom Data Feeds

    Services: Data Enrichment - We assess the fill rate gaps and profile your customer file for the purpose of appending fields, updating information, and/or rendering net new “look alike” prospects for your campaigns. ABM Match & Append - Send us your domain or other company related files, and we’ll match your Account Based Marketing targets and provide you with B2B contacts to campaign. Optionally throw in your suppression file to avoid any redundant records. Verification (“Cleaning/Hygiene”) Services - Address the 2% per month aging issue on contact records! We will identify duplicate records, contacts no longer at the company, rid your email hard bounces, and update/replace titles or phones. This is right up our alley and levers our existing internal and external processes and systems.

  19. d

    Replication Data for: Scale length does matter: Recommendations for...

    • b2find.dkrz.de
    Updated Oct 23, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Replication Data for: Scale length does matter: Recommendations for Measurement Invariance Testing with Categorical Factor Analysis and Item Response Theory Approaches - Dataset - B2FIND [Dataset]. https://b2find.dkrz.de/dataset/56e1cc99-15be-507c-b825-302c5a8db41f
    Explore at:
    Dataset updated
    Oct 23, 2023
    Description

    These files contain the code and data used to evaluate the impact of MG-CCFA- and MG-IRT-based hypotheses and testing strategies on the power to detect violations of MI, for which we performed two simulation studies. In the first study, an invariance scenario was simulated where parameters were invariant between groups. The second study simulated a non-invariance scenario where model parameters varied between groups. See the file Data Report - Replication Data for - Scale length does matter - Recommendations for Measurement Invariance Testing.pdf for, among other things, more information about the files and how they relate.

  20. Large Scale International Boundaries (LSIB)

    • data.amerigeoss.org
    shp
    Updated Jan 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UN Humanitarian Data Exchange (2024). Large Scale International Boundaries (LSIB) [Dataset]. https://data.amerigeoss.org/dataset/large-scale-international-boundaries-lsib
    Explore at:
    shp(46321649)Available download formats
    Dataset updated
    Jan 17, 2024
    Dataset provided by
    United Nationshttp://un.org/
    Description

    Large Scale International Boundaries

    Version 11.1 Release Date: August 22, 2022

    Overview

    The Office of the Geographer and Global Issues at the U.S. Department of State produces the Large Scale International Boundaries (LSIB) dataset. These data and their derivatives are the only international boundary lines approved for U.S. Government use. They reflect U.S. Government policy, and not necessarily de facto limits of control. This dataset is a National Geospatial Data Asset.

    Details

    Sources for these data include treaties, relevant maps, and data from boundary commissions and national mapping agencies. Where available, the dataset incorporates information from courts, tribunals, and international arbitrations. The research and recovery of the data involves analysis of satellite imagery and elevation data. Due to the limitations of source materials and processing techniques, most lines are within 100 meters of their true position on the ground.

    Attributes

    The dataset uses the following attributes: Attribute Name Explanation Country Code Country-level codes are from the Geopolitical Entities, Names, and Codes Standard (GENC). The Q2 code denotes a line representing a boundary associated with an area not in GENC. Country Names Names approved by the U.S. Board on Geographic Names (BGN). Names for lines associated with a Q2 code are descriptive and are not necessarily BGN-approved. Label Required text label for the line segment where scale permits Rank/Status Rank 1: International Boundary Rank 2: Other Line of International Separation Rank 3: Special Line Notes Explanation of any applicable special circumstances Cartographic Usage Depiction of the LSIB requires a visual differentiation between the three categories of boundaries: International Boundaries (Rank 1), Other Lines of International Separation (Rank 2), and Special Lines (Rank 3). Rank 1 lines must be the most visually prominent. Rank 2 lines must be less visually prominent than Rank 1 lines. Rank 3 lines must be shown in a manner visually subordinate to Ranks 1 and 2. Where scale permits, Rank 2 and 3 lines must be labeled in accordance with the “Label” field. Data marked with a Rank 2 or 3 designation does not necessarily correspond to a disputed boundary. Additional cartographic information can be found in Guidance Bulletins (https://hiu.state.gov/data/cartographic_guidance_bulletins/) published by the Office of the Geographer and Global Issues. Please direct inquiries to internationalboundaries@state.gov.

    Credits

    The lines in the LSIB dataset are the product of decades of collaboration between geographers at the Department of State and the National Geospatial-Intelligence Agency with contributions from the Central Intelligence Agency and the UK Defence Geographic Centre. Attribution is welcome: U.S. Department of State, Office of the Geographer and Global Issues.

    Changes from Prior Release

    This version of the LSIB contains changes and accuracy refinements for the following line segments. These changes reflect improvements in spatial accuracy derived from newly available source materials, an ongoing review process, or the publication of new treaties or agreements. Changes to lines include: • Akrotiri (UK) / Cyprus • Albania / Montenegro • Albania / Greece • Albania / North Macedonia • Armenia / Turkey • Austria / Czechia • Austria / Slovakia • Austria / Hungary • Austria / Slovenia • Austria / Germany • Austria / Italy • Austria / Switzerland • Azerbaijan / Turkey • Azerbaijan / Iran • Belarus / Latvia • Belarus / Russia • Belarus / Ukraine • Belarus / Poland • Bhutan / India • Bhutan / China • Bulgaria / Turkey • Bulgaria / Romania • Bulgaria / Serbia • Bulgaria / Romania • China / Tajikistan • China / India • Croatia / Slovenia • Croatia / Hungary • Croatia / Serbia • Croatia / Montenegro • Czechia / Slovakia • Czechia / Poland • Czechia / Germany • Finland / Russia • Finland / Norway • Finland / Sweden • France / Italy • Georgia / Turkey • Germany / Poland • Germany / Switzerland • Greece / North Macedonia • Guyana / Suriname • Hungary / Slovenia • Hungary / Serbia • Hungary / Romania • Hungary / Ukraine • Iran / Turkey • Iraq / Turkey • Italy / Slovenia • Italy / Switzerland • Italy / Vatican City • Italy / San Marino • Kazakhstan / Russia • Kazakhstan / Uzbekistan • Kosovo / north Macedonia • Kosovo / Serbia • Kyrgyzstan / Tajikistan • Kyrgyzstan / Uzbekistan • Latvia / Russia • Latvia / Lithuania • Lithuania / Poland • Lithuania / Russia • Moldova / Ukraine • Moldova / Romania • Norway / Russia • Norway / Sweden • Poland / Russia • Poland / Ukraine • Poland / Slovakia • Romania / Ukraine • Romania / Serbia • Russia / Ukraine • Syria / Turkey • Tajikistan / Uzbekistan

    This release also contains topology fixes, land boundary terminus refinements, and tripoint adjustments.

    Copyright Notice and Disclaimer

    While U.S. Government works prepared by employees of the U.S. Government as part of their official duties are not subject to Federal copyright protection (see 17 U.S.C. § 105), copyrighted material incorporated in U.S. Government works retains its copyright protection. The works on or made available through download from the U.S. Department of State’s website may not be used in any manner that infringes any intellectual property rights or other proprietary rights held by any third party. Use of any copyrighted material beyond what is allowed by fair use or other exemptions may require appropriate permission from the relevant rightsholder. With respect to works on or made available through download from the U.S. Department of State’s website, neither the U.S. Government nor any of its agencies, employees, agents, or contractors make any representations or warranties—express, implied, or statutory—as to the validity, accuracy, completeness, or fitness for a particular purpose; nor represent that use of such works would not infringe privately owned rights; nor assume any liability resulting from use of such works; and shall in no way be liable for any costs, expenses, claims, or demands arising out of use of such works.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
nasa.gov (2025). USAGE OF DISSIMILARITY MEASURES AND MULTIDIMENSIONAL SCALING FOR LARGE SCALE SOLAR DATA ANALYSIS [Dataset]. https://data.staging.idas-ds1.appdat.jsc.nasa.gov/dataset/usage-of-dissimilarity-measures-and-multidimensional-scaling-for-large-scale-solar-data-an
Organization logo

Data from: USAGE OF DISSIMILARITY MEASURES AND MULTIDIMENSIONAL SCALING FOR LARGE SCALE SOLAR DATA ANALYSIS

Related Article
Explore at:
Dataset updated
Feb 19, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description

USAGE OF DISSIMILARITY MEASURES AND MULTIDIMENSIONAL SCALING FOR LARGE SCALE SOLAR DATA ANALYSIS Juan M Banda, Rafal Anrgyk ABSTRACT: This work describes the application of several dissimilarity measures combined with multidimensional scaling for large scale solar data analysis. Using the first solar domain-specific benchmark data set that contains multiple types of phenomena, we investigated combination of different image parameters with different dissimilarity measure sin order to determine which combination will allow us to differentiate our solar data within each class and versus the rest of the classes. In this work we also address the issue of reducing dimensionality by applying multidimensional scaling to our dissimilarity matrices produced by the previously mentioned combination. By applying multidimensional scaling we can investigate how many resulting components are needed in order to maintain a good representation of our data (in an artificial dimensional space) and how many can be discarded in order to economize our storage costs. We present a comparative analysis between different classifiers in order to determine the amount of dimensionality reduction that can be achieved with said combination of image parameters, similarity measure and multidimensional scaling.

Search
Clear search
Close search
Google apps
Main menu