81 datasets found
  1. d

    TagX Web Browsing clickstream Data - 300K Users North America, EU - GDPR -...

    • datarade.ai
    .json, .csv, .xls
    Updated Sep 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TagX (2024). TagX Web Browsing clickstream Data - 300K Users North America, EU - GDPR - CCPA Compliant [Dataset]. https://datarade.ai/data-products/tagx-web-browsing-clickstream-data-300k-users-north-america-tagx
    Explore at:
    .json, .csv, .xlsAvailable download formats
    Dataset updated
    Sep 16, 2024
    Dataset authored and provided by
    TagX
    Area covered
    United States
    Description

    TagX Web Browsing Clickstream Data: Unveiling Digital Behavior Across North America and EU Unique Insights into Online User Behavior TagX Web Browsing clickstream Data offers an unparalleled window into the digital lives of 1 million users across North America and the European Union. This comprehensive dataset stands out in the market due to its breadth, depth, and stringent compliance with data protection regulations. What Makes Our Data Unique?

    Extensive Geographic Coverage: Spanning two major markets, our data provides a holistic view of web browsing patterns in developed economies. Large User Base: With 300K active users, our dataset offers statistically significant insights across various demographics and user segments. GDPR and CCPA Compliance: We prioritize user privacy and data protection, ensuring that our data collection and processing methods adhere to the strictest regulatory standards. Real-time Updates: Our clickstream data is continuously refreshed, providing up-to-the-minute insights into evolving online trends and user behaviors. Granular Data Points: We capture a wide array of metrics, including time spent on websites, click patterns, search queries, and user journey flows.

    Data Sourcing: Ethical and Transparent Our web browsing clickstream data is sourced through a network of partnered websites and applications. Users explicitly opt-in to data collection, ensuring transparency and consent. We employ advanced anonymization techniques to protect individual privacy while maintaining the integrity and value of the aggregated data. Key aspects of our data sourcing process include:

    Voluntary user participation through clear opt-in mechanisms Regular audits of data collection methods to ensure ongoing compliance Collaboration with privacy experts to implement best practices in data anonymization Continuous monitoring of regulatory landscapes to adapt our processes as needed

    Primary Use Cases and Verticals TagX Web Browsing clickstream Data serves a multitude of industries and use cases, including but not limited to:

    Digital Marketing and Advertising:

    Audience segmentation and targeting Campaign performance optimization Competitor analysis and benchmarking

    E-commerce and Retail:

    Customer journey mapping Product recommendation enhancements Cart abandonment analysis

    Media and Entertainment:

    Content consumption trends Audience engagement metrics Cross-platform user behavior analysis

    Financial Services:

    Risk assessment based on online behavior Fraud detection through anomaly identification Investment trend analysis

    Technology and Software:

    User experience optimization Feature adoption tracking Competitive intelligence

    Market Research and Consulting:

    Consumer behavior studies Industry trend analysis Digital transformation strategies

    Integration with Broader Data Offering TagX Web Browsing clickstream Data is a cornerstone of our comprehensive digital intelligence suite. It seamlessly integrates with our other data products to provide a 360-degree view of online user behavior:

    Social Media Engagement Data: Combine clickstream insights with social media interactions for a holistic understanding of digital footprints. Mobile App Usage Data: Cross-reference web browsing patterns with mobile app usage to map the complete digital journey. Purchase Intent Signals: Enrich clickstream data with purchase intent indicators to power predictive analytics and targeted marketing efforts. Demographic Overlays: Enhance web browsing data with demographic information for more precise audience segmentation and targeting.

    By leveraging these complementary datasets, businesses can unlock deeper insights and drive more impactful strategies across their digital initiatives. Data Quality and Scale We pride ourselves on delivering high-quality, reliable data at scale:

    Rigorous Data Cleaning: Advanced algorithms filter out bot traffic, VPNs, and other non-human interactions. Regular Quality Checks: Our data science team conducts ongoing audits to ensure data accuracy and consistency. Scalable Infrastructure: Our robust data processing pipeline can handle billions of daily events, ensuring comprehensive coverage. Historical Data Availability: Access up to 24 months of historical data for trend analysis and longitudinal studies. Customizable Data Feeds: Tailor the data delivery to your specific needs, from raw clickstream events to aggregated insights.

    Empowering Data-Driven Decision Making In today's digital-first world, understanding online user behavior is crucial for businesses across all sectors. TagX Web Browsing clickstream Data empowers organizations to make informed decisions, optimize their digital strategies, and stay ahead of the competition. Whether you're a marketer looking to refine your targeting, a product manager seeking to enhance user experience, or a researcher exploring digital trends, our cli...

  2. s

    Comprehensive Dataset Collection of Related Tables from a Renowned Swedish...

    • store.smartdatahub.io
    Updated Aug 26, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Comprehensive Dataset Collection of Related Tables from a Renowned Swedish Website - Datasets - This service has been deprecated - please visit https://www.smartdatahub.io/ to access data. See the About page for details. // [Dataset]. https://store.smartdatahub.io/dataset/se_lantmateriet_vasterbotten_zip
    Explore at:
    Dataset updated
    Aug 26, 2024
    Area covered
    Sweden
    Description

    The data collection encompasses a variety of tables sourced from the Swedish website 'Lantmäteriet' (National Land Survey). Each table is a unique assemblage of related data, offering a comprehensive overview of various aspects. Given the diverse range of tables available, the collection is able to provide a thorough examination of its subject. The dataset collection is meticulously organized into rows and columns, making it easier for the user to navigate and understand the data.

  3. Third-party trackers captured with PrivacyScore

    • zenodo.org
    Updated Jan 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ana Pop Stefanija; Ana Pop Stefanija (2025). Third-party trackers captured with PrivacyScore [Dataset]. http://doi.org/10.5281/zenodo.14719661
    Explore at:
    Dataset updated
    Jan 23, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ana Pop Stefanija; Ana Pop Stefanija
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a dataset of trackers as captured by PrivacyScore (privacyscore.org) on February 19th, 2019. Data collected with Privacy Score was done one-time only, as the presence of trackers is tied with the website, not the research subject. The data collection reflects the state of the particular website at that particular data. Data collected showed that the number of third party embeds (third parties that provide services to the first party) is 575 for only ten websites, set by 328 unique companies, and the number of third-party calls is 172. The queried websites were: Wired, The Guardian, Ars Technica, EuraktivJobs, Forumotion, Motheboard (VICE), Politico EU. The found trackers were tringulated with data from Better.fyi and WhoTracksMe in order to detect the purpose for tracking and the tracking type detected. Visual analysis is provided in the published paper (see details below).

  4. w

    Global Data Scraping Tools Market Research Report: By Deployment Mode...

    • wiseguyreports.com
    Updated Jul 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    wWiseguy Research Consultants Pvt Ltd (2024). Global Data Scraping Tools Market Research Report: By Deployment Mode (Cloud, Web, On-Premises), By Data Source (Websites, Social Media, E-commerce Platforms, Databases, Flat Files), By Extraction Type (Structured Data, Semi-Structured Data, Unstructured Data), By Cloud Type (SaaS, PaaS, IaaS), By Application (Market Research, Price Monitoring, Lead Generation, Sentiment Analysis, Data Integration) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2032. [Dataset]. https://www.wiseguyreports.com/reports/data-scraping-tools-market
    Explore at:
    Dataset updated
    Jul 23, 2024
    Dataset authored and provided by
    wWiseguy Research Consultants Pvt Ltd
    License

    https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

    Time period covered
    Jan 7, 2024
    Area covered
    Global
    Description
    BASE YEAR2024
    HISTORICAL DATA2019 - 2024
    REPORT COVERAGERevenue Forecast, Competitive Landscape, Growth Factors, and Trends
    MARKET SIZE 20233.24(USD Billion)
    MARKET SIZE 20243.73(USD Billion)
    MARKET SIZE 203211.46(USD Billion)
    SEGMENTS COVEREDDeployment Mode ,Data Source ,Extraction Type ,Cloud Type ,Application ,Regional
    COUNTRIES COVEREDNorth America, Europe, APAC, South America, MEA
    KEY MARKET DYNAMICS1 AIpowered data extraction 2 Growing demand for structured data 3 Cloudbased data scraping services 4 Realtime web data extraction 5 Increased use of web scraping for business intelligence
    MARKET FORECAST UNITSUSD Billion
    KEY COMPANIES PROFILEDDexi.io ,Cheerio ,ScrapingBee ,Import.io ,Scrapinghub ,80legs ,Bright Data ,Mozenda ,Phantombuster ,Helium Scraper ,ScraperAPI ,Octoparse ,Apify ,ParseHub ,Diffbot
    MARKET FORECAST PERIOD2024 - 2032
    KEY MARKET OPPORTUNITIESAutomation for efficient data collection Realtime data extraction for enhanced decisionmaking Cloudbased tools for scalability and flexibility AIpowered tools for advanced data analysis Increased demand for web scraping in various industries
    COMPOUND ANNUAL GROWTH RATE (CAGR) 15.06% (2024 - 2032)
  5. d

    Altosight | AI Custom Web Scraping Data | 100% Global | Free Unlimited Data...

    • datarade.ai
    .json, .csv, .xls
    Updated Sep 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Altosight (2024). Altosight | AI Custom Web Scraping Data | 100% Global | Free Unlimited Data Points | Bypassing All CAPTCHAs & Blocking Mechanisms | GDPR Compliant [Dataset]. https://datarade.ai/data-products/altosight-ai-custom-web-scraping-data-100-global-free-altosight
    Explore at:
    .json, .csv, .xlsAvailable download formats
    Dataset updated
    Sep 7, 2024
    Dataset authored and provided by
    Altosight
    Area covered
    Wallis and Futuna, Tajikistan, Chile, Svalbard and Jan Mayen, Paraguay, Guatemala, Czech Republic, Singapore, Côte d'Ivoire, Greenland
    Description

    Altosight | AI Custom Web Scraping Data

    ✦ Altosight provides global web scraping data services with AI-powered technology that bypasses CAPTCHAs, blocking mechanisms, and handles dynamic content.

    We extract data from marketplaces like Amazon, aggregators, e-commerce, and real estate websites, ensuring comprehensive and accurate results.

    ✦ Our solution offers free unlimited data points across any project, with no additional setup costs.

    We deliver data through flexible methods such as API, CSV, JSON, and FTP, all at no extra charge.

    ― Key Use Cases ―

    ➤ Price Monitoring & Repricing Solutions

    🔹 Automatic repricing, AI-driven repricing, and custom repricing rules 🔹 Receive price suggestions via API or CSV to stay competitive 🔹 Track competitors in real-time or at scheduled intervals

    ➤ E-commerce Optimization

    🔹 Extract product prices, reviews, ratings, images, and trends 🔹 Identify trending products and enhance your e-commerce strategy 🔹 Build dropshipping tools or marketplace optimization platforms with our data

    ➤ Product Assortment Analysis

    🔹 Extract the entire product catalog from competitor websites 🔹 Analyze product assortment to refine your own offerings and identify gaps 🔹 Understand competitor strategies and optimize your product lineup

    ➤ Marketplaces & Aggregators

    🔹 Crawl entire product categories and track best-sellers 🔹 Monitor position changes across categories 🔹 Identify which eRetailers sell specific brands and which SKUs for better market analysis

    ➤ Business Website Data

    🔹 Extract detailed company profiles, including financial statements, key personnel, industry reports, and market trends, enabling in-depth competitor and market analysis

    🔹 Collect customer reviews and ratings from business websites to analyze brand sentiment and product performance, helping businesses refine their strategies

    ➤ Domain Name Data

    🔹 Access comprehensive data, including domain registration details, ownership information, expiration dates, and contact information. Ideal for market research, brand monitoring, lead generation, and cybersecurity efforts

    ➤ Real Estate Data

    🔹 Access property listings, prices, and availability 🔹 Analyze trends and opportunities for investment or sales strategies

    ― Data Collection & Quality ―

    ► Publicly Sourced Data: Altosight collects web scraping data from publicly available websites, online platforms, and industry-specific aggregators

    ► AI-Powered Scraping: Our technology handles dynamic content, JavaScript-heavy sites, and pagination, ensuring complete data extraction

    ► High Data Quality: We clean and structure unstructured data, ensuring it is reliable, accurate, and delivered in formats such as API, CSV, JSON, and more

    ► Industry Coverage: We serve industries including e-commerce, real estate, travel, finance, and more. Our solution supports use cases like market research, competitive analysis, and business intelligence

    ► Bulk Data Extraction: We support large-scale data extraction from multiple websites, allowing you to gather millions of data points across industries in a single project

    ► Scalable Infrastructure: Our platform is built to scale with your needs, allowing seamless extraction for projects of any size, from small pilot projects to ongoing, large-scale data extraction

    ― Why Choose Altosight? ―

    ✔ Unlimited Data Points: Altosight offers unlimited free attributes, meaning you can extract as many data points from a page as you need without extra charges

    ✔ Proprietary Anti-Blocking Technology: Altosight utilizes proprietary techniques to bypass blocking mechanisms, including CAPTCHAs, Cloudflare, and other obstacles. This ensures uninterrupted access to data, no matter how complex the target websites are

    ✔ Flexible Across Industries: Our crawlers easily adapt across industries, including e-commerce, real estate, finance, and more. We offer customized data solutions tailored to specific needs

    ✔ GDPR & CCPA Compliance: Your data is handled securely and ethically, ensuring compliance with GDPR, CCPA and other regulations

    ✔ No Setup or Infrastructure Costs: Start scraping without worrying about additional costs. We provide a hassle-free experience with fast project deployment

    ✔ Free Data Delivery Methods: Receive your data via API, CSV, JSON, or FTP at no extra charge. We ensure seamless integration with your systems

    ✔ Fast Support: Our team is always available via phone and email, resolving over 90% of support tickets within the same day

    ― Custom Projects & Real-Time Data ―

    ✦ Tailored Solutions: Every business has unique needs, which is why Altosight offers custom data projects. Contact us for a feasibility analysis, and we’ll design a solution that fits your goals

    ✦ Real-Time Data: Whether you need real-time data delivery or scheduled updates, we provide the flexibility to receive data when you need it. Track price changes, monitor product trends, or gather...

  6. d

    Fish data collected during 2015 and 2016 at 9 sites at the Ten Thousand...

    • catalog.data.gov
    • data.usgs.gov
    • +2more
    Updated Jul 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Fish data collected during 2015 and 2016 at 9 sites at the Ten Thousand Islands National Wildlife Refuge, Florida [Dataset]. https://catalog.data.gov/dataset/fish-data-collected-during-2015-and-2016-at-9-sites-at-the-ten-thousand-islands-national-w
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    U.S. Geological Survey
    Area covered
    Florida, Ten Thousand Islands National Wildlife Refuge
    Description

    Field sampling occurred at locations within Ten Thousand Islands National Wildlife Refuge on three transects along the natural salinity gradient of increasing salinity to the coastal south. We used three replicates per tier (east-west) for a total of nine sampling sites. Sites were approximately 1300 m apart in all directions. Sampling events occurred every 3–4 weeks from January to May for 7 sampling events in 2015 and 5 in 2016. These dates were selected to capture signals of the natural variation in water levels and salinity that occur during the transition from the wet season to the dry season. Fish traps were deployed at each of the nine sites and then retrieved the following day, allowing 24 hours soak time. Two minnow traps (Gee Minnow Trap; 22.9 x 44.5 cm, 0.6 cm mesh, 1.9 cm diameter opening) and two Breder traps (15 cm x 15 cm x 30 cm, 12 mm opening width, 15 cm opening height) were placed at each sampling site because the two trap types have the potential to catch different fish species.

  7. Z

    Network Traffic Analysis: Data and Code

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jun 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Honig, Joshua (2024). Network Traffic Analysis: Data and Code [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11479410
    Explore at:
    Dataset updated
    Jun 12, 2024
    Dataset provided by
    Ferrell, Nathan
    Soni, Shreena
    Honig, Joshua
    Moran, Madeline
    Homan, Sophia
    Chan-Tin, Eric
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Code:

    Packet_Features_Generator.py & Features.py

    To run this code:

    pkt_features.py [-h] -i TXTFILE [-x X] [-y Y] [-z Z] [-ml] [-s S] -j

    -h, --help show this help message and exit -i TXTFILE input text file -x X Add first X number of total packets as features. -y Y Add first Y number of negative packets as features. -z Z Add first Z number of positive packets as features. -ml Output to text file all websites in the format of websiteNumber1,feature1,feature2,... -s S Generate samples using size s. -j

    Purpose:

    Turns a text file containing lists of incomeing and outgoing network packet sizes into separate website objects with associative features.

    Uses Features.py to calcualte the features.

    startMachineLearning.sh & machineLearning.py

    To run this code:

    bash startMachineLearning.sh

    This code then runs machineLearning.py in a tmux session with the nessisary file paths and flags

    Options (to be edited within this file):

    --evaluate-only to test 5 fold cross validation accuracy

    --test-scaling-normalization to test 6 different combinations of scalers and normalizers

    Note: once the best combination is determined, it should be added to the data_preprocessing function in machineLearning.py for future use

    --grid-search to test the best grid search hyperparameters - note: the possible hyperparameters must be added to train_model under 'if not evaluateOnly:' - once best hyperparameters are determined, add them to train_model under 'if evaluateOnly:'

    Purpose:

    Using the .ml file generated by Packet_Features_Generator.py & Features.py, this program trains a RandomForest Classifier on the provided data and provides results using cross validation. These results include the best scaling and normailzation options for each data set as well as the best grid search hyperparameters based on the provided ranges.

    Data

    Encrypted network traffic was collected on an isolated computer visiting different Wikipedia and New York Times articles, different Google search queres (collected in the form of their autocomplete results and their results page), and different actions taken on a Virtual Reality head set.

    Data for this experiment was stored and analyzed in the form of a txt file for each experiment which contains:

    First number is a classification number to denote what website, query, or vr action is taking place.

    The remaining numbers in each line denote:

    The size of a packet,

    and the direction it is traveling.

    negative numbers denote incoming packets

    positive numbers denote outgoing packets

    Figure 4 Data

    This data uses specific lines from the Virtual Reality.txt file.

    The action 'LongText Search' refers to a user searching for "Saint Basils Cathedral" with text in the Wander app.

    The action 'ShortText Search' refers to a user searching for "Mexico" with text in the Wander app.

    The .xlsx and .csv file are identical

    Each file includes (from right to left):

    The origional packet data,

    each line of data organized from smallest to largest packet size in order to calculate the mean and standard deviation of each packet capture,

    and the final Cumulative Distrubution Function (CDF) caluclation that generated the Figure 4 Graph.

  8. Q

    Data for: The Pandemic Journaling Project, Phase One (PJP-1)

    • data.qdr.syr.edu
    3gp +22
    Updated Feb 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sarah S. Willen; Sarah S. Willen; Katherine A. Mason; Katherine A. Mason (2024). Data for: The Pandemic Journaling Project, Phase One (PJP-1) [Dataset]. http://doi.org/10.5064/F6PXS9ZK
    Explore at:
    jpeg(-1), jpeg(64787), png(-1), jpeg(2635904), jpeg(2809706), jpeg(3128025), jpeg(3522579), mp4a(609792), jpeg(2715246), jpeg(564843), mp4a(1607020), jpeg(29277), jpeg(411392), jpeg(3219184), html(64045635), jpeg(1455187), jpeg(3953592), jpeg(445647), jpeg(3079564), png(858132), jpeg(3262275), jpeg(5268315), jpeg(1173279), mp4a(4746585), mp4a(506955), jpeg(2228793), jpeg(2399356), jpeg(1847185), png(1487656), mp4a(3329780), mp4a(1503462), bin(-1), jpeg(3226310), mp4a(2843558), jpeg(3161075), jpeg(2535033), jpeg(1814204), mp4a(1403036), jpeg(6831581), jpeg(3500892), jpeg(2063706), jpeg(2867362), jpeg(36303), mp4a(608702), jpeg(2174907), jpeg(2775382), mpga(3119325), pdf(-1), html(28046914), jpeg(2571274), qt(642282), gif(-1), bin(1475326), jpeg(1669679), jpeg(288031), mp4(16611275), jpeg(3758294), mp4a(1316029), mp4a(2192000), jpeg(51905), mpga(3284435), jpeg(47621), jpeg(806714), jpeg(3720630), mp4a(2496251), jpeg(2320221), jpeg(4266931), jpeg(3779944), jpeg(2036741), jpeg(73283), jpeg(460192), jpeg(81002), jpeg(1794407), jpeg(843851), jpeg(134732), bin(1324105), mp4(-1), html(3785552), bin(446182), jpeg(126557), jpeg(112141), jpeg(99013), jpeg(2763037), jpeg(2904103), mp4a(3455446), jpeg(2690540), mpga(3655410), jpeg(2348580), mp4a(8043573), jpeg(4103780), mp4a(2090318), jpeg(3309302), xlsx(34600), jpeg(3101557), qt(-1), jpeg(2597912), jpeg(197952), jpeg(528533), jpeg(2484777), jpeg(17026260), jpeg(31091), jpeg(1143472), jpeg(2705547), jpeg(4634609), mp4a(2427794), mp4a(865561), qt(6530289), jpeg(2750981), mp4a(431473), jpeg(4477949), jpeg(5588285), mp4a(1258547), jpeg(44679), jpeg(5718836), jpeg(2169748), mp4a(4727052), jpeg(4410466), jpeg(359020), jpeg(319878), jpeg(3348421), jpeg(2742034), jpeg(479908), jpeg(2871901), jpeg(754914), mpga(3369080), audio/vnd.dlna.adts(2291450), bin(925606), mp4a(1468479), mp4a(3505956), mp4a(934968), jpeg(94576), mp4a(954136), png(1217841), png(259675), jpeg(2768465), jpeg(7435869), mp4a(558160), jpeg(452676), jpeg(2614435), jpeg(2295874), jpeg(2985176), jpeg(2382774), jpeg(1836889), mp4a(714107), jpeg(3058184), png(4809397), png(291188), jpeg(476581), bin(315174), mp4a(963668), mp4a(1691796), jpeg(305566), jpeg(2340053), mp4a(1416194), jpeg(2187251), mp4a(1480696), jpeg(1224621), jpeg(799339), jpeg(2106618), mp4a(2234556), html(59903646), jpeg(1502693), jpeg(496111), mp4a(710717), pdf(791867), jpeg(2320307), mp4a(2723319), jpeg(2588596), qt(6524117), jpeg(706630), jpeg(1797399), jpeg(3578041), png(34340), jpeg(413917), jpeg(2018007), mp4a(1822023), mp4a(546214), jpeg(104863), png(505848), jpeg(3999644), jpeg(2202086), jpeg(1779668), webm(2501579), jpeg(3644901), mpga(61021), xlsx(19458121), jpeg(3678114), jpeg(3195259), mp4a(5998805), mp4a(1089264), mpga(1223745), png(79931), ogv(921344), mp4a(5290770), mp4a(537339), mp4a(2522582), mp4a(2757638), mp4a(902919), mp4a(3664250), jpeg(293524), jpeg(1611225), jpeg(78426), audio/vnd.dlna.adts(3577011), jpeg(1425684), jpeg(2114989), png(2239184), jpeg(3532208), jpeg(2599799), jpeg(4051592), mp4a(766677), bin(1140735), mp4a(1950073), jpeg(2482637), mp4a(9461846), mp4a(886225), mp4a(2275458), jpeg(3964175), png(7323654), mp4a(3407172), jpeg(1662239), jpeg(2738720), jpeg(2680408), jpeg(875989), mp4a(1135778), jpeg(3063173), mp4a(1044083), mp4a(3068302), jpeg(4586435), jpeg(944028), jpeg(65604), jpeg(803886), mp4a(3207845), jpeg(9303719), jpeg(1178560), mpga(1096992), mp4a(273265), jpeg(37593), jpeg(148529), jpeg(516395), html(799294), mp4a(1064123), jpeg(647105), jpeg(3412037), bin(3742158), jpeg(2343745), jpeg(2242087), jpeg(1153242), mp4a(700840), mp4a(614290), png(674974), mp4a(462181), mp4a(3341713), mp4a(5455315), bin(1700382), png(7882498), jpeg(3098020), jpeg(2781328), mp4a(3763168), jpeg(4431416), mp4a(1614389), jpeg(287296), jpeg(2681973), jpeg(2107304), pdf(332485), jpeg(2635452), audio/vnd.dlna.adts(3058005), mp4a(2448226), mp4a(1805349), mp4a(4150285), mp4a(204164), jpeg(2606693), jpeg(2626157), mp4a(1459294), jpeg(566696), jpeg(2543785), mp4a(369050), mp4(30391500), jpeg(4579297), jpeg(5172226), jpeg(1548860), mp4a(944403), html(640739), jpeg(147544), jpeg(3964519), jpeg(1776724), mp4a(2984325), bin(1595391), jpeg(320684), bin(48838), jpeg(4079596), jpeg(2144716), mp4a(1642287), bin(616420), jpeg(4110243), html(799551), png(1792687), mp4a(962844), jpeg(2625613), jpeg(2666985), jpeg(2722455), jpeg(36852), jpeg(40164), jpeg(111950), mp4a(1235641), mp4a(101692), mp4a(489606), mp4a(1202077), mp4a(4721088), jpeg(63112), jpeg(3627878), mp4a(2368173), jpeg(6463999), mp4a(558864), jpeg(2818575), jpeg(950258), jpeg(4870478), jpeg(4661936), mp4a(828006), png(135414), jpeg(1511423), mpga(2579649), mpga(6283555), jpeg(39553), pdf(141529), bin(1084358), jpeg(379064), jpeg(1305368), mpga(625262), jpeg(4847317), bin(116966), wav(3184824), png(166019), jpeg(804562), jpeg(443742), jpeg(2216857), jpeg(539445), jpeg(2166243), png(1796101), jpeg(1875257), png(1640881), jpeg(2545361), png(441607), jpeg(2890369), mp4a(441334), jpeg(3591325), jpeg(130755), png(170479), mp4a(2620611), mp4a(4518524), mp4a(6386348), jpeg(2467582), mp4a(1084240), jpeg(95788), jpeg(2619585), mp4(8919033), jpeg(4410537), bin(1049901), jpeg(4145168), jpeg(1015520), png(108417), jpeg(11074031), mp4a(1034473), html(479151), jpeg(2543166), jpeg(1867990), jpeg(1688053), html(640918), jpeg(3761476), mp4a(2043016), mp4a(1327650), bin(443069), mp4a(8236358), jpeg(3333029), mp4a(4192934), jpeg(1964105), jpeg(3303164), jpeg(7390050), jpeg(3982230), jpeg(3033149), mp4a(705651), jpeg(45398), jpeg(1013777), jpeg(3386166), jpeg(3610339), jpeg(79582), jpeg(2749667), jpeg(3103944), jpeg(197437), jpeg(1240130), mp4a(3140356), mp4a(2218267), jpeg(5765324), jpeg(103691), jpeg(83984), jpeg(4445333), mp4a(634555), png(2280208), jpeg(3823557), jpeg(704279), mp4a(1632575), jpeg(2986691), bin(481830), jpeg(2921224), docx(-1), mp4a(5352815), ogv(650885), jpeg(421521), jpeg(3832698), html(3025837), audio/vnd.dlna.adts(3763036), bin(161414), jpeg(3634921), jpeg(175071), png(156532), jpeg(38705), jpeg(2969378), png(1059022), mp4a(1110381), bin(1812775), jpeg(1434922), bin(1048366), audio/vnd.dlna.adts(1787003), mp4a(795300), jpeg(2146419), jpeg(3113325), png(2690433), jpeg(2955817), jpeg(1950597), jpeg(180961), jpeg(2921263), png(1187248), jpeg(3661093), bin(1638526), mp4a(3258141), mp4a(2299616), audio/vnd.dlna.adts(6828390), png(4625953), jpeg(1806678), mp4a(1442751), jpeg(3484297), mp4a(581212), jpeg(2358438), jpeg(5251366), mp4a(856519), jpeg(895955), mp4a(225192), jpeg(1857109), png(396961), jpeg(6504102), jpeg(3550057), bin(642950), bin(726730), jpeg(2937002), jpeg(2241215), jpeg(2848793), jpeg(114301), jpeg(6851150), jpeg(5412996), jpeg(5099807), jpeg(2352338), mp4a(1108249), jpeg(59955), jpeg(597941), png(822965), png(279993), mp4a(649729), jpeg(5327907), html(41982439), jpeg(3926818), jpeg(3811126), mpga(3150075), mp4a(851987), jpeg(2161975), jpeg(3049221), mp4(14723059), mp4a(1166746), jpeg(3929963), jpeg(32386), bin(647846), jpeg(943529), png(3558483), mp4a(496459), jpeg(554775), jpeg(673727), jpeg(1234744), mp4a(1614229), bin(1077286), jpeg(2321955), mp4(15102498), jpeg(1138223), jpeg(2821667), mp4a(4957829), jpeg(5267053), jpeg(3746852), xlsx(66430625), png(1781350), mp4(13377154), jpeg(2521556), jpeg(4363031), jpeg(38838), jpeg(1177161), jpeg(5648135), jpeg(3860593), jpeg(3191081), jpeg(4074964), jpeg(2592942), jpeg(70743), jpeg(47092), jpeg(17155), mp4a(5461865), jpeg(317565), jpeg(154225), jpeg(2641570), jpeg(1432979), jpeg(2996468), jpeg(2537158), jpeg(2126839), mp4a(3445663), jpeg(524301), jpeg(2577631), mp4a(999933), jpeg(212728), jpeg(3050628), jpeg(67402), jpeg(4528980), jpeg(48108), jpeg(2849620), mp4a(799189), jpeg(977868), mp4a(1114948), mp4a(1538194), jpeg(3539999), jpeg(732964), mp4a(1159815), jpeg(177432), png(5221994), mp4a(120084), jpeg(4880331), jpeg(2634063), jpeg(1018097), webp(-1), bin(878982), jpeg(5596898), png(356862), jpeg(33015), mp4a(1665024), jpeg(1110786), xlsx(27165), jpeg(2034603), jpeg(2410690), mp4a(2172212), jpeg(287142), jpeg(865631), jpeg(4371438), mp4a(505909), bin(2410811), mp4a(416617), qt(5205385), jpeg(1642459), jpeg(1864894), mp4a(1275342), jpeg(4389684), mp4a(1216743), jpeg(1645086), mp4a(1917929), jpeg(2202466), jpeg(3415224), mp4a(2687040), jpeg(4168896), jpeg(3608610), mp4a(847604), jpeg(2952649), jpeg(1632186), jpeg(482523), jpeg(3260717), wav(2205734), ogv(332111), mp4a(3028452), jpeg(5449171), jpeg(2190017), html(646595), jpeg(2046616), jpeg(363257), bin(2539604), audio/vnd.dlna.adts(13530010), html(8779436), mp4a(3988517), html(710893), bin(2108773), mp4a(938780), mp4a(1632058), mp4a(1781328), jpeg(6006498), mp4a(2011577), png(1867628), jpeg(3578276), qt(1377580), bin(498661), jpeg(3959637), jpeg(3553188), mp4a(1566800), html(9536819), jpeg(1795067), bin(593638), jpeg(68405), jpeg(937156), jpeg(4183531), mpga(1488238), jpeg(864405), jpeg(1365686), docx(12339), jpeg(578317), xlsx(52077), html(523486), jpeg(7547441), mp4a(1930783), jpeg(58628), mp4a(1145760), jpeg(3167708), mp4(31660079), jpeg(2489302), mp4a(1666611), xlsx(82776), jpeg(1827086), jpeg(1844434), jpeg(4555773), jpeg(3299756), mp4a(1140725), mp4a(531377), mp4a(3139464), mp4(24994984), ogv(408137), jpeg(2440831), png(497108), xlsx(88927), jpeg(859100), jpeg(3121852), png(3396851), mp4a(337657), jpeg(1938676), mpga(3748682), jpeg(3010539), png(618010), jpeg(120170), mp4a(691616), jpeg(4782980), jpeg(1882397), mp4a(847950), mp4a(579012), jpeg(3477933), jpeg(3332206), jpeg(1777340), jpeg(1779300), jpeg(3324446), bin(2111272), jpeg(134273), jpeg(2327041), mp4a(2112621), jpeg(2028706), jpeg(2253098), jpeg(87256), jpeg(4748410), jpeg(2262473), mp4a(3061773), jpeg(3853660), jpeg(489701), jpeg(2016316), mp4(48601545), jpeg(4110324), mp4a(750884), mp4a(1666390), jpeg(2729939), jpeg(887373), pdf(122363), mp4a(760877), jpeg(5047594), jpeg(3513429), mp4a(701592), mp4a(24233), jpeg(3878593), jpeg(955964), jpeg(1959028), mp4a(573738), jpeg(1607988), jpeg(121889), mp4a(1115213), bin(1173798), jpeg(6732180), jpeg(1945789), jpeg(5423032), jpeg(252261), jpeg(3546392), jpeg(1587693), jpeg(1303230), jpeg(1050632), mp4a(2957441), mp4a(2682346), bin(564582), jpeg(117534), jpeg(417971), jpeg(3639631), jpeg(3283728), bin(234118), png(2037576), jpeg(3095107), png(1185912), jpeg(3003672), mp4a(1307438), jpeg(142223), jpeg(6401219), bin(2429287), jpeg(3129315), jpeg(111760), jpeg(749493), mpga(5172750), jpeg(67155), mp4a(1303543), audio/vnd.dlna.adts(4340557), jpeg(3978187), jpeg(2696452), mp4a(1505002), jpeg(1750030), jpeg(7505927), jpeg(2638934), jpeg(3812323), bin(818310), jpeg(571235), jpeg(3256481), mp4a(1374945), png(357625), jpeg(5542820), mp4a(1981377), mp4a(2469218), jpeg(4044906), jpeg(37019), jpeg(1134103), bin(632006), jpeg(85234), mp4(11623573), bin(1030438), audio/vnd.dlna.adts(11278413), mp4a(6956199), xlsx(48995), mp4a(10021109), xlsx(224948556), jpeg(41894), jpeg(85137), bin(3540340), jpeg(1280936), xlsx(189425), bin(546822), html(1075544), png(1790553), mp4a(8341651), mp4a(1347344), jpeg(1837571), qt(2398526), jpeg(488375), png(652644), bin(709318), mp4a(512559), jpeg(1660933), mp4a(903487), jpeg(2355965), jpeg(3175474), mp4a(3235128), pdf(213974), jpeg(3105125), mp4a(1264503), jpeg(817070), jpeg(2858948), bin(1019282), jpeg(3172013), jpeg(2118129), png(856929), jpeg(3172905), mp4a(2083812), jpeg(3950185), 3gp(4189257), webp(13654), jpeg(3985986), jpeg(22928), html(496815), jpeg(2221272), jpeg(4526887), jpeg(3917797), jpeg(1579597), jpeg(4260674), jpeg(3155291), jpeg(939502), jpeg(3169133), jpeg(68283), jpeg(145275), audio/vnd.dlna.adts(4820134), mp4a(1195465), html(1694054), jpeg(155887), mp4a(3274925), mp4a(4613589), mpga(2386117), jpeg(41185), mp4a(1086359), mp4a(1151555), bin(1960531), jpeg(2149916), jpeg(2564893), wmv(50197262), mp4(26601787), jpeg(1997912), jpeg(2729245), mp4a(729599), mpga(3484030), jpeg(4728142), jpeg(5043578), mp4a(873556), mp4a(660082), jpeg(13696858), mp4a(1555980), jpeg(45747), jpeg(3178887), qt(28706733), jpeg(4509448), bin(381126), mp4a(661507), jpeg(495339), jpeg(138394), jpeg(85114), mpga(1449626), mp4a(3615513), jpeg(6130051), mp4a(13214859), mp4a(1702996), mp4a(562777), jpeg(2551565), mp4a(1176775), jpeg(16753), mpga(1784266), jpeg(377428), jpeg(3136525), mp4a(1115669), jpeg(64481), mp4a(2548754), jpeg(32021), bin(3983879), jpeg(1629680), pdf(121390), jpeg(2243229), jpeg(3134307), html(38240607), jpeg(8644181), jpeg(4566822), mpga(379781), mp4a(2068903), jpeg(599871), mp4a(8995283), jpeg(2507441), bin(1544294), jpeg(254462), jpeg(1915392), jpeg(1595555), mp4a(1073809), jpeg(40514), jpeg(535219), mp4a(1617110), xlsx(20756300), bin(1869989), jpeg(2381586), jpeg(35883), mpga(4061915), jpeg(917468), jpeg(3052078), mp4a(1901851), jpeg(131612), jpeg(1507898), jpeg(130590), jpeg(133876), jpeg(180752), jpeg(3552912), jpeg(172352), mp4a(2419697), mp4a(331293), jpeg(1583799), jpeg(840041), mp4a(1611680), bin(328166), jpeg(219612), jpeg(1656656), jpeg(4653342), mp4a(5608105), jpeg(2201474), wav(2818960), mp4a(936086), pdf(91460), mp4a(1601130), jpeg(659500), jpeg(100391), jpeg(2812452), mp4a(5629529), jpeg(1816312), jpeg(71716), pdf(295280), jpeg(2911219), jpeg(2471054), docx(31188), jpeg(4659509), png(105272), mp4a(959231), mp4a(1516084), mpga(5970561), jpeg(3668632), mp4a(1739564), jpeg(2058883), jpeg(1901789), mp4a(3134928), mp4a(1152026), jpeg(3523727), mp4a(760909), mp4a(1248111), mp4a(984328), audio/vnd.dlna.adts(934543), jpeg(2193720), jpeg(1401200), bin(919270), jpeg(529647), mp4a(1608171), mp4a(5154628), jpeg(1040846), mp4a(2360919), mp4a(1273706), jpeg(1766662), mp4a(291843), jpeg(3199783), jpeg(4440461), mp4a(2354743), html(983166), jpeg(4653818), jpeg(3216327), jpeg(12340), png(24722), jpeg(68398), audio/vnd.dlna.adts(9495356), mp4a(1911363), jpeg(363586), jpeg(3277514), jpeg(2684588), png(795810), mp4a(1244456), jpeg(59161), jpeg(1603743), mp4a(611153), jpeg(2500101), jpeg(3468457), mp4a(843462), jpeg(4005962), mp4a(912224), 3gp(5920182), jpeg(1714504), jpeg(2280388), mpga(4640203), jpeg(3332571), mp4a(1269110), jpeg(1788844), mp4a(4350631), mp4a(1496135), bin(1772535), mpga(371534), jpeg(4221720), mp4a(1486515), mp4a(3758180), jpeg(3413660), jpeg(3451347), mp4(6993330), bin(152038), jpeg(3535829), jpeg(3234324), tiff(-1), jpeg(2251269), jpeg(2600986), bin(1606725), bin(1615540), jpeg(629961), mp4a(1364069), jpeg(849628), jpeg(2384630), jpeg(854035), jpeg(1059910), mp4a(432261), jpeg(6803436), qt(2010499), mp4a(1222788), png(252350), mp4a(561403), mp4a(1301355), jpeg(78430), jpeg(153294), jpeg(3111015), jpeg(3506560), mp4a(1614765), mp4a(4359255), mp4a(1609908), jpeg(3129756), jpeg(1440858), jpeg(24096), mpga(6606764), mp4a(219517), wav(16120364), mp4a(1071439), jpeg(3293381), jpeg(112899), jpeg(2875869), jpeg(4948125), mp4a(1615299), png(3496115), mp4a(1986411), png(586680), jpeg(1897709), jpeg(2273020), jpeg(4022260), jpeg(377213), mp4a(1702687), html(4191543), jpeg(1398077), jpeg(2079488), jpeg(31946), jpeg(1243971), jpeg(2389859), qt(574596), mp4a(532776), jpeg(2730221), mp4a(510562), jpeg(2968414), mp4a(2145487), jpeg(496123), jpeg(4274950), png(548620), jpeg(2124741), png(5709270), jpeg(5322032), mp4a(304846), jpeg(2969836), jpeg(5084546), jpeg(173417), mpga(2814171), pdf(308146), png(7879), png(2155793), jpeg(1568444), jpeg(107669), jpeg(3844552), jpeg(5050854), mp4(59931145), jpeg(26777), bin(3681626), mp4a(1124596), txt(186920), jpeg(520311), bin(416102), mp4a(7284061), jpeg(40281), jpeg(657555), png(1437413), jpeg(2534845), jpeg(445866), jpeg(1237900), jpeg(4250838), bin(156966), tsv(733), qt(3177780), bin(864966), jpeg(11690), mp4a(3045602), mp4a(2449349), bin(748148), jpeg(1825738), jpeg(1990482), mpga(1190436), mp4a(5845364), mp4a(1448064), jpeg(3171202), bin(2501650), jpeg(2273265), mp4a(619603), jpeg(951877), jpeg(63914), mp4a(1271334), jpeg(1976245), mpga(4817983), jpeg(331201), jpeg(129869), jpeg(7445743), jpeg(5717518), jpeg(2968114), mp4a(693312), mp4a(264471), jpeg(5399866), jpeg(71431), jpeg(1519243), jpeg(1593696), mp4(4106014), mp4a(705329), mp4a(1148157), jpeg(6046515), mp4a(916096), jpeg(333207), jpeg(3138702), jpeg(417572), mpga(5269701), jpeg(145637), mp4a(802505), png(1017305), jpeg(17907), jpeg(3598845), jpeg(1155643), jpeg(2638302), mp4a(822545), bin(1493618), bin(906790), jpeg(154930), jpeg(953837), zip(11659935), mp4a(1214837), mp4a(1016151), mp4a(3515351), mp4a(3839771), mp4a(1256085), jpeg(4031381), mpga(3309399), jpeg(290224), png(459262), jpeg(48326), jpeg(4736590), jpeg(1964763), jpeg(2042850), jpeg(14911972), jpeg(981139), mp4(8726495), jpeg(455010), mp4a(2202351), jpeg(72668), mpga(970535), jpeg(12825578), mp4a(1931894), jpeg(1726579), jpeg(3996799), jpeg(2413680), jpeg(2299059), png(1038072), mp4a(1467032), jpeg(732955), jpeg(145129), jpeg(4057705), jpeg(1575841), mpga(4266613), jpeg(3444896), mp4a(1095447), jpeg(2423812), 3gp(11381321), png(477408), mp4a(1358807), pdf(155079), jpeg(822164), mp4a(3978276), png(316363), jpeg(3336796), bin(1495558), jpeg(874390), jpeg(278529), jpeg(942247), pdf(129862), jpeg(4954268), jpeg(2572775), jpeg(3062482), qt(89399945), jpeg(2128499), jpeg(2849921), png(1019045), mp4a(3170368), mpga(4747435), jpeg(1371393), jpeg(3550211), mp4a(942819), jpeg(2313418), jpeg(4887470), jpeg(91125), mp4a(2439271), jpeg(2764753), mp4a(3002959), bin(729766), jpeg(798303), bin(2204684)Available download formats
    Dataset updated
    Feb 15, 2024
    Dataset provided by
    Qualitative Data Repository
    Authors
    Sarah S. Willen; Sarah S. Willen; Katherine A. Mason; Katherine A. Mason
    License

    https://qdr.syr.edu/policies/qdr-restricted-access-conditionshttps://qdr.syr.edu/policies/qdr-restricted-access-conditions

    Time period covered
    May 29, 2020 - May 31, 2022
    Area covered
    Mexico, Canada, Europe, United States, Central America
    Description

    Project Summary This dataset contains all qualitative and quantitative data collected in the first phase of the Pandemic Journaling Project (PJP). PJP is a combined journaling platform and interdisciplinary, mixed-methods research study developed by two anthropologists, with support from a team of colleagues and students across the social sciences, humanities, and health fields. PJP launched in Spring 2020 as the COVID-19 pandemic was emerging in the United States. PJP was created in order to “pre-design an archive” of COVID-19 narratives and experiences open to anyone around the world. The project is rooted in a commitment to democratizing knowledge production, in the spirit of “archival activism” and using methods of “grassroots collaborative ethnography” (Willen et al. 2022; Wurtz et al. 2022; Zhang et al 2020; see also Carney 2021). The motto on the PJP website encapsulates these commitments: “Usually, history is written only by the powerful. When the history of COVID-19 is written, let’s make sure that doesn’t happen.” (A version of this Project Summary with links to the PJP website and other relevant sites is included in the public documentation of the project at QDR.) In PJP’s first phase (PJP-1), the project provided a digital space where participants could create weekly journals of their COVID-19 experiences using a smartphone or computer. The platform was designed to be accessible to as wide a range of potential participants as possible. Anyone aged 15 or older, living anywhere in the world, could create journal entries using their choice of text, images, and/or audio recordings. The interface was accessible in English and Spanish, but participants could submit text and audio in any language. PJP-1 ran on a weekly basis from May 2020 to May 2022. Data Overview This Qualitative Data Repository (QDR) project contains all journal entries and closed-ended survey responses submitted during PJP-1, along with accompanying descriptive and explanatory materials. The dataset includes individual journal entries and accompanying quantitative survey responses from more than 1,800 participants in 55 countries. Of nearly 27,000 journal entries in total, over 2,700 included images and over 300 are audio files. All data were collected via the Qualtrics survey platform. PJP-1 was approved as a research study by the Institutional Review Board (IRB) at the University of Connecticut. Participants were introduced to the project in a variety of ways, including through the PJP website as well as professional networks, PJP’s social media accounts (on Facebook, Instagram, and Twitter) , and media coverage of the project. Participants provided a single piece of contact information — an email address or mobile phone number — which was used to distribute weekly invitations to participate. This contact information has been stripped from the dataset and will not be accessible to researchers. PJP uses a mixed-methods research approach and a dynamic cohort design. After enrolling in PJP-1 via the project’s website, participants received weekly invitations to contribute to their journals via their choice of email or SMS (text message). Each weekly invitation included a link to that week’s journaling prompts and accompanying survey questions. Participants could join at any point, and they could stop participating at any point as well. They also could stop participating and later restart. Retention was encouraged with a monthly raffle of three $100 gift cards. All individuals who had contributed that month were eligible. Regardless of when they joined, all participants received the project’s narrative prompts and accompanying survey questions in the same order. In Week 1, before contributing their first journal entries, participants were presented with a baseline survey that collected demographic information, including political leanings, as well as self-reported data about COVID-19 exposure and physical and mental health status. Some of these survey questions were repeated at periodic intervals in subsequent weeks, providing quantitative measures of change over time that can be analyzed in conjunction with participants' qualitative entries. Surveys employed validated questions where possible. The core of PJP-1 involved two weekly opportunities to create journal entries in the format of their choice (text, image, and/or audio). Each week, journalers received a link with an invitation to create one entry in response to a recurring narrative prompt (“How has the COVID-19 pandemic affected your life in the past week?”) and a second journal entry in response to their choice of two more tightly focused prompts. Typically the pair of prompts included one focusing on subjective experience (e.g., the impact of the pandemic on relationships, sense of social connectedness, or mental health) and another with an external focus (e.g., key sources of scientific information, trust in government, or COVID-19’s economic impact). Each week,...

  9. The Items Dataset

    • zenodo.org
    Updated Nov 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Patrick Egan; Patrick Egan (2024). The Items Dataset [Dataset]. http://doi.org/10.5281/zenodo.10964134
    Explore at:
    Dataset updated
    Nov 13, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Patrick Egan; Patrick Egan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset originally created 03/01/2019 UPDATE: Packaged on 04/18/2019 UPDATE: Edited README on 04/18/2019

    I. About this Data Set This data set is a snapshot of work that is ongoing as a collaboration between Kluge Fellow in Digital Studies, Patrick Egan and an intern at the Library of Congress in the American Folklife Center. It contains a combination of metadata from various collections that contain audio recordings of Irish traditional music. The development of this dataset is iterative, and it integrates visualizations that follow the key principles of trust and approachability. The project, entitled, “Connections In Sound” invites you to use and re-use this data.

    The text available in the Items dataset is generated from multiple collections of audio material that were discovered at the American Folklife Center. Each instance of a performance was listed and “sets” or medleys of tunes or songs were split into distinct instances in order to allow machines to read each title separately (whilst still noting that they were part of a group of tunes). The work of the intern was then reviewed before publication, and cross-referenced with the tune index at www.irishtune.info. The Items dataset consists of just over 1000 rows, with new data being added daily in a separate file.

    The collections dataset contains at least 37 rows of collections that were located by a reference librarian at the American Folklife Center. This search was complemented by searches of the collections by the scholar both on the internet at https://catalog.loc.gov and by using card catalogs.

    Updates to these datasets will be announced and published as the project progresses.

    II. What’s included? This data set includes:

    • The Items Dataset – a .CSV containing Media Note, OriginalFormat, On Website, Collection Ref, Missing In Duplication, Collection, Outside Link, Performer, Solo/multiple, Sub-item, type of tune, Tune, Position, Location, State, Date, Notes/Composer, Potential Linked Data, Instrument, Additional Notes, Tune Cleanup. This .CSV is the direct export of the Items Google Spreadsheet

    III. How Was It Created? These data were created by a Kluge Fellow in Digital Studies and an intern on this program over the course of three months. By listening, transcribing, reviewing, and tagging audio recordings, these scholars improve access and connect sounds in the American Folklife Collections by focusing on Irish traditional music. Once transcribed and tagged, information in these datasets is reviewed before publication.

    IV. Data Set Field Descriptions

    IV

    a) Collections dataset field descriptions

    • ItemId – this is the identifier for the collection that was found at the AFC
    • Viewed – if the collection has been viewed, or accessed in any way by the researchers.
    • On LOC – whether or not there are audio recordings of this collection available on the Library of Congress website.
    • On Other Website – if any of the recordings in this collection are available elsewhere on the internet
    • Original Format – the format that was used during the creation of the recordings that were found within each collection
    • Search – this indicates the type of search that was performed in order that resulted in locating recordings and collections within the AFC
    • Collection – the official title for the collection as noted on the Library of Congress website
    • State – The primary state where recordings from the collection were located
    • Other States – The secondary states where recordings from the collection were located
    • Era / Date – The decade or year associated with each collection
    • Call Number – This is the official reference number that is used to locate the collections, both in the urls used on the Library website, and in the reference search for catalog cards (catalog cards can be searched at this address: https://memory.loc.gov/diglib/ihas/html/afccards/afccards-home.html)
    • Finding Aid Online? – Whether or not a finding aid is available for this collection on the internet

    b) Items dataset field descriptions

    • id – the specific identification of the instance of a tune, song or dance within the dataset
    • Media Note – Any information that is included with the original format, such as identification, name of physical item, additional metadata written on the physical item
    • Original Format – The physical format that was used when recording each specific performance. Note: this field is used in order to calculate the number of physical items that were created in each collection such as 32 wax cylinders.
    • On Webste? – Whether or not each instance of a performance is available on the Library of Congress website
    • Collection Ref – The official reference number of the collection
    • Missing In Duplication – This column marks if parts of some recordings had been made available on other websites, but not all of the recordings were included in duplication (see recordings from Philadelphia Céilí Group on Villanova University website)
    • Collection – The official title of the collection given by the American Folklife Center
    • Outside Link – If recordings are available on other websites externally
    • Performer – The name of the contributor(s)
    • Solo/multiple – This field is used to calculate the amount of solo performers vs group performers in each collection
    • Sub-item – In some cases, physical recordings contained extra details, the sub-item column was used to denote these details
    • Type of item – This column describes each individual item type, as noted by performers and collectors
    • Item – The item title, as noted by performers and collectors. If an item was not described, it was entered as “unidentified”
    • Position – The position on the recording (in some cases during playback, audio cassette player counter markers were used)
    • Location – Local address of the recording
    • State – The state where the recording was made
    • Date – The date that the recording was made
    • Notes/Composer – The stated composer or source of the item recorded
    • Potential Linked Data – If items may be linked to other recordings or data, this column was used to provide examples of potential relationships between them
    • Instrument – The instrument(s) that was used during the performance
    • Additional Notes – Notes about the process of capturing, transcribing and tagging recordings (for researcher and intern collaboration purposes)
    • Tune Cleanup – This column was used to tidy each item so that it could be read by machines, but also so that spelling mistakes from the Item column could be corrected, and as an aid to preserving iterations of the editing process

    V. Rights statement The text in this data set was created by the researcher and intern and can be used in many different ways under creative commons with attribution. All contributions to Connections In Sound are released into the public domain as they are created. Anyone is free to use and re-use this data set in any way they want, provided reference is given to the creators of these datasets.

    VI. Creator and Contributor Information

    Creator: Connections In Sound

    Contributors: Library of Congress Labs

    VII. Contact Information Please direct all questions and comments to Patrick Egan via www.twitter.com/drpatrickegan or via his website at www.patrickegan.org. You can also get in touch with the Library of Congress Labs team via LC-Labs@loc.gov.

  10. d

    TagX Data collection for AI/ ML training | LLM data | Data collection for AI...

    • datarade.ai
    .json, .csv, .xls
    Updated Jun 18, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TagX (2021). TagX Data collection for AI/ ML training | LLM data | Data collection for AI development & model finetuning | Text, image, audio, and document data [Dataset]. https://datarade.ai/data-products/data-collection-and-capture-services-tagx
    Explore at:
    .json, .csv, .xlsAvailable download formats
    Dataset updated
    Jun 18, 2021
    Dataset authored and provided by
    TagX
    Area covered
    Antigua and Barbuda, Belize, Saudi Arabia, Iceland, Equatorial Guinea, Russian Federation, Benin, Djibouti, Colombia, Qatar
    Description

    We offer comprehensive data collection services that cater to a wide range of industries and applications. Whether you require image, audio, or text data, we have the expertise and resources to collect and deliver high-quality data that meets your specific requirements. Our data collection methods include manual collection, web scraping, and other automated techniques that ensure accuracy and completeness of data.

    Our team of experienced data collectors and quality assurance professionals ensure that the data is collected and processed according to the highest standards of quality. We also take great care to ensure that the data we collect is relevant and applicable to your use case. This means that you can rely on us to provide you with clean and useful data that can be used to train machine learning models, improve business processes, or conduct research.

    We are committed to delivering data in the format that you require. Whether you need raw data or a processed dataset, we can deliver the data in your preferred format, including CSV, JSON, or XML. We understand that every project is unique, and we work closely with our clients to ensure that we deliver the data that meets their specific needs. So if you need reliable data collection services for your next project, look no further than us.

  11. Imgur Most Viral and Secret Santa

    • kaggle.com
    Updated Apr 18, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ghalib93 (2020). Imgur Most Viral and Secret Santa [Dataset]. https://www.kaggle.com/ghalib93/imgur-most-viral-and-secret-santa/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 18, 2020
    Dataset provided by
    Kaggle
    Authors
    Ghalib93
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    Imgur is an image hosting and sharing website founded in 2009. It became one of the most popular websites around the world with approximately 250 million users. The website does not require registration and anyone can browse its content. However, to be able to post an account must be created. It is famous for an event that it created in 2013 where members get to register to send/receive gifts from other members on the website. The event takes place during Christmas time and people share their gifts via the website where they post pictures of the process or what they received in a specific tag. Today the data provided covers two sections that I think are important to understanding certain patterns within the Imgur community. The first is the Most Viral section and the second is the Secret Santa tag.

    I have participated twice in The Imgur secret Santa event and always found funny and interesting post from its most viral section. I would like with the help of the Kaggle community to identify trends from the data provided and maybe make a comparison between the Secret Santa data and the most viral.

    Content

    There are two Dataframes included and they are almost identical in the number of columns:

    • The first Dataframe is Imgur Most Viral posts. This contains many of the posts that were labelled as Viral by The Imgur community and team using specific algorithms to track number of likes and dislikes across multiple platforms. The posts might be videos, gifs, pictures or just text.

    • The second Dataframe is Imgur Secret Santa Tag. Secret Santa is an annual Imgur tradition where members can sign up to send gifts to and receive gifts from other members during the Christmas holiday.This contains many of the posts that were tagged with Secret Santa by the Imgur community. The posts might be videos, gifs, pictures or just text. There is a (is_viral) column in this Dataframe that is not available in the Most Viral Dataframe since all of the posts there are viral.

      Data Dictionary

      FeatureTypeDatasetDescription
      account_idobjectImgur_Viral/imgur_secret_santaUnique Account ID per member
      comment_countfloat64Imgur_Viral/imgur_secret_santaNumber of comments made in the post
      datetimefloat64Imgur_Viral/imgur_secret_santaTimeStamp containing Date and Time Details
      downsfloat64Imgur_Viral/imgur_secret_santaNumber of dislikes for the post
      favorite_countfloat64Imgur_Viral/imgur_secret_santaNumber of user that marked the post as a favourite
      idobjectImgur_Viral/imgur_secret_santaUniqe Post ID. Even if it was posted by the same member, different posts will have different IDs
      images_countfloat64Imgur_Viral/imgur_secret_santaNumber of images included in the post
      pointsfloat64Imgur_Viral/imgur_secret_santaEach post will have calculated points based on (ups - downs)
      scorefloat64Imgur_Viral/imgur_secret_santaTicket number
      tagsobjectImgur_Viral/imgur_secret_santaTags are sub albums that the post will show under
      titleobjectImgur_Viral/imgur_secret_santaTitle of the post
      upsfloat64Imgur_Viral/imgur_secret_santaNumber of likes for the post
      viewsfloat64Imgur_Viral/imgur_secret_santaNumber of people that viewed the post
      is_most_viralbooleanimgur_secret_santaIf the post is viral or not

    Acknowledgements

    I would like to thank imgur for providing an API that made collecting data easier from its website. With their help we might be able to better understand certain trends that emerge from its community

    Inspiration

    There is no problem to solve from this data, but it just a fun way to explore and learn more about programming and analyzing data. I hope you enjoy playing with the data as much as I did collecting it and browsing the website

  12. i

    Website Fingerprinting Dataset of Browsing Network Traffic for Desktop and...

    • ieee-dataport.org
    Updated Oct 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohamad Amar Irsyad Mohd Aminuddin (2024). Website Fingerprinting Dataset of Browsing Network Traffic for Desktop and Mobile Webpages [Dataset]. https://ieee-dataport.org/documents/website-fingerprinting-dataset-browsing-network-traffic-desktop-and-mobile-webpages
    Explore at:
    Dataset updated
    Oct 21, 2024
    Authors
    Mohamad Amar Irsyad Mohd Aminuddin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a dataset of Tor cell file extracted from browsing simulation using Tor Browser. The simulations cover both desktop and mobile webpages. The data collection process was using WFP-Collector tool (https://github.com/irsyadpage/WFP-Collector). All the neccessary configuration to perform the simulation as detailed in the tool repository.The webpage URL is selected by using the first 100 website based on: https://dataforseo.com/free-seo-stats/top-1000-websites.Each webpage URL is visited 90 times for each deskop and mobile browsing mode.

  13. v

    Greened waste collection sites - Dataset - Vilnius Open Data portal

    • opendata.vilnius.lt
    Updated Sep 30, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Greened waste collection sites - Dataset - Vilnius Open Data portal [Dataset]. https://opendata.vilnius.lt/dataset/greened-waste-collection-sites
    Explore at:
    Dataset updated
    Sep 30, 2024
    Area covered
    Vilnius
    Description

    The dataset provides information on greened waste collection sites located in the city of Vilnius.

  14. Additional file 8 of Electronic data collection, management and analysis...

    • figshare.com
    xlsx
    Updated Jun 9, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Patrick Keating; Jillian Murray; Karl Schenkel; Laura Merson; Anna Seale (2023). Additional file 8 of Electronic data collection, management and analysis tools used for outbreak response in low- and middle-income countries: a systematic review and stakeholder survey [Dataset]. http://doi.org/10.6084/m9.figshare.16680250.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 9, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Patrick Keating; Jillian Murray; Karl Schenkel; Laura Merson; Anna Seale
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 8. Technical characteristics of tools as identified from the review, survey, and tool developers’ websites/direct contact. Dataset that describes the technical characteristics of the electronic tools as identified from the systematic review (2010–2020), survey or from review of software websites or contact with software developers. Where no data were found on a particular characteristic of a tool, “don’t know” was entered in the database and where a tool only had one function (data collection or management or analysis), “NA” for not applicable was added to the relevant columns. The Samaritan’s Purse Reporting System was excluded from this database on request from the organisation.

  15. Z

    List of research data repositories that were shut down

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Schabinger, Rouven (2024). List of research data repositories that were shut down [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7802441
    Explore at:
    Dataset updated
    Jul 11, 2024
    Dataset provided by
    Weisweiler, Nina Leonie
    Schabinger, Rouven
    Strecker, Dorothea
    Pampel, Heinz
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset aggregates information about 191 research data repositories that were shut down. The data collection was based on the registry of research data repositories re3data and a comprehensive content analysis of repository websites and related materials. Documented in the dataset are the period in which a repository was active, the risks resulting in its shutdown, and the repositories taking over custody of the data after.

  16. s

    Index 100KM: A Comprehensive Dataset Collection sourced from a Prominent...

    • store.smartdatahub.io
    Updated Aug 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Index 100KM: A Comprehensive Dataset Collection sourced from a Prominent Swedish Website - Datasets - This service has been deprecated - please visit https://www.smartdatahub.io/ to access data. See the About page for details. // [Dataset]. https://store.smartdatahub.io/dataset/se_lantmateriet_index_100km_zip
    Explore at:
    Dataset updated
    Aug 26, 2024
    Description

    The dataset collection titled 'index_100km' is a valuable resource comprising one or multiple tables of related data. The data within these tables is meticulously organized in a structured format of columns and rows for easy understanding and analysis. The data for these tables has been sourced from the website of 'Lantmäteriet' (The Land Survey) in Sweden. This ensures the data's authenticity, given the reputation of the source. The dataset collection is crucial for various analyses and can be significantly useful for researchers and professionals in various fields.

  17. Requirements data sets (user stories)

    • zenodo.org
    • data.mendeley.com
    txt
    Updated Jan 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fabiano Dalpiaz; Fabiano Dalpiaz (2025). Requirements data sets (user stories) [Dataset]. http://doi.org/10.17632/7zbk8zsd8y.1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jan 13, 2025
    Dataset provided by
    Mendeley Ltd.
    Authors
    Fabiano Dalpiaz; Fabiano Dalpiaz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A collection of 22 data set of 50+ requirements each, expressed as user stories.

    The dataset has been created by gathering data from web sources and we are not aware of license agreements or intellectual property rights on the requirements / user stories. The curator took utmost diligence in minimizing the risks of copyright infringement by using non-recent data that is less likely to be critical, by sampling a subset of the original requirements collection, and by qualitatively analyzing the requirements. In case of copyright infringement, please contact the dataset curator (Fabiano Dalpiaz, f.dalpiaz@uu.nl) to discuss the possibility of removal of that dataset [see Zenodo's policies]

    The data sets have been originally used to conduct experiments about ambiguity detection with the REVV-Light tool: https://github.com/RELabUU/revv-light

    This collection has been originally published in Mendeley data: https://data.mendeley.com/datasets/7zbk8zsd8y/1

    Overview of the datasets [data and links added in December 2024]

    The following text provides a description of the datasets, including links to the systems and websites, when available. The datasets are organized by macro-category and then by identifier.

    Public administration and transparency

    g02-federalspending.txt (2018) originates from early data in the Federal Spending Transparency project, which pertain to the website that is used to share publicly the spending data for the U.S. government. The website was created because of the Digital Accountability and Transparency Act of 2014 (DATA Act). The specific dataset pertains a system called DAIMS or Data Broker, which stands for DATA Act Information Model Schema. The sample that was gathered refers to a sub-project related to allowing the government to act as a data broker, thereby providing data to third parties. The data for the Data Broker project is currently not available online, although the backend seems to be hosted in GitHub under a CC0 1.0 Universal license. Current and recent snapshots of federal spending related websites, including many more projects than the one described in the shared collection, can be found here.

    g03-loudoun.txt (2018) is a set of extracted requirements from a document, by the Loudoun County Virginia, that describes the to-be user stories and use cases about a system for land management readiness assessment called Loudoun County LandMARC. The source document can be found here and it is part of the Electronic Land Management System and EPlan Review Project - RFP RFQ issued in March 2018. More information about the overall LandMARC system and services can be found here.

    g04-recycling.txt(2017) concerns a web application where recycling and waste disposal facilities can be searched and located. The application operates through the visualization of a map that the user can interact with. The dataset has obtained from a GitHub website and it is at the basis of a students' project on web site design; the code is available (no license).

    g05-openspending.txt (2018) is about the OpenSpending project (www), a project of the Open Knowledge foundation which aims at transparency about how local governments spend money. At the time of the collection, the data was retrieved from a Trello board that is currently unavailable. The sample focuses on publishing, importing and editing datasets, and how the data should be presented. Currently, OpenSpending is managed via a GitHub repository which contains multiple sub-projects with unknown license.

    g11-nsf.txt (2018) refers to a collection of user stories referring to the NSF Site Redesign & Content Discovery project, which originates from a publicly accessible GitHub repository (GPL 2.0 license). In particular, the user stories refer to an early version of the NSF's website. The user stories can be found as closed Issues.

    (Research) data and meta-data management

    g08-frictionless.txt (2016) regards the Frictionless Data project, which offers an open source dataset for building data infrastructures, to be used by researchers, data scientists, and data engineers. Links to the many projects within the Frictionless Data project are on GitHub (with a mix of Unlicense and MIT license) and web. The specific set of user stories has been collected in 2016 by GitHub user @danfowler and are stored in a Trello board.

    g14-datahub.txt (2013) concerns the open source project DataHub, which is currently developed via a GitHub repository (the code has Apache License 2.0). DataHub is a data discovery platform which has been developed over multiple years. The specific data set is an initial set of user stories, which we can date back to 2013 thanks to a comment therein.

    g16-mis.txt (2015) is a collection of user stories that pertains a repository for researchers and archivists. The source of the dataset is a public Trello repository. Although the user stories do not have explicit links to projects, it can be inferred that the stories originate from some project related to the library of Duke University.

    g17-cask.txt (2016) refers to the Cask Data Application Platform (CDAP). CDAP is an open source application platform (GitHub, under Apache License 2.0) that can be used to develop applications within the Apache Hadoop ecosystem, an open-source framework which can be used for distributed processing of large datasets. The user stories are extracted from a document that includes requirements regarding dataset management for Cask 4.0, which includes the scenarios, user stories and a design for the implementation of these user stories. The raw data is available in the following environment.

    g18-neurohub.txt (2012) is concerned with the NeuroHub platform, a neuroscience data management, analysis and collaboration platform for researchers in neuroscience to collect, store, and share data with colleagues or with the research community. The user stories were collected at a time NeuroHub was still a research project sponsored by the UK Joint Information Systems Committee (JISC). For information about the research project from which the requirements were collected, see the following record.

    g22-rdadmp.txt (2018) is a collection of user stories from the Research Data Alliance's working group on DMP Common Standards. Their GitHub repository contains a collection of user stories that were created by asking the community to suggest functionality that should part of a website that manages data management plans. Each user story is stored as an issue on the GitHub's page.

    g23-archivesspace.txt (2012-2013) refers to ArchivesSpace: an open source, web application for managing archives information. The application is designed to support core functions in archives administration such as accessioning; description and arrangement of processed materials including analog, hybrid, and
    born digital content; management of authorities and rights; and reference service. The application supports collection management through collection management records, tracking of events, and a growing number of administrative reports. ArchivesSpace is open source and its

  18. Ecosystem Data

    • tern.org.au
    Updated Jul 18, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Terrestrial Ecosystem Research Network (2020). Ecosystem Data [Dataset]. https://www.tern.org.au/ecosystem-data/
    Explore at:
    Dataset updated
    Jul 18, 2020
    Dataset provided by
    TERN
    Description

    TERN Ecosystem Data Services and Analytics

    TERN Research Data Repository

    Simplify your research data collection with the help of the research data repository managed by the Terrestrial Ecosystem Research Network. Our collection of ecosystem data includes ecoacustics, bio acoustics, lead area index information and much more.

    The TERN research data collection provides analysis-ready environment data that facilitates a wide range of ecological research projects undertaken by established and emerging scientists from Australia and around the world. The resources which we provide support scientific investigation in a wide array of environment and climate research fields along with decision-making initiatives.

    Explore Our Ecosystem Data Portals

    Open access ecosystem data collections via the TERN Data Discovery Portal and sub-portals:

    Access all TERN Environment Data

    Discover datasets published by TERN’s observing platforms and collaborators. Search geographically, then browse, query and extract the data via the TERN Data Discovery Portal.

    Search EcoPlots data

    Search, integrate and access Australia’s plot-based ecology survey data.

    Download ausplotsR

    Extract, prepare, visualise and analyse TERN Ecosystem Surveillance monitoring data in R.

    Search EcoImages

    Search and download Leaf Area Index (LAI), Phenocam and Photopoint images.

    Explore our data services

    Tools that support the discovery, anaylsis and re-use of data:

    Visualise the data

    We’ve teamed up with ANU to provide 50 landscape and ecosystem datasets presented graphically.

    Access CoESRA Virtual Desktop

    A virtual desktop environment that enables users to create, execute and share environmental data simulations.

    Submit data with SHaRED

    Our user friendly tool to upload your data securely to our environment database so you can contribute to Australia’s ecological research.

    Other data portals, tools and services

    The Soil and Landscape Grid of Australia provides relevant, consistent, comprehensive, nation-wide data in an easily-accessible format. It provides detailed digital maps of the country’s soil and landscape attributes at a finer resolution than ever before in Australia.

    The annual Australia’s Environment products summarise a large amount of observations on the trajectory of our natural resources and ecosystems. Use the data explorer to view and download maps, accounts or charts by region and land use type. The website also has national summary reports and report cards for different types of administrative and geographical regions.

    TERN’s ausplotsR is an R Studio package for extracting, preparing, visualising and analysing TERN’s Ecosystem Surveillance monitoring data. Users can use the package to directly access plot-based data on vegetation and soils across Australia, with simple function calls to extract the data and merge them into species occurrence matrices for analysis or to calculate things like basal area and fractional cover.

    The Australian Cosmic-Ray Neutron Soil Moisture Monitoring Network (CosmOz) delivers soil moisture data for 16 sites over an area of about 30 hectares to depths in the soil of between 10 to 50 cm. In 2020, the CosmOz soil moisture network, which is led by CSIRO, is set to be expanded to 23 sites.

    The TERN Mangrove Data Portal provides a diverse range of historical and contemporary remotely-sensed datasets on extent and change of mangrove ecosystems across Australia. It includes multi-scale field measurements of mangrove floristics, structure and biomass, a diverse range of airborne imagery collected since the 1950s, and multispectral and hyperspectral imagery captured by drones, aircraft and satellites.

    The TERN Wetlands and Riparian Zones Data Portal provides access to relevant national to local remotely-sensed datasets and also facilitates the collation and collection of on-ground data that support validation.

    ecocloud provides easy access to large volumes of curated ecosystem science data and tools, a computing platform and resources and tools for innovative research. ecocloud gives you 10GB of persistent storage to keep your code/notebooks so they are ready to go when you start up a server (R or Python environment). It uses the JupyterLabs interface, which includes connections to GitHub, Google Drive and Dropbox.

    Analysis Ready Ecosystem Data

    Our research data collection makes it easier for scientists and researchers to investigate and answer their questions by providing them with open data, research and management tools, infrastructure, and site-based research tools.

    The TERN data portal provides open access ecosystem data. Our tools support data discovery, analysis, and re-use. The services which we provide facilitate research, education, and management. We maintain a network of monitoring site and sensor data streams for long-term research as part of our research data repository.

  19. Intensive Supervision for High-Risk Offenders in 14 Sites in the United...

    • icpsr.umich.edu
    • catalog.data.gov
    ascii, sas, spss +1
    Updated May 15, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Petersilia, Joan; Turner, Susan (2013). Intensive Supervision for High-Risk Offenders in 14 Sites in the United States, 1987-1990 [Dataset]. http://doi.org/10.3886/ICPSR06358.v2
    Explore at:
    stata, sas, spss, asciiAvailable download formats
    Dataset updated
    May 15, 2013
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    Authors
    Petersilia, Joan; Turner, Susan
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/6358/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/6358/terms

    Time period covered
    1987 - 1990
    Area covered
    United States
    Description

    In 1986, the Bureau of Justice Assistance (BJA) funded a demonstration project of intensive supervision programs (ISPs), alternatives to control sanctions that involve community sanctions and emphasize stringent conditions and close monitoring of convicted offenders. The primary intent of the demonstration project was to determine the effects of participation in an ISP program on the subsequent behavior of offenders and to test the feasibility of the ISP's stated objectives: (1) to reduce recidivism by providing a seemingly cost-effective alternative to imprisonment, and (2) to provide an intermediate punishment between incarceration and regular probation that allows the punishment to fit the crime. Fourteen sites in nine states participated in the project and each of the selected sites was funded for 18 to 24 months. Individual agencies in each site tailored their ISP programs to their local needs, resources, and contexts, developed their own eligibility criteria, and determined whether probationers met those criteria. While the individual ISP projects differed, each site was required to follow identical procedures regarding random assignment, data collection, and overall program evaluation. Data collection instruments that differed in the amount of drug-related questions asked were used for the six- and twelve-month reviews. The "non-drug" data collection instrument, used in Contra Costa, Ventura, and Los Angeles counties, CA, Marion County, OR, and Milwaukee, WI, gathered drug data only on the number of monthly drug and alcohol tests given to offenders. The "drug" data collection instrument was distributed in Atlanta, Macon, and Waycross, GA, Seattle, WA, Santa Fe, NM, Des Moines, IA, and Winchester, VA. Variables regarding drug use included the number of drug tests ordered, the number of drug tests taken, and the number of positives for alcohol, cocaine, heroin, uppers, downers, quaaludes, LSD/hallucinogens, PCP, marijuana/hashish, and "other". The drug questions on the instrument used in Dallas and Houston, TX, were the same as those asked at the drug sites. Once a site determined that an offender was eligible for inclusion, RAND staff randomly assigned the offender to either the experimental ISP program (prison diversion, enhanced probation, or enhanced parole) or to a control sanction (prison, routine probation, or parole). Assignment periods began in January 1987 and some sites continued to accept cases through January 1990. Each offender was followed for a period of one year, beginning on the day of assignment to the experimental or control program. The six-month and twelve-month review data contain identical variables: the current status of the offender (prison, ISP, or terminated), record of each arrest and/or technical violation, its disposition, and sentence or sanction. Information was also recorded for each month during the follow-up regarding face-to-face contacts, phone and collateral contacts, monitoring and record checks, community service hours, days on electronic surveillance (if applicable), contacts between client and community sponsor, number and type of counseling sessions and training, days in paid employment and earnings, number of drug and alcohol tests taken, and amount of restitution, fines, court costs, and probation fees paid. Background variables include sex, race, age at assignment, prior criminal history, drug use and treatment history, type of current offense, sentence characteristics, conditions imposed, and various items relating to risk of recidivism and need for treatment. For the two Texas sites, information on each arrest and/or technical violation, its disposition, and sentence or sanction was recorded in separate recidivism files (Parts 10 and 17). Dates were converted by RAND to time-lapse variables for the public release files that comprise this data collection.

  20. f

    Characteristics of data collection, abstraction, and management at audit...

    • plos.figshare.com
    xls
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stephany N. Duda; Bryan E. Shepherd; Cynthia S. Gadd; Daniel R. Masys; Catherine C. McGowan (2023). Characteristics of data collection, abstraction, and management at audit sites A–G. [Dataset]. http://doi.org/10.1371/journal.pone.0033908.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Stephany N. Duda; Bryan E. Shepherd; Cynthia S. Gadd; Daniel R. Masys; Catherine C. McGowan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Characteristics of data collection, abstraction, and management at audit sites A–G.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
TagX (2024). TagX Web Browsing clickstream Data - 300K Users North America, EU - GDPR - CCPA Compliant [Dataset]. https://datarade.ai/data-products/tagx-web-browsing-clickstream-data-300k-users-north-america-tagx

TagX Web Browsing clickstream Data - 300K Users North America, EU - GDPR - CCPA Compliant

Explore at:
.json, .csv, .xlsAvailable download formats
Dataset updated
Sep 16, 2024
Dataset authored and provided by
TagX
Area covered
United States
Description

TagX Web Browsing Clickstream Data: Unveiling Digital Behavior Across North America and EU Unique Insights into Online User Behavior TagX Web Browsing clickstream Data offers an unparalleled window into the digital lives of 1 million users across North America and the European Union. This comprehensive dataset stands out in the market due to its breadth, depth, and stringent compliance with data protection regulations. What Makes Our Data Unique?

Extensive Geographic Coverage: Spanning two major markets, our data provides a holistic view of web browsing patterns in developed economies. Large User Base: With 300K active users, our dataset offers statistically significant insights across various demographics and user segments. GDPR and CCPA Compliance: We prioritize user privacy and data protection, ensuring that our data collection and processing methods adhere to the strictest regulatory standards. Real-time Updates: Our clickstream data is continuously refreshed, providing up-to-the-minute insights into evolving online trends and user behaviors. Granular Data Points: We capture a wide array of metrics, including time spent on websites, click patterns, search queries, and user journey flows.

Data Sourcing: Ethical and Transparent Our web browsing clickstream data is sourced through a network of partnered websites and applications. Users explicitly opt-in to data collection, ensuring transparency and consent. We employ advanced anonymization techniques to protect individual privacy while maintaining the integrity and value of the aggregated data. Key aspects of our data sourcing process include:

Voluntary user participation through clear opt-in mechanisms Regular audits of data collection methods to ensure ongoing compliance Collaboration with privacy experts to implement best practices in data anonymization Continuous monitoring of regulatory landscapes to adapt our processes as needed

Primary Use Cases and Verticals TagX Web Browsing clickstream Data serves a multitude of industries and use cases, including but not limited to:

Digital Marketing and Advertising:

Audience segmentation and targeting Campaign performance optimization Competitor analysis and benchmarking

E-commerce and Retail:

Customer journey mapping Product recommendation enhancements Cart abandonment analysis

Media and Entertainment:

Content consumption trends Audience engagement metrics Cross-platform user behavior analysis

Financial Services:

Risk assessment based on online behavior Fraud detection through anomaly identification Investment trend analysis

Technology and Software:

User experience optimization Feature adoption tracking Competitive intelligence

Market Research and Consulting:

Consumer behavior studies Industry trend analysis Digital transformation strategies

Integration with Broader Data Offering TagX Web Browsing clickstream Data is a cornerstone of our comprehensive digital intelligence suite. It seamlessly integrates with our other data products to provide a 360-degree view of online user behavior:

Social Media Engagement Data: Combine clickstream insights with social media interactions for a holistic understanding of digital footprints. Mobile App Usage Data: Cross-reference web browsing patterns with mobile app usage to map the complete digital journey. Purchase Intent Signals: Enrich clickstream data with purchase intent indicators to power predictive analytics and targeted marketing efforts. Demographic Overlays: Enhance web browsing data with demographic information for more precise audience segmentation and targeting.

By leveraging these complementary datasets, businesses can unlock deeper insights and drive more impactful strategies across their digital initiatives. Data Quality and Scale We pride ourselves on delivering high-quality, reliable data at scale:

Rigorous Data Cleaning: Advanced algorithms filter out bot traffic, VPNs, and other non-human interactions. Regular Quality Checks: Our data science team conducts ongoing audits to ensure data accuracy and consistency. Scalable Infrastructure: Our robust data processing pipeline can handle billions of daily events, ensuring comprehensive coverage. Historical Data Availability: Access up to 24 months of historical data for trend analysis and longitudinal studies. Customizable Data Feeds: Tailor the data delivery to your specific needs, from raw clickstream events to aggregated insights.

Empowering Data-Driven Decision Making In today's digital-first world, understanding online user behavior is crucial for businesses across all sectors. TagX Web Browsing clickstream Data empowers organizations to make informed decisions, optimize their digital strategies, and stay ahead of the competition. Whether you're a marketer looking to refine your targeting, a product manager seeking to enhance user experience, or a researcher exploring digital trends, our cli...

Search
Clear search
Close search
Google apps
Main menu