98 datasets found
  1. f

    Unravelling travellers’ route choice behaviour at full-scale urban network...

    • plos.figshare.com
    • figshare.com
    tiff
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Humberto González Ramírez; Ludovic Leclercq; Nicolas Chiabaut; Cécile Becarie; Jean Krug (2023). Unravelling travellers’ route choice behaviour at full-scale urban network by focusing on representative OD pairs in computer experiments [Dataset]. http://doi.org/10.1371/journal.pone.0225069
    Explore at:
    tiffAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Humberto González Ramírez; Ludovic Leclercq; Nicolas Chiabaut; Cécile Becarie; Jean Krug
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In a city-scale network, trips are made in thousands of origin-destination (OD) pairs connected by multiple routes, resulting in a large number of alternatives with diverse characteristics that influence the route choice behaviour of the travellers. As a consequence, to accurately predict user choices at full network scale, a route choice model should be scalable to suit all possible configurations that may be encountered. In this article, a new methodology to obtain such a model is proposed. The main idea is to use clustering analysis to obtain a small set of representative OD pairs and routes that can be investigated in detail through computer route choice experiments to collect observations on travellers behaviour. The results are then scaled-up to all other OD pairs in the network. It was found that 9 OD pair configurations are sufficient to represent the network of Lyon, France, composed of 96,096 OD pairs and 559,423 routes. The observations, collected over these nine representative OD pair configurations, were used to estimate three mixed logit models. The predictive accuracy of the three models was tested against the predictive accuracy of the same models (with the same specification), but estimated over randomly selected OD pair configurations. The obtained results show that the models estimated with the representative OD pairs are superior in predictive accuracy, thus suggesting the scaling-up to the entire network of the choices of the participants over the representative OD pair configurations, and validating the methodology in this study.

  2. ODOT District OD Pairs

    • ohiostatefreightplan-ohiodot.hub.arcgis.com
    Updated Apr 22, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ohio Department of Transportation (2022). ODOT District OD Pairs [Dataset]. https://ohiostatefreightplan-ohiodot.hub.arcgis.com/datasets/odot-district-od-pairs/about
    Explore at:
    Dataset updated
    Apr 22, 2022
    Dataset authored and provided by
    Ohio Department of Transportationhttps://transportation.ohio.gov/
    Description

    Tabe containing data used to join ODOT Districts to Origin and Destination of freight. This table also includes Origin and Destination ODOT District codes and information detailing if the origin or destination is International or an US state.

  3. Dataset defining representative route network for GLOWOPT market segments

    • zenodo.org
    • data.niaid.nih.gov
    bin, csv, png
    Updated Jul 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kaushik Radhakrishnan; Kaushik Radhakrishnan (2024). Dataset defining representative route network for GLOWOPT market segments [Dataset]. http://doi.org/10.5281/zenodo.5110098
    Explore at:
    png, csv, binAvailable download formats
    Dataset updated
    Jul 18, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Kaushik Radhakrishnan; Kaushik Radhakrishnan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    For calculating the GLOWOPT representative route network, a forecast model chain was used. The model was calibrated with 2019 flight movement data (unimpeded by COVID-19) and provided forecasted aircraft movements from the year 2019 (~2020) to 2050 in 5 years intervals.

    Two formats of datasets are generated with the results of the forecast model chain, a csv file format and 4-dimensional array supported with MATLAB (.mat).

    CSV Datasets

    For each forecasted year a csv file is generated with the information on the origin-destination (OD) airports IATA codes, region, latitude and longitude of OD pair, representative aircraft type along with the aircraft category , the average load factor and finally, the distance between the OD pair. The airports worldwide are sub-dived into nine regions namely Africa, Asia, Caribbean, Central America, Europe, Middle East, North America, Oceania and South America. There are total of seven datasets, one for each forecasted year i.e. for years 2019 (~2020), 2025, 2030, 2035, 2040, 2045 and 2050.

    Description of the data labels:

    Origin- Origin airport IATA code

    Origin_Region- Region of the Origin Airport

    Origin_Latitude- Latitude of the Origin Airport

    Origin_Longitude- Longitude of the Origin Airport

    Destination- Destination airport IATA code

    Destination_Region- Region of the Destination Airport

    Destination_Latitude- Latitude of the Destination Airport

    Destination_Longitude- Longitude of the Destination Airport

    AcType- Representative aircraft type

    Load_Factor- Average load factor per flight

    Yearly_Frequency- Total aircraft movements per annum

    RefACType- Aircraft Category based on number of seats (Category 6 represents aircraft with seats 252-301 and category 7 represents aircraft with seats greater than 302.)

    Distance- Great circle distance between Origin and Destination in Km.

    MATLAB Datasets

    The dataset generated with MATLAB is a 4-dimensional array with the extension *.mat. The first dimension is the region of the origin airport and subsequently the second dimensions contains the region of the destination airport. The third and fourth dimension are the aircraft category based on seat numbers and the categorized great circle distances. The information received therein is a 1X1 cell with the IATA codes of the OD pairs, frequency and great circle distance in Km.

    The 4D array is categorised such that the user can select the route segment specific to a region or a combination of regions. The range categorisation in combination with an aircraft category additionally offers the user the possibility to select routes depending on their great circle distances. The ranges are categorised to represent very short range (0-2000 km), short range (2000-6000 km), medium range (6000-10000 km) and long range (10000 – 15000 km).

    Indexing based on the categorisation of the 4D array dataset - Refer to file 'Indexing_MAT_Dataset.PNG'

    For example:

    To derive the OD pairs and yearly frequency of aircraft movements for routes which originate from Europe and are destined to Asia, operated with category 6 aircraft type and are separated by distances between 10,000 to 15,000 km:

    In MATLAB (Indexing based on file 'Indexing_MAT_Dataset.PNG' ):

    Route_Network (5,2,1,4),

    Description on Index:

    5 – Europe: Origin Region

    2 – Asia: Destination Region

    1– Category 6: Aircraft Type

    4 – 10000-15000 km: Range

  4. S

    MTA Subway Origin-Destination Ridership Estimate: 2024

    • data.ny.gov
    • datasets.ai
    • +1more
    application/rdfxml +5
    Updated Mar 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Metropolitan Transportation Authority (2025). MTA Subway Origin-Destination Ridership Estimate: 2024 [Dataset]. https://data.ny.gov/Transportation/MTA-Subway-Origin-Destination-Ridership-Estimate-2/jsu2-fbtj
    Explore at:
    application/rdfxml, csv, json, tsv, application/rssxml, xmlAvailable download formats
    Dataset updated
    Mar 1, 2025
    Dataset authored and provided by
    Metropolitan Transportation Authority
    Description

    This dataset provides an estimate of subway travel patterns based on scaled-up OMNY and MetroCard return swipe data. It provides estimated passenger volumes for all populated origin-destination (O-D) pairs aggregated by month, day of the week, and hour of day. It also provides the name, ID, and approximate latitude and longitude of the origin and destination subway complexes.

  5. f

    Hyperparameters of the prior distribution h.

    • plos.figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Humberto González Ramírez; Ludovic Leclercq; Nicolas Chiabaut; Cécile Becarie; Jean Krug (2023). Hyperparameters of the prior distribution h. [Dataset]. http://doi.org/10.1371/journal.pone.0225069.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Humberto González Ramírez; Ludovic Leclercq; Nicolas Chiabaut; Cécile Becarie; Jean Krug
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Hyperparameters of the prior distribution h.

  6. u

    Maltese crowS-pairs dataset

    • drum.um.edu.mt
    application/csv
    Updated Jun 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CLAUDIA BORG; Marthese Borg (2024). Maltese crowS-pairs dataset [Dataset]. http://doi.org/10.60809/drum.26056957.v1
    Explore at:
    application/csvAvailable download formats
    Dataset updated
    Jun 25, 2024
    Dataset provided by
    University of Malta
    Authors
    CLAUDIA BORG; Marthese Borg
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Warning: This dataset contains explicit statements of offensive stereotypes which may be upsetting.The study of bias, fairness and social impact in Natural Language Processing (NLP) lacks resources in languages other than English. Our objective is to support the evaluation of bias in language models in a multilingual setting. We use stereotypes across nine types of biases to build a corpus containing contrasting sentence pairs, one sentence that presents a stereotype concerning an underadvantaged group and another minimally changed sentence, concerning a matching advantaged group.In total, we produced 11,139 new sentence pairs that cover stereotypes dealing with nine types of biases in seven cultural contexts. We use the final resource for the evaluation of relevant monolingual and multilingual masked language models.This file contains the sentence pairs localised to the Maltese context in the Maltese language.Other languages are available here: https://gitlab.inria.fr/corpus4ethics/multilingualcrowspairsThe paper describing this work is available here: https://www.um.edu.mt/library/oar/handle/123456789/121722https://aclanthology.org/2024.lrec-main.1545/To use this dataset, please use the following citation:Karen Fort, Laura Alonso Alemany, Luciana Benotti, Julien Bezançon, Claudia Borg, Marthese Borg, Yongjian Chen, Fanny Ducel, Yoann Dupont, Guido Ivetta, Zhijian Li, Margot Mieskes, Marco Naguib, Yuyan Qian, Matteo Radaelli, Wolfgang S. Schmeisser-Nieto, Emma Raimundo Schulz, Thiziri Saci, Sarah Saidi, et al.. 2024. Your Stereotypical Mileage May Vary: Practical Challenges of Evaluating Biases in Multiple Languages and Cultural Contexts. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 17764–17769, Torino, Italia. ELRA and ICCL.

  7. Text-audio pairs (4 of 4)

    • kaggle.com
    zip
    Updated Aug 14, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jorvan (2024). Text-audio pairs (4 of 4) [Dataset]. https://www.kaggle.com/datasets/jorvan/text-audio-pairs-4-of-4
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Aug 14, 2024
    Authors
    Jorvan
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This is the fourth of the four datasets that we have created, for audio-text training tasks. These collect pairs of texts and audios, based on the audio-image pairs from our datasets [1, 2, 3]. These are only intended for research purposes.

    For the conversion, .csv tables were created, where audio values were separated in 16,000 columns and images were transformed into texts using the public model BLIP [4]. The original images are also preserved for future reference.

    To allow other researchers a quick evaluation of the potential usefulness of our datasets for their purposes, we have made available a public page where anyone can check 60 random samples that we extracted from all of our data [5].

    [1] Jorge E. León. Image-audio pairs (1 of 3). 2024. url: https://www.kaggle.com/datasets/jorvan/image-audio-pairs-1-of-3. [2] Jorge E. León. Image-audio pairs (2 of 3). 2024. url: https://www.kaggle.com/datasets/jorvan/image-audio-pairs-2-of-3. [3] Jorge E. León. Image-audio pairs (3 of 3). 2024. url: https://www.kaggle.com/datasets/jorvan/image-audio-pairs-3-of-3. [4] Junnan Li et al. “BLIP: Bootstrapping Language-Image Pre-training for Unified VisionLanguage Understanding and Generation”. En: ArXiv 2201.12086 (2022). [5] Jorge E. León. AVT Multimodal Dataset. 2024. url: https://jorvan758.github.io/AVT-Multimodal-Dataset/.

  8. Parameters and predicted probabilities of mode choice and ride pass...

    • zenodo.org
    csv
    Updated Oct 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xiyuan Ren; Xiyuan Ren; Y.J. Joseph Chow; Y.J. Joseph Chow; Venktesh Pandey; Venktesh Pandey (2024). Parameters and predicted probabilities of mode choice and ride pass subscription for microtransit in Arlington, TX [Dataset]. http://doi.org/10.5281/zenodo.13379435
    Explore at:
    csvAvailable download formats
    Dataset updated
    Oct 1, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Xiyuan Ren; Xiyuan Ren; Y.J. Joseph Chow; Y.J. Joseph Chow; Venktesh Pandey; Venktesh Pandey
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Arlington, Texas
    Description

    We provide parameters and predicted probabilities of mode choice (at trip level) and ride pass subscription (at individual level) for microtransit in Arlington, TX. The parameters are estimated by an agent-based, nested behavioral model that is developed in the C2SMARTER project “Multi-modal Tripchain Planner for Disadvantaged Travelers to Incentivize Transit Usage” (Award #69A3551747124). We separate the choice related to microtransit into two parts: travel mode choice and ride pass subscription choice. Synthetic population data from Replica Inc. and microtransit service data from City of Arlington are used for estimation.

    In the lower-branch travel mode choice, individuals decide on mode to use by considering factors such as travel time, cost, trip purpose, tour type, and mode-specific preferences. The mode choice set includes driving, walking, biking, carpool, and microtransit. Different from traditional mode choice models, we allow time and cost parameters to vary across individuals, and we allow mode specific constants to vary across trip OD pairs. This makes sense when we do not have sufficient data for socioeconomic attributes and built environment variables. The assumption we made here is that the impacts of these unobserved variables are included in the nonparametric distribution of individual and OD pair-level parameters.

    In the upper-branch ride pass subscription choice, individuals decide whether to purchase a weekly ride pass, a monthly ride pass, or no ride pass at all. By subscribing to a ride pass, travelers pay an amount of money in advance and enjoy free microtransit trips until the ride pass expires. The utility of purchasing a ride pass consists of four components: (1) the utility related to the prices of ride passes, (2) the utility related to the change in consumer surplus (or compensating variation) brought by free microtransit trips with a ride pass, (3) the utility specific to microtransit users, and (4) the alternative specific constant of the ride pass. Given the data availability, we consider the ride pass model as a simple MNL model with six parameters to calibrate. These ride pass parameters are calibrated using the Nelder-Mead Simplex Method. The cost function to minimize is the squared distance between the predicted ride pass market share and the observed one.

    Accordingly, this dataset consists of six .csv files:

    • Mode_choice_parameters_weekday.csv
    • Mode_choice_parameters_weekend.csv
    • RidePass_subscription_parameters.csv
    • Mode_choice_probability_weekday.csv
    • Mode_choice_probability_weekend.csv
    • Ridepass_subscription_probability.csv

    Field definition in "Mode_choice_parameters_weekday.csv" and "Mode_choice_parameters_weekend.csv"

    Field NameDescription
    iidIDs of each synthetic individual
    trip_idIDs of each synthetic trip
    origin_bgrpFIPs code of the block group where the trip starts
    destination_bgrpFIPs code of the block group where the trip ends
    B_AUTO_TTThe parameter of auto travel time (vary across individuals)
    B_MICRO_TTThe parameter of microtransit waiting time (vary across individuals)
    B_NON_AUTO_TTThe parameter of non-auto travel time (vary across individuals)
    B_COSTThe parameter of travel cost (vary across individuals)
    ASC_MIRCOThe alternative specific constant of microtransit (vary across trip OD pairs)
    ASC_DRIVINGThe alternative specific constant of driving (vary across trip OD pairs)
    ASC_BIKINGThe alternative specific constant of biking (vary across trip OD pairs)
    ASC_WALKINGThe alternative specific constant of walking (vary across trip OD pairs)
    MICRO_P_SHOPPINGThe interaction effect between microtransit and shopping trip purpose (generic)
    MICRO_P_SCHOOLThe interaction effect between microtransit and school trip purpose (generic)
    MICRO_P_OTHERThe interaction effect between microtransit and other trip purpose (generic)
    MICRO_T_COMMUTEThe interaction effect between microtransit and commute tour type (generic)
    MICRO_T_HOME_BASEDThe interaction effect between microtransit and home-based tour type (generic)

    Field definition in "RidePass_subscription_parameters.csv"

    Field NameDescription
    B_COST_RPA transfer factor from trip fare to ride pass price
    B_CS_WEEKDAYThe parameter of increased consumer surplus (due to the ride pass) on weekday
    B_CS_WEEKENDThe parameter of increased consumer surplus (due to the ride pass) on weekend
    B_MICRO_USERThe parameter of a binary variable indicating former microtransit users
    ASC_WRPThe alternative specific constant of subscribing weekly ride pass
    ASC_MRPThe alternative specific constant of subscribing monthly ride pass

    Field definition in "Mode_choice_probability_weekday.csv" and "Mode_choice_probability_weekend.csv"

    Field NameDescription
    iidIDs of each synthetic individual
    trip_idIDs of each synthetic trip
    origin_bgrpFIPs code of the block group where the trip starts
    destination_bgrpFIPs code of the block group where the trip ends
    P_bikingThe predicted probability of choosing biking
    P_carpoolThe predicted probability of choosing carpool
    P_microtransitThe predicted probability of choosing microtransit
    P_drivingThe predicted probability of choosing driving
    P_walkingThe predicted probability of choosing walking

    Field definition in "Ridepass_subscription_probability.csv"

    Field NameDescription
    iidIDs of each synthetic individual
    Micro_userA binary variable indicating whether the individual used microtransit before
    Population_segSegment ID of the synthetic individual
    BLOCKGROUPFIPs code of the home block group
    P_weekly_passThe predicted probability of subscribing weekly ride pass
    P_monthly_passThe predicted probability of subscribing monthly ride pass
    P_NoneThe predicted probability of subscribing no ride pass
  9. Digital image analysis tools for pairs of filaments in embedded 3D printing

    • catalog.data.gov
    • data.nist.gov
    Updated Mar 14, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2025). Digital image analysis tools for pairs of filaments in embedded 3D printing [Dataset]. https://catalog.data.gov/dataset/digital-image-analysis-tools-for-pairs-of-filaments-in-embedded-3d-printing
    Explore at:
    Dataset updated
    Mar 14, 2025
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    In embedded 3D printing, a nozzle is embedded into a support bath and extrudes filaments or droplets into the bath. This repository includes Python code for analyzing and managing images and videos of the printing process during extrusion of single filaments, pairs of filaments, and triplets of filaments. The link to the GitHub release goes to the state of the code when the paper was submitted. From there, you can also access the current state of the code.

  10. Number of registered Filipino au pairs 2020-2023, by destination

    • statista.com
    Updated Aug 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Number of registered Filipino au pairs 2020-2023, by destination [Dataset]. https://www.statista.com/statistics/1482463/philippines-registered-filipino-au-pairs-by-destination/
    Explore at:
    Dataset updated
    Aug 12, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Philippines
    Description

    The Netherlands was the leading destination of registered Filipino au pairs in 2023 at 642. This was followed by Germany and Denmark, with 359 and 278 au pairs, respectively.

  11. MXL models estimation results.

    • plos.figshare.com
    xls
    Updated Jun 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Humberto González Ramírez; Ludovic Leclercq; Nicolas Chiabaut; Cécile Becarie; Jean Krug (2023). MXL models estimation results. [Dataset]. http://doi.org/10.1371/journal.pone.0225069.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 9, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Humberto González Ramírez; Ludovic Leclercq; Nicolas Chiabaut; Cécile Becarie; Jean Krug
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MXL models estimation results.

  12. Image-audio pairs (3 of 3)

    • kaggle.com
    zip
    Updated Jun 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jorvan (2024). Image-audio pairs (3 of 3) [Dataset]. https://www.kaggle.com/datasets/jorvan/image-audio-pairs-3-of-3
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Jun 28, 2024
    Authors
    Jorvan
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This is the third of the three datasets that we have created, for audio-image training tasks. These collect pairs of images and audios, extracted from the same intervals of time in videos from the public datasets MUSIC dataset [1, 2], AudioSetZSL [3, 4] and SoundNet [5, 6]. As AudioSetZSL is only intended for research purposes, so are our datasets. The images are 512x512, .png and with color format RGB24. On the other hand, the audios are 1 s long, monophonic, .wav, and have 16 kHz with 16 bit depth. A small slip made us omit naming the first .zip file as 1.zip. However, it doesn't really end up having a meaningful impact, and our data is still perfectly usable and it's worth mentioning that is also pretty varied. In addition, you can easily distinguish the pairs by their numeric names (that were given in a rising order).

    [1] Hang Zhao and Andrew Rouditchenko. MUSIC Dataset from Sound of Pixels. 2018. url: https://github.com/roudimit/MUSIC_dataset. [2] Hang Zhao et al. “The Sound of Pixels”. In: Proceedings of the 15th European Conference on Computer Vision. 2018, pages. 587-604. [3] Kranti Kumar Parida. AudioSetZSL. 2019. url: https://github.com/krantiparida/AudioSetZSL. [4] Kranti Kumar Parida et al. “Coordinated Joint Multimodal Embeddings for Generalized Audio-Visual Zero-shot Classification and Retrieval of Videos”. In: Proceedings of the 2020 IEEE Winter Conference on Applications of Computer Vision. 2020, pages. 3240-3249. [5] Yusuf Aytar, Carl Vondrick and Antonio Torralba. “SoundNet: Learning Sound Representations from Unlabeled Video”. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016, pages. 892-900. [6] Bart Thomee et al. “YFCC100M: the new data in multimedia research”. In: Communications of the ACM 59.2 (2016), pages. 64-73.

  13. PP-ind: A Repository of Industrial Pair Programming Research Data

    • zenodo.org
    txt
    Updated Feb 15, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Franz Zieris; Franz Zieris; Lutz Prechelt; Lutz Prechelt (2021). PP-ind: A Repository of Industrial Pair Programming Research Data [Dataset]. http://doi.org/10.5281/zenodo.4529143
    Explore at:
    txtAvailable download formats
    Dataset updated
    Feb 15, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Franz Zieris; Franz Zieris; Lutz Prechelt; Lutz Prechelt
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    PP-ind is a repository of research data on industrial pair programming sessions. Since 2007, our research group has collected audio-video-recordings and questionnaire data in 13 companies. A total of 57 developers worked together (mostly in groups of two, but also three or four) in 67 sessions with a mean length of 1:35 hours. A separate tech report provides many details on how this data was collected.

    While we cannot share the original video recordings due to confidentiality agreements, we do provide transcripts of the pairs' dialog in this data set. Note that we transcribe our data on an is-needed basis. Early versions of this data set will therefore contain only few and partial transcripts which will be amended over time.

    Files named "session--transcript.txt" contain original quotations in the language spoken by the recorded developers. For non-English sessions, we also provide non-authoritative "session--transcript_translated.txt" files (following the same is-needed rule for translating the originals). All our analyses, however, are performed on the raw data as reflected in the original transcripts. See file "transcription-notation.txt" for details on the special notation we use.

  14. h

    Data from: Persuasive-Pairs

    • huggingface.co
    Updated Nov 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amalie Pauli (2024). Persuasive-Pairs [Dataset]. https://huggingface.co/datasets/APauli/Persuasive-Pairs
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 8, 2024
    Authors
    Amalie Pauli
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Persuasive Pairs

    The dataset consists of pairs of short-text; one from a news,debate or chat (see field 'source' to see where the text originates from), one rewritten by LLM to contain more or less persuasive language. The pairs are judged on degrees of persuasive language by three annotators: the task is to select which text contains much persuasive language and how much more on an ordinary scale with 'marginally','moderately', or 'heavily' more. Flatten out the score is a 6-point… See the full description on the dataset page: https://huggingface.co/datasets/APauli/Persuasive-Pairs.

  15. Monthly app downloads of Pairs in Japan 2024

    • statista.com
    Updated Jul 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Monthly app downloads of Pairs in Japan 2024 [Dataset]. https://www.statista.com/statistics/1245001/japan-monthly-number-of-app-downloads-pairs/
    Explore at:
    Dataset updated
    Jul 9, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 2024 - Dec 2024
    Area covered
    Japan
    Description

    The matchmaking app Pairs was downloaded about **** thousand times in Japan in December 2024. The total number of downloads during that year reached more than *** million. The app was released by Eureka, Inc. in 2012.

  16. 23 Pairs of Identical Twins Face Image Data

    • m.nexdata.ai
    • nexdata.ai
    Updated Jul 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2025). 23 Pairs of Identical Twins Face Image Data [Dataset]. https://m.nexdata.ai/datasets/computervision/1007?source=Github
    Explore at:
    Dataset updated
    Jul 7, 2025
    Dataset authored and provided by
    Nexdata
    Variables measured
    Device, Accuracy, Data size, Data format, Data diversity, Race distribution, Collection environment
    Description

    23 Pairs of Identical Twins Face Image Data. The collecting scenes includes indoor and outdoor scenes. The data diversity includes multiple face angles, multiple face postures, close-up of eyes, multiple light conditions and multiple age groups. This dataset can be used for tasks such as twins' face recognition.

  17. n

    Dataset of Pairs of an Image and Tags for Cataloging Image-based Records

    • narcis.nl
    • data.mendeley.com
    Updated Apr 19, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Suzuki, T (via Mendeley Data) (2022). Dataset of Pairs of an Image and Tags for Cataloging Image-based Records [Dataset]. http://doi.org/10.17632/msyc6mzvhg.2
    Explore at:
    Dataset updated
    Apr 19, 2022
    Dataset provided by
    Data Archiving and Networked Services (DANS)
    Authors
    Suzuki, T (via Mendeley Data)
    Description

    Brief ExplanationThis dataset is created to develop and evaluate a cataloging system which assigns appropriate metadata to an image record for database management in digital libraries. That is assumed for evaluating a task, in which given an image and assigned tags, an appropriate Wikipedia page is selected for each of the given tags.A main characteristic of the dataset is including ambiguous tags. Thus, visual contents of images are not unique to their tags. For example, it includes a tag 'mouse' which has double meaning of not a mammal but a computer controller device. The annotations are corresponding Wikipedia articles for tags as correct entities by human judgement.The dataset offers both data and programs that reproduce experiments of the above-mentioned task. Its data consist of sources of images and annotations. The image sources are URLs of 420 images uploaded to Flickr. The annotations are a total 2,464 relevant Wikipedia pages manually judged for tags of the images. The dataset also provides programs in Jupiter notebook (scripts.ipynb) to conduct a series of experiments running some baseline methods for the designated task and evaluating the results. ## Structure of the Dataset1. data directory 1.1. image_URL.txt This file lists URLs of image files. 1.2. rels.txt This file lists collect Wikipedia pages for each topic in topics.txt 1.3. topics.txt This file lists a target pair, which is called a topic in this dataset, of an image and a tag to be disambiguated. 1.4. enwiki_20171001.xml This file is extracted texts from the title and body parts of English Wikipedia articles as of 1st October 2017. This is a modified data of Wikipedia dump data (https://archive.org/download/enwiki-20171001).2. img directory This directory is a placeholder directory to fetch image files for downloading.3. results directory This directory is a placeholder directory to store results files for evaluation. It maintains three results of baseline methods in sub-directories. They contain json files each of which is a result of one topic, and are ready to be evaluated using an evaluation scripts in scripts.ipynb for reference of both usage and performance. 4. scripts.ipynb The scripts for running baseline methods and evaluation are ready in this Jupyter notebook file.

  18. f

    Statistics of the distribution of alternative routes in OD pairs.

    • figshare.com
    • plos.figshare.com
    xls
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fanglei Jin; Enjian Yao; Yongsheng Zhang; Shasha Liu (2023). Statistics of the distribution of alternative routes in OD pairs. [Dataset]. http://doi.org/10.1371/journal.pone.0185349.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Fanglei Jin; Enjian Yao; Yongsheng Zhang; Shasha Liu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Statistics of the distribution of alternative routes in OD pairs.

  19. Awareness of Pairs Japan 2023

    • statista.com
    Updated Jul 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Awareness of Pairs Japan 2023 [Dataset]. https://www.statista.com/statistics/1434296/japan-awareness-of-pairs/
    Explore at:
    Dataset updated
    Jul 29, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    May 25, 2023 - May 26, 2023
    Area covered
    Japan
    Description

    According to an online survey conducted in Japan in 2023, 39.1 percent of respondents stated they know about the matchmaking app Pairs. 10.5 percent of respondents answered that they have used it before.

  20. Z

    Causal Dataset for cause-effect pairs from Tubingen repository

    • data.niaid.nih.gov
    • zenodo.org
    Updated May 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zadorozhnyi, Oleksandr (2023). Causal Dataset for cause-effect pairs from Tubingen repository [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7709406
    Explore at:
    Dataset updated
    May 3, 2023
    Dataset provided by
    Haug, Stephan
    Reifferscheidt, David
    Drton, Mathias
    Zadorozhnyi, Oleksandr
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Tübingen
    Description

    Cause-effect is a two dimensional database with two-variable cause-effect pairs chosen from the different datasets created by Max-Planck-Institute for Biological Cybernetics in Tuebingen, Germany.

    Size: 83 datasets of various sizes

    Number of features: 2 in every datasets

    Ground truth: avalaible for every dataset

    Type of Graph: directed

    Extension of the datasets used in CauseEffectPairs task. Each dataset consists of samples of a pair of statistically dependent random variables, where one variable is known to cause the other one. The task is to identify for each pair which of the two variables is the cause and which one the effect, using the observed samples only

    More information about the dataset is contained in causal_description.html file.

    Reference

    J. M. Mooij, J. Peters, D. Janzing, J. Zscheischler, B. Schoelkopf: “Distinguishing cause from effect using observational data: methods and benchmarks”, Journal of Machine Learning Research 17(32):1-102, 2016

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Humberto González Ramírez; Ludovic Leclercq; Nicolas Chiabaut; Cécile Becarie; Jean Krug (2023). Unravelling travellers’ route choice behaviour at full-scale urban network by focusing on representative OD pairs in computer experiments [Dataset]. http://doi.org/10.1371/journal.pone.0225069

Unravelling travellers’ route choice behaviour at full-scale urban network by focusing on representative OD pairs in computer experiments

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
tiffAvailable download formats
Dataset updated
May 31, 2023
Dataset provided by
PLOS ONE
Authors
Humberto González Ramírez; Ludovic Leclercq; Nicolas Chiabaut; Cécile Becarie; Jean Krug
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

In a city-scale network, trips are made in thousands of origin-destination (OD) pairs connected by multiple routes, resulting in a large number of alternatives with diverse characteristics that influence the route choice behaviour of the travellers. As a consequence, to accurately predict user choices at full network scale, a route choice model should be scalable to suit all possible configurations that may be encountered. In this article, a new methodology to obtain such a model is proposed. The main idea is to use clustering analysis to obtain a small set of representative OD pairs and routes that can be investigated in detail through computer route choice experiments to collect observations on travellers behaviour. The results are then scaled-up to all other OD pairs in the network. It was found that 9 OD pair configurations are sufficient to represent the network of Lyon, France, composed of 96,096 OD pairs and 559,423 routes. The observations, collected over these nine representative OD pair configurations, were used to estimate three mixed logit models. The predictive accuracy of the three models was tested against the predictive accuracy of the same models (with the same specification), but estimated over randomly selected OD pair configurations. The obtained results show that the models estimated with the representative OD pairs are superior in predictive accuracy, thus suggesting the scaling-up to the entire network of the choices of the participants over the representative OD pair configurations, and validating the methodology in this study.

Search
Clear search
Close search
Google apps
Main menu